Wednesday, April 5, 2023

Dealing with KDE "plasmashell" freezing

I've been using KDE for over a decade now, and something that started happening in the past year or two (at least on Kubuntu 22.04) would be that my whole screen would mostly freeze.  Generally, I'd be able to alt-tab between windows, interact with them, etc., but I couldn't click on or interact with anything related to the window manager (the title bars, the task bar, etc.).

In my case, I'd immediately notice when I came back to my desk and there was obviously a notification at some point, but the rendering got all screwed up:


In this image, you can see that the notification toast window has no visible content and instead looks like the KDE background image.  Also, the time is locked at 5:29 PM, which is when this problem happened (I didn't get back to my desk until 8:30 AM the next morning).

The general fix for this is to use a shell (if you have one open, great; if not, press ctrl+alt+F2 to jump to the console) and kill "plasmashell":
killall plasmashell

Once that's done, your window manager should be less broken, but it won't have the taskbar, etc.  From there, you can press alt+F2 to open the "run" window, and type in:
plasmashell --replace


You can also run this from a terminal somewhere, but you need to make sure that your "DISPLAY" environment variable is set up correctly, etc.  I find it easier to do it it from the run window (and I don't have to worry about redirecting its output anywhere, since "plasmashell" does generate some logging noise).


Friday, February 10, 2023

Using a dynamic PVC on Kubernetes agents in Jenkins

I recently had to create a Jenkins job that needed to use a lot of disk space. The short version of the story is that the job needed to dump the contents of a Postgres database and upload that to Artifactory, and the "jfrog" command line tool won't let you stream an upload, so the entire dump had to be present on disk in order for it to work.

I run my Jenkins on Kubernetes, and the Kubernetes hosts absolutely didn't have the disk space needed to dump this database, and it was definitely too big to use a memory-based filesystem.

The solution was to use a dynamic Persistent Volume Claim, which is maybe(?) implemented as an ephemeral volume in Kubernetes, but the exact details of what it does under the hood aren't important.  What is important is that, as part of the job running, a new Persistent Volume Claim (PVC) gets created and is available for all of the containers in the pod.  When the job finishes, the PVC gets destroyed.  Perfect.

I couldn't figure out how to create a dynamic PVC as an ordinary volume that would get mounted on all of my containers (it's a thing, but apparently not for a declarative pipeline), but I was able to get the "workspace" dynamic PVC working.

A "workspace" volume is shared across all of the containers in the pod and have the Jenkins workspace mounted.  This has all of the Git contents, including the Jenkinsfile, for the job (I'm assuming that you're using Git-based jobs here).  Since all of the containers share the same workspace volume, any work done in one container is instantly visible in all of the others, without the need for Jenkins stashes or anything.

The biggest problem that I ran into was the permissions on the "workspace" file system.  Each of my containers had a different idea of what the UID of the user running the container would be, and all of the containers have to agree on the permissions around their "workspace" volume.

I ended up cheating and just forcing all of my containers to run as root (UID 0), since (1) everyone could agree on that, and (2) I didn't have to worry about "sudo" not being installed on some of the containers that needed to install packages as part of their setup.

Using "workspace" volumes

To use a "workspace" volume, set workspaceVolume inside the kubernetes block:

kubernetes {
   workspaceVolume dynamicPVC(accessModes: 'ReadWriteOnce', requestsSize: "300Gi")
   yaml '''
---
apiVersion: v1
kind: Pod
spec:
   securityContext:
      fsGroup: 0
      runAsGroup: 0
      runAsUser: 0
   containers:
[...]

In this example, we allocate a 300GiB volume for the duration of the job running.

In addition, you can see that I set the user and group information to 0 (for "root"), which let me work around all the annoying UID mismatches across the containers.  If you only have one container, then obviously you don't have to do this.  Also, if you have full control of your containers, then you can probably set them up with a known user with a fixed UID who can sudo, etc., as necessary.

For more information about using Kubernetes agents in Jenkins, see the official docs, but (at least of the time of this writing) they're missing a whole lot of information about volume-related things.

Troubleshooting

If you see Jenkins trying to create and then delete pods over and over and over again, you have something else wrong.  In my case, the Kubernetes service accout that Jenkins uses didn't have any permissions around "persistentvolumeclaims" objects, so every time that the Pod was created, it would fail and try again.

I was only able to see the errors in the Jenkins logs in Kubernetes; they looked something like this:

Caused: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.100.0.1:443/api/v1/namespaces/cicd/persistentvolumeclaims. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. persistentvolumeclaims is forbidden: User "system:serviceaccount:cicd:default" cannot create resource "persistentvolumeclaims" in API group "" in the namespace "cicd".

I didn't have the patience to figure out exactly what was needed, so I just gave it everything:

- verbs:
    - create
    - delete
    - get
    - list
    - patch
    - update
    - watch
  apiGroups:
    - ''
  resources:
    - persistentvolumeclaims


Tuesday, January 31, 2023

Use a custom login page when using Apache to require sign-in

Apache has its own built-in authentication system(s) for providing access control to a site that it's hosting.  You've probably encountered this before using "basic" authentication backed by a flatfile created and edited using the htpasswd command.

If you do this using the common guides on the Internet (for example, this guide from Apache itself), then when you go to your site, you'll be presented with your browser's built-in basic-authentication dialog box asking for a username and password.  If you provide valid credentials, then you'll be moved on through to the main site, and if you don't, then it'll dump you to a plain "401 Unauthorized" page.

This works fine, but it has three main drawbacks:

  1. Password managers (such as LastPass) can't detect this dialog box and autofill it, which is very annoying.
  2. On some mobile browsers, the dialog gets in the way of normal operations.  Even if you have multiple tabs open, whatever tab is trying to get you to log in will get in the way and force you to deal with it.
  3. If you're using Windows authentication, the browser might detect the 401 error and attempt to sign you in using your domain credentials.  If the server has a different set of credentials, then it'll mean that you can't actually log in due to Windows trying to auto log in.

(And the built-in popup is really ugly, and it submits the password in plaintext, etc., etc.)

Apache "Form" Authentication

To solve this problem, Apache has a type of authentication called "form" that adds an extra step involving an HTML form (that's fully customizable).

The workflow is as follows:

  1. Create a login HTML page (you'll have to provide the page).
  2. Register a handler for that page to POST to (Apache already has the handler).
  3. Update any "Directory" or "Location" blocks in your Apache config to use the "form" authentication type instead of "basic".
You'll also need these modules installed and enabled:
  1. mod_auth_form
  2. mod_request
  3. mod_session
  4. mod_session_cookie

On Ubuntu, I believe that these were all installed out of the box but needed to be enabled separately.  On Red Hat, I had to install the mod_session package, but everything was otherwise already enabled.

Example

If you want to try out "form" authentication, I recommend that you get everything working with "basic" authentication first.  This is especially true if you have multiple directories that need to be configured separately.

For this example, I'm going to use our Nagios server.

There were two directories that needed to be protected: "/usr/local/nagios/sbin" and "/usr/local/nagios/share".  This setup is generally described by this document (although it covers "digest" authentication instead of "basic").

For both directories that already had "AuthType" set up, the changes are simple:

  1. Change AuthType Basic to AuthType Form.
  2. Change AuthBasicProvider to AuthFormProvider.
  3. Add the login redirect: AuthFormLoginRequiredLocation "/login.html"
  4. Enable sessions: Session On
  5. Set a cookie name: SessionCookieName session path=/

I decided to put my login page at "/login.html" because that makes sense, but you could put it anywhere (and even host it on a different server if you specify a full URL instead of just a path).

That page should contain a "form" with two "input" elements: "httpd_username" and "httpd_password".  The form "action" should be set to "/do-login.html" (or whatever handler you want to register with Apache).

At its simplest, "login.html" looks like this:

<form method="POST" action="/do-login.html">
  Username: <input type="text" name="httpd_username" value="" />
  Password: <input type="password" name="httpd_password" value="" />
  <input type="submit" name="login" value="Login" />
</form>

You'll probably want an "html" tag, a title and body and such, maybe some CSS, but this'll get the job done.

The last step is to register the thing that'll actually process the form data: "/do-login.html"

In your Apache config, add a "location" for it:

<Location "/do-login.html">
  SetHandler form-login-handler

  AuthType form
  AuthName "Nagios Access"
  AuthFormProvider file
  AuthUserFile /path/to/your/htpasswd.users

  AuthFormLoginRequiredLocation "/login.html"
  AuthFormLoginSuccessLocation "/nagios/"

  Session On
  SessionCookieName session path=/
</Location>

The key thing here is SetHandler form-login-handler.  This tells Apache to use its built-in form handler to take the values from httpd_username and httpd_password and compare them against your authentication provider(s) (in this example, it's just a flatfile, but you could use LDAP, etc.).

The other two options handle the last bit of navigation.  AuthFormLoginRequiredLocation sends you back to the login page if the username/password combination didn't work (you could potentially have another page here with an error message pre-written).  AuthFormLoginSuccessLocation sends you to the place where you want the user to go after login (I'm sending the user to the main Nagios page, but you could send them anywhere).

Notes

Other Authentication Providers

I've just covered the "file" authentication provider here.  If you use "ldap" and/or any others, then that config will need to be copied to every single place where you have "form" authentication set up, just like you would if you were only using the "file" provider.

I found this to be really annoying, since I had two directories to protect plus the form handler, so that brings over another 4 lines or so to each config section, but what matters is that it works.