Thursday, April 15, 2021

Using "errors.Is" to detect "connection reset by peer" and work around it

 I maintain an application that ties into Emergency Reporting using their REST API.  When an item is updated, I have a Google Cloud Task that attempts to publish a change to a web hook, which connects to the Emergency Reporting API and creates a new incident in that system.  Because it's in Cloud Tasks, if the task fails for any reason, Cloud Tasks will attempt to retry the task until it succeeds.  Cool.

I also have it set up to send any log messages at warning level or higher to a Slack channel.  Also cool.

However, in December of 2020, Emergency Reporting switched to some kind of Microsoft-managed authentication system for their API, and this has only brought problems.  The most common of which is that the authentication API will frequently fail with a "connection reset by peer" error.  My Emergency Reporting wrapper detects this and logs it; my web hook detects a sign-in failure and logs that; and the whole Cloud Task detects that the web hook has failed and logs that.  Cloud Tasks automatically retries the task, which makes another post to the web hook, and everything succeeds the second time.  But by now, I've accumulated a bunch of warnings in the Slack channel.  Not cool.

So here's the thing: the Emergency Reporting API can fail for a lot of reasons, and I'd like to be notified when something important actually happens.  But a standard, run-of-the-mill TCP "connection reset by peer" error is not important at all.

Here's an example of the kind of error that Go's http.Client.PostForm returns in this case:

Could not post form: Post https://login.emergencyreporting.com/login.emergencyreporting.com/B2C_1A_PasswordGrant/oauth2/v2.0/token: read tcp [fddf:3978:feb1:d745::c001]:33391->[2620:1ec:29::19]:443: read: connection reset by peer

Looking at the error, it looks like there are 4 layers of error:

  1. The HTTP post
  2. The particular TCP read
  3. A generic "read"
  4. A generic "connection reset by peer"
What I really want to do in this case is detect a generic "connection reset by peer" error and quietly retry the operation, allowing all other errors to be handled as true errors.  Doing string-comparison operations on error text is rarely a good idea, so what does that leave us with?

Go 1.13 adds support for "error wrapping", where one error can "wrap" another one, while still allowing programs to make decisions based on the "wrapped" error.  You may call "errors.Is" to determine if any error in an error chain matches a particular target.

Fortunately, all of the packages in this particular chain of errors utilize this feature.  In particular, the syscall package has a set of distinct Errno errors for each low-level error, including "connection reset by peer" (ECONNRESET).

This lets us do something like this:

tokenResponse, err = client.GenerateToken()
if err != nil {
   // If this was a connection-reset error, then continue to the next retry.
   if errors.Is(err, syscall.ECONNRESET) {
      logrus.Info("Got back a syscall.ECONNRESET from Emergency Reporting.")
      // [attempt to retry the operation]
   } else {
      // This was some other kind of error that we can't handle.
      // [log a proper error message and fail]
   }
}

Since using "errors.Is" to detect the "connection reset by peer" error, I haven't received a single annoying, pointless error message in my Slack channel.  I did have to spend a bit of time trying to figure out what that ultimate, underlying error was, but after that, it's been working flawlessly.

Monday, January 25, 2021

Using LDAP groups to limit access to a Radius server (freeRADIUS 3.0)

Note: this is an updated version of a prior entry for freeRADIUS 3.0.

Anytime I need to create a VPN (to my home network, to my AWS network, etc.), I use SoftEther.  SoftEther is OpenVPN-compatible, supports L2TP/IPsec, and has some neat settings around VPN over ICMP and DNS.  Anyway, once you get it set up, it generally just works (except for the cronjob that you need to make to trim its massive log files daily).

At work, we use LDAP for our user authentication and permissions, but SoftEther doesn't support LDAP.  It does, however, support Radius, and freeRADIUS supports using LDAP as a module, so you can easily set up a quick Radius proxy for LDAP.

Quick recap on setting up freeRADIUS with LDAP

I'm assuming that you already have an LDAP server.

Install freeRADIUS and the LDAP module.
sudo apt install freeradius freeradius-ldap
sudo systemctl enable freeradius
sudo systemctl start freeradius

Enable the LDAP module via symlink:
ln -sfn ../mods-available/ldap /etc/freeradius/3.0/mods-enabled/ldap

Then turn on the LDAP module by editing /etc/freeradius/3.0/sites-enabled/default and uncommenting the "ldap" line under the "authorize" block.
authorize {
...
   ldap
...

You'll need to add an "if" statement to set the "Auth-Type"; do this immediately after that "ldap" line.
   if ((ok || updated) && User-Password) {
      update {
         control:Auth-Type := ldap
      }
   }

And the same for the "Auth-Type LDAP" block.
authorize {
...
   Auth-Type LDAP {
      ldap
   }
...

Cool; at this point, freeRADIUS will use whatever LDAP setup is in the /etc/freeradius/3.0/mods-enabled/ldap file.  It won't work (because it's not set up for your LDAP server), that's all that you need in order to back your Radius server with your LDAP server.

Next up, we'll look at configuring it to actually talk to your LDAP server.

Configuring the LDAP module

/etc/freeradius/3.0/mods-enabled/ldap is where the LDAP configuration lives.  In order to understand exactly what's going on, you should know a few things.
  1. Run-time variables, like the current user name, are written as %{Variable-Name}.  For example, the current user name is %{User-Name}.
  2. Similar to shell variables, you can have conditional values.  The basic syntax is %{%{Variable-1}:-${Variable-2}}.  A typical pattern that you'll see is using the "stripped" user name (the user name without any realm information), but if that's not defined, then use the actual user name: %{%{Stripped-User-Name}:-%{User-Name}}
For your basic LDAP integration (if you provide a valid username and password, you can sign in), you'll need to set the following values in the "ldap" block:
  1. server; this is the hostname or address of your server.  If you're running freeRADIUS on the same LDAP server, then this will be "localhost".
  2. identity; this is the DN for the "bind" user.  That's the user that freeRADIUS will log in as in order to search the directory tree and do its LDAP stuff.  This is typically a read-only user.
  3. password; this is the password for the user configured in identity.
  4. base_dn; this is the base DN to use for all user searches.  It's usually something like dc=example,dc=com, but that'll depend on your LDAP setup.  You'll generally want to set this as the base for all of your users (maybe something like ou=users,dc=example,dc=com, etc.).
Here's an example that assumes that your users are all under ou=users,dc=example,dc=com:
server = "my-ldap-server.example.com"
identity = "uid=my-bind-user,ou=service-users,dc=example,dc=com"
password = "abc123"
base_dn = "ou=users,dc=example,dc=com"

Users

You'll also need to set up user-level things in the "user" block:
  1. filter; this is the LDAP search condition that freeRADIUS will use to try to find the matching LDAP user for the user name that just tried to sign in via Radius.  This is where run-time variables will come into play.  For out-of-the-box OpenLDAP, something like this will generally work: (uid=%{%{Stripped-User-Name}:-%{User-Name}}).  What this means is look for an entity in LDAP (under the base DN defined in basedn) with a uid property of the Radius user name.  Yes, you need the surrounding parentheses.  No, I don't make the rules.
Here's an example that uses "uid" for the user name.
filter = "(uid=%{%{Stripped-User-Name}:-%{User-Name}})"

Remember, filter can be any LDAP filter, so if there were a property that you also wanted to check (such as isAllowedToDoRadius or something), then you could check for that, as well.  For example:
filter = "(&(uid=%{%{Stripped-User-Name}:-%{User-Name}})(isAllowedToDoRadius=yes))"

Filtering by group

So, that'll let any LDAP user authenticate with Radius.  Maybe you want that, maybe you don't.  In my case, I have a whole bunch of users, but I only want a small subset to be able to VPN in using SoftEther.  I added those users to the "vpn-users" group in LDAP.

Note that there are two general grouping strategies in LDAP:
  1. Groups-have-users; in this strategy, the group entity lists the users within the group.  This is the default OpenLDAP strategy.
  2. Users-have-groups; in this strategy, the user entity lists the groups that it belongs to.
If you want to have freeRADIUS respect your groups, you'll need to set the following in /etc/freeradius/3.0/mods-enabled/ldap in the "groups" block:
  1. name_attribute = cn (which turns on tracking groups); and
  2. One of these two options, which each correspond to one of the LDAP grouping strategies:
    1. membership_filter; this is an LDAP filter to use to query for all of the groups that the user belongs to.
    2. membership_attribute; this is the property on the user entity that lists the groups that the user belongs to.
If your groups have users, this might look like:
name_attribute = cn
membership_filter = "(&(objectClass=posixGroup)(memberUid=%{%{Stripped-User-Name}:-%{User-Name}}))"

If your users have groups, this might look like:
name_attribute = cn
membership_attribute = groupName

With that set up, freeRADIUS will now know which groups the user belongs to, but it won't do anything with them.

The last step is to set up some group rules in /etc/freeradius/3.0/users.  There will probably be a few entries in that file already, but by default, none of them will be LDAP-related.  So, at the very bottom, add the LDAP group rules.

Note: In my case, this file was a symlink to "mods-config/files/authorize".  The symlink was a convenience for backward-compatibility in editing the config files; freeRADIUS doesn't actually load "users"; rather, it loads "mods-config/files/authorize", so make sure that you're actually modifying the correct file.

The simplest grouping rules will look like this:
DEFAULT LDAP-Group == "your-group-name-here"
DEFAULT Auth-Type := Reject
  Reply-Message = "Sorry, you're not part of an authorized group."

This generally means: you have to a member of "your-group-name-here" or else you'll be rejected (and here's the message to send you).

In my case, my group is "vpn-users", so it looks like this:
DEFAULT LDAP-Group == "vpn-users", Auth-Type := Accept
DEFAULT Auth-Type := Reject
  Reply-Message = "Sorry, you're not part of an authorized group."

Once that's done, restart freeradius and you'll be good to go.
sudo systemctl restart freeradius

To test to see if it worked, you can run the radtest command:
radtest -x ${username} ${password} ${address} ${port} ${secret}

For example, in our case, this might look like:
radtest -x some-user abc123 my-radius-server.example.com 1812 the-gold-is-under-the-bridge

On success, you'll see something like:
rad_recv: Access-Accept packet

On failure, you'll see something like:
rad_recv: Access-Reject packet

Hopefully this helped a bit; I struggle every time I need to do anything with LDAP or Radius.  It's always really hard to find the documentation for what I'm looking for.