Thursday, April 15, 2021

Using "errors.Is" to detect "connection reset by peer" and work around it

 I maintain an application that ties into Emergency Reporting using their REST API.  When an item is updated, I have a Google Cloud Task that attempts to publish a change to a web hook, which connects to the Emergency Reporting API and creates a new incident in that system.  Because it's in Cloud Tasks, if the task fails for any reason, Cloud Tasks will attempt to retry the task until it succeeds.  Cool.

I also have it set up to send any log messages at warning level or higher to a Slack channel.  Also cool.

However, in December of 2020, Emergency Reporting switched to some kind of Microsoft-managed authentication system for their API, and this has only brought problems.  The most common of which is that the authentication API will frequently fail with a "connection reset by peer" error.  My Emergency Reporting wrapper detects this and logs it; my web hook detects a sign-in failure and logs that; and the whole Cloud Task detects that the web hook has failed and logs that.  Cloud Tasks automatically retries the task, which makes another post to the web hook, and everything succeeds the second time.  But by now, I've accumulated a bunch of warnings in the Slack channel.  Not cool.

So here's the thing: the Emergency Reporting API can fail for a lot of reasons, and I'd like to be notified when something important actually happens.  But a standard, run-of-the-mill TCP "connection reset by peer" error is not important at all.

Here's an example of the kind of error that Go's http.Client.PostForm returns in this case:

Could not post form: Post https://login.emergencyreporting.com/login.emergencyreporting.com/B2C_1A_PasswordGrant/oauth2/v2.0/token: read tcp [fddf:3978:feb1:d745::c001]:33391->[2620:1ec:29::19]:443: read: connection reset by peer

Looking at the error, it looks like there are 4 layers of error:

  1. The HTTP post
  2. The particular TCP read
  3. A generic "read"
  4. A generic "connection reset by peer"
What I really want to do in this case is detect a generic "connection reset by peer" error and quietly retry the operation, allowing all other errors to be handled as true errors.  Doing string-comparison operations on error text is rarely a good idea, so what does that leave us with?

Go 1.13 adds support for "error wrapping", where one error can "wrap" another one, while still allowing programs to make decisions based on the "wrapped" error.  You may call "errors.Is" to determine if any error in an error chain matches a particular target.

Fortunately, all of the packages in this particular chain of errors utilize this feature.  In particular, the syscall package has a set of distinct Errno errors for each low-level error, including "connection reset by peer" (ECONNRESET).

This lets us do something like this:

tokenResponse, err = client.GenerateToken()
if err != nil {
   // If this was a connection-reset error, then continue to the next retry.
   if errors.Is(err, syscall.ECONNRESET) {
      logrus.Info("Got back a syscall.ECONNRESET from Emergency Reporting.")
      // [attempt to retry the operation]
   } else {
      // This was some other kind of error that we can't handle.
      // [log a proper error message and fail]
   }
}

Since using "errors.Is" to detect the "connection reset by peer" error, I haven't received a single annoying, pointless error message in my Slack channel.  I did have to spend a bit of time trying to figure out what that ultimate, underlying error was, but after that, it's been working flawlessly.

Monday, January 25, 2021

Using LDAP groups to limit access to a Radius server (freeRADIUS 3.0)

Note: this is an updated version of a prior entry for freeRADIUS 3.0.

Anytime I need to create a VPN (to my home network, to my AWS network, etc.), I use SoftEther.  SoftEther is OpenVPN-compatible, supports L2TP/IPsec, and has some neat settings around VPN over ICMP and DNS.  Anyway, once you get it set up, it generally just works (except for the cronjob that you need to make to trim its massive log files daily).

At work, we use LDAP for our user authentication and permissions, but SoftEther doesn't support LDAP.  It does, however, support Radius, and freeRADIUS supports using LDAP as a module, so you can easily set up a quick Radius proxy for LDAP.

Quick recap on setting up freeRADIUS with LDAP

I'm assuming that you already have an LDAP server.

Install freeRADIUS and the LDAP module.
sudo apt install freeradius freeradius-ldap
sudo systemctl enable freeradius
sudo systemctl start freeradius

Enable the LDAP module via symlink:
ln -sfn ../mods-available/ldap /etc/freeradius/3.0/mods-enabled/ldap

Then turn on the LDAP module by editing /etc/freeradius/3.0/sites-enabled/default and uncommenting the "ldap" line under the "authorize" block.
authorize {
...
   ldap
...

You'll need to add an "if" statement to set the "Auth-Type"; do this immediately after that "ldap" line.
   if ((ok || updated) && User-Password) {
      update {
         control:Auth-Type := ldap
      }
   }

And the same for the "Auth-Type LDAP" block.
authorize {
...
   Auth-Type LDAP {
      ldap
   }
...

Cool; at this point, freeRADIUS will use whatever LDAP setup is in the /etc/freeradius/3.0/mods-enabled/ldap file.  It won't work (because it's not set up for your LDAP server), that's all that you need in order to back your Radius server with your LDAP server.

Next up, we'll look at configuring it to actually talk to your LDAP server.

Configuring the LDAP module

/etc/freeradius/3.0/mods-enabled/ldap is where the LDAP configuration lives.  In order to understand exactly what's going on, you should know a few things.
  1. Run-time variables, like the current user name, are written as %{Variable-Name}.  For example, the current user name is %{User-Name}.
  2. Similar to shell variables, you can have conditional values.  The basic syntax is %{%{Variable-1}:-${Variable-2}}.  A typical pattern that you'll see is using the "stripped" user name (the user name without any realm information), but if that's not defined, then use the actual user name: %{%{Stripped-User-Name}:-%{User-Name}}
For your basic LDAP integration (if you provide a valid username and password, you can sign in), you'll need to set the following values in the "ldap" block:
  1. server; this is the hostname or address of your server.  If you're running freeRADIUS on the same LDAP server, then this will be "localhost".
  2. identity; this is the DN for the "bind" user.  That's the user that freeRADIUS will log in as in order to search the directory tree and do its LDAP stuff.  This is typically a read-only user.
  3. password; this is the password for the user configured in identity.
  4. base_dn; this is the base DN to use for all user searches.  It's usually something like dc=example,dc=com, but that'll depend on your LDAP setup.  You'll generally want to set this as the base for all of your users (maybe something like ou=users,dc=example,dc=com, etc.).
Here's an example that assumes that your users are all under ou=users,dc=example,dc=com:
server = "my-ldap-server.example.com"
identity = "uid=my-bind-user,ou=service-users,dc=example,dc=com"
password = "abc123"
base_dn = "ou=users,dc=example,dc=com"

Users

You'll also need to set up user-level things in the "user" block:
  1. filter; this is the LDAP search condition that freeRADIUS will use to try to find the matching LDAP user for the user name that just tried to sign in via Radius.  This is where run-time variables will come into play.  For out-of-the-box OpenLDAP, something like this will generally work: (uid=%{%{Stripped-User-Name}:-%{User-Name}}).  What this means is look for an entity in LDAP (under the base DN defined in basedn) with a uid property of the Radius user name.  Yes, you need the surrounding parentheses.  No, I don't make the rules.
Here's an example that uses "uid" for the user name.
filter = "(uid=%{%{Stripped-User-Name}:-%{User-Name}})"

Remember, filter can be any LDAP filter, so if there were a property that you also wanted to check (such as isAllowedToDoRadius or something), then you could check for that, as well.  For example:
filter = "(&(uid=%{%{Stripped-User-Name}:-%{User-Name}})(isAllowedToDoRadius=yes))"

Filtering by group

So, that'll let any LDAP user authenticate with Radius.  Maybe you want that, maybe you don't.  In my case, I have a whole bunch of users, but I only want a small subset to be able to VPN in using SoftEther.  I added those users to the "vpn-users" group in LDAP.

Note that there are two general grouping strategies in LDAP:
  1. Groups-have-users; in this strategy, the group entity lists the users within the group.  This is the default OpenLDAP strategy.
  2. Users-have-groups; in this strategy, the user entity lists the groups that it belongs to.
If you want to have freeRADIUS respect your groups, you'll need to set the following in /etc/freeradius/3.0/mods-enabled/ldap in the "groups" block:
  1. name_attribute = cn (which turns on tracking groups); and
  2. One of these two options, which each correspond to one of the LDAP grouping strategies:
    1. membership_filter; this is an LDAP filter to use to query for all of the groups that the user belongs to.
    2. membership_attribute; this is the property on the user entity that lists the groups that the user belongs to.
If your groups have users, this might look like:
name_attribute = cn
membership_filter = "(&(objectClass=posixGroup)(memberUid=%{%{Stripped-User-Name}:-%{User-Name}}))"

If your users have groups, this might look like:
name_attribute = cn
membership_attribute = groupName

With that set up, freeRADIUS will now know which groups the user belongs to, but it won't do anything with them.

The last step is to set up some group rules in /etc/freeradius/3.0/users.  There will probably be a few entries in that file already, but by default, none of them will be LDAP-related.  So, at the very bottom, add the LDAP group rules.

Note: In my case, this file was a symlink to "mods-config/files/authorize".  The symlink was a convenience for backward-compatibility in editing the config files; freeRADIUS doesn't actually load "users"; rather, it loads "mods-config/files/authorize", so make sure that you're actually modifying the correct file.

The simplest grouping rules will look like this:
DEFAULT LDAP-Group == "your-group-name-here"
DEFAULT Auth-Type := Reject
  Reply-Message = "Sorry, you're not part of an authorized group."

This generally means: you have to a member of "your-group-name-here" or else you'll be rejected (and here's the message to send you).

In my case, my group is "vpn-users", so it looks like this:
DEFAULT LDAP-Group == "vpn-users", Auth-Type := Accept
DEFAULT Auth-Type := Reject
  Reply-Message = "Sorry, you're not part of an authorized group."

Once that's done, restart freeradius and you'll be good to go.
sudo systemctl restart freeradius

To test to see if it worked, you can run the radtest command:
radtest -x ${username} ${password} ${address} ${port} ${secret}

For example, in our case, this might look like:
radtest -x some-user abc123 my-radius-server.example.com 1812 the-gold-is-under-the-bridge

On success, you'll see something like:
rad_recv: Access-Accept packet

On failure, you'll see something like:
rad_recv: Access-Reject packet

Hopefully this helped a bit; I struggle every time I need to do anything with LDAP or Radius.  It's always really hard to find the documentation for what I'm looking for.

Tuesday, December 22, 2020

Using LDAP groups to limit access to a Radius server

Anytime I need to create a VPN (to my home network, to my AWS network, etc.), I use SoftEther.  SoftEther is OpenVPN-compatible, supports L2TP/IPsec, and has some neat settings around VPN over ICMP and DNS.  Anyway, once you get it set up, it generally just works (except for the cronjob that you need to make to trim its massive log files daily).

At work, we use LDAP for our user authentication and permissions, but SoftEther doesn't support LDAP.  It does, however, support Radius, and freeRADIUS supports using LDAP as a module, so you can easily set up a quick Radius proxy for LDAP.

Quick recap on setting up freeRADIUS with LDAP

I'm assuming that you already have an LDAP server.

Install freeRADIUS and the LDAP module.
sudo apt install freeradius freeradius-ldap
sudo systemctl enable freeradius
sudo systemctl start freeradius

Then turn on the LDAP module by editing /etc/freeradius/sites-enabled/default and uncommenting the "ldap" line under the "authorize" block.
authorize {
...
   ldap
...

And the same for the "Auth-Type LDAP" block.
authorize {
...
   Auth-Type LDAP {
      ldap
   }
...

Cool; at this point, freeRADIUS will use whatever LDAP setup is in the /etc/freeradius/modules/ldap file.  It won't work (because it's not set up for your LDAP server), that's all that you need in order to back your Radius server with your LDAP server.

Next up, we'll look at configuring it to actually talk to your LDAP server.

Configuring the LDAP module

/etc/freeradius/modules/ldap is where the LDAP configuration lives.  In order to understand exactly what's going on, you should know a few things.
  1. Run-time variables, like the current user name, are written as %{Variable-Name}.  For example, the current user name is %{User-Name}.
  2. Similar to shell variables, you can have conditional values.  The basic syntax is %{%{Variable-1}:-${Variable-2}}.  A typical pattern that you'll see is using the "stripped" user name (the user name without any realm information), but if that's not defined, then use the actual user name: %{%{Stripped-User-Name}:-%{User-Name}}
For your basic LDAP integration (if you provide a valid username and password, you can sign in), you'll need to set the following values in the "ldap" block:
  1. server; this is the hostname or address of your server.  If you're running freeRADIUS on the same LDAP server, then this will be "localhost".
  2. identity; this is the DN for the "bind" user.  That's the user that freeRADIUS will log in as in order to search the directory tree and do its LDAP stuff.  This is typically a read-only user.
  3. password; this is the password for the user configured in identity.
  4. basedn; this is the base DN to use for all user searches.  It's usually something like dc=example,dc=com, but that'll depend on your LDAP setup.  You'll generally want to set this as the base for all of your users (maybe something like ou=users,dc=example,dc=com, etc.).
  5. filter; this is the LDAP search condition that freeRADIUS will use to try to find the matching LDAP user for the user name that just tried to sign in via Radius.  This is where run-time variables will come into play.  For out-of-the-box OpenLDAP, something like this will generally work: (uid=%{%{Stripped-User-Name}:-%{User-Name}}).  What this means is look for an entity in LDAP (under the base DN defined in basedn) with a uid property of the Radius user name.  Yes, you need the surrounding parentheses.  No, I don't make the rules.
Here's an example that assumes that your users are all under ou=users,dc=example,dc=com and have a uid property that is their user name:
server = "my-ldap-server.example.com"
identity = "uid=my-bind-user,ou=service-users,dc=example,dc=com"
password = "abc123"
basedn = "ou=users,dc=example,dc=com"
filter = "(uid=%{%{Stripped-User-Name}:-%{User-Name}})"

Remember, filter can be any LDAP filter, so if there were a property that you also wanted to check (such as isAllowedToDoRadius or something), then you could check for that, as well.  For example:
filter = "(&(uid=%{%{Stripped-User-Name}:-%{User-Name}})(isAllowedToDoRadius=yes))"

Filtering by group

So, that'll let any LDAP user authenticate with Radius.  Maybe you want that, maybe you don't.  In my case, I have a whole bunch of users, but I only want a small subset to be able to VPN in using SoftEther.  I added those users to the "vpn-users" group in LDAP.

Note that there are two general grouping strategies in LDAP:
  1. Groups-have-users; in this strategy, the group entity lists the users within the group.  This is the default OpenLDAP strategy.
  2. Users-have-groups; in this strategy, the user entity lists the groups that it belongs to.
If you want to have freeRADIUS respect your groups, you'll need to set the following in /etc/freeradius/modules/ldap:
  1. groupname_attribute = cn (which turns on tracking groups); and
  2. One of these two options, which each correspond to one of the LDAP grouping strategies:
    1. groupmembership_filter; this is an LDAP filter to use to query for all of the groups that the user belongs to.
    2. groupmembership_attribute; this is the property on the user entity that lists the groups that the user belongs to.
If your groups have users, this might look like:
groupname_attribute = cn
groupmembership_filter = "(&(objectClass=posixGroup)(memberUid=%{%{Stripped-User-Name}:-%{User-Name}}))"

If your users have groups, this might look like:
groupname_attribute = cn
groupmembership_attribute = groupName

With that set up, freeRADIUS will now know which groups the user belongs to, but it won't do anything with them.

The last step is to set up some group rules in /etc/freeradius/users.  There will probably be a few entries in that file already, but by default, none of them will be LDAP-related.  So, at the very bottom, add the LDAP group rules.

The simplest grouping rules will look like this:
DEFAULT LDAP-Group == "your-group-name-here"
DEFAULT Auth-Type := Reject
  Reply-Message = "Sorry, you're not part of an authorized group."

This generally means: you have to a member of "your-group-name-here" or else you'll be rejected (and here's the message to send you).

In my case, my group is "vpn-users", so it looks like this:
DEFAULT LDAP-Group == "vpn-users"
DEFAULT Auth-Type := Reject
  Reply-Message = "Sorry, you're not part of an authorized group."

Once that's done, restart freeradius and you'll be good to go.
sudo systemctl restart freeradius

To test to see if it worked, you can run the radtest command:
radtest -x ${username} ${password} ${address} ${port} ${secret}

For example, in our case, this might look like:
radtest -x some-user abc123 my-radius-server.example.com 1812 the-gold-is-under-the-bridge

On success, you'll see something like:
rad_recv: Access-Accept packet

On failure, you'll see something like:
rad_recv: Access-Reject packet

Hopefully this helped a bit; I struggle every time I need to do anything with LDAP or Radius.  It's always really hard to find the documentation for what I'm looking for.

Monday, December 7, 2020

Working with the Google Datastore emulator

 I do a good chunk of my business in Google App Engine; you package up your web application, send it to GCP, and then it takes care of scaling and uptime and all that stuff.

When I started out in 2014, I created my main application in Java because that was the least-crappy language that was supported.  However, in 2020, there are a whole lot more languages (in particular: Go).  I've slowly been working on porting my application from Java 8 to Go 1.14.  Along the way, I've run into some really annoying issues.

For today, I'm going to be focusing on the Datastore emulator.  In "old" App Engine (Java 8, Go 1.11, Python 2, etc.), they gave you a whole emulator suite.  Your application ran inside of that suite, and you had fake Google-based App Engine authentication, inbound e-mail, and a Datastore emulator that also had a web UI that you could use to see your entities and manipulate them.  The Datastore emulator's web UI wasn't as good as the current one that you get in production, but it was good enough to use for development.

Well, in "new" App Engine, the emulator suite is gone, and now you have to emulate or mock every aspect of App Engine that you plan on using.  It's not a huge deal, but it is a bit inconvenient.  In particular, you now have to start your own Datastore emulator.

It's easy to start:

gcloud config set project <your-project-id>;
gcloud beta emulators datastore start;

There are some environment variables that you'll need to export for the various libraries to detect and use instead of the production instance; run this to see them:

gcloud beta emulators datastore env-init;

That part is fine.

There are also two halfway-decent third party web UIs for the Datastore emulator:

  1. https://github.com/GabiAxel/google-cloud-gui
  2. https://github.com/streamrail/dsui

I fought for hours trying to figure out why either of those two web UIs didn't work.  Neither would show any namespaces (and thus, neither would show any entities).

The short answer is that despite what the Datastore emulator claims it's using for the project ID, the only thing that it actually uses is "dummy-emulator-datastore-project".

I got a hint about it by poking around in the emulator's data file, and I got some confirmation in this file, which is the only thing on the Internet at the time of this writing that references that string: https://code.googlesource.com/gocloud/+/master/datastore/datastore.go

So, if you start the Datastore emulator according to the instructions and either of those two web UIs aren't working, try setting the project ID to "dummy-emulator-datastore-project".

  1. In "google-cloud-gui", you set the project ID in the UI when you hit the "+" button to create a new project.
  2. In "dsui", you set the project ID using the "--projectId" flag.

Saturday, July 18, 2020

Google Cloud Functions Issues Upgrading From Go 1.11 To 1.13

I use Google Cloud Functions with Go.  However, I upgraded from Go 1.11 to Go 1.13 (because Go 1.11 is being deprecated) and ran into some annoying, undocumented issues.

Static Files And The Current Working Directory

One of my Cloud Functions acts as a tiny web server; it has a few static HTML files that it serves in addition to its dynamic things.

In Go 1.11, Cloud Functions put the static files (and all the source files, for that matter) in the working directory of the function.  This (1) makes sense, and (2) makes testing easy.

However, in Go 1.13, Cloud Functions puts the static files (and all of the source files) is placed in the ./serverless_function_source_code directory.  Why?  Who knows.  All that mattered is that after a simple version upgrade, all of my stuff broke because it couldn't find files that it was able to find before the upgrade.

I found that using a sync.Once to attempt to change the current working directory (if necessary) is a fairly clean backward-compatible way of handling this issue.

Here's an example; it's fairly verbose, but you could rip out most of the logging if you don't want or need it.

// GoogleCloudFunctionSourceDirectory is where Google Cloud will put the source code that was uploaded.
//
const GoogleCloudFunctionSourceDirectory = "serverless_function_source_code"

// once is an object that will only execute its function one time.
//
// Because we want to log during our initialization, we need to handle this in a non-standard
// function and keep track of our initialization status.
var once sync.Once

// Initialize initializes the application.
//
// Primarily, this changes the current working directory.
func Initialize(log *logrus.Logger) {
log.Infof("Initializing the application.")

path, err := os.Getwd()
if err != nil {
log.Warnf("Could not find the current working directory: %v", err)
}
log.Infof("Current working directory: %s", path)

log.Infof("Looking for top-level source directory: %s", GoogleCloudFunctionSourceDirectory)
fileInfo, err := os.Stat(GoogleCloudFunctionSourceDirectory)
if err == nil && fileInfo.IsDir() {
log.Infof("Found top-level source directory: %s", GoogleCloudFunctionSourceDirectory)
err = os.Chdir(GoogleCloudFunctionSourceDirectory)
if err != nil {
log.Warnf("Could not change to directory %q: %v", GoogleCloudFunctionSourceDirectory, err)
}
}

log.Infof("Initialization complete.")
}

// CloudFunction is an HTTP Cloud Function with a request parameter.
func CloudFunction(w http.ResponseWriter, r *http.Request) {
log := logrus.New()

// Initialize our application if we haven't already.
once.Do(func() { Initialize(log) })

// YOUR CLOUD FUNCTION LOGIC HERE
}

For more information, see the Cloud Functions concepts docs.

Logging And Environment Variables

For whatever reason, Cloud Functions with Go don't log at anything other than the "default" log level; this means that all of my carefully crafted log messages all just get dumped into the logs at the same severity.

I've been using gcfhook with logrus to get around this, but it's not an ideal solution.  That combination works by nullifying all output of the application and then adding a logrus hook that connects to the StackDriver API to send proper logs over the network.  It works fine, but it's silly to have to make a network connection to a logging API when the application itself can output directly.

As of Go 1.13, Cloud Functions will no longer set the FUNCTION_NAME, FUNCTION_REGION, and GCP_PROJECT environment variables.  This is a problem because we need those three pieces of information in order to use the StackDriver API to send the log messages.  You could publish those environment variables back as part of your deployment, but I'd prefer not to.

Fortunately, Cloud Functions can now parse (poorly documented) JSON-formatted lines from stdout and stderr, resulting in proper log messages with severities.  The Cloud Functions docs refer to this as "structured logging", but the docs don't seem to apply correctly.  Cloud Run has a document on how these JSON-formatted lines should look, but it's still a bit hazy.

Anyway, the gcfstructuredlogformatter package introduces a logrus formatter that outputs JSON instead of plain text for logs.  This eliminates the need for the extra environment variables and generally simplifies the logging workflow.  It should only be a couple of lines of code to sub out gcfhook for gcfstructuredlogformatter.

Here's an example:

// CloudFunction is an HTTP Cloud Function with a request parameter.
func CloudFunction(w http.ResponseWriter, r *http.Request) {
log := logrus.New()

if value := os.Getenv("FUNCTION_TARGET"); value == "" {
log.Infof("FUNCTION_TARGET is not set; falling back to normal logging.")
} else {
formatter := gcfstructuredlogformatter.New()

log.SetFormatter(formatter)
}

log.Infof("This is an info message.")
log.Warnf("This is a warning message.")
log.Errorf("This is an error message.")

// YOUR CLOUD FUNCTION LOGIC HERE
}

Hopefully this stopped you from banging your head against the wall for a few hours like I was doing as I tried to frantically figure out why the upgrade had failed in such weird ways.


Wednesday, June 3, 2020

Sharing a single screen in Slack for Linux

I have a bunch of monitors, and for whatever reason, Slack for Linux refuses to let me limit my screen sharing to a single monitor or application.  This means that if I try to share my screen on a call, no one can see or read anything because they just see a giant, wide view of three monitors' worth of pixels crammed into their Slack window (typically only one monitor wide).

In my experience, disabling monitors/displays is just not worth it; I'll have to spend too much time getting everything set back up correctly afterward, and that's really inconvenient and annoying.

The solution that I've landed on is Xephyr; Xephyr runs a second X11 server inside a new window, so when I need to get on a call where I'll have to share my screen, I simply:
  1. Launch a new Xephyr display.
  2. Close Slack.
  3. Open Slack on the Xephyr display.
  4. Open whatever else I'll need to share in the Xephyr display, typically a web browser or a terminal.
  5. Get on the Slack call and share my "screen".
Some small details:
  • You'll need to open Xephyr with the resolution that you want; given window decorations and such, you may need to play around with this a bit.  Once you find out what works, put it in a script.
  • In order to resize windows in Xephyr, it'll need to be running a window manager.  I struggled to get any "startx"-related things working, but I found that "twm" worked well enough for my purposes.
  • Some applications, such as Chrome, won't open on two displays at the same time.  I just open a different browser in my Xephyr display (for example, I use "google-chrome" normally and "chromium-browser" in Xephyr), but you can also run Chrome using a different profile directory and it'll run in the other display.
Install Xephyr and TWM:
sudo apt install xserver-xephyr twm

Run Zephyr, Slack, and Chromium:
# Launch Xephyr and create display ":1".
Xephyr -ac -noreset -screen 1920x1000 :1 &
# Start a window manager in Xephyr.
DISPLAY=:1 twm &>/dev/null &
# Open Slack in Xephyr.
DISPLAY=:1 slack &>/dev/null &
# Open Chromium in Xephyr.
DISPLAY=:1 chromium-browser &>/dev/null &

It's kind of dirty, but it works extremely well, and I don't have to worry about messing with my monitor setup when I need to give a presentation.

Edit: an earlier version of this post used "Xephyr -bc -ac -noreset -screen 1920x1000 :1 &" for the Xephyr command; I can't get this to work with "-bc" anymore; I must have copied the wrong command when I published the post.

Wednesday, January 29, 2020

Unit-testing reCAPTCHA v2 and v3 in Go

I recently worked on a project where we allowed new users to sign up for our system with a form.  A new user would need to provide us with her name, her e-mail address, and a password.  In order to prevent spamming, we used reCAPTCHA v3, and so that meant that we also submitted a reCAPTCHA token along with the rest of the new-user data.

Unit-testing the sign-up process was fairly simple if we turned off the reCAPTCHA requirement, but the weakest link in the whole process is the one part that we could not control: reCAPTCHA.  It would be foolish not to have test coverage around the reCAPTCHA workflow.

So, how do you unit-test reCAPTCHA?

Focus: Server-side testing

For the purposes of this post, I'm going to be focusing on testing reCAPTCHA on the server side.  This means that I'm not concerned with validating that users acted like humans fiddling around on a website.  Instead, I'm concerned with what our sign-up endpoint does when it receives valid and invalid reCAPTCHA tokens.

reCAPTCHA in Go

There are a variety of Go package to provide reCAPTCHA support; however, only one of them (1) has support for Go modules, and (2) has support for unit testing built in:

For docs, see:

When you create a new reCAPTCHA site, you're given public and private keys (short little strings, nothing huge).  The public key is used on the client side when you make your connection to the reCAPTCHA API, and a response token is provided back.  The private key is used on the server side to connect to the reCAPTCHA API and validate the response token.

Since the client side will likely be a line or two of Javascript that generates a token, our server-side work will be focused on validating that token.

Assuming that the newly generated token is "NEW_TOKEN" and that the private key is "YOUR_PRIVATE_KEY", then this is all you have to do in order to validate that token:

import "github.com/tekkamanendless/go-recaptcha"

// ...

recaptchaVerifier := recaptcha.New("YOUR_PRIVATE_KEY")
success, err := recaptchaVerifier.Verify("NEW_TOKEN")
if err != nil {
   // Fail with some 500-level error about not being able to verify the token
}


if !success {
   // Fail with some 400-level error about not being a human
}


// The token is valid!


And that's it!  All we really care about is whether or not it worked.

Unit testing

tekkamanendless/go-recaptcha includes a package called "recaptchatest" that provides a fake reCAPTCHA API running as an httptest.Server instance.  This server simulates enough of the reCAPTCHA API to let you do the kinds of testing that you need to.

Just like the actual reCAPTCHA service, you can create multiple "sites" on the test server.  Each site will have a public and private key, and you can call the NewResponseToken method of a site to have that site generate a valid token for that site.

In terms of design, you'll set up the test server, the test site, and the valid token in advance of your test.  When you create your Recaptcha instance with the test site's private key, all you have to do is set the VerifyEndpoint property of that instance to point to the test server (otherwise, it would try to talk to the real reCAPTCHA API and fail).

Here's a simple example:

import (
   "github.com/tekkamanendless/go-recaptcha" 
   "github.com/tekkamanendless/go-recaptcha/recaptchatest"
)

// ...

// Create a new reCAPTCHA test server, site, and valid token before the main test.
testServer := recaptchatest.NewServer()
defer testServer.Close()

site := testServer.NewSite()
token := site.NewResponseToken()

// Create the reCAPTCHA verifier with the site's private key.
recaptchaVerifier := recaptcha.New(site.PrivateKey)

// Override the endpoint so that it uses the test server.
recaptchaVerifier.VerifyEndpoint = testServer.VerifyEndpoint()

// Run your test.

// ...

// Validate that the reCAPTCHA token is good.
success, err := recaptchaVerifier.Verify(token)
assert.Nil(t, err)
assert.True(t, success)

The recaptchatest test server doesn't do too much that's fancy, but it will properly return a failure if the same token is verified twice or if the token is too old.  It also has some functions to let you tweak the token properties so you don't have to wait around for 2 minutes for a token to age out; you can make one that's already too old (see Site.GenerateToken for more information).