Sunday, December 20, 2015

Google App Engine cron jobs run as no user

Google App Engine (GAE) is a great platform to use when developing just about any new application.  I use it for a couple personal projects, as well as for some fire department apps (dispatch and inventory management).

Today I'd like to talk about two things: users and cron jobs.

Users

One of the really convenient things about GAE is that it has built-in support for Google authentication (no surprise there).  This means that you can let GAE take care of your sign-in system (and trust me, handling the single-sign-on 3-way handshake isn't all that fun).  With GAE, you easily go one of two routes:
  1. Certain paths in your web application are automatically required to have a logged-in user.  If someone tries to access a path without being logged in, GAE will redirect him to a sign-in window and then bring him back when he's done.
  2. Server-side, you can ask GAE for log-in and log-out URLs, and you can direct a user to these at any time to have him log in or out.
I personally prefer the second route because there's no strange redirection and all of my endpoints behave as I expect them to.  For example, if I have a JSON REST API, a user (who has not signed in yet) can access the endpoint and be given a normal 400-level error (in JSON) instead of being redirected to a Google sign-in page.  My REST clients much prefer this.

To see if a user is signed in, a Java Servlet can call the "getCurrentUser()" method of the "UserService" instance.  If the result is null, then no one is logged in.  Otherwise, you get a couple of helpful methods to tell you about the user:
  1. "getEmail()"; this returns the user's e-mail address.
  2. "getUserId()"; this returns a unique numeric ID for the user.
  3. "isUserAdmin()"; this returns whether or not the user is an administrator of the GAE application.
For my apps, user authentication is whitelist style.  I check the "@" portion of the e-mail address to see if the user is in one of the domains that I care about (I typically build apps internal to an organization that uses Google's mail system), and I check the administrator status to grant administrator powers to my admins.

If someone tries to access a sensitive API endpoint without being logged in appropriately, I'll send back a 400-level error stating that the user is not signed in with an appropriate account.

Pretty easy stuff.

UserService userService = UserServiceFactory.getUserService();
User user = userService.getCurrentUser();
if( user == null ) {
   // There is no user logged in.
} else {
   // The user is logged in.
   System.out.println( "User is logged in: " + userService.isUserLoggedIn() );
   System.out.println( "User is administrator: " + userService.isUserAdmin() );
   System.out.println( "User:" );
   System.out.println( "   Auth Domain: " + user.getAuthDomain() );
   System.out.println( "   E-mail: " + user.getEmail() );
   System.out.println( "   Federated ID: " + user.getFederatedIdentity() );
   System.out.println( "   Nickname: " + user.getNickname() );
   System.out.println( "   User ID: " + user.getUserId() );
}

Warning: you cannot call "userService.isUserAdmin()" if the user is not already logged in.  If you try to, then it will throw an exception.

Cron Jobs

Another thing that GAE can do is schedule cron jobs.  Basically, these are page requests that are scheduled like normal Linux "cron" jobs.  So if you need to have some task performed regularly, create an endpoint for it and schedule a job to access that endpoint.

Cron jobs act as if an administrator is making the request, so they can access all paths with "admin" requirements.  Howerver, you cannot check this using "userService.isUserAdmin()" because cron jobs do not run as any particular user.

To determine if a request is coming from the cron scheduler, you have to check for the "X-Appengine-Cron" header.  This header cannot be faked (except by admins); if you try to set this header, GAE will quietly remove it by the time that it gets to your Servlet.

To detect a cron job, you have to check for the header and make sure that its value is "true".

String cronHeader = request.getHeader("X-Appengine-Cron");
if( cronHeader != null && cronHeader.compareTo("true") == 0 ) {
   log.info( "Cron service is making this request." );
}

Ultimately, if I'm checking to see whether a user is allowed to access a particular section, I go through these steps:
  1. (Assume no access at all.)
  2. No user is logged in.
    1. Is the "X-Appengine-Cron" header set to "true"?  If so, then allow administrative access.
  3. A user is logged in.
    1. Is the user a GAE admin of the application?  If so, then allow administrative access.
    2. Is the user's e-mail domain in the whitelist of basic user access?  If so, then allow basic access.