Flow Control in Node.js

Late last summer I began working on Cloudkick’s new deployment tool, cast which is written in Node.js. At the time I had only minimal experience with JavaScript, having occasionally used it for frontend work on personal projects. After a crash course from Bjorn, Cloudkick’s in-house JavaScript wizard, I was able to write mostly working code.

The Problem

As others have found however, writing a Node app that is maintainable is a lot different than just writing one that works.

In cast, we frequently perform long sequences of filesystem operations. For example, for each application deployed in cast we maintain a symlink to the ‘current’ version. Because symlinks can be atomically replaced with a call to rename the ‘current’ link will always point to the single current version of the application.

The sequence of operations we need to perform to update the ‘current’ symlink is:

  1. Determine the path to the new version. We have a method that can do this for us, but it hits the filesystem so calls to it are asynchronous.
  2. Make sure the new version exists. Again, we’re hitting the filesystem so this happens asynchronously as well.
  3. Create a new symlink pointing to the specified version. Another filesystem operation.
  4. Swap the new symlink into place using ‘rename’.

The naive solution comes out looking something like this:

Instance.prototype.activate_version = function(version, callback) {
  var self = this;
  var new_version_link = path.join(self.root, 'new');
  var cur_version_link = path.join(self.root, 'current');

  self.get_version_path(version, function(err, vp) {
    if (err) {
      return callback(err);
    }
    path.exists(vp, function(exists) {
      if (!exists) {
        return callback(new Error('Cannot activate nonexistent version \'' + version + '\''));
      }
      fs.symlink(path.resolve(vp), new_version_link, function(err) {
        if (err) {
          return callback(err);
        }
        fs.rename(new_version_link, current_version_link, callback);
      });
    });
  });
};

This isn’t too bad, but its already getting unwieldy after only four operations (other methods perform ten or more such operations). There are a few problems here:

  1. We quickly end up with very deep nesting which pushes all our code increasingly further to the right and limits the space left on each line for code.
  2. Inserting a new operation early in the chain forces us add another level of indentation to every subsequent operation. This causes commits to contain massive diffs which can hide other accidental (or, theoretically, malicious) changes and obfuscates the actual changes.
  3. Its just plain difficult to read.

The Solution

This isn’t exactly revolutionary, but the first step is to use one of the many flow control modules available for Node. For cast we’ve been using async and I’ve grown quite attached to it, but there are a lot of other nice ones available as well. Using async, our ‘activate_version’ method now looks like this:

Instance.prototype.activate_version = function(version, callback) {
  var self = this;
  var new_version_path;
  var new_version_link = path.join(self.root, 'new');
  var current_version_link = path.join(self.root, 'current');

  async.series([
    // Get the path to the specified version
    function(callback) {
      self.get_version_path(version, function(err, vp) {
        new_version_path = vp;
        return callback(err);
      });
    },

    // Make sure the version exists
    function(callback) {
      path.exists(new_version_path, function(exists) {
        var err = null;
        if (!exists) {
          err = new Error('Cannot activate nonexistent version \'' + version + '\'');
        }
        return callback(err);
      });
    },

    // Create the new link
    function(callback) {
      fs.symlink(path.resolve(new_version_path), new_version_link, callback);
    },

    // Atomically move it into place
    async.apply(fs.rename, new_version_link, current_version_link)
  ], callback);
};

I don’t want to turn this into an async tutorial, so check out the documentation (or code) if you’re curious about the specifics, but in short we get a number of benefits here:

  1. We describe our list of actions by providing a list of functions: its very intuitive.
  2. We avoid both of the problems I mentioned that result from nesting
  3. We provide a single final callback that is always fired, allowing centralized error handling, logging, etc. Alternatively, if the signatures match, we can pass the method’s callback argument to async.
  4. We can use async’s ‘parallel’, ‘waterfall’ or several other methods to change the behavior without significantly altering the pattern.

Once you’re using some sort of flow control library, the some things to keep in mind that keep things manageable:

  1. Don’t chain multiple asynchronous calls within one function passed to async. If two calls are part of the same logical operation then chances are you should make that operation its own asynchronous function.
  2. If you have a group of logically related calls that could be parallelized, split them into their own function and use async.parallel() (or some equivalent).
  3. If you need to be able to undo your operations in reverse order, you can build a stack of reverse operations as you go, then execute it with async in the final callback.