Form an orderly queue!

About once a month, I seem to stumble across an article that states that one reason JavaScript is easy to use is that, because it is single-threaded, you don’t get the synchronisation problems you see in multi-threaded environments. You can throw away everything you know about locks, mutexes, semaphores, and so on and you don’t ever need to worry about two bits of your program running concurrently and causing consistency errors in shared data structures.

In a very pedantic technical sense, JavaScript code is ‘thread-safe’. There is only one thread, so of course you don’t need to worry about multiple threads co-operating safely. However, the same kinds of problems that threads cause also plague JavaScript programs, and they will bite you unless you know how to avoid them.

But first, let’s state what we mean by ‘thread-safe’. In JavaScript, only one block of synchronous statements is ever running at a time. If you see code like this, you can guarantee that no other part of the program will be executed in between the statements and expressions in this block:

var myData = {hello: 'world'};
assertEqual(2 + 2, 4);
myData.anotherProperty = ['a', 'b', 'c'];
delete myData.hello;
callAnotherFunction('with', {some: ['args']});

Anywhere you see a sequence of synchronous statements, those statements will run one after another, and no other JavaScript will run concurrently with them. However, as soon as you make an asynchronous call, you create a pause in execution. Think of the indent in your callback function as the current block taking a breath, during which any other part of the program can run.

    var url = 'http://localhost:4567/';
    request(url, function(error, request, body) {
      var data = JSON.parse(body);
/*  ^^
    ||
    ++---- Here be dragons! */

The var url and var data lines are separated by a callback boundary; other parts of the program will run while you’re waiting for the request to finish, and you’ve no idea how long this gap will be, especially in environments like mobile browsers where latency can be both high and extremely variable. Indeed, when I’m browsing the web on my phone, I frequently see JavaScript apps behaving weirdly because the increased latency exposes scheduling bugs and race conditions that the author did not anticipate.

Asynchrony makes it harder to guarantee which order bits of your program will run in, and leads to some of the same issues as threads whereby the unpredictable ordering of the program puts it into weird states.

Let’s consider an example. Suppose you have a web server whose only function is to allow you to read and write an in-memory object. A GET request retrieves the current version of the object, and a PUT request replaces it with a new version.

var express = require('express'),
    app     = express();

var document = {};

app.use(express.json());

app.get('/', function(request, response) {
  response.json(document);
});

app.put('/', function(request, response) {
  document = request.body;
  response.send('OK');
});

app.listen(4567);

Let’s write some functions to interact with the server, to load and save the document that’s kept there.

var request = require('request');

var url = 'http://localhost:4567/';

var load = function(callback) {
  request.get(url, {json: true}, function(error, response, body) {
    callback(error, body);
  });
};

var save = function(document, callback) {
  var body    = JSON.stringify(document),
      headers = {'Content-Type': 'application/json'};

  request.put(url, {body: body, headers: headers}, callback);
};

If we make a document, save it, then make a change to it, save it again, and so on, this will fire off PUT requests to the server, one for each save() call. If we wait a while for those requests to complete, and then load() the document from the server, we can see what the result of the save() calls was.

var document = {one: 1};
save(document);
document.two = 2;
save(document);
document.three = 3;
save(document);

setTimeout(function() { load(console.log) }, 1000);

This code looks synchronous, and on my machine this program routinely prints {one: 1, two: 2, three: 3}. This shows that the server received the PUT requests in the same order that we sent them; it has kept the very last update with all the fields we added.

However, in the real world, things are rarely this simple. Everything I/O related in JavaScript is asynchronous; those save() calls don’t block while waiting for the PUT request to finish, they just initiate the request but let the program keep chugging along. Although we initiate the requests in the correct order, they might not arrive at the server in the right order. Due to variable latency, or the whims of a load balancer on the server side, they might arrive out of order at the application server. We can simulate this by adding a random delay to the save() function, so each PUT request is deferred by up to 300ms.

var save = function(document, callback) {
  var body    = JSON.stringify(document),
      headers = {'Content-Type': 'application/json'},
      latency = 300 * Math.random();

  setTimeout(function() {
    request.put(url, {body: body, headers: headers}, callback);
  }, latency);
};

When we run the program with this change, it sometimes prints {one: 1}, sometimes {one: 1, two: 2} and sometimes {one: 1, two: 2, three: 3}. The randomised latency means it’s not predictable which request will arrive at the server last and end up being the value the server keeps at the end of the process.

So, the problem is that we have code that looks synchronous, but is actually firing off async messages to a remote server in the background. We want to save the document each time we make a change to it, but make sure those save requests are processed by the server in the right order, avoiding the race conditions that latency has introduced.

We could do this by using the save() function’s callback parameter to make sure we don’t start a new save() call until the previous one is finished (remembering to handle errors along the way):

save(document, function(error) {
  if (error) return handleError(error);
  document.two = 2;
  save(document, function() {
    if (error) return handleError(error);
    document.three = 3;
    save(document, function() {
      if (error) return handleError(error);
      load(console.log);
    });
  });
});

Alternatively, we can use the Async library to clean up our callback pyramid, but basically do the same thing as the above example:

var async = require('async');

async.series([
  function(cb) {
    save(document, cb);
  }, function(cb) {
    document.two = 2;
    save(document, cb);
  }, function(cb) {
    document.three = 3;
    save(document, cb);
  }
], function(error) {
  if (error) return handleError(error);
  load(console.log);
});

However, this approach only works if the changes and save() calls are all in the same place in the codebase; if multiple independent modules are changing the document and calling save(), you can’t use callbacks to co-ordinate things because the modules don’t talk directly to one another. This situation happens frequently in modular user interface code, where two UI components both hold a reference to a data model, which is saved to the server whenever a change is made to it.

In this situation, the model itself must enforce that all save() calls must arrive at the server in the same order as they were issued by the client, as would happen if save() were a blocking operation. A blocking save() function enforces that only one save() can happen at a time by waiting until the request finishes before it returns. An async save() function needs to do the same thing: it must allow multiple parts of the codebase to call it, but it must make sure we wait for the last request to end until the next one is initiated. Only one PUT can be in flight at a time.

Since this is JavaScript, we can’t make save() be blocking by putting a while (requestIsNotDone) loop in it; this blocks the single-threaded event loop and stops the request ever completing. It’s also inefficient to use a flag to say that if save() is currently in flight, then further calls to it should fail. This makes for messy code and drops half your save commands on the floor.

Fortunately, the problem of ‘make sure a series of async commands are executed in order’ can be elegantly solved with a queue. Instead of executing each command immediately, we push the command into a queue that executes things one at a time. Async has an API for doing exactly this; async.queue() takes a worker function and a concurrency level, where the worker function takes a value and a callback, does some work and calls the callback when it’s done.

var queue = async.queue(function(document, callback) {
  save(document, callback);
}, 1);

The concurrency level sets how many queued tasks can be processed concurrently (async operations can run concurrently even though we only have one thread); setting it to 1 means our queue will process documents one at a time. The above expression reduces to:

var queue = async.queue(save, 1);

If we push the updated documents into the queue, rather than calling save() directly, this makes sure the PUT requests are sent one at a time, and the program finishes by printing {one: 1, two: 2, three: 3} every time.

var document = {one: 1};
queue.push(document);
document.two = 2;
queue.push(document);
document.three = 3;
queue.push(document);

setTimeout(function() { load(console.log) }, 1000);

Although this example is contrived, I have used it in real programs to make sure changes to files are done in a predictable order and changes don’t clobber one another. In Vault it’s used to prevent exactly the race we saw above when talking to the filesystem or a remote server, and in reStore I use it to make sure only one operation is performed on each user’s tree at a time, so you can’t, say, try to write a document while its parent directory is being deleted by a previous request. reStore’s Redis backend uses a SETNX-based lock to achieve the same thing, and Faye uses this too to make sure only one garbage collection sweep is rounding up expired session data at a time.

So, while JavaScript might be thread-safe, it’s certainly not free from race conditions, and since most JS programs talk to remote systems asynchronously over volatile network connections, it’s actually really easy to introduce race conditions without realising it if you only test on your office wifi network. While it’s certainly nice to ignore concurrency while writing blocking code, it helps to have some queue and lock techniques in store for when you go async.