Promises are the monad of asynchronous programming

In my introduction to monads in JavaScript we saw a couple of monads and examined the commonality between them to expose an underlying design pattern. Before I get on to how that applies to asynchronous programming we need to take one final diversion and discuss polymorphism.

Consider the list monad that we implemented before:

var compose = function(f, g) {
  return function(x) { return f(g(x)) };
};

// unit :: a -> [a]
var unit = function(x) { return [x] };

// bind :: (a -> [a]) -> ([a] -> [a])
var bind = function(f) {
  return function(list) {
    var output = [];
    for (var i = 0, n = list.length; i < n; i++) {
      output = output.concat(f(list[i]));
    }
    return output;
  };
};

In our previous example, we just had functions that accepted an HTMLElement and returned a list of HTMLElements. Notice the type signature of bind above: (a -> [a]) -> ([a] -> [a]). That ‘a’ just means we can put any type in its place, but the signature a -> [a] implies that the functions must return a list of things of the same type as the input. This is actually not the case, and the correct signature is:

bind :: (a -> [b]) -> ([a] -> [b])

For example, bind may take a function that maps a single String to a list of HTMLElements, and return a function that maps a list of Strings to a list of HTMLElements. Consider these two functions: the first takes a string and returns a list of nodes with that tag name, and the second takes a node and returns a list of all the class names it has.

// byTagName :: String -> [HTMLElement]
var byTagName = function(name) {
  var nodes = document.getElementsByTagName(name);
  return Array.prototype.slice.call(nodes);
};

// classNames :: HTMLElement -> [String]
var classNames = function(node) {
  return node.className.split(/\s+/);
};

If we ignore the ‘returns a list of’ aspect, we’d expect to be able to compose these to get all the class names of all the links in a document:

var classNamesByTag = compose(classNames, byTagName);

Of course, we do have lists to contend with, but because the monad just cares about the lists, not what’s in the lists, we can use it to compose the functions:

// classNamesByTag :: [String] -> [String]
var classNamesByTag = compose(bind(classNames), bind(byTagName));

classNamesByTag(unit('a'))
// -> ['profile-links', 'signout-button', ...]

The monad just cares about the ‘list of’ part, not the contents of the list. Just as a reminder, let’s recast this using the pipeline syntax we developed in the previous article:

// bind :: [a] -> (a -> [b]) -> [b]
var bind = function(list, f) {
  var result = [];
  for (var i = 0, n = list.length; i < n; i++) {
    result = result.concat(f(list[i]));
  }
  return result;
};

// pipe :: [a] -> [a -> [b]] -> [b]
var pipe = function(x, functions) {
  for (var i = 0, n = functions.length; i < n; i++) {
    x = bind(x, functions[i]);
  }
  return x;
};

// for example
pipe(unit('a'), [byTagName, classNames])
// -> ['profile-links', 'signout-button', ...]

The pipe function doesn’t actually care that bind deals with lists, and we can write its signature in a more generic way:

pipe :: m a -> [a -> m b] -> m b

In this notation, m a means ‘a monadic wrapper around a’. For example, if the input is a list of strings, under the List monad the m refers to ‘list of’, and a refers to ‘string’. This concept of generic containers becomes extremely important when we consider how we can apply these ideas to asynchronous programming.

One common complaint levelled against Node.js is that it is easy to become mired in ‘callback hell’, where callbacks become nested so deeply it’s impossible to maintain the resulting code. There are various ways to solve this, but I think monads provide an interesting approach that highlights the need to separate business logic from code that simply glues data together.

Let’s consider a fairly contrived example: we have a file called urls.json, that contains a JSON document that contains a URL. We want to read the file, extract this URL, request the URL from the web, and print the response body. Monads encourage us to think in terms of types of data, and how they flow through a pipeline. We can model this problem using our pipeline syntax:

pipe(unit(__dirname + '/urls.json'),
        [ readFile,
          getUrl,
          httpGet,
          responseBody,
          print         ]);

Beginning with the pathname (a String), we can trace the data as it flows through this pipe:

  • readFile takes a String (pathname) and returns a String (file contents)
  • getUrl takes a String (a JSON document) and returns a URI object
  • httpGet takes a URI and returns a Response
  • responseBody takes a Response and returns a String
  • print takes a String and returns nothing

So each link in the chain produces a data type that is consumed by the next. But in Node, many of these operations are asynchronous. Rather than use continuation-passing style and deeply nested callbacks, how can we modify the return types of these functions to indicate the result might not be known yet? By wrapping them in Promises:

readFile      :: String   -> Promise String
getUrl        :: String   -> Promise URI
httpGet       :: URI      -> Promise Response
responseBody  :: Response -> Promise String
print         :: String   -> Promise null

The Promise monad needs three things: a wrapper object, a unit function to wrap a value in this object, and a bind function to help us compose the above functions. We can implement promises using the Deferrable module from JS.Class:

var Promise = new JS.Class({
  include: JS.Deferrable,
  
  initialize: function(value) {
    // if value is already known, succeed immediately
    if (value !== undefined) this.succeed(value);
  }
});

With this class, we can then implement the functions we need to solve our problem. The functions that can return a value immediately just return a value wrapped in a Promise object:

// readFile :: String -> Promise String
var readFile = function(path) {
  var promise = new Promise();
  fs.readFile(path, function(err, content) {
    promise.succeed(content);
  });
  return promise;
};

// getUrl :: String -> Promise URI
var getUrl = function(json) {
  var uri = url.parse(JSON.parse(json).url);
  return new Promise(uri);
};

// httpGet :: URI -> Promise Response
var httpGet = function(uri) {
  var client  = http.createClient(80, uri.hostname),
      request = client.request('GET', uri.pathname, {'Host': uri.hostname}),
      promise = new Promise();
  
  request.addListener('response', function(response) {
    promise.succeed(response);
  });
  request.end();
  return promise;
};

// responseBody :: Response -> Promise String
var responseBody = function(response) {
  var promise = new Promise(),
      body    = '';
  
  response.addListener('data', function(c) { body += c });
  response.addListener('end', function() {
    promise.succeed(body);
  });
  return promise;
};

// print :: String -> Promise null
var print = function(string) {
  return new Promise(sys.puts(string));
};

So we’ve now got a way to represent a deferred value simply using data types, no nested callbacks required. The next step is to implement unit and bind in such a way that the callback glue is contained within these functions. unit is very simple, it just needs to wrap a value in a Promise. bind is more complex: it accepts a Promise and a function. It must extract the value of the Promise, give this value to the function, which returns another Promise, then wait for this final Promise to complete.

// unit :: a -> Promise a
var unit = function(x) {
  return new Promise(x);
};

// bind :: Promise a -> (a -> Promise b) -> Promise b
var bind = function(input, f) {
  var output = new Promise();
  input.callback(function(x) {
    f(x).callback(function(y) {
      output.succeed(y);
    });
  });
  return output;
};

With these definitions, you should find that the pipeline shown above just works, for example if I’ve placed my GitHub API URL in the urls.json file:

pipe(unit(__dirname + '/urls.json'),
        [ readFile,
          getUrl,
          httpGet,
          responseBody,
          print         ]);

// prints:
// {"user":{"name":"James Coglan","location":"London, UK", ...

I hope this has shown why Promises (also known as Deferreds in jQuery and Dojo) are valuable, how they can help you pipe data through your code, and how this piping glue can be contained in quite a neat way. On the client side, you’ll find this pattern applies equally well to Ajax and animation; in fact MethodChain (which I’ve been using as async glue for years) is just the Promise monad applied to method calls instead of function calls.

The full working code for this article is on GitHub if you’d like to tinker with it. After that, you can dive into jQuery’s Deferred API. There’s also talk of promises making a return to Node, after they were banished in favour of continuation-passing style, so you may find you can do this without writing so much plumbing in future.