Announcing Canopy, a Treetop-like PEG compiler for JavaScript

This is very brief announcement to say that I’ve just released a new PEG parser-compiler for JavaScript, called Canopy. It generates fast, self-contained parser modules from grammar definition files and runs in all the major browsers and on CommonJS platforms including Node.js, Narwhal and RingoJS.

Why do we need another PEG compiler? We don’t. PEG.js is excellent, actively used and maintained and has a really nice website. Canopy exists mostly for reasons of personal taste: I wanted something more akin to Treetop, which lets you keep the grammar definition separate from any methods you add to the parse tree. In fact, Canopy goes further than this: you have to keep them separate, you cannot have inline JavaScript in the grammar files. JavaScript goes in JavaScript files with all your other JavaScript.

It’s taken rather a while to release. I initially wrote it as part of a huge yak-shaving exercise: I was trying to get Terminus to work in IE, which doesn’t support the document.evaluate() API. I thought it would be a fun idea to reimplement it, so I created a project called Pathology, an ad-hoc, informally specified, bug-ridden slow implementation of half of XPath for IE. Of course this meant parsing XPath queries, which I wasn’t going to do by hand, so I thought it would even more fun to learn how parser compilers work and build one. Pathology never really panned out, although it turns out Android browsers don’t have XPath either so I might revive it, you never know.

I also used Canopy to build Fargo, my fiber-aware version of Scheme. It’s great for getting a new language off the ground quickly.

So, after two years of off-and-on development, and after some interest from a few other people, a couple of months ago I finally got around to documenting it, getting rid of some annoying dependencies (the parsers it generates are now completely self-contained and work on lots of JS platforms), and testing it properly. It’s available by running npm install -g canopy, or from the website, along with the documentation. Let me know what you think.

Designing Vault’s generator algorithm

A few weeks ago now, I released Vault, a stateless password generator. It took about six months to develop, which sounds ridiculous for something that’s only a couple hundred lines of JavaScript, but during that time I learned a fair bit about cryptography that changed the generator’s design, and I’d like to discuss that here.

Vault’s password generation algorithm went through about four distinct iterations between the 0.1 release and the for-public-consumption 0.2 version. The strategy throughout could be summarized as:

  • Take two values from the user and make a hash (A) out of them
  • Construct a character set (B) acceptable by the target site
  • Encode the hash A using the character set B

However the details of how you do each of these things affect the security of the system in important ways, as we’ll see.

The inputs to the process are:

  • The phrase: a master passphrase the person uses to generate all her passwords
  • The service: the name of the site the person is logging into
  • Optionally, a set of character constraints on the generated password, e.g. ‘must not contain spaces’, or ‘must have at least two uppercase letters’

The initial design worked like this: you start by calculating the SHA-256 hash of a combination of the phrase, service and a UUID that’s fixed in Vault’s codebase. (Vault uses either crypto-js or Node’s crypto module, depending on where it’s running.)

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);

Let’s take a concrete example:

var phrase  = 'you look nice today',
    service = 'gmail';

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);
// -> 'af7382c35e42ae2a5599a497d69ee5484559925265e609d69dc5026a4a214d90'

Then you turn this hash into binary form, like so:

Vault.toBits = function(digit) {
  var string = parseInt(digit, 16).toString(2);
  while (string.length < 4) string = '0' + string;
  return string;
};

var binary = hash.split('').map(Vault.toBits).join('');
// -> '1010111101110011100000101100001101011110010000101010111000101010 ...'

So now you have a big blob of random-looking data based on the phrase and service, and you need to encode this in a format acceptable to the site you’re logging into. This means picking an appropriate character set. Vault has a collection of different types of characters (these are from the stable 0.2 release):

Vault.LOWER     = 'abcdefghijklmnopqrstuvwxyz'.split('');
Vault.UPPER     = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.split('');
Vault.ALPHA     = Vault.LOWER.concat(Vault.UPPER);
Vault.NUMBER    = '0123456789'.split('');
Vault.ALPHANUM  = Vault.ALPHA.concat(Vault.NUMBER);
Vault.SPACE     = [' '];
Vault.DASH      = ['-', '_'];
Vault.SYMBOL    = '!"#$%&\'()*+,./:;< =>?@[\\]^{|}~'.split('').concat(Vault.DASH);
Vault.ALL       = Vault.ALPHANUM.concat(Vault.SPACE).concat(Vault.SYMBOL);

Vault.ALL ends up having 94 unique characters in it: all the printable ASCII characters, except for the backtick which I couldn’t find on my Android keyboard, so I thought it might cause problems. In the simplest case, you just encode the bits of the hash using Vault.ALL, which means selecting characters from a set of 94.

To select a number, we first figure out how many bits we need.

Math.ceil(Math.log(94)/Math.log(2))
// -> 7

So we shift 7 bits off the front of our hash, and see what number it gives us. In this case, the first 7 bits are 1010111, which is 87. Vault.ALL[87] is ^, so ^ is the first character of our password. The next 7 bits are 1011100 which is 92, so our password is now ^-. We continue this until we have a password of the desired length.

Obviously, some strings of 7 bits give values outside the range 0–93. I’ll come to this later.

But this system does not allow for any constraints. Say we want the password to contain no spaces and at least 2 uppercase letters, which we represent like this:

var constraints = {space: 0, upper: 2};

We need to force the generator to obey these constraints in a secure, deterministic way. This means the ordering needs to appear random when sampled over many phrase/service pairs (i.e. the uppercase letters must not always appear in the same place), but must always be the same for a given pair. We’ve already got a deterministic random number: our binary hash. We can use it both to select characters, and select which set to draw each character from.

Let’s say we want an 8-character password with the above constraints. We start by making an empty list and a copy of Vault.ALL:

 var required = [],
     allowed  = Vault.ALL.slice();

required will be a list of character sets as long as the required password. In order to make sure our password gets the required sets, we begin by pushing each required set onto the list the required number of times. We want 2 uppercase characters, so we add Vault.UPPER to the list twice. Then we remove any disallowed characters from the allowed list, and push copies of the allowed list onto required until it’s long enough – in this case, 8 characters long. In code this looks like:

for (var type in constraints) {
  var n   = constraints[type],
      set = Vault[type.toUpperCase()];
  
  if (n === 0) {
    removeCharacters(allowed, set);
  } else {
    while (n--) required.push(set);
  }
}
while (required.length < length)
  required.push(allowed);

In our example, this means required ends up looking like this, where allowed is Vault.ALL with Vault.SPACE removed:

required = [Vault.UPPER, Vault.UPPER, allowed, allowed,
            allowed, allowed, allowed, allowed];

Wait! Did you spot the deliberate mistake? That’s right, for in loops on objects do not guarantee order, so the constraints {digit: 3, upper: 2} and {upper: 2, digit: 3} might result in different orderings of required and hence different passwords. This is clearly wrong, let’s fix it:

var TYPES = 'LOWER UPPER DIGIT SPACE DASH SYMBOL'.split(' ');

for (var i = 0, m = TYPES.length; i < m; i++) {
  var n   = constraints[TYPES[i].toLowerCase()],
      set = Vault[TYPES[i]];
  
  // etc.
}

Now we have everything we need. Again, we take our hash and take some bits off the front to select items from lists. First thing we need to know is which set from required to use. required is 8 items long, so we need a 3-bit number. The first 3 bits are 101 which gives 5, and required[5] is the allowed set. We pop this out of the list using required.splice(5,1), and use the next few bits to select a value from allowed as before. We go through our list of bits, using them to pick a character set and than a character from that set, until we’ve used all the items in the required list. This gives an apparently random ordering of character sets, and random selection from each set. Easy.

Well, not quite. There are several implementation details that ended up making this design insecure that needed fixing to get to 0.2. See if you can spot them before continuing.

Okay, the most glaring one first:

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);

As I’ve previously explained, this is vulnerable to an extension attack on the service name. If someone discovered your gmail password, say, they could figure out your password for a service whose name begins with the letters gmail. A small risk, and possibly confounded by character constraints, but worth fixing. I switched to using HMAC-SHA256 to make the hash, and suffixed the service name with the UUID.

var hash = CryptoJS.HmacSHA256(secret + Vault.UUID, phrase);

This effectively means the user is signing the service name using their passphrase, a more sensible model for what Vault does, and it gets rid of the extension attack risk.

The second problem is not a threat to the security of the passwords Vault is capable of generating, but a practical problem: because SHA-256 produces a finite output, we can only generate passwords up to a certain length, depending on how many bits you spend on selecting character sets. Users didn’t want this artificial-seeming limit, so we needed a way to get as many bits as we like. Fortunately there’s a standard crypto function for that: the key-expanding function PBKDF2. This is based on a hashing function (the default is HMAC-SHA1, so we’re still using a MAC rather than a hash), but can safely generate arbitrarily large output. We just calculate how many bits we need for a password of a certain length, and generate them:

var hash = CryptoJS.PBKDF2(phrase,
                           service + Vault.UUID,
                           {keySize: k, iterations: i});

PBKDF2 takes an iterations argument that tells it how many rounds of hashing to do; this is basically used to make the function more expensive. This is important on the server-side where passwords must be stored using expensive hashes, but in Vault it’s only used because it can generate arbitrary amounts of output. (I arbitrarily set it to use 8 iterations, though unless this number’s very large it doesn’t make much practical difference what you use. However expensive you make it for the user to generate the password, they’re still sending it over a wire and asking someone to store it, and the burden of protection needs to lie with the server at that point. Making Vault as slow as server-side hashing ought to be would just be annoying.)

Great, so now we’re combining phrase and service to make a secure hash as large as we like, job done, right? Nope. Remember I said I’d come back to how the bits are turned into numbers to select items from lists? Well it turns out that was badly broken for a long time.

Say you need to select an item from a 94-ary list, which Vault does very frequently. 94 sits somewhere between 64 and 128, so to make sure you cover all possible values you need to use 7 bits to select a number. But this could give you a number you can’t use: say you get 1110001 (113), what do you do with it? Well, Vault’s original plan was: if you take 7 bits and get a number you can’t use, only keep the first 6 bits, and leave the 7th in the queue. This gets you numbers you can always use, but has the horrible side effect of producing bias: it makes some numbers more likely than others.

We can show this with an experiment. Here’s a small script that models Vault’s old encoding process, for selecting digits between 0 and 9:

var stream  = [],
    size    = 65536,
    results = {};

while (size--) stream.push(Math.floor(Math.random() * 2));

while (stream.length > 4) {
  var bits = stream.splice(0,4),
      num  = parseInt(bits.join(''), 2);
  
  if (num > 9) {
    stream.unshift(bits.pop());
    num = parseInt(bits.join(''), 2);
  }
  results[num] = (results[num] || 0) + 1;
}

console.log(results);

When we run this, we see that some numbers are substantially more likely than others:

$ node test.js 
{ '0': 1137,
  '1': 1120,
  '2': 1132,
  '3': 1121,
  '4': 1117,
  '5': 3473,
  '6': 3339,
  '7': 3369,
  '8': 1116,
  '9': 1144 }

This same systematic bias was present in Vault too. It meant that some orderings of character sets were more likely than others, and within each set some characters were more likely than others, by a factor of 3. If given a collection of passwords, this bias was enough to easily identify them as being generated by Vault rather than being truly random noise. For the generated passwords to be safe, they had to be evenly distributed.

The easiest way to implement this is to simply throw a number away and try the next group of bits if the number if too large. This is safe but inefficient. After asking around on Twitter, Christian Perfect showed me a better way that attempts to ‘recycle’ digits that are too large but pushing them into higher-order streams. This works great, and is exactly what Vault implements. It means Vault passwords now have the same character distribution as genuine pseudo-random noise.

Those are the major changes Vault went through before I decided it was safe enough to release. An important factor all these changes have in common is that they’re all irreversible: every one of them changes the output of the generator such that a set of inputs gives one password today and another password tomorrow. This is not acceptable so all these bugs needed stamping out before going live. There are other potential problems that can be fixed without breaking things though. For example, if it turns out there are cases where we’re not generating enough bits, we can just bump up the PBKDF2 key size and it won’t change existing outputs. Asking PBKDF2 to double its output size replaces ABC with ABCDEF: the bits you were already generating don’t change. Similarly, we can introduce new character sets without breaking old passwords as long as we add them without changing the order the existing sets appear in Vault.TYPES, which affects set ordering before encoding.

Backwards compatibility is a deal-breaker on this project: if you mess it up, everyone loses their passwords. The milestone the let me ship was realizing that all the remaining potential problems were reversibly fixable. Fortunately they’ve not come up yet.

Why you should never use hash functions for message authentication

The general thrust of this post is: use a MAC function like HMAC to sign data, don’t use hash functions. Although not all hash functions suffer from the problem I’m going to illustrate, in general using a hash function for message authentication comes with a lot of potential problems because those functions aren’t designed for this task. You shouldn’t try to work around it by creatively processing the inputs or inventing some fancy way of chaining hash functions. Just use the functions that were designed for this task instead of inventing your own crypto schemes.

This morning I was reading an article on the Intridea blog about signed idempotent action links. These are links you can embed in emails to your users that let them perform authenticated actions on your site, without going through a login screen or any other interactions. This technique relies on embedding action data in the link, for example an email from Twitter could include your user ID and the name of another user, and following the link would make you follow that user on Twitter.

But these parameters are typically easily guessable and so we must provide an authentication mechanism, otherwise anyone can construct URLs to modify other accounts however they like. To do this, we ‘sign’ the action data by combining it with a secret value, for example a global secret token stored on our servers, or the user’s password hash to produce a ‘tag’. This ‘tag’ does not reveal the secret values used in its construction but lets us verify that the link was generated by our site and for the correct user. The article gives the following methods you could add to a user class to let them sign and verify actions, assuming the user has an id and a secret_token. (I don’t mean to pick on Intridea, I’ve seen a lot of other people make this mistake and I only found out recently why you shouldn’t do it.)

class User
  def sign_action(action, *params)
    Digest::SHA1.hexdigest(
      "--signed--#{id}-#{action}-#{params.join('-')}-#{secret_token}"
    )
  end
  
  def verify(signature, action, *params)
    signature == sign_action(action, *params)
  end
end

This combines the action data with a secret token and calculates the SHA-1 hash; this hash is then appended to the URL to go in the email, and verified when the link is followed. Seems fine, right? SHA-1 doesn’t reveal anything about its input, so the secret is safe. This might be true (modulo the existence of rainbow tables) but hash functions still don’t provide any guarantee that a (message,tag) pair is genuine. To see why, we need to consider how hash functions work.

Many hash functions are based on something called the ‘Merkle-Damgard iterated construction’, which works like this: say you have a long string you want to hash, that looks like this:

+--------------------------+
| abcdefghijklmnopqrstuvwx |
+--------------------------+

Your message gets split into fixed-size blocks like this:

+--------+  +--------+  +--------+  +--------+
| abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+

It is then prefixed with a value called the ‘initialization vector’ or IV, which is a global constant that’s part of the hash function’s definition. This IV is the same size as the message blocks:

    IV          M0          M1          M2          M3
+--------+  +--------+  +--------+  +--------+  +--------+
| Zn8AGy |  | abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+  +--------+

This sequence is then folded using a compression function h(). The details of the h() depend on the hashing function, but the only thing that concerns us here is that the compression function takes two message blocks and returns another block of the same size.

So, first we take IV and M0 and compute h(IV,M0) to get H0. Then we take H0 and M1 and compute H1 = h(H0,M1) and so on down the chain.

    IV          M0          M1          M2          M3
+--------+  +--------+  +--------+  +--------+  +--------+
| Zn8AGy |  | abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+  +--------+
     |           |           |           |           |
     |         +-+-+       +-+-+       +-+-+       +-+-+
     +-------->| h |--H0-->| h |--H1-->| h |--H2-->| h |--> H3
               +---+       +---+       +---+       +---+

Whatever value comes out the end of the chain is the result of the hash function; this is how hash functions take arbitrary-size input and produce fixed-size output. But what does it mean for message authentication? Well say you have a (message,tag) pair, like our action params and SHA-1 hash from above. The construction of hash functions means that if you know the value of hash(string), you can easily work out the value of hash(string + modification) without knowing what string is: you just take the hash you already have, and do some more rounds of Merkle-Damgard with your modification. So, if we’re signing data by doing this:

tag = sha1(secret + message)

we’re then going to send (message,tag) over the wire in an email. The Merkle-Damgard construction means an attacker can take these values and easily compute the following:

fake_tag = sha1(secret + message + modification)

The attacker can do this without knowing secret, and can therefore construct new (message,tag) pairs that look genuine to the application. The creation of new pairs by untrusted parties is called an extension attack, and common hash functions are vulnerable to it.

Fortunately, there’s a function that is resistant to extension attacks and is designed precisely for signing messages: HMAC, short for Hash-based Message Authentication Code. In Ruby you can use the OpenSSL module from the standard library for this:

sha1 = OpenSSL::Digest::Digest.new('sha1')
tag  = OpenSSL::HMAC.hexdigest(sha1, secret_token, message)

Although HMAC is based on hash functions, it’s constructed in a way that prevents extension attacks and other classes of ‘existential forgery’, i.e. people constructing fake-but-valid (message,tag) pairs. If you’re signing data, you should use it.

But wait: there’s more! Take another look at the verify() function:

def verify(signature, action, *params)
  signature == sign_action(action, *params)
end

See anything wrong with it? It’s actually vulnerable to a timing attack: the String#== method as commonly implemented compares strings character-by-character (or byte-by-byte) and exits with false as soon as it finds an unequal pair. This fact means that an attacker can determine the first correct character of the tag by submitting requests to a signed URL with a different first character in the tag each time, and stopping when the request takes a little longer than usual. After guessing the first character they can move onto the second, and so on until they’ve guessed the whole correct tag.

The easiest way to defeat this attack is, instead of directly comparing two strings, compare their mappings under a collision-resistant hash function:

def verify(signature, action, *params)
  expected = Digest::SHA1.hexdigest(sign_action(action, *params))
  actual = Digest::SHA1.hexdigest(signature)
  expected == actual
end

This strategy means that if any of the supplied tag is wrong, it will probably differ on the first character when compared to the hash of the expected signature. Even if it doesn’t sometimes, you’ve destroyed the linear relationship between the correctness of the string and the time the comparison takes.

Finally, you should make sure your application does not exit early if the tag is invalid. You should do all the data processing you would normally do, just short of modifying the database, and check the tag last. If you return early you risk another timing attack.

All the information in this post I learned from Stanford’s introduction to cryptography, which is an excellent six-week primer on the topic. It’s a little mathsy but not nearly as much as it could be, meaning it’s fairly accessible and focuses on practical problems and intuition, and is really useful to anyone handling user data for a living. I’d go so far as to say it should be required reading for any web developer. It starts up again on Monday June 11: go sign up and make sure you’re not the next victim of the password leaks we’ve seen this week.

Source map support added to Packr and Jake

I’m a little late announcing this, in fact I held off until I’d been using this code for a couple of months to make sure there weren’t any glaring surprises. But the good news is, all the JavaScript libraries I ship from now on will come with source maps, and yours can too.

What’s a source map, you ask. Well, as this article explains, a source map is just a metadata file that maps locations in a minified (or otherwise compiled) JavaScript file back to locations in the source code the author is working on. This means you can load minified/compiled code into a browser, and when log messages or exceptions occur, the browser can show you the filename and line number in your source code that such events come from, rather than somewhere in line 1 of whatever huge bundle of code you’re actually deploying. This is really useful.

Telling the browser about this metadata is really easy. If you look at the latest release of Faye, you’ll see it contains three client-side files:

  • faye-browser.js – the source code
  • faye-browser-min.js – the minified code
  • faye-browser-min.js.map – the source map

faye-browser-min.js is what gets loaded in the browser. At the end of the file is a single line comment that looks like this:

//@ sourceMappingURL=faye-browser-min.js.map

This line tells the browser to use the file faye-browser-min.js.map as the source map for the JavaScript file that includes this comment. sourceMappingURL is resolved relative to the URL of the containing script, so as long as the script and the map are in the same directory this directive works fine.

The file faye-browser-min.js.map looks like this:

{
  "version": 3,
  "file": "faye-browser-min.js",
  "sourceRoot": "",
  "sources": ["faye-browser.js"],
  "names": ["_advice", "_callback", "_callbacks", "_cancelled", "_cbCount", "_channels", ...],
  "mappings": "AAAA,IAAI,MAAQ,OAAO,QAAU,SAAW,QACxC,GAAI,OAAO,UAAY,WAAY,OAAO,KAAO,KAEjD,KAAK..."
}

The sources field refers to the source code the map relates to: a source map can describe how one file was concatenated from a set of source files. The browser uses the sources and mappings data to show you the source code instead of the minified code when you use the web inspector, and it only loads the map and the source code if you have the inspector open so it doesn’t add latency for normal users.

Now if you want to use this new browser feature on your own projects, I have a couple of tools you should look at. First is Packr, my port of Dead Edwards’ Packer tool in Ruby. Say you have two source files:

# example_a.js

1. // When the minified code is loaded into a browser, you should see the call to
2. // console.log() attributed to example_a.js:4
3. 
4. console.log('Hello from file A');
# example_b.js

1. var display = function(message) {
2.   alert(message + ' from file B');
3. };
4. 
5. display('Ahoy there');

You can run this in the shell:

packr example_a.js example_b.js -o min/example-min.js -h '/* Copyright 2012 */' --shrink-vars

And it will generate a minified file and a source map for you:

# example-min.js

/* Copyright 2012 */
console.log('Hello from file A');var display=function(a){alert(a+' from file B')};display('Ahoy there');
//@ sourceMappingURL=example-min.js.map
# example-min.js.map

{
  "version": 3,
  "file": "example-min.js",
  "sourceRoot": "",
  "sources": ["../example_a.js", "../example_b.js"],
  "names": ["message"],
  "mappings": ";AAGA,QAAQ,KAAK,MAAM,KAAK,KAAK,ICH7B,IAAI,QAAU,SAASA,GACrB,MAAMA,IAAY,KAAK,KAAK,KAG9B,SAAS,KAAK;"
}

Packr lets you remove whitespace, compress variable names, use base-62 encoding, insert header comments on minified output, and can be used as a CLI or as a Ruby library. See the readme for more info.

Second is a tool I’ve been using to build all my JavaScript projects for a few years now, Jake. Jake is based on Packr, but is designed for larger projects. It’s designed to let you describe the structure of a project using a simple YAML file, and build the whole thing easily with one shell command. Based on Packr’s source map support, I’ve added the ability to tell Jake to generate source maps for one build based on the output of another. For example, say your build produces both unminified (src) and minified (min) copies of your code; then you can tell Jake to generate a source map for the min build that refers to code in the src build as follows:

---
builds:
  src:
    minify: false
  min:
    minify: true
    shrink_vars: true
    source_map: src

The source_map: src line means, generate a source map for min files that refers to locations in the corresponding src file. This is how the Faye files above are generated. Again, the Jake readme has more information.

We’re likely to see source map support in other tools soon; adding it to Packr was fairly easy because of Packr’s very simple model for compressing code. Basically it’s a regex-replacement-based compressor rather than a full JavaScript parser, so source maps could be added as a post-processing step rather than being based on AST annotations requiring deep integration. Unfortunately this means that if you like omitting semicolons from your code you might have to wait a little longer to use this stuff.

Faye 0.8: the refactoring

I’m pleased to finally announce the release of Faye 0.8 after a few months of reorganising the 0.7 codebase to make it more modular, and splitting parts of it out into separate projects. Before I get to what’s changed, I’m going to get the API changes out of the way: this is the stuff you need to know if you’re upgrading.

I hate introducing API changes but I’m afraid these really couldn’t be avoided. They’re really configuration changes so you shouldn’t need to change a lot of code. Please get on the mailing list if you have problems.

First, if you’re running the Ruby server you need to tell it which web server you’re using. Faye now supports the Thin, Rainbows and Goliath web servers, and you need to tell the WebSocket layer which set of adapters to load. For a ‘hello world’ app this looks like this:

# config.ru
require 'faye'
Faye::WebSocket.load_adapter('thin')

app = Faye::RackAdapter.new(:mount => '/faye', :timeout => 25)

run app

Depending on whether you run your application with rackup or thin, the load_adapter call might not be strictly necessary, but better to have it in there just in case. See the Faye Ruby docs and the faye-websocket documentation on how to run your app with different Ruby servers.

Second, if you use the Redis backend, you need to install a new library and change the engine.type setting in your server. Instead of specifying the name of the engine, this field now takes a reference to the engine object. In Ruby that looks like this:

# config.ru
# First run: gem install faye-redis

require 'faye'
require 'faye/redis'

bayeux = Faye::RackAdapter.new(
  :mount   => '/',
  :timeout => 25,
  :engine  => {
    :type  => Faye::Redis,
    :host  => 'redis.example.com',
    # more options
  }
)

And on Node:

// First run: npm install faye-redis

var faye  = require('faye'),
    redis = require('faye-redis');

var bayeux = new faye.NodeAdapter({
  mount:    '/',
  timeout:  25,
  engine: {
    type:   redis,
    host:   'redis.example.com',
    // more options
  }
});

Apart from this, Faye 0.8 should be backward-compatible with previous releases.

Having got the administrivia out of the way, what’s new? Well, the main focus of the 0.8 release is modularity. Two major components of Faye – the WebSocket support and the Redis engine – have been split off into their own packages. I’ve been blogging here already about faye-websocket (rubygem, npm package), and the major work over the last few months has gone into making this a really solid WebSocket and EventSource implementation. Because of its new-found freedom outside the main Faye project, it’s been adopted by other projects, notably SockJS, Cramp and Poltergeist. This adoption, particularly from SockJS, has meant more feedback, bug fixes and performance improvements and resulted in a really solid WebSocket library that improves Faye’s performance.

This new library has had a beneficial impact on Faye’s transport layer. faye-websocket is much faster than Faye’s 0.7 WebSocket code, supports EventSource and new WebSocket features, and runs on more Ruby servers: Faye 0.7 was confined to Thin whereas it now also runs on Rainbows and Goliath. On top of this, Faye 0.8 adds a new EventSource-based transport to support Opera 11 and browsers where proxies block WebSocket, and improves how it uses WebSocket. Previously, Faye’s WebSocket transport used the same polling-based /meta/connect cycle as other transports. It was faster than HTTP, but not optimal. Faye 0.8 now breaks out of this request/response pattern and pushes messages to WebSocket and EventSource connections as soon as they arrive at the server, without returning the /meta/connect poll. This results in lower latency, particular when delivering messages at high volume.

The second major change is that the Redis engine is now a separate library, faye-redis (rubygem, npm package). This has two important benefits. First, the main Faye package no longer depends on Redis clients, and in particular the Node version no longer depends on packages with C extensions, so no compiler is needed to install it. Second, it means Faye’s backend is now totally pluggable and third parties can implement their own engines: the API is thoroughly documented for Ruby and Node. The engine is a small piece of code (for example here’s the Ruby in-memory engine) but it really defines Faye’s behaviour. This layer is not concerned with transport negotiation (HTTP, WebSocket, etc) or even the Bayeux protocol format, it just implements the messaging business logic and stores the state of the system. You can easily implement your own engine to run on top of another messaging/storage stack, or change the messaging semantics if you like. Faye has a complete set of tests you can run to check your engine – see the Ruby and Node projects for examples.

Finally, there have been a couple of changes to the client. We’ve switched from exponential-backoff to fixed-interval for trying to reconnect after the client loses its connection to the server, and this interval is configurable using the retry setting:

// Attempts to reconnect every 5 seconds
var client = new Faye.Client('http://example.com/bayeux', {
  retry: 5
});

You can also set headers for long-polling requests on the client; this is useful for talking to OAuth-protected servers, for example:

client.setHeader('Authorization', 'OAuth ' + accessToken);

And finally, there’s a new server-side setting, ping, that controls how often the server sends keep-alive data over WebSocket and EventSource connections. This data is ignored by the client, but helps keep the connection open through proxies that like to kill idle connections.

// Sends pings every 10 seconds

var bayeux = new faye.NodeAdapter({
  mount: '/bayeux',
  ping:  10
});

And that just about wraps things up for this release. Since 0.7.1, the main Faye codebase has shed over 2,000 lines of code into other projects that we can easily ship incremental updates to without affecting Faye itself. It’s more performant, leaner and more modular and I know there are already projects doing cool things with it. If you’re using Faye for interesting projects, I’d love to hear from you on the mailing list.

Organizing a project with JS.Packages

I’ve been asked by a few users of JS.Class to explain how I use it to organize projects. I’ve been meaning to write this up for quite a while, ever since we adopted it at Songkick for managing our client-side codebase. Specifically, we use JS.Packages to organize our code, and JS.Test to test it, and I’m mostly going to talk about JS.Packages here.

JS.Packages is my personal hat-throw into the ring of JavaScript module loaders. It’s designed to separate dependency metadata from source code, and be capable of loading just about anything as efficiently as possible. It works at a more abstract level than most script loaders: users specify objects they want to use, rather than scripts they want to load, allowing JS.Packages to optimize downloads for them and load modules that have their own loading strategies, all through a single interface, the JS.require() function.

As an example, I’m going to show how we at Songkick use JS.Packages within our main Rails app. We manage our JavaScript and CSS by doing as much as possible in those languages, and finding simple ways to integrate with the Rails stack. JS.Packages lets us specify where our scripts live and how they depend on each other in pure JavaScript, making this information portable. We use JS.require() to load our codebase onto static pages for running unit tests without the Rails stack, and we use jsbuild and AssetHat to package it for deployment. Nowhere in our setup do we need to manage lists of script tags or worry about load order.

The first rule of our codebase is: every class/module lives in its own file, much like how we organize our Ruby code. And this means every namespace: even if a namespace has no methods of its own but just contains other classes, we give it a file so that other files don’t have to guess whether the namespace is defined or not. For example a file containing a UI widget class might look like this:

// public/javascripts/songkick/ui/widget.js

Songkick.UI.Widget = function() {
  // ...
};

This file does not have to check whether Songkick or Songkick.UI is defined, it just assumes they are. The namespaces are each defined in their own file:

// public/javascripts/songkick.js
Songkick = {};

// public/javascripts/songkick/ui.js
Songkick.UI = {};

Notice how each major class or namespace lives in a file named after the module it contains; this makes it easier to find things while hacking and lets us take advantage of the autoload() feature in JS.Packages to keep our dependency data small. It looks redundant at first, but it helps maintain predictability as the codebase grows. It results in more files, but we bundle everything for production so we keep our code browsable without sacrificing performance. I’ll cover bundling later on.

To drive out the implementation of our UI widget, we use JS.Test to write a spec for it. I’m just going to give it some random behaviour for now to demonstrate how we get everything wired up.

// test/js/songkick/ui/widget_spec.js

Songkick.UI.WidgetSpec = JS.Test.describe("Songkick.UI.Widget", function() { with(this) {
  before(function() { with(this) {
    this.widget = new Songkick.UI.Widget("foo")
  }})
  
  it("returns its attributes", function() { with(this) {
    assertEqual( {name: "foo"}, widget.getAttributes() )
  }})
}})

So now we’ve got a test and some skeleton source code, how do we run the tests? First, we need a static page to load up the JS.Packages loader, our manifest (which we’ll get to in a second) and a script that runs the tests:

// test/js/browser.html

<!doctype html>
<html>
  <head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <title>JavaScript tests</title>
  </head>
  <body>
    
    <script type="text/javascript">ROOT = '../..'</script>
    <script type="text/javascript" src="../../vendor/jsclass/min/loader.js"></script>
    <script type="text/javascript" src="../../public/javascripts/manifest.js"></script>
    <script type="text/javascript" src="./runner.js"></script>
    
  </body>
</html>

The file runner.js should be very simple: ideally we just want to load Songkick.UI.WidgetSpec and run it:

// test/js/runner.js

// Don't cache files during tests
JS.cacheBust = true;

JS.require('JS.Test', function() {
  
  JS.require(
    'Songkick.UI.WidgetSpec',
    // more specs as the app grows...
    function() { JS.Test.autorun() });
});

The final missing piece is the manifest, the file that says where our files are stored and how they depend on each other. Let’s start with a manifest that uses autoload() to specify all our scripts’ locations; I’ll present the code and explain what each line does.

// public/javascripts/manifest.js

JS.Packages(function() { with(this) {
  var ROOT = JS.ENV.ROOT || '.'
  
  autoload(/^(.*)Spec$/,     {from: ROOT + '/test/js', require: '$1'});
  autoload(/^(.*)\.[^\.]+$/, {from: ROOT + '/public/javascripts', require: '$1'});
  autoload(/^(.*)$/,         {from: ROOT + '/public/javascripts'});
}});

The ROOT setting simply lets us override root directory for the manifest, as we do on our test page. After that, we have three autoload() statements. When you call JS.require() with an object that’s not been explicitly configured, the autoload() rules are examined in order until a match for the name is found.

The first rule says that object names matching /^(.*)Spec$/ (that is, test files) should be loaded from the test/js directory. For example, Songkick.UI.WidgetSpec should be found in test/js/songkick/ui/widget_spec.js. The require: '$1' means that the object depends on the object captured by the regex, so Songkick.UI.WidgetSpec requires Songkick.UI.Widget to be loaded first, as you’d expect.

The second rule makes sure that the containing namespace for any object is loaded before the object itself. For example, it makes sure Songkick.UI is loaded before Songkick.UI.Widget, and Songkick before Songkick.UI. The regex captures everything up to the final . in the name, and makes sure it’s loaded using require: '$1'.

The third rule is a catch-all: any object not matched by the above rules should be loaded from public/javascripts. Because of the preceeding rule, this only matches root objects, i.e. it matches Songkick but not Songkick.UI. Taken together, these rules say: load all objects from public/javascripts, and make sure any containing namespaces are loaded first.

Let’s implement the code needed to make the test pass. We’re going to use jQuery to do some trivial operation; the details aren’t important but it causes a dependency problem that I’ll illustrate next.

// public/javascripts/songkick/ui/widget.js

Songkick.UI.Widget = function(name) {
  this._name = name;
};

Songkick.UI.Widget.prototype.getAttributes = function() {
  return jQuery.extend({}, {name: this._name});
};

If you open the page test/js/browser.html, you’ll see an error:

The test doesn’t work because jQuery is not loaded; this means part of our codebase depends on it but JS.Packages doesn’t know that. Remember runner.js just requires Songkick.UI.WidgetSpec? We can use jsbuild to see which files get loaded when we require this object. (jsbuild is a command-line tool I wrote after an internal project at Amazon, that was using JS.Class, decided they needed to pre-compile their code for static analysis rather than loading it dynamically at runtime. You can install it by running npm install -g jsclass.)

$ jsbuild -m public/javascripts/manifest.js -o paths Songkick.UI.WidgetSpec
public/javascripts/songkick.js
public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js
test/js/songkick/ui/widget_spec.js

As expected, it loads the containing namespaces, the Widget class, and the spec, in that order. But the Widget class depends on jQuery, so we need to tell JS.Packages about this. However, rather than adding it as a dependency to every UI module in our application, we can use a naming convention trick: all our UI modules require Songkick.UI to be loaded first, so we can make everything in that namespace depend on jQuery but making the namespace itself depend on jQuery. We update our manifest like so:

// public/javascripts/manifest.js

JS.Packages(function() { with(this) {
  var ROOT = JS.ENV.ROOT || '.';
  
  file('https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js')
    .provides('jQuery', '$');
  
  autoload(/^(.*)Spec$/,     {from: ROOT + '/test/js', require: '$1'});
  autoload(/^(.*)\.[^\.]+$/, {from: ROOT + '/public/javascripts', require: '$1'});
  autoload(/^(.*)$/,         {from: ROOT + '/public/javascripts'});
  
  pkg('Songkick.UI').requires('jQuery');
}});

Running jsbuild again shows jQuery will be loaded, and if you reload the tests now they will pass:

$ jsbuild -m public/javascripts/manifest.js -o paths Songkick.UI.WidgetSpec
public/javascripts/songkick.js

https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js

public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js
test/js/songkick/ui/widget_spec.js

So we’ve now got a working UI widget, and we can use exactly the same approach to load it in our Rails app: load the JS.Packages library and our manifest, and call JS.require('Songkick.UI.Widget'). But in production, we’d rather not be downloading all those tiny little files one at a time, it’s much more efficient to bundle them into one file.

To bundle our JavaScript and CSS for Rails, we use AssetHat, or rather a fork we made to tweak a few things. Our fork notwithstanding, AssetHat is the closest of the handful of Rails packaging solutions we tried that did everything we needed, and I highly recommend it.

AssetHat uses a file called config/assets.yml, in which you list all the bundles you want and which files should go in each section. But I’d rather specify which objects I want in each bundle; we already have tooling that figures out which files we need and in what order so I’d rather not duplicate that information. But fortunately, AssetHat lets you put ERB in your config, and we use this to shell out to jsbuild to construct our bundles for us.

First, we write a jsbuild bundles file that says which objects our application needs. We exclude jQuery from the bundle because we’ll probably load that from Google’s CDN.

// config/bundles.json
{
  "app" : {
    "exclude" : [ "jQuery" ],
    "include" : [
      "Songkick.UI.Widget"
    ]
  }
}

This is a minimal format that’s close to what the application developer works with: objects. It’s easy to figure out which objects your app needs, less simple to make sure you only load the files you need and get them in the right order, in both your test pages and your application code. We can use jsbuild to tell us which files will go into this bundle:

$ jsbuild -m public/javascripts/manifest.js -b config/bundles.json -o paths app
public/javascripts/songkick.js
public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js

Now all we need to do is pipe this information into AssetHat. This is easily done with a little ERB magic:

// config/assets.yml
# ...
js:
  <%  def js_bundles
        JSON.parse(File.read('config/bundles.json')).keys
      end
      
      def paths_for_js_bundle(name)
        jsbuild = 'jsbuild -m public/javascripts/manifest.js -b config/bundles.json'
        `#{jsbuild} -o paths -d public/javascripts #{name}`.split("\n")
      end
  %>
  
  bundles:
  <% js_bundles.each do |name| %>
    <%= name %>:
    <% paths_for_js_bundle(name).each do |path| %>
      - <%= path %>
    <% end %>
  <% end %>

Running the minification task takes the bundles we’ve defined in bundles.json and packages them for us:

$ rake asset_hat:minify
Minifying CSS/JS...

 Wrote JS bundle: public/javascripts/bundles/app.min.js
        contains: public/javascripts/songkick.js
        contains: public/javascripts/songkick/ui.js
        contains: public/javascripts/songkick/ui/widget.js
        MINIFIED: 14.4% (Engine: jsmin)

This bundle can now be loaded in your Rails views very easily:

<%= include_js :bundle => 'app' %>

This will render script tags for each individual file in the bundle during development, and a single script tag containing all the code in production. (You may have to disable the asset pipeline in recent Rails versions to make this work.)

So that’s our JavaScript strategy. As I said earlier, the core concern is to express dependency information in one place, away from the source code, in a portable format that can be used just as easily in a static web page as in your production web framework. Using autoload() and some simple naming conventions, you can get all these benefits while keeping the configuration very small indeed.

But wait, there’s more!

As a demonstration of how valuable it is to have portable dependency data and tests, consider the situation where we now want to run tests from the command line, or during our CI process. We can load the exact same files we load in the browser, plus a little stubbing of the jQuery API, and make our tests run on Node:

// test/js/node.js

require('jsclass');
require('../../public/javascripts/manifest');

JS.ENV.jQuery = {
  extend: function(a, b) {
    for (var k in b) a[k] = b[k];
    return a;
  }
};

JS.ENV.$ = JS.ENV.jQuery;

require('./runner');

And lo and behold, our tests run:

$ node test/js/node.js 
Loaded suite Songkick.UI.Widget

Started
.
Finished in 0.003 seconds
1 tests, 1 assertions, 0 failures, 0 errors

Similarly, we can write a quick PhantomJS script to parse the log messages that JS.Test emits:

// test/js/phantom.js

var page = new WebPage();

page.onConsoleMessage = function(message) {
  try {
    var result = JSON.parse(message).jstest;
    if ('total' in result && 'fail' in result) {
      console.log(message);
      var status = (!result.fail && !result.error) ? 0 : 1;
      phantom.exit(status);
    }
  } catch (e) {}
};

page.open('test/js/browser.html');

We can now run our tests on a real WebKit instance from the command line:

$ phantomjs test/js/phantom.js 
{"jstest":{"fail":0,"error":0,"total":1}}

One nice side-effect of doing as much of this as possible in JavaScript is that it improves your API design and makes you decouple your JS from your server-side stack; if it can’t be done through HTML and JavaScript, your code doesn’t do it. This makes it easy to keep your code portable, making it easier to reuse across applications with different server-side stacks.

The cost of privacy

I have a bone to pick with a certain oddly prevalent piece of received wisdom in the JavaScript community. I’ve been meaning to rant about this properly for what seems like geologic amounts of time, but I finally hit a concrete example today that both broke the camel’s back and gave me something I could actually illustrate my point with.

Let’s talk about private variables, or more precisely, defining API methods inside a closure, giving them privileged access to data that code outside the closure cannot see. This is used in several JavaScript design patterns, for example when writing constructors or using the module pattern:

// Defining a constructor
var Foo = function() {
  var privateData = {hello: 'world', etc: 'etc'};

  this.publicMethod = function(key) {
    return privateData[key];
  };
};

// Defining a single object
var Bar = (function() {
  var privateData = {hello: 'world', etc: 'etc'};
  
  return {
    publicMethod: function(key) {
      return privateData[key];
    }
  };
})();

The reason people use this pattern, in fact the only reason for doing so as far as I can tell, is encapsulation. They want to keep the internal state of something private, so that access to it is predictable and it can’t put the object in a weird or inconsistent state, or leak implementation details to consumers of the API. These are all laudable design goals.

However, encapsulation does not need to be rigorously enforced by the machine, and using this style has all sorts of annoying costs that I’ll get to in just a second. Encapsulation is something you get by deliberately designing interfaces and architectures, by communicating with your team/users, and through documentation and tests. Trying to enforce it in code shows a level of paranoia that isn’t necessary in most situations, and this code style has plenty of costs that grossly offset the minimal encapsulation benefit it provides.

So, I guess I should show you what I’m talking about. Okay, hands up who can read the jQuery.Deferred documentation and then tell me what this code does?

var DelayedTask = function() {
  jQuery.Deferred.call(this);
};
DelayedTask.prototype = new jQuery.Deferred();

var task = new DelayedTask();
task.done(function(value) { console.log('Done!', value) });

task.resolve('the value');

var again = new DelayedTask();
again.done(function(value) { console.log('Again!', value) });

I asked this on Twitter earlier and got one correct response. This code prints the following:

Done! the value
Again! the value

But from the source code, what it seems to be trying to do is as follows:

  • Create a subclass of jQuery.Deferred
  • Instantiate a copy of the subclass, and add a callback to it
  • Resolve the instance, invoking the callback
  • Create a second instance of the subclass and add a callback to it
  • Do not resolve the second instance

But it does not do this: the second instance is somehow already resolved, and the callback we add is unexpectedly invoked. What’s going on?

Well, let’s examine what we expect to happen when we try to subclass in JavaScript:

var DelayedTask = function() {
  jQuery.Deferred.call(this);
};
DelayedTask.prototype = new jQuery.Deferred();

We assign the prototype of our subclass to an instance of the superclass, putting all its API methods in our subclass’s prototype. But this not only puts the superclass’s methods into our prototype, it also puts the object’s state in there, so in our constructor we call jQuery.Deferred.call(this). This applies the superclass’s constructor function to our new instance, setting up a fresh copy of any state the object might need.

So why doesn’t this work? Well it turns out that inside the jQuery.Deferred function you’ll find code that essentially does this:

jQuery.Deferred = function() {
  var doneCallbacks = [],
      failCallbacks = [],
      // more private state

  var done = function(callback) {
    doneCallbacks.push(callback);
  };

  var fail = function(callback) {
    failCallbacks.push(callback);
  };
  
  return {
    done: done,
    fail: fail,
    // more API methods
  };
};

So now we see what’s really going on: jQuery.Deferred, despite its capitalized name and the docs’ instruction to invoke it with new, is not a constructor. It’s a normal function that creates and returns an object, rather than acting on the object created by new. Its methods are not stored in a prototype, they are created anew every time you create a deferred object. As such they are bound to the private data inside the Deferred closure, and cannot be reused and applied to other objects, such as objects that try to inherit this API. It also means that calling jQuery.Deferred.call(this) in our constructor is pointless, since it just returns a new object and does not modify this at all.

This concept of binding is important. All JavaScript function invocations have an implicit argument: this. It refers to the receiving object if the function is called as a method (i.e. o in o.m()) or can be set explicitly using the first argument to call() or apply(). Being able to invoke a method against any object is part of what makes them useful; a single function can be used by all the objects in a class and give different output depending on each object’s state. A function that does not use this to refer to state, but instead refers to variables inside a closure, can only act on that data; if you want to reuse its behaviour you have to reimplement it or manually delegate to it.

Manual delegation means that in order to implement a ‘subtype’, we keep an instance of the supertype as an instance variable and reimplement its API, delegating calls to the stored object.

var DelayedTask = function() {
  this._deferred = new jQuery.Deferred();
};

DelayedTask.prototype.always = function() {
  this._deferred.always.apply(this._deferred, arguments);
  return this;
};

// 17 more method definitions

One correspondent on Twitter suggested I do this to dynamically inherit the jQuery API:

var DelayedTask = function() {
  jQuery.extend(this, new jQuery.Deferred());
};

This makes a new instance of the superclass, and copies its API onto the subclass instance. This avoids the problem of manual delegation, but you just introduced four new problems:

  • It assumes the methods are correctly bound to new jQuery.Deferred(), so they will work correctly if invoked as methods on another object. This happens to be true in our case but it’s a risky assumption to make about all objects.
  • These methods will return a reference to the jQuery.Deferred instance, rather than the DelayedTask object, breaking an abstraction boundary.
  • for/in loops are slow; object creation now takes O(N) time where N is the size of its API.
  • Doing this will clobber any methods you define in your prototype, so if you want to override any methods you have to put them in the constructor.

So we’ve shown that defining your methods in the constructor, rather than in the prototype, hurts reusability and increases maintenance effort, especially in such a dynamic, malleable language as JavaScript. It also makes code harder to understand: if I have to read the whole bloated body of a function to figure out that it’s not really a constructor, because it explicitly returns an object after defining a ton of methods, that’s a maintenance problem. Here’s several things I expect to be true if you name a function using an uppercase first letter:

  • It must be invoked using new to function correctly
  • The object returned by new Foo() gives true for object instanceof Foo
  • It can be subclassed easily using the technique shown above
  • Its public API can be seen by inspecting its prototype

One of JavaScript’s key problems is that it’s not semantically rich enough to accurately convey the author’s intent. This can largely be solved using libraries and consistent style, but any time you need to read to the end of a function (including all possible early exits) to figure out if it’s really a constructor or not, that causes maintenance problems.

I’ve shown that it causes reusability problems and shared state, and that it makes code harder to understand, but if I know JavaScript programmers you’re probably not convinced. Okay, let’s try science:

require('jsclass');
JS.require('JS.Benchmark');

var Foo = function() {};
Foo.prototype.method = function() {};

var Bar = function() {
  this.method = function() {};
};

JS.Benchmark.measure('prototype', 1000000, {
  test: function() { new Foo() }
});

JS.Benchmark.measure('constructor', 1000000, {
  test: function() { new Bar() }
});

We have two classes: one defines a method on its prototype, and the other defines it inside its constructor. Let’s see how they perform:

$ node test.js 
 BENCHMARK [47ms +/- 8%] prototype
 BENCHMARK [265ms +/- 26%] constructor

(Another pet peeve: benchmarks with no statistical error margins.)

Defining the API inside the constructor – just one no-op method – is substantially slower than if the method is defined on the prototype. Every time you invoke Bar, it has to construct its entire API from scratch, instead of simply setting a pointer to the prototype to inherit methods. It also uses more memory: all those function objects aren’t free, and are more likely to leak memory since they’re inside a closure. As the API gets bigger, this problem only gets worse.

Defining methods this way vastly increases the runtime of constructors, increases memory use, and forces manual delegation, adding an extra function dispatch to every method call in the subclass. JavaScript function calls are expensive: two of the best ways to improve the performance of JavaScript code are to inline function calls, and make constructors cheaper (a third being to aggressively remove loops). The original version of Sylvester defined methods in its constructors, and the first big performance win involved moving to prototypes. One of the factors that made faye-websocket much faster was removing unnecessary function calls and loops.

As a sweet bonus, if your instances all share the same copy of their methods, you can do useful things like test coverage analysis, which is impossible if every instance gets a fresh set of methods.

Yes, I know most of the time object creation and function calls do not dominate the runtime of an application, but when they do, you will wish you’d written your code in a style less likely to introduce performance problems. Writing code with prototypes is no more costly than using closures in terms of development effort, and avoids all the problems I’ve listed above, and I like sticking to habits that result in less maintenance work.

If you really want privacy, you need to ask yourself these questions. First, is your API actually guaranteeing privacy? It’s easy to let one little object reference slip out through an API and your whole privacy claim is blown. Second, is it worth all the above costs? And third, can you better communicate the design intent of your code without incurring these costs? For example, many people prefix ‘private’ fields with an underscore to signal you shouldn’t call them. I go one step further and compress my code using Packer, which obfuscates these underscored names. You can still reuse my methods because they’re not bound to private state, but it’s very clear which methods are public and which aren’t. I’m not going to stop you using them, but the risk is very clearly stated.

Finally consider the real reason we’ve been told global variables are evil, and we should encapsulate things are much as possible. Global variables are evil because they are an example of implicit shared state. This is definitely something to avoid, but you need to know it’s this you’re avoiding, and not global variables per se. The methods in the jQuery.Deferred API still have an implicit shared state problem that in some sense is worse than the global variable problem: the user cannot completely determine the function’s output from its inputs, because the user cannot change the object the function acts upon. The function’s behaviour is bound to state that the user cannot see or replace.

CommonJS doesn’t really solve this problem either, it just moves it to the filesystem so multiple versions of a library can co-exist and each module can get its own copies of its dependencies. (I’d argue this a waste of memory and start-up time for very little reward.) You still have a global shared namespace (both in the filesystem and the JavaScript runtime), and you can still change the public API of a CommonJS module, just as you can for anything defined using the module pattern. There’s only so far you can go in locking down your code, at some point some of it has to walk out into the mean wide world and interact with other programs. Deal with it, and quit punishing your users with bad design decisions.

faye-websocket 0.3: EventSource support, and two more Ruby servers

The latest iteration of faye-websocket has just been released for Ruby and Node, probably the last release before I get back to making progress on Faye itself. It contains two major new features: EventSource support, and support for the Rainbows and Goliath web servers for the Ruby version.

EventSource is a server-push protocol that’s supported by many modern browsers. On the client side, you open a connection and listen for messages:

var es = new EventSource('http://example.com/events');
es.onmessage = function(event) {
  // process event.data
};

This sends a GET request with Content-Type: text/event-stream to the server and holds the connection open, and the server can then send messages to the client via a streaming HTTP response. It’s a one-way connection so the client cannot send messages to the server over this connection; it must use separate HTTP requests. However, EventSource uses a much simpler protocol than WebSocket, and looks more like ‘normal’ HTTP so it has less trouble getting through proxies.

On the server side, supporting this requires a lot of the same code as WebSocket, and I might use it in Faye later on, so I decided to add support for it in faye-websocket. In your Rack app, you can now easily handle EventSource connections using an API similar to WebSocket:

require 'faye/websocket'

App = lambda do |env|
  if Faye::EventSource.eventsource?(env)
    es = Faye::EventSource.new(env)

    # Periodically send messages
    loop = EM.add_periodic_timer(1) { es.send('Hello') }
    
    es.onclose = lambda do |event|
      EM.cancel_timer(loop)
      es = nil
    end

    # Async Rack response
    es.rack_response
  
  else
    # Normal HTTP
    [200, {'Content-Type' => 'text/plain'}, ['Hello']]
  end
end

Just like WebSocket, EventSource is designed as a convenient wrapper around the Rack environment and underlying TCP connection that deals with the wire protocol and connection details for you. It tries not to make any assumptions or force constraints on your application design. There are a lot of WebSocket libraries around whose interfaces look more like Rails controllers; black-box Rack components or full-stack servers that hide the socket object and force you respond to WebSockets on one endpoint, and normal HTTP on another. Faye needs to be able to speak many different transport protocols over a single endpoint, which is why this library is designed to be usable inside any Rack adapter while leaving routing decisions up to you.

The benefit of interacting with the sockets as first-class objects is that you can pass them to other parts of your application, which can deal with them as a simple abstraction. For example, if your application just needs to push data to the client, as many WebSocket apps do, you can maintain a socket pool full of objects that respond to send(). When the application wants to push data, it selects the right connection and calls send() on it, without worrying whether it’s a WebSocket or EventSource connection. This gives you some flexibility around which transport your client uses.

The Node API to this is very similar and both are fully documented on GitHub: Ruby docs, Node docs.

The final big change is that the Ruby version now works under a broader range of web servers; it now supports Rack apps running under Thin, Rainbows and Goliath. Hopefully, providing a portable socket implementation that’s easy to drop into any Rack app will open up possibilities for more portable async app frameworks, decoupling the application from the network transport just as Rack did for HTTP. faye-websocket extracted its initial Thin and Rainbows adapters from the Cramp project, and there’s a chance Cramp can now remove its WebSocket code and rely on a simple abstraction for binding the app framework to the web.

As usual, download from Rubygems and npm, and get on the Faye mailing list if you have problems.

Black-box criteria

Tim Bray recently published an article called Type-System Criteria, in which he makes the argument that Java, or statically-typed languages in general, is better-suited to mobile development than the dynamically-typed languages that are more prevalent in web development circles. The reason he gives for this boils down to API surface size:

Another observation that I think is partially but not entirely a consequence of API scale is testing difficulty. In my experience it’s pretty easy and straightforward to unit-test Web Apps. There aren’t that many APIs to mock out, and at the end of the day, these things take data in off the wire and emit other data down the wire and are thus tractable to black-box, in whole or in part.

On the other hand, I’ve found that testing mobile apps is a major pain in the ass. I think the big reason is all those APIs. Your average method in a mobile app responds to an event and twiddles APIs in the mobile framework. If you test at all completely you end up with this huge tangle of mocks that pretty soon start getting in the way of seeing what’s actually going on.

The argument goes that, as the API surface you need to integrate with becomes larger, so static type systems become more attractive. I don’t disagree, in part because I don’t have nearly enough experience with static languages to have an informed opinion on them. But at a gut level I believe this to be true, in fact I’d be willing to bet that a majority of the bugs I’ve written while refactoring software could have been caught by a static type checker (and not even a very sophisticated one, at that).

But the excerpt I quoted above contains a code smell, and it points to another reason why mobile development is difficult. It’s not the size of the APIs that’s the big problem: it’s the nature of the application.

Web application servers are comparatively easy to test because the tests can be written by talking to an encapsulated black box. You throw a request (or several) at a web server, you read what comes back, and check it looks like what you expected. On the other hand, testing web application clients is much more complex: instead of doing simple call/response testing, you have to initiate events within the application’s environment, and then monitor changes to that environment that you expect the events to cause. The core difference here is that client-side programs tend to be what I’m going to refer to as ‘stateful user interfaces’, and mobile (and desktop) software falls into the same category.

What exactly do I mean by ‘stateful user interface’? When you call a web server, you don’t need to hold onto any state on your end: you ask the server a question by sending it a request, and it sends back a fully-formed, self-contained response. When you’ve checked that response, you throw it away and start the next test. In contrast, stateful user interfaces are long-running processes in which incremental changes are made to what the user sees. Instead of getting a fresh new page, just a part of the view is changed, or a sound is emitted, or a notification generated, or a vibration initiated. The programming paradigm in a server environment emphasises call/response, statelessness and immutability; in a client environment you have side effects, state and incremental change. Testing in such environments is hard.

I think this, rather than large API surface, is the real problem. Large API surfaces are only a problem if your application code talks to them directly, and this is much more common in side-effect-heavy applications. Unit tests in these environments tend to be messy for several reasons:

  • Application code responds to events triggered by the host environment
  • Business logic produces its output by modifying the host environment rather than returning values
  • It is hard or impossible to reset the environment to a clean state between tests

The third reason is a particular problem when unit testing client-side JavaScript, and I’ve seen plenty of tests where the state of the page or the implementation of event listeners is such that it becomes very difficult to keep each test independent of the others. You also have the problem that anything that causes a page refresh will cause your test runner to vanish. (I wrote about this exact problem in Refactoring towards testable JavaScript.)

So if side-effect-heavy programs cause large API surfaces to be a problem, what should we do about it? The answer comes down to something I think of as ‘avoiding framework-isms’. This means that any time you have a framework or host environment in which user input or third-party code drives your application, the sooner you can dispatch to something you control the better. The classic example of this is the ‘fat model, skinny controller’ mantra popular in the Rails community: rather than dump lots of code in a controller that’s only invoked by the host server and framework, turn the request into calls to models. This way, the bulk of the logic is in objects that you control the interface to, and that are easy to create and manipulate, properties that also make them easy to test.

In client-side JavaScript and other stateful user interfaces, this means keeping event listeners small. Ideally an event listener should extract all the necessary data from the event and the current application state, and use this to make a black-box call to a module containing the real business logic. It means making sure orthogonal components of a user interface do not talk to each other directly, but publish data changes via a message bus. And it means writing business logic that returns results rather than causes side-effects; the side-effects again being dealt with by thin bindings to the host environment.

I’ll finish up with a small but illustrative example. Say you’re writing a WebSocket implementation, and the protocol mandates that when you call socket.send('Hello, world!') then the bytes 81 8d ed a3 88 c3 a5 c6 e4 af 82 8f a8 b4 82 d1 e4 a7 cc should be written to the TCP socket. You could write a test for it by mocking out the whole network stack (which I’ve probably glossed over considerably here):

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "writes a message to the socket" do
    @tcp_socket.should_receive(:write).with [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
    @web_socket.send("Hello, world!")
  end
  
  # More mock-based protocol tests...
end

Or you could test it by implementing a pure function that turns text into WebSocket frames, leaving the code that actually deals with networking doing only that and nothing else:

describe WebSocket::Parser do
  before do
    @parser = WebSocket::Parser.new
  end
  
  it "turns text into message frames" do
    @parser.frame("Hello, world!").should == [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
  end
  
  # More protocol implementation tests...
end

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    
    @parser = mock('parser')
    WebSocket::Parser.should_receive(:new).and_return @parser
    
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "converts text to frames and sends them" do
    frame = mock('frame')
    @parser.should_receive(:frame).with("Hello, world!").and_return frame
    @tcp_socket.should_receive(:write).with(frame)
    @web_socket.send("Hello, world!")
  end
  
  # And we're done here
end

This separates the business logic (implementing the WebSocket protocol) away from the side effects to the host environment (writing to network connections). This results in code that’s more modular, much easier to test, and less coupled to the API surface of the host environment. If a static type system helps you with that then have at it, but recognize when it’s a symptom of a deeper problem.

faye-websocket 0.2: big performance boost, and subprotocol support

I’ve just released the 0.2 version of faye-websocket for Ruby and Node. This release benefits from the fact that the SockJS project is now using faye-websocket to handle WebSocket connections; my thanks to them for finding the performance bugs and missing features that went into making this release.

The biggest difference in this release is performance. In 0.1, the Node version had a rather interesting performance profile and was pretty slow. We’ve now made a bunch of optimisations that give it a more predictable performance profile across message sizes, and increases performance across the range by orders of magnitude.

The following benchmarks were produced using the ws client to send 1000 messages of various sizes to an echo server:

Benchmarks for Node 0.6.6

On Ruby, the change is not so dramatic but for most message sizes performance is improved by at least a factor of 2. This is in part achieved by writing part of the parser in C; this being my first Ruby C extension there may be problems with it so please get on the mailing list if you find any.

Benchmarks for Ruby 1.9.3

There is one new feature in the form of Sec-WebSocket-Protocol support. With the latest WebSocket protocol, the one currently shipping in Chrome and Firefox, you can specify which application protocol(s) you want to use over the socket, for example:

// client-side
var ws = new WebSocket('ws://example.com/', ['irc', 'xmpp']);

On the server side, you can specify which protocols the server supports and the first of these that matches a protocol supported by the client will be selected and sent back to the client as part of the handshake.

// server-side
var ws = new WebSocket(request, socket, head, ['bayeux', 'irc']);
ws.protocol // -> 'irc'

I’m really pleased with the progress on this project since decoupling it from Faye, and stoked that SockJS has adopted it. They’ve helped me improve things a great deal just in the last few days and it’s great to know these changes will go into making Faye faster.