Faye 0.8.4: more efficient socket connections

Update: Version 0.8.5 was released shortly afterward to fix a URL parsing bug in this release.

I just released Faye 0.8.4, a drop-in replacement for previous releases. It includes various little fixes, including working around iOS’s new POST-caching bug, making sure JSON-P requests don’t exceed URL size limits, checking EventSource actually works to detect broken releases of Opera, and fixing relative URL resolution in Internet Explorer. But the biggest improvement is in how it negotiates which transport to use. TL;DR: it now makes half as many connections to the server to establish a WebSocket connection. Read on for more detail.

One responsibility of the Bayeux protocol on which Faye is based is figuring out which transport type to use between the client and the server. It does this using an upgrading process as follows. To initiate a connection, the client uses a vanilla HTTP request to send a message on the /meta/handshake channel; in the browser this is done using XMLHttpRequest for same-origin requests and JSON-P otherwise. The server’s response to this message includes two important things: a randomly generated client ID, and a list of transport types supported by the server.

$ curl -X POST http://localhost/bayeux /
    -H 'Content-Type: application/json' \
    -d '{"channel": "/meta/handshake", "supportedConnectionTypes": ["long-polling", "websocket"], "version": "1.0"}'

{
  "channel":    "/meta/handshake",
  "successful": true,
  "version":    "1.0",
  "clientId":   "irta1b0nh93z90baok4b1gbcw3p1emmcj50",
  "supportedConnectionTypes":[
    "long-polling",
    "cross-origin-long-polling",
    "callback-polling",
    "websocket",
    "eventsource",
    "in-process"
  ]
}

Once it knows what the server supports, the client can pick a new transport. But, it can’t just take this list at face value: even if the server supports WebSockets and the client has a WebSocket object available, the intervening network, proxies and so on may break the connection. So, the client needs to begin trying connections to find out which transports actually work before upgrading from the vanilla HTTP transport it is using. This testing is asynchronous, so in the meantime the client continues to use the vanilla transport.

In practise this results in the following sequence of events:

  1. Select a vanilla transport, either long-polling (XHR) or callback-polling (JSON-P)
  2. Send the /meta/handshake message to the server
  3. Receive a response, store client ID and list of supported transports
  4. Begin testing WebSocket and EventSource connections in the background
  5. Begin sending publish and subscribe messages using the original transport
  6. Eventually step 4 completes and the transport is upgraded

This typically results in four connections to the server during set-up, in the best case where WebSocket works:

  1. First POST request sending handshake message
  2. Trial WebSocket connection during transport selection
  3. Second POST request to send subscriptions and begin polling
  4. Second WebSocket connection used to actually send messages

Faye 0.8.4 improves this in two ways. First, it begins testing all the transports it has available before it knows what the server supports. This means a WebSocket connection is opened even before the handshake message is sent over HTTP. Second, it caches the connection it uses to test the transport, and reuses it for sending messages. So, in the best case, by the time the server’s handshake response arrives we already have a WebSocket open and can begin using it, so we establish a connection with only two connections.

You may ask, since we’re proactively testing whether WebSocket works, why can’t we rely on that and send the initial handshake over a socket connection, taking the connection count down to one? Well, in the case where WebSockets don’t work, because either the server or an intervening proxy doesn’t support it, it can take a long time to find out that things are broken. In some cases, you don’t get an error event from the WebSocket client until the TCP connection times out, and this can take a long time. This is what the CometD client does, and it renders the client unusable for the first 60 seconds because it tries to send the handshake over an unresponsive socket. So while in the best case you use one less connection, if there’s any problem the client degrades horribly. (It also doesn’t bother testing the connection first, it just assumes that the availability of WebSocket means everything’s fine, which futher compounds its responsiveness problem.)

By using an upgrade strategy and testing transports in the background, it means the client always has a transport it can use to send and receive messages, with no interruption in service. The improved responsiveness when there are problems is easily worth that one extra request in the best case.

One final thing to mention about this release is that I’ve finally written up a guide to securing Faye and other socket-based applications, which includes authenticating both publish and subscribe access, and preventing CSRF and XSS attacks. I’ve decided that educating people about this is better than providing canned extensions for this, since different applications do require different things. If you have experience with socket security and want to contribute, just send me a pull request.

JS.Class 3.0.8: source maps, prototype stubs, and async error catching

I don’t usually blog point releases, but JS.Class releases tend to be infrequent these days, and mostly polish what’s there rather than significantly changing things. This release is no different, but the few changes it contains make it significantly more usable.

First, it now (like everything I ship for the browser) comes with source maps. Thanks to Jake, this was a simple configuration change.

Second, it fixes a bug in the stubbing library that means instance methods on prototypes can now be stubbed. For example, I was recently writing some new tests for Songkick’s Spotify app, which we run these tests in Chrome. (Being based on WebKit, the Spotify platform is close enough that you can write useful unit tests and run them in Chrome or v8/Node.) Spotify adds some methods to built-in prototypes though, and our code relies on them, so they need to be present while running tests in Chrome. I could just implement them globally, but there are other use cases where you just need to stub a method on all instances of a class during one test. So, this now works, and the stub is removed (and verified if it’s a mock) at the end of the test:

stub(String.prototype, "decodeForText", function() { return this.valueOf() })
"foo".decodeForText() // -> "foo"

Finally, I’ve fixed a major issue that’s been bugging me with JS.Test. As I’ve done more projects on Node.js, I’ve found that it’s way too easy to crash the test run completely because an error was thrown in a section of async code. Because it’s outside the test framework’s call stack, it doesn’t get caught, and Node just bails out:

$ npm test

> restore@0.1.0 test /home/james/projects/restore
> node spec/runner.js

Loaded suite WebFinger, OAuth, Storage, Stores, File store

Started
..
/home/james/projects/restore/node_modules/jsclass/src/test.js:1899
          throw new JS.Test.Mocking.UnexpectedCallError(message);
                ^
Error: <store> received call to authorize() with unexpected arguments:
( "the_client_id", "zebcoe", { "the_scope": [ "r", "w" ] }, #function )
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

This happens most often because I have a test that uses mocks, for example when I send a certain request to a server, I expect the server to tell the underlying model to do something.

it("authorizes the client", function() { with(this) {
  expect(store, "authorize").given("the_client_id", "zebcoe", {the_scope: ["r", "w"]}).yielding([null, "a_token"])
  http_post("/auth", auth_params)
}})

When I change the mock expectation this makes previously working code call a method with unexpected arguments, which throws an error, and because the HTTP request is processed asynchronously, the error is not caught. But it also happens for all sorts of other reasons, for example you have code that calls fs.readFile(), then processes the contents before calling you back – if the pre-processing fails, the error crashes the process.

Well now this error gets caught, so you get useful feedback from your tests when these types of errors happen:

$ npm test

> restore@0.1.0 test /home/james/projects/restore
> node spec/runner.js

Loaded suite WebFinger, OAuth, Storage, Stores, File store

Started
..E...............................................

1) Error:
OAuth with valid login credentials authorizes the client:
Error: <store> received call to authorize() with unexpected arguments:
( "the_client_id", "zebcoe", { "the_scope": [ "r", "w" ] }, #function )

Finished in 0.851 seconds
50 tests, 111 assertions, 0 failures, 1 errors

npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

Now the error is caught, the tests all finish, and you get a clear report about which test caused the error.

This functionality is supported on Node.js and in the browser. As far as I know (and I’ve tried a lot of different frameworks) the only other test frameworks that do this are Mocha and Buster. If you have a similar problem, you can catch uncaught errors like this:

// Node.js
process.addListener('uncaughtException', function(error) {
  // handle error
});

// Browsers
window.addEventListener('error', function(event) {
  // handle event
}, false);

On Node, this is particularly useful for stopping servers crashing in case of an error. In the browser, it’s mostly useful for reporting, only because the argument to the callback is a DOM event rather than an exception object, the information you can get out of it tends to be lacking. Note that for old IEs you’ll need to use window.attachEvent('onerror'), and Opera only supports catching these errors with window.onerror.

While researching this, I was really surprised to see how many very widely used frameworks don’t do this. The best alternative I’ve seen is in Jasmine: this example does not report the async error, but the test times out because it is never resumed. jasmine-node doesn’t catch this at all, which is why the test is only run if global.window exists. I’ve seen several other frameworks that either crash on Node, or in the browser simply stop updating the view, giving you no feedback that the test runner has halted without running all the tests.

Since most frameworks don’t catch these errors, I would assume this isn’t actually a problem for most people. Is this true? I’d like to know how other people deal with this situation.

If you want to give JS.Class a go, just run npm install jsclass or download it from the website.

Your first Ruby native extension: Java

In my last post, I covered how to get started writing code in C and wiring it up to your Ruby code. While that code technically will work in JRuby, it’s preferred to write native JRuby extensions in Java rather than in C. In this article we’ll add JRuby support to our gem.

Let’s start with the Java code itself. Just like we did in C, we need to define the business logic itself as a Java method, and some glue code to set up modules and expose the methods to the Ruby runtime. The JRuby APIs are fairly easy to google, but getting things wired up correctly is quite challenging if you’re new to Java. The slightly obtuse-looking wiring here mostly concerns getting a reference to the current JRuby runtime, which parts of the API need (e.g. see RubyArray.newArray()).

Here’s the required code, which we put in ext/faye_websocket/FayeWebSocketService.java:

package com.jcoglan.faye;

import java.lang.Long;
import java.io.IOException;

import org.jruby.Ruby;
import org.jruby.RubyArray;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyObject;
import org.jruby.anno.JRubyMethod;
import org.jruby.runtime.ObjectAllocator;
import org.jruby.runtime.ThreadContext;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FayeWebSocketService implements BasicLibraryService {
  private Ruby runtime;
  
  // Initial setup function. Takes a reference to the current JRuby runtime and
  // sets up our modules. For JRuby, we will define mask() as an instance method
  // on a specially created class, Faye::WebSocketMask.
  public boolean basicLoad(Ruby runtime) throws IOException {
    this.runtime = runtime;
    RubyModule faye = runtime.defineModule("Faye");
    
    // Create the WebSocketMask class. defineClassUnder() takes a name, a
    // reference to the superclass -- runtime.getObject() gets you the Object
    // class for the current runtime -- and an allocator function that says
    // which Java object to constuct when you call new() on the class.
    RubyClass webSocket = faye.defineClassUnder("WebSocketMask", runtime.getObject(), new ObjectAllocator() {
      public IRubyObject allocate(Ruby runtime, RubyClass rubyClass) {
        return new WebSocket(runtime, rubyClass);
      }
    });
    
    webSocket.defineAnnotatedMethods(WebSocket.class);
    return true;
  }
  
  // The Java class that backs the Ruby class Faye::WebSocketMask. Its methods
  // annotated with @JRubyMethod become exposed as instance methods on the Ruby
  // class through the call to defineAnnotatedMethods() above.
  public class WebSocket extends RubyObject {
    public WebSocket(final Ruby runtime, RubyClass rubyClass) {
      super(runtime, rubyClass);
    }
    
    @JRubyMethod
    public IRubyObject mask(ThreadContext context, IRubyObject payload, IRubyObject mask) {
      int n = ((RubyArray)payload).getLength(), i;
      long p, m;
      RubyArray unmasked = RubyArray.newArray(runtime, n);
      
      long[] maskArray = {
        (Long)((RubyArray)mask).get(0),
        (Long)((RubyArray)mask).get(1),
        (Long)((RubyArray)mask).get(2),
        (Long)((RubyArray)mask).get(3)
      };
      
      for (i = 0; i < n; i++) {
        p = (Long)((RubyArray)payload).get(i);
        m = maskArray[i % 4];
        unmasked.set(i, p ^ m);
      }
      return unmasked;
    }
  }
}

There’s another strategy you can use for JRuby, which is more like FFI: you define the logic you want in pure Java, and then tell JRuby to expose that to Ruby, mapping data types between the two runtimes. I tried that for this problem and it ended up being slower, so I went with the approach above which uses the JRuby APIs directly.

Next thing we need is code to compile it. Find the Rakefile from the C example and modify it like so: we need to define a different compiler task based on which runtime we’re using.

# Rakefile

spec = Gem::Specification.load('faye-websocket.gemspec')

if RUBY_PLATFORM =~ /java/
  require 'rake/javaextensiontask'
  Rake::JavaExtensionTask.new('faye_websocket', spec)
else
  require 'rake/extensiontask'
  Rake::ExtensionTask.new('faye_websocket', spec)
end

You should be able to run rake compile on JRuby and it should just work. It will create a new file in lib called faye_websocket.jar, which is the compiled Java bytecode package.

We also need a little more glue to load this code on JRuby, and we need some additional logic to map our intended single method WebSocket.mask() onto the Java method WebSocketMask#mask(). Create the file lib/faye.rb with this content:

# lib/faye.rb

# This loads either faye_websocket.so, faye_websocket.bundle or
# faye_websocket.jar, depending on your Ruby platform and OS
require File.expand_path('../faye_websocket', __FILE__)

module Faye
  module WebSocket
    
    if RUBY_PLATFORM =~ /java/
      require 'jruby'
      com.jcoglan.faye.FayeWebSocketService.new.basicLoad(JRuby.runtime)
      
      def self.mask(*args)
        @mask ||= WebSocketMask.new
        @mask.mask(*args)
      end
    end
    
  end
end

As you can see, we need to use the JRuby API to load the extension when running on JRuby. This code will load the native code and then add any glue we need to make everything work correctly. If you run rake compile && irb -r ./lib/faye on either MRI or JRuby you’ll find that the Faye::WebSocket.mask() method works as expected.

Finally there’s the question of packaging. Unlike compiled C code, compiled Java code is portable to any JVM, so JRuby extensions are not compiled on site. Instead, you put the .jar file in your gem. Your gemspec needs to tell Rubygems to compile on site for MRI, but include the .jar for JRuby.

# faye-webosocket.gemspec

Gem::Specification.new do |s|
  s.name    = "faye-websocket"
  s.version = "0.4.0"
  s.summary = "WebSockets for Ruby"
  s.author  = "James Coglan"
  
  files = Dir.glob("ext/**/*.{c,java,rb}") +
          Dir.glob("lib/**/*.rb")
  
  if RUBY_PLATFORM =~ /java/
    s.platform = "java"
    files << "lib/faye_websocket.jar"
  else
    s.extensions << "ext/faye_websocket/extconf.rb"
  end
  
  s.files = files
  
  s.add_development_dependency "rake-compiler"
end

I’ve put "lib/faye_websocket.jar" as an explicit addition rather than including it in the Dir.glob for two reasons: you don’t want it in the MRI gem, and you want gem build to fail if the file is missing when building on JRuby, which can easily happen if you forget.

Now the final part, which isn’t obvious the first time you do this: you need to release two gems, one generic one and one for JRuby. The s.platform = "java" line means you’ll get a different gem file when building on JRuby. All you need to do is build the gem on two different platforms, and remember to compile before building the JRuby gems.

$ rbenv shell 1.9.3-p194 
$ gem build faye-websocket.gemspec 
  Successfully built RubyGem
  Name: faye-websocket
  Version: 0.4.0
  File: faye-websocket-0.4.0.gem

$ rbenv shell jruby-1.6.7 
$ rake compile
install -c tmp/java/faye_websocket/faye_websocket.jar lib/faye_websocket.jar
$ gem build faye-websocket.gemspec 
  Successfully built RubyGem
  Name: faye-websocket
  Version: 0.4.0
  File: faye-websocket-0.4.0-java.gem

Just push those two gem files to Rubygems.org, and you’re done.

Your first Ruby native extension: C

A few months back I released faye-websocket 0.4, my first gem that contained native code. After a few mis-step releases I got a working build for MRI and JRuby, but getting there was a little tricky. What follows is a quick how-to from someone who knows barely any C or Java, to explain how to wire your code up for release.

This only covers one possible use case for native code: rewriting a pure function in native code to make it faster. It does not cover binding to native libraries, or FFI, just writing some vanilla C/Java to make some hot code faster. In faye-websocket’s case, this is the function in question:

bytes = [72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]
mask  = [23, 142, 94, 24]

Faye::WebSocket.mask(bytes, mask)
# => [95, 235, 50, 116, 120, 162, 126, 111, 120, 252, 50, 124, 54]

It takes an arbitrary list of bytes, and a list of four bytes, and XORs the first set using the second set (this is part of how data is encoded in WebSocket frames). You’d implement it in Ruby like this:

def mask(payload, mask)
  result = []
  payload.each_with_index do |byte, i|
    result[i] = byte ^ mask[i % 4]
  end
  result
end

It turns out Ruby’s quite slow at doing this, and you get a big performance boost by writing in C. Remember that what we want to do is define a singleton method called mask(payload, mask) on the module Faye::WebSocket. I’ll just show you all the code for doing that in C, which we’re going to save in ext/faye_websocket/faye_websocket.c. All I know about the Ruby C APIs I got through Google, and I’ve annotated this code with some useful tips.

// ext/faye_websocket/faye_websocket.c

#include <ruby.h>

// Allocate two VALUE variables to hold the modules we'll create. Ruby values
// are all of type VALUE. Qnil is the C representation of Ruby's nil.
VALUE Faye = Qnil;
VALUE FayeWebSocket = Qnil;

// Declare a couple of functions. The first is initialization code that runs
// when this file is loaded, and the second is the actual business logic we're
// implementing.
void Init_faye_websocket();
VALUE method_faye_websocket_mask(VALUE self, VALUE payload, VALUE mask);

// Initial setup function, takes no arguments and returns nothing. Some API
// notes:
// 
// * rb_define_module() creates and returns a top-level module by name
// 
// * rb_define_module_under() takes a module and a name, and creates a new
//   module within the given one
// 
// * rb_define_singleton_method() take a module, the method name, a reference to
//   a C function, and the method's arity, and exposes the C function as a
//   single method on the given module
// 
void Init_faye_websocket() {
  Faye = rb_define_module("Faye");
  FayeWebSocket = rb_define_module_under(Faye, "WebSocket");
  rb_define_singleton_method(FayeWebSocket, "mask", method_faye_websocket_mask, 2);
}

// The business logic -- this is the function we're exposing to Ruby. It returns
// a Ruby VALUE, and takes three VALUE arguments: the receiver object, and the
// method parameters. Notes on APIs used here:
// 
// * RARRAY_LEN(VALUE) returns the length of a Ruby array object
// * rb_ary_new2(int) creates a new Ruby array with the given length
// * rb_ary_entry(VALUE, int) returns the nth element of a Ruby array
// * NUM2INT converts a Ruby Fixnum object to a C int
// * INT2NUM converts a C int to a Ruby Fixnum object
// * rb_ary_store(VALUE, int, VALUE) sets the nth element of a Ruby array
// 
VALUE method_faye_websocket_mask(VALUE self, VALUE payload, VALUE mask) {
  int n = RARRAY_LEN(payload), i, p, m;
  VALUE unmasked = rb_ary_new2(n);
  
  int mask_array[] = {
    NUM2INT(rb_ary_entry(mask, 0)),
    NUM2INT(rb_ary_entry(mask, 1)),
    NUM2INT(rb_ary_entry(mask, 2)),
    NUM2INT(rb_ary_entry(mask, 3))
  };
  
  for (i = 0; i < n; i++) {
    p = NUM2INT(rb_ary_entry(payload, i));
    m = mask_array[i % 4];
    rb_ary_store(unmasked, i, INT2NUM(p ^ m));
  }
  return unmasked;
}

Now we’ve got our C code done, we need some glue to compile it and load it from Ruby. I use rake-compiler for this. Your project needs an extconf.rb, in the same directory as the C code:

# ext/faye_websocket/extconf.rb

require 'mkmf'
extension_name = 'faye_websocket'
dir_config(extension_name)
create_makefile(extension_name)

Now let’s make a skeleton gemspec and Rakefile for the project. For a C extension, you ship the source code as part of the gem and it gets compiled on site using the extensions field from the gemspec.

# faye-websocket.gemspec

Gem::Specification.new do |s|
  s.name    = "faye-websocket"
  s.version = "0.4.0"
  s.summary = "WebSockets for Ruby"
  s.author  = "James Coglan"
  
  s.files = Dir.glob("ext/**/*.{c,rb}") +
            Dir.glob("lib/**/*.rb")
  
  s.extensions << "ext/faye_websocket/extconf.rb"
  
  s.add_development_dependency "rake-compiler"
end
# Rakefile

require 'rake/extensiontask'
spec = Gem::Specification.load('faye-websocket.gemspec')
Rake::ExtensionTask.new('faye_websocket', spec)

So the project looks like this at this point:

ext/
    faye_websocket/
        extconf.rb
        faye_websocket.c
faye-websocket.gemspec
Rakefile

This is now enough to run rake compile, wherein rake-compiler works its magic. This will create a Makefile, and some *.o and *.so files. You should not check these into git, or ship them as part of the gem – these compilation artifacts are created on install.

$ rake compile
mkdir -p lib
mkdir -p tmp/x86_64-linux/faye_websocket/1.9.3
cd tmp/x86_64-linux/faye_websocket/1.9.3
/home/james/.rbenv/versions/1.9.3-p194/bin/ruby -I. ../../../../ext/faye_websocket/extconf.rb
creating Makefile
cd -
cd tmp/x86_64-linux/faye_websocket/1.9.3
make
compiling ../../../../ext/faye_websocket/faye_websocket.c
linking shared-object faye_websocket.so
cd -
install -c tmp/x86_64-linux/faye_websocket/1.9.3/faye_websocket.so lib/faye_websocket.so

The file lib/faye_websocket.so is directly loadable from Ruby – let’s try it out:

$ irb -r ./lib/faye_websocket
>> Faye::WebSocket.mask [1,2,3,4], [5,6,7,8]
=> [4, 4, 4, 12]

So the final part of the process is to load this file from our main library code, for example:

# lib/faye/websocket.rb

require File.expand_path('../../faye_websocket', __FILE__)

module Faye
  module WebSocket
    # all your Ruby logic
  end
end

And that’s all there is to it. Just a few points to remember:

  • By convention, rake-compiler expects extensions to be in ext/
  • Make sure the C source and extconf.rb are included in the gemspec
  • Don’t put compilation output in the gem or in source control
  • Remember to recompile between editing C code and running tests

In the next article I’ll show you how to add JRuby support to your native gem.

Announcing Canopy, a Treetop-like PEG compiler for JavaScript

This is very brief announcement to say that I’ve just released a new PEG parser-compiler for JavaScript, called Canopy. It generates fast, self-contained parser modules from grammar definition files and runs in all the major browsers and on CommonJS platforms including Node.js, Narwhal and RingoJS.

Why do we need another PEG compiler? We don’t. PEG.js is excellent, actively used and maintained and has a really nice website. Canopy exists mostly for reasons of personal taste: I wanted something more akin to Treetop, which lets you keep the grammar definition separate from any methods you add to the parse tree. In fact, Canopy goes further than this: you have to keep them separate, you cannot have inline JavaScript in the grammar files. JavaScript goes in JavaScript files with all your other JavaScript.

It’s taken rather a while to release. I initially wrote it as part of a huge yak-shaving exercise: I was trying to get Terminus to work in IE, which doesn’t support the document.evaluate() API. I thought it would be a fun idea to reimplement it, so I created a project called Pathology, an ad-hoc, informally specified, bug-ridden slow implementation of half of XPath for IE. Of course this meant parsing XPath queries, which I wasn’t going to do by hand, so I thought it would even more fun to learn how parser compilers work and build one. Pathology never really panned out, although it turns out Android browsers don’t have XPath either so I might revive it, you never know.

I also used Canopy to build Fargo, my fiber-aware version of Scheme. It’s great for getting a new language off the ground quickly.

So, after two years of off-and-on development, and after some interest from a few other people, a couple of months ago I finally got around to documenting it, getting rid of some annoying dependencies (the parsers it generates are now completely self-contained and work on lots of JS platforms), and testing it properly. It’s available by running npm install -g canopy, or from the website, along with the documentation. Let me know what you think.

Designing Vault’s generator algorithm

A few weeks ago now, I released Vault, a stateless password generator. It took about six months to develop, which sounds ridiculous for something that’s only a couple hundred lines of JavaScript, but during that time I learned a fair bit about cryptography that changed the generator’s design, and I’d like to discuss that here.

Vault’s password generation algorithm went through about four distinct iterations between the 0.1 release and the for-public-consumption 0.2 version. The strategy throughout could be summarized as:

  • Take two values from the user and make a hash (A) out of them
  • Construct a character set (B) acceptable by the target site
  • Encode the hash A using the character set B

However the details of how you do each of these things affect the security of the system in important ways, as we’ll see.

The inputs to the process are:

  • The phrase: a master passphrase the person uses to generate all her passwords
  • The service: the name of the site the person is logging into
  • Optionally, a set of character constraints on the generated password, e.g. ‘must not contain spaces’, or ‘must have at least two uppercase letters’

The initial design worked like this: you start by calculating the SHA-256 hash of a combination of the phrase, service and a UUID that’s fixed in Vault’s codebase. (Vault uses either crypto-js or Node’s crypto module, depending on where it’s running.)

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);

Let’s take a concrete example:

var phrase  = 'you look nice today',
    service = 'gmail';

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);
// -> 'af7382c35e42ae2a5599a497d69ee5484559925265e609d69dc5026a4a214d90'

Then you turn this hash into binary form, like so:

Vault.toBits = function(digit) {
  var string = parseInt(digit, 16).toString(2);
  while (string.length < 4) string = '0' + string;
  return string;
};

var binary = hash.split('').map(Vault.toBits).join('');
// -> '1010111101110011100000101100001101011110010000101010111000101010 ...'

So now you have a big blob of random-looking data based on the phrase and service, and you need to encode this in a format acceptable to the site you’re logging into. This means picking an appropriate character set. Vault has a collection of different types of characters (these are from the stable 0.2 release):

Vault.LOWER     = 'abcdefghijklmnopqrstuvwxyz'.split('');
Vault.UPPER     = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.split('');
Vault.ALPHA     = Vault.LOWER.concat(Vault.UPPER);
Vault.NUMBER    = '0123456789'.split('');
Vault.ALPHANUM  = Vault.ALPHA.concat(Vault.NUMBER);
Vault.SPACE     = [' '];
Vault.DASH      = ['-', '_'];
Vault.SYMBOL    = '!"#$%&\'()*+,./:;< =>?@[\\]^{|}~'.split('').concat(Vault.DASH);
Vault.ALL       = Vault.ALPHANUM.concat(Vault.SPACE).concat(Vault.SYMBOL);

Vault.ALL ends up having 94 unique characters in it: all the printable ASCII characters, except for the backtick which I couldn’t find on my Android keyboard, so I thought it might cause problems. In the simplest case, you just encode the bits of the hash using Vault.ALL, which means selecting characters from a set of 94.

To select a number, we first figure out how many bits we need.

Math.ceil(Math.log(94)/Math.log(2))
// -> 7

So we shift 7 bits off the front of our hash, and see what number it gives us. In this case, the first 7 bits are 1010111, which is 87. Vault.ALL[87] is ^, so ^ is the first character of our password. The next 7 bits are 1011100 which is 92, so our password is now ^-. We continue this until we have a password of the desired length.

Obviously, some strings of 7 bits give values outside the range 0–93. I’ll come to this later.

But this system does not allow for any constraints. Say we want the password to contain no spaces and at least 2 uppercase letters, which we represent like this:

var constraints = {space: 0, upper: 2};

We need to force the generator to obey these constraints in a secure, deterministic way. This means the ordering needs to appear random when sampled over many phrase/service pairs (i.e. the uppercase letters must not always appear in the same place), but must always be the same for a given pair. We’ve already got a deterministic random number: our binary hash. We can use it both to select characters, and select which set to draw each character from.

Let’s say we want an 8-character password with the above constraints. We start by making an empty list and a copy of Vault.ALL:

 var required = [],
     allowed  = Vault.ALL.slice();

required will be a list of character sets as long as the required password. In order to make sure our password gets the required sets, we begin by pushing each required set onto the list the required number of times. We want 2 uppercase characters, so we add Vault.UPPER to the list twice. Then we remove any disallowed characters from the allowed list, and push copies of the allowed list onto required until it’s long enough – in this case, 8 characters long. In code this looks like:

for (var type in constraints) {
  var n   = constraints[type],
      set = Vault[type.toUpperCase()];
  
  if (n === 0) {
    removeCharacters(allowed, set);
  } else {
    while (n--) required.push(set);
  }
}
while (required.length < length)
  required.push(allowed);

In our example, this means required ends up looking like this, where allowed is Vault.ALL with Vault.SPACE removed:

required = [Vault.UPPER, Vault.UPPER, allowed, allowed,
            allowed, allowed, allowed, allowed];

Wait! Did you spot the deliberate mistake? That’s right, for in loops on objects do not guarantee order, so the constraints {digit: 3, upper: 2} and {upper: 2, digit: 3} might result in different orderings of required and hence different passwords. This is clearly wrong, let’s fix it:

var TYPES = 'LOWER UPPER DIGIT SPACE DASH SYMBOL'.split(' ');

for (var i = 0, m = TYPES.length; i < m; i++) {
  var n   = constraints[TYPES[i].toLowerCase()],
      set = Vault[TYPES[i]];
  
  // etc.
}

Now we have everything we need. Again, we take our hash and take some bits off the front to select items from lists. First thing we need to know is which set from required to use. required is 8 items long, so we need a 3-bit number. The first 3 bits are 101 which gives 5, and required[5] is the allowed set. We pop this out of the list using required.splice(5,1), and use the next few bits to select a value from allowed as before. We go through our list of bits, using them to pick a character set and than a character from that set, until we’ve used all the items in the required list. This gives an apparently random ordering of character sets, and random selection from each set. Easy.

Well, not quite. There are several implementation details that ended up making this design insecure that needed fixing to get to 0.2. See if you can spot them before continuing.

Okay, the most glaring one first:

var hash = CryptoJS.SHA256(phrase + Vault.UUID + service);

As I’ve previously explained, this is vulnerable to an extension attack on the service name. If someone discovered your gmail password, say, they could figure out your password for a service whose name begins with the letters gmail. A small risk, and possibly confounded by character constraints, but worth fixing. I switched to using HMAC-SHA256 to make the hash, and suffixed the service name with the UUID.

var hash = CryptoJS.HmacSHA256(secret + Vault.UUID, phrase);

This effectively means the user is signing the service name using their passphrase, a more sensible model for what Vault does, and it gets rid of the extension attack risk.

The second problem is not a threat to the security of the passwords Vault is capable of generating, but a practical problem: because SHA-256 produces a finite output, we can only generate passwords up to a certain length, depending on how many bits you spend on selecting character sets. Users didn’t want this artificial-seeming limit, so we needed a way to get as many bits as we like. Fortunately there’s a standard crypto function for that: the key-expanding function PBKDF2. This is based on a hashing function (the default is HMAC-SHA1, so we’re still using a MAC rather than a hash), but can safely generate arbitrarily large output. We just calculate how many bits we need for a password of a certain length, and generate them:

var hash = CryptoJS.PBKDF2(phrase,
                           service + Vault.UUID,
                           {keySize: k, iterations: i});

PBKDF2 takes an iterations argument that tells it how many rounds of hashing to do; this is basically used to make the function more expensive. This is important on the server-side where passwords must be stored using expensive hashes, but in Vault it’s only used because it can generate arbitrary amounts of output. (I arbitrarily set it to use 8 iterations, though unless this number’s very large it doesn’t make much practical difference what you use. However expensive you make it for the user to generate the password, they’re still sending it over a wire and asking someone to store it, and the burden of protection needs to lie with the server at that point. Making Vault as slow as server-side hashing ought to be would just be annoying.)

Great, so now we’re combining phrase and service to make a secure hash as large as we like, job done, right? Nope. Remember I said I’d come back to how the bits are turned into numbers to select items from lists? Well it turns out that was badly broken for a long time.

Say you need to select an item from a 94-ary list, which Vault does very frequently. 94 sits somewhere between 64 and 128, so to make sure you cover all possible values you need to use 7 bits to select a number. But this could give you a number you can’t use: say you get 1110001 (113), what do you do with it? Well, Vault’s original plan was: if you take 7 bits and get a number you can’t use, only keep the first 6 bits, and leave the 7th in the queue. This gets you numbers you can always use, but has the horrible side effect of producing bias: it makes some numbers more likely than others.

We can show this with an experiment. Here’s a small script that models Vault’s old encoding process, for selecting digits between 0 and 9:

var stream  = [],
    size    = 65536,
    results = {};

while (size--) stream.push(Math.floor(Math.random() * 2));

while (stream.length > 4) {
  var bits = stream.splice(0,4),
      num  = parseInt(bits.join(''), 2);
  
  if (num > 9) {
    stream.unshift(bits.pop());
    num = parseInt(bits.join(''), 2);
  }
  results[num] = (results[num] || 0) + 1;
}

console.log(results);

When we run this, we see that some numbers are substantially more likely than others:

$ node test.js 
{ '0': 1137,
  '1': 1120,
  '2': 1132,
  '3': 1121,
  '4': 1117,
  '5': 3473,
  '6': 3339,
  '7': 3369,
  '8': 1116,
  '9': 1144 }

This same systematic bias was present in Vault too. It meant that some orderings of character sets were more likely than others, and within each set some characters were more likely than others, by a factor of 3. If given a collection of passwords, this bias was enough to easily identify them as being generated by Vault rather than being truly random noise. For the generated passwords to be safe, they had to be evenly distributed.

The easiest way to implement this is to simply throw a number away and try the next group of bits if the number if too large. This is safe but inefficient. After asking around on Twitter, Christian Perfect showed me a better way that attempts to ‘recycle’ digits that are too large but pushing them into higher-order streams. This works great, and is exactly what Vault implements. It means Vault passwords now have the same character distribution as genuine pseudo-random noise.

Those are the major changes Vault went through before I decided it was safe enough to release. An important factor all these changes have in common is that they’re all irreversible: every one of them changes the output of the generator such that a set of inputs gives one password today and another password tomorrow. This is not acceptable so all these bugs needed stamping out before going live. There are other potential problems that can be fixed without breaking things though. For example, if it turns out there are cases where we’re not generating enough bits, we can just bump up the PBKDF2 key size and it won’t change existing outputs. Asking PBKDF2 to double its output size replaces ABC with ABCDEF: the bits you were already generating don’t change. Similarly, we can introduce new character sets without breaking old passwords as long as we add them without changing the order the existing sets appear in Vault.TYPES, which affects set ordering before encoding.

Backwards compatibility is a deal-breaker on this project: if you mess it up, everyone loses their passwords. The milestone the let me ship was realizing that all the remaining potential problems were reversibly fixable. Fortunately they’ve not come up yet.

Announcing Vault: safer passwords for the web

TL;DR: This is something of a rant. The important bit: I’ve built an open-source storage-free available-anywhere secure password generator called Vault. It’s available online and for your terminal.

What with recent weeks being a pretty bad time to be a LinkedIn, eHarmony or Last.fm user, I’ve been putting the final touches on something I hope will make it much easier for Internet users to keep their accounts safe. But before I dive into that, let’s review the problems users are facing right now.

Anyone who uses the Web is likely to have signed up for a few, maybe dozens, maybe hundreds of online services, covering a range of applications from sharing pictures of kittens to storing all of one’s money and personal information. Many of these accounts are identified by the user’s email address, and most people use the same password for every service they sign up for, because that’s what’s easiest (and often because nobody told them not to). So while technically all of your data is in nice little silos, a breach of one service provider’s database can often lead to the attacker gaining access to accounts on lots of other services. People aren’t attacking eHarmony because they want to know your ideal date – they want your email password, because that’s the key to everything else. The money in your bank, the marketable information Facebook knows about you, your whole identity.

So, the sensible thing to do is use a different password for every service you use. Then when one site’s database gets stolen, none of your other accounts is at risk. Well, sort of. How true this is depends on how good you are at picking passwords. Actually, that’s a lie. It relies on you not picking passwords: the best way to make sure that one database leak doesn’t lead to others is to use completely uncorrelated passwords, i.e. get a random number generator to pick them for you. Passwords chosen by people are highly likely to follow some predictable pattern, or include affordances for memorability that make them easier to crack.

But random passwords have a problem: they’re not memorable. This is actually a good thing for security but it makes them hard to use safely. Advice for dealing with this ranges from writing your passwords down (a horrible idea) to using a password manager like 1Password, KeePassX or LastPass to store your passwords in an encrypted database locked with one master password. Unfortunately most people still don’t know about these tools and so you end up with break-ins caused by someone leaving their password on a post-in note stuck to their monitor.

While these tools are useful, and certainly much better than using the same password everywhere or writing them down, I think they still suffer from a number of problems, namely: adoption, security and portability.

All these tools require you to install something on your various devices. They typically consist of an application that manages storage of passwords, a random password generator, and maybe a browser plugin and syncing support. Some of them also require payment. And while I don’t begrudge anyone trying to make money from software, it must be recognised that all of this creates friction that makes it less likely any given person will adopt the product. Less adoption means more unsafe accounts out there, creating more potential for break-ins. Trading off security against convenience is a very hard problem, but I think we can make people substantially safer than they currently are, with far less ceremony than these tools involve.

The second problem is security. While a good password manager will use a sensible encryption scheme to protect your data, it’s worth questioning whether storage is necessary at all. Any sensitive stored information requires very careful treatment when it is stored on disk or transmitted over a network; the safest data is that which is never stored anywhere and exists only in someone’s head.

(While I’m on the topic of storage, it’s worth pointing this out to service providers: because everyone uses the same password everywhere, if you run a site that requires people to log in, you have their email password. I don’t care if all your site does is let people rank versions of that one picture of a fishing boat on the beach at dusk that everyone seems to love – you have people’s email passwords and you had better take damn good care of them. This means hashing them using an expensive salted hashing algorithm like PBKDF2, Bcrypt or scrypt with a tunable work factor. It should take well over 100ms to hash someone’s password, more if you’re using a slow programming language. It must be computationally expensive for someone to guess a single password; using hashing algorithms designed for speed makes it easy for anyone to brute-force your passwords. I have, worryingly, seen several people advocate rate limiting or IP blocking on your login system for this purpose. This is completely useless: the most productive thing for an attacker to do is not to guess username-password pairs through your login form, where a round-trip takes at least 10ms, but to steal your whole database and analyse it on their own hardware. Your rate limiting does nothing to stop this, and the expense of guessing a password must be baked into the hashes themselves.)

Anyway, where was I. Right, password managers, yes. The third problem with password managers is partly a consequence of the storage problem: portability. Password managers have two factors conspiring to make them less portable: they require software installation and so must provide versions for every platform they wish to target, and they must provide a way to make your password data available everywhere you need it. I’ve seen too many products (‘too many’ can mean ‘one’ if it’s something you really care about) do an ‘experimental’ Linux release and then abandon it – I don’t trust that it’s going to continue to be worth any company’s time supporting all the platforms I need to have my passwords available on. And portability doesn’t just mean availability on all my devices. What if you’re travelling, or out without your phone and you need to pop into an Internet café? What if your password manager simply doesn’t work on your phone? What if your password manager doesn’t provide online retrieval, or it’s down for maintenance? You’re stuck.

So, you can’t use the same password everywhere, creating good passwords is hard, bad password policies make it even harder, and password managers suck. But, back in December I found a good idea in the aforelinked article:

I generate a different password for every service, based on a convoluted master password and the name of the thing. I do this because it’s what you’re supposed to do; it’s what security nerds (including myself for the purposes of this post) tell everyone else to do. “Ho ho!” we all chuckled to ourselves after the Gawker leak, and the subsequent breakins to various other things that used the same passwords. “If only these chumps had been generating different random passwords for every service!”

So my passwords look like 'fC`29ap5w78r3IJ, or Ab3HE4 2Iv5hJk\K, or mw@\_h< ~o04neHiJ{. Those are actual examples i just generated. I’m eating my own dogfood, so to speak.

Fuck Passwords

Now remember the qualities we want from our passwords: different for every site, and uncorrelated. The word ‘uncorrelated’ is usually taken to mean ‘use a random number generator and store the result somewhere’. But we can get passwords that look random (technically, this means there’s no polynomial-time algorithm that can distinguish them from genuine random noise) in a deterministic way: a function that produces random-looking output but gives the same result every time if given the same input. We can just use hashing functions for this: their output is pseudorandom, deterministic, and changes drastically if the input is slightly changed. This means we can generate a set of random-looking passwords from one master passphrase and a string that’s different for each service we use, and because it’s deterministic we don’t need to store the result anywhere. This means we can put up a static web page that implements such a function, never stores or transmits the user’s data, and is available everywhere and hostable by anybody.

So that’s what I’ve built. A simple page that takes your master passphrase and service name, and generates a very strong password from them. It lets you select which characters are required/forbidden, to work around sites with bad password policies. It has a command-line version based on Node.js. And, needless to say, it’s open source and anyone can host a copy of it.

The core idea of Vault is that it works without storing anything by default. This means no installation, no storage to secure and sync, and total portability: it’s just a static web page instantly accessible from anywhere. You can host your own copy for when my site’s down, put a copy on your laptop or internal network, whatever you want. The storage options offered by the command-line version are simply a convenience: I can get my Gmail password by typing vault gmail | pbcopy, but if I’m on another machine without my stored settings I can still get my password from the web version. I am currently investigating ways of providing similar convenience on the web, but until then you can do something quite easy: write your character settings down. The character constraints of websites’ passwords are public knowledge already, there’s nothing dangerous about keeping a note of them in a text file or in your wallet.

But, I hear you cry, how does this protect anyone? Can’t an attacker just use Vault themselves to pre-process my password before trying it? Yes, of course they can, but they could have just tried your password before, so they’ve not gained any new power. Vault protects you in two important ways. First, the passwords it generates with the default settings have about 130 bits of entropy (20 characters from a uniformly distributed 94-character alphabet). This is far higher than a dictionary-word password, making it resistant to brute-force attacks, in fact zxcvbn estimates they will take billions of times the age of the universe to crack by brute force. Second, the fact that a set of Vault passwords looks entirely random means that if attacker gets one of your passwords they cannot predict any others or discover your master passphrase. This is especially true if you include your username in the the service name when generating a password so you get a universally unique password. Plus, it gives you the freedom to choose a good master passphrase rather than being constrained by the awful password policies affecting many websites.

I’ve been using this system for six months now, and the generator algorithm has been through several rounds of design improvements to the point where I’m now happy calling it stable and freezing it. (I’d go into how it works but I’ve already rambled on for 1700 words so I’ll save that for another post.) Initial response from the few people I’ve shown it to has been really positive, and I’d encourage you to try it out if you’re still using one password everywhere. Read the FAQ first before diving in.

I’ll close by saying that I’m absolutely not trying to compete with the non-password-based authentication systems that are coming out. I really want BrowserID to succeed and kill passwords for good, but we’re not there yet. I’m just trying to do something that works with the web as it is today, that’s reasonably secure while still being extremely convenient, and is much better than what the majority of people do with their passwords. It’s a stepping stone until we have something better.

Why you should never use hash functions for message authentication

The general thrust of this post is: use a MAC function like HMAC to sign data, don’t use hash functions. Although not all hash functions suffer from the problem I’m going to illustrate, in general using a hash function for message authentication comes with a lot of potential problems because those functions aren’t designed for this task. You shouldn’t try to work around it by creatively processing the inputs or inventing some fancy way of chaining hash functions. Just use the functions that were designed for this task instead of inventing your own crypto schemes.

This morning I was reading an article on the Intridea blog about signed idempotent action links. These are links you can embed in emails to your users that let them perform authenticated actions on your site, without going through a login screen or any other interactions. This technique relies on embedding action data in the link, for example an email from Twitter could include your user ID and the name of another user, and following the link would make you follow that user on Twitter.

But these parameters are typically easily guessable and so we must provide an authentication mechanism, otherwise anyone can construct URLs to modify other accounts however they like. To do this, we ‘sign’ the action data by combining it with a secret value, for example a global secret token stored on our servers, or the user’s password hash to produce a ‘tag’. This ‘tag’ does not reveal the secret values used in its construction but lets us verify that the link was generated by our site and for the correct user. The article gives the following methods you could add to a user class to let them sign and verify actions, assuming the user has an id and a secret_token. (I don’t mean to pick on Intridea, I’ve seen a lot of other people make this mistake and I only found out recently why you shouldn’t do it.)

class User
  def sign_action(action, *params)
    Digest::SHA1.hexdigest(
      "--signed--#{id}-#{action}-#{params.join('-')}-#{secret_token}"
    )
  end
  
  def verify(signature, action, *params)
    signature == sign_action(action, *params)
  end
end

This combines the action data with a secret token and calculates the SHA-1 hash; this hash is then appended to the URL to go in the email, and verified when the link is followed. Seems fine, right? SHA-1 doesn’t reveal anything about its input, so the secret is safe. This might be true (modulo the existence of rainbow tables) but hash functions still don’t provide any guarantee that a (message,tag) pair is genuine. To see why, we need to consider how hash functions work.

Many hash functions are based on something called the ‘Merkle-Damgard iterated construction’, which works like this: say you have a long string you want to hash, that looks like this:

+--------------------------+
| abcdefghijklmnopqrstuvwx |
+--------------------------+

Your message gets split into fixed-size blocks like this:

+--------+  +--------+  +--------+  +--------+
| abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+

It is then prefixed with a value called the ‘initialization vector’ or IV, which is a global constant that’s part of the hash function’s definition. This IV is the same size as the message blocks:

    IV          M0          M1          M2          M3
+--------+  +--------+  +--------+  +--------+  +--------+
| Zn8AGy |  | abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+  +--------+

This sequence is then folded using a compression function h(). The details of the h() depend on the hashing function, but the only thing that concerns us here is that the compression function takes two message blocks and returns another block of the same size.

So, first we take IV and M0 and compute h(IV,M0) to get H0. Then we take H0 and M1 and compute H1 = h(H0,M1) and so on down the chain.

    IV          M0          M1          M2          M3
+--------+  +--------+  +--------+  +--------+  +--------+
| Zn8AGy |  | abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+  +--------+
     |           |           |           |           |
     |         +-+-+       +-+-+       +-+-+       +-+-+
     +-------->| h |--H0-->| h |--H1-->| h |--H2-->| h |--> H3
               +---+       +---+       +---+       +---+

Whatever value comes out the end of the chain is the result of the hash function; this is how hash functions take arbitrary-size input and produce fixed-size output. But what does it mean for message authentication? Well say you have a (message,tag) pair, like our action params and SHA-1 hash from above. The construction of hash functions means that if you know the value of hash(string), you can easily work out the value of hash(string + modification) without knowing what string is: you just take the hash you already have, and do some more rounds of Merkle-Damgard with your modification. So, if we’re signing data by doing this:

tag = sha1(secret + message)

we’re then going to send (message,tag) over the wire in an email. The Merkle-Damgard construction means an attacker can take these values and easily compute the following:

fake_tag = sha1(secret + message + modification)

The attacker can do this without knowing secret, and can therefore construct new (message,tag) pairs that look genuine to the application. The creation of new pairs by untrusted parties is called an extension attack, and common hash functions are vulnerable to it.

Fortunately, there’s a function that is resistant to extension attacks and is designed precisely for signing messages: HMAC, short for Hash-based Message Authentication Code. In Ruby you can use the OpenSSL module from the standard library for this:

sha1 = OpenSSL::Digest::Digest.new('sha1')
tag  = OpenSSL::HMAC.hexdigest(sha1, secret_token, message)

Although HMAC is based on hash functions, it’s constructed in a way that prevents extension attacks and other classes of ‘existential forgery’, i.e. people constructing fake-but-valid (message,tag) pairs. If you’re signing data, you should use it.

But wait: there’s more! Take another look at the verify() function:

def verify(signature, action, *params)
  signature == sign_action(action, *params)
end

See anything wrong with it? It’s actually vulnerable to a timing attack: the String#== method as commonly implemented compares strings character-by-character (or byte-by-byte) and exits with false as soon as it finds an unequal pair. This fact means that an attacker can determine the first correct character of the tag by submitting requests to a signed URL with a different first character in the tag each time, and stopping when the request takes a little longer than usual. After guessing the first character they can move onto the second, and so on until they’ve guessed the whole correct tag.

The easiest way to defeat this attack is, instead of directly comparing two strings, compare their mappings under a collision-resistant hash function:

def verify(signature, action, *params)
  expected = Digest::SHA1.hexdigest(sign_action(action, *params))
  actual = Digest::SHA1.hexdigest(signature)
  expected == actual
end

This strategy means that if any of the supplied tag is wrong, it will probably differ on the first character when compared to the hash of the expected signature. Even if it doesn’t sometimes, you’ve destroyed the linear relationship between the correctness of the string and the time the comparison takes.

Finally, you should make sure your application does not exit early if the tag is invalid. You should do all the data processing you would normally do, just short of modifying the database, and check the tag last. If you return early you risk another timing attack.

All the information in this post I learned from Stanford’s introduction to cryptography, which is an excellent six-week primer on the topic. It’s a little mathsy but not nearly as much as it could be, meaning it’s fairly accessible and focuses on practical problems and intuition, and is really useful to anyone handling user data for a living. I’d go so far as to say it should be required reading for any web developer. It starts up again on Monday June 11: go sign up and make sure you’re not the next victim of the password leaks we’ve seen this week.

Source map support added to Packr and Jake

I’m a little late announcing this, in fact I held off until I’d been using this code for a couple of months to make sure there weren’t any glaring surprises. But the good news is, all the JavaScript libraries I ship from now on will come with source maps, and yours can too.

What’s a source map, you ask. Well, as this article explains, a source map is just a metadata file that maps locations in a minified (or otherwise compiled) JavaScript file back to locations in the source code the author is working on. This means you can load minified/compiled code into a browser, and when log messages or exceptions occur, the browser can show you the filename and line number in your source code that such events come from, rather than somewhere in line 1 of whatever huge bundle of code you’re actually deploying. This is really useful.

Telling the browser about this metadata is really easy. If you look at the latest release of Faye, you’ll see it contains three client-side files:

  • faye-browser.js – the source code
  • faye-browser-min.js – the minified code
  • faye-browser-min.js.map – the source map

faye-browser-min.js is what gets loaded in the browser. At the end of the file is a single line comment that looks like this:

//@ sourceMappingURL=faye-browser-min.js.map

This line tells the browser to use the file faye-browser-min.js.map as the source map for the JavaScript file that includes this comment. sourceMappingURL is resolved relative to the URL of the containing script, so as long as the script and the map are in the same directory this directive works fine.

The file faye-browser-min.js.map looks like this:

{
  "version": 3,
  "file": "faye-browser-min.js",
  "sourceRoot": "",
  "sources": ["faye-browser.js"],
  "names": ["_advice", "_callback", "_callbacks", "_cancelled", "_cbCount", "_channels", ...],
  "mappings": "AAAA,IAAI,MAAQ,OAAO,QAAU,SAAW,QACxC,GAAI,OAAO,UAAY,WAAY,OAAO,KAAO,KAEjD,KAAK..."
}

The sources field refers to the source code the map relates to: a source map can describe how one file was concatenated from a set of source files. The browser uses the sources and mappings data to show you the source code instead of the minified code when you use the web inspector, and it only loads the map and the source code if you have the inspector open so it doesn’t add latency for normal users.

Now if you want to use this new browser feature on your own projects, I have a couple of tools you should look at. First is Packr, my port of Dead Edwards’ Packer tool in Ruby. Say you have two source files:

# example_a.js

1. // When the minified code is loaded into a browser, you should see the call to
2. // console.log() attributed to example_a.js:4
3. 
4. console.log('Hello from file A');
# example_b.js

1. var display = function(message) {
2.   alert(message + ' from file B');
3. };
4. 
5. display('Ahoy there');

You can run this in the shell:

packr example_a.js example_b.js -o min/example-min.js -h '/* Copyright 2012 */' --shrink-vars

And it will generate a minified file and a source map for you:

# example-min.js

/* Copyright 2012 */
console.log('Hello from file A');var display=function(a){alert(a+' from file B')};display('Ahoy there');
//@ sourceMappingURL=example-min.js.map
# example-min.js.map

{
  "version": 3,
  "file": "example-min.js",
  "sourceRoot": "",
  "sources": ["../example_a.js", "../example_b.js"],
  "names": ["message"],
  "mappings": ";AAGA,QAAQ,KAAK,MAAM,KAAK,KAAK,ICH7B,IAAI,QAAU,SAASA,GACrB,MAAMA,IAAY,KAAK,KAAK,KAG9B,SAAS,KAAK;"
}

Packr lets you remove whitespace, compress variable names, use base-62 encoding, insert header comments on minified output, and can be used as a CLI or as a Ruby library. See the readme for more info.

Second is a tool I’ve been using to build all my JavaScript projects for a few years now, Jake. Jake is based on Packr, but is designed for larger projects. It’s designed to let you describe the structure of a project using a simple YAML file, and build the whole thing easily with one shell command. Based on Packr’s source map support, I’ve added the ability to tell Jake to generate source maps for one build based on the output of another. For example, say your build produces both unminified (src) and minified (min) copies of your code; then you can tell Jake to generate a source map for the min build that refers to code in the src build as follows:

---
builds:
  src:
    minify: false
  min:
    minify: true
    shrink_vars: true
    source_map: src

The source_map: src line means, generate a source map for min files that refers to locations in the corresponding src file. This is how the Faye files above are generated. Again, the Jake readme has more information.

We’re likely to see source map support in other tools soon; adding it to Packr was fairly easy because of Packr’s very simple model for compressing code. Basically it’s a regex-replacement-based compressor rather than a full JavaScript parser, so source maps could be added as a post-processing step rather than being based on AST annotations requiring deep integration. Unfortunately this means that if you like omitting semicolons from your code you might have to wait a little longer to use this stuff.

Faye 0.8: the refactoring

I’m pleased to finally announce the release of Faye 0.8 after a few months of reorganising the 0.7 codebase to make it more modular, and splitting parts of it out into separate projects. Before I get to what’s changed, I’m going to get the API changes out of the way: this is the stuff you need to know if you’re upgrading.

I hate introducing API changes but I’m afraid these really couldn’t be avoided. They’re really configuration changes so you shouldn’t need to change a lot of code. Please get on the mailing list if you have problems.

First, if you’re running the Ruby server you need to tell it which web server you’re using. Faye now supports the Thin, Rainbows and Goliath web servers, and you need to tell the WebSocket layer which set of adapters to load. For a ‘hello world’ app this looks like this:

# config.ru
require 'faye'
Faye::WebSocket.load_adapter('thin')

app = Faye::RackAdapter.new(:mount => '/faye', :timeout => 25)

run app

Depending on whether you run your application with rackup or thin, the load_adapter call might not be strictly necessary, but better to have it in there just in case. See the Faye Ruby docs and the faye-websocket documentation on how to run your app with different Ruby servers.

Second, if you use the Redis backend, you need to install a new library and change the engine.type setting in your server. Instead of specifying the name of the engine, this field now takes a reference to the engine object. In Ruby that looks like this:

# config.ru
# First run: gem install faye-redis

require 'faye'
require 'faye/redis'

bayeux = Faye::RackAdapter.new(
  :mount   => '/',
  :timeout => 25,
  :engine  => {
    :type  => Faye::Redis,
    :host  => 'redis.example.com',
    # more options
  }
)

And on Node:

// First run: npm install faye-redis

var faye  = require('faye'),
    redis = require('faye-redis');

var bayeux = new faye.NodeAdapter({
  mount:    '/',
  timeout:  25,
  engine: {
    type:   redis,
    host:   'redis.example.com',
    // more options
  }
});

Apart from this, Faye 0.8 should be backward-compatible with previous releases.

Having got the administrivia out of the way, what’s new? Well, the main focus of the 0.8 release is modularity. Two major components of Faye – the WebSocket support and the Redis engine – have been split off into their own packages. I’ve been blogging here already about faye-websocket (rubygem, npm package), and the major work over the last few months has gone into making this a really solid WebSocket and EventSource implementation. Because of its new-found freedom outside the main Faye project, it’s been adopted by other projects, notably SockJS, Cramp and Poltergeist. This adoption, particularly from SockJS, has meant more feedback, bug fixes and performance improvements and resulted in a really solid WebSocket library that improves Faye’s performance.

This new library has had a beneficial impact on Faye’s transport layer. faye-websocket is much faster than Faye’s 0.7 WebSocket code, supports EventSource and new WebSocket features, and runs on more Ruby servers: Faye 0.7 was confined to Thin whereas it now also runs on Rainbows and Goliath. On top of this, Faye 0.8 adds a new EventSource-based transport to support Opera 11 and browsers where proxies block WebSocket, and improves how it uses WebSocket. Previously, Faye’s WebSocket transport used the same polling-based /meta/connect cycle as other transports. It was faster than HTTP, but not optimal. Faye 0.8 now breaks out of this request/response pattern and pushes messages to WebSocket and EventSource connections as soon as they arrive at the server, without returning the /meta/connect poll. This results in lower latency, particular when delivering messages at high volume.

The second major change is that the Redis engine is now a separate library, faye-redis (rubygem, npm package). This has two important benefits. First, the main Faye package no longer depends on Redis clients, and in particular the Node version no longer depends on packages with C extensions, so no compiler is needed to install it. Second, it means Faye’s backend is now totally pluggable and third parties can implement their own engines: the API is thoroughly documented for Ruby and Node. The engine is a small piece of code (for example here’s the Ruby in-memory engine) but it really defines Faye’s behaviour. This layer is not concerned with transport negotiation (HTTP, WebSocket, etc) or even the Bayeux protocol format, it just implements the messaging business logic and stores the state of the system. You can easily implement your own engine to run on top of another messaging/storage stack, or change the messaging semantics if you like. Faye has a complete set of tests you can run to check your engine – see the Ruby and Node projects for examples.

Finally, there have been a couple of changes to the client. We’ve switched from exponential-backoff to fixed-interval for trying to reconnect after the client loses its connection to the server, and this interval is configurable using the retry setting:

// Attempts to reconnect every 5 seconds
var client = new Faye.Client('http://example.com/bayeux', {
  retry: 5
});

You can also set headers for long-polling requests on the client; this is useful for talking to OAuth-protected servers, for example:

client.setHeader('Authorization', 'OAuth ' + accessToken);

And finally, there’s a new server-side setting, ping, that controls how often the server sends keep-alive data over WebSocket and EventSource connections. This data is ignored by the client, but helps keep the connection open through proxies that like to kill idle connections.

// Sends pings every 10 seconds

var bayeux = new faye.NodeAdapter({
  mount: '/bayeux',
  ping:  10
});

And that just about wraps things up for this release. Since 0.7.1, the main Faye codebase has shed over 2,000 lines of code into other projects that we can easily ship incremental updates to without affecting Faye itself. It’s more performant, leaner and more modular and I know there are already projects doing cool things with it. If you’re using Faye for interesting projects, I’d love to hear from you on the mailing list.