Your first Ruby native extension: Java

In my last post, I covered how to get started writing code in C and wiring it up to your Ruby code. While that code technically will work in JRuby, it’s preferred to write native JRuby extensions in Java rather than in C. In this article we’ll add JRuby support to our gem.

Let’s start with the Java code itself. Just like we did in C, we need to define the business logic itself as a Java method, and some glue code to set up modules and expose the methods to the Ruby runtime. The JRuby APIs are fairly easy to google, but getting things wired up correctly is quite challenging if you’re new to Java. The slightly obtuse-looking wiring here mostly concerns getting a reference to the current JRuby runtime, which parts of the API need (e.g. see RubyArray.newArray()).

Here’s the required code, which we put in ext/faye_websocket/FayeWebSocketService.java:

package com.jcoglan.faye;

import java.lang.Long;
import java.io.IOException;

import org.jruby.Ruby;
import org.jruby.RubyArray;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyObject;
import org.jruby.anno.JRubyMethod;
import org.jruby.runtime.ObjectAllocator;
import org.jruby.runtime.ThreadContext;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FayeWebSocketService implements BasicLibraryService {
  private Ruby runtime;
  
  // Initial setup function. Takes a reference to the current JRuby runtime and
  // sets up our modules. For JRuby, we will define mask() as an instance method
  // on a specially created class, Faye::WebSocketMask.
  public boolean basicLoad(Ruby runtime) throws IOException {
    this.runtime = runtime;
    RubyModule faye = runtime.defineModule("Faye");
    
    // Create the WebSocketMask class. defineClassUnder() takes a name, a
    // reference to the superclass -- runtime.getObject() gets you the Object
    // class for the current runtime -- and an allocator function that says
    // which Java object to constuct when you call new() on the class.
    RubyClass webSocket = faye.defineClassUnder("WebSocketMask", runtime.getObject(), new ObjectAllocator() {
      public IRubyObject allocate(Ruby runtime, RubyClass rubyClass) {
        return new WebSocket(runtime, rubyClass);
      }
    });
    
    webSocket.defineAnnotatedMethods(WebSocket.class);
    return true;
  }
  
  // The Java class that backs the Ruby class Faye::WebSocketMask. Its methods
  // annotated with @JRubyMethod become exposed as instance methods on the Ruby
  // class through the call to defineAnnotatedMethods() above.
  public class WebSocket extends RubyObject {
    public WebSocket(final Ruby runtime, RubyClass rubyClass) {
      super(runtime, rubyClass);
    }
    
    @JRubyMethod
    public IRubyObject mask(ThreadContext context, IRubyObject payload, IRubyObject mask) {
      int n = ((RubyArray)payload).getLength(), i;
      long p, m;
      RubyArray unmasked = RubyArray.newArray(runtime, n);
      
      long[] maskArray = {
        (Long)((RubyArray)mask).get(0),
        (Long)((RubyArray)mask).get(1),
        (Long)((RubyArray)mask).get(2),
        (Long)((RubyArray)mask).get(3)
      };
      
      for (i = 0; i < n; i++) {
        p = (Long)((RubyArray)payload).get(i);
        m = maskArray[i % 4];
        unmasked.set(i, p ^ m);
      }
      return unmasked;
    }
  }
}

There’s another strategy you can use for JRuby, which is more like FFI: you define the logic you want in pure Java, and then tell JRuby to expose that to Ruby, mapping data types between the two runtimes. I tried that for this problem and it ended up being slower, so I went with the approach above which uses the JRuby APIs directly.

Next thing we need is code to compile it. Find the Rakefile from the C example and modify it like so: we need to define a different compiler task based on which runtime we’re using.

# Rakefile

spec = Gem::Specification.load('faye-websocket.gemspec')

if RUBY_PLATFORM =~ /java/
  require 'rake/javaextensiontask'
  Rake::JavaExtensionTask.new('faye_websocket', spec)
else
  require 'rake/extensiontask'
  Rake::ExtensionTask.new('faye_websocket', spec)
end

You should be able to run rake compile on JRuby and it should just work. It will create a new file in lib called faye_websocket.jar, which is the compiled Java bytecode package.

We also need a little more glue to load this code on JRuby, and we need some additional logic to map our intended single method WebSocket.mask() onto the Java method WebSocketMask#mask(). Create the file lib/faye.rb with this content:

# lib/faye.rb

# This loads either faye_websocket.so, faye_websocket.bundle or
# faye_websocket.jar, depending on your Ruby platform and OS
require File.expand_path('../faye_websocket', __FILE__)

module Faye
  module WebSocket
    
    if RUBY_PLATFORM =~ /java/
      require 'jruby'
      com.jcoglan.faye.FayeWebSocketService.new.basicLoad(JRuby.runtime)
      
      def self.mask(*args)
        @mask ||= WebSocketMask.new
        @mask.mask(*args)
      end
    end
    
  end
end

As you can see, we need to use the JRuby API to load the extension when running on JRuby. This code will load the native code and then add any glue we need to make everything work correctly. If you run rake compile && irb -r ./lib/faye on either MRI or JRuby you’ll find that the Faye::WebSocket.mask() method works as expected.

Finally there’s the question of packaging. Unlike compiled C code, compiled Java code is portable to any JVM, so JRuby extensions are not compiled on site. Instead, you put the .jar file in your gem. Your gemspec needs to tell Rubygems to compile on site for MRI, but include the .jar for JRuby.

# faye-webosocket.gemspec

Gem::Specification.new do |s|
  s.name    = "faye-websocket"
  s.version = "0.4.0"
  s.summary = "WebSockets for Ruby"
  s.author  = "James Coglan"
  
  files = Dir.glob("ext/**/*.{c,java,rb}") +
          Dir.glob("lib/**/*.rb")
  
  if RUBY_PLATFORM =~ /java/
    s.platform = "java"
    files << "lib/faye_websocket.jar"
  else
    s.extensions << "ext/faye_websocket/extconf.rb"
  end
  
  s.files = files
  
  s.add_development_dependency "rake-compiler"
end

I’ve put "lib/faye_websocket.jar" as an explicit addition rather than including it in the Dir.glob for two reasons: you don’t want it in the MRI gem, and you want gem build to fail if the file is missing when building on JRuby, which can easily happen if you forget.

Now the final part, which isn’t obvious the first time you do this: you need to release two gems, one generic one and one for JRuby. The s.platform = "java" line means you’ll get a different gem file when building on JRuby. All you need to do is build the gem on two different platforms, and remember to compile before building the JRuby gems.

$ rbenv shell 1.9.3-p194 
$ gem build faye-websocket.gemspec 
  Successfully built RubyGem
  Name: faye-websocket
  Version: 0.4.0
  File: faye-websocket-0.4.0.gem

$ rbenv shell jruby-1.6.7 
$ rake compile
install -c tmp/java/faye_websocket/faye_websocket.jar lib/faye_websocket.jar
$ gem build faye-websocket.gemspec 
  Successfully built RubyGem
  Name: faye-websocket
  Version: 0.4.0
  File: faye-websocket-0.4.0-java.gem

Just push those two gem files to Rubygems.org, and you’re done.

Your first Ruby native extension: C

A few months back I released faye-websocket 0.4, my first gem that contained native code. After a few mis-step releases I got a working build for MRI and JRuby, but getting there was a little tricky. What follows is a quick how-to from someone who knows barely any C or Java, to explain how to wire your code up for release.

This only covers one possible use case for native code: rewriting a pure function in native code to make it faster. It does not cover binding to native libraries, or FFI, just writing some vanilla C/Java to make some hot code faster. In faye-websocket’s case, this is the function in question:

bytes = [72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 33]
mask  = [23, 142, 94, 24]

Faye::WebSocket.mask(bytes, mask)
# => [95, 235, 50, 116, 120, 162, 126, 111, 120, 252, 50, 124, 54]

It takes an arbitrary list of bytes, and a list of four bytes, and XORs the first set using the second set (this is part of how data is encoded in WebSocket frames). You’d implement it in Ruby like this:

def mask(payload, mask)
  result = []
  payload.each_with_index do |byte, i|
    result[i] = byte ^ mask[i % 4]
  end
  result
end

It turns out Ruby’s quite slow at doing this, and you get a big performance boost by writing in C. Remember that what we want to do is define a singleton method called mask(payload, mask) on the module Faye::WebSocket. I’ll just show you all the code for doing that in C, which we’re going to save in ext/faye_websocket/faye_websocket.c. All I know about the Ruby C APIs I got through Google, and I’ve annotated this code with some useful tips.

// ext/faye_websocket/faye_websocket.c

#include <ruby.h>

// Allocate two VALUE variables to hold the modules we'll create. Ruby values
// are all of type VALUE. Qnil is the C representation of Ruby's nil.
VALUE Faye = Qnil;
VALUE FayeWebSocket = Qnil;

// Declare a couple of functions. The first is initialization code that runs
// when this file is loaded, and the second is the actual business logic we're
// implementing.
void Init_faye_websocket();
VALUE method_faye_websocket_mask(VALUE self, VALUE payload, VALUE mask);

// Initial setup function, takes no arguments and returns nothing. Some API
// notes:
// 
// * rb_define_module() creates and returns a top-level module by name
// 
// * rb_define_module_under() takes a module and a name, and creates a new
//   module within the given one
// 
// * rb_define_singleton_method() take a module, the method name, a reference to
//   a C function, and the method's arity, and exposes the C function as a
//   single method on the given module
// 
void Init_faye_websocket() {
  Faye = rb_define_module("Faye");
  FayeWebSocket = rb_define_module_under(Faye, "WebSocket");
  rb_define_singleton_method(FayeWebSocket, "mask", method_faye_websocket_mask, 2);
}

// The business logic -- this is the function we're exposing to Ruby. It returns
// a Ruby VALUE, and takes three VALUE arguments: the receiver object, and the
// method parameters. Notes on APIs used here:
// 
// * RARRAY_LEN(VALUE) returns the length of a Ruby array object
// * rb_ary_new2(int) creates a new Ruby array with the given length
// * rb_ary_entry(VALUE, int) returns the nth element of a Ruby array
// * NUM2INT converts a Ruby Fixnum object to a C int
// * INT2NUM converts a C int to a Ruby Fixnum object
// * rb_ary_store(VALUE, int, VALUE) sets the nth element of a Ruby array
// 
VALUE method_faye_websocket_mask(VALUE self, VALUE payload, VALUE mask) {
  int n = RARRAY_LEN(payload), i, p, m;
  VALUE unmasked = rb_ary_new2(n);
  
  int mask_array[] = {
    NUM2INT(rb_ary_entry(mask, 0)),
    NUM2INT(rb_ary_entry(mask, 1)),
    NUM2INT(rb_ary_entry(mask, 2)),
    NUM2INT(rb_ary_entry(mask, 3))
  };
  
  for (i = 0; i < n; i++) {
    p = NUM2INT(rb_ary_entry(payload, i));
    m = mask_array[i % 4];
    rb_ary_store(unmasked, i, INT2NUM(p ^ m));
  }
  return unmasked;
}

Now we’ve got our C code done, we need some glue to compile it and load it from Ruby. I use rake-compiler for this. Your project needs an extconf.rb, in the same directory as the C code:

# ext/faye_websocket/extconf.rb

require 'mkmf'
extension_name = 'faye_websocket'
dir_config(extension_name)
create_makefile(extension_name)

Now let’s make a skeleton gemspec and Rakefile for the project. For a C extension, you ship the source code as part of the gem and it gets compiled on site using the extensions field from the gemspec.

# faye-websocket.gemspec

Gem::Specification.new do |s|
  s.name    = "faye-websocket"
  s.version = "0.4.0"
  s.summary = "WebSockets for Ruby"
  s.author  = "James Coglan"
  
  s.files = Dir.glob("ext/**/*.{c,rb}") +
            Dir.glob("lib/**/*.rb")
  
  s.extensions << "ext/faye_websocket/extconf.rb"
  
  s.add_development_dependency "rake-compiler"
end
# Rakefile

require 'rake/extensiontask'
spec = Gem::Specification.load('faye-websocket.gemspec')
Rake::ExtensionTask.new('faye_websocket', spec)

So the project looks like this at this point:

ext/
    faye_websocket/
        extconf.rb
        faye_websocket.c
faye-websocket.gemspec
Rakefile

This is now enough to run rake compile, wherein rake-compiler works its magic. This will create a Makefile, and some *.o and *.so files. You should not check these into git, or ship them as part of the gem – these compilation artifacts are created on install.

$ rake compile
mkdir -p lib
mkdir -p tmp/x86_64-linux/faye_websocket/1.9.3
cd tmp/x86_64-linux/faye_websocket/1.9.3
/home/james/.rbenv/versions/1.9.3-p194/bin/ruby -I. ../../../../ext/faye_websocket/extconf.rb
creating Makefile
cd -
cd tmp/x86_64-linux/faye_websocket/1.9.3
make
compiling ../../../../ext/faye_websocket/faye_websocket.c
linking shared-object faye_websocket.so
cd -
install -c tmp/x86_64-linux/faye_websocket/1.9.3/faye_websocket.so lib/faye_websocket.so

The file lib/faye_websocket.so is directly loadable from Ruby – let’s try it out:

$ irb -r ./lib/faye_websocket
>> Faye::WebSocket.mask [1,2,3,4], [5,6,7,8]
=> [4, 4, 4, 12]

So the final part of the process is to load this file from our main library code, for example:

# lib/faye/websocket.rb

require File.expand_path('../../faye_websocket', __FILE__)

module Faye
  module WebSocket
    # all your Ruby logic
  end
end

And that’s all there is to it. Just a few points to remember:

  • By convention, rake-compiler expects extensions to be in ext/
  • Make sure the C source and extconf.rb are included in the gemspec
  • Don’t put compilation output in the gem or in source control
  • Remember to recompile between editing C code and running tests

In the next article I’ll show you how to add JRuby support to your native gem.

Why you should never use hash functions for message authentication

The general thrust of this post is: use a MAC function like HMAC to sign data, don’t use hash functions. Although not all hash functions suffer from the problem I’m going to illustrate, in general using a hash function for message authentication comes with a lot of potential problems because those functions aren’t designed for this task. You shouldn’t try to work around it by creatively processing the inputs or inventing some fancy way of chaining hash functions. Just use the functions that were designed for this task instead of inventing your own crypto schemes.

This morning I was reading an article on the Intridea blog about signed idempotent action links. These are links you can embed in emails to your users that let them perform authenticated actions on your site, without going through a login screen or any other interactions. This technique relies on embedding action data in the link, for example an email from Twitter could include your user ID and the name of another user, and following the link would make you follow that user on Twitter.

But these parameters are typically easily guessable and so we must provide an authentication mechanism, otherwise anyone can construct URLs to modify other accounts however they like. To do this, we ‘sign’ the action data by combining it with a secret value, for example a global secret token stored on our servers, or the user’s password hash to produce a ‘tag’. This ‘tag’ does not reveal the secret values used in its construction but lets us verify that the link was generated by our site and for the correct user. The article gives the following methods you could add to a user class to let them sign and verify actions, assuming the user has an id and a secret_token. (I don’t mean to pick on Intridea, I’ve seen a lot of other people make this mistake and I only found out recently why you shouldn’t do it.)

class User
  def sign_action(action, *params)
    Digest::SHA1.hexdigest(
      "--signed--#{id}-#{action}-#{params.join('-')}-#{secret_token}"
    )
  end
  
  def verify(signature, action, *params)
    signature == sign_action(action, *params)
  end
end

This combines the action data with a secret token and calculates the SHA-1 hash; this hash is then appended to the URL to go in the email, and verified when the link is followed. Seems fine, right? SHA-1 doesn’t reveal anything about its input, so the secret is safe. This might be true (modulo the existence of rainbow tables) but hash functions still don’t provide any guarantee that a (message,tag) pair is genuine. To see why, we need to consider how hash functions work.

Many hash functions are based on something called the ‘Merkle-Damgard iterated construction’, which works like this: say you have a long string you want to hash, that looks like this:

+--------------------------+
| abcdefghijklmnopqrstuvwx |
+--------------------------+

Your message gets split into fixed-size blocks like this:

+--------+  +--------+  +--------+  +--------+
| abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+

It is then prefixed with a value called the ‘initialization vector’ or IV, which is a global constant that’s part of the hash function’s definition. This IV is the same size as the message blocks:

    IV          M0          M1          M2          M3
+--------+  +--------+  +--------+  +--------+  +--------+
| Zn8AGy |  | abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+  +--------+

This sequence is then folded using a compression function h(). The details of the h() depend on the hashing function, but the only thing that concerns us here is that the compression function takes two message blocks and returns another block of the same size.

So, first we take IV and M0 and compute h(IV,M0) to get H0. Then we take H0 and M1 and compute H1 = h(H0,M1) and so on down the chain.

    IV          M0          M1          M2          M3
+--------+  +--------+  +--------+  +--------+  +--------+
| Zn8AGy |  | abcdef |  | ghijkl |  | mnopqr |  | stuvwx |
+--------+  +--------+  +--------+  +--------+  +--------+
     |           |           |           |           |
     |         +-+-+       +-+-+       +-+-+       +-+-+
     +-------->| h |--H0-->| h |--H1-->| h |--H2-->| h |--> H3
               +---+       +---+       +---+       +---+

Whatever value comes out the end of the chain is the result of the hash function; this is how hash functions take arbitrary-size input and produce fixed-size output. But what does it mean for message authentication? Well say you have a (message,tag) pair, like our action params and SHA-1 hash from above. The construction of hash functions means that if you know the value of hash(string), you can easily work out the value of hash(string + modification) without knowing what string is: you just take the hash you already have, and do some more rounds of Merkle-Damgard with your modification. So, if we’re signing data by doing this:

tag = sha1(secret + message)

we’re then going to send (message,tag) over the wire in an email. The Merkle-Damgard construction means an attacker can take these values and easily compute the following:

fake_tag = sha1(secret + message + modification)

The attacker can do this without knowing secret, and can therefore construct new (message,tag) pairs that look genuine to the application. The creation of new pairs by untrusted parties is called an extension attack, and common hash functions are vulnerable to it.

Fortunately, there’s a function that is resistant to extension attacks and is designed precisely for signing messages: HMAC, short for Hash-based Message Authentication Code. In Ruby you can use the OpenSSL module from the standard library for this:

sha1 = OpenSSL::Digest::Digest.new('sha1')
tag  = OpenSSL::HMAC.hexdigest(sha1, secret_token, message)

Although HMAC is based on hash functions, it’s constructed in a way that prevents extension attacks and other classes of ‘existential forgery’, i.e. people constructing fake-but-valid (message,tag) pairs. If you’re signing data, you should use it.

But wait: there’s more! Take another look at the verify() function:

def verify(signature, action, *params)
  signature == sign_action(action, *params)
end

See anything wrong with it? It’s actually vulnerable to a timing attack: the String#== method as commonly implemented compares strings character-by-character (or byte-by-byte) and exits with false as soon as it finds an unequal pair. This fact means that an attacker can determine the first correct character of the tag by submitting requests to a signed URL with a different first character in the tag each time, and stopping when the request takes a little longer than usual. After guessing the first character they can move onto the second, and so on until they’ve guessed the whole correct tag.

The easiest way to defeat this attack is, instead of directly comparing two strings, compare their mappings under a collision-resistant hash function:

def verify(signature, action, *params)
  expected = Digest::SHA1.hexdigest(sign_action(action, *params))
  actual = Digest::SHA1.hexdigest(signature)
  expected == actual
end

This strategy means that if any of the supplied tag is wrong, it will probably differ on the first character when compared to the hash of the expected signature. Even if it doesn’t sometimes, you’ve destroyed the linear relationship between the correctness of the string and the time the comparison takes.

Finally, you should make sure your application does not exit early if the tag is invalid. You should do all the data processing you would normally do, just short of modifying the database, and check the tag last. If you return early you risk another timing attack.

All the information in this post I learned from Stanford’s introduction to cryptography, which is an excellent six-week primer on the topic. It’s a little mathsy but not nearly as much as it could be, meaning it’s fairly accessible and focuses on practical problems and intuition, and is really useful to anyone handling user data for a living. I’d go so far as to say it should be required reading for any web developer. It starts up again on Monday June 11: go sign up and make sure you’re not the next victim of the password leaks we’ve seen this week.

Source map support added to Packr and Jake

I’m a little late announcing this, in fact I held off until I’d been using this code for a couple of months to make sure there weren’t any glaring surprises. But the good news is, all the JavaScript libraries I ship from now on will come with source maps, and yours can too.

What’s a source map, you ask. Well, as this article explains, a source map is just a metadata file that maps locations in a minified (or otherwise compiled) JavaScript file back to locations in the source code the author is working on. This means you can load minified/compiled code into a browser, and when log messages or exceptions occur, the browser can show you the filename and line number in your source code that such events come from, rather than somewhere in line 1 of whatever huge bundle of code you’re actually deploying. This is really useful.

Telling the browser about this metadata is really easy. If you look at the latest release of Faye, you’ll see it contains three client-side files:

  • faye-browser.js – the source code
  • faye-browser-min.js – the minified code
  • faye-browser-min.js.map – the source map

faye-browser-min.js is what gets loaded in the browser. At the end of the file is a single line comment that looks like this:

//@ sourceMappingURL=faye-browser-min.js.map

This line tells the browser to use the file faye-browser-min.js.map as the source map for the JavaScript file that includes this comment. sourceMappingURL is resolved relative to the URL of the containing script, so as long as the script and the map are in the same directory this directive works fine.

The file faye-browser-min.js.map looks like this:

{
  "version": 3,
  "file": "faye-browser-min.js",
  "sourceRoot": "",
  "sources": ["faye-browser.js"],
  "names": ["_advice", "_callback", "_callbacks", "_cancelled", "_cbCount", "_channels", ...],
  "mappings": "AAAA,IAAI,MAAQ,OAAO,QAAU,SAAW,QACxC,GAAI,OAAO,UAAY,WAAY,OAAO,KAAO,KAEjD,KAAK..."
}

The sources field refers to the source code the map relates to: a source map can describe how one file was concatenated from a set of source files. The browser uses the sources and mappings data to show you the source code instead of the minified code when you use the web inspector, and it only loads the map and the source code if you have the inspector open so it doesn’t add latency for normal users.

Now if you want to use this new browser feature on your own projects, I have a couple of tools you should look at. First is Packr, my port of Dead Edwards’ Packer tool in Ruby. Say you have two source files:

# example_a.js

1. // When the minified code is loaded into a browser, you should see the call to
2. // console.log() attributed to example_a.js:4
3. 
4. console.log('Hello from file A');
# example_b.js

1. var display = function(message) {
2.   alert(message + ' from file B');
3. };
4. 
5. display('Ahoy there');

You can run this in the shell:

packr example_a.js example_b.js -o min/example-min.js -h '/* Copyright 2012 */' --shrink-vars

And it will generate a minified file and a source map for you:

# example-min.js

/* Copyright 2012 */
console.log('Hello from file A');var display=function(a){alert(a+' from file B')};display('Ahoy there');
//@ sourceMappingURL=example-min.js.map
# example-min.js.map

{
  "version": 3,
  "file": "example-min.js",
  "sourceRoot": "",
  "sources": ["../example_a.js", "../example_b.js"],
  "names": ["message"],
  "mappings": ";AAGA,QAAQ,KAAK,MAAM,KAAK,KAAK,ICH7B,IAAI,QAAU,SAASA,GACrB,MAAMA,IAAY,KAAK,KAAK,KAG9B,SAAS,KAAK;"
}

Packr lets you remove whitespace, compress variable names, use base-62 encoding, insert header comments on minified output, and can be used as a CLI or as a Ruby library. See the readme for more info.

Second is a tool I’ve been using to build all my JavaScript projects for a few years now, Jake. Jake is based on Packr, but is designed for larger projects. It’s designed to let you describe the structure of a project using a simple YAML file, and build the whole thing easily with one shell command. Based on Packr’s source map support, I’ve added the ability to tell Jake to generate source maps for one build based on the output of another. For example, say your build produces both unminified (src) and minified (min) copies of your code; then you can tell Jake to generate a source map for the min build that refers to code in the src build as follows:

---
builds:
  src:
    minify: false
  min:
    minify: true
    shrink_vars: true
    source_map: src

The source_map: src line means, generate a source map for min files that refers to locations in the corresponding src file. This is how the Faye files above are generated. Again, the Jake readme has more information.

We’re likely to see source map support in other tools soon; adding it to Packr was fairly easy because of Packr’s very simple model for compressing code. Basically it’s a regex-replacement-based compressor rather than a full JavaScript parser, so source maps could be added as a post-processing step rather than being based on AST annotations requiring deep integration. Unfortunately this means that if you like omitting semicolons from your code you might have to wait a little longer to use this stuff.

Faye 0.8: the refactoring

I’m pleased to finally announce the release of Faye 0.8 after a few months of reorganising the 0.7 codebase to make it more modular, and splitting parts of it out into separate projects. Before I get to what’s changed, I’m going to get the API changes out of the way: this is the stuff you need to know if you’re upgrading.

I hate introducing API changes but I’m afraid these really couldn’t be avoided. They’re really configuration changes so you shouldn’t need to change a lot of code. Please get on the mailing list if you have problems.

First, if you’re running the Ruby server you need to tell it which web server you’re using. Faye now supports the Thin, Rainbows and Goliath web servers, and you need to tell the WebSocket layer which set of adapters to load. For a ‘hello world’ app this looks like this:

# config.ru
require 'faye'
Faye::WebSocket.load_adapter('thin')

app = Faye::RackAdapter.new(:mount => '/faye', :timeout => 25)

run app

Depending on whether you run your application with rackup or thin, the load_adapter call might not be strictly necessary, but better to have it in there just in case. See the Faye Ruby docs and the faye-websocket documentation on how to run your app with different Ruby servers.

Second, if you use the Redis backend, you need to install a new library and change the engine.type setting in your server. Instead of specifying the name of the engine, this field now takes a reference to the engine object. In Ruby that looks like this:

# config.ru
# First run: gem install faye-redis

require 'faye'
require 'faye/redis'

bayeux = Faye::RackAdapter.new(
  :mount   => '/',
  :timeout => 25,
  :engine  => {
    :type  => Faye::Redis,
    :host  => 'redis.example.com',
    # more options
  }
)

And on Node:

// First run: npm install faye-redis

var faye  = require('faye'),
    redis = require('faye-redis');

var bayeux = new faye.NodeAdapter({
  mount:    '/',
  timeout:  25,
  engine: {
    type:   redis,
    host:   'redis.example.com',
    // more options
  }
});

Apart from this, Faye 0.8 should be backward-compatible with previous releases.

Having got the administrivia out of the way, what’s new? Well, the main focus of the 0.8 release is modularity. Two major components of Faye – the WebSocket support and the Redis engine – have been split off into their own packages. I’ve been blogging here already about faye-websocket (rubygem, npm package), and the major work over the last few months has gone into making this a really solid WebSocket and EventSource implementation. Because of its new-found freedom outside the main Faye project, it’s been adopted by other projects, notably SockJS, Cramp and Poltergeist. This adoption, particularly from SockJS, has meant more feedback, bug fixes and performance improvements and resulted in a really solid WebSocket library that improves Faye’s performance.

This new library has had a beneficial impact on Faye’s transport layer. faye-websocket is much faster than Faye’s 0.7 WebSocket code, supports EventSource and new WebSocket features, and runs on more Ruby servers: Faye 0.7 was confined to Thin whereas it now also runs on Rainbows and Goliath. On top of this, Faye 0.8 adds a new EventSource-based transport to support Opera 11 and browsers where proxies block WebSocket, and improves how it uses WebSocket. Previously, Faye’s WebSocket transport used the same polling-based /meta/connect cycle as other transports. It was faster than HTTP, but not optimal. Faye 0.8 now breaks out of this request/response pattern and pushes messages to WebSocket and EventSource connections as soon as they arrive at the server, without returning the /meta/connect poll. This results in lower latency, particular when delivering messages at high volume.

The second major change is that the Redis engine is now a separate library, faye-redis (rubygem, npm package). This has two important benefits. First, the main Faye package no longer depends on Redis clients, and in particular the Node version no longer depends on packages with C extensions, so no compiler is needed to install it. Second, it means Faye’s backend is now totally pluggable and third parties can implement their own engines: the API is thoroughly documented for Ruby and Node. The engine is a small piece of code (for example here’s the Ruby in-memory engine) but it really defines Faye’s behaviour. This layer is not concerned with transport negotiation (HTTP, WebSocket, etc) or even the Bayeux protocol format, it just implements the messaging business logic and stores the state of the system. You can easily implement your own engine to run on top of another messaging/storage stack, or change the messaging semantics if you like. Faye has a complete set of tests you can run to check your engine – see the Ruby and Node projects for examples.

Finally, there have been a couple of changes to the client. We’ve switched from exponential-backoff to fixed-interval for trying to reconnect after the client loses its connection to the server, and this interval is configurable using the retry setting:

// Attempts to reconnect every 5 seconds
var client = new Faye.Client('http://example.com/bayeux', {
  retry: 5
});

You can also set headers for long-polling requests on the client; this is useful for talking to OAuth-protected servers, for example:

client.setHeader('Authorization', 'OAuth ' + accessToken);

And finally, there’s a new server-side setting, ping, that controls how often the server sends keep-alive data over WebSocket and EventSource connections. This data is ignored by the client, but helps keep the connection open through proxies that like to kill idle connections.

// Sends pings every 10 seconds

var bayeux = new faye.NodeAdapter({
  mount: '/bayeux',
  ping:  10
});

And that just about wraps things up for this release. Since 0.7.1, the main Faye codebase has shed over 2,000 lines of code into other projects that we can easily ship incremental updates to without affecting Faye itself. It’s more performant, leaner and more modular and I know there are already projects doing cool things with it. If you’re using Faye for interesting projects, I’d love to hear from you on the mailing list.

faye-websocket 0.3: EventSource support, and two more Ruby servers

The latest iteration of faye-websocket has just been released for Ruby and Node, probably the last release before I get back to making progress on Faye itself. It contains two major new features: EventSource support, and support for the Rainbows and Goliath web servers for the Ruby version.

EventSource is a server-push protocol that’s supported by many modern browsers. On the client side, you open a connection and listen for messages:

var es = new EventSource('http://example.com/events');
es.onmessage = function(event) {
  // process event.data
};

This sends a GET request with Content-Type: text/event-stream to the server and holds the connection open, and the server can then send messages to the client via a streaming HTTP response. It’s a one-way connection so the client cannot send messages to the server over this connection; it must use separate HTTP requests. However, EventSource uses a much simpler protocol than WebSocket, and looks more like ‘normal’ HTTP so it has less trouble getting through proxies.

On the server side, supporting this requires a lot of the same code as WebSocket, and I might use it in Faye later on, so I decided to add support for it in faye-websocket. In your Rack app, you can now easily handle EventSource connections using an API similar to WebSocket:

require 'faye/websocket'

App = lambda do |env|
  if Faye::EventSource.eventsource?(env)
    es = Faye::EventSource.new(env)

    # Periodically send messages
    loop = EM.add_periodic_timer(1) { es.send('Hello') }
    
    es.onclose = lambda do |event|
      EM.cancel_timer(loop)
      es = nil
    end

    # Async Rack response
    es.rack_response
  
  else
    # Normal HTTP
    [200, {'Content-Type' => 'text/plain'}, ['Hello']]
  end
end

Just like WebSocket, EventSource is designed as a convenient wrapper around the Rack environment and underlying TCP connection that deals with the wire protocol and connection details for you. It tries not to make any assumptions or force constraints on your application design. There are a lot of WebSocket libraries around whose interfaces look more like Rails controllers; black-box Rack components or full-stack servers that hide the socket object and force you respond to WebSockets on one endpoint, and normal HTTP on another. Faye needs to be able to speak many different transport protocols over a single endpoint, which is why this library is designed to be usable inside any Rack adapter while leaving routing decisions up to you.

The benefit of interacting with the sockets as first-class objects is that you can pass them to other parts of your application, which can deal with them as a simple abstraction. For example, if your application just needs to push data to the client, as many WebSocket apps do, you can maintain a socket pool full of objects that respond to send(). When the application wants to push data, it selects the right connection and calls send() on it, without worrying whether it’s a WebSocket or EventSource connection. This gives you some flexibility around which transport your client uses.

The Node API to this is very similar and both are fully documented on GitHub: Ruby docs, Node docs.

The final big change is that the Ruby version now works under a broader range of web servers; it now supports Rack apps running under Thin, Rainbows and Goliath. Hopefully, providing a portable socket implementation that’s easy to drop into any Rack app will open up possibilities for more portable async app frameworks, decoupling the application from the network transport just as Rack did for HTTP. faye-websocket extracted its initial Thin and Rainbows adapters from the Cramp project, and there’s a chance Cramp can now remove its WebSocket code and rely on a simple abstraction for binding the app framework to the web.

As usual, download from Rubygems and npm, and get on the Faye mailing list if you have problems.

Black-box criteria

Tim Bray recently published an article called Type-System Criteria, in which he makes the argument that Java, or statically-typed languages in general, is better-suited to mobile development than the dynamically-typed languages that are more prevalent in web development circles. The reason he gives for this boils down to API surface size:

Another observation that I think is partially but not entirely a consequence of API scale is testing difficulty. In my experience it’s pretty easy and straightforward to unit-test Web Apps. There aren’t that many APIs to mock out, and at the end of the day, these things take data in off the wire and emit other data down the wire and are thus tractable to black-box, in whole or in part.

On the other hand, I’ve found that testing mobile apps is a major pain in the ass. I think the big reason is all those APIs. Your average method in a mobile app responds to an event and twiddles APIs in the mobile framework. If you test at all completely you end up with this huge tangle of mocks that pretty soon start getting in the way of seeing what’s actually going on.

The argument goes that, as the API surface you need to integrate with becomes larger, so static type systems become more attractive. I don’t disagree, in part because I don’t have nearly enough experience with static languages to have an informed opinion on them. But at a gut level I believe this to be true, in fact I’d be willing to bet that a majority of the bugs I’ve written while refactoring software could have been caught by a static type checker (and not even a very sophisticated one, at that).

But the excerpt I quoted above contains a code smell, and it points to another reason why mobile development is difficult. It’s not the size of the APIs that’s the big problem: it’s the nature of the application.

Web application servers are comparatively easy to test because the tests can be written by talking to an encapsulated black box. You throw a request (or several) at a web server, you read what comes back, and check it looks like what you expected. On the other hand, testing web application clients is much more complex: instead of doing simple call/response testing, you have to initiate events within the application’s environment, and then monitor changes to that environment that you expect the events to cause. The core difference here is that client-side programs tend to be what I’m going to refer to as ‘stateful user interfaces’, and mobile (and desktop) software falls into the same category.

What exactly do I mean by ‘stateful user interface’? When you call a web server, you don’t need to hold onto any state on your end: you ask the server a question by sending it a request, and it sends back a fully-formed, self-contained response. When you’ve checked that response, you throw it away and start the next test. In contrast, stateful user interfaces are long-running processes in which incremental changes are made to what the user sees. Instead of getting a fresh new page, just a part of the view is changed, or a sound is emitted, or a notification generated, or a vibration initiated. The programming paradigm in a server environment emphasises call/response, statelessness and immutability; in a client environment you have side effects, state and incremental change. Testing in such environments is hard.

I think this, rather than large API surface, is the real problem. Large API surfaces are only a problem if your application code talks to them directly, and this is much more common in side-effect-heavy applications. Unit tests in these environments tend to be messy for several reasons:

  • Application code responds to events triggered by the host environment
  • Business logic produces its output by modifying the host environment rather than returning values
  • It is hard or impossible to reset the environment to a clean state between tests

The third reason is a particular problem when unit testing client-side JavaScript, and I’ve seen plenty of tests where the state of the page or the implementation of event listeners is such that it becomes very difficult to keep each test independent of the others. You also have the problem that anything that causes a page refresh will cause your test runner to vanish. (I wrote about this exact problem in Refactoring towards testable JavaScript.)

So if side-effect-heavy programs cause large API surfaces to be a problem, what should we do about it? The answer comes down to something I think of as ‘avoiding framework-isms’. This means that any time you have a framework or host environment in which user input or third-party code drives your application, the sooner you can dispatch to something you control the better. The classic example of this is the ‘fat model, skinny controller’ mantra popular in the Rails community: rather than dump lots of code in a controller that’s only invoked by the host server and framework, turn the request into calls to models. This way, the bulk of the logic is in objects that you control the interface to, and that are easy to create and manipulate, properties that also make them easy to test.

In client-side JavaScript and other stateful user interfaces, this means keeping event listeners small. Ideally an event listener should extract all the necessary data from the event and the current application state, and use this to make a black-box call to a module containing the real business logic. It means making sure orthogonal components of a user interface do not talk to each other directly, but publish data changes via a message bus. And it means writing business logic that returns results rather than causes side-effects; the side-effects again being dealt with by thin bindings to the host environment.

I’ll finish up with a small but illustrative example. Say you’re writing a WebSocket implementation, and the protocol mandates that when you call socket.send('Hello, world!') then the bytes 81 8d ed a3 88 c3 a5 c6 e4 af 82 8f a8 b4 82 d1 e4 a7 cc should be written to the TCP socket. You could write a test for it by mocking out the whole network stack (which I’ve probably glossed over considerably here):

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "writes a message to the socket" do
    @tcp_socket.should_receive(:write).with [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
    @web_socket.send("Hello, world!")
  end
  
  # More mock-based protocol tests...
end

Or you could test it by implementing a pure function that turns text into WebSocket frames, leaving the code that actually deals with networking doing only that and nothing else:

describe WebSocket::Parser do
  before do
    @parser = WebSocket::Parser.new
  end
  
  it "turns text into message frames" do
    @parser.frame("Hello, world!").should == [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
  end
  
  # More protocol implementation tests...
end

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    
    @parser = mock('parser')
    WebSocket::Parser.should_receive(:new).and_return @parser
    
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "converts text to frames and sends them" do
    frame = mock('frame')
    @parser.should_receive(:frame).with("Hello, world!").and_return frame
    @tcp_socket.should_receive(:write).with(frame)
    @web_socket.send("Hello, world!")
  end
  
  # And we're done here
end

This separates the business logic (implementing the WebSocket protocol) away from the side effects to the host environment (writing to network connections). This results in code that’s more modular, much easier to test, and less coupled to the API surface of the host environment. If a static type system helps you with that then have at it, but recognize when it’s a symptom of a deeper problem.

faye-websocket 0.2: big performance boost, and subprotocol support

I’ve just released the 0.2 version of faye-websocket for Ruby and Node. This release benefits from the fact that the SockJS project is now using faye-websocket to handle WebSocket connections; my thanks to them for finding the performance bugs and missing features that went into making this release.

The biggest difference in this release is performance. In 0.1, the Node version had a rather interesting performance profile and was pretty slow. We’ve now made a bunch of optimisations that give it a more predictable performance profile across message sizes, and increases performance across the range by orders of magnitude.

The following benchmarks were produced using the ws client to send 1000 messages of various sizes to an echo server:

Benchmarks for Node 0.6.6

On Ruby, the change is not so dramatic but for most message sizes performance is improved by at least a factor of 2. This is in part achieved by writing part of the parser in C; this being my first Ruby C extension there may be problems with it so please get on the mailing list if you find any.

Benchmarks for Ruby 1.9.3

There is one new feature in the form of Sec-WebSocket-Protocol support. With the latest WebSocket protocol, the one currently shipping in Chrome and Firefox, you can specify which application protocol(s) you want to use over the socket, for example:

// client-side
var ws = new WebSocket('ws://example.com/', ['irc', 'xmpp']);

On the server side, you can specify which protocols the server supports and the first of these that matches a protocol supported by the client will be selected and sent back to the client as part of the handshake.

// server-side
var ws = new WebSocket(request, socket, head, ['bayeux', 'irc']);
ws.protocol // -> 'irc'

I’m really pleased with the progress on this project since decoupling it from Faye, and stoked that SockJS has adopted it. They’ve helped me improve things a great deal just in the last few days and it’s great to know these changes will go into making Faye faster.

Announcing faye-websocket, a standards-compliant WebSocket library

While announcing last week’s release of Faye 0.7, I mentioned a couple of things: first, Faye now exposes its WebSocket APIs for everyone to use, and second that I planned to extract some components in order to slim Faye down for the 0.8 release.

Well the first step in that process is now done, and faye-websocket packages are now available for Node and Ruby, installable using npm install faye-websocket and gem install faye-websocket respectively. This process has already shrunk the Faye codebase by over 2,300 LOC but I suppose you’re still wondering why we need another WebSocket library.

It was never my intention to turn Faye into a plain-WebSocket project. I just implemented enough of the protocol to get Faye working, and left it at that. But recently I’ve taken a look around at the other packages for both Node and Ruby, and it turns out pretty much all of them suffer from at least one of the following problems:

The first is a simple matter of maintenance; I could contribute patches to projects to make them work. And it turns out you can’t take a piecemeal approach to implementing wire protocols, you need to deal with whatever might come your way. This is the Internet, after all. I found several libraries where sending them totally valid WebSocket data causes them to crash, simply because they are incomplete.

The second is somewhat trickier, depending on the nature of the project. Some projects are simple WebSocket implementations. Some are transport abstractions aiming to provide a socket-like interface on top of other protocols. And some are full-blown messaging systems with semantics over and above those provided by WebSockets. Faye falls into the latter camp, although its WebSocket code has always been decoupled from the rest of the codebase.

The third is harder to fix, though. If you change the API for a project, you affect all its users. Its author probably has reasons for the API they picked, reasons you weren’t privy to when examining potential use cases. Patches that change the user-facing API of a library are much less likely to be accepted.

But the problem posed by non-standard APIs is, to me, quite serious. The Web Standards movement came about because web developers wanted to write portable code, and have a reasonable expectation of that code working when delivered to users. The WebSocket API is part of that same effort, to provide Web authors with standard interfaces to code against in order to produce portable software. As I have ranted about at length before, ignorance of code portability constitutes a missed opportunity for anyone developing Node programs.

Every time somebody invents a new API for a standard piece of Web infrastructure, they make everyone else’s code less portable. The aim with faye-websocket is to keep the standard WebSocket API so that code developed for the browser can be reused on the server. For example, the Faye pub/sub messaging client uses a ‘transport’ object to send messages to the server; there’s a transport based on XMLHttpRequest, one based on JSON-P, one based on Node’s HTTP library, and one based on WebSockets. If using an HTTP transport, one must pick whether to use XMLHttpRequest or Node HTTP, based on the environment. With WebSockets, we can reuse the same transport class on the server and in the browser, because Faye’s WebSockets have the same API you get in the browser. This means less code to write, and more code you can automatically test from the command line.

And this doesn’t just apply to WebSocket clients. After the handshake, the WebSocket protocol is symmetric: both peers can send and receive messages with the same protocol operating in both directions. It seems to me that any code based on WebSockets ought to be equally happy handling a server-side connection or a client-side one, since both ends of the connection have exactly the same capabilities. For this reason, faye-websocket wraps server-side connections with the standard API:

var WebSocket = require('faye-websocket'),
    http      = require('http'),
    server    = http.createServer();

server.addListener('upgrade', function(request, socket, head) {
  var ws = new WebSocket(request, socket, head);
  
  ws.onmessage = function(event) {
    ws.send(event.data);
  };
  
  ws.onclose = function(event) {
    console.log('close', event.code, event.reason);
    ws = null;
  };
});

server.listen(8000);

Server-side connections transparently select between draft-75, draft-76 or hybi-07-compatible protocols and handle text, binary, ping/pong, close and fragmented messages. The client is hybi-08, but has been tested up to version 17.

Of course, nothing is ever perfect on the first release but faye-websocket does have one of the most complete protocol implementations around, and I’d like to see more implementations adopt the standard API. The standard interfaces might not be to your taste, but they benefit the ecosystem by giving everyone a predictable playing field. If you maintain a WebSocket project, please test it using the Autobahn test suite (here are Faye’s server and client results) and consider exposing the standard API to your users.

Faye 0.7: new event APIs and an open WebSocket stack

I’m very excited to announce the release of Faye 0.7, available now through gem and npm. This release focuses on two main areas: new event APIs for hooking into the framework, and a polished stand-alone WebSocket implementation.

Let’s deal with the new APIs first. The main one is an API for listening to what’s going on inside the engine. People have historically used extensions for various sorts of monitoring but there are certain events extensions can’t catch. For example, when a client session ends due to inactivity rather than because the client sent a /meta/disconnect message, there’s no way to detect that. Well now you can:

var bayeux = new Faye.NodeAdapter({mount: '/faye', timeout: 45});

bayeux.bind('disconnect', function(clientId) {
  // event listener logic
});

There’s a complete API for listening to all the major events in Faye’s pub/sub model, and I encourage you to use it over extensions if all you want is a little monitoring.

The second new event API is on the client side. Faye will always attempt to reconnect for you if the network connection is dropped, but often it’s useful to signal the connection loss to the user. You can now do this using these events on the client:

client.bind('transport:down', function() {
  // Fires when the connection is lost
});
client.bind('transport:up', function() {
  // Fires when the connection is established
});

You don’t need to reconnect the client yourself, these events are just there to notify you of what’s going on. And to give you even more feedback, publish() now supports the same Deferrable API as subscribe().

Now onto the WebSocket changes. The first of which is that Faye now includes a WebSocket client, and will use this instead of HTTP for server-side clients. The server-side HTTP transports now support cookies, and all transports support SSL. The WebSocket stack has received a lot of bug fixes based on the Autobahn test suite, and is now one of the most complete WebSocket implementations available for Node or Ruby. (Want proof? Here’s the server and client test report.)

Faye’s WebSocket stack also has the advantage of having been reasonably decoupled from the rest of the codebase since it was first introduced in version 0.5, and now I’ve taken the decision to open it up for everyone to use. It makes adding WebSocket support to existing Node and Rack apps very easy indeed; you can drop sockets right into your existing application and they expose the same API you use in the browser:

var http = require('http'),
    faye = require('faye');

var server = http.createServer();

server.addListener('upgrade', function(request, socket, head) {
  var ws = new faye.WebSocket(request, socket, head);

  ws.onmessage = function(event) {
    ws.send(event.data);
  };

  ws.onclose = function(event) {
    console.log('close', event.code, event.reason);
    ws = null;
  };
});

server.listen(8000);

The sockets implement the standard WebSocket API on both the server- and client-side. They handle text and binary data, and transparently handle ping/pong, close, and fragmented messages for you. The server supports a wide range of protocol versions and the client should be able to talk to any modern WebSocket server.

If, however, you don’t want to use WebSockets for your Faye client, you can easily switch them off:

client.disable('websocket');

I think the WebSocket tools are important because while there are plenty of messaging frameworks and transport abstractions around (e.g. libraries that present the WebSocket API on top of other wire transports), the state of pure WebSocket implementations seems to be somewhat lacking at the moment. Hopefully this can improve the situation as we move towards broader socket support across the web.

So that’s about it for the 0.7 release. Work is already underway on the 0.8 release, and the current plan is to focus on refactoring. The Faye distribution contains several components, such as the WebSocket code and the Redis backend, that can be broken off into separate projects. Indeed, work is already underway on extracting the WebSocket code for Ruby. As usual, this release should be backward compatible but if you have problems don’t hesitate to post on the mailing list.