Do I need DI?

This article is based on a talk of the same name, on the topic of dependency injection, which I gave at Eurucamp 2012. I thought I would write it up since it’s of some relevance to the current discussion about TDD, and I frequently find myself wanting to reiterate that talk’s argument. It is only related to current events insofar as I find them emblematic of most conversations about TDD; since the talk is nearly two years old it does not directly address the current debate but tries to illustrate a style of argument I wish were more common.

DHH’s keynote at RailsConf 2014 and his ensuing declaration that TDD is dead have got the Ruby community talking testing again. My frustration with the majority of talks and writing on testing, TDD, agile development, software architecture and object-oriented design is that a great deal of it consists of generalised arguments, with very little in the way of concrete case studies. Despite the fact that I’m a perfectly capable programmer, and the fact I’ve written automated tests for all sorts of different kinds of software, I’ve sat through any number of talks on these topics that went right over my head. The impression I’m left with is that talks on TDD often only make sense if the audience already understands and agrees with the speaker’s argument. They are often littered with appeals to authority figures and books that only land if the listener knows of the cited authority or has read the particular book. At worst, they are presented as ‘movements’, wherein all the old ways of thinking must be discarded in favour of the new one true way.

Now, grand general assertions make for entertaining and engaging talks, but they are unhelpful. Real-world software development is an exercise in design, and to design things based on general rules without reference to the context of the problem at hand is to not practice design properly. We need to learn to make decisions based on the properties of the problem we’re dealing with, not by appealing to advice handed down without any context.

This is why JavaScript Testing Recipes almost entirely concerns itself with the application of a few simple techniques to a wide array of real programs, including discussion of cases where those techniques are not applicable or are too costly due to the requirements of a particular problem. I wish more talks were like this, and it’s what I was hoping to do when I wrote this talk originally.

So, this article is not really about dependency injection per se, but about how you might decide when to use it. DI, much like TDD or any other programming support tool, is not de-facto good or bad; but has pros and cons that determine when you might decide to use it. It is not even good or bad in the context of any defined problem; it is a design tool and its usefulness varies according to the mental model of the person using it.

When we ask whether an object-oriented program is well-designed, we often use proxies for this question, like asking whether the code obeys the ‘SOLID principles’:

  • Single responsibility principle
  • Open/closed principle
  • Liskov substitution principle
  • Interface segregation principle
  • Dependency inversion principle

We might look at each class in the system, decide whether or not it obeys each of these principles, and if most of the classes in the system are SOLID then we declare the system to be well-designed. However, this approach encourages us to factor our code in a certain way without regard for what that code means, what it models, how that model is realised in code, and who it is intended to be used by. We should not treat these principles as normative rules, but as observations: programs that are easy to maintain tend to exhibit these properties more often than not, but they are not a barometer against which to accept or reject a design.

Across the fence from the people advocating for SOLID, and TDD, and various prescriptive architectural styles, are the people who want to tell you off for following these rules, assessing your designs by whether you’re practicing ‘true’ OO, or becoming an architecture astronaut, reinventing Java’s morass of AbstractBeanListenerAdapterFactory classes.

I don’t find the rhetoric of either of these camps helpful: both offer decontextualised normative advice based on statistical observations of systems they happen to have worked on or heard of. They don’t know the system you’re working on.

Your job is not to keep Alan Kay happy. Your job is to keep shipping useful product at a sustainable pace, and it’s on you to choose practices that help you do that.

So let’s talk about DI, and when to use it. To begin with, what is dependency injection? Take a look at the following code for a putative GitHub API client:

class GitHub::Client
  def get_user(name)
    u = URI.parse("https://api.github.com/users/#{name}")
    http = Net::HTTP.new(u.host, u.port)
    http.use_ssl = true
    response = http.request_get(u.path)
    if response.code == '200'
      data = JSON.parse(response.body)
      GitHub::User.new(data)
    else
      raise GitHub::NotFound
    end
  end
end

In order to get the account data for a GitHub user, this class needs to make an HTTP request to api.github.com. It does this by creating an instance of Net::HTTP, using that to make a request, checking the request is successful, parsing the JSON data that comes back, and constructing a user object with that data.

An advocate for TDD and DI might quibble with the design of this class: it’s too hard to test because there’s no way, via the class’s interface to intercept its HTTP interactions with an external server – something we don’t want to depend on during our tests. A common response to this in Ruby is that you can stub out Net::HTTP.new to return a mock object, but this is often problematic: stubbing out singleton methods does not just affect the component you’re testing, it affects any other component of the system or the test framework that relies on this functionality. In fact, DHH’s pet example of bad DI uses the Time class, which is particularly problematic to stub out since testing frameworks often need to do time computations to enforce timeouts and record how long your tests took to run. Globally stubbing out Date is so problematic in JavaScript that JSTR dedicates a six-page section to dealing with dates and times.

Dependency injection is one technique for solving this problem: instead of the class reaching out into the global namespace and calling a singleton method to fabricate an HTTP client itself, the class is given an HTTP client as an argument to its constructor. That client offers a high-level API that makes the contract between the class and its dependency clearer. Say we want the GitHub::Client class to only deal with knowing which paths to find data at, and how to map the response data to domain objects. We could push all the concerns of parsing URIs, making requests, checking response codes and parsing the body – that is, all the concerns not specifically related to GitHub – down into a high-level HTTP client, and pass that into the GitHub::Client constructor.

class GitHub::Client
  def initialize(http_client)
    @http = http_client
  end

  def get_user(name)
    data = @http.get("/users/#{name}").parsed_body
    GitHub::User.new(data)
  end
end

This makes the class easier to test: rather than globally stubbing out Net::HTTP.new and mocking its messy API, we can pass in a fake client during testing and make simple mock expectations about what should be called.

So dependency injection means passing in any collaborators an object needs as arguments to its constructor or methods. At its most basic, this means removing any calls to constructors and other singleton methods and passing in the required objects instead. But, this is often accompanied by refactoring the objects involved to make their interactions simpler and clearer.

So that’s what DI is, but what is it for? Many advocates will tell you that it exists to make testing easier, and indeed it does often accomplish that. But testability is not usually an end-user requirement for a component; being testable is often a proxy for being usable. A class is easy to test if it’s easy to set up and use, without requiring a complex environment of dependencies in order for the class to work properly. Applying dependency injection solely to achieve better tests is a bad move: design is all about dealing with constraints, and we must take all the applicable constraints into consideration when designing a class’s API. The patterns we use must emerge in response to the program’s requirements, rather than being dogmatically imposed in the name of ‘best practices’.

This is best illustrated by a concrete example. I maintain a library called Faye, a pub/sub messaging system for the web. It uses a JSON-based protocol called Bayeux between the client and the server; the client begins by sending a handshake message to the server:

{
  "channel":  "/meta/handshake",
  "version":  "1.0",
  "supportedConnectionTypes": ["long-polling", "callback-polling", "websocket"]
}

and the server replies with a message containing an ID for the client, and a list of transport types the server supports:

{
  "channel":    "/meta/handshake",
  "successful": true,
  "clientId":   "cpym7ufcmkebx4nnki5loe36f",
  "version":    "1.0",
  "supportedConnectionTypes": ["long-polling", "callback-polling", "websocket"]
}

Once it has a client ID, the client can tell the server it wants to subscribe to a channel by sending this message:

{
  "channel":      "/meta/subscribe",
  "clientId":     "cpym7ufcmkebx4nnki5loe36f",
  "subscription": "/foo"
}

and the server acknowledges the subscription:

{
  "channel":    "/meta/subscribe",
  "clientId":   "cpym7ufcmkebx4nnki5loe36f",
  "successful": true
}

After the client has subscribed to any channels it is interested in, it sends a ‘connect’ message that tells the server the it wants to poll for new messages:

{
  "channel":  "/meta/connect",
  "clientId": "cpym7ufcmkebx4nnki5loe36f"
}

When the server receives a message on any channel the client is subscribed to, it sends a response to the connect message with the new message attached. The client then sends another connect message to poll for more messages. (When using a socket-based connection, new messages can be pushed to the client immediately, without a pending connect request; the connect messages then act as a keep-alive heartbeat mechanism.)

So Faye is fundamentally a client-server system rather than a true peer-to-peer one. The architecture as we understand it so far is very simple:

                          +--------+
                          | Client |
                          +---+----+
                              |
                              V
                          +--------+
                          | Server |
                          +--------+

However there is more to it than that. Notice the supportedConnectionTypes field in the handshake messages: the client and server allow these messages to be sent and received via a number of different transport mechanisms, including XMLHttpRequest, JSON-P, EventSource and WebSocket. So we can add another layer of indirection to our architecture:

                          +--------+
                          | Client |
                          +---+----+
                              |
                              V
    +-----------+-------------+----------------+-------+
    | WebSocket | EventSource | XMLHttpRequest | JSONP |
    +-----------+-------------+----------------+-------+
                              |
                              V
                          +--------+
                          | Server |
                          +--------+

We now have a question: the Server class allows for multiple implementations of the same concept – sending messages over an HTTP connection – to co-exist, rather than having one mechanism hard-coded. Above, we allowed our GitHub::Client class to take an HTTP client as a parameter, letting us change which HTTP client we wanted to use. Surely we have another opportunity to do the same thing here, to construct servers with different kinds of connection handlers, including the possibility of using a fake connection handler to make testing easier.

xhr_server         = Server.new(XMLHttpRequestHandler.new)
jsonp_server       = Server.new(JSONPHandler.new)
eventsource_server = Server.new(EventSourceHandler.new)
websocket_server   = Server.new(WebSocketHandler.new)

If we’re doing DI by the book, this seems like the right approach: to separate the concern of dealing with the abstract protocol embodied in the JSON messages from the transport and serialization mechanism that delivers those messages. But it doesn’t meet the constraints of the problem: the server has to allow for clients using any of these transport mechanisms. The whole point of having transport negotiation is that different clients on different networks will have different capabilities, and the central server needs to accommodate them all. So, although there is a possibility for building classes that all implement an abstract connection handling API in different ways (and this is indeed how the WebSocket portion of the codebase deals with all the different WebSocket protocol versions – see the websocket-driver gem), the server must use all of these connection handlers rather than accepting one of them as a parameter.

What about the client side? The Client class uses only one of the four transport classes at a time, to mediate its communication with the server. The client depends on having a transport mechanism available, but the choice of which transport to use can be deferred until runtime. This seems like a more promising venue to apply DI. So, should we make its API look like this?

var url    = 'http://www.example.com/faye',
    client = new Client(new WebSocketTransport(url));

Again, the answer is no. Allowing Client to accept a transport implementation as a constructor parameter gives the system that creates the client (whether that’s a user directly writing Faye client code or part of an application that makes use of Faye) a free choice over which transport to use. But, the choice of transport in this situation is not a free choice; it’s determined by the capabilities of the host browser runtime, the transports that the server says it supports, and which transports we can detect as actually working over the user’s network connection to the server – even if the client and server claim to support WebSocket, an intermediate proxy can stop WebSocket from working.

It is actually part of the client’s responsibility to choose which transport to use, based on automated analysis of these constraints at runtime. The same application, running on different browsers on different networks, will require different transports to be used, even throughout the lifetime of a single client instance. If the Client constructor took a transport as an argument, then the user of the Client class would have to conduct all this detective work themselves. Hiding this work is a core problem the client is designed to solve, so it should not accept a transport object via injection. Even though it would improve testability, it would result in the wrong abstraction being presented to the user.

Finally, let’s look at the internal structure of the Server class. It’s actually composed of multiple layers, and until version 0.6 those layers looked something like this:

    +--------------+
    |  RackServer  |    --> * WebSocket, XHR, JSON-P
    +-------+------+
            |                           |   message
            V                           V   objects
    +---------------+
    | BayeuxHandler |   --> * Bayeux message protocol
    +---------------+       * Connections / subscriptions
                            * Message queues

The RackServer deals with the front-end HTTP handling logic, processing all the different transport types and extracting JSON messages from them. Those JSON messages are then handed off to the BayeuxHandler, which contains all the transport-independent protocol logic. It processes the contents of the various JSON messages I showed you above, stores client IDs and their subscriptions and message queues, and routes incoming messages to all the subscribed clients. Now, some of the concerns of this class can be factored out:

    +--------------+
    |  RackServer  |    --> * WebSocket, XHR, JSON-P
    +-------+------+
            |                           |   message
            V                           V   objects
    +---------------+
    | BayeuxHandler |   --> * Bayeux message protocol
    +---------------+     +-------------------------------+
                          | * Connections / subscriptions |
                          | * Message queues              |
                          +-------------------------------+
                                        STATE

The BayeuxHandler class actually deals with two things: implementing the protocol as described by the JSON messages, and storing the state of the system: which clients are active, which channels they are subscribed to, and which messages are queued for delivery to which clients. There are many potential ways of implementing this state storage without changing the details of the protocol, and so in version 0.6 this concern was extract into an Engine class:

    +--------------+
    |  RackServer  |    --> * WebSocket, XHR, JSON-P
    +-------+------+
            |                           |   message
            V                           V   objects
    +---------------+
    | BayeuxHandler |   --> * Bayeux message protocol
    +---------------+
            |
            V
    +--------------+
    |    Engine    |    --> * Subscriptions
    +--------------+        * Message queues

There are two engine implementations available: in-memory storage, and a Redis database using Redis’s pub/sub mechanism for IPC.

                      +--------------+
                      |  RackServer  |
                      +-------+------+
                              |
                              V
                      +---------------+
                      | BayeuxHandler |
                      +---------------+
                              |
                    +---------+---------+
                    |                   |
            +--------------+     +-------------+
            | MemoryEngine |     | RedisEngine |
            +--------------+     +-------------+

Finally: we have an honest candidate for dependency injection: whether the stack uses the in-memory or Redis engine makes absolutely no difference to the rest of the stack. It’s a contained implementation detail; given any object that implements the Engine API correctly, the BayeuxHandler and all of the components that sit above it will not be able to tell the difference. The choice of engine is a free choice that the user can make entirely on their own terms, without being bound by environmental constraints as we saw with client-side transport negotiation.

However, we don’t have multiple choices at the BayeuxHandler layer of the stack: there is only one Bayeux protocol, it’s an open standard, and there aren’t multiple competing implementations of this component. It’s just in-process computation that takes messages extracted by the RackServer, validates them, determines their meaning and delegates any state changes to the engine.

So, BayeuxHandler can be parameterized on which engine object it uses, but the Server will always construct a BayeuxHandler (as well as any transport-handling objects it needs). A highly simplified version of RackServer than only deals with HTTP POST would look like this, taking an engine as input and handing it down to the BayeuxHandler:

class RackServer
  def initialize(engine)
    @handler = BayeuxHandler.new(engine)
  end

  def call(env)
    request  = Rack::Request.new(env)
    message  = JSON.parse(request.params['message'])
    response = @handler.process(message)
    [
      200,
      {'Content-Type' => 'application/json'},
      [JSON.dump(response)]
    ]
  end
end

A user would then start up the server like so:

server = RackServer.new(RedisEngine.new)

thin = Rack::Handler.get('thin')
thin.run(server, :Port => 80)

Now, in this scenario we get good tests, that make mock expectations about one object’s contract with another without stubbing globally visible bindings, as a side effect of the code fulfilling its design requirements, and of it being easy to use at the right level of abstraction for the problem it solves. Here are a couple of tests that assert the server tells the engine to do the right things, and that the server relays information generated by the engine back to the client.

require './rack_server'
require 'rack/test'

describe RackServer do
  include Rack::Test::Methods

  let(:engine) { double('engine') }
  let(:app)    { RackServer.new(engine) }

  describe 'handshake' do
    let(:message) { {
      'channel' => '/meta/handshake',
      'version' => '1.0',
      'supportedConnectionTypes' => ['long-polling']
    } }

    it 'tells the engine to create a new client session' do
      expect(engine).to receive(:create_client).and_return 'new_client_id'
      post '/bayeux', :message => JSON.dump(message)
    end

    it 'returns the new client ID in the response' do
      engine.stub(:create_client).and_return 'new-client-id'
      post '/bayeux', :message => JSON.dump(message)
      expect(JSON.parse(last_response.body)).to include('clientId' => 'new-client-id')
    end
  end
end

Even when we do decide to use dependency injection, we face some trade-offs. By making the external interface for constructing an object more complicated, we gain some flexibility but lose some convenience. For example, a convenient way to read files looks like this:

File.read('path/to/file.txt')

while a flexible way to read files might look like this:

FileInputStream fis = new FileInputStream("path/to/file.text");
DataInputStream dis = new DataInputStream(fis);
BufferedReader br = new BufferedReader(dis);
// ...

Yes, I have picked extreme examples and I’ve probably got the Java version wrong. The important point is, you can build the former API on top of the latter, but not the other way around. You can wrap flexible building blocks in a convenience API – just look at jQuery or any UI toolkit – but it’s much harder or impossible to go the other way around. Suppose you want to use the file I/O code independently of the string encoding code? File.read(path) does not expose those building blocks so you’re going to need to find them somewhere else.

When I was at Songkick, we built the clients for our backend services much like the GitHub example above: each client is instantiated with an HTTP client adapter, letting us easily switch which HTTP library we used (yes, we had to do this in production and it took an afternoon), as well as letting us pass in a fake for testing. But for calling these clients from a controller, we wrapped them in a convenience module that constructed canonical instances for us:

module Services
  def self.github
    @github ||= GitHub::Client.new(HttpClient.new('https://api.github.com'))
  end

  # and so on for other services
end

So controller code just made a call like Services.github.get_user('jcoglan'), which is pretty much as convenient as ActiveRecord::Base.find().

To summarise, there is certainly a place for DI, or for any architectural technique, but you must let the requirements of the problem – not just testing concerns – drive the patterns you use. In the case of DI, I reach for it when:

  • There are multiple implementations of an abstract API that a caller might use
  • The code’s client has a free choice over which implementation to use, rather than environmental factors dictating this choice
  • To provide plugin APIs, as in the case of Faye’s engine system or the protocol handlers in websocket-driver

Beyond DI, I would like to see much more of our design discussion focus on just that: design, in context, with case studies, without deferring to generalisation. Design patterns must be derived from the program’s requirements, not imposed in an attempt to shoe-horn the program into a predefined shape. Ultimately, rather than focussing on testing for its own sake, we must focus on usability. Testability will soon follow.

Building JavaScript projects with Make

As a long-time Ruby and JavaScript user, I’ve seen my share of build tools. Rake, Jake, Cake, Grunt, Gulp, the Rails asset pipeline… I’ve even invented one or two of my own. I’ve always wondered why every language ecosystem feels the need to invent its own build tools, and I’ve often felt like they get in my way. Too often, Rake tasks are just wrappers around existing executables like rspec or cucumber, or require custom glue code to hook a tool into the build system — witness the explosion of grunt-* packages on npm. I find Grunt particularly problematic; its configuration is verbose and indirect, and its plugins bake in assumptions that you cannot change. For example, grunt-contrib-handlebars currently depends on handlebars~1.1.2, when the current version of Handlebars is 1.3.0. It seems strange to have your build tools choose which version of your libraries you must use.

Most of the tasks we need to build our projects can be run using existing executables in the shell. CoffeeScript, Uglify, Browserify, testing tools, PhantomJS, all have perfectly good executables already and it seems silly to require extra tooling and plugins just so I can glue them together using Grunt. In the Node community we talk a lot about ‘the Unix way’, but the current crop of build tools don’t seem Unixy to me: they’re highly coupled, obfuscatory and verbose, wrapping particular versions of executables in hundreds of lines of custom integration code and configuration.

Fortunately, Unix has had a perfectly good build tool for decades, and it’s called Make. Though widely perceived as being only for C projects, Make is a general-purpose build tool that can be used on any kind of project. I’m currently using it to build my book, generating EPUB, MOBI and PDF from AsciiDoc and checking all the code in the book is successfully tested before any files are generated. (I’ve put my build process on GitHub if you’re interested.) While writing the book, I decided to use Make for the book’s example projects themselves, and found it to be a very quick way to get all the usual JavaScript build steps set up.

In this article I’m going to build a project that uses CoffeeScript source code, Handlebars templates, and is tested using jstest on PhantomJS. To start with, let’s cover the project’s source files. The contents of these files aren’t too important, the important thing is what we need to do with the files to make them ready to run. There’s a couple of .coffee files in the lib directory, just a small Backbone model and view:

# lib/concert.coffee

Concert = Backbone.Model.extend()

window.Concert = Concert
# lib/concert_view.coffee

ConcertView = Backbone.View.extend
  initialize: ->
    @render()
    @model.on "change", => @render()

  render: ->
    html = Handlebars.templates.concert(@model.attributes)
    @$el.html(html)

window.ConcertView = ConcertView

(You could use CommonJS modules instead of window globals and build the project with Browserify; hopefully having read this it’ll be obvious how to add this to your project.)

And, we have a template for displaying a little information about a concert:

<!-- templates/concert.handlebars -->

<div class="concert">
  <h2 class="artist">{{artist}}</h2>
  <h3 class="venue">{{venueName}}, {{cityName}}, {{country}}</h3>
</div>

There are also a couple of test suites in spec/*.coffee that test the template and view from above:

# spec/concert_template_spec.coffee

JS.Test.describe "templates.concert()", ->
  @before ->
    @concert =
      artist:    "Boredoms",
      venueName: "The Forum",
      cityName:  "Kentish Town",
      country:   "UK"

    @html = $(Handlebars.templates.concert(@concert))

  @it "renders the artist name", ->
    @assertEqual "Boredoms", @html.find(".artist").text()

  @it "renders the venue details", ->
    @assertEqual "The Forum, Kentish Town, UK", @html.find(".venue").text()
# spec/concert_view_spec.coffee

JS.Test.describe "ConcertView", ->
  @before ->
    @fixture = $(".fixture").html('<div class="concert"></div>')

    @concert = new Concert
      artist:    "Boredoms",
      venueName: "The Forum",
      cityName:  "Kentish Town",
      country:   "UK"

    new ConcertView(el: @fixture.find(".concert"), model: @concert)

  @it "renders the artist name", ->
    @assertEqual "Boredoms", @fixture.find(".artist").text()

  @it "updates the artist name if it changes", ->
    @concert.set "artist", "Low"
    @assertEqual "Low", @fixture.find(".artist").text()

Now, to get from these files to working code, we need to do a few things:

  • Compile all the CoffeeScript to JavaScript
  • Compile all the Handlebars templates to a JS file
  • Combine the app’s libraries and compiled source code into a single file
  • Minify the bundled app code using Uglify
  • Run the tests after making sure all the files are up to date

Make is a great tool for managing these tasks, as they are essentially relationships between files. Make does not have any baked-in assumptions about what type of project you have or what languages you’re using, it’s simply a tool for organising sets of shell commands. It has a very simple model: you describe which files each build file in your project depends on, and how to regenerate build files if they are out of date. For example, if file a.txt is generated by concatenating b.txt and c.txt, then we would write this rule in our Makefile:

a.txt: b.txt c.txt
	cat b.txt c.txt > a.txt

The first line says that a.txt (the target) is generated from b.txt and c.txt (the dependencies). a.txt will only be rebuilt if one of its dependencies was changed since a.txt was last changed; Make always tries to skip unnecessary work by checking the last-modified times of files. The second line (the recipe) says how to regenerate the target if it’s out of date, which in this case is a simple matter of piping cat into the target file. Recipe lines must begin with a tab, not spaces; I deal with this by adding the following to my .vimrc:

autocmd filetype make setlocal noexpandtab

Let’s start by installing the dependencies for this project. Add the following to package.json and run npm install:

{
  "dependencies": {
    "backbone":      "~1.1.0",
    "underscore":    "~1.5.0"
  },

  "devDependencies": {
    "coffee-script": "~1.7.0",
    "handlebars":    "~1.3.0",
    "jstest":        "~1.0.0",
    "uglify-js":     "~2.4.0"
  }
}

This installs all the build tools we’ll need, and all of them have executables that npm places in node_modules/.bin.

Let’s write a rule for building our Handlebars templates. We want to compile all the templates — that’s templates/*.handlebars — into the single file build/templates.js. Here’s a rule for this:

PATH  := node_modules/.bin:$(PATH)
SHELL := /bin/bash

build/templates.js: templates/*.handlebars
	mkdir -p $(dir $@)
	handlebars templates/*.handlebars > $@

The first line adds the executables from npm to the Unix $PATH variable so that we can refer to, say handlebars by its name without typing out its full path. (Installing programs ‘globally’ just means installing them into a directory that is usually listed in $PATH by default.) The first line of the recipe uses mkdir to make sure the directory we’re compiling the templates into already exists; $@ is a special Make variable that contains the pathname of the target we’re trying to build, and the dir function takes the directory part of that pathname.

The rule duplicates the names of the source and target files, and we often use variables to remove this duplication:

PATH  := node_modules/.bin:$(PATH)
SHELL := /bin/bash

template_source := templates/*.handlebars
template_js     := build/templates.js

$(template_js): $(templates_source)
	mkdir -p $(dir $@)
	handlebars $(templates_source) > $@

With this Makefile, running make in the shell will generate build/templates.js, or do nothing if that file is already up to date.

$ touch templates/concert.handlebars 

$ make
mkdir -p build/
handlebars templates/*.handlebars > build/templates.js

$ make
make: `build/templates.js' is up to date.

Next up, we need to compile our CoffeeScript. We want to say that every file lib/foo.coffee generates a corresponding file build/lib/foo.js, and likewise every file spec/foo_spec.coffee generates build/spec/foo_spec.js. In Make, we can use the wildcard function to find all the names of the CoffeeScript files in lib and spec, and generate lists of JavaScript files from those names using pattern substitution. In Make, the expression $(files:%.coffee=build/%.js) means for every filename in the list files, replace %.coffee with build/%.js, for example replace lib/foo.coffee with build/lib/foo.js. We also use a pattern-based rule to describe how to compile any CoffeeScript file to its JavaScript counterpart. Here are the rules:

source_files := $(wildcard lib/*.coffee)
build_files  := $(source_files:%.coffee=build/%.js)

spec_coffee  := $(wildcard spec/*.coffee)
spec_js      := $(spec_coffee:%.coffee=build/%.js)

build/%.js: %.coffee
	coffee -co $(dir $@) $<

We need to generate the names of all the generated JavaScript files because later targets will depend on them. If a target simply depended on build/*.js but we’d not built those files yet, the build wouldn’t work correctly. With this configuration, Make sets the variables to these values:

source_files := lib/concert.coffee lib/concert_view.coffee
build_files  := build/lib/concert.js build/lib/concert_view.js
spec_coffee  := spec/concert_template_spec.coffee spec/concert_view_spec.coffee
spec_js      := build/spec/concert_template_spec.js build/spec/concert_view_spec.js

So, Make now knows the names of all the generated files before they exist. The recipe for CoffeeScript states the any file build/foo/bar.js is generated from foo/bar.coffee, and uses coffee -co to make coffee compile each file into a given output directory. We use $@ as before to get the name of the current target, and $< gives the name of the first dependency of the current target. These variables are essential when dealing with pattern-based rules like this.

Pattern rules are invoked if Make has not been told explicitly how to build a particular file. If you run make build/lib/concert.js you’ll see that it generates the named file from the pattern rule.

Now that we’ve compiled all our code, we can concatenate and minify it. We want to take all the files in build_files, and template_js, and any third party libraries we need, and use Uglify to compress them. A rule for this is straightforward; note how app_bundle depends on the upstream files so that if any of them change, Make knows it needs to rebuild app_bundle.

app_bundle := build/app.js

libraries  := vendor/jquery.js \
              node_modules/handlebars/dist/handlebars.runtime.js \
              node_modules/underscore/underscore.js \
              node_modules/backbone/backbone.js

$(app_bundle): $(libraries) $(build_files) $(template_js)
	uglifyjs -cmo $@ $^

Here we’ve used another of Make’s automatic variables: $^ is a list of all the dependencies, separated with spaces. It’s handy when all your recipe does is combine all the dependencies into an aggregate file.

It’s customary to make a target called all that depends on all your project’s compiled files, and make this the first rule in the Makefile so that running make will run this rule. The all rule is also what’s called ‘phony’: all is not the name of an actual file, it’s just the name of a task we want to run, and so Make should not look for a file called all and check its last-modified time before proceeding. Targets are marked as phony by making them dependencies of the special .PHONY target.

.PHONY: all

all: $(app_bundle)

And finally, we need a task to run our tests. Let’s create a little web page for doing that:

<!-- test.html -->

<!doctype html>
<html>
  <head>
    <meta charset="utf-8">
    <title>jstest</title>
  </head>
  <body>

    <div class="fixture"></div>

    <script src="./build/app.js"></script>

    <script src="./node_modules/jstest/jstest.js"></script>
    <script src="./build/spec/concert_template_spec.js"></script>
    <script src="./build/spec/concert_view_spec.js"></script>

    <script>
      JS.Test.autorun()
    </script>

  </body>
</html>

and a PhantomJS script for launching this page and displaying the results:

// phantom.js

var JS = require("./node_modules/jstest/jstest")

var reporter = new JS.Test.Reporters.Headless({})
reporter.open("test.html")

and to top it all off, a Make task that runs the tests after making sure all the files are up to date (this task is also phony since it does not generate any files):

test: $(app_bundle) $(spec_js)
	phantomjs phantom.js

It’s also customary to add a phony task called clean that deletes any generated files from the project, putting it back in its ‘clean’ state:

clean:
	rm -rf build

So, the whole finished Makefile looks like this, containing instructions for compiling all the source code, building a single app bundle, and running the tests:

PATH  := node_modules/.bin:$(PATH)
SHELL := /bin/bash

source_files    := $(wildcard lib/*.coffee)
build_files     := $(source_files:%.coffee=build/%.js)
template_source := templates/*.handlebars
template_js     := build/templates.js
app_bundle      := build/app.js
spec_coffee     := $(wildcard spec/*.coffee)
spec_js         := $(spec_coffee:%.coffee=build/%.js)

libraries       := vendor/jquery.js \
                   node_modules/handlebars/dist/handlebars.runtime.js \
                   node_modules/underscore/underscore.js \
                   node_modules/backbone/backbone.js

.PHONY: all clean test

all: $(app_bundle)

build/%.js: %.coffee
	coffee -co $(dir $@) $<

$(template_js): $(template_source)
	mkdir -p $(dir $@)
	handlebars $(template_source) > $@

$(app_bundle): $(libraries) $(build_files) $(template_js)
	uglifyjs -cmo $@ $^

test: $(app_bundle) $(spec_js)
	phantomjs phantom.js

clean:
	rm -rf build

If you run make test, you’ll see all the compiled files are generated on the first run, but not regenerated afterward. They will only be regenerated if the source files change and you run a task that depends on them. Expressing the dependencies between files lets Make save you a lot of time waiting for things to unnecessarily recompile.

$ make test 
coffee -co build/lib/ lib/concert.coffee
coffee -co build/lib/ lib/concert_view.coffee
mkdir -p build/
handlebars templates/*.handlebars > build/templates.js
uglifyjs -cmo build/app.js vendor/jquery.js node_modules/handlebars/dist/handlebars.runtime.js \
                           node_modules/underscore/underscore.js node_modules/backbone/backbone.js \
                           build/lib/concert.js build/lib/concert_view.js build/templates.js
coffee -co build/spec/ spec/concert_template_spec.coffee
coffee -co build/spec/ spec/concert_view_spec.coffee
phantomjs phantom.js
Loaded suite: templates.concert(), ConcertView

....

Finished in 0.015 seconds
4 tests, 4 assertions, 0 failures, 0 errors

We’ve got rather a lot done with very little configuration, and no need for plugins to glue the programs we want to use into Make. We can use whatever programs we want, Make will happily execute whatever we tell it to without us needing to write any glue code between Make and the compiler tools themselves. That means fewer things to install, audit and keep up to date, and more time getting on with your project.

You can add whatever build steps you like to these recipes, so long as you follow the pattern of describing relationships between files. You can add in other testing tools like JSHint if you like, and even make them into dependencies of other tasks so that the downstream tasks won’t run unless your tests are good. That’s how I build my book: the dependencies are set up so that the book won’t build unless all the example tests pass first.

Having all these steps automated is a big time-saver, and setting them up so quickly without needing to install any tools beyond what comes with Unix means you’ve no excuse not get your project organised. It also encourages you to write functionality you need as generic scripts, rather than hiding it away inside plugins that only work with a particular build system. You save time and the whole community benefits. Not bad for a build tool from the eighties.

Running RSpec tests from the browser

As a fun demo of the flexibility of jstest, I thought I’d show how you can use it to run tests that aren’t even JavaScript. This is a silly example but it actually demonstrates the power of the framework and I’ve used these capabilities to solve real testing problems when running JavaScript on unusual platforms.

jstest has a plugin system for changing the output format; all the different output formats and runner adapters come from objects that all implement a standard interface; the test runner invokes these methods on the reporter to notify it of what’s going on so the reporter can produce output. The event objects passed to these methods are self-contained JavaScript data structures that can be easily serialized as JSON, and indeed this JSON stream is one of the built-in output formats.

The nice thing about the JSON format is it makes it easy to sent reporter data over a wire, parse it and give to another reporter running in a different process, so you can print browser results in a terminal and vice-versa. This is how the PhantomJS integration works: the JSON reporter writes to the browser console, and PhantomJS can pick this data up and reformat it using one of the text-based output formats in the terminal.

The docs for the JSON reporter show an example of running a server-side test suite using the browser UI, by making the server process emit JSON, sending this JSON over a WebSocket and handing it to the browser reporter. The nice thing about this is the WebSocket and browser don’t care where the JSON came from – any process that emits jstest-compatible JSON will do. So, we can use this system to run Ruby tests!

To begin with, let’s write a little RSpec test:

// spec/ruby_spec.rb

describe Array do
  before do
    @strings = %w[foo bar]
  end

  it 'returns a new array by mapping the elements through the block' do
    @strings.map(&:upcase).should == %w[FOO BAR]
  end
end

Now we just need to make RSpec emit JSON, which we can do using a custom formatter:

$ rspec -r ./spec/json_formatter -f JsonFormatter ./spec
{"jstest":["startSuite",{"children":[],"size":1,"eventId":0,"timestamp":1372708952811}]}
{"jstest":["startContext",{"fullName":"Array","shortName":"Array","context":[],"children":["map"],"eventId":1,"timestamp":1372708952811}]}
{"jstest":["startContext",{"fullName":"Array map","shortName":"map","context":["Array"],"children":[],"eventId":2,"timestamp":1372708952812}]}
{"jstest":["startTest",{"fullName":"Array map returns a new array by mapping the elements through the block","shortName":"returns a new array by mapping the elements through the block","context":["Array","map"],"eventId":3,"timestamp":1372708952812}]}
{"jstest":["update",{"passed":true,"tests":1,"assertions":1,"failures":0,"errors":0,"eventId":4,"timestamp":1372708952812}]}
{"jstest":["endTest",{"fullName":"Array map returns a new array by mapping the elements through the block","shortName":"returns a new array by mapping the elements through the block","context":["Array","map"],"eventId":5,"timestamp":1372708952812}]}
{"jstest":["endContext",{"fullName":"Array map","shortName":"map","context":["Array"],"children":[],"eventId":6,"timestamp":1372708952812}]}
{"jstest":["endSuite",{"passed":true,"tests":1,"assertions":1,"failures":0,"errors":0,"eventId":7,"timestamp":1372708952812}]}

Next, we need a server, specifically a WebSocket server that will trigger a test run each time a connection is made. When we open a WebSocket to ws://localhost:8888/?test=map, the server should run this command and pipe the output into the WebSocket, sending each line of output as a separate message:

$ rspec -r ./spec/json_formatter -f JsonFormatter ./spec -e map

This is easily accomplished using the faye-websocket and split modules from npm:

// server.js

var child     = require('child_process'),
    http      = require('http'),
    url       = require('url'),
    split     = require('split'),
    WebSocket = require('faye-websocket')

var bin  = 'rspec',
    argv = ['-r', './spec/json_formatter', '-f', 'JsonFormatter', './spec']

var server = http.createServer()

server.on('upgrade', function(request, socket, body) {
  var ws = new WebSocket(request, socket, body),

      params  = url.parse(request.url, true).query,
      tests   = JSON.parse(params.test),

      options = tests.reduce(function(o, t) { return o.concat(['-e', t]) }, []),
      proc    = child.spawn(bin, argv.concat(options))

  proc.stdout.pipe(split()).pipe(ws)
})

server.listen(8888)

And finally, we need a web page that will open a socket, and channel the messages into the jstest browser reporter. We have a special class for this: JS.Test.Reporters.JSON.Reader takes lines of JSON output, parses them and dispatches the data to a reporter, making sure the messages are replayed in the right order.

By using JS.Test.Runner to get the current run options, we can tell which tests the user has selected to run, and send their names to the server that will pass these names on to rspec.

<!-- browser.html -->

<!doctype>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>RSpec in the browser</title>
  </head>
  <body>

    <script type="text/javascript" src="./node_modules/jstest/jstest.js"></script>

    <script type="text/javascript">
      var options = new JS.Test.Runner().getOptions(),

          R       = JS.Test.Reporters,
          browser = new R.Browser(options),
          reader  = new R.JSON.Reader(browser)

      var test = encodeURIComponent(JSON.stringify(options.test)),
          ws   = new WebSocket('ws://localhost:8888/?test=' + test)

      ws.onmessage = function(event) {
        reader.read(event.data)
      }
    </script>

  </body>
</html>

If you start the server and open the web page, you should see the results in the browser!

rspec

Clicking the green arrows next to the tests reloads the page with that test selected, so we can use this to run a subset of our Ruby tests.

As I said, this is a silly example but it shows the power of the jstest reporting API. You can use this approach in reverse to send browser test results to the terminal and do other useful things.

For example, a while ago I was working on a Spotify application. It’s quite hard to make Spotify reload the page without completely shutting it down and restarting it. I wanted to drive the tests quickly from the terminal, so I made a little script to help me do this. I made page in my app that opened a Faye connection to a server, and when it received a certain message it would reload the page using window.location and re-run my tests. The tests used a custom reporter to send updates over another Faye channel. My script would send the reload command, then listen for progress updates and channel them into one of jstest‘s text reporters. This only took about a page of code and hugely improved my productivity when building my app.

This demonstrates the power of having a simple serializable data format for test events, and reporters that work on any platform out of the box.

jstest 1.0: the cross-platform JavaScript test framework finally released as a standalone package

After hacking away on it for months, I’m happy to announce the 1.0 release of jstest as a standalone package. jstest has long been part of the jsclass library but over time it’s become the module I use the most: I test all my other JavaScript projects with it, not just those based on jsclass.

Using it as heavily as I do, I’ve noticed how its user interface can be troublesome. To being with, you need to install a bunch of apparently unrelated stuff from jsclass in order to use it, and you have to learn how the jsclass module system works, and plenty more besides. I wanted to make it so that it’s easy to get cross-platform JavaScript testing in a simple, single-file package that you could easily drop into any project, and have it just work no matter what platform you’re running on. That’s what the new jstest package achieves.

It bundles up everything you need to run JavaScript tests into a single file that you can download from the website or from npm. The website contains full documentation for how to use and extend it, presenting it as a library in its own right rather than some obscure part of jsclass. I hope the new packaging and improved docs make it much easier to get started and use this library.

But, this is not simply a marketing or refactoring exercise. jstest has always been about running on as many JS platforms as possible, and this release is no different. But until now it’s been hard to integrate into other workflows: there was no way of changing the output format, or integrating with different test runners, and the platforms it did support – TestSwarm, and PhantomJS via a clunky JSON interface – were hard-coded. That all changes in 1.0 thanks to the addition of reporter plugins: an API for adding new output formats to the framework. It already includes a ton of useful output formats that work on all supported platforms, and adapters for many new test runners including Buster.JS, Karma, Teabag, Testem, Testling CI and TestSwarm.

The text output formats are built in such a way that they run on any platform, including in the browser, meaning you can now do things like use any of the output formats for PhantomJS browser tests or send test output over a socket and reformat it somewhere else. The reporters include several standard formats like TAP, TAP-Y/J and JUnit XML for integration with other reporting systems. There’s even an API you can use to make sure your own reporters work everywhere seamlessly.

This release also involves a new release of jsclass. Version 4.0 is really a maintenance release that supports this work, but notably turns all the library modules into CommonJS-compatible modules, so we’re finally free of our pre-Node legacy of managing dependencies via global variables. As before, they still work out of the box on all supported platforms, but if you’re using CommonJS they’ll be better behaved on your platform.

Finally, getting this release done finally frees me up to work on my testing book. While I emphatically don’t want the book to be about any particular tool, having a framework that just works out of the box is crucial for me to keep the book ‘usable’. Getting other frameworks to work cross-platform typically involves fiddling with a bunch of hacky plugins or googling to find out how other people managed it, and it’s important to me that the tools that ship with the book just work, without the reader having to figure out how some random plugin has changed since the book was published. I want to write about how to approach testing and architecture problems in general, rather than about any particular framework, so having tools that get out of the way is a big part of making the book useful and not wasting its readers’ time messing with project config.

So, as usual: download it, and let me know what you think.

And sign up for the book!

Terminus 0.5: now with Capybara 2.0 support and remote debugging

You might remember from my previous post that Terminus is a Capybara driver that lets you run your integration tests on any web browser on any device. Well, in mid-November Capybara 2.0 came out, and since I was at the excellent RuPy conference at the time, my conf hack project became getting Terminus compatible with this new release.

I almost finished it that weekend, but not quite, and as always once you’re home and back at work you lose focus on side projects. But, for my final release of 2012, I can happily announce Terminus 0.5 is released, and makes Terminus compatible with both Capybara 1.1 and 2.0. It’s mostly a compatibility update but it adds a couple of new features. First, Capybara’s screenshot API is supported when running tests with PhantomJS:

page.save_screenshot('screenshot.png')

And, it supports the PhantomJS remote debugger. You can call this API:

page.driver.debugger

This will pause the test execution, and open the WebKit remote debugger in Chrome so you can interact with the PhantomJS runtime through the WebKit developer tools. When testing on other browsers it simply pauses execution so you can inspect the browser where the tests are running.

As usual, ping the GitHub project if you find bugs.

Happy new year!

Terminus 0.4: Capybara for real browsers

As I occasionally mention, the original reason I built Faye was so I could control web browsers with Ruby. The end result was Terminus, a Capybara driver that controls real browsers. Since the last release, various improvements in Faye – including the extracted WebSocket module, removal of the Redis dependency and overall performance gains – have made various improvements to Terminus possible. Since Faye’s 0.8 release, I’ve been working on Terminus on-and-off and can now finally release version 0.4.

Terminus is a driver designed to control any browser on any device. To that end, this release adds support for the headless PhantomJS browser, as well as Android and IE8. In combination with the performance improvements, this makes Terminus a great option for headless and mobile testing. The interesting thing about Android and IE is that they do not support the document.evaluate() method for querying the DOM using XPath, and Capybara gives XPath queries to the driver to execute. In order to support these browsers, I had to write an XPath library, and in order to get that done quickly I wrote a PEG parser compiler. So that’s now three separate side projects that have sprung out of Terminus – talk about yak shaving.

But the big change in 0.4 is speed: Terminus 0.4 runs the Capybara test suite 3 to 5 times faster than 0.3 did. It does this using some trickery from Jon Leighton’s excellent Poltergeist dirver, which just got to 1.0. Here’s how Terminus usually talks to the browser: first, the browser connects to a running terminus server using Faye, and sends ping messages to advertise its presence:

        +---------+
        | Browser |
        +---------+
             |
             | ping
             V
        +---------+
        | Server  |
        +---------+

When you start your tests, the Terminus library connects to the server, discovers which browsers exist, and sends instructions to them. The browser executes the instructions and sends the results back to the Terminus library via the server.

        +---------+
        | Browser |
        +---------+
            ^  |
   commands |  | results
            |  V
        +---------+           +-------+
        | Server  |< -------->| Tests |
        +---------+           +-------+

As you can guess, the overhead of two socket connections and a pub/sub messaging protocol makes this a little slow. This is where the Poltergeist trick comes in. If the browser supports WebSocket, the Terminus library will boot a blocking WebSocket server in your test process, and wait for the browser to connect to it. It can then use this socket to perform request/response to the browser – it sends a message over the socket and blocks until the browser sends a response. This turns out to be much faster than using Faye and running sleep() in a loop until a result arrives.

        +---------+
        | Browser |< -------------+
        +---------+               |
            ^  |                  | queries
   commands |  | results          |
            |  V                  V
        +---------+           +-------+
        | Server  |< -------->| Tests |
        +---------+           +-------+

The Faye connection is still used to advertise the browser’s existence and to bootstrap the connection, since it’s guaranteed to work whatever browser or network you’re on.

The cool thing about this is that Jon’s code reuses the Faye::WebSocket protocol parser, supporting both hixie-76 and hybi protocols, on a totally different I/O stack. Though Faye::WebSocket is written for EventMachine, I did try to keep the parser decoupled but had never actually tried to use it elsewhere, so it’s really nice to see it used like this.

Anyway, if you’re curious about Terminus you can find out more on the website.

JS.Class 3.0.8: source maps, prototype stubs, and async error catching

I don’t usually blog point releases, but JS.Class releases tend to be infrequent these days, and mostly polish what’s there rather than significantly changing things. This release is no different, but the few changes it contains make it significantly more usable.

First, it now (like everything I ship for the browser) comes with source maps. Thanks to Jake, this was a simple configuration change.

Second, it fixes a bug in the stubbing library that means instance methods on prototypes can now be stubbed. For example, I was recently writing some new tests for Songkick’s Spotify app, which we run these tests in Chrome. (Being based on WebKit, the Spotify platform is close enough that you can write useful unit tests and run them in Chrome or v8/Node.) Spotify adds some methods to built-in prototypes though, and our code relies on them, so they need to be present while running tests in Chrome. I could just implement them globally, but there are other use cases where you just need to stub a method on all instances of a class during one test. So, this now works, and the stub is removed (and verified if it’s a mock) at the end of the test:

stub(String.prototype, "decodeForText", function() { return this.valueOf() })
"foo".decodeForText() // -> "foo"

Finally, I’ve fixed a major issue that’s been bugging me with JS.Test. As I’ve done more projects on Node.js, I’ve found that it’s way too easy to crash the test run completely because an error was thrown in a section of async code. Because it’s outside the test framework’s call stack, it doesn’t get caught, and Node just bails out:

$ npm test

> restore@0.1.0 test /home/james/projects/restore
> node spec/runner.js

Loaded suite WebFinger, OAuth, Storage, Stores, File store

Started
..
/home/james/projects/restore/node_modules/jsclass/src/test.js:1899
          throw new JS.Test.Mocking.UnexpectedCallError(message);
                ^
Error: <store> received call to authorize() with unexpected arguments:
( "the_client_id", "zebcoe", { "the_scope": [ "r", "w" ] }, #function )
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

This happens most often because I have a test that uses mocks, for example when I send a certain request to a server, I expect the server to tell the underlying model to do something.

it("authorizes the client", function() { with(this) {
  expect(store, "authorize").given("the_client_id", "zebcoe", {the_scope: ["r", "w"]}).yielding([null, "a_token"])
  http_post("/auth", auth_params)
}})

When I change the mock expectation this makes previously working code call a method with unexpected arguments, which throws an error, and because the HTTP request is processed asynchronously, the error is not caught. But it also happens for all sorts of other reasons, for example you have code that calls fs.readFile(), then processes the contents before calling you back – if the pre-processing fails, the error crashes the process.

Well now this error gets caught, so you get useful feedback from your tests when these types of errors happen:

$ npm test

> restore@0.1.0 test /home/james/projects/restore
> node spec/runner.js

Loaded suite WebFinger, OAuth, Storage, Stores, File store

Started
..E...............................................

1) Error:
OAuth with valid login credentials authorizes the client:
Error: <store> received call to authorize() with unexpected arguments:
( "the_client_id", "zebcoe", { "the_scope": [ "r", "w" ] }, #function )

Finished in 0.851 seconds
50 tests, 111 assertions, 0 failures, 1 errors

npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

Now the error is caught, the tests all finish, and you get a clear report about which test caused the error.

This functionality is supported on Node.js and in the browser. As far as I know (and I’ve tried a lot of different frameworks) the only other test frameworks that do this are Mocha and Buster. If you have a similar problem, you can catch uncaught errors like this:

// Node.js
process.addListener('uncaughtException', function(error) {
  // handle error
});

// Browsers
window.addEventListener('error', function(event) {
  // handle event
}, false);

On Node, this is particularly useful for stopping servers crashing in case of an error. In the browser, it’s mostly useful for reporting, only because the argument to the callback is a DOM event rather than an exception object, the information you can get out of it tends to be lacking. Note that for old IEs you’ll need to use window.attachEvent('onerror'), and Opera only supports catching these errors with window.onerror.

While researching this, I was really surprised to see how many very widely used frameworks don’t do this. The best alternative I’ve seen is in Jasmine: this example does not report the async error, but the test times out because it is never resumed. jasmine-node doesn’t catch this at all, which is why the test is only run if global.window exists. I’ve seen several other frameworks that either crash on Node, or in the browser simply stop updating the view, giving you no feedback that the test runner has halted without running all the tests.

Since most frameworks don’t catch these errors, I would assume this isn’t actually a problem for most people. Is this true? I’d like to know how other people deal with this situation.

If you want to give JS.Class a go, just run npm install jsclass or download it from the website.

Organizing a project with JS.Packages

I’ve been asked by a few users of JS.Class to explain how I use it to organize projects. I’ve been meaning to write this up for quite a while, ever since we adopted it at Songkick for managing our client-side codebase. Specifically, we use JS.Packages to organize our code, and JS.Test to test it, and I’m mostly going to talk about JS.Packages here.

JS.Packages is my personal hat-throw into the ring of JavaScript module loaders. It’s designed to separate dependency metadata from source code, and be capable of loading just about anything as efficiently as possible. It works at a more abstract level than most script loaders: users specify objects they want to use, rather than scripts they want to load, allowing JS.Packages to optimize downloads for them and load modules that have their own loading strategies, all through a single interface, the JS.require() function.

As an example, I’m going to show how we at Songkick use JS.Packages within our main Rails app. We manage our JavaScript and CSS by doing as much as possible in those languages, and finding simple ways to integrate with the Rails stack. JS.Packages lets us specify where our scripts live and how they depend on each other in pure JavaScript, making this information portable. We use JS.require() to load our codebase onto static pages for running unit tests without the Rails stack, and we use jsbuild and AssetHat to package it for deployment. Nowhere in our setup do we need to manage lists of script tags or worry about load order.

The first rule of our codebase is: every class/module lives in its own file, much like how we organize our Ruby code. And this means every namespace: even if a namespace has no methods of its own but just contains other classes, we give it a file so that other files don’t have to guess whether the namespace is defined or not. For example a file containing a UI widget class might look like this:

// public/javascripts/songkick/ui/widget.js

Songkick.UI.Widget = function() {
  // ...
};

This file does not have to check whether Songkick or Songkick.UI is defined, it just assumes they are. The namespaces are each defined in their own file:

// public/javascripts/songkick.js
Songkick = {};

// public/javascripts/songkick/ui.js
Songkick.UI = {};

Notice how each major class or namespace lives in a file named after the module it contains; this makes it easier to find things while hacking and lets us take advantage of the autoload() feature in JS.Packages to keep our dependency data small. It looks redundant at first, but it helps maintain predictability as the codebase grows. It results in more files, but we bundle everything for production so we keep our code browsable without sacrificing performance. I’ll cover bundling later on.

To drive out the implementation of our UI widget, we use JS.Test to write a spec for it. I’m just going to give it some random behaviour for now to demonstrate how we get everything wired up.

// test/js/songkick/ui/widget_spec.js

Songkick.UI.WidgetSpec = JS.Test.describe("Songkick.UI.Widget", function() { with(this) {
  before(function() { with(this) {
    this.widget = new Songkick.UI.Widget("foo")
  }})
  
  it("returns its attributes", function() { with(this) {
    assertEqual( {name: "foo"}, widget.getAttributes() )
  }})
}})

So now we’ve got a test and some skeleton source code, how do we run the tests? First, we need a static page to load up the JS.Packages loader, our manifest (which we’ll get to in a second) and a script that runs the tests:

// test/js/browser.html

<!doctype html>
<html>
  <head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <title>JavaScript tests</title>
  </head>
  <body>
    
    <script type="text/javascript">ROOT = '../..'</script>
    <script type="text/javascript" src="../../vendor/jsclass/min/loader.js"></script>
    <script type="text/javascript" src="../../public/javascripts/manifest.js"></script>
    <script type="text/javascript" src="./runner.js"></script>
    
  </body>
</html>

The file runner.js should be very simple: ideally we just want to load Songkick.UI.WidgetSpec and run it:

// test/js/runner.js

// Don't cache files during tests
JS.cacheBust = true;

JS.require('JS.Test', function() {
  
  JS.require(
    'Songkick.UI.WidgetSpec',
    // more specs as the app grows...
    function() { JS.Test.autorun() });
});

The final missing piece is the manifest, the file that says where our files are stored and how they depend on each other. Let’s start with a manifest that uses autoload() to specify all our scripts’ locations; I’ll present the code and explain what each line does.

// public/javascripts/manifest.js

JS.Packages(function() { with(this) {
  var ROOT = JS.ENV.ROOT || '.'
  
  autoload(/^(.*)Spec$/,     {from: ROOT + '/test/js', require: '$1'});
  autoload(/^(.*)\.[^\.]+$/, {from: ROOT + '/public/javascripts', require: '$1'});
  autoload(/^(.*)$/,         {from: ROOT + '/public/javascripts'});
}});

The ROOT setting simply lets us override root directory for the manifest, as we do on our test page. After that, we have three autoload() statements. When you call JS.require() with an object that’s not been explicitly configured, the autoload() rules are examined in order until a match for the name is found.

The first rule says that object names matching /^(.*)Spec$/ (that is, test files) should be loaded from the test/js directory. For example, Songkick.UI.WidgetSpec should be found in test/js/songkick/ui/widget_spec.js. The require: '$1' means that the object depends on the object captured by the regex, so Songkick.UI.WidgetSpec requires Songkick.UI.Widget to be loaded first, as you’d expect.

The second rule makes sure that the containing namespace for any object is loaded before the object itself. For example, it makes sure Songkick.UI is loaded before Songkick.UI.Widget, and Songkick before Songkick.UI. The regex captures everything up to the final . in the name, and makes sure it’s loaded using require: '$1'.

The third rule is a catch-all: any object not matched by the above rules should be loaded from public/javascripts. Because of the preceeding rule, this only matches root objects, i.e. it matches Songkick but not Songkick.UI. Taken together, these rules say: load all objects from public/javascripts, and make sure any containing namespaces are loaded first.

Let’s implement the code needed to make the test pass. We’re going to use jQuery to do some trivial operation; the details aren’t important but it causes a dependency problem that I’ll illustrate next.

// public/javascripts/songkick/ui/widget.js

Songkick.UI.Widget = function(name) {
  this._name = name;
};

Songkick.UI.Widget.prototype.getAttributes = function() {
  return jQuery.extend({}, {name: this._name});
};

If you open the page test/js/browser.html, you’ll see an error:

The test doesn’t work because jQuery is not loaded; this means part of our codebase depends on it but JS.Packages doesn’t know that. Remember runner.js just requires Songkick.UI.WidgetSpec? We can use jsbuild to see which files get loaded when we require this object. (jsbuild is a command-line tool I wrote after an internal project at Amazon, that was using JS.Class, decided they needed to pre-compile their code for static analysis rather than loading it dynamically at runtime. You can install it by running npm install -g jsclass.)

$ jsbuild -m public/javascripts/manifest.js -o paths Songkick.UI.WidgetSpec
public/javascripts/songkick.js
public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js
test/js/songkick/ui/widget_spec.js

As expected, it loads the containing namespaces, the Widget class, and the spec, in that order. But the Widget class depends on jQuery, so we need to tell JS.Packages about this. However, rather than adding it as a dependency to every UI module in our application, we can use a naming convention trick: all our UI modules require Songkick.UI to be loaded first, so we can make everything in that namespace depend on jQuery but making the namespace itself depend on jQuery. We update our manifest like so:

// public/javascripts/manifest.js

JS.Packages(function() { with(this) {
  var ROOT = JS.ENV.ROOT || '.';
  
  file('https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js')
    .provides('jQuery', '$');
  
  autoload(/^(.*)Spec$/,     {from: ROOT + '/test/js', require: '$1'});
  autoload(/^(.*)\.[^\.]+$/, {from: ROOT + '/public/javascripts', require: '$1'});
  autoload(/^(.*)$/,         {from: ROOT + '/public/javascripts'});
  
  pkg('Songkick.UI').requires('jQuery');
}});

Running jsbuild again shows jQuery will be loaded, and if you reload the tests now they will pass:

$ jsbuild -m public/javascripts/manifest.js -o paths Songkick.UI.WidgetSpec
public/javascripts/songkick.js

https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js

public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js
test/js/songkick/ui/widget_spec.js

So we’ve now got a working UI widget, and we can use exactly the same approach to load it in our Rails app: load the JS.Packages library and our manifest, and call JS.require('Songkick.UI.Widget'). But in production, we’d rather not be downloading all those tiny little files one at a time, it’s much more efficient to bundle them into one file.

To bundle our JavaScript and CSS for Rails, we use AssetHat, or rather a fork we made to tweak a few things. Our fork notwithstanding, AssetHat is the closest of the handful of Rails packaging solutions we tried that did everything we needed, and I highly recommend it.

AssetHat uses a file called config/assets.yml, in which you list all the bundles you want and which files should go in each section. But I’d rather specify which objects I want in each bundle; we already have tooling that figures out which files we need and in what order so I’d rather not duplicate that information. But fortunately, AssetHat lets you put ERB in your config, and we use this to shell out to jsbuild to construct our bundles for us.

First, we write a jsbuild bundles file that says which objects our application needs. We exclude jQuery from the bundle because we’ll probably load that from Google’s CDN.

// config/bundles.json
{
  "app" : {
    "exclude" : [ "jQuery" ],
    "include" : [
      "Songkick.UI.Widget"
    ]
  }
}

This is a minimal format that’s close to what the application developer works with: objects. It’s easy to figure out which objects your app needs, less simple to make sure you only load the files you need and get them in the right order, in both your test pages and your application code. We can use jsbuild to tell us which files will go into this bundle:

$ jsbuild -m public/javascripts/manifest.js -b config/bundles.json -o paths app
public/javascripts/songkick.js
public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js

Now all we need to do is pipe this information into AssetHat. This is easily done with a little ERB magic:

// config/assets.yml
# ...
js:
  <%  def js_bundles
        JSON.parse(File.read('config/bundles.json')).keys
      end
      
      def paths_for_js_bundle(name)
        jsbuild = 'jsbuild -m public/javascripts/manifest.js -b config/bundles.json'
        `#{jsbuild} -o paths -d public/javascripts #{name}`.split("\n")
      end
  %>
  
  bundles:
  <% js_bundles.each do |name| %>
    <%= name %>:
    <% paths_for_js_bundle(name).each do |path| %>
      - <%= path %>
    <% end %>
  <% end %>

Running the minification task takes the bundles we’ve defined in bundles.json and packages them for us:

$ rake asset_hat:minify
Minifying CSS/JS...

 Wrote JS bundle: public/javascripts/bundles/app.min.js
        contains: public/javascripts/songkick.js
        contains: public/javascripts/songkick/ui.js
        contains: public/javascripts/songkick/ui/widget.js
        MINIFIED: 14.4% (Engine: jsmin)

This bundle can now be loaded in your Rails views very easily:

<%= include_js :bundle => 'app' %>

This will render script tags for each individual file in the bundle during development, and a single script tag containing all the code in production. (You may have to disable the asset pipeline in recent Rails versions to make this work.)

So that’s our JavaScript strategy. As I said earlier, the core concern is to express dependency information in one place, away from the source code, in a portable format that can be used just as easily in a static web page as in your production web framework. Using autoload() and some simple naming conventions, you can get all these benefits while keeping the configuration very small indeed.

But wait, there’s more!

As a demonstration of how valuable it is to have portable dependency data and tests, consider the situation where we now want to run tests from the command line, or during our CI process. We can load the exact same files we load in the browser, plus a little stubbing of the jQuery API, and make our tests run on Node:

// test/js/node.js

require('jsclass');
require('../../public/javascripts/manifest');

JS.ENV.jQuery = {
  extend: function(a, b) {
    for (var k in b) a[k] = b[k];
    return a;
  }
};

JS.ENV.$ = JS.ENV.jQuery;

require('./runner');

And lo and behold, our tests run:

$ node test/js/node.js 
Loaded suite Songkick.UI.Widget

Started
.
Finished in 0.003 seconds
1 tests, 1 assertions, 0 failures, 0 errors

Similarly, we can write a quick PhantomJS script to parse the log messages that JS.Test emits:

// test/js/phantom.js

var page = new WebPage();

page.onConsoleMessage = function(message) {
  try {
    var result = JSON.parse(message).jstest;
    if ('total' in result && 'fail' in result) {
      console.log(message);
      var status = (!result.fail && !result.error) ? 0 : 1;
      phantom.exit(status);
    }
  } catch (e) {}
};

page.open('test/js/browser.html');

We can now run our tests on a real WebKit instance from the command line:

$ phantomjs test/js/phantom.js 
{"jstest":{"fail":0,"error":0,"total":1}}

One nice side-effect of doing as much of this as possible in JavaScript is that it improves your API design and makes you decouple your JS from your server-side stack; if it can’t be done through HTML and JavaScript, your code doesn’t do it. This makes it easy to keep your code portable, making it easier to reuse across applications with different server-side stacks.

Black-box criteria

Tim Bray recently published an article called Type-System Criteria, in which he makes the argument that Java, or statically-typed languages in general, is better-suited to mobile development than the dynamically-typed languages that are more prevalent in web development circles. The reason he gives for this boils down to API surface size:

Another observation that I think is partially but not entirely a consequence of API scale is testing difficulty. In my experience it’s pretty easy and straightforward to unit-test Web Apps. There aren’t that many APIs to mock out, and at the end of the day, these things take data in off the wire and emit other data down the wire and are thus tractable to black-box, in whole or in part.

On the other hand, I’ve found that testing mobile apps is a major pain in the ass. I think the big reason is all those APIs. Your average method in a mobile app responds to an event and twiddles APIs in the mobile framework. If you test at all completely you end up with this huge tangle of mocks that pretty soon start getting in the way of seeing what’s actually going on.

The argument goes that, as the API surface you need to integrate with becomes larger, so static type systems become more attractive. I don’t disagree, in part because I don’t have nearly enough experience with static languages to have an informed opinion on them. But at a gut level I believe this to be true, in fact I’d be willing to bet that a majority of the bugs I’ve written while refactoring software could have been caught by a static type checker (and not even a very sophisticated one, at that).

But the excerpt I quoted above contains a code smell, and it points to another reason why mobile development is difficult. It’s not the size of the APIs that’s the big problem: it’s the nature of the application.

Web application servers are comparatively easy to test because the tests can be written by talking to an encapsulated black box. You throw a request (or several) at a web server, you read what comes back, and check it looks like what you expected. On the other hand, testing web application clients is much more complex: instead of doing simple call/response testing, you have to initiate events within the application’s environment, and then monitor changes to that environment that you expect the events to cause. The core difference here is that client-side programs tend to be what I’m going to refer to as ‘stateful user interfaces’, and mobile (and desktop) software falls into the same category.

What exactly do I mean by ‘stateful user interface’? When you call a web server, you don’t need to hold onto any state on your end: you ask the server a question by sending it a request, and it sends back a fully-formed, self-contained response. When you’ve checked that response, you throw it away and start the next test. In contrast, stateful user interfaces are long-running processes in which incremental changes are made to what the user sees. Instead of getting a fresh new page, just a part of the view is changed, or a sound is emitted, or a notification generated, or a vibration initiated. The programming paradigm in a server environment emphasises call/response, statelessness and immutability; in a client environment you have side effects, state and incremental change. Testing in such environments is hard.

I think this, rather than large API surface, is the real problem. Large API surfaces are only a problem if your application code talks to them directly, and this is much more common in side-effect-heavy applications. Unit tests in these environments tend to be messy for several reasons:

  • Application code responds to events triggered by the host environment
  • Business logic produces its output by modifying the host environment rather than returning values
  • It is hard or impossible to reset the environment to a clean state between tests

The third reason is a particular problem when unit testing client-side JavaScript, and I’ve seen plenty of tests where the state of the page or the implementation of event listeners is such that it becomes very difficult to keep each test independent of the others. You also have the problem that anything that causes a page refresh will cause your test runner to vanish. (I wrote about this exact problem in Refactoring towards testable JavaScript.)

So if side-effect-heavy programs cause large API surfaces to be a problem, what should we do about it? The answer comes down to something I think of as ‘avoiding framework-isms’. This means that any time you have a framework or host environment in which user input or third-party code drives your application, the sooner you can dispatch to something you control the better. The classic example of this is the ‘fat model, skinny controller’ mantra popular in the Rails community: rather than dump lots of code in a controller that’s only invoked by the host server and framework, turn the request into calls to models. This way, the bulk of the logic is in objects that you control the interface to, and that are easy to create and manipulate, properties that also make them easy to test.

In client-side JavaScript and other stateful user interfaces, this means keeping event listeners small. Ideally an event listener should extract all the necessary data from the event and the current application state, and use this to make a black-box call to a module containing the real business logic. It means making sure orthogonal components of a user interface do not talk to each other directly, but publish data changes via a message bus. And it means writing business logic that returns results rather than causes side-effects; the side-effects again being dealt with by thin bindings to the host environment.

I’ll finish up with a small but illustrative example. Say you’re writing a WebSocket implementation, and the protocol mandates that when you call socket.send('Hello, world!') then the bytes 81 8d ed a3 88 c3 a5 c6 e4 af 82 8f a8 b4 82 d1 e4 a7 cc should be written to the TCP socket. You could write a test for it by mocking out the whole network stack (which I’ve probably glossed over considerably here):

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "writes a message to the socket" do
    @tcp_socket.should_receive(:write).with [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
    @web_socket.send("Hello, world!")
  end
  
  # More mock-based protocol tests...
end

Or you could test it by implementing a pure function that turns text into WebSocket frames, leaving the code that actually deals with networking doing only that and nothing else:

describe WebSocket::Parser do
  before do
    @parser = WebSocket::Parser.new
  end
  
  it "turns text into message frames" do
    @parser.frame("Hello, world!").should == [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
  end
  
  # More protocol implementation tests...
end

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    
    @parser = mock('parser')
    WebSocket::Parser.should_receive(:new).and_return @parser
    
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "converts text to frames and sends them" do
    frame = mock('frame')
    @parser.should_receive(:frame).with("Hello, world!").and_return frame
    @tcp_socket.should_receive(:write).with(frame)
    @web_socket.send("Hello, world!")
  end
  
  # And we're done here
end

This separates the business logic (implementing the WebSocket protocol) away from the side effects to the host environment (writing to network connections). This results in code that’s more modular, much easier to test, and less coupled to the API surface of the host environment. If a static type system helps you with that then have at it, but recognize when it’s a symptom of a deeper problem.

Terminus 0.3: control multiple browsers with Ruby

As you’ll have noticed if you made it to the end of my last post, there is a new release of Terminus. Terminus is a Capybara driver that is designed to let you control your app in any browser on any device, by sending all driver instructions to be executed on the client side in JavaScript.

This release is the first since Capybara 1.0, and supports the entire Capybara API. This includes:

  • Reading response headers and status codes
  • Handling cookies
  • Running JavaScript and receiving the results
  • Resynchronizing XHR requests (jQuery only)
  • Switching between frames and windows
  • Detecting infinite redirects

This is a superset of the supported features of the Rack::Test and Selenium drivers, and has the added bonus of letting you switch between browsers. When you have multiple browsers connected to your Terminus server, you can select which one you want to control by matching on the browser’s name, OS, version and current URL, for example:

Terminus.browser = {:name => /Safari/, :current_url => /pitchfork.com/}

You can select any browser that is ‘docked’, i.e. idling on the Terminus holding page:

Terminus.browser = :docked

Or simply by selecting one browser from the list:

Terminus.browser = Terminus.browsers.first

All this lets you control multiple browsers at once, for example I’ve been using it to automate some of the Faye integration tests:

#================================================================
# Acquire some browsers and log into each with a username

NAMES = %w[alice bob carol]
BROWSERS = {}
Terminus.ensure_browsers 3

Terminus.browsers.each_with_index do |browser, i|
  name = NAMES[i]
  puts "#{name} is using #{browser}"
  BROWSERS[name] = browser
  Terminus.browser = browser
  visit '/'
  fill_in 'username', :with => name
  click_button 'Go'
end

#================================================================
# Send a message from each browser to every other browser,
# and check that it arrived. If it doesn't arrive, send all
# the browsers back to the dock and raise an exception

BROWSERS.each do |name, sender|
  BROWSERS.each do |at, target|
    next if at == name
    
    Terminus.browser = sender
    fill_in 'message', :with => "@#{at} Hello, world!"
    click_button 'Send'
    
    Terminus.browser = target
    unless page.has_content?("#{name}: @#{at} Hello, world!")
      Terminus.return_to_dock
      raise "Message did not make it from #{sender} to #{target}"
    end
  end
end

#================================================================
# Re-dock all the browsers when we're finished

Terminus.return_to_dock

So what’s not supported? Internet Explorer is still not supported because I cannot find a decent way to run XPath queries on it. I was working on Pathology to solve this but I can’t get it to perform well enough for the workload Capybara throws at it. It might be possible to work around this by monkey-patching Capybara to pass through CSS selectors instead of compiling them to XPath, though. File attachments are not supported for security reasons, and there are still some bugs that show up if you do stuff you’re not supposed to, like using duplicate element IDs. These are particularly apparent on Opera. And finally visiting remote hosts outside your application is supported but is not particularly robust as yet.

You can find out more and see a video of it in action on its new website.