Terminus 0.5: now with Capybara 2.0 support and remote debugging

You might remember from my previous post that Terminus is a Capybara driver that lets you run your integration tests on any web browser on any device. Well, in mid-November Capybara 2.0 came out, and since I was at the excellent RuPy conference at the time, my conf hack project became getting Terminus compatible with this new release.

I almost finished it that weekend, but not quite, and as always once you’re home and back at work you lose focus on side projects. But, for my final release of 2012, I can happily announce Terminus 0.5 is released, and makes Terminus compatible with both Capybara 1.1 and 2.0. It’s mostly a compatibility update but it adds a couple of new features. First, Capybara’s screenshot API is supported when running tests with PhantomJS:

page.save_screenshot('screenshot.png')

And, it supports the PhantomJS remote debugger. You can call this API:

page.driver.debugger

This will pause the test execution, and open the WebKit remote debugger in Chrome so you can interact with the PhantomJS runtime through the WebKit developer tools. When testing on other browsers it simply pauses execution so you can inspect the browser where the tests are running.

As usual, ping the GitHub project if you find bugs.

Happy new year!

Terminus 0.4: Capybara for real browsers

As I occasionally mention, the original reason I built Faye was so I could control web browsers with Ruby. The end result was Terminus, a Capybara driver that controls real browsers. Since the last release, various improvements in Faye – including the extracted WebSocket module, removal of the Redis dependency and overall performance gains – have made various improvements to Terminus possible. Since Faye’s 0.8 release, I’ve been working on Terminus on-and-off and can now finally release version 0.4.

Terminus is a driver designed to control any browser on any device. To that end, this release adds support for the headless PhantomJS browser, as well as Android and IE8. In combination with the performance improvements, this makes Terminus a great option for headless and mobile testing. The interesting thing about Android and IE is that they do not support the document.evaluate() method for querying the DOM using XPath, and Capybara gives XPath queries to the driver to execute. In order to support these browsers, I had to write an XPath library, and in order to get that done quickly I wrote a PEG parser compiler. So that’s now three separate side projects that have sprung out of Terminus – talk about yak shaving.

But the big change in 0.4 is speed: Terminus 0.4 runs the Capybara test suite 3 to 5 times faster than 0.3 did. It does this using some trickery from Jon Leighton’s excellent Poltergeist dirver, which just got to 1.0. Here’s how Terminus usually talks to the browser: first, the browser connects to a running terminus server using Faye, and sends ping messages to advertise its presence:

        +---------+
        | Browser |
        +---------+
             |
             | ping
             V
        +---------+
        | Server  |
        +---------+

When you start your tests, the Terminus library connects to the server, discovers which browsers exist, and sends instructions to them. The browser executes the instructions and sends the results back to the Terminus library via the server.

        +---------+
        | Browser |
        +---------+
            ^  |
   commands |  | results
            |  V
        +---------+           +-------+
        | Server  |< -------->| Tests |
        +---------+           +-------+

As you can guess, the overhead of two socket connections and a pub/sub messaging protocol makes this a little slow. This is where the Poltergeist trick comes in. If the browser supports WebSocket, the Terminus library will boot a blocking WebSocket server in your test process, and wait for the browser to connect to it. It can then use this socket to perform request/response to the browser – it sends a message over the socket and blocks until the browser sends a response. This turns out to be much faster than using Faye and running sleep() in a loop until a result arrives.

        +---------+
        | Browser |< -------------+
        +---------+               |
            ^  |                  | queries
   commands |  | results          |
            |  V                  V
        +---------+           +-------+
        | Server  |< -------->| Tests |
        +---------+           +-------+

The Faye connection is still used to advertise the browser’s existence and to bootstrap the connection, since it’s guaranteed to work whatever browser or network you’re on.

The cool thing about this is that Jon’s code reuses the Faye::WebSocket protocol parser, supporting both hixie-76 and hybi protocols, on a totally different I/O stack. Though Faye::WebSocket is written for EventMachine, I did try to keep the parser decoupled but had never actually tried to use it elsewhere, so it’s really nice to see it used like this.

Anyway, if you’re curious about Terminus you can find out more on the website.

JS.Class 3.0.8: source maps, prototype stubs, and async error catching

I don’t usually blog point releases, but JS.Class releases tend to be infrequent these days, and mostly polish what’s there rather than significantly changing things. This release is no different, but the few changes it contains make it significantly more usable.

First, it now (like everything I ship for the browser) comes with source maps. Thanks to Jake, this was a simple configuration change.

Second, it fixes a bug in the stubbing library that means instance methods on prototypes can now be stubbed. For example, I was recently writing some new tests for Songkick’s Spotify app, which we run these tests in Chrome. (Being based on WebKit, the Spotify platform is close enough that you can write useful unit tests and run them in Chrome or v8/Node.) Spotify adds some methods to built-in prototypes though, and our code relies on them, so they need to be present while running tests in Chrome. I could just implement them globally, but there are other use cases where you just need to stub a method on all instances of a class during one test. So, this now works, and the stub is removed (and verified if it’s a mock) at the end of the test:

stub(String.prototype, "decodeForText", function() { return this.valueOf() })
"foo".decodeForText() // -> "foo"

Finally, I’ve fixed a major issue that’s been bugging me with JS.Test. As I’ve done more projects on Node.js, I’ve found that it’s way too easy to crash the test run completely because an error was thrown in a section of async code. Because it’s outside the test framework’s call stack, it doesn’t get caught, and Node just bails out:

$ npm test

> restore@0.1.0 test /home/james/projects/restore
> node spec/runner.js

Loaded suite WebFinger, OAuth, Storage, Stores, File store

Started
..
/home/james/projects/restore/node_modules/jsclass/src/test.js:1899
          throw new JS.Test.Mocking.UnexpectedCallError(message);
                ^
Error: <store> received call to authorize() with unexpected arguments:
( "the_client_id", "zebcoe", { "the_scope": [ "r", "w" ] }, #function )
npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

This happens most often because I have a test that uses mocks, for example when I send a certain request to a server, I expect the server to tell the underlying model to do something.

it("authorizes the client", function() { with(this) {
  expect(store, "authorize").given("the_client_id", "zebcoe", {the_scope: ["r", "w"]}).yielding([null, "a_token"])
  http_post("/auth", auth_params)
}})

When I change the mock expectation this makes previously working code call a method with unexpected arguments, which throws an error, and because the HTTP request is processed asynchronously, the error is not caught. But it also happens for all sorts of other reasons, for example you have code that calls fs.readFile(), then processes the contents before calling you back – if the pre-processing fails, the error crashes the process.

Well now this error gets caught, so you get useful feedback from your tests when these types of errors happen:

$ npm test

> restore@0.1.0 test /home/james/projects/restore
> node spec/runner.js

Loaded suite WebFinger, OAuth, Storage, Stores, File store

Started
..E...............................................

1) Error:
OAuth with valid login credentials authorizes the client:
Error: <store> received call to authorize() with unexpected arguments:
( "the_client_id", "zebcoe", { "the_scope": [ "r", "w" ] }, #function )

Finished in 0.851 seconds
50 tests, 111 assertions, 0 failures, 1 errors

npm ERR! Test failed.  See above for more details.
npm ERR! not ok code 0

Now the error is caught, the tests all finish, and you get a clear report about which test caused the error.

This functionality is supported on Node.js and in the browser. As far as I know (and I’ve tried a lot of different frameworks) the only other test frameworks that do this are Mocha and Buster. If you have a similar problem, you can catch uncaught errors like this:

// Node.js
process.addListener('uncaughtException', function(error) {
  // handle error
});

// Browsers
window.addEventListener('error', function(event) {
  // handle event
}, false);

On Node, this is particularly useful for stopping servers crashing in case of an error. In the browser, it’s mostly useful for reporting, only because the argument to the callback is a DOM event rather than an exception object, the information you can get out of it tends to be lacking. Note that for old IEs you’ll need to use window.attachEvent('onerror'), and Opera only supports catching these errors with window.onerror.

While researching this, I was really surprised to see how many very widely used frameworks don’t do this. The best alternative I’ve seen is in Jasmine: this example does not report the async error, but the test times out because it is never resumed. jasmine-node doesn’t catch this at all, which is why the test is only run if global.window exists. I’ve seen several other frameworks that either crash on Node, or in the browser simply stop updating the view, giving you no feedback that the test runner has halted without running all the tests.

Since most frameworks don’t catch these errors, I would assume this isn’t actually a problem for most people. Is this true? I’d like to know how other people deal with this situation.

If you want to give JS.Class a go, just run npm install jsclass or download it from the website.

Organizing a project with JS.Packages

I’ve been asked by a few users of JS.Class to explain how I use it to organize projects. I’ve been meaning to write this up for quite a while, ever since we adopted it at Songkick for managing our client-side codebase. Specifically, we use JS.Packages to organize our code, and JS.Test to test it, and I’m mostly going to talk about JS.Packages here.

JS.Packages is my personal hat-throw into the ring of JavaScript module loaders. It’s designed to separate dependency metadata from source code, and be capable of loading just about anything as efficiently as possible. It works at a more abstract level than most script loaders: users specify objects they want to use, rather than scripts they want to load, allowing JS.Packages to optimize downloads for them and load modules that have their own loading strategies, all through a single interface, the JS.require() function.

As an example, I’m going to show how we at Songkick use JS.Packages within our main Rails app. We manage our JavaScript and CSS by doing as much as possible in those languages, and finding simple ways to integrate with the Rails stack. JS.Packages lets us specify where our scripts live and how they depend on each other in pure JavaScript, making this information portable. We use JS.require() to load our codebase onto static pages for running unit tests without the Rails stack, and we use jsbuild and AssetHat to package it for deployment. Nowhere in our setup do we need to manage lists of script tags or worry about load order.

The first rule of our codebase is: every class/module lives in its own file, much like how we organize our Ruby code. And this means every namespace: even if a namespace has no methods of its own but just contains other classes, we give it a file so that other files don’t have to guess whether the namespace is defined or not. For example a file containing a UI widget class might look like this:

// public/javascripts/songkick/ui/widget.js

Songkick.UI.Widget = function() {
  // ...
};

This file does not have to check whether Songkick or Songkick.UI is defined, it just assumes they are. The namespaces are each defined in their own file:

// public/javascripts/songkick.js
Songkick = {};

// public/javascripts/songkick/ui.js
Songkick.UI = {};

Notice how each major class or namespace lives in a file named after the module it contains; this makes it easier to find things while hacking and lets us take advantage of the autoload() feature in JS.Packages to keep our dependency data small. It looks redundant at first, but it helps maintain predictability as the codebase grows. It results in more files, but we bundle everything for production so we keep our code browsable without sacrificing performance. I’ll cover bundling later on.

To drive out the implementation of our UI widget, we use JS.Test to write a spec for it. I’m just going to give it some random behaviour for now to demonstrate how we get everything wired up.

// test/js/songkick/ui/widget_spec.js

Songkick.UI.WidgetSpec = JS.Test.describe("Songkick.UI.Widget", function() { with(this) {
  before(function() { with(this) {
    this.widget = new Songkick.UI.Widget("foo")
  }})
  
  it("returns its attributes", function() { with(this) {
    assertEqual( {name: "foo"}, widget.getAttributes() )
  }})
}})

So now we’ve got a test and some skeleton source code, how do we run the tests? First, we need a static page to load up the JS.Packages loader, our manifest (which we’ll get to in a second) and a script that runs the tests:

// test/js/browser.html

<!doctype html>
<html>
  <head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <title>JavaScript tests</title>
  </head>
  <body>
    
    <script type="text/javascript">ROOT = '../..'</script>
    <script type="text/javascript" src="../../vendor/jsclass/min/loader.js"></script>
    <script type="text/javascript" src="../../public/javascripts/manifest.js"></script>
    <script type="text/javascript" src="./runner.js"></script>
    
  </body>
</html>

The file runner.js should be very simple: ideally we just want to load Songkick.UI.WidgetSpec and run it:

// test/js/runner.js

// Don't cache files during tests
JS.cacheBust = true;

JS.require('JS.Test', function() {
  
  JS.require(
    'Songkick.UI.WidgetSpec',
    // more specs as the app grows...
    function() { JS.Test.autorun() });
});

The final missing piece is the manifest, the file that says where our files are stored and how they depend on each other. Let’s start with a manifest that uses autoload() to specify all our scripts’ locations; I’ll present the code and explain what each line does.

// public/javascripts/manifest.js

JS.Packages(function() { with(this) {
  var ROOT = JS.ENV.ROOT || '.'
  
  autoload(/^(.*)Spec$/,     {from: ROOT + '/test/js', require: '$1'});
  autoload(/^(.*)\.[^\.]+$/, {from: ROOT + '/public/javascripts', require: '$1'});
  autoload(/^(.*)$/,         {from: ROOT + '/public/javascripts'});
}});

The ROOT setting simply lets us override root directory for the manifest, as we do on our test page. After that, we have three autoload() statements. When you call JS.require() with an object that’s not been explicitly configured, the autoload() rules are examined in order until a match for the name is found.

The first rule says that object names matching /^(.*)Spec$/ (that is, test files) should be loaded from the test/js directory. For example, Songkick.UI.WidgetSpec should be found in test/js/songkick/ui/widget_spec.js. The require: '$1' means that the object depends on the object captured by the regex, so Songkick.UI.WidgetSpec requires Songkick.UI.Widget to be loaded first, as you’d expect.

The second rule makes sure that the containing namespace for any object is loaded before the object itself. For example, it makes sure Songkick.UI is loaded before Songkick.UI.Widget, and Songkick before Songkick.UI. The regex captures everything up to the final . in the name, and makes sure it’s loaded using require: '$1'.

The third rule is a catch-all: any object not matched by the above rules should be loaded from public/javascripts. Because of the preceeding rule, this only matches root objects, i.e. it matches Songkick but not Songkick.UI. Taken together, these rules say: load all objects from public/javascripts, and make sure any containing namespaces are loaded first.

Let’s implement the code needed to make the test pass. We’re going to use jQuery to do some trivial operation; the details aren’t important but it causes a dependency problem that I’ll illustrate next.

// public/javascripts/songkick/ui/widget.js

Songkick.UI.Widget = function(name) {
  this._name = name;
};

Songkick.UI.Widget.prototype.getAttributes = function() {
  return jQuery.extend({}, {name: this._name});
};

If you open the page test/js/browser.html, you’ll see an error:

The test doesn’t work because jQuery is not loaded; this means part of our codebase depends on it but JS.Packages doesn’t know that. Remember runner.js just requires Songkick.UI.WidgetSpec? We can use jsbuild to see which files get loaded when we require this object. (jsbuild is a command-line tool I wrote after an internal project at Amazon, that was using JS.Class, decided they needed to pre-compile their code for static analysis rather than loading it dynamically at runtime. You can install it by running npm install -g jsclass.)

$ jsbuild -m public/javascripts/manifest.js -o paths Songkick.UI.WidgetSpec
public/javascripts/songkick.js
public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js
test/js/songkick/ui/widget_spec.js

As expected, it loads the containing namespaces, the Widget class, and the spec, in that order. But the Widget class depends on jQuery, so we need to tell JS.Packages about this. However, rather than adding it as a dependency to every UI module in our application, we can use a naming convention trick: all our UI modules require Songkick.UI to be loaded first, so we can make everything in that namespace depend on jQuery but making the namespace itself depend on jQuery. We update our manifest like so:

// public/javascripts/manifest.js

JS.Packages(function() { with(this) {
  var ROOT = JS.ENV.ROOT || '.';
  
  file('https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js')
    .provides('jQuery', '$');
  
  autoload(/^(.*)Spec$/,     {from: ROOT + '/test/js', require: '$1'});
  autoload(/^(.*)\.[^\.]+$/, {from: ROOT + '/public/javascripts', require: '$1'});
  autoload(/^(.*)$/,         {from: ROOT + '/public/javascripts'});
  
  pkg('Songkick.UI').requires('jQuery');
}});

Running jsbuild again shows jQuery will be loaded, and if you reload the tests now they will pass:

$ jsbuild -m public/javascripts/manifest.js -o paths Songkick.UI.WidgetSpec
public/javascripts/songkick.js

https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js

public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js
test/js/songkick/ui/widget_spec.js

So we’ve now got a working UI widget, and we can use exactly the same approach to load it in our Rails app: load the JS.Packages library and our manifest, and call JS.require('Songkick.UI.Widget'). But in production, we’d rather not be downloading all those tiny little files one at a time, it’s much more efficient to bundle them into one file.

To bundle our JavaScript and CSS for Rails, we use AssetHat, or rather a fork we made to tweak a few things. Our fork notwithstanding, AssetHat is the closest of the handful of Rails packaging solutions we tried that did everything we needed, and I highly recommend it.

AssetHat uses a file called config/assets.yml, in which you list all the bundles you want and which files should go in each section. But I’d rather specify which objects I want in each bundle; we already have tooling that figures out which files we need and in what order so I’d rather not duplicate that information. But fortunately, AssetHat lets you put ERB in your config, and we use this to shell out to jsbuild to construct our bundles for us.

First, we write a jsbuild bundles file that says which objects our application needs. We exclude jQuery from the bundle because we’ll probably load that from Google’s CDN.

// config/bundles.json
{
  "app" : {
    "exclude" : [ "jQuery" ],
    "include" : [
      "Songkick.UI.Widget"
    ]
  }
}

This is a minimal format that’s close to what the application developer works with: objects. It’s easy to figure out which objects your app needs, less simple to make sure you only load the files you need and get them in the right order, in both your test pages and your application code. We can use jsbuild to tell us which files will go into this bundle:

$ jsbuild -m public/javascripts/manifest.js -b config/bundles.json -o paths app
public/javascripts/songkick.js
public/javascripts/songkick/ui.js
public/javascripts/songkick/ui/widget.js

Now all we need to do is pipe this information into AssetHat. This is easily done with a little ERB magic:

// config/assets.yml
# ...
js:
  <%  def js_bundles
        JSON.parse(File.read('config/bundles.json')).keys
      end
      
      def paths_for_js_bundle(name)
        jsbuild = 'jsbuild -m public/javascripts/manifest.js -b config/bundles.json'
        `#{jsbuild} -o paths -d public/javascripts #{name}`.split("\n")
      end
  %>
  
  bundles:
  <% js_bundles.each do |name| %>
    <%= name %>:
    <% paths_for_js_bundle(name).each do |path| %>
      - <%= path %>
    <% end %>
  <% end %>

Running the minification task takes the bundles we’ve defined in bundles.json and packages them for us:

$ rake asset_hat:minify
Minifying CSS/JS...

 Wrote JS bundle: public/javascripts/bundles/app.min.js
        contains: public/javascripts/songkick.js
        contains: public/javascripts/songkick/ui.js
        contains: public/javascripts/songkick/ui/widget.js
        MINIFIED: 14.4% (Engine: jsmin)

This bundle can now be loaded in your Rails views very easily:

<%= include_js :bundle => 'app' %>

This will render script tags for each individual file in the bundle during development, and a single script tag containing all the code in production. (You may have to disable the asset pipeline in recent Rails versions to make this work.)

So that’s our JavaScript strategy. As I said earlier, the core concern is to express dependency information in one place, away from the source code, in a portable format that can be used just as easily in a static web page as in your production web framework. Using autoload() and some simple naming conventions, you can get all these benefits while keeping the configuration very small indeed.

But wait, there’s more!

As a demonstration of how valuable it is to have portable dependency data and tests, consider the situation where we now want to run tests from the command line, or during our CI process. We can load the exact same files we load in the browser, plus a little stubbing of the jQuery API, and make our tests run on Node:

// test/js/node.js

require('jsclass');
require('../../public/javascripts/manifest');

JS.ENV.jQuery = {
  extend: function(a, b) {
    for (var k in b) a[k] = b[k];
    return a;
  }
};

JS.ENV.$ = JS.ENV.jQuery;

require('./runner');

And lo and behold, our tests run:

$ node test/js/node.js 
Loaded suite Songkick.UI.Widget

Started
.
Finished in 0.003 seconds
1 tests, 1 assertions, 0 failures, 0 errors

Similarly, we can write a quick PhantomJS script to parse the log messages that JS.Test emits:

// test/js/phantom.js

var page = new WebPage();

page.onConsoleMessage = function(message) {
  try {
    var result = JSON.parse(message).jstest;
    if ('total' in result && 'fail' in result) {
      console.log(message);
      var status = (!result.fail && !result.error) ? 0 : 1;
      phantom.exit(status);
    }
  } catch (e) {}
};

page.open('test/js/browser.html');

We can now run our tests on a real WebKit instance from the command line:

$ phantomjs test/js/phantom.js 
{"jstest":{"fail":0,"error":0,"total":1}}

One nice side-effect of doing as much of this as possible in JavaScript is that it improves your API design and makes you decouple your JS from your server-side stack; if it can’t be done through HTML and JavaScript, your code doesn’t do it. This makes it easy to keep your code portable, making it easier to reuse across applications with different server-side stacks.

Black-box criteria

Tim Bray recently published an article called Type-System Criteria, in which he makes the argument that Java, or statically-typed languages in general, is better-suited to mobile development than the dynamically-typed languages that are more prevalent in web development circles. The reason he gives for this boils down to API surface size:

Another observation that I think is partially but not entirely a consequence of API scale is testing difficulty. In my experience it’s pretty easy and straightforward to unit-test Web Apps. There aren’t that many APIs to mock out, and at the end of the day, these things take data in off the wire and emit other data down the wire and are thus tractable to black-box, in whole or in part.

On the other hand, I’ve found that testing mobile apps is a major pain in the ass. I think the big reason is all those APIs. Your average method in a mobile app responds to an event and twiddles APIs in the mobile framework. If you test at all completely you end up with this huge tangle of mocks that pretty soon start getting in the way of seeing what’s actually going on.

The argument goes that, as the API surface you need to integrate with becomes larger, so static type systems become more attractive. I don’t disagree, in part because I don’t have nearly enough experience with static languages to have an informed opinion on them. But at a gut level I believe this to be true, in fact I’d be willing to bet that a majority of the bugs I’ve written while refactoring software could have been caught by a static type checker (and not even a very sophisticated one, at that).

But the excerpt I quoted above contains a code smell, and it points to another reason why mobile development is difficult. It’s not the size of the APIs that’s the big problem: it’s the nature of the application.

Web application servers are comparatively easy to test because the tests can be written by talking to an encapsulated black box. You throw a request (or several) at a web server, you read what comes back, and check it looks like what you expected. On the other hand, testing web application clients is much more complex: instead of doing simple call/response testing, you have to initiate events within the application’s environment, and then monitor changes to that environment that you expect the events to cause. The core difference here is that client-side programs tend to be what I’m going to refer to as ‘stateful user interfaces’, and mobile (and desktop) software falls into the same category.

What exactly do I mean by ‘stateful user interface’? When you call a web server, you don’t need to hold onto any state on your end: you ask the server a question by sending it a request, and it sends back a fully-formed, self-contained response. When you’ve checked that response, you throw it away and start the next test. In contrast, stateful user interfaces are long-running processes in which incremental changes are made to what the user sees. Instead of getting a fresh new page, just a part of the view is changed, or a sound is emitted, or a notification generated, or a vibration initiated. The programming paradigm in a server environment emphasises call/response, statelessness and immutability; in a client environment you have side effects, state and incremental change. Testing in such environments is hard.

I think this, rather than large API surface, is the real problem. Large API surfaces are only a problem if your application code talks to them directly, and this is much more common in side-effect-heavy applications. Unit tests in these environments tend to be messy for several reasons:

  • Application code responds to events triggered by the host environment
  • Business logic produces its output by modifying the host environment rather than returning values
  • It is hard or impossible to reset the environment to a clean state between tests

The third reason is a particular problem when unit testing client-side JavaScript, and I’ve seen plenty of tests where the state of the page or the implementation of event listeners is such that it becomes very difficult to keep each test independent of the others. You also have the problem that anything that causes a page refresh will cause your test runner to vanish. (I wrote about this exact problem in Refactoring towards testable JavaScript.)

So if side-effect-heavy programs cause large API surfaces to be a problem, what should we do about it? The answer comes down to something I think of as ‘avoiding framework-isms’. This means that any time you have a framework or host environment in which user input or third-party code drives your application, the sooner you can dispatch to something you control the better. The classic example of this is the ‘fat model, skinny controller’ mantra popular in the Rails community: rather than dump lots of code in a controller that’s only invoked by the host server and framework, turn the request into calls to models. This way, the bulk of the logic is in objects that you control the interface to, and that are easy to create and manipulate, properties that also make them easy to test.

In client-side JavaScript and other stateful user interfaces, this means keeping event listeners small. Ideally an event listener should extract all the necessary data from the event and the current application state, and use this to make a black-box call to a module containing the real business logic. It means making sure orthogonal components of a user interface do not talk to each other directly, but publish data changes via a message bus. And it means writing business logic that returns results rather than causes side-effects; the side-effects again being dealt with by thin bindings to the host environment.

I’ll finish up with a small but illustrative example. Say you’re writing a WebSocket implementation, and the protocol mandates that when you call socket.send('Hello, world!') then the bytes 81 8d ed a3 88 c3 a5 c6 e4 af 82 8f a8 b4 82 d1 e4 a7 cc should be written to the TCP socket. You could write a test for it by mocking out the whole network stack (which I’ve probably glossed over considerably here):

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "writes a message to the socket" do
    @tcp_socket.should_receive(:write).with [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
    @web_socket.send("Hello, world!")
  end
  
  # More mock-based protocol tests...
end

Or you could test it by implementing a pure function that turns text into WebSocket frames, leaving the code that actually deals with networking doing only that and nothing else:

describe WebSocket::Parser do
  before do
    @parser = WebSocket::Parser.new
  end
  
  it "turns text into message frames" do
    @parser.frame("Hello, world!").should == [0x81, 0x8d, 0xed, 0xa3, 0x88, 0xc3, 0xa5, 0xc6, 0xe4, 0xaf, 0x82, 0x8f, 0xa8, 0xb4, 0x82, 0xd1, 0xe4, 0xa7, 0xcc]
  end
  
  # More protocol implementation tests...
end

describe WebSocket do
  before do
    @tcp_socket = mock('TCP socket')
    TCP.should_receive(:connect).with('example.com', 80).and_return @tcp_socket
    
    @parser = mock('parser')
    WebSocket::Parser.should_receive(:new).and_return @parser
    
    @web_socket = WebSocket.new('ws://example.com/')
  end
  
  it "converts text to frames and sends them" do
    frame = mock('frame')
    @parser.should_receive(:frame).with("Hello, world!").and_return frame
    @tcp_socket.should_receive(:write).with(frame)
    @web_socket.send("Hello, world!")
  end
  
  # And we're done here
end

This separates the business logic (implementing the WebSocket protocol) away from the side effects to the host environment (writing to network connections). This results in code that’s more modular, much easier to test, and less coupled to the API surface of the host environment. If a static type system helps you with that then have at it, but recognize when it’s a symptom of a deeper problem.

Terminus 0.3: control multiple browsers with Ruby

As you’ll have noticed if you made it to the end of my last post, there is a new release of Terminus. Terminus is a Capybara driver that is designed to let you control your app in any browser on any device, by sending all driver instructions to be executed on the client side in JavaScript.

This release is the first since Capybara 1.0, and supports the entire Capybara API. This includes:

  • Reading response headers and status codes
  • Handling cookies
  • Running JavaScript and receiving the results
  • Resynchronizing XHR requests (jQuery only)
  • Switching between frames and windows
  • Detecting infinite redirects

This is a superset of the supported features of the Rack::Test and Selenium drivers, and has the added bonus of letting you switch between browsers. When you have multiple browsers connected to your Terminus server, you can select which one you want to control by matching on the browser’s name, OS, version and current URL, for example:

Terminus.browser = {:name => /Safari/, :current_url => /pitchfork.com/}

You can select any browser that is ‘docked’, i.e. idling on the Terminus holding page:

Terminus.browser = :docked

Or simply by selecting one browser from the list:

Terminus.browser = Terminus.browsers.first

All this lets you control multiple browsers at once, for example I’ve been using it to automate some of the Faye integration tests:

#================================================================
# Acquire some browsers and log into each with a username

NAMES = %w[alice bob carol]
BROWSERS = {}
Terminus.ensure_browsers 3

Terminus.browsers.each_with_index do |browser, i|
  name = NAMES[i]
  puts "#{name} is using #{browser}"
  BROWSERS[name] = browser
  Terminus.browser = browser
  visit '/'
  fill_in 'username', :with => name
  click_button 'Go'
end

#================================================================
# Send a message from each browser to every other browser,
# and check that it arrived. If it doesn't arrive, send all
# the browsers back to the dock and raise an exception

BROWSERS.each do |name, sender|
  BROWSERS.each do |at, target|
    next if at == name
    
    Terminus.browser = sender
    fill_in 'message', :with => "@#{at} Hello, world!"
    click_button 'Send'
    
    Terminus.browser = target
    unless page.has_content?("#{name}: @#{at} Hello, world!")
      Terminus.return_to_dock
      raise "Message did not make it from #{sender} to #{target}"
    end
  end
end

#================================================================
# Re-dock all the browsers when we're finished

Terminus.return_to_dock

So what’s not supported? Internet Explorer is still not supported because I cannot find a decent way to run XPath queries on it. I was working on Pathology to solve this but I can’t get it to perform well enough for the workload Capybara throws at it. It might be possible to work around this by monkey-patching Capybara to pass through CSS selectors instead of compiling them to XPath, though. File attachments are not supported for security reasons, and there are still some bugs that show up if you do stuff you’re not supposed to, like using duplicate element IDs. These are particularly apparent on Opera. And finally visiting remote hosts outside your application is supported but is not particularly robust as yet.

You can find out more and see a video of it in action on its new website.

Refactoring towards testable JavaScript, part 3

This article is one in a 3-part series. The full series is:

We finished up the previous article having separated the business logic from the DOM interactions in our JavaScript, and adjusted our unit tests to take advantage of this. In the final part of this series, we’ll take a look at how to take the tests we have and run them across a range of browsers automatically to give us maximum confidence that our code works.

To automate cross-browser testing, I use a much-overlooked tool called TestSwarm. Developed by John Resig for testing jQuery, it takes care of tracking your test status in multiple browsers as you make commits to your project.

To set it up, we need to clone it from GitHub and create a directory within to host revisions of our project.

$ git clone git://github.com/jquery/testswarm.git
$ cd testswarm
$ cp config/config-sample.ini config.ini
$ mkdir -p changeset/jsapp

You’ll need to create a MySQL database for it:

CREATE USER 'testswarm'@'localhost' IDENTIFIED BY 'choose-a-password';
CREATE DATABASE testswarm;
GRANT ALL ON testswarm.* TO 'testswarm'@'localhost';

Then import the TestSwarm schema:

$ mysql -u testswarm -p testswarm < config/testswam.sql
$ mysql -u testswarm -p testswarm < config/useragents.sql

Once you’ve added the database details to config.ini and set up an Apache VHost, you can visit your TestSwarm server and click ‘Signup’. Once you’ve filled in that form you’ll be able to grab your auth key from the database:


$ mysql testswarm -u testswarm -p
mysql> select auth from users;
+------------------------------------------+
| auth                                     |
+------------------------------------------+
| a962c548c22a591e8f150b9d9f6b673b6f212d08 |
+------------------------------------------+

Keep that auth code somewhere as you’ll need it later on. Now before we go any further, to show how TestSwarm works I want to deliberately break our application so that it doesn’t work in Internet Explorer. Do this I’m going to replace jQuery#bind with HTMLElement#addEventListener, and when we push our code to TestSwarm we should see it break.

To get our tests running on TestSwarm, we need a config file. I just grabbed one of the standard Perl scripts from the TestSwarm project and added my own configuration. This tells the script where your TestSwarm server is, where your code should be checkout out, any build scripts you need to run, which files to load, etc. JS.Test includes TestSwarm support baked in so we don’t need to modify our tests at all to make them send reports to the TestSwarm server, we just need to load the same old spec/browser.html file we’ve been using all along. You should configure the Perl script to clone your project into the changeset/jsapp directory we created earlier: this is in TestSwarm’s public directory so web browsers will be able to load it from there. You’ll need to include the auth key we created earlier to submit jobs to the server.

Having created this file, we clone the project on our server somewhere and create a cron job to periodically update our copy and run the TestSwam script: this means that new test jobs will be submitted whenever we commit to the project.

# crontab
* * * * * cd $HOME/projects/jsapp && git pull && perl spec/testswarm.pl

If you now open a couple of browsers and connect to the swarm, you’ll see tests begin to run. If you inspect the test results for our project you should see this:

The green box is Chrome reporting 5 passed tests, and the black box is IE8 reporting 7 errors. If we click through we see what happened:

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)

  • Error:
    FormValidator with valid data displays no errors:
    TypeError: Object doesn’t support this property or method
  • Error:
    FormValidator with an invalid name displays an error message:
    TypeError: Object doesn’t support this property or method
  • Error:
    FormValidator with an invalid email displays an error message:
    TypeError: Object doesn’t support this property or method
  • Error:
    FormValidator with an invalid argument displays an error message:
    TypeError: Object doesn’t support this property or method
  • Error:
    FormValidator with an invalid name displays an error message:
    TypeError: Object doesn’t support this property or method
  • Error:
    FormValidator with an invalid name displays an error message:
    TypeError: ‘submit’ is undefined
  • Error:
    FormValidator with an invalid name displays an error message:
    TypeError: ‘error’ is undefined

5 tests, 4 assertions, 0 failures, 7 errors

“Object doesn’t support this property or method” is IE’s way of saying you’re calling a method that doesn’t exist, in our case addEventListener(). If we make some changes so that we use attachEvent() instead in IE, when TestSwam picks up the change and runs our tests they go green in IE.

You can leave any number of browsers connected to the swarm and they will run tests automatically as you make commits to the project. This is the great advantage of sticking to portable unit tests for your JavaScript, it makes this kind of automation much easier, you run tests in real browsers and don’t need a lot of additional tooling to set up fake environments. I run the JS.Class tests this way and it’s great for making sure my code works across all platforms before shipping a release.

The final big win, having set all this up, is that we can now delete some of our full-stack tests since they just duplicate our unit tests. When we’re testing a full-stack integration, we really just want to test at the level of abstraction the integration works at, i.e. test the glue that binds the pieces together, rather than testing all the different cases of every piece of business logic. In our case, this means testing two broad cases: either the form is valid or it is not valid. We have unit tests that cover in detail what ‘valid’ and ‘invalid’ mean, and we don’t need to duplicate these. We just need to test their effect on the application as a whole: either the form submits or it doesn’t. We can then instantly discard all the integration tests that cover the validation details, leaving the broad cases covered. Doing this rigorously will keep your integration tests to a minimum and keep your build running quickly.

To wrap up this series, I thought I’d mention a couple other things we can do with our tests to get extra coverage. The first is the JS.Test coverage tool. If we write our code using JS.Class, we can make the test framework report which methods were called during the test just by adding cover(FormValidator) to the top of the spec. When we run the tests we get a report:

<code>$ node spec/console.js 
Loaded suite FormValidator

Started
....

  +-----------------------------------+-------+
  | Method                            | Calls |
  +-----------------------------------+-------+
  | Method:FormValidator.validate     | 4     |
  | Method:FormValidator#handleSubmit | 0     |
  | Method:FormValidator#initialize   | 0     |
  +-----------------------------------+-------+

Finished in 0.005 seconds
4 tests, 4 assertions, 0 failures, 0 errors</code>

If any of the methods are not called, the process exits with a non-zero exit status so you can treat your build as failing until all the methods are called during the test.

Finally, I have an ongoing experimental Capybara driver called Terminus that lets you run your Capybara-based tests on remote machines like phones, iPads and so on. If we change our Capybara driver as required, we can open a browser on a remote machine, connect to the Terminus server and run the tests on that machine, or on many machines at once if your tests involve communication between many clients.

Here’s the full list of software we’ve used in this series:

  • Sinatra – Ruby web application framework used to create our application stack
  • jQuery – client side DOM, Ajax and effects library used to handle from submissions
  • JS.Class, JS.Test – portable object system and testing framework for JavaScript
  • Cucumber – Ruby acceptance testing framework used for writing plain-text test scenarios
  • Capybara – web scripting API that can drive many different backends
  • Rack::Test – Capybara backend that talks to Rack applications directly with no wire traffic, suited to doing fast, in-process testing, does not support JavaScript
  • Selenium – Capybara backend that runs using a real browser, slower but supports JavaScript
  • Terminus – Capybara backend that can drive any remote browser using JavaScript
  • PhantomJS – headless distribution of WebKit, scriptable using JavaScript
  • TestSwam – automated cross-browser CI server for tracking JavaScript unit tests across the project history

I’ll leave you with a few points to bear in mind to keep your JavaScript unit-testable:

  • Minimize DOM interaction – write your business logic in pure JavaScript, test it server-side, and use a ‘controller’ layer to bind this logic to your UI.
  • Keep controllers DOM focused – in JavaScript, ‘controllers’ in MVC parlance are basically your event handlers. They should handle user input, trigger actions in your business logic, and update the page as appropriate.
  • If you need a browser, use a real one – in my experience, given how easy it is to test on real browsers and minimize integration tests, fake DOM environments are often more pain than they’re worth. The important thing is to keep your code as portable as possible so you can adapt if you spot more suitable tools.

Refactoring towards testable JavaScript, part 2

This article is one in a 3-part series. The full series is:

At the end of the previous article, we’d just finished reproducing our full-stack Cucumber tests as pure JavaScript unit tests against the FormValidator class, ending up with this spec:

FORM_HTML = '\
    <form method="post" action="/accounts/create">\
      <label for="username">Username</label>\
      <input type="text" id="username" name="username">\
      \
      <label for="email">Email</label>\
      <input type="text" id="email" name="email">\
      \
      <div class="error"></div>\
      <input type="submit" value="Sign up">\
    </form>'

JS.require('JS.Test', function() {

  JS.Test.describe("FormValidator", function() { with(this) {
    before(function() {
      $("#fixture").html(FORM_HTML)
      new FormValidator($("form"))

      this.submit = $("form input[type=submit]")
      this.error  = $("form .error")
    })

    describe("with an invalid name", function() { with(this) {
      before(function() { with(this) {
        $("#username").val("Hagrid")
        submit.click()
      }})

      it("displays an error message", function() { with(this) {
        assertEqual( "Your name is invalid", error.html() )
      }})
    }})

    // ...
  }})

  JS.Test.autorun()
})

These run much faster than their full-stack counterparts, and they let us run the tests in any browser we like. But they’re still not ideal: we’ve made the mistake of tangling up the model with the view, testing validation logic by going through the UI layer. If we separate the business logic from the view logic, we’ll end up with validation functions written in pure JavaScript that doesn’t touch the DOM and that can be tested from the command line.

Before we do that though, let’s move the spec out of the HTML test page and into its own JavaScript file. This will make it easier to load on the command line when we get to that stage. This leaves our HTML page containing just the logic needed to load the code and the tests:

JS.Packages(function() { with(this) {
  
  file('../public/jquery.js')
      .provides('jQuery', '$')
  
  file('../public/form_validator.js')
      .provides('FormValidator')
      .requires('jQuery')
  
  autoload(/^(.*)Spec$/, {from: '../spec/javascript', require: '$1'})
}})

JS.require('JS.Test', function() {
  JS.require('FormValidatorSpec', JS.Test.method('autorun'))
})

We will eventually move this into its own file as well, but for now getting the spec into a separate file is the important step.

Recall our FormValidator class currently looks like this:

function FormValidator(form) {
  var username = form.find('#username'),
      email    = form.find('#email'),
      error    = form.find('.error');

  form.bind('submit', function() {
    if (username.val() === 'Wizard') {
      error.html('Your argument is invalid');
      return false;
    }
    else if (username.val() !== 'Harry') {
      error.html('Your name is invalid');
      return false;
    }
    else if (!/@/.test(email.val())) {
      error.html('Your email is invalid');
      return false;
    }
  });
};

We can refactor so that we get a validation function that doesn’t touch the DOM:

FormValidator = function(form) {
  form.bind('submit', function() {
    var params = form.serializeArray(),
        data   = {};
    
    for (var i = 0, n = params.length; i < n; i++)
      data[params[i].name] = params[i].value;
    
    var errors = FormValidator.validate(data);
    if (errors.length === 0) return true;
    
    form.find('.error').html(errors[0]);
    return false;
  });
};

FormValidator.validate = function(params) {
  var errors = [];
  
  if (params.username === 'Wizard')
    errors.push('Your argument is invalid');
  
  else if (params.username !== 'Harry')
    errors.push('Your name is invalid');
  
  else if (!/@/.test(params.email))
    errors.push('Your email is invalid');
  
  return errors;
};

Notice how FormValidator.validate() does not talk to the DOM at all: it doesn’t listen to events and it doesn’t modify the page. It just accepts a data object and returns a (hopefully empty) list of errors. The FormValidator initialization does the work of listening to form events, marshalling the form’s data, running the validation and printing any errors. The DOM interaction has been separated from the business logic.

This step lets us refactor our tests so that they don’t use the DOM, they just test the business logic:

JS.ENV.FormValidatorSpec = JS.Test.describe("FormValidator", function() { with(this) {
  describe("with valid data", function() { with(this) {
    before(function() { with(this) {
      this.errors = FormValidator.validate({username: "Harry", email: "wizard@hogwarts.com"})
    }})
    
    it("displays no errors", function() { with(this) {
      assertEqual( [], errors )
    }})
  }})
  
  describe("with an invalid name", function() { with(this) {
    before(function() { with(this) {
      this.errors = FormValidator.validate({username: "Hagrid"})
    }})
    
    it("displays an error message", function() { with(this) {
      assertEqual( ["Your name is invalid"], errors )
    }})
  }})
  
  // ...
}})

Testing the business logic without going through the DOM has let us add another test: if the form data is valid, the form submission proceeds unhindered and the page running the tests is unloaded, so we cannot test the valid case through the DOM. By testing the business logic directly, we can test the valid case without worrying about the page changing.

We load up our tests in the browser and once again they are all good.

Since our tests do not now talk to the DOM, we can run them on the command line. We move the code to load the code and tests out of the HTML page and into its own file, spec/runner.js:

var CWD = (typeof CWD === 'undefined') ? '.' : CWD

JS.Packages(function() { with(this) {
  file(CWD + '/public/form_validator.js')
      .provides('FormValidator')
  
  autoload(/^(.*)Spec$/, {from: CWD + '/spec/javascript', require: '$1'})
}})

JS.require('JS.Test', function() {
  JS.require('FormValidatorSpec', JS.Test.method('autorun'))
})

This just leaves the test page spec/browser.html needing to load the JS.Class seed file and runner.js:

<!doctype html>
<html>
  <head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <title>FormValidator tests</title>
    <script type="text/javascript" src="../vendor/js.class/build/min/loader.js"></script>
  </head>
  <body>
    
    <script type="text/javascript">CWD = '..'</script>
    <script type="text/javascript" src="./runner.js"></script>
    
  </body>
</html>

We’ve now moved all our JavaScript out of our HTML and we can run these new JavaScript files on the server side. All we need to do is create a file that performs the same job as spec/browser.html but for server-side platforms. We’ll call this file spec/console.js:

JSCLASS_PATH = 'vendor/js.class/build/src'

if (typeof require === 'function') {
  require('../' + JSCLASS_PATH + '/loader.js')
  require('./runner.js')
} else {
  load(JSCLASS_PATH + '/loader.js')
  load('spec/runner.js')
}

This file performs some feature detection to figure out how to load files. This is the only place we need to do this, since JS.Packages will figure out how to load files for us after this. Let’s try running this script with Node:

<code>$ node spec/console.js 
Loaded suite FormValidator

Started
....
Finished in 0.004 seconds
4 tests, 4 assertions, 0 failures, 0 errors</code>

We’ve now got some lightning-fast unit tests of our JavaScript business logic that we can run from the command line. The portability of JS.Test means you can run these tests with Node, V8, Rhino and SpiderMonkey, and with a little adjustment to console.js (see the JS.Test documentation) you can even run them on Windows Script Host.

However our test coverage has slipped a bit: we are no longer testing the interaction with the DOM at all. We aught to have at least a sanity test that our app is wired together correctly, and we can do this easily by adding a section at the end of our FormValidatorSpec beginning with this:

if (typeof document === 'undefined') return

We can then define a test after this to check the interaction with the DOM that will only be executed if the tests are running in a DOM environment.

To round off this section, let’s get this DOM test running on the command line as well using PhantomJS. This is a headless browser based on WebKit that you can control using JavaScript. It also lets you catch console output emitted by the pages you load, which lets you monitor your tests. I recently made JS.Test emit JSON on the console for just this purpose.

We can create a script to load our test page and capture this output:

var page = new WebPage()

page.onConsoleMessage = function(message) {
  try {
    var event = JSON.parse(message).jstest
    if (!event) return
    
    if (event.status === 'failed')
      return console.log('FAILED: ' + event.test)
    
    if (event.total) {
      console.log(JSON.stringify(event))
      var status = (!event.fail && !event.error) ? 0 : 1
      phantom.exit(status)
    }
    
  } catch (e) {}
}

page.open('spec/browser.html')

As you can see, it’s just a case of parsing every console message we get and checking the data contained therein. If the message signals the end of the tests, we can exit with the appropriate exit status. Let’s give this script a spin:

<code>$ phantomjs spec/phantom.js 
{"fail":0,"error":0,"total":5}</code>

So we’ve now got full DOM integration tests we can run on the command line, letting us roll this into our continuous integration cycle. You can run PhantomJS on server machines, although if you’re not running X on these machines you’ll need to run Xvfb to give PhantomJS a virtual X buffer to work with.

It’s worth mentioning at this point that I’ve never been a fan of browser simulators, that is fake DOM environments used during testing. They never behave quite like real browsers, and often involve a lot of elaborate environment set-up in other languages that makes your tests non-portable. I’ve found it far too easy to find bugs in them, for example HtmlUnit (which the Ruby tools Celerity and Akephalos are based on) will throw a NullPointerException when running our tests because of the cancelled form submission. Given how easy it is to use PhantomJS for unit testing and Selenium through Capybara for full-stack testing, and that these tools use real browsers, I don’t see a huge benefit to using simulators. I like to keep as much of my code as I can in simple, portable JavaScript that can easily be run in different environments to maintain flexibility.

In the final part of this series, we’ll cover how to strengthen your test coverage by automating your cross-browser testing process.

Refactoring towards testable JavaScript, part 1

This article is one in a 3-part series. The full series is:

As someone who does a lot of pure-JavaScript projects, I’ve settled into a pattern for organizing my code and its tests in a way I’m comfortable with. At Songkick, despite being obsessed with testing our Ruby code we’ve traditionally done a patchy job of testing our JavaScript. Some recent refactoring is giving us a chance to review our practises and I wanted to use this chance to see how easily we can test JavaScript within applications. I’m pleased to report that the tools available today make this an absolute breeze, and it’s quite easy to do a thorough job of putting together a sustainable testing strategy.

This is the first in a series of articles walking through a demo I presented internally at Songkick, showing various ways we can test our JavaScript and how we can refactor to keep these tests running quickly. I’ll be going through changes to a project and linking to Git commits as appropriate. We’ll cover a range of testing styles using tools written be me and others, all of which make JavaScript testing easy.

Let’s start off with version 0: we decide we want a new software product, and promptly decide to write a spec for it. We decide there should be a sign-up form, and there should be rules about what data is acceptable.

<code>Feature: Signing up
  In order to show everyone what a badass I am
  As a developer
  I want to make my users sit through some JavaScript validation
  
  Background:
    Given I visit the sign-up form
  
  Scenario: Entering the wrong name
    When I enter an invalid name
    Then I should see "Your name is invalid"
  
  Scenario: Entering the wrong email address
    When I enter an invalid email address
    Then I should see "Your email is invalid"
  
  Scenario: Having an invalid argument
    When I use an invalid argument
    Then I should see "Your argument is invalid"
  
  Scenario: Entering valid data
    When I enter valid sign-up data
    Then I should see "You are a wizard, Harry!"</code>

Great! Some detailed full-stack tests are a good starting point for for making sure we build the right thing. Full of enthusiasm, we crack on and write some step definitions and an application that passes the tests. Here’s our little Sinatra application:

require 'sinatra'

get '/signup' do
  erb :signup
end

post '/accounts/create' do
  if params[:username] == 'Wizard'
    'Your argument is invalid'
  elsif params[:username] != 'Harry'
    'Your name is invalid'
  elsif params[:email] !~ /@/
    'Your email is invalid'
  else
    'You are a wizard, Harry!'
  end
end

And the view containing the sign-up form:

<form method="post" action="/accounts/create">
  <label for="username">Username</label>
  <input type="text" id="username" name="username">
  
  <label for="email">Email</label>
  <input type="text" id="email" name="email">
  
  <input type="submit" value="Sign up">
</form>

We run cucumber features/ and all is good:

<code>$ cucumber features/
Feature: Signing up
  In order to show everyone what a badass I am
  As a developer
  I want to make my users sit through some JavaScript validation

  Background:                      # features/signup.feature:6
    Given I visit the sign-up form # features/step_definitions/app_steps.rb:1

  Scenario: Entering the wrong name          # features/signup.feature:9
    When I enter an invalid name             # features/step_definitions/app_steps.rb:5
    Then I should see "Your name is invalid" # features/step_definitions/app_steps.rb:27

  Scenario: Entering the wrong email address  # features/signup.feature:13
    When I enter an invalid email address     # features/step_definitions/app_steps.rb:10
    Then I should see "Your email is invalid" # features/step_definitions/app_steps.rb:27

  Scenario: Having an invalid argument           # features/signup.feature:17
    When I use an invalid argument               # features/step_definitions/app_steps.rb:16
    Then I should see "Your argument is invalid" # features/step_definitions/app_steps.rb:27

  Scenario: Entering valid data                  # features/signup.feature:21
    When I enter valid sign-up data              # features/step_definitions/app_steps.rb:21
    Then I should see "You are a wizard, Harry!" # features/step_definitions/app_steps.rb:27

4 scenarios (4 passed)
12 steps (12 passed)
0m0.582s</code>

These tests are fast because we’re currently using Rack::Test, which talks directly to our Rack application in Ruby without needing to boot a server or go over the wire. That is specified by this in our features/support/env.rb:

Capybara.current_driver = :rack_test
Capybara.app = Sinatra::Application

So next we decide that we want to put the validation on the client side, rather than the server (not a good idea in general, but I needed an example everyone would be familiar with). We hollow out our application and move the validation into a script tag after the form:

post '/accounts/create' do
  'You are a wizard, Harry!'
end
<form method="post" action="/accounts/create">
  <label for="username">Username</label>
  <input type="text" id="username" name="username">
  
  <label for="email">Email</label>
  <input type="text" id="email" name="email">
  
  <div class="error"></div>
  <input type="submit" value="Sign up">
</form>

<script type="text/javascript">
  $('form').bind('submit', function() {
    if ($('#username').val() === 'Wizad') {
      $('.error').html('Your argument is invalid');
      return false;
    }
    else if ($('#username').val() !== 'Harry') {
      $('.error').html('Your name is invalid');
      return false;
    }
    else if (!/@/.test($('#email').val())) {
      $('.error').html('Your email is invalid');
      return false;
    }
  });
</script>

Now Rack::Test won’t run JavaScript, but not to worry – Capybara just lets us set Capybara.current_driver = :selenium and suddenly our tests are all executed in Firefox. But there’s one problem:

<code>$ cucumber features/
# ...
4 scenarios (4 passed)
12 steps (12 passed)
0m9.891s</code>

Our tests have gone from taking 0.5 seconds to nearly 10 seconds: that’s 20 times slower. Multiplied over a whole application test suite you’ll soon be wanting to throw all your tests away. We need to move more of this logic into unit tests if we want a sustainable testing strategy.

The first step is to get that JavaScript out of the view and into an external file that we can share between pages, and then just instantiate a copy of our new class where we need it.

function FormValidator(form) {
  var username = form.find('#username'),
      email    = form.find('#email'),
      error    = form.find('.error');
  
  form.bind('submit', function() {
    if (username.val() === 'Wizard') {
      error.html('Your argument is invalid');
      return false;
    }
    else if (username.val() !== 'Harry') {
      error.html('Your name is invalid');
      return false;
    }
    else if (!/@/.test(email.val())) {
      error.html('Your email is invalid');
      return false;
    }
  });
};
<form method="post" action="/accounts/create">
  <!-- ... -->
</form>

<script type="text/javascript">
  new FormValidator($('form'));
</script>

We can then test this class in isolation by creating a test page using the JS.Test framework (full spec page source on GitHub). This spec replicates what our Cucumber tests do, except that instead of loading the whole sign-up page every time, they just add a form to the page, add the validator to it then run one of the validation examples.

FORM_HTML = '\
    <form method="post" action="/accounts/create">\
      <label for="username">Username</label>\
      <input type="text" id="username" name="username">\
      \
      <label for="email">Email</label>\
      <input type="text" id="email" name="email">\
      \
      <div class="error"></div>\
      <input type="submit" value="Sign up">\
    </form>'

JS.require('JS.Test', function() {
  
  JS.Test.describe("FormValidator", function() { with(this) {
    before(function() {
      $("#fixture").html(FORM_HTML)
      new FormValidator($("form"))
      
      this.submit = $("form input[type=submit]")
      this.error  = $("form .error")
    })
    
    describe("with an invalid name", function() { with(this) {
      before(function() { with(this) {
        $("#username").val("Hagrid")
        submit.click()
      }})
      
      it("displays an error message", function() { with(this) {
        assertEqual( "Your name is invalid", error.html() )
      }})
    }})
    
    // ...
  }})
  
  JS.Test.autorun()
})

We open our test page spec/browser.html up in a web browser and JS.Test confirms that all our JavaScript logic works.

This is a great place to stop for now: we’ve turned what were some full-stack tests that required booting our entire application into some unit tests that we can run quickly. In the next article we’ll get into how we can refactor this further to decouple our business logic from the DOM and get test we can run from the command line.

Terminus driving multiple browsers

Conferences usually prompt me to hack on some loose thread I’ve not picked up in months. At this year’s Scottish Ruby Conference I decided I had to give a lightning talk on Terminus, my Capybara driver for scripting remote browsers. With a little hacking and lot of sitting around waiting for tests to complete, I’ve got it up to date with a lot of the latest Capybara specs and added a really simple API for switching between browsers based on name, OS, version etc.

I’m not putting out a release just yet but I thought I’d share a couple videos of it in action; one’s a hi-res screen capture and one’s a fuzzy shaky mobile capture so you can see it running across multiple machines.

The app it’s running is the Faye example application – a chat app much like Twitter. Terminus logs in as a different user on each browser, then sends messages between all the pairs of browsers and checks that each message arrives on screen in the intended browser.

Sadly, it looks like Internet Explorer support is a pipe dream for now. The XPath queries Capybara spits out are too much for my half-baked XPath engine to deal with, so unless someone comes up with a fast implementation of document.evaluate() for IE I’m leaving it alone.

Anyway, the videos:

The script that runs this is really simple, it’s just some Capybara calls with a little extra to tell Terminus to switch browsers. Check it out on GitHub.

As for what I’ve added to Terminus, here’s a quick run-down:

  • Headers and status code support
  • Multiple windows and iframes
  • Improved concurrency handling for running the same test in multiple browsers
  • Browser selection API
  • Removed the need for you to embed a script in your application
  • Very basic support for scripting remote applications

The addition of status and header support means it actually supports a superset of the behaviour supported by the Selenium driver, albeit considerably slower. The main win is being able to script remote devices – it’s great fun watching it control somebody’s iPad! Would love to see if people come up with novel uses for it.