Testing command-line apps with Cucumber

I recently wrote a tiny little tool called Claw to help me work on large codebases in gEdit. It provides a terminal that lets you search for files by name and content using very minimal syntax, and it numbers the search results so you can just type a number to open the file using any shell command. In the last few days I converted its tests from Test::Unit to Cucumber and added a lot more coverage, and found Cucumber to be very nice for testing command-line UIs.

Test::Unit and RSpec work fine when you’re just calling methods and testing their return values, stuff like assert [:foo, :bar].include?(:foo) or "ruby".upcase.should == "RUBY". The thing about command-line apps is that they mostly output their results as side effects, such as writing to standard out, modifying files, making network calls etc. Testing these things often involves stubs and mock objects and these can make your tests harder to follow, but Cucumber lets you hide all that behind easy-to-read prose.

To illustrate, let’s write a little app ourselves using TDD. Let’s make a few directories:

mkdir tdd-terminal
cd tdd-terminal
mkdir lib
mkdir -p features/step_definitions
mkdir -p features/support

Let’s write a spec for our program. This is a stripped down version of the Claw spec that contains enough details to illustrate how to test command-line apps: we’ve got some visible output to verify, and a side effect of running the program: it should open files for us when we issue certain commands.

# features/search_for_files.feature

Feature: Search for files
  
  Background:
    Given I start the app with "-c gedit"
  
  Scenario: Find files by name
    Given I enter "foo"
    Then I should see
    """
    1. foo.txt
    """
  
  Scenario: Open a matching file
    Given I enter "foo"
    And I enter "1"
    Then "foo.txt" should be open in "gedit"

Let’s run the feature using Cucumber:

Feature: Search for files

  Background:                             # features/search_for_files.feature:3
    Given I start the app with "-c gedit" # features/search_for_files.feature:4

  Scenario: Find files by name            # features/search_for_files.feature:6
    Given I enter "foo"                   # features/search_for_files.feature:7
    Then I should see                     # features/search_for_files.feature:8
      """
      1. foo.txt
      """

  Scenario: Open a matching file             # features/search_for_files.feature:13
    Given I enter "foo"                      # features/search_for_files.feature:14
    And I enter "1"                          # features/search_for_files.feature:15
    Then "foo.txt" should be open in "gedit" # features/search_for_files.feature:16

2 scenarios (2 undefined)
7 steps (7 undefined)
0m0.041s

You can implement step definitions for undefined steps with these snippets:

Given /^I start the app with "([^\"]*)"$/ do |arg1|
  pending
end

Given /^I enter "([^\"]*)"$/ do |arg1|
  pending
end

Then /^I should see$/ do |string|
  pending
end

Then /^"([^\"]*)" should be open in "([^\"]*)"$/ do |arg1, arg2|
  pending
end

Take the generated step definitions and copy them into features/step_definitions/terminal_steps.rb. Re-running the tests you should see the steps marked as pending or skipped rather than undefined. Let’s implement the first step, the one that starts our app.

Given /^I start the app with "([^\"]*)"$/ do |command|
  @io  = StringIO.new
  @app = Terminal.new(command.split(/\s+/), @io)
end

Here we take the command-line argument string and split it up to mimic what we’d expect to receive in ARGV when running the app for real. We also make an IO object and give it to the app: instead of calling Kernel#puts our app will call puts on whatever object we pass in. In production this would be Kernel but for testing we want an object that we can read from so we can check what the app prints. Run the tests again:

Feature: Search for files

  Background:                             # features/search_for_files.feature:3
    Given I start the app with "-c gedit" # features/step_definitions/terminal_steps.rb:1
      uninitialized constant Terminal (NameError)
      ./features/step_definitions/terminal_steps.rb:3:in `/^I start the app with "([^\"]*)"$/'
      features/search_for_files.feature:4:in `Given I start the app with "-c gedit"'

  Scenario: Find files by name            # features/search_for_files.feature:6
    Given I enter "foo"                   # features/step_definitions/terminal_steps.rb:6
    Then I should see                     # features/step_definitions/terminal_steps.rb:10
      """
      1. foo.txt
      """

  Scenario: Open a matching file             # features/search_for_files.feature:13
    Given I enter "foo"                      # features/step_definitions/terminal_steps.rb:6
    And I enter "1"                          # features/step_definitions/terminal_steps.rb:6
    Then "foo.txt" should be open in "gedit" # features/step_definitions/terminal_steps.rb:14

Failing Scenarios:
cucumber features/search_for_files.feature:6 # Scenario: Find files by name

2 scenarios (1 failed, 1 skipped)
7 steps (1 failed, 6 skipped)

We need to go and implement our Terminal class to get the first step working. First, enter the following in features/support/env.rb:

# features/support/env.rb

require File.dirname(__FILE__) + '/../../lib/terminal'
require 'spec/mocks'

This tells Cucumber to load our application and RSpec’s mocking library before running the tests. We need to give it something to load, so let’s put the following in lib/terminal.rb:

# lib/terminal.rb

require 'oyster'

class Terminal
  BIN_SPEC = Oyster.spec do
    string :command
  end
  
  def initialize(argv, io)
    @options = BIN_SPEC.parse(argv)
    @stdout  = io
  end
end

The constructor takes an array of command-line input (should be ARGV in production) and an output device to puts to. I’m using my Oyster gem to parse the input into an options hash, and storing a reference to the output device for later use. This should be enough code to turn our first step green:

Feature: Search for files

  Background:                             # features/search_for_files.feature:3
    Given I start the app with "-c gedit" # features/step_definitions/terminal_steps.rb:1

  Scenario: Find files by name            # features/search_for_files.feature:6
    Given I enter "foo"                   # features/step_definitions/terminal_steps.rb:6
      TODO (Cucumber::Pending)
      ./features/step_definitions/terminal_steps.rb:7:in `/^I enter "([^\"]*)"$/'
      features/search_for_files.feature:7:in `Given I enter "foo"'
    Then I should see                     # features/step_definitions/terminal_steps.rb:10
      """
      1. foo.txt
      """

  Scenario: Open a matching file             # features/search_for_files.feature:13
    Given I enter "foo"                      # features/step_definitions/terminal_steps.rb:6
      TODO (Cucumber::Pending)
      ./features/step_definitions/terminal_steps.rb:7:in `/^I enter "([^\"]*)"$/'
      features/search_for_files.feature:14:in `Given I enter "foo"'
    And I enter "1"                          # features/step_definitions/terminal_steps.rb:6
    Then "foo.txt" should be open in "gedit" # features/step_definitions/terminal_steps.rb:14

2 scenarios (2 pending)
7 steps (3 skipped, 2 pending, 2 passed)

Let’s implement the next step, “Given I enter”. This step just needs to send the command to the app, the app just needs to receive the command to get the step to pass.

# features/step_definitions/terminal_steps.rb
Given /^I enter "([^\"]*)"$/ do |command|
  @app.interpret(command)
end

# lib/terminal.rb
  def interpret(command)
  end

Running the tests now gives:

Feature: Search for files

  Background:                             # features/search_for_files.feature:3
    Given I start the app with "-c gedit" # features/step_definitions/terminal_steps.rb:1

  Scenario: Find files by name            # features/search_for_files.feature:6
    Given I enter "foo"                   # features/step_definitions/terminal_steps.rb:6
    Then I should see                     # features/step_definitions/terminal_steps.rb:10
      """
      1. foo.txt
      """
      TODO (Cucumber::Pending)
      ./features/step_definitions/terminal_steps.rb:11:in `/^I should see$/'
      features/search_for_files.feature:8:in `Then I should see'

  Scenario: Open a matching file             # features/search_for_files.feature:13
    Given I enter "foo"                      # features/step_definitions/terminal_steps.rb:6
    And I enter "1"                          # features/step_definitions/terminal_steps.rb:6
    Then "foo.txt" should be open in "gedit" # features/step_definitions/terminal_steps.rb:14
      TODO (Cucumber::Pending)
      ./features/step_definitions/terminal_steps.rb:15:in `/^"([^\"]*)" should be open in "([^\"]*)"$/'
      features/search_for_files.feature:16:in `Then "foo.txt" should be open in "gedit"'

2 scenarios (2 pending)
7 steps (2 pending, 5 passed)

Now I’m going to tackle the remaining pending steps in one go since their implementation overlaps somewhat. First we need to implement the “I should see” step, which will just read from our @io object to find out what the app has printed:

# features/step_definitions/terminal_steps.rb

Then /^I should see$/ do |string|
  @io.rewind
  @io.read.gsub(/\n*$/, "").should == string
end

To implement the step that checks that a file was openned, we want to know that the app shelled out to the relevant program to tell it to open the file. To support this, I’m going to stub out the system call method on the app and collect any commands we’d normally send to the shell. In our first step:

# features/step_definitions/terminal_steps.rb

Given /^I start the app with "([^\"]*)"$/ do |command|
  @io  = StringIO.new
  @app = Terminal.new(command.split(/\s+/), @io)
  
  @commands = []
  @app.stub('`') { |cmd| @commands << cmd }
end

Then to implement the “should be open” step we can inspect this array to see if the right command was called:

# features/step_definitions/terminal_steps.rb

Then /^"([^\"]*)" should be open in "([^\"]*)"$/ do |path, program|
  @commands.should include("#{program} #{path}")
end

One more test run:

Feature: Search for files

  Background:                             # features/search_for_files.feature:3
    Given I start the app with "-c gedit" # features/step_definitions/terminal_steps.rb:1

  Scenario: Find files by name            # features/search_for_files.feature:6
    Given I enter "foo"                   # features/step_definitions/terminal_steps.rb:9
    Then I should see                     # features/step_definitions/terminal_steps.rb:13
      """
      1. foo.txt
      """
      expected: "1. foo.txt",
           got: "" (using ==)
      
       Diff:
      @@ -1,2 +1 @@
      -1. foo.txt
       (Spec::Expectations::ExpectationNotMetError)
      ./features/step_definitions/terminal_steps.rb:15:in `/^I should see$/'
      features/search_for_files.feature:8:in `Then I should see'

  Scenario: Open a matching file             # features/search_for_files.feature:13
    Given I enter "foo"                      # features/step_definitions/terminal_steps.rb:9
    And I enter "1"                          # features/step_definitions/terminal_steps.rb:9
    Then "foo.txt" should be open in "gedit" # features/step_definitions/terminal_steps.rb:18
      expected [] to include "gedit foo.txt" (Spec::Expectations::ExpectationNotMetError)
      ./features/step_definitions/terminal_steps.rb:19:in `/^"([^\"]*)" should be open in "([^\"]*)"$/'
      features/search_for_files.feature:16:in `Then "foo.txt" should be open in "gedit"'

Failing Scenarios:
cucumber features/search_for_files.feature:6 # Scenario: Find files by name
cucumber features/search_for_files.feature:13 # Scenario: Open a matching file

2 scenarios (2 failed)
7 steps (2 failed, 5 passed)

Finally, we implement code in the app to make these steps pass. I’m not going to actually write the search logic, just the logic to print output and open files so that our tests pass.

# lib/terminal.rb

require 'oyster'

class Terminal
  BIN_SPEC = Oyster.spec do
    string :command
  end
  
  def initialize(argv, io)
    @options = BIN_SPEC.parse(argv)
    @stdout  = io
    @results = []
  end
  
  def interpret(command)
    case command
    when /^\d+$/ then open_result(command.to_i - 1)
    else
      @results = command.split(/\s+/).map { |f| "#{f}.txt" }
      print_results
    end
  end
  
  def open_result(index)
    `#{ @options[:command] } #{ @results[index] }`
  end
  
  def print_results
    @results.each_with_index do |result, i|
      @stdout.puts "#{ i+1 }. #{ result }"
    end
  end
end

We should get a nice green list of cukes now:

Feature: Search for files

  Background:                             # features/search_for_files.feature:3
    Given I start the app with "-c gedit" # features/step_definitions/terminal_steps.rb:1

  Scenario: Find files by name            # features/search_for_files.feature:6
    Given I enter "foo"                   # features/step_definitions/terminal_steps.rb:9
    Then I should see                     # features/step_definitions/terminal_steps.rb:13
      """
      1. foo.txt
      """

  Scenario: Open a matching file             # features/search_for_files.feature:13
    Given I enter "foo"                      # features/step_definitions/terminal_steps.rb:9
    And I enter "1"                          # features/step_definitions/terminal_steps.rb:9
    Then "foo.txt" should be open in "gedit" # features/step_definitions/terminal_steps.rb:18

2 scenarios (2 passed)
7 steps (7 passed)

Now that you have a well-tested frontend for your app, you can flesh it out with business logic, adding more tests using “Given I enter”, “I should see” as you go. Finally, you can easily add an executable for your project that supplies real input/output objects and runs the application. In bin/terminal

#!/usr/bin/env ruby

require 'rubygems'
require 'readline'
require File.dirname(__FILE__) + '/../lib/terminal'

app = Terminal.new(ARGV, Kernel)
loop { app.interpret(Readline.readline('> ')) }

Just chmod +x bin/terminal and run it, and you should see it running just like your tests say it should.

Helium: a package server for JavaScript

Last week, my former employer theOTHERmedia open-sourced the last project I worked on there: Helium. It’s a web application that lets you deploy JavaScript packages from Git and load them on-demand into any website by including a single script tag. There’s been a lot of innovation in JavaScript deployment recently, and Helium fits a particular set of needs that I think most web agencies will be all too used to struggling with.

First, some background about the problems we were trying to solve. theOTHERmedia is a design and development agency with dozens of clients and a lot of code to keep under control. Their JavaScript stack is based around YUI and Ojay, as well as other common tools like Google Maps, Analytics etc. As part of client work, we inevitably ended up producing a lot of reusable components that didn’t really fit into the Ojay core but nevertheless needed to be shared between client projects.

Because they had no obvious place to live, these little libraries ended up being copy-pasted from project to project, producing a maintenance headache. If I fixed a bug in one copy, I’d have to track down all the other copies to patch them. Also, some clients required some small customisations: too small to merit generalising the code to accommodate them, but large enough that managed branching, merging and cherry-picking between branches would really help with maintenance.

The main day-to-day problems I’d run into were that these projects became hard to maintain, hard to track down, and that it wasn’t clear how to set them up. Developers would copy a component into their project and come to me when it didn’t work: inevitably some dependency or config setting was missing. I wanted to make it as transparent as possible to load any JavaScript object you liked into a site without running into these sorts of problems, and I wanted each library to be maintained in one repository and farmed out to client sites as required. I should be able to apply a bug fix and see it show up in all our clients’ sites immediately.

The solution to the maintenance problem was obvious: these libraries should have been under their own version control, and the need to fork libraries for certain clients and keep the forks up-to-date with bug fixes made Git the perfect choice for this. Git makes branching and merging easy, and lets you cherry-pick changes between branches if you don’t want to do a full merge. Problem one solved.

The second problem is deployment: we need to get the libraries out of Git, build them and distribute them to client sites. Clients need to specify which version (i.e. which branch or tag) of each library they want to use. And, dependencies should be handled automatically: I’m a big fan of package managers like aptitude on Ubuntu or RubyGems in the Ruby community as they make it easy to get software running without worrying about its internal dependencies.

This is where an number of other tools come in. Helium is really a wrapper around a set of smaller components that were all designed to be flexible enough that they can be trivially integrated. The main “front end” of Helium is powered by JS.Packages, a pure-JavaScript dependency manager that ships with JS.Class. It works at a very high level, letting you load JavaScript objects by name rather than by URL, figuring out dependencies for you and downloading files in parallel where possible. For example, it lets me write this:

require('YAHOO.util.Selector', 'GMap2', function() {
    var links = YAHOO.util.Selector.query('a'),
        map = new GMap2(container);
    
    // ...
});

This is very powerful, but it requires some configuration to tell JS.Packages where to find objects, what other objects they depend on and so on. To make the above work, you need the following configuration:

JS.Packages(function() { with(this) {
    file('http://yui.yahooapis.com/2.8.0/build/yahoo-dom-event/yahoo-dom-event.js')
        .provides('YAHOO',
                  'YAHOO.util.Dom',
                  'YAHOO.util.Event');
    
    file('http://yui.yahooapis.com/2.8.0/build/selector/selector-min.js')
        .provides('YAHOO.util.Selector')
        .requires('YAHOO.util.Dom');
    
    loader(function(cb) {
        var url = 'http://www.google.com/jsapi?key=' + Helium.GOOGLE_API_KEY;
        load(url, cb);
    })  .provides('google.load');
    
    loader(function(cb) { google.load('maps', '2.x', {callback: cb}) })
        .provides('GMap2', 'GClientGeocoder',
                  'GEvent', 'GLatLng', 'GMarker')
        .requires('google.load');
}});

Aside from the fact that writing this is fairly tedious, I didn’t want these config files to live in the client projects where they’d need just as much maintenance as the original libraries, especially since different branches of a library often have different dependencies.

Now most of this code tends to be of the file/provides/requires variety rather than the custom loader function variety, and this is code that’s very easy to generate. The final piece of the puzzle, Jake, is what ties this all together.

Jake is a build tool for JavaScript projects that mostly handles code concatenation and minification for larger JavaScript projects. The neat thing about it is that it lets you embed metadata (such as dependency information) and has an event system to let you know when build files are generated and where they live on disk. This means you can use it to build a manifest of all the files a project build generates, which JavaScript objects each file contains and which objects it depends on, just by placing some metadata in each project’s build configuration. We’ve now moved this data out of the client projects and into a central location: each library manages its own dependency data, and Helium uses Jake to extract this and build a manifest of all the libraries on your server and what their dependencies are.

Putting it all together, the whole process pulls projects out of Git, uses Jake to build every branch of each project and extract its dependency data, then uses this to generate a JS.Packages manifest: it’s this file that client sites include in their head section:

<script type="text/javascript"
        src="http://helium.yourcompany.com/js/helium.js">
        </script>

You then need to tell Helium which branch/tag of each library to use, and thereafter you can use require() to load any object you like. On the client side, Helium essentially acts as a versioning layer on top of JS.Packages.

<script type="text/javascript">
Helium.use('yui', '2.7.0');
Helium.use('ojay', '0.4.1');
</script>

The intent is for Helium users to host their own installations of it, since providing a single package manifest for all the JavaScript libraries on the web would produce a huge file and obviate the benefit of on-demand loading. If you want to check it out, the best place to start is the documentation on GitHub, which covers a ton of stuff I’ve not even mentioned here. In particular it explains how to make your own JavaScript projects Helium-deployable, a process I’ve tried hard to make as easy as possible. I’d love to see this adopted as a package distribution system for JavaScript, so if you have any feedback on how it can be improved, get over to GitHub and let us know!