Do I need DI? – The If Works

This article is based on a talk of the same name, on the topic of dependency injection, which I gave at Eurucamp 2012. I thought I would write it up since it’s of some relevance to the current discussion about TDD, and I frequently find myself wanting to reiterate that talk’s argument. It is only related to current events insofar as I find them emblematic of most conversations about TDD; since the talk is nearly two years old it does not directly address the current debate but tries to illustrate a style of argument I wish were more common.

DHH’s keynote at RailsConf 2014 and his ensuing declaration that TDD is dead have got the Ruby community talking testing again. My frustration with the majority of talks and writing on testing, TDD, agile development, software architecture and object-oriented design is that a great deal of it consists of generalised arguments, with very little in the way of concrete case studies. Despite the fact that I’m a perfectly capable programmer, and the fact I’ve written automated tests for all sorts of different kinds of software, I’ve sat through any number of talks on these topics that went right over my head. The impression I’m left with is that talks on TDD often only make sense if the audience already understands and agrees with the speaker’s argument. They are often littered with appeals to authority figures and books that only land if the listener knows of the cited authority or has read the particular book. At worst, they are presented as ‘movements’, wherein all the old ways of thinking must be discarded in favour of the new one true way.

Now, grand general assertions make for entertaining and engaging talks, but they are unhelpful. Real-world software development is an exercise in design, and to design things based on general rules without reference to the context of the problem at hand is to not practice design properly. We need to learn to make decisions based on the properties of the problem we’re dealing with, not by appealing to advice handed down without any context.

This is why JavaScript Testing Recipes almost entirely concerns itself with the application of a few simple techniques to a wide array of real programs, including discussion of cases where those techniques are not applicable or are too costly due to the requirements of a particular problem. I wish more talks were like this, and it’s what I was hoping to do when I wrote this talk originally.

So, this article is not really about dependency injection per se, but about how you might decide when to use it. DI, much like TDD or any other programming support tool, is not de-facto good or bad; but has pros and cons that determine when you might decide to use it. It is not even good or bad in the context of any defined problem; it is a design tool and its usefulness varies according to the mental model of the person using it.

When we ask whether an object-oriented program is well-designed, we often use proxies for this question, like asking whether the code obeys the ‘SOLID principles’:

Single responsibility principle
Open/closed principle
Liskov substitution principle
Interface segregation principle
Dependency inversion principle

We might look at each class in the system, decide whether or not it obeys each of these principles, and if most of the classes in the system are SOLID then we declare the system to be well-designed. However, this approach encourages us to factor our code in a certain way without regard for what that code means, what it models, how that model is realised in code, and who it is intended to be used by. We should not treat these principles as normative rules, but as observations: programs that are easy to maintain tend to exhibit these properties more often than not, but they are not a barometer against which to accept or reject a design.

Across the fence from the people advocating for SOLID, and TDD, and various prescriptive architectural styles, are the people who want to tell you off for following these rules, assessing your designs by whether you’re practicing ‘true’ OO, or becoming an architecture astronaut, reinventing Java’s morass of AbstractBeanListenerAdapterFactory classes.

I don’t find the rhetoric of either of these camps helpful: both offer decontextualised normative advice based on statistical observations of systems they happen to have worked on or heard of. They don’t know the system you’re working on.

Your job is not to keep Alan Kay happy. Your job is to keep shipping useful product at a sustainable pace, and it’s on you to choose practices that help you do that.

So let’s talk about DI, and when to use it. To begin with, what is dependency injection? Take a look at the following code for a putative GitHub API client:

class GitHub::Client
  def get_user(name)
    u = URI.parse("https://api.github.com/users/#{name}")
    http = Net::HTTP.new(u.host, u.port)
    http.use_ssl = true
    response = http.request_get(u.path)
    if response.code == '200'
      data = JSON.parse(response.body)
      GitHub::User.new(data)
    else
      raise GitHub::NotFound
    end
  end
end

In order to get the account data for a GitHub user, this class needs to make an HTTP request to api.github.com. It does this by creating an instance of Net::HTTP, using that to make a request, checking the request is successful, parsing the JSON data that comes back, and constructing a user object with that data.

An advocate for TDD and DI might quibble with the design of this class: it’s too hard to test because there’s no way, via the class’s interface to intercept its HTTP interactions with an external server – something we don’t want to depend on during our tests. A common response to this in Ruby is that you can stub out Net::HTTP.new to return a mock object, but this is often problematic: stubbing out singleton methods does not just affect the component you’re testing, it affects any other component of the system or the test framework that relies on this functionality. In fact, DHH’s pet example of bad DI uses the Time class, which is particularly problematic to stub out since testing frameworks often need to do time computations to enforce timeouts and record how long your tests took to run. Globally stubbing out Date is so problematic in JavaScript that JSTR dedicates a six-page section to dealing with dates and times.

Dependency injection is one technique for solving this problem: instead of the class reaching out into the global namespace and calling a singleton method to fabricate an HTTP client itself, the class is given an HTTP client as an argument to its constructor. That client offers a high-level API that makes the contract between the class and its dependency clearer. Say we want the GitHub::Client class to only deal with knowing which paths to find data at, and how to map the response data to domain objects. We could push all the concerns of parsing URIs, making requests, checking response codes and parsing the body – that is, all the concerns not specifically related to GitHub – down into a high-level HTTP client, and pass that into the GitHub::Client constructor.

class GitHub::Client
  def initialize(http_client)
    @http = http_client
  end

  def get_user(name)
    data = @http.get("/users/#{name}").parsed_body
    GitHub::User.new(data)
  end
end

This makes the class easier to test: rather than globally stubbing out Net::HTTP.new and mocking its messy API, we can pass in a fake client during testing and make simple mock expectations about what should be called.

So dependency injection means passing in any collaborators an object needs as arguments to its constructor or methods. At its most basic, this means removing any calls to constructors and other singleton methods and passing in the required objects instead. But, this is often accompanied by refactoring the objects involved to make their interactions simpler and clearer.

So that’s what DI is, but what is it for? Many advocates will tell you that it exists to make testing easier, and indeed it does often accomplish that. But testability is not usually an end-user requirement for a component; being testable is often a proxy for being usable. A class is easy to test if it’s easy to set up and use, without requiring a complex environment of dependencies in order for the class to work properly. Applying dependency injection solely to achieve better tests is a bad move: design is all about dealing with constraints, and we must take all the applicable constraints into consideration when designing a class’s API. The patterns we use must emerge in response to the program’s requirements, rather than being dogmatically imposed in the name of ‘best practices’.

This is best illustrated by a concrete example. I maintain a library called Faye, a pub/sub messaging system for the web. It uses a JSON-based protocol called Bayeux between the client and the server; the client begins by sending a handshake message to the server:

{
  "channel":  "/meta/handshake",
  "version":  "1.0",
  "supportedConnectionTypes": ["long-polling", "callback-polling", "websocket"]
}

and the server replies with a message containing an ID for the client, and a list of transport types the server supports:

{
  "channel":    "/meta/handshake",
  "successful": true,
  "clientId":   "cpym7ufcmkebx4nnki5loe36f",
  "version":    "1.0",
  "supportedConnectionTypes": ["long-polling", "callback-polling", "websocket"]
}

Once it has a client ID, the client can tell the server it wants to subscribe to a channel by sending this message:

{
  "channel":      "/meta/subscribe",
  "clientId":     "cpym7ufcmkebx4nnki5loe36f",
  "subscription": "/foo"
}

and the server acknowledges the subscription:

{
  "channel":    "/meta/subscribe",
  "clientId":   "cpym7ufcmkebx4nnki5loe36f",
  "successful": true
}

After the client has subscribed to any channels it is interested in, it sends a ‘connect’ message that tells the server the it wants to poll for new messages:

{
  "channel":  "/meta/connect",
  "clientId": "cpym7ufcmkebx4nnki5loe36f"
}

When the server receives a message on any channel the client is subscribed to, it sends a response to the connect message with the new message attached. The client then sends another connect message to poll for more messages. (When using a socket-based connection, new messages can be pushed to the client immediately, without a pending connect request; the connect messages then act as a keep-alive heartbeat mechanism.)

So Faye is fundamentally a client-server system rather than a true peer-to-peer one. The architecture as we understand it so far is very simple:

                          +--------+
                          | Client |
                          +---+----+
                              |
                              V
                          +--------+
                          | Server |
                          +--------+

However there is more to it than that. Notice the supportedConnectionTypes field in the handshake messages: the client and server allow these messages to be sent and received via a number of different transport mechanisms, including XMLHttpRequest, JSON-P, EventSource and WebSocket. So we can add another layer of indirection to our architecture:

                          +--------+
                          | Client |
                          +---+----+
                              |
                              V
    +-----------+-------------+----------------+-------+
    | WebSocket | EventSource | XMLHttpRequest | JSONP |
    +-----------+-------------+----------------+-------+
                              |
                              V
                          +--------+
                          | Server |
                          +--------+

We now have a question: the Server class allows for multiple implementations of the same concept – sending messages over an HTTP connection – to co-exist, rather than having one mechanism hard-coded. Above, we allowed our GitHub::Client class to take an HTTP client as a parameter, letting us change which HTTP client we wanted to use. Surely we have another opportunity to do the same thing here, to construct servers with different kinds of connection handlers, including the possibility of using a fake connection handler to make testing easier.

xhr_server         = Server.new(XMLHttpRequestHandler.new)
jsonp_server       = Server.new(JSONPHandler.new)
eventsource_server = Server.new(EventSourceHandler.new)
websocket_server   = Server.new(WebSocketHandler.new)

If we’re doing DI by the book, this seems like the right approach: to separate the concern of dealing with the abstract protocol embodied in the JSON messages from the transport and serialization mechanism that delivers those messages. But it doesn’t meet the constraints of the problem: the server has to allow for clients using any of these transport mechanisms. The whole point of having transport negotiation is that different clients on different networks will have different capabilities, and the central server needs to accommodate them all. So, although there is a possibility for building classes that all implement an abstract connection handling API in different ways (and this is indeed how the WebSocket portion of the codebase deals with all the different WebSocket protocol versions – see the websocket-driver gem), the server must use all of these connection handlers rather than accepting one of them as a parameter.

What about the client side? The Client class uses only one of the four transport classes at a time, to mediate its communication with the server. The client depends on having a transport mechanism available, but the choice of which transport to use can be deferred until runtime. This seems like a more promising venue to apply DI. So, should we make its API look like this?

var url    = 'http://www.example.com/faye',
    client = new Client(new WebSocketTransport(url));

Again, the answer is no. Allowing Client to accept a transport implementation as a constructor parameter gives the system that creates the client (whether that’s a user directly writing Faye client code or part of an application that makes use of Faye) a free choice over which transport to use. But, the choice of transport in this situation is not a free choice; it’s determined by the capabilities of the host browser runtime, the transports that the server says it supports, and which transports we can detect as actually working over the user’s network connection to the server – even if the client and server claim to support WebSocket, an intermediate proxy can stop WebSocket from working.

It is actually part of the client’s responsibility to choose which transport to use, based on automated analysis of these constraints at runtime. The same application, running on different browsers on different networks, will require different transports to be used, even throughout the lifetime of a single client instance. If the Client constructor took a transport as an argument, then the user of the Client class would have to conduct all this detective work themselves. Hiding this work is a core problem the client is designed to solve, so it should not accept a transport object via injection. Even though it would improve testability, it would result in the wrong abstraction being presented to the user.

Finally, let’s look at the internal structure of the Server class. It’s actually composed of multiple layers, and until version 0.6 those layers looked something like this:

    +--------------+
    |  RackServer  |    --> * WebSocket, XHR, JSON-P
    +-------+------+
            |                           |   message
            V                           V   objects
    +---------------+
    | BayeuxHandler |   --> * Bayeux message protocol
    +---------------+       * Connections / subscriptions
                            * Message queues

The RackServer deals with the front-end HTTP handling logic, processing all the different transport types and extracting JSON messages from them. Those JSON messages are then handed off to the BayeuxHandler, which contains all the transport-independent protocol logic. It processes the contents of the various JSON messages I showed you above, stores client IDs and their subscriptions and message queues, and routes incoming messages to all the subscribed clients. Now, some of the concerns of this class can be factored out:

    +--------------+
    |  RackServer  |    --> * WebSocket, XHR, JSON-P
    +-------+------+
            |                           |   message
            V                           V   objects
    +---------------+
    | BayeuxHandler |   --> * Bayeux message protocol
    +---------------+     +-------------------------------+
                          | * Connections / subscriptions |
                          | * Message queues              |
                          +-------------------------------+
                                        STATE

The BayeuxHandler class actually deals with two things: implementing the protocol as described by the JSON messages, and storing the state of the system: which clients are active, which channels they are subscribed to, and which messages are queued for delivery to which clients. There are many potential ways of implementing this state storage without changing the details of the protocol, and so in version 0.6 this concern was extract into an Engine class:

    +--------------+
    |  RackServer  |    --> * WebSocket, XHR, JSON-P
    +-------+------+
            |                           |   message
            V                           V   objects
    +---------------+
    | BayeuxHandler |   --> * Bayeux message protocol
    +---------------+
            |
            V
    +--------------+
    |    Engine    |    --> * Subscriptions
    +--------------+        * Message queues

There are two engine implementations available: in-memory storage, and a Redis database using Redis’s pub/sub mechanism for IPC.

                      +--------------+
                      |  RackServer  |
                      +-------+------+
                              |
                              V
                      +---------------+
                      | BayeuxHandler |
                      +---------------+
                              |
                    +---------+---------+
                    |                   |
            +--------------+     +-------------+
            | MemoryEngine |     | RedisEngine |
            +--------------+     +-------------+

Finally: we have an honest candidate for dependency injection: whether the stack uses the in-memory or Redis engine makes absolutely no difference to the rest of the stack. It’s a contained implementation detail; given any object that implements the Engine API correctly, the BayeuxHandler and all of the components that sit above it will not be able to tell the difference. The choice of engine is a free choice that the user can make entirely on their own terms, without being bound by environmental constraints as we saw with client-side transport negotiation.

However, we don’t have multiple choices at the BayeuxHandler layer of the stack: there is only one Bayeux protocol, it’s an open standard, and there aren’t multiple competing implementations of this component. It’s just in-process computation that takes messages extracted by the RackServer, validates them, determines their meaning and delegates any state changes to the engine.

So, BayeuxHandler can be parameterized on which engine object it uses, but the Server will always construct a BayeuxHandler (as well as any transport-handling objects it needs). A highly simplified version of RackServer than only deals with HTTP POST would look like this, taking an engine as input and handing it down to the BayeuxHandler:

class RackServer
  def initialize(engine)
    @handler = BayeuxHandler.new(engine)
  end

  def call(env)
    request  = Rack::Request.new(env)
    message  = JSON.parse(request.params['message'])
    response = @handler.process(message)
    [
      200,
      {'Content-Type' => 'application/json'},
      [JSON.dump(response)]
    ]
  end
end

A user would then start up the server like so:

server = RackServer.new(RedisEngine.new)

thin = Rack::Handler.get('thin')
thin.run(server, :Port => 80)

Now, in this scenario we get good tests, that make mock expectations about one object’s contract with another without stubbing globally visible bindings, as a side effect of the code fulfilling its design requirements, and of it being easy to use at the right level of abstraction for the problem it solves. Here are a couple of tests that assert the server tells the engine to do the right things, and that the server relays information generated by the engine back to the client.

require './rack_server'
require 'rack/test'

describe RackServer do
  include Rack::Test::Methods

  let(:engine) { double('engine') }
  let(:app)    { RackServer.new(engine) }

  describe 'handshake' do
    let(:message) { {
      'channel' => '/meta/handshake',
      'version' => '1.0',
      'supportedConnectionTypes' => ['long-polling']
    } }

    it 'tells the engine to create a new client session' do
      expect(engine).to receive(:create_client).and_return 'new_client_id'
      post '/bayeux', :message => JSON.dump(message)
    end

    it 'returns the new client ID in the response' do
      engine.stub(:create_client).and_return 'new-client-id'
      post '/bayeux', :message => JSON.dump(message)
      expect(JSON.parse(last_response.body)).to include('clientId' => 'new-client-id')
    end
  end
end

Even when we do decide to use dependency injection, we face some trade-offs. By making the external interface for constructing an object more complicated, we gain some flexibility but lose some convenience. For example, a convenient way to read files looks like this:

File.read('path/to/file.txt')

while a flexible way to read files might look like this:

FileInputStream fis = new FileInputStream("path/to/file.text");
DataInputStream dis = new DataInputStream(fis);
BufferedReader br = new BufferedReader(dis);
// ...

Yes, I have picked extreme examples and I’ve probably got the Java version wrong. The important point is, you can build the former API on top of the latter, but not the other way around. You can wrap flexible building blocks in a convenience API – just look at jQuery or any UI toolkit – but it’s much harder or impossible to go the other way around. Suppose you want to use the file I/O code independently of the string encoding code? File.read(path) does not expose those building blocks so you’re going to need to find them somewhere else.

When I was at Songkick, we built the clients for our backend services much like the GitHub example above: each client is instantiated with an HTTP client adapter, letting us easily switch which HTTP library we used (yes, we had to do this in production and it took an afternoon), as well as letting us pass in a fake for testing. But for calling these clients from a controller, we wrapped them in a convenience module that constructed canonical instances for us:

module Services
  def self.github
    @github ||= GitHub::Client.new(HttpClient.new('https://api.github.com'))
  end

  # and so on for other services
end

So controller code just made a call like Services.github.get_user('jcoglan'), which is pretty much as convenient as ActiveRecord::Base.find().

To summarise, there is certainly a place for DI, or for any architectural technique, but you must let the requirements of the problem – not just testing concerns – drive the patterns you use. In the case of DI, I reach for it when:

There are multiple implementations of an abstract API that a caller might use
The code’s client has a free choice over which implementation to use, rather than environmental factors dictating this choice
To provide plugin APIs, as in the case of Faye’s engine system or the protocol handlers in websocket-driver

Beyond DI, I would like to see much more of our design discussion focus on just that: design, in context, with case studies, without deferring to generalisation. Design patterns must be derived from the program’s requirements, not imposed in an attempt to shoe-horn the program into a predefined shape. Ultimately, rather than focussing on testing for its own sake, we must focus on usability. Testability will soon follow.