Of objects and operators

I was bitten this week by some unexpected behaviour in RSpec. I can’t remember the specifics of the problem but here’s a very similar one: when a Faye client connects to the server, it must perform what’s called a handshake. It sends the server a message and the server responds by sending a randomly generated client ID back. We don’t care what the ID is, we just care about the format of the response. So we do something like this (greatly simplified of course):

describe Faye::Server do
  include Rack::Test::Methods
  let(:app) { Faye::Server.new }
  
  describe :handshake do
    before do
      post '/bayeux', :message => '{"channel":"/meta/handshake"}'
    end
    
    it "returns a clientId" do
      json = JSON.parse(last_response.body)
      json.should == hash_including('clientId' => instance_of(String))
    end
  end
end

Let’s assume our server works properly, and returns a response body of {"clientId":"abc123"}. Our test will end up doing this:

{'clientId' => 'abc123'}.should == hash_including('clientId' => instance_of(String))
RSpec::Expectations::ExpectationNotMetError:
expected: #<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00000001be5ab8 @expected={"clientId"=>#<RSpec::Mocks::ArgumentMatchers::InstanceOf:0x00000001be65f8 @klass=String>}>
     got: {"clientId"=>"abc123"} (using ==)
Diff:
@@ -1,6 +1,2 @@
-#<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00000001be5ab8
- @expected=
-  {"clientId"=>
-    #<RSpec::Mocks::ArgumentMatchers::InstanceOf:0x00000001be65f8
-     @klass=String>}>
+{"clientId"=>"abc123"}

What happens if we swap the operands of == around?

hash_including('clientId' => instance_of(String)).should == {'clientId' => 'abc123'}
#=> true

So on the one hand, we’re told that A != B, but somehow B == A. What’s going on here? Well, you no doubt recognise hash_including and instance_of from RSpec’s mocking framework. It lets you write stubs like this:

o = Object.new
o.stub(:foo).with(instance_of String).and_return :chunky
o.stub(:foo).with(hash_including 'clientId' => 'abc123').and_return :bacon

o.foo('hello') #=> :chunky
o.foo('clientId' => 'abc123') #=> :bacon

These methods create matchers: objects that are used to pattern-match incoming arguments and dispatch the correct return value. They work by implementing custom equality methods that RSpec uses to tell which argument list matches the arguments passed to the method.

instance_of(String) == 'hello'
#=> true
hash_including('clientId' => 'abc123') == 'hello'
#=> false
instance_of(String) == {}
#=> false
hash_including('clientId' => 'abc123') == {'clientId' => 'abc123'}
#=> true

But if you try placing these matchers on the right-hand-side of an equality expression, something weird happens:

'hello' == instance_of(String)
#=> false

These things ought to be equal, but they’re not, and the reason they’re not is quite simple: == is implemented as a method.

In object-oriented programming, a method is just a function attached to some object. When it is invoked, it has access to the object it is attached to, and all its state, as well as the arguments passed into the method. In Ruby, the equality method looks like this:

class MyClass
  def ==(other)
    # return true or false
  end
end

Now broadly speaking, I like programming this way. I find objects to be a useful way of organising the concepts in my programs into thematically linked groups. Methods let you tell objects what to do, and they let you ask objects questions. But what they don’t do terribly well is let you implement operators.

On the surface, it seems like a natural fit: invoking an operator could be seen as asking an object, “Are you equal to this other object?”. Letting your custom object types implement this question seems like a good idea, as it lets the language’s built-in idea of equality be extended. But the equality operator should not work this way: the question of whether two objects are equal is related to the concept of equality, not to the whims of either of the operands. By implementing equality as a method, we make one object entirely responsible for the question of whether two objects are equal, and it turns out this is not easily extensible.

Many operators have properties that are hard to get right if implemented as methods, and the most troublesome of them all is commutativity. Operators such as equality, addition and scalar multiplication have the property that the order of the operands does not matter. A + B = B + A and if A = B then B = A. It is not hard to see how, as we introduce more and more types of objects into a system, each type needs to know how to interact with every other type. In our example above, RSpec matchers know how to compare themselves to other Ruby objects, but the objects have no idea what an RSpec matcher is and so declare themselves not equal. Ruby provides a mechanism for working around this with the coerce method, but even this has a problem. Users need to remember to defer to it if given an unknown object type, it’s easy to create an infinite loop if both parties defer responsibility, and clearly many built-in Ruby objects don’t use it to defer equality decisions.

This little story highlights two key problems. First, given the semantics of operators in Ruby, RSpec’s assertion mechanism is unreliable. By placing the value under test on the left-hand-side of the a.should == b expression, we make it responsible for deciding whether it is a valid value or not when really that decision should be up to b, the expected value. This is one reason that I prefer Test::Unit’s assert_equal, which places the expected value first and invokes == on it rather than on the value under test. I’ve been told that custom RSpec matchers can solve this problem but really that feels like far too much ceremony when a simple function call will suffice.

Second, the more insidious problem that we’ve seen repeated in Dart this week: operators should not be methods. Making one operand responsible for the whole operation might seem convenient but it does not scale as your program grows. A more sensible approach is to adopt a dispatch table where you can dynamically register new type signatures for existing operators; the closest thing I’ve seen to this is Clojure’s multimethods. I won’t say they’re the answer as I’ve not really used Clojure in anger, but they do look like a better solution than jamming operators into the ‘everything is an object’ model.

Unfortunately, many things that are bad ideas in theory turn out to be useful in practise, and until we come up with convenient ways of doing the right thing, we’ll be lumbered with bad design decisions in our programming languages.