I was bitten this week by some unexpected behaviour in RSpec. I can’t remember the specifics of the problem but here’s a very similar one: when a Faye client connects to the server, it must perform what’s called a handshake. It sends the server a message and the server responds by sending a randomly generated client ID back. We don’t care what the ID is, we just care about the format of the response. So we do something like this (greatly simplified of course):
describe Faye::Server do
include Rack::Test::Methods
let(:app) { Faye::Server.new }
describe :handshake do
before do
post '/bayeux', :message => '{"channel":"/meta/handshake"}'
end
it "returns a clientId" do
json = JSON.parse(last_response.body)
json.should == hash_including('clientId' => instance_of(String))
end
end
end
Let’s assume our server works properly, and returns a response body of
{"clientId":"abc123"}
. Our test will end up doing this:
{'clientId' => 'abc123'}.should == hash_including('clientId' => instance_of(String))
RSpec::Expectations::ExpectationNotMetError:
expected: #<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00000001be5ab8 @expected={"clientId"=>#<RSpec::Mocks::ArgumentMatchers::InstanceOf:0x00000001be65f8 @klass=String>}>
got: {"clientId"=>"abc123"} (using ==)
Diff:
@@ -1,6 +1,2 @@
-#<RSpec::Mocks::ArgumentMatchers::HashIncludingMatcher:0x00000001be5ab8
- @expected=
- {"clientId"=>
- #<RSpec::Mocks::ArgumentMatchers::InstanceOf:0x00000001be65f8
- @klass=String>}>
+{"clientId"=>"abc123"}
What happens if we swap the operands of ==
around?
hash_including('clientId' => instance_of(String)).should == {'clientId' => 'abc123'}
#=> true
So on the one hand, we’re told that A != B
, but somehow B == A
. What’s going
on here? Well, you no doubt recognise hash_including
and instance_of
from
RSpec’s mocking framework. It lets you write stubs like this:
o = Object.new
o.stub(:foo).with(instance_of String).and_return :chunky
o.stub(:foo).with(hash_including 'clientId' => 'abc123').and_return :bacon
o.foo('hello') #=> :chunky
o.foo('clientId' => 'abc123') #=> :bacon
These methods create matchers: objects that are used to pattern-match incoming arguments and dispatch the correct return value. They work by implementing custom equality methods that RSpec uses to tell which argument list matches the arguments passed to the method.
instance_of(String) == 'hello'
#=> true
hash_including('clientId' => 'abc123') == 'hello'
#=> false
instance_of(String) == {}
#=> false
hash_including('clientId' => 'abc123') == {'clientId' => 'abc123'}
#=> true
But if you try placing these matchers on the right-hand-side of an equality expression, something weird happens:
'hello' == instance_of(String)
#=> false
These things ought to be equal, but they’re not, and the reason they’re not is
quite simple: ==
is implemented as a method.
In object-oriented programming, a method is just a function attached to some object. When it is invoked, it has access to the object it is attached to, and all its state, as well as the arguments passed into the method. In Ruby, the equality method looks like this:
class MyClass
def ==(other)
# return true or false
end
end
Now broadly speaking, I like programming this way. I find objects to be a useful way of organising the concepts in my programs into thematically linked groups. Methods let you tell objects what to do, and they let you ask objects questions. But what they don’t do terribly well is let you implement operators.
On the surface, it seems like a natural fit: invoking an operator could be seen as asking an object, “Are you equal to this other object?”. Letting your custom object types implement this question seems like a good idea, as it lets the language’s built-in idea of equality be extended. But the equality operator should not work this way: the question of whether two objects are equal is related to the concept of equality, not to the whims of either of the operands. By implementing equality as a method, we make one object entirely responsible for the question of whether two objects are equal, and it turns out this is not easily extensible.
Many operators have properties that are hard to get right if implemented as
methods, and the most troublesome of them all is commutativity. Operators such
as equality, addition and scalar multiplication have the property that the order
of the operands does not matter. A + B = B + A
and if A = B
then B = A
. It
is not hard to see how, as we introduce more and more types of objects into a
system, each type needs to know how to interact with every other type. In our
example above, RSpec matchers know how to compare themselves to other Ruby
objects, but the objects have no idea what an RSpec matcher is and so declare
themselves not equal. Ruby provides a mechanism for working around this with
the coerce
method, but even this has a problem. Users need to remember to
defer to it if given an unknown object type, it’s easy to create an infinite
loop if both parties defer responsibility, and clearly many built-in Ruby
objects don’t use it to defer equality decisions.
This little story highlights two key problems. First, given the semantics of
operators in Ruby, RSpec’s assertion mechanism is unreliable. By placing the
value under test on the left-hand-side of the a.should == b
expression, we
make it responsible for deciding whether it is a valid value or not when really
that decision should be up to b
, the expected value. This is one reason that
I prefer Test::Unit
’s assert_equal
, which places the expected value first
and invokes ==
on it rather than on the value under test. I’ve been told that
custom RSpec matchers can solve this problem but really that feels like far too
much ceremony when a simple function call will suffice.
Second, the more insidious problem that we’ve seen repeated in Dart this week: operators should not be methods. Making one operand responsible for the whole operation might seem convenient but it does not scale as your program grows. A more sensible approach is to adopt a dispatch table where you can dynamically register new type signatures for existing operators; the closest thing I’ve seen to this is Clojure’s multimethods. I won’t say they’re the answer as I’ve not really used Clojure in anger, but they do look like a better solution than jamming operators into the ‘everything is an object’ model.
Unfortunately, many things that are bad ideas in theory turn out to be useful in practise, and until we come up with convenient ways of doing the right thing, we’ll be lumbered with bad design decisions in our programming languages.