How Ruby method dispatch works

I was asking around to see if anyone knew a good, short explanation of Ruby’s object and method dispatch system the other day, and the response from several people was, “no, you should write one.” So, here we are. I’m going to explain how Ruby’s object system works, including method lookup, inheritance, super calls, classes, mixins, and singleton methods. My understanding comes not from reading the MRI source but from reimplementing this system, once in JavaScript and once in Ruby. If you want to read a minimal but almost correct implementation that Ruby gist is not a bad place to start.

Because I’ve not actually read the source, this will explain what happens logically but it might not be what actually happens inside of Ruby. It’s just a model you can use to understand things.

Right, let’s start at the start. You can build almost all of Ruby’s object system out of Module. Think of a module as a bag of methods. For example, module A contains methods foo and bar.

    +----------+
    | module A |
    +----------+
    | def foo  |
    | def bar  |
    +----------+

When you write def foo ... end inside a Ruby module, you are adding that method to the module, that’s all. Now, a module can have any number of ‘parents’. When you write:

module B
  include A
end

all you are doing is adding A as a ‘parent’ of B. No methods are copied, we just create a pointer from B to A.

    +-----------+
    | module A  |
    +-----------+
    | def foo   |
    | def bar   |
    +-----------+
          ^
          |
    +-----+-----+
    | module B  |
    +-----------+
    | def hello |
    | def bye   |
    +-----------+

Now, a module can have many parents, and they form a tree. Take these modules:

module A
  def foo ; end
  def bar ; end
end

module B
  def hello ; end
  def bye ; end
end

module C
  include B
  def start ; end
  def stop ; end
end

module D
  include A
  include C
end

These form a tree like this, following their include relationships:

                                            +-----------+
                                            | module B  |
                                            +-----------+
                                            | def hello |
                                            | def bye   |
                                            +-----------+
                                                  ^
    +-----------+                           +-----+-----+
    | module A  |                           | module C  |
    +-----------+                           +-----------+
    | def foo   |                           | def start |
    | def bar   |                           | def stop  |
    +-----------+                           +-----------+
          ^                                       ^ 
          +-------------------+-------------------+
                              |
                        +-----+-----+
                        | module D  |
                        +-----------+

An important concept that affects how methods are dispatched is a module’s ‘ancestry’. You can ask a module for its ancestors and it will give you an array of modules:

>> D.ancestors
=> [D, C, B, A]

The important thing about this list is that it’s flat, rather than being a tree. It determines the order that we search modules in to find a method. To build this list, we start at D and run a depth-first right-to-left search of its tree. This is why the order of include calls is important: a module’s parents are ordered and this determines the order they are searched in.

When we want to dispatch a method, we look at each one of a module’s ancestors in turn, and stop at the first module that contains a method with the name we want. If none of the modules contain this method, we perform the search again but this time looking for the method called method_missing. If none of the modules contain that method, we throw a NoMethodError. The order of the ancestry resolves cases where two modules contain the same method: whichever comes earlier in the ancestors array wins.

We can use Ruby’s reflection capabilities to find out which method will be used when we invoke certain names:

>> D.instance_method(:foo)
=> #<UnboundMethod: D(A)#foo>

>> D.instance_method(:hello)
=> #<UnboundMethod: D(B)#hello>

>> D.instance_method(:start)
=> #<UnboundMethod: D(C)#start>

An UnboundMethod is just an object representing a method from a module, before it’s been bound to an object. When you see D(A)#foo, it means D has inherited the #foo method from A. If you dispatch #foo to an object that includes D, you’ll get the method defined in A.

Speaking of objects, why haven’t we made any yet? What good is a bag of methods will no objects to invoke them on? Well, that’s where Class comes in. In Ruby, Class is a subclass of Module, which sounds weird but just remember they’re data structures that hold methods. A Class is like a Module, in that it’s a thing that stores methods and can include other modules, but it also has some additional capabilities, the first of which is that it can create objects.

class K
  include D
end

k = K.new

Again, we can use reflection to see where each of the object’s methods come from:

>> k.method(:start)
=> #<Method: K(C)#start>

This shows that when we invoke k.start, we’ll get the #start method from module C. You’ll notice that while calling instance_method on a module gets us an UnboundMethod, calling method on an Object gets us a Method. The difference is that a Method is bound to an object; it’s a callable that, when you invoke #call on it, will do the same thing as calling k.start. UnboundMethods cannot be called directly since they have no object to be invoked on.

So it looks like we dispatch method calls by finding the class the object belongs to, then looking through that class’s ancestors until we find a matching method. Well, that’s almost true, but Ruby has another trick up its sleeve: singleton methods. You can add new methods to any object, and only that object, without adding them to a class. See:

>> def k.mart ; end

>> k.method(:mart)
=> #<Method: #<K:0x00000001f78248>.mart>

We can add them to modules too, since modules are just another kind of object:

>> def B.roll ; end

>> B.method(:roll)
=> #<Method: B.roll>

When a Method‘s name has a dot (.) instead of a hash (#) in it, it means the method exists only on that object instead of being contained in a module. But we said earlier that modules are the thing Ruby uses to store methods; plain old objects don’t have this power. So where are singleton methods stored?

Every object in Ruby (and remember, modules and classes are objects too) has what’s called a metaclass, also known as a singleton class, eigenclass or virtual class. The job of this class is simply to store the object’s singleton methods; by default it contains no methods and has the object’s class as its only parent. So for our object k, its full ancestor tree looks like this:

                                            +-----------+
                                            | module B  |
                                            +-----------+
                                                  ^
    +-----------+                           +-----+-----+
    | module A  |                           | module C  |
    +-----------+                           +-----------+
          ^                                       ^
          +-------------------+-------------------+
                              |
                        +-----+-----+
                        | module D  |
                        +-----------+
                              ^
                        +-----+-----+
                        | class K   |
                        +-----------+
                              ^
                        +-----+-----+         +---+
                        | metaclass |<~~~~~~~~+ k |
                        +-----------+         +---+

We can ask Ruby for an object’s metaclass, and reflect on it just like any other. Here we see the metaclass is an anonymous Class attached to the object k, and it has an instance method #mart that doesn’t exist in the K class.

>> k.singleton_class
=> #<Class:#<K:0x00000001f78248>>

>> k.singleton_class.instance_method(:mart)
=> #<UnboundMethod: #<Class:#<K:0x00000001f78248>>#mart>

>> K.instance_method(:mart)
NameError: undefined method `mart' for class `K'

One gotcha to look out for is that metaclasses don’t appear in their own #ancestors lists, but you should think of them being in their for the purposes of finding methods.

When we invoke methods on k, it asks its metaclass to find the method, and this uses the metaclass’s ancestry to locate the required method. Singleton methods live in the metaclass itself, so they are preferred over methods inherited from the object’s class or any of its ancestors.

Now we come to the second special property of classes, beyond their ability to create objects. Classes have a special form of inheritance called 'subclassing’. Every class has one and only one superclass, the default being Object. In terms of method lookup, you can think of a superclass as just being the class’s first parent module:

class Foo < Bar                   class Foo
  include Extras        =~          include Bar
end                                 include Extras
                                  end

So Foo.ancestors gives us [Foo, Extras, Bar] in both cases, and this determines method lookup order as usual. (Actually it gives us [Foo, Extras, Bar, Object, Kernel, BasicObject] but we’ll get to those letter modules in a minute.) Note that Ruby violates the Liskov substitution principle by not allowing classes to be given to include; only modules can be used this way, not their subtypes. The above snippet simply expresses what subclassing means for method lookup, and the code on the right will not run if Bar is a Class.

If subclassing is the same as including, why do we need it at all? Well, it does one extra thing: classes inherit their superclass’s singleton methods, but not those of included modules.

module Z
  def self.z ; :z ; end
end

class Bar
  def self.bar ; :bar ; end
end

class Foo < Bar
  include Z
end

# Singleton methods from Bar work on Foo ...
>> Bar.bar
=> :bar
>> Foo.bar
=> :bar

# ... but singleton methods from Z don't
>> Z.z
=> :z
>> Foo.z
NoMethodError: undefined method `z' for Foo:Class

We can model this in terms of parent relationships by saying that the subclass’s metaclass has the superclass’s metaclass as a parent:

    +-----+         +--------------+
    | Bar +~~~~~~~~>| #<Class:Bar> |
    +-----+         +--------------+
       ^                    ^
       |                    |
    +--+--+         +-------+------+
    | Foo +~~~~~~~~>| #<Class:Foo> |
    +-----+         +--------------+

And indeed if we reflect on Foo we see that its #bar method originates from Bar‘s metaclass.

>> Foo.method(:bar)
=> #<Method: Foo(Bar).bar>

>> Foo.singleton_class.instance_method(:bar)
=> #<UnboundMethod: #<Class:Bar>#bar>

We’ve seen how inheritance and method lookup in Ruby can be modelled as a tree of modules, with include and subclassing creating various parent relationships. This describes single and multiple inheritance of instance and singleton methods pretty well. Now let’s look at a few things that piggy-back on this model.

The first is the Object#extend method. Calling object.extend(M) makes the methods in module M available on object. It doesn’t copy the methods, it just adds M as a parent of the object’s metaclass. If object has class Thing, we get this relationship:

                     +-------+      +-----+
                     | Thing |      |  M  |
                     +-------+      +-----+
                         ^             ^
                         +-------+-----+
                                 |
    +--------+         +---------+-------+
    | object +~~~~~~~~>| #<Class:object> |
    +--------+         +-----------------+

So extending an object with a module is just the same thing as including that module in the object’s metaclass. (Actually there are some differences but they’re not relevant to the present discussion.) Given this tree, we see that when we invoke methods on object, the lookup process will prefer methods contained in M to those defined in Thing, and will prefer methods defined directly in the object’s metaclass over both of them.

This context is important: we cannot say methods in M take precedence over Thing in general, only when we’re talking about method calls to object. The method receiver’s ancestry is what’s important, and this shows up when we investigate how super works. Take this set of modules:

module X
  def call ; [:x] ; end
end

module Y
  def call ; super + [:y] ; end
end

class Test
  include X
  include Y
end

The ancestry of Test goes [Test, Y, X], so clearly if we call Test.new.call we will invoke the #call method from Y. But what happens when Y calls super? Y has no ancestors of its own, so there’s nowhere to dispatch the method to, right?

Nope. When we encounter a super call, what’s important is the ancestry of the object we made the original method call on (the 'receiver’), and nothing else. You can imagine method lookup as finding all the implementations of the given method in the ancestor list for the receiver’s metaclass:

>> t = Test.new

>> t.singleton_class.ancestors.map { |m|
     m.instance_methods(false).include?(:call) ? m.instance_method(:call) : nil
   }.compact

=> [#<UnboundMethod: Y#call>, #<UnboundMethod: X#call>]

To dispatch the method, we invoke the first method in this list. If that method calls super, we jump to the second, and so on until we run out of methods to invoke. If Test didn’t include module X, there would be no implementations of #call after the one from Y so that call to super would fail.

Sure enough, in our case Test.new.call returns [:x, :y].

We’re almost done, but I promised I’d explain what Object, Kernel and BasicObject are. BasicObject is the root class of the whole system; it’s a Class with no superclass. Object inherits from BasicObject, and is the default superclass of all user-defined classes. The difference between the two is that BasicObject has almost no methods defined in it, while Object has loads: core Ruby methods like #==, #__send__, #dup, #inspect, #instance_eval, #is_a?, #method, #respond_to?, and #to_s. Well, actually it doesn’t have all those methods itself, it gets them from Kernel. Kernel is just the module with all Ruby’s core object methods in it. So when we map out Ruby’s core object system we get the following:

    +---------------+                                                           +------------+
    |               |                                                           |            |
    |   +-----------+----------+     +-------------+    +--------+     +--------+--------+   |
    |   | #<Class:BasicObject> |<~~~~+ BasicObject |    | Kernel +~~~~>| #<Class:Kernel> |   |
    |   +----------------------+     +-------------+    +--------+     +-----------------+   |
    |               ^                       ^                ^                               |
    |               |                       +-------+--------+                               |
    |               |                               |                                        |
    |      +--------+--------+                 +----+---+                                    |
    |      | #<Class:Object> |<~~~~~~~~~~~~~~~~+ Object |                                    |
    |      +-----------------+                 +--------+                                    |
    |               ^                               ^                                        |
    |               |                               |                                        |
    |      +--------+--------+                 +----+---+                                    |
    |      | #<Class:Module> |<~~~~~~~~~~~~~~~~+ Module |<-----------------------------------+
    |      +-----------------+                 +--------+
    |               ^                               ^
    |               |                               |
    |      +--------+--------+                 +----+---+
    |      | #<Class:Class>  |<~~~~~~~~~~~~~~~~+ Class  |                                                                                                                                
    |      +-----------------+                 +--------+
    |                                               ^
    |                                               |
    +-----------------------------------------------+

This shows the core modules and classes in Ruby: BasicObject, Kernel, Object, Module and Class, their metaclasses, and how they are all related. Yes, BasicObject.singleton_class.superclass is Class. Ruby does some voodoo internally to make this circular relationship work. Anyway, if you want to understand Ruby method dispatch, just remember:

  • A module is a bag of methods
  • A module can have many parents
  • A class is a module that can make new objects
  • Every object has a metaclass that has the object’s class as its parent
  • Subclassing means linking two classes and their metaclasses
  • Methods are found via a depth-first right-to-left search of the receiver’s metaclass’s ancestry

No, I don’t know how refinements work. No-one does.

If you’ve enjoyed this article, you might enjoy my recently published book JavaScript Testing Recipes. It’s full of simple techniques for writing modular, maintainable JavaScript apps in the browser and on the server.