Before I start in on this: be aware I’m mostly writing this to force myself to
understand something by writing it down. If you get anything out of it, consider
it a bonus. I will be deriving Y()
in JavaScrpit, and giving a version in
Ruby.
After stumbling on this article on Raganwald last year (thoroughly interesting site by the way), I was initially intrigued and then totally frustrated by not being able to understand what the hell this Y combinator thing was. It’s of limited practical use to me, but it bugs me when I can’t wrap my brain around something. This is an attempt to remedy said frustration. You may find this Wikipedia article and Richard Gabriel’s The Why of Y (PDF, examples in Scheme) useful as we go along. I’ve cobbled this together from other articles linked from Raganwald and found through Google.
I assume we’re all familiar with the idea of a function as something that takes some input value and returns some output value. Say, the function for squaring numbers:
- f(x) = x²
Or in Javascript:
var square = function(x) {
return x * x;
};
The fixed points of a function are any input values for which f(x) is equal to x. So, the fixed points of f(x) = x² are 0 and 1.
Now, in mathematics and in any language that supports first-class functions (that is, functions that can be passed around as data), we have things called higher-order functions. These are functions that take another function as input, or return a function as output, or both. The fixed point of a higher order function f is another function p such that f(p) = p. (It may be more helpful to think in terms of functions actually being executed. The previous statement is equivalent to the statement that f(p)(x) = p(x) for all values of x.) Y (the Y combinator) is a special function that returns the fixed points of higher-order functions, that is to say:
- f(Y(f)) = Y(f)
Y is commonly use to allow anonymous recursion without assuming your host language supports it. Let’s say I want a function that calculates the factorial of a number:
var factorial = function(x) {
return x == 0 ? 1 : x * factorial(x-1);
};
Before we go any further: JavaScript supports anonymous recursion. The correct form of the above function is:
var factorial = function(x) {
return x == 0 ? 1 : x * arguments.callee(x-1);
};
arguments.callee
gives all functions an internal reference to themselves. This
article is about what to do if such a feature is not available in your language.
There’s a big problem with the above function (the one without
arguments.callee
): it relies on the name I’ve given the function externally.
If I do this:
var bar = factorial;
factorial = "something";
bar(5) // -> does not work
My factorial function ceases to work because, inside it, factorial
is no
longer what it expects. What we’re going to work towards here is a function that
acts as a factory for factorial functions:
var f = function(q) {
return function(x) {
return x == 0 ? 1 : x * q(x-1);
};
};
This function f
will return a factorial function. To recurse, it calls a local
function (only defined inside f
, not globally) called q
. We need to pass
this function to f
to make the recursion work correctly, but what function do
we need? q
needs to be the factorial function for the inner function to work
properly, but we’re back where we started: we can’t write a secure factorial
function because we can’t use anonymous recursion.
But, notice what’s happened: if we pass the factorial function to f
, it will
return the factorial function. The factorial function is thus a fixed point of
f
, and can be determined using the Y combinator. So how do we get there?
To start with, lets make the recursive function local to factorial
by passing
it in as a parameter:
var factorial = function(h, x) {
return x == 0 ? 1 : x * h(h, x-1);
};
factorial(factorial, 5) // -> 120
We now pass a function to tell factorial
how to recurse without refering to
any global variables. This interface is not really satisfactory though – I want
to be able to call functionName(value)
. So let’s create a function that
returns another function (this is called currying our factorial
function):
var factorial = function(h) {
return function(x) {
return x == 0 ? 1 : x * h(h)(x-1);
};
};
factorial(factorial)(5) // -> 120
That’s all well and good, but there’s a lot of cruft and duplication and we’ve
got h(h)(x)
in the function definition rather than a single q(x)
as above.
So, we create another nested function that creates a single reference to h(h)
:
var factorial = function(h) {
return function(x) {
var f = function(q, x) {
return x == 0 ? 1 : x * q(x-1);
};
return f(h(h), x);
};
};
factorial(factorial)(5) // -> 120
This is almost looking like our f
function form before. We just curry the
innermost function to complete the transformation:
var factorial = function(h) {
return function(x) {
var f = function(q) {
return function(x) {
return x == 0 ? 1 : x * q(x-1);
};
};
return f(h(h))(x);
};
};
factorial(factorial)(5) // -> 120
Now that inner function f
does not contain any references to variables outside
it, so we can pull it out and it will work the same:
var f = function(q) {
return function(x) {
return x == 0 ? 1 : x * q(x-1);
};
};
var factorial = function(h) {
return function(x) {
return f(h(h))(x);
};
};
factorial(factorial)(5) // -> 120
Now we have our original f
that we want the fixed point of. The fixed point is
the factorial function given by factorial(factorial)
, so we can write our
Y()
function to return it:
var Y = function(f) {
var g = function(h) {
return function(x) {
return f(h(h))(x);
};
};
return g(g);
};
This is more generally written as:
var Y = function(f) {
return (function(g) {
return g(g);
})(function(h) {
return function() {
return f(h(h)).apply(null, arguments);
};
});
};
Y()
will now return the fixed point of any higher-order function, which
is just what we wanted:
var factorial = Y(function(recurse) {
return function(x) {
return x == 0 ? 1 : x * recurse(x-1);
};
});
factorial(5) // -> 120
And presto – anonymous recursion with no dependence on global variables or
support from the host language. If you want to do this in Ruby (I’m not sure
whether it has an analog for arguments.callee
), it looks like this:
def y(&f)
lambda { |g| g[g] } [
lambda do |h|
lambda { |*args| f[h[h]][*args] }
end
]
end
factorial = y do |recurse|
lambda do |x|
x.zero? ? 1 : x * recurse[x-1]
end
end
The frustrating thing about Y is that once you’ve derived it, it’s totally impossible to tell what it does just by looking at it. If you made it this far, congratulations.