Why recursion matters, part 3: meaningful programs

In part 1 we learned that recursion underlies proof by induction, letting us guarantee the behaviour of functions algebraically. In part 2 we saw how defining functions as recursive equations enables lazy evaluation, so that the program only computes those things that are essential to generate its output. In this final part, I’d like to talk about what programs mean, and to do that I’m going to show you a tiny programming language that supports time travel.

I am going show you the whole implementation of this programming language in JavaScript, which I’m using so you know I’m not cheating. We know that, compared to Haskell, JavaScript can’t do algebraic analysis of your code or run it lazily, and it certain can’t run functions backwards, so I want you to know that everything this language can do is a result of the functions you’re about to see.

First, the language has two kinds of values: variables and pairs. A variable is simply a value that has a name. It just lets us talk about variables as first-class objects rather than merely using them.

class Variable {
  constructor(name) {
    this._name = name
  }

  inspect() {
    return this._name
  }
}

We’ll give every type an inspect() method so that we can display the results of programs in the terminal.

A pair is a value with a left field and a right field; it is the only compound data structure in this language. Though simple, pairs can be used to simulate many other structures, for example they’re commonly used to build linked lists.

class Pair {
  constructor(left, right) {
    this.left  = left
    this.right = right
  }

  inspect() {
    let left  = inspect(this.left),
        right = inspect(this.right)

    return `(${left} . ${right})`
  }
}

Then we come to the machinery for running programs, the first of which is a state. A state is a mapping from variables to values. States are immutable; every method on states will return a new state rather than mutating the receiver.

We construct a state from a list of the variables it contains, and a map that stores the values those variables are bound to.

class State {
  constructor(vars = [], values = new Map()) {
    this._vars   = vars
    this._values = values
  }

  inspect() {
    let pairs = Array.from(this._values).map((pair) => {
      return pair.map(inspect).join(' = ')
    })

    return `{ state ${inspect(this._vars)} ${pairs.join(', ')} }`
  }

  // ...

States can be changed in two ways. First, we can add variables to them; create_vars takes a list of names and turns them into a list of variables, and returns those variables plus a new state containing the current variables plus the new ones. Second, we can bind those variables to values; assign takes a variable and a value, it copies the current map of values, adds a new mapping to it, and returns a new state containing that mapping.

  create_vars(...names) {
    let new_vars = names.map((name) => new Variable(name))

    return [
      new State([...this._vars, ...new_vars], this._values),
      new_vars
    ]
  }

  assign(variable, value) {
    let new_values = new Map(Array.from(this._values))
    new_values.set(variable, value)

    return new State(this._vars, new_values)
  }

For example, if we make a new state and create a variable called x, then assign x to 42, we get a state in which x is bound to 42.

let [s, [x]] = new State().create_vars('x')

s.assign(x, 42)
// -> { state [x] x = 42 }

Next, State has operations built on these methods. First, walk essentially looks up the value of a variable in a state, including resolving chains of variables that refer to each other. If x is bound to y, and y to 3, then walking x gives 3. However, if x is not bound to anything, then walking x returns x itself. Walking a pair returns a new pair by walking both sides. The purpose of this method is to find out what a variable ultimately refers to, or to find out it’s bound to nothing and can be given a value.

  walk(term) {
    if (this._values.has(term)) {
      return this.walk(this._values.get(term))
    } else if (term instanceof Pair) {
      return new Pair(this.walk(term.left), this.walk(term.right))
    } else {
      return term
    }
  }

Then there is unify. This method simply asserts that two things are equal, and tries to return a state in which this is so. It walks the two inputs, and if they are equal, it returns the current state. If one is a variable after walking, which means it is not yet bound to anything, then we use assign to return a new state in which the value is bound to the other input. If both values are pairs, then we unify each side of them. If no way to unify the values can be found, we return null.

  unify(x, y) {
    [x, y] = [this.walk(x), this.walk(y)]

    if (x === y) return this

    if (x instanceof Variable) return this.assign(x, y)
    if (y instanceof Variable) return this.assign(y, x)

    if (x instanceof Pair && y instanceof Pair) {
      let state = this.unify(x.left, y.left)
      return state && state.unify(x.right, y.right)
    }

    return null
  }

For example, let’s make a state and two variables, x and y. If we unify x and 3, then unify y and x, then we get a state in which x and y both equal 3. If we unify a pair of 3 and x with a pair of y and a pair of 5 and y, then we get a state in which y is 3 (by the left-hand side), and x is a pair of 5 and 3 (by the right-hand side and the value of y).

let [s, [x, y]] = new State().create_vars('x', 'y')

s.unify(x, 3).unify(y, x)
// -> { state [x, y], x = 3, y = 3 }

s.unify(new Pair(3, x),
        new Pair(y, new Pair(5, y)))

// -> { state [x, y] y = 3, x = (5 . 3) }

The final elements of the language are four goals. A goal is a function that takes a State and returns a list of states; specifically it returns the set of states in which some condition is true, subject to the contraints imposed by the input state. There are four core functions for generating goals.

The first is called equal. It takes two values and returns a function that tries to unify the two values in the current state. If we get a state, we return it in a list, otherwise we return an empty list. For example, if we make a variable and create a goal where it’s equal to 1, when we call that goal with a state we get a state in which x is 1.

function equal(a, b) {
  return (state) => {
    state = state.unify(a, b)
    return state ? [state] : []
  }
}

// e.g.

let [s, [x]] = new State().create_vars('x'),
    goal     = equal(x, 1)

goal(s) // -> [ { state [x] x = 1 } ]

The second is called bind. It will be convenient to package up process of creating variables and turning them into a goal, into a single expression, and that’s what bind does. It takes a list of names, and a function that generates a goal. It converts the names into variables, and passes them to the given function to get a goal. Then it calls the goal with the new state. So, if we use bind to create two variables and return a goal where they are equal, then calling that goal returns a state in which they are equal.

function bind(names, build_goal) {
  return (state) => {
    let [new_state, vars] = state.create_vars(...names)
    let goal = build_goal(...vars)

    return goal(new_state)
  }
}

// e.g.

let goal = bind(['x', 'y'], (x, y) => equal(x, y))

goal(new State())
// -> [ { state [x, y] x = y } ]

either combines two goals and creates a goal that satisfies either of them. It simply concatenates all the states in which goal a is satisfied with all the states in which goal b is satisfied. If we ask for a goal where either x is 2 or x is 3, then we get one state where x is 2 and one where x is 3.

function either(a, b) {
  return (state) => {
    return [...a(state), ...b(state)]
  }
}

// e.g.

let goal = bind(['x'], (x) =>
             either(equal(x, 2), equal(x, 3)))

goal(new State())
// -> [ { state [x] x = 2 }, { state [x] x = 3 } ]

Finally, both gets all the states that satisfy goal a, then calls goal b in each of those states and combines the results, so it generates the states in which both goals a and b are satisfied.

function both(a, b) {
  return (state) => {
    return a(state).reduce((states, a_state) => {
      return [...states, ...b(a_state)]
    }, [])
  }
}

// e.g.

let goal = bind(['x', 'y'], (x, y) =>
             both(equal(x, 8), equal(y, 9)))

goal(new State())
// -> [ { state [x, y] x = 8, y = 9 } ]

All these goals can be composed. For example if we ask for a goal that both x is either red or blue, and y is yellow, then we get one state in which x is red and y is yellow and one in which x is blue and y is yellow.

let goal = bind(['x', 'y'], (x, y) =>
             both(
               either(equal(x, 'red'), equal(x, 'blue')),
               equal(y, 'yellow')))

goal(new State())

// -> [
//      { state [x, y] x = 'red', y = 'yellow' },
//      { state [x, y] x = 'blue', y = 'yellow' }
//    ]

Now, let’s write something interesting with this language. We can convert JavaScript arrays into recursive list we’ve seen from Haskell: a list is either null or a pair of a value and a list. We’ll create some functions for creating such structures from arrays and strings, and some functions for converting from lists back to arrays and strings.

const NULL = {}

function from_array(array) {
  if (array.length === 0)
    return NULL
  else
    return new Pair(array[0], from_array(array.slice(1)))
}

function to_array(pair) {
  if (pair === NULL)
    return []
  else
    return [pair.left].concat(to_array(pair.right))
}

function from_string(string) {
  return from_array(string.split(''))
}

function to_string(pair) {
  return to_array(pair).join('')
}

Here’s a list operation from Haskell: if we append an empty list with any other list, then the result is just the second list. Otherwise, we pair the first element of the first list with the result of appending the rest of the first list to the second.

append          :: [a] -> [a] -> [a]
append [] y     =  y
append (x:xs) y =  x : append xs y

-- e.g.   append [1, 2, 3] [4, 5]
--      = 1 : append [2, 3] [4, 5]
--      = 1 : (2 : append [3] [4, 5])
--      = 1 : (2 : (3 : append [] [4, 5]))
--      = 1 : (2 : (3 : [4, 5]))
--      = [1, 2, 3, 4, 5]

To express the same thing in this language, we describe append as a relationship between three values, rather than a function that takes two arguments and returns one. append is defined as follows: either x is empty and y equals z, or x and z have the same first element, and appending the rest of x with y gives the rest of z; the repeated use of head here means everywhere that variable appears must have the same value. This contains exactly the same information as the Haskell version, it’s just expressed differently.

function append(x, y, z) {
  return bind(['head', 'xs', 'zs'], (head, xs, zs) =>
    either(
      both(
        equal(x, NULL),
        equal(y, z)),
      both(
        both(
          equal(x, new Pair(head, xs)),
          equal(z, new Pair(head, zs))),
        append(xs, y, zs))))
}

Let’s see what this function can do. First, we can say that both x is equal to rec and y is equal to urse, and that appending x and y gives z. As we’re used to in conventional languages, the fact that z is recurse comes out the other side. The results function just grabs the first three variables from the state and resolves their values.

let goal = bind(['x', 'y', 'z'], (x, y, z) =>
  both(
    both(
      equal(x, from_string('rec')),
      equal(y, from_string('urse'))),
    append(x, y, z)))

goal(new State()).map(s => s.results(3).map(to_string))
// -> [ [ 'rec', 'urse', 'recurse' ] ]

Now, I said at the beginning that I was going to show you a language with time travel. And, this append function can be run backwards. We can remove the assignment to y and instead define z, and it will figure out that y has the value urse. It is as though time has been reversed, and we can supply the return value of a function and get its arguments.

let goal = bind(['x', 'y', 'z'], (x, y, z) =>
  both(
    both(
      equal(x, from_string('rec')),
      equal(z, from_string('recurse'))),
    append(x, y, z)))

goal(new State()).map(s => s.results(3).map(to_string))
// -> [ [ 'rec', 'urse', 'recurse' ] ]

We can use either to define multiple possible values for z, and append will dutifully find the right y value for each one.

let goal = bind(['x', 'y', 'z'], (x, y, z) =>
  both(
    both(
      equal(x, from_string('rec')),
      either(
        equal(z, from_string('recurse')),
        equal(z, from_string('reciprocal')))),
    append(x, y, z)))

goal(new State()).map(s => s.results(3).map(to_string))
// -> [
//      [ 'rec', 'urse', 'recurse' ],
//      [ 'rec', 'iprocal', 'reciprocal' ]
//    ]

Even more spooky: we can leave both x and y undefined and only define z, asserting merely that appending x and y gives recurse. In this event, it figures out all the possible values of x and y that can lead to the result. Not only can we reverse time, we can walk back through every possible history given the facts we have about the present.

let goal = bind(['x', 'y'], (x, y) =>
  append(x, y, from_string('recurse')))

goal(new State()).map(s => s.results(2).map(to_string))

// -> [
//      [ '', 'recurse' ],
//      [ 'r', 'ecurse' ],
//      [ 're', 'curse' ],
//      [ 'rec', 'urse' ],
//      [ 'recu', 'rse' ],
//      [ 'recur', 'se' ],
//      [ 'recurs', 'e' ],
//      [ 'recurse', '' ]
//    ]

But of course, time is not running backwards. What has really happened is that we have banished time entirely from our program. Motion, just as Zeno said, is inherently impossible. What we’ve done is describe append in such a way that the computer seems to understand what it means. But it doesn’t really, all this language can do is follow assertions that structures are equal. It is the expression of computations as structures, rather than as processes, that lets it do these tricks.

The language we’ve just implemented is called μKanren. This particular implementation is derived from Tom Stuart’s excellent talk, Hello, declarative world. It is a member of the family of logic programming languages, which includes Prolog and Clojure’s core.logic. In part 1 we talked about proving things about programs by using equations between a function and its definition, and this sometimes required working “backwards”, replacing a definition with its name. Logic programs express relationships in a way that means the computer can carry this kind of proof process, and logic languages are excellent vehicles for doing formal proofs of program semantics.

Logic programming also plays an important role in type systems. Haskell’s type system lets it pick a function implementation based on the requested return type, rather than just on the argument types. For example, it has a read function that tries to parse a string as the requested type:

> read "42" :: Int
42

> read "42" :: Bool
*** Exception: Prelude.read: no parse

> read "True" :: Bool
True

> read "True" :: Int
*** Exception: Prelude.read: no parse

> read "[1, 2, 3]" :: [Int]
[1,2,3]

Often the type can be inferred from context, for example the !! operator takes an Int on the right, and so this figures out that read needs to return an Int:

[4, 5, 6] !! (read "1")

In Generic returns in Rust we saw how Rust does a similar thing, selecting a version of collect() to use based on the returned collection type. This mechanism of going “backwards” from outputs to inputs is a key feature of logic programming, and it comes from expressing compution as recursive equations between structures, rather than as procedural loops.

This style of programming is creeping in in surprising places, for example Yarn constraints are written in Prolog. Learning to use recursion effectively is essential in logic programming, and once you get the hang of it, it enables all sorts of other ways to think about and run programs.