Hilbert’s rusty hotel

A couple of years ago, I created a programming language. It was an attempt to make an extremely minimal logic programming system, with almost no built-in functionality other than the ability to do first-order logic, for the purposes of following the formalism in some more maths-driven textbooks. Recently, having learned Rust, I thought it would be a good exercise to port this language, originally written in Ruby, as I believe building it in Rust will result it significant performance wins and help make it more usable.

Though I’ve been making good progress with it, a few things have been surprisingly hard. One of the problems I’ve run into is that, while Rust’s iterators are highly expressive, performant and lazy, which helps substantially in this problem space, they run into problems when you try to model infinite sets with them. It turns out this is a significant problem in logic programming, and so I needed to find a good solution. So, this is a story about maths, infinities, and how computers deal with them, but first, we need to understand the problem logic programming languages try to solve.

Logical rules become tricky to deal with when they contain recursive references to themselves. As an example, here’s a definition of the subtyping relation, as given in Types and Programming Languages (TAPL). The expression A <: B means that A is a subtype of B. One important property that this relation has is that it’s reflexive, which means that any type is a subtype of itself for the purposes of a type-checker. In the language I’m building, this rule is written as follows:

rule S-Refl { $T <: $T }

S-Refl is the name that TAPL gives this rule, and the repeated variable $T means this rule matches statements where the expressions on either side of the <: symbol are the same.

Second, the relation is transitive, which means that if some type S is a subtype of U, and that type in turn is a subtype of T, then we can say that S is a subtype of T. The rule below states that if S <: U, and U <: T, then S <: T, which is called S-Trans in TAPL.

rule S-Trans {
  $S <: $U / $U <: $T
  -------------------
        $S <: $T
}

These two rules don’t have any idea what S, T or U are, they simply give constraints that two types must obey in order to be considered subtypes. We’ll now add some concrete facts to these rules, stating that gerbils are a subtype of rodents, rodents a subtype of mammals, and mammals a subtype of animals.

rule S-Gerbil { Gerbil <: Rodent }
rule S-Rodent { Rodent <: Mammal }
rule S-Mammal { Mammal <: Animal }

Now, it ought to be possible to use these rules to generate a proof that gerbils are a subtype of animals. Indeed by combining the above definitions, we can construct such a proof:

                            ──────────────── S-Rodent   ──────────────── S-Mammal
                            Rodent <: Mammal            Mammal <: Animal
──────────────── S-Gerbil   ──────────────────────────────────────────── S-Trans
Gerbil <: Rodent                          Rodent <: Animal
────────────────────────────────────────────────────────── S-Trans
                     Gerbil <: Animal

The name to the right of each horizontal line indicates which rule produced that part of the proof; the statements above the line are the rule’s premises (if any) and the expression below the line is its conclusion. For example, the last two lines of this proof can be read as “whereas Gerbil <: Rodent, and Rodent <: Animal, therefore by the rule S-Trans we infer that Gerbil <: Animal”.

A popular language for doing this kind of programming is called Prolog. In Prolog, the above logical rules can be directly transcribed as follows.

% S-Refl
subtype(T, T).

% S-Trans
subtype(S, T) :- subtype(S, U), subtype(U, T).

subtype(gerbil, rodent).
subtype(rodent, mammal).
subtype(mammal, animal).

Now, let’s say we want to find all the things of which gerbil is a subtype; we’re expecting the results gerbil, rodent, mammal and animal. To do this computation, we give Prolog a query:

?- subtype(gerbil, X).

The Prolog interpreter will try to find all values of X that make this a true statement, that is, a statement matching the rules it’s been given. It starts with the first rule: subtype(T, T). It compares the expressions subtype(gerbil, X) and subtype(T, T), and figures out what must be true in order for them to be equal. For this to be the case, the sub-expressions inside the parentheses must be equal, and this tells us that T = gerbil and T = X, therefore X = gerbil. And so it’s found our first solution to subtype(gerbil, X).

Now, there are other results that can be found, by examining the other rules. The next rule in the program (S-Trans) begins with subtype(S, T), and matching this against the query subtype(gerbil, X) tells us S = gerbil and T = X. Having matched the start of the rule, we try to match its first premise, subtype(S, U). We know S = gerbil, so we now need to look for rules that match subtype(gerbil, U). But this is just the query we started with! So, from the first rule, we determine that U = gerbil. Similarly, matching the second premise subtype(U, T), where we now know that U = gerbil, gives us the same problem yet again and tells us that T = gerbil. And so, this again tells us that subtype(gerbil, gerbil), with the following proof:

──────────────── S-Refl   ──────────────── S-Refl
Gerbil <: Gerbil          Gerbil <: Gerbil
────────────────────────────────────────── S-Trans
             Gerbil <: Gerbil

Can we generate any more results? In Prolog, the answer is unfortunately that we cannot. Prolog’s search for results is depth-first; it will generate all the possible results of looking at the S-Trans rule, before moving on to any others. So far it’s generated one result, by using S-Refl to match both the premises, and next it will try to generate more results for the first premise, subtype(S, U), to see if they produce valid results. Well, having exhausted the possibilities of S-Refl, it moves onto S-Trans, and so S-Trans is now recursively calling itself. This loop will continue indefinitely, so that Prolog returns an infinite stream of X = gerbil for our initial query. The next proof it generates will be:

──────────────── S-Refl   ──────────────── S-Refl
Gerbil <: Gerbil          Gerbil <: Gerbil
────────────────────────────────────────── S-Trans   ──────────────── S-Refl
             Gerbil <: Gerbil                        Gerbil <: Gerbil
             ──────────────────────────────────────────────────────── S-Trans
                                 Gerbil <: Gerbil

After that it will just continue to infinitely recurse into the leftmost branch of this tree, never exploring other parts of the search space.

──────────────── S-Refl   ──────────────── S-Refl
Gerbil <: Gerbil          Gerbil <: Gerbil
────────────────────────────────────────── S-Trans   ──────────────── S-Refl
             Gerbil <: Gerbil                        Gerbil <: Gerbil
             ──────────────────────────────────────────────────────── S-Trans   ──────────────── S-Refl
                                 Gerbil <: Gerbil                               Gerbil <: Gerbil
                                 ──────────────────────────────────────────────────────────────── S-Trans
                                                        Gerbil <: Gerbil

The core problem here is that there are various rules that could produce matches for our query, but because one of them produces an infinite stream of results, we never visit the later rules and generate more interesting results. In order to reach those results, we can approach the problem breadth-first by generating one result for each rule, before looking for further results that any of the rules produce. This is the approach taken by the μKanren language. In this system, the first three proofs of subtype(gerbil, X) would be:

1.  ──────────────── S-Refl
    Gerbil <: Gerbil


2.  ──────────────── S-Refl   ──────────────── S-Refl
    Gerbil <: Gerbil          Gerbil <: Gerbil
    ────────────────────────────────────────── S-Trans
                 Gerbil <: Gerbil


3.  ──────────────── S-Gerbil
    Gerbil <: Rodent

We’ve already generated a result we couldn’t reach before: since we didn’t get stuck looking at results from S-Trans, we can visit S-Gerbil and later rules to see which ones match our query. Only after this first pass do we return to any of the rules and see if they produce more matches. S-Refl and S-Gerbil produce one result each, since they have no premises, but S-Trans will produce more. We can imagine using the above three results in place of this rule’s first premise, subtype(S, U) where S = gerbil. Using result 1 actually generates result 2 above, but using result 2 gives us the next proof:

4.  ──────────────── S-Refl   ──────────────── S-Refl
    Gerbil <: Gerbil          Gerbil <: Gerbil
    ────────────────────────────────────────── S-Trans   ──────────────── S-Refl
                 Gerbil <: Gerbil                        Gerbil <: Gerbil
                 ──────────────────────────────────────────────────────── S-Trans
                                     Gerbil <: Gerbil

For each choice of the first premise, we also want to visit different choices for the second premise, and the next result we generate combines S-Refl for the first premise with S-Trans for the second:

5.                            ──────────────── S-Refl   ──────────────── S-Refl
                              Gerbil <: Gerbil          Gerbil <: Gerbil
    ──────────────── S-Refl   ────────────────────────────────────────── S-Trans
    Gerbil <: Gerbil                       Gerbil <: Gerbil
    ─────────────────────────────────────────────────────── S-Trans
                       Gerbil <: Gerbil

We can also use S-Gerbil for the first premise and S-Relf for the second:

6.  ──────────────── S-Gerbil   ──────────────── S-Refl
    Gerbil <: Rodent            Rodent <: Rodent
    ──────────────────────────────────────────── S-Trans
                  Gerbil <: Rodent

These proofs aren’t generating any new conclusions yet, but they will eventually, as we’re exploring more branches of the search space than Prolog does when it recurses off down the left branch indefinitely. After generating more combinations of the above proofs, we’ll eventually generate this, where S-Trans uses two of the later rules for its premises:

    ──────────────── S-Gerbil   ──────────────── S-Rodent
    Gerbil <: Rodent            Rodent <: Mammal
    ──────────────────────────────────────────── S-Trans
                  Gerbil <: Mammal

Given enough time, the whole search space will be explored, and it’s simply a matter of waiting for a result you’re interested in, rather than never being able to generate it.

This problem of enumerating infinite sets is illustrated by the thought experiment known as Hilbert’s hotel. It describes an imaginary hotel that has an infinite number of rooms. One day, an infinite number of buses, each carrying an infinite number of passengers arrive at the hotel, and all the passengers need a place to stay the night. The problem is how to make sure all the visitors are given a room.

Surely, after the first infinite busload of visitors disembark and are allocated rooms, the hotel is now full, and all the other buses must be turned away. This is equivalent to following only the infinite stream of results from S-Trans and never visiting any of the other rules. Let’s illustrate this situation by representing the rooms as an array of boxes, and the number in each box identifies which bus, and which seat on that bus, the guest came from.

+-----+-----+-----+-----+-----+ ...
| 1.1 | 1.2 | 1.3 | 1.4 | 1.5 |
+-----+-----+-----+-----+-----+ ...

However, it turns out we can accommodate at least one more busload of guests. Once the hotel is “full” of the first bus’s passengers, we ask all those guests to move into the room whose number is double the number of their current room. The guest in room 1 moves to room 2, the one in room 2 to room 4, and so on. The hotel being infinite, it’s possible to move all the guests in this way and still leave all the odd-numbered rooms empty.

+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
|     | 1.1 |     | 1.2 |     | 1.3 |     | 1.4 |     | 1.5 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...

The next busload of visitors can then move into the odd-numbered rooms, of which there is an infinite quantity, and the hotel is again full.

+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
| 2.1 | 1.1 | 2.2 | 1.2 | 2.3 | 1.3 | 2.4 | 1.4 | 2.5 | 1.5 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...

What about all the other infinite buses waiting on the infinite driveway to unload their passengers? Well for each bus that arrives we can repeat this process, moving all the existing guests into the even-numbered rooms and letting the next bus fill up the odd numbers. And so, the infinite queue of infinitely large buses can all be accommodated.

+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
|     | 2.1 |     | 1.1 |     | 2.2 |     | 1.2 |     | 2.3 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...

+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
| 3.1 | 2.1 | 3.2 | 1.1 | 3.3 | 2.2 | 3.4 | 1.2 | 3.5 | 2.3 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...

This is a neat illustration of the paradoxes of infinity: that there are the same number of even (or odd) numbers as there are integers, and that an infinite set of infinite sets (the buses and their passengers) is no larger than a single infinite set (the hotel’s rooms). But it also gives us a clue to solve our infinite search problem.

A logic interpreter works by trying to match a given query against the conclusions of all the known rules. Matching a pair of expressions generates a state, a set of bindings recording the variables that were found in the expressions and what they’ve been bound to. For example, matching subtype(gerbil, X) against subtype(T, T), with no prior state, generates the state { T = gerbil, X = gerbil }. For each matching rule, we then try to match the rule’s first premise, in the state generated by the conclusion. This match may generate an infinite number of results, and thus an infinite stream of matching states. In our example above, we saw that the first premise of S-Trans, subtype(S, U), generated such a stream.

The search proceeds by matching the next premise in each of these states, in principle generating an infinite stream of infinite streams of states. For each state generated by the premise subtype(S, U), the next premise subtype(U, T) may also generate an infinite stream of states. And so the following premise, if there is one, needs to be checked in any of an infinite set of infinite sets of states, exactly the problem illustrated by Hilbert’s hotel.

Prolog gets stuck because it only looks at the set generated by the first premise, which never ends, and so it never moves on to consider the state space generated by the other premises, or other rules. It only lets the passengers from the first bus into the hotel. It works depth-first, whereas we need a breadth-first search to have any hope of reaching other states. However, it’s not as simple as that; a naive breadth-first search of a stream of streams would mean taking the first item from each stream, then the second, and so on. But if the outer stream is infinite, we can never finish taking the first elements of the inner streams, and the second elements are never visited. The first passenger on each bus gets a room, and everyone else must sleep in their seats.

To solve the problem, we need to interleave the sets. That’s what we were doing when we moved the existing guests into the even-numbered rooms in order to let the next busload into the hotel. Now, before looking at how to do this in Rust, it’s instructive to look at how it’s done in Haskell, which is probably the easiest language in which to implement this. Whereas Rust has lazy iterators, Haskell has lazy everything; all parts of Haskell programs are only evaluated when strictly necessary to produce some output. And like Rust, Haskell has infinite iterators: the notation [1..] means an infinite list of all the integers, starting from 1. We can’t print this whole list, but just like in Rust we can use take to get only the first N elements of it – take 5 [1..] gives the finite list [1, 2, 3, 4, 5].

Using map, we can generate an infinite list of infinite lists. map (\n -> [(100 * n)..]) [0..] is equivalent to the Rust expression (0..).map(|n| (100 * n)..), and produces an infinite list of the ranges [0..], [100..], [200..], and so on.

First we’ll look at two ways of accommodating the guests that don’t work. One way is to concatenate all the buses’ sets of passengers, but, as the first bus is infinite, we never finish admitting its passengers and all the other buses remain blocked. Here that’s demonstrated by only getting elements from the first range in the output, equivalent to a depth-first search.

let buses = map (\n -> [(100 * n)..]) [0..] in
  print (take 32 (concat buses))

-- -> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
--     17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]

The other simple approach is to use a breadth-first search: take the first passenger from each bus, then the second, and so on. However, as there are an infinite number of buses, we never complete the first iteration, and we only get the first member of each set. In Haskell, this is equivalent to concat (transpose buses).

let buses = map (\n -> [(100 * n)..]) [0..] in
  print (take 32 (concat (transpose buses)))

-- -> [0, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200,
--     1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300,
--     2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100]

To properly solve the problem, we need a function for interleaving two lists. In Haskell, lists can be destructured into their first element (head) and the rest of the list (tail) using the : operator, so (x:xs) pattern-matches a list of at least one element, binding that element to x and the rest of the list to xs. interleave takes two lists of any type ([a]) and returns another list. If the first list is empty, then the result of interleaving it with the second is just the second list. If the first list has at least one element, then the result is that element, followed by the result of interleaving the second list with the rest of the first – swapping the arguments is what achieves the interleaving effect.

interleave :: [a] -> [a] -> [a]
interleave []     ys = ys
interleave (x:xs) ys = x : interleave ys xs

To flatten an infinite list of infinite lists ([[a]]) into a single list ([a]), we interleave the first inner list with the result of flattening the rest of the lists.

flatten :: [[a]] -> [a]
flatten []     = []
flatten (x:xs) = interleave x (flatten xs)

This is a tiny amount of code, but Haskell’s lazy evaluation means we can directly write the mathematical definition of what we’re doing, and it will compile and run and give a meaningful answer:

let buses = map (\n -> [(100 * n)..]) [0..] in
  print (take 32 (flatten buses))

-- -> [0, 100, 1, 200, 2, 101, 3, 300, 4, 102, 5, 201, 6, 103, 7, 400, 8,
--     104, 9, 202, 10, 105, 11, 301, 12, 106, 13, 203, 14, 107, 15, 500]

The numbers in the result are not evenly distributed; every other element is taken from the first set. But that’s not so important as the fact that we are visiting every input list, rather only the first list, or only getting the first member of each list. Given an infinite amount of time, all the input lists will be expanded – all the buses can unload all their passengers.

In Haskell, this is possible because of how laziness relies on recursive definitions. interleave is defined in terms of itself, and flatten is defined in terms of itself and interleave. And the particular recursive structure of the definitions is what makes them work. To make lazy lists work, all we need to be able to do is find the next element, without evaluating the entire list. So say we need to flatten this list:

buses = [[0..], [100..], [200..], ..]

The flattened version of buses is given by this expression:

flatten buses

Can we determine just the first element of this, without evaluating the entire result? Well, let’s look at the definition of flatten, which tells us that flatten (x:xs) = interleave x (flatten xs). Dropping buses into this equation gives:

flatten buses = interleave [0..] (flatten [[100..], [200..], ..])

Now we look at the definition of interleave, which says interleave (x:xs) ys = x : interleave ys xs. The infinite list [0..] destructures into 0 : [1..], and so applying this to the above gives:

flatten buses = 0 : interleave (flatten [[100..], [200..], ..]) [1..]

We now know the first element of the result: 0! Let’s try one more round of substitutions:

0 : interleave (flatten [[100..], [200..], ..]) [1..]

-- destructure [[100..], [200..], ..] into [100..] : [[200..], ..]
= 0 : interleave (flatten ([100..] : [[200..], ..])) [1..]

-- apply definition flatten (x:xs) = interleave x (flatten xs)
= 0 : interleave (interleave [100..] (flatten [[200..], ..])) [1..]

-- destructure [100..] into 100 : [101..]
= 0 : interleave (interleave (100 : [101..]) (flatten [[200..], ..])) [1..]

-- apply definition interleave (x:xs) ys = x : interleave ys xs (inner)
= 0 : interleave (100 : interleave (flatten [[200..], ..]) [101..]) [1..]

-- apply definition interleave (x:xs) ys = x : interleave ys xs (outer)
= 0 : 100 : interleave [1..] (interleave (flatten [[200..], ..]) [101..])

So now we know that the result begins [0, 100, ..], without having to evaluate any further. The take function essentially does this until it’s generated the requested number of results. We can see elements being popped off the front of the inner ranges, and “pulled” through the definitions of flatten and interleave as they pattern-match the first element of their inputs to produce a new expression.

If you’re familiar with Rust, this should remind you of something: the Iterator trait. This trait defines the next() method, which pulls values out of the iterator on demand, allowing iterators to be composed lazily. However, while Iterator itself has a lazy interface, Rust in general is eagerly evaluated, and that’s going to pose a challenge for our implementation. We’ll also incur a fair amount of boilerplate as we convert between various types, so we’ll need a lot more code than we did in Haskell. But, once we have the right iterators defined, we can compose them in powerful ways.

Looking at the Haskell evaluation above, it appears there are three fundamental iterators present that we need to turn into Rust equivalents.

  • Sources, i.e. the actual streams of values that we want to combine; in our Haskell values these are infinite lists of ints, but they could be anything that implements Iterator<Item = T>.

  • Interleaves, which take two (or more) streams and return the next element by alternating between the inputs.

  • Flattens, which take a stream of streams, and return the next element by interleaving the first element with the result of flattening the rest.

We’ll start by defining a type alias. Since Iterator is a trait and we’ll need to store values that implement it, we need to create trait objects by boxing things. A BoxIter is a boxed iterator of some element type T.

type BoxIter<'a, T> = Box<dyn Iterator<Item = T> + 'a>;

First, let’s define Interleave. Although in Haskell this was defined recursively, in Rust it’s possible to define using a loop; the struct contains a fixed and finite number of iterators, and a cursor that points to the next one it’s going to try to pull from.

struct Interleave<'a, T> {
    cursor: usize,
    iters: Vec<BoxIter<'a, T>>,
}

We can create one of these structs given a set of iterators. To accept as broad a range of inputs as possible, we’ll make Interleave::new accept any IntoIterator value, whose items are themselves IntoIterator values that yield T. The new() function turns the input into an iterator, maps its contents to boxed iterators, and collects the result into a Vec, which is stored in the returned Interleave struct.

impl<'a, T> Interleave<'a, T> {
    pub fn new<I, J>(iters: I) -> Interleave<'a, T>
    where
        I: IntoIterator<Item = J>,
        J: IntoIterator<Item = T> + 'a,
    {
        let iters: Vec<_> = iters
            .into_iter()
            .map(|s| Box::new(s.into_iter()) as BoxIter<'a, T>)
            .collect();

        let cursor = if iters.is_empty() { 0 } else { iters.len() - 1 };

        Interleave { cursor, iters }
    }
}

The Iterator implementation for this struct is relatively straightforward; its role is to pull items from each of the iterators in turn. So, starting from the current cursor index, we call next() on each inner iterator until one of them returns Some. If we try every iterator and none of them returns anything, then we return None.

impl<T> Iterator for Interleave<'_, T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        for _ in 0..self.iters.len() {
            self.cursor = (self.cursor + 1) % self.iters.len();
            let item = self.iters[self.cursor].next();

            if item.is_some() {
                return item;
            }
        }
        None
    }
}

This is enough to produce breadth-first behaviour, i.e. picking the first passenger from each bus. Note that we need to restrict buses to a finite set here using take(), so that we can create an Interleave from them.

let buses = (0..).map(|n| (100 * n)..).take(1_000);
let interleaved: Vec<_> = Interleave::new(buses).take(32).collect();
println!("{:?}", interleaved);

// -> [0, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200,
//     1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300,
//     2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100]

In contrast, Rust’s built-in flatten() method produces depth-first behaviour, taking elements only from the first stream and not from any of the others. Here buses can be an infinite iterator, and iterating it for a while shows that it only takes elements from the first range.

let buses = (0..).map(|n| (100 * n)..);
let flat: Vec<_> = buses.flatten().take(32).collect();
println!("{:?}", flat);

// -> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
//     17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]

Now we can tackle Flatten. Let’s remind ourselves of the Haskell definition of this iterator:

flatten :: [[a]] -> [a]
flatten []     = []
flatten (x:xs) = interleave x (flatten xs)

Given some iterator (x:xs), it takes the first element from it and interleaves that with the result of flattening the rest of the iterator. And so, it might be tempting to try to implement it like this:

fn flatten<'a, I, J, T>(mut streams: I) -> Option<Interleave<'a, T>>
where
    I: Iterator<Item = J> + 'a,
    J: Iterator<Item = T> + 'a,
    T: 'a,
{
    if let (Some(head), Some(tail)) = (streams.next(), flatten(streams)) {
        Some(Interleave::new(vec![
            Box::new(head) as BoxIter<'a, T>,
            Box::new(tail) as BoxIter<'a, T>,
        ]))
    } else {
        None
    }
}

However, the unconditional recursive call to flatten() means that if streams is infinite, this function will cause a stack overflow. Rust iterators are lazy in that they call next() only when more elements are requested, but the code for constructing them is not lazy, it’s just ordinary Rust code. To get laziness, we need to defer the creation of the tail iterator here until we actually need to get the next item from it.

So, we need to create a distinct iterator type that uses a closure to defer the creation of that inner Interleave struct. That closure will be called as needed to generate a value, which we’ll denote by wrapping it in a Thunk, another type we’ll define shortly. The closure will return an Option<Interleave<T>>; it’s optional because we’ll only generate one if the stream we’re given has a first element.

struct Flatten<'a, T> {
    thunk: Thunk<'a, Option<Interleave<'a, T>>>,
}

impl<'a, T> Flatten<'_, T> {
    fn new(mut streams: BoxIter<'a, BoxIter<'a, T>>) -> Flatten<'a, T> {
        let thunk = Thunk::new(|| {
            streams.next().map(|head| {
                let tail = Box::new(Flatten::new(streams));
                Interleave::new(vec![head, tail])
            })
        });

        Flatten { thunk }
    }
}

Implementing Iterator for this struct requires getting a mutable reference to the value inside the Thunk, which we can do using the AsMut trait; this will return &mut Option<Interleave<T>>, and we can pattern-match on that to return the next item from the wrapped stream, if it exists.

impl<T> Iterator for Flatten<'_, T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        if let Some(stream) = self.thunk.as_mut() {
            stream.next()
        } else {
            None
        }
    }
}

The trick that makes this all work is the Thunk type. Its job is to defer the creation of a value by wrapping a closure that generates it. We only want to generate the inner value once, and so the Thunk exists in one of two states: Pending, which contains a boxed FnOnce that returns a T, or Forced, which contains a T. The Option inside Pending is a trick that lets us move the closure out of the struct, which is required if we want to call it. Because it’s wrapping a non-nullable pointer, the Option does not require any space to be allocated, it’s just there to make the borrow checker happy.

enum Thunk<'a, T: 'a> {
    Pending(Option<Box<dyn 'a + FnOnce() -> T>>),
    Forced(T),
}

impl<'a, T> Thunk<'a, T> {
    fn new<F>(gen: F) -> Thunk<'a, T>
    where
        F: 'a + FnOnce() -> T,
    {
        Thunk::Pending(Some(Box::new(gen)))
    }
}

Implementing AsMut for Thunk requires us to return &mut T. To do this, we need to evaluate the closure if it’s not already been used; if the thunk is Pending, then we can take() the closure out and call it, generating a T, which we store as a Forced value. Once this is done, we can assume the thunk is in the Forced state, so we pattern-match the value out of it and return a mutable reference to it.

impl<T> AsMut<T> for Thunk<'_, T> {
    fn as_mut(&mut self) -> &mut T {
        if let Thunk::Pending(gen) = self {
            let gen = gen.take().unwrap();
            *self = Thunk::Forced(gen());
        }

        match self {
            Thunk::Forced(ref mut value) => value,
            _ => panic!(),
        }
    }
}

This interface lets us get a mutable reference to the iterator inside the thunk, and thus ask it for the next item. Finally, we can add a little syntactic sugar so that we can build a Flatten from any nested IntoIterator value – just as we did for Interleave, this requires converting the inputs into boxed iterators so we can pass them to the normal Flatten constructor.

impl<'a, T> Flatten<'a, T> {
    pub fn from_iter<I, J>(streams: I) -> Flatten<'a, T>
    where
        I: IntoIterator<Item = J> + 'a,
        J: IntoIterator<Item = T> + 'a,
    {
        let iters = streams
            .into_iter()
            .map(|s| Box::new(s.into_iter()) as BoxIter<'a, T>);

        Flatten::new(Box::new(iters))
    }
}

With these three structs defined – Interleave, Thunk and Flatten – we can convert our infinite list of infinite lists into a flattened form that pulls items from every iterator in the set:

let buses = (0..).map(|n| (100 * n)..);
let flat: Vec<_> = Flatten::from_iter(buses).take(32).collect();
println!("{:?}", flat);

// -> [0, 100, 1, 200, 2, 101, 3, 300, 4, 102, 5, 201, 6, 103, 7, 400, 8,
//     104, 9, 202, 10, 105, 11, 301, 12, 106, 13, 203, 14, 107, 15, 500]

And so we’ve reproduced the behaviour that took a handful of lines in Haskell. Now, maybe I’m using the wrong language for this project, but there’s also been a good deal of it that Rust’s lazy iterators made substantially easier. The thing to remember is that if something returns an iterator – think map, filter, or take – then it may be lazy. But anything that returns a non-iterator value must be eager. The place I tripped over this the most was when using fold, especially since I was using it to transform one iterator into another. I expected the closure to be called lazily, but it wasn’t, and that forced the creation of all of the above. But once you have the building blocks you need, Rust’s iterators are remarkably useful and compose wonderfully, giving you expressive code without the overhead of many nested loops as you’d get if they were eager.