A couple of years ago, I created a programming language. It was an attempt to make an extremely minimal logic programming system, with almost no built-in functionality other than the ability to do first-order logic, for the purposes of following the formalism in some more maths-driven textbooks. Recently, having learned Rust, I thought it would be a good exercise to port this language, originally written in Ruby, as I believe building it in Rust will result it significant performance wins and help make it more usable.
Though I’ve been making good progress with it, a few things have been surprisingly hard. One of the problems I’ve run into is that, while Rust’s iterators are highly expressive, performant and lazy, which helps substantially in this problem space, they run into problems when you try to model infinite sets with them. It turns out this is a significant problem in logic programming, and so I needed to find a good solution. So, this is a story about maths, infinities, and how computers deal with them, but first, we need to understand the problem logic programming languages try to solve.
Logical rules become tricky to deal with when they contain recursive references
to themselves. As an example, here’s a definition of the subtyping relation,
as given in Types and Programming Languages (TAPL). The expression A <: B
means that A is a subtype of B. One important property that this relation has is
that it’s reflexive, which means that any type is a subtype of itself for the
purposes of a type-checker. In the language I’m building, this rule is written
as follows:
rule S-Refl { $T <: $T }
S-Refl
is the name that TAPL gives this rule, and the repeated variable $T
means this rule matches statements where the expressions on either side of the
<:
symbol are the same.
Second, the relation is transitive, which means that if some type S
is a
subtype of U
, and that type in turn is a subtype of T
, then we can say that
S
is a subtype of T
. The rule below states that if S <: U
, and U <: T
,
then S <: T
, which is called S-Trans
in TAPL.
rule S-Trans {
$S <: $U / $U <: $T
-------------------
$S <: $T
}
These two rules don’t have any idea what S
, T
or U
are, they simply give
constraints that two types must obey in order to be considered subtypes. We’ll
now add some concrete facts to these rules, stating that gerbils are a subtype
of rodents, rodents a subtype of mammals, and mammals a subtype of animals.
rule S-Gerbil { Gerbil <: Rodent }
rule S-Rodent { Rodent <: Mammal }
rule S-Mammal { Mammal <: Animal }
Now, it ought to be possible to use these rules to generate a proof that gerbils are a subtype of animals. Indeed by combining the above definitions, we can construct such a proof:
──────────────── S-Rodent ──────────────── S-Mammal
Rodent <: Mammal Mammal <: Animal
──────────────── S-Gerbil ──────────────────────────────────────────── S-Trans
Gerbil <: Rodent Rodent <: Animal
────────────────────────────────────────────────────────── S-Trans
Gerbil <: Animal
The name to the right of each horizontal line indicates which rule produced that
part of the proof; the statements above the line are the rule’s premises (if
any) and the expression below the line is its conclusion. For example, the last
two lines of this proof can be read as “whereas Gerbil <: Rodent
, and Rodent
<: Animal
, therefore by the rule S-Trans
we infer that Gerbil <: Animal
”.
A popular language for doing this kind of programming is called Prolog. In Prolog, the above logical rules can be directly transcribed as follows.
% S-Refl
subtype(T, T).
% S-Trans
subtype(S, T) :- subtype(S, U), subtype(U, T).
subtype(gerbil, rodent).
subtype(rodent, mammal).
subtype(mammal, animal).
Now, let’s say we want to find all the things of which gerbil
is a subtype;
we’re expecting the results gerbil
, rodent
, mammal
and animal
. To do
this computation, we give Prolog a query:
?- subtype(gerbil, X).
The Prolog interpreter will try to find all values of X
that make this a true
statement, that is, a statement matching the rules it’s been given. It starts
with the first rule: subtype(T, T)
. It compares the expressions
subtype(gerbil, X)
and subtype(T, T)
, and figures out what must be true in
order for them to be equal. For this to be the case, the sub-expressions
inside the parentheses must be equal, and this tells us that T
= gerbil
and T
= X
, therefore X
= gerbil
. And so it’s found our first solution to
subtype(gerbil, X)
.
Now, there are other results that can be found, by examining the other rules.
The next rule in the program (S-Trans
) begins with subtype(S, T)
, and
matching this against the query subtype(gerbil, X)
tells us S
= gerbil
and
T
= X
. Having matched the start of the rule, we try to match its first
premise, subtype(S, U)
. We know S
= gerbil
, so we now need to look for
rules that match subtype(gerbil, U)
. But this is just the query we started
with! So, from the first rule, we determine that U
= gerbil
. Similarly,
matching the second premise subtype(U, T)
, where we now know that U
=
gerbil
, gives us the same problem yet again and tells us that T
= gerbil
.
And so, this again tells us that subtype(gerbil, gerbil)
, with the following
proof:
──────────────── S-Refl ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
────────────────────────────────────────── S-Trans
Gerbil <: Gerbil
Can we generate any more results? In Prolog, the answer is unfortunately that we
cannot. Prolog’s search for results is depth-first; it will generate all the
possible results of looking at the S-Trans
rule, before moving on to any
others. So far it’s generated one result, by using S-Refl
to match both the
premises, and next it will try to generate more results for the first premise,
subtype(S, U)
, to see if they produce valid results. Well, having exhausted
the possibilities of S-Refl
, it moves onto S-Trans
, and so S-Trans
is now
recursively calling itself. This loop will continue indefinitely, so that Prolog
returns an infinite stream of X = gerbil
for our initial query. The next proof
it generates will be:
──────────────── S-Refl ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
────────────────────────────────────────── S-Trans ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
──────────────────────────────────────────────────────── S-Trans
Gerbil <: Gerbil
After that it will just continue to infinitely recurse into the leftmost branch of this tree, never exploring other parts of the search space.
──────────────── S-Refl ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
────────────────────────────────────────── S-Trans ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
──────────────────────────────────────────────────────── S-Trans ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
──────────────────────────────────────────────────────────────── S-Trans
Gerbil <: Gerbil
The core problem here is that there are various rules that could produce matches
for our query, but because one of them produces an infinite stream of results,
we never visit the later rules and generate more interesting results. In order
to reach those results, we can approach the problem breadth-first by
generating one result for each rule, before looking for further results that any
of the rules produce. This is the approach taken by the μKanren language.
In this system, the first three proofs of subtype(gerbil, X)
would be:
1. ──────────────── S-Refl
Gerbil <: Gerbil
2. ──────────────── S-Refl ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
────────────────────────────────────────── S-Trans
Gerbil <: Gerbil
3. ──────────────── S-Gerbil
Gerbil <: Rodent
We’ve already generated a result we couldn’t reach before: since we didn’t get
stuck looking at results from S-Trans
, we can visit S-Gerbil
and later rules
to see which ones match our query. Only after this first pass do we return to
any of the rules and see if they produce more matches. S-Refl
and S-Gerbil
produce one result each, since they have no premises, but S-Trans
will produce
more. We can imagine using the above three results in place of this rule’s first
premise, subtype(S, U)
where S
= gerbil
. Using result 1 actually generates
result 2 above, but using result 2 gives us the next proof:
4. ──────────────── S-Refl ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
────────────────────────────────────────── S-Trans ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
──────────────────────────────────────────────────────── S-Trans
Gerbil <: Gerbil
For each choice of the first premise, we also want to visit different choices
for the second premise, and the next result we generate combines S-Refl
for
the first premise with S-Trans
for the second:
5. ──────────────── S-Refl ──────────────── S-Refl
Gerbil <: Gerbil Gerbil <: Gerbil
──────────────── S-Refl ────────────────────────────────────────── S-Trans
Gerbil <: Gerbil Gerbil <: Gerbil
─────────────────────────────────────────────────────── S-Trans
Gerbil <: Gerbil
We can also use S-Gerbil
for the first premise and S-Relf
for the second:
6. ──────────────── S-Gerbil ──────────────── S-Refl
Gerbil <: Rodent Rodent <: Rodent
──────────────────────────────────────────── S-Trans
Gerbil <: Rodent
These proofs aren’t generating any new conclusions yet, but they will
eventually, as we’re exploring more branches of the search space than Prolog
does when it recurses off down the left branch indefinitely. After generating
more combinations of the above proofs, we’ll eventually generate this, where
S-Trans
uses two of the later rules for its premises:
──────────────── S-Gerbil ──────────────── S-Rodent
Gerbil <: Rodent Rodent <: Mammal
──────────────────────────────────────────── S-Trans
Gerbil <: Mammal
Given enough time, the whole search space will be explored, and it’s simply a matter of waiting for a result you’re interested in, rather than never being able to generate it.
This problem of enumerating infinite sets is illustrated by the thought experiment known as Hilbert’s hotel. It describes an imaginary hotel that has an infinite number of rooms. One day, an infinite number of buses, each carrying an infinite number of passengers arrive at the hotel, and all the passengers need a place to stay the night. The problem is how to make sure all the visitors are given a room.
Surely, after the first infinite busload of visitors disembark and are allocated
rooms, the hotel is now full, and all the other buses must be turned away. This
is equivalent to following only the infinite stream of results from S-Trans
and never visiting any of the other rules. Let’s illustrate this situation by
representing the rooms as an array of boxes, and the number in each box
identifies which bus, and which seat on that bus, the guest came from.
+-----+-----+-----+-----+-----+ ...
| 1.1 | 1.2 | 1.3 | 1.4 | 1.5 |
+-----+-----+-----+-----+-----+ ...
However, it turns out we can accommodate at least one more busload of guests. Once the hotel is “full” of the first bus’s passengers, we ask all those guests to move into the room whose number is double the number of their current room. The guest in room 1 moves to room 2, the one in room 2 to room 4, and so on. The hotel being infinite, it’s possible to move all the guests in this way and still leave all the odd-numbered rooms empty.
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
| | 1.1 | | 1.2 | | 1.3 | | 1.4 | | 1.5 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
The next busload of visitors can then move into the odd-numbered rooms, of which there is an infinite quantity, and the hotel is again full.
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
| 2.1 | 1.1 | 2.2 | 1.2 | 2.3 | 1.3 | 2.4 | 1.4 | 2.5 | 1.5 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
What about all the other infinite buses waiting on the infinite driveway to unload their passengers? Well for each bus that arrives we can repeat this process, moving all the existing guests into the even-numbered rooms and letting the next bus fill up the odd numbers. And so, the infinite queue of infinitely large buses can all be accommodated.
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
| | 2.1 | | 1.1 | | 2.2 | | 1.2 | | 2.3 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
| 3.1 | 2.1 | 3.2 | 1.1 | 3.3 | 2.2 | 3.4 | 1.2 | 3.5 | 2.3 |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+ ...
This is a neat illustration of the paradoxes of infinity: that there are the same number of even (or odd) numbers as there are integers, and that an infinite set of infinite sets (the buses and their passengers) is no larger than a single infinite set (the hotel’s rooms). But it also gives us a clue to solve our infinite search problem.
A logic interpreter works by trying to match a given query against the
conclusions of all the known rules. Matching a pair of expressions generates a
state, a set of bindings recording the variables that were found in the
expressions and what they’ve been bound to. For example, matching
subtype(gerbil, X)
against subtype(T, T)
, with no prior state, generates the
state { T = gerbil, X = gerbil }
. For each matching rule, we then try to match
the rule’s first premise, in the state generated by the conclusion. This match
may generate an infinite number of results, and thus an infinite stream of
matching states. In our example above, we saw that the first premise of
S-Trans
, subtype(S, U)
, generated such a stream.
The search proceeds by matching the next premise in each of these states, in
principle generating an infinite stream of infinite streams of states. For each
state generated by the premise subtype(S, U)
, the next premise subtype(U, T)
may also generate an infinite stream of states. And so the following premise, if
there is one, needs to be checked in any of an infinite set of infinite sets of
states, exactly the problem illustrated by Hilbert’s hotel.
Prolog gets stuck because it only looks at the set generated by the first premise, which never ends, and so it never moves on to consider the state space generated by the other premises, or other rules. It only lets the passengers from the first bus into the hotel. It works depth-first, whereas we need a breadth-first search to have any hope of reaching other states. However, it’s not as simple as that; a naive breadth-first search of a stream of streams would mean taking the first item from each stream, then the second, and so on. But if the outer stream is infinite, we can never finish taking the first elements of the inner streams, and the second elements are never visited. The first passenger on each bus gets a room, and everyone else must sleep in their seats.
To solve the problem, we need to interleave the sets. That’s what we were
doing when we moved the existing guests into the even-numbered rooms in order to
let the next busload into the hotel. Now, before looking at how to do this in
Rust, it’s instructive to look at how it’s done in Haskell, which is probably
the easiest language in which to implement this. Whereas Rust has lazy
iterators, Haskell has lazy everything; all parts of Haskell programs are only
evaluated when strictly necessary to produce some output. And like Rust, Haskell
has infinite iterators: the notation [1..]
means an infinite list of all the
integers, starting from 1. We can’t print this whole list, but just like in Rust
we can use take
to get only the first N elements of it – take 5 [1..]
gives
the finite list [1, 2, 3, 4, 5]
.
Using map
, we can generate an infinite list of infinite lists. map (\n ->
[(100 * n)..]) [0..]
is equivalent to the Rust expression (0..).map(|n| (100 *
n)..)
, and produces an infinite list of the ranges [0..]
, [100..]
,
[200..]
, and so on.
First we’ll look at two ways of accommodating the guests that don’t work. One way is to concatenate all the buses’ sets of passengers, but, as the first bus is infinite, we never finish admitting its passengers and all the other buses remain blocked. Here that’s demonstrated by only getting elements from the first range in the output, equivalent to a depth-first search.
let buses = map (\n -> [(100 * n)..]) [0..] in
print (take 32 (concat buses))
-- -> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
-- 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
The other simple approach is to use a breadth-first search: take the first
passenger from each bus, then the second, and so on. However, as there are an
infinite number of buses, we never complete the first iteration, and we only get
the first member of each set. In Haskell, this is equivalent to concat
(transpose buses)
.
let buses = map (\n -> [(100 * n)..]) [0..] in
print (take 32 (concat (transpose buses)))
-- -> [0, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200,
-- 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300,
-- 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100]
To properly solve the problem, we need a function for interleaving two lists. In
Haskell, lists can be destructured into their first element (head
) and the
rest of the list (tail
) using the :
operator, so (x:xs)
pattern-matches a
list of at least one element, binding that element to x
and the rest of the
list to xs
. interleave
takes two lists of any type ([a]
) and returns
another list. If the first list is empty, then the result of interleaving it
with the second is just the second list. If the first list has at least one
element, then the result is that element, followed by the result of interleaving
the second list with the rest of the first – swapping the arguments is what
achieves the interleaving effect.
interleave :: [a] -> [a] -> [a]
interleave [] ys = ys
interleave (x:xs) ys = x : interleave ys xs
To flatten an infinite list of infinite lists ([[a]]
) into a single list
([a]
), we interleave the first inner list with the result of flattening the
rest of the lists.
flatten :: [[a]] -> [a]
flatten [] = []
flatten (x:xs) = interleave x (flatten xs)
This is a tiny amount of code, but Haskell’s lazy evaluation means we can directly write the mathematical definition of what we’re doing, and it will compile and run and give a meaningful answer:
let buses = map (\n -> [(100 * n)..]) [0..] in
print (take 32 (flatten buses))
-- -> [0, 100, 1, 200, 2, 101, 3, 300, 4, 102, 5, 201, 6, 103, 7, 400, 8,
-- 104, 9, 202, 10, 105, 11, 301, 12, 106, 13, 203, 14, 107, 15, 500]
The numbers in the result are not evenly distributed; every other element is taken from the first set. But that’s not so important as the fact that we are visiting every input list, rather only the first list, or only getting the first member of each list. Given an infinite amount of time, all the input lists will be expanded – all the buses can unload all their passengers.
In Haskell, this is possible because of how laziness relies on recursive
definitions. interleave
is defined in terms of itself, and flatten
is
defined in terms of itself and interleave
. And the particular recursive
structure of the definitions is what makes them work. To make lazy lists work,
all we need to be able to do is find the next element, without evaluating the
entire list. So say we need to flatten this list:
buses = [[0..], [100..], [200..], ..]
The flattened version of buses
is given by this expression:
flatten buses
Can we determine just the first element of this, without evaluating the entire
result? Well, let’s look at the definition of flatten
, which tells us that
flatten (x:xs) = interleave x (flatten xs)
. Dropping buses
into this
equation gives:
flatten buses = interleave [0..] (flatten [[100..], [200..], ..])
Now we look at the definition of interleave
, which says interleave (x:xs) ys
= x : interleave ys xs
. The infinite list [0..]
destructures into 0 :
[1..]
, and so applying this to the above gives:
flatten buses = 0 : interleave (flatten [[100..], [200..], ..]) [1..]
We now know the first element of the result: 0
! Let’s try one more round of
substitutions:
0 : interleave (flatten [[100..], [200..], ..]) [1..]
-- destructure [[100..], [200..], ..] into [100..] : [[200..], ..]
= 0 : interleave (flatten ([100..] : [[200..], ..])) [1..]
-- apply definition flatten (x:xs) = interleave x (flatten xs)
= 0 : interleave (interleave [100..] (flatten [[200..], ..])) [1..]
-- destructure [100..] into 100 : [101..]
= 0 : interleave (interleave (100 : [101..]) (flatten [[200..], ..])) [1..]
-- apply definition interleave (x:xs) ys = x : interleave ys xs (inner)
= 0 : interleave (100 : interleave (flatten [[200..], ..]) [101..]) [1..]
-- apply definition interleave (x:xs) ys = x : interleave ys xs (outer)
= 0 : 100 : interleave [1..] (interleave (flatten [[200..], ..]) [101..])
So now we know that the result begins [0, 100, ..]
, without having to evaluate
any further. The take
function essentially does this until it’s generated the
requested number of results. We can see elements being popped off the front of
the inner ranges, and “pulled” through the definitions of flatten
and
interleave
as they pattern-match the first element of their inputs to produce
a new expression.
If you’re familiar with Rust, this should remind you of something: the
Iterator
trait. This trait defines the next()
method, which pulls
values out of the iterator on demand, allowing iterators to be composed lazily.
However, while Iterator
itself has a lazy interface, Rust in general is
eagerly evaluated, and that’s going to pose a challenge for our implementation.
We’ll also incur a fair amount of boilerplate as we convert between various
types, so we’ll need a lot more code than we did in Haskell. But, once we have
the right iterators defined, we can compose them in powerful ways.
Looking at the Haskell evaluation above, it appears there are three fundamental iterators present that we need to turn into Rust equivalents.
-
Sources, i.e. the actual streams of values that we want to combine; in our Haskell values these are infinite lists of ints, but they could be anything that implements
Iterator<Item = T>
. -
Interleaves, which take two (or more) streams and return the next element by alternating between the inputs.
-
Flattens, which take a stream of streams, and return the next element by interleaving the first element with the result of flattening the rest.
We’ll start by defining a type alias. Since Iterator
is a trait and we’ll need
to store values that implement it, we need to create trait objects by boxing
things. A BoxIter
is a boxed iterator of some element type T
.
type BoxIter<'a, T> = Box<dyn Iterator<Item = T> + 'a>;
First, let’s define Interleave
. Although in Haskell this was defined
recursively, in Rust it’s possible to define using a loop; the struct contains a
fixed and finite number of iterators, and a cursor
that points to the next one
it’s going to try to pull from.
struct Interleave<'a, T> {
cursor: usize,
iters: Vec<BoxIter<'a, T>>,
}
We can create one of these structs given a set of iterators. To accept as broad
a range of inputs as possible, we’ll make Interleave::new
accept any
IntoIterator
value, whose items are themselves IntoIterator
values that
yield T
. The new()
function turns the input into an iterator, maps its
contents to boxed iterators, and collects the result into a Vec
, which is
stored in the returned Interleave
struct.
impl<'a, T> Interleave<'a, T> {
pub fn new<I, J>(iters: I) -> Interleave<'a, T>
where
I: IntoIterator<Item = J>,
J: IntoIterator<Item = T> + 'a,
{
let iters: Vec<_> = iters
.into_iter()
.map(|s| Box::new(s.into_iter()) as BoxIter<'a, T>)
.collect();
let cursor = if iters.is_empty() { 0 } else { iters.len() - 1 };
Interleave { cursor, iters }
}
}
The Iterator
implementation for this struct is relatively straightforward; its
role is to pull items from each of the iterators in turn. So, starting from the
current cursor
index, we call next()
on each inner iterator until one of
them returns Some
. If we try every iterator and none of them returns anything,
then we return None
.
impl<T> Iterator for Interleave<'_, T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
for _ in 0..self.iters.len() {
self.cursor = (self.cursor + 1) % self.iters.len();
let item = self.iters[self.cursor].next();
if item.is_some() {
return item;
}
}
None
}
}
This is enough to produce breadth-first behaviour, i.e. picking the first
passenger from each bus. Note that we need to restrict buses
to a finite set
here using take()
, so that we can create an Interleave
from them.
let buses = (0..).map(|n| (100 * n)..).take(1_000);
let interleaved: Vec<_> = Interleave::new(buses).take(32).collect();
println!("{:?}", interleaved);
// -> [0, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200,
// 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300,
// 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100]
In contrast, Rust’s built-in flatten()
method produces depth-first behaviour,
taking elements only from the first stream and not from any of the others. Here
buses
can be an infinite iterator, and iterating it for a while shows that it
only takes elements from the first range.
let buses = (0..).map(|n| (100 * n)..);
let flat: Vec<_> = buses.flatten().take(32).collect();
println!("{:?}", flat);
// -> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
// 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]
Now we can tackle Flatten
. Let’s remind ourselves of the Haskell definition of
this iterator:
flatten :: [[a]] -> [a]
flatten [] = []
flatten (x:xs) = interleave x (flatten xs)
Given some iterator (x:xs)
, it takes the first element from it and interleaves
that with the result of flattening the rest of the iterator. And so, it might be
tempting to try to implement it like this:
fn flatten<'a, I, J, T>(mut streams: I) -> Option<Interleave<'a, T>>
where
I: Iterator<Item = J> + 'a,
J: Iterator<Item = T> + 'a,
T: 'a,
{
if let (Some(head), Some(tail)) = (streams.next(), flatten(streams)) {
Some(Interleave::new(vec![
Box::new(head) as BoxIter<'a, T>,
Box::new(tail) as BoxIter<'a, T>,
]))
} else {
None
}
}
However, the unconditional recursive call to flatten()
means that if streams
is infinite, this function will cause a stack overflow. Rust iterators are lazy
in that they call next()
only when more elements are requested, but the code
for constructing them is not lazy, it’s just ordinary Rust code. To get
laziness, we need to defer the creation of the tail
iterator here until we
actually need to get the next item from it.
So, we need to create a distinct iterator type that uses a closure to defer the
creation of that inner Interleave
struct. That closure will be called as
needed to generate a value, which we’ll denote by wrapping it in a Thunk
,
another type we’ll define shortly. The closure will return an
Option<Interleave<T>>
; it’s optional because we’ll only generate one if the
stream we’re given has a first element.
struct Flatten<'a, T> {
thunk: Thunk<'a, Option<Interleave<'a, T>>>,
}
impl<'a, T> Flatten<'_, T> {
fn new(mut streams: BoxIter<'a, BoxIter<'a, T>>) -> Flatten<'a, T> {
let thunk = Thunk::new(|| {
streams.next().map(|head| {
let tail = Box::new(Flatten::new(streams));
Interleave::new(vec![head, tail])
})
});
Flatten { thunk }
}
}
Implementing Iterator
for this struct requires getting a mutable reference to
the value inside the Thunk
, which we can do using the AsMut
trait; this
will return &mut Option<Interleave<T>>
, and we can pattern-match on that to
return the next item from the wrapped stream, if it exists.
impl<T> Iterator for Flatten<'_, T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
if let Some(stream) = self.thunk.as_mut() {
stream.next()
} else {
None
}
}
}
The trick that makes this all work is the Thunk
type. Its job is to defer the
creation of a value by wrapping a closure that generates it. We only want to
generate the inner value once, and so the Thunk
exists in one of two states:
Pending
, which contains a boxed FnOnce
that returns a T
, or Forced
,
which contains a T
. The Option
inside Pending
is a trick that lets us move
the closure out of the struct, which is required if we want to call it. Because
it’s wrapping a non-nullable pointer, the Option
does not require any space to
be allocated, it’s just there to make the borrow checker happy.
enum Thunk<'a, T: 'a> {
Pending(Option<Box<dyn 'a + FnOnce() -> T>>),
Forced(T),
}
impl<'a, T> Thunk<'a, T> {
fn new<F>(gen: F) -> Thunk<'a, T>
where
F: 'a + FnOnce() -> T,
{
Thunk::Pending(Some(Box::new(gen)))
}
}
Implementing AsMut
for Thunk
requires us to return &mut T
. To do this, we
need to evaluate the closure if it’s not already been used; if the thunk is
Pending
, then we can take()
the closure out and call it, generating a
T
, which we store as a Forced
value. Once this is done, we can assume the
thunk is in the Forced
state, so we pattern-match the value out of it and
return a mutable reference to it.
impl<T> AsMut<T> for Thunk<'_, T> {
fn as_mut(&mut self) -> &mut T {
if let Thunk::Pending(gen) = self {
let gen = gen.take().unwrap();
*self = Thunk::Forced(gen());
}
match self {
Thunk::Forced(ref mut value) => value,
_ => panic!(),
}
}
}
This interface lets us get a mutable reference to the iterator inside the thunk,
and thus ask it for the next item. Finally, we can add a little syntactic sugar
so that we can build a Flatten
from any nested IntoIterator
value – just as
we did for Interleave
, this requires converting the inputs into boxed
iterators so we can pass them to the normal Flatten
constructor.
impl<'a, T> Flatten<'a, T> {
pub fn from_iter<I, J>(streams: I) -> Flatten<'a, T>
where
I: IntoIterator<Item = J> + 'a,
J: IntoIterator<Item = T> + 'a,
{
let iters = streams
.into_iter()
.map(|s| Box::new(s.into_iter()) as BoxIter<'a, T>);
Flatten::new(Box::new(iters))
}
}
With these three structs defined – Interleave
, Thunk
and Flatten
– we
can convert our infinite list of infinite lists into a flattened form that pulls
items from every iterator in the set:
let buses = (0..).map(|n| (100 * n)..);
let flat: Vec<_> = Flatten::from_iter(buses).take(32).collect();
println!("{:?}", flat);
// -> [0, 100, 1, 200, 2, 101, 3, 300, 4, 102, 5, 201, 6, 103, 7, 400, 8,
// 104, 9, 202, 10, 105, 11, 301, 12, 106, 13, 203, 14, 107, 15, 500]
And so we’ve reproduced the behaviour that took a handful of lines in Haskell.
Now, maybe I’m using the wrong language for this project, but there’s also been
a good deal of it that Rust’s lazy iterators made substantially easier. The
thing to remember is that if something returns an iterator – think map
,
filter
, or take
– then it may be lazy. But anything that
returns a non-iterator value must be eager. The place I tripped over this the
most was when using fold
, especially since I was using it to transform
one iterator into another. I expected the closure to be called lazily, but it
wasn’t, and that forced the creation of all of the above. But once you have the
building blocks you need, Rust’s iterators are remarkably useful and compose
wonderfully, giving you expressive code without the overhead of many nested
loops as you’d get if they were eager.