Last month, I followed the Advent of Code series of puzzles that have been published each December for the last few years. I typically write my solutions in Ruby, but there were a few puzzles in this batch that I decided to solve in Rust. These problems ended up teaching me a lot about how one can use traits to make a Rust API feel like something you’d do in an untyped language, including expanding the functionality of built-in types in a safe, controlled way.
This year’s story was based around a virtual machine; a software simulation of a physical computer. This particular machine has a very small set of features. It works like a bytecode interpreter, where a program is represented as an array of integers – signed 64-bit integers, rather than just bytes. Once this program is loaded into the machine’s memory, it can modify any value within that memory region, but there are no other storage abstractions – no stack, no heap, no registers, and so on.
Here’s a little example program, which I’m feeding into a Machine
struct that
will execute it. I’ve broken the program into one instruction per line, and
annotated each line with the address of its first value – its index in the
array of values representing the program. The machine starts reading and
executing the program from its first element. Though I’ve written the program in
lines here, the machine just sees it as an array of numbers.
let mut machine = Machine::new(&[
1, 9, 10, 12, // 0
2, 11, 12, 12, // 4
99, // 8
2, 3, 4, 0, // 9
]);
Each instruction begins with an opcode that identifies one of the machine’s
built-in operations. The first instruction above has opcode 1
, which
identifies the add
operation. This takes three arguments, which are the
numbers following the opcode: 9
, 10
, 12
. All these arguments identify
addresses in the machine’s memory at which to read or write values. The value at
address 9
is 2
, and the value at address 10
is 3
. The add
operation
sums these values and writes the result at the address given by the third
argument: 12
. So after this operation, the machine’s memory beginning at
address 9
contains:
2 3 4 5
After running this instruction, the instruction pointer (IP) is updated to
point at the next instruction, at address 4
. This has opcode 2
, which is the
multiply
operation. It works just like add
, except it multiplies rather than
adding its operands. So here it takes the values from addresses 11
and 12
(4
and 5
), and stores their product at address 12
, so that the machine’s
memory now ends with:
2 3 4 20
The IP now points at address 8
, containing the opcode 99
, which tells the
machine to halt. So the effect of this program is to compute (2 + 3) * 4
and
store the result in address 12
.
machine.run();
println!("{}", machine.memory[12]);
// -> 20
Here’s a more complicated program, which computes the value of 232:
let mut machine = Machine::new(&[
2, 18, 19, 18, // 0
1, 17, 20, 20, // 4
6, 20, 16, // 8
6, 15, 15, // 11
99, // 14
0, 14, -1, // 15
1, 2, 32, // 18
]);
machine.run();
println!("{:?}", &machine.memory[18..]);
// -> [4294967296, 2, 0]
In the following, I will use the notation #x
to mean “the value at address
x
”, so for example #3
means the value at address 3
and #4 = #5
means we
write the value from address 5
into address 4
.
The first two instructions in this program are familiar from above. Opcode 6
is a jump
operation: if the value at the address in its first argument is
zero, then the IP is set to the value at the address in the second argument,
otherwise it is incremented to the next instruction as normal. So, the third
instruction sets the IP to #16
if #20 == 0
. The fourth instruction sets IP
to #15
if #15 == 0
. Since #15
is zero, this works as an unconditional
jump, taking the machine back to the start of the program.
Only locations 18
and 20
are modified by this program. The addresses 15
to
17
can be viewed as constants: data items the program needs to do its job,
but aren’t part of the program’s instructions. The addresses after this are used
to store variables that change while the program runs. A higher-level writing of
this program, where we’ve used labels rather than instruction addresses and
resolved all constants, could be:
loop:
#18 = #18 * #19
#20 = #20 - 1
if #20 == 0 then goto exit
goto loop
exit:
halt
data:
18: 1
19: 2
20: 32
We can see that the effect of this program is to multiply the number 1
by 2
,
a total of 32
times, so that we compute 232. The result of this
calculation ends up in #18
.
This is a simplified version of the virtual machine from Advent of Code; the
real one has instructions for comparing addresses for equality or to check if
one is less than another, and its opcodes can be modified so that the arguments
are treated as literal values rather than as addresses. Here’s an implementation
of the machine described above; read()
and write()
accept an offset
and
turn this into an address relative to the IP.
struct Machine {
memory: Vec<i64>,
ip: usize,
}
impl Machine {
fn new(program: &[i64]) -> Self {
let memory = Vec::from(program);
Machine { memory, ip: 0 }
}
fn run(&mut self) {
loop {
match self.memory[self.ip] {
1 => {
self.write(3, self.read(1) + self.read(2));
self.ip += 4;
}
2 => {
self.write(3, self.read(1) * self.read(2));
self.ip += 4;
}
6 => {
if self.read(1) == 0 {
self.ip = self.read(2) as usize;
} else {
self.ip += 3;
}
}
99 => break,
opcode => panic!("unrecognised: {}", opcode),
}
}
}
fn read(&self, offset: usize) -> i64 {
let address = self.memory[self.ip + offset];
self.memory[address as usize]
}
fn write(&mut self, offset: usize, value: i64) {
let address = self.memory[self.ip + offset];
self.memory[address as usize] = value;
}
}
Now, a program that computes one fixed thing is not especially useful; programs become more powerful when they can accept input from outside their own code, and output results back to the system in a convenient way, without the caller needing to know what address the program stores its results in. Here’s a program that extends our exponentiating example above with the ability to take the two operands as inputs, and emit the result as output.
let mut machine = Machine::new(&[
3, 25, // 0
3, 26, // 2
2, 24, 25, 24, // 4
1, 23, 26, 26, // 8
6, 26, 22, // 12
6, 21, 18, // 15
4, 24, // 18
99, // 20
0, 18, -1, // 21
1, 0, 0, // 24
]);
Opcode 3
is the input
operation; it takes a single value from the input
channel and writes it to the given address. Opcode 4
is the output
operation, which reads the value at the given address and sends it to the output
channel. Written in our higher-level assembly-like language, it looks like
this:
#25 = input()
#26 = input()
loop:
#24 = #24 * #25
#26 = #26 - 1
if #26 == 0 then goto exit
goto loop
exit:
output(#24)
halt
data:
24: 1
25: 0
26: 0
The question is, how should we implement input and output? In the early puzzles, there is a single machine that takes a fixed number of inputs known ahead of time, and produces a single output. Later on, we need to generate input by observing the program’s output, for example where the program represents a robot we’re trying to guide through a maze. Later still, there are many machines running at once with their inputs and outputs chained together in sequence, or in a network where one machine can send a message to any other. That’s a lot of different situations to handle!
We can make a start by picking something that seems reasonable for the simple
cases and then seeing where we get stuck. For simple problems, it would be okay
to supply the inputs as a Vec
, and have the outputs also be a Vec
that the
machine can push values into. This would let us interact with the machine like
so:
// compute 3 to the power 7
machine.input.extend(&[3, 7]);
machine.run();
println!("{:?}", machine.output);
// -> [2187]
Let’s go ahead and implement this design; Machine
gains input
and output
fields which are both Vec
s:
struct Machine {
memory: Vec<i64>,
ip: usize,
input: Vec<i64>,
output: Vec<i64>,
}
impl Machine {
fn new(program: &[i64]) -> Self {
Machine {
memory: Vec::from(program),
ip: 0,
input: Vec::new(),
output: Vec::new(),
}
}
// ...
}
And the run() loop
gains a couple of branches for matching opcodes 3
and
4
:
fn run(&mut self) {
loop {
match self.memory[self.ip] {
3 => {
let value = self.input.remove(0);
self.write(1, value);
self.ip += 2;
}
4 => {
let value = self.read(1);
self.output.push(value);
self.ip += 2;
}
// ...
}
}
}
The program now runs successfully and we can pass any two positive integers into
the machine inputs to compute an exponent. Now, suppose we have an interactive
program where want need to generate input by observing the output and responding
to it. Our current design doesn’t let us make decisions like this, because the
machine runs until it finds opcode 99
, and we can’t inject code to monitor the
output and use it to generate more input.
Whenever we want to insert arbitrary code into another process, closures seem
like a good fit. What if, instead of Machine
having two Vec
s inside it, we
could instead give it two closures – one of type FnMut() -> i64
that would
produce inputs, and one of type FnMut(i64)
to accept output. This would, for
example, let us wrap an Iterator
that returns the inputs, and a Vec
to
store the outputs, so we’re already accommodating different types for our input
and output.
let program = [
3, 25, // 0
3, 26, // 2
2, 24, 25, 24, // 4
// ...
];
let mut input = [3, 7].iter().cloned();
let mut output = vec![];
let mut machine = Machine::new_with_io(
&program,
|| input.next().unwrap(),
|x| output.push(x)
);
machine.run();
println!("{:?}", output);
To make this work, we’d need to store the two closures inside Machine
, so
they’re available for Machine::run
to use. The Machine::new
function is
modified to call a new Machine::new_with_io
function, passing in two
do-nothing closures for the existing cases where we don’t need I/O.
struct Machine<'a> {
memory: Vec<i64>,
ip: usize,
input: Box<dyn 'a + FnMut() -> i64>,
output: Box<dyn 'a + FnMut(i64)>,
}
impl<'a> Machine<'a> {
fn new(program: &[i64]) -> Self {
Self::new_with_io(program, || 0, |_| {})
}
fn new_with_io<I, O>(program: &[i64], input: I, output: O) -> Self
where
I: 'a + FnMut() -> i64,
O: 'a + FnMut(i64),
{
Machine {
memory: Vec::from(program),
ip: 0,
input: Box::new(input),
output: Box::new(output),
}
}
// ...
}
The Machine::run
function could then use these closures when it encounters
opcodes 3
and 4
; rather than interacting with a Vec
, it just calls into
the closures the caller supplied.
fn run(&mut self) {
loop {
match self.memory[self.ip] {
3 => {
let value = (self.input)();
self.write(1, value);
self.ip += 2;
}
4 => {
let value = self.read(1);
(self.output)(value);
self.ip += 2;
}
// ...
}
}
}
However, if we try this design, the borrow checker yells at us.
error[E0502]: cannot borrow `output` as immutable because it is
also borrowed as mutable
|
| |x| output.push(x)
| --- ------ first borrow occurs due to use of `output` in closure
| |
| mutable borrow occurs here
...
| println!("{:?}", output);
| ^^^^^^ immutable borrow occurs here
| }
| - mutable borrow might be used here, when `machine` is dropped
| and runs the destructor for type `Machine<'_>`
The problem is that, when we create a closure, we capture any references the
closure body needs in order for the calls it makes to be valid. The output
closure, |x| output.push(x)
, contains a call to Vec::push
, which takes
&mut self
, and so the closure contains a &mut
to the output
variable. As
the closure becomes part of the Machine
struct, it lives until machine
goes
out of scope at the end of the program. While machine
is in scope, it contains
a &mut
to output
, and so no other interaction with output
is possible, and
we cannot print it.
let mut input = [3, 7].iter().cloned();
let mut output = vec![];
let mut machine = Machine::new_with_io(
&program,
|| input.next().unwrap(),
|x| output.push(x) // mutable borrow
);
machine.run();
println!("{:?}", output); // immutable borrow
// machine is still live here
We could solve this by reducing the lifetime of machine
, by moving its use
into a function or a local block, but this is an ugly and short-term solution
that won’t serve us for long. The better approach, which Rust pushes us
constantly toward, is to make our structs smaller and our mutable references
narrower and shorter-lived. In this case, that means passing closures into
Machine::run
, where they’re needed, and not storing them in the Machine
struct. That is, the Machine
code now looks like this, with the struct and
new()
reverted to their original forms, but a new form of run()
introduced
to take the closures.
struct Machine {
memory: Vec<i64>,
ip: usize,
}
impl Machine {
fn new(program: &[i64]) -> Self {
let memory = Vec::from(program);
Machine { memory, ip: 0 }
}
fn run(&mut self) {
self.run_with_io(|| 0, |_| {});
}
fn run_with_io<I, O>(&mut self, mut input: I, mut output: O)
where
I: FnMut() -> i64,
O: FnMut(i64),
{
loop {
match self.memory[self.ip] {
3 => {
let value = input();
self.write(1, value);
self.ip += 2;
}
4 => {
let value = self.read(1);
output(value);
self.ip += 2;
}
// ...
}
}
}
// ...
}
This design means the &mut
for output
only lives as long as the call to
run_with_io()
, so once it’s complete we’re free to read output
again and we
can print the program’s results.
let program = [
3, 25, // 0
3, 26, // 2
2, 24, 25, 24, // 4
// ...
];
let mut machine = Machine::new(&program);
let mut input = [3, 7].iter().cloned();
let mut output = vec![];
machine.run_with_io(
|| input.next().unwrap(),
|x| output.push(x)
);
println!("{:?}", output);
// -> [2187]
Now, this particular machine does not read any more input after emitting its
sole output value, but it’s clear that by introducing FnMut
closures, we’re
able to insert code to handle the output, and this will be able to affect the
choice of the next input value we generate.
Already we’ve seen a guiding rule for managing read/write access in Rust: keep your structs small, and only scope references to the functions that need them. Making overly broad structs tends to expand both the scope and lifetime of mutable borrows, making it hard to interact with multiple parts of a structure at the same time. It really pushes towards a core principle of object-oriented design, forcing you to reduce structures to only those pieces of state that must live and be kept consistent together, and break big objects into smaller ones that express which pieces of state are truly related.
For our next challenge, let’s suppose we want to exponentiate a number multiple times, for example we’d like to compute ((23)3)3. The laws of algebra tell us this is 227 and we could use our existing machine here, but humour me and imagine that we implement this by chaining several machines together.
We already have a machine that calculates pow(x, y)
. Repeated exponentiation
just means applying the same process to the output of such a machine:
((23)3)3 is given by pow(pow(pow(2, 3), 3),
3)
. We can take the output of one machine, and feed it in as input to another.
The pow()
machine has very simple control flow: it reads two inputs, does some
computation, and returns an output. So, we can run one instance of it to
completion, then take its output and use that as input for another iteration. In
the following program, we start with an output
array containing the initial
input: 2
. Then we perform three loops where the previous output
becomes the
new input
and we assign a fresh output
array, we push a second value onto
the input
array to provide the second operand – the exponent 3
– and then
we run a fresh copy of the program to generate the next output.
let mut input: Vec<i64>;
let mut output = vec![2];
for _ in 0..3 {
input = output;
output = vec![];
let mut machine = Machine::new(&program);
input.push(3);
machine.run_with_io(|| input.remove(0), |x| output.push(x));
}
println!("{:?}", output);
// -> [134217728]
As you can verify for yourself, 227 is Indeed 134,217,728, so our
chained machines appear to work. But, we’re already seeing a downside of using
closures for I/O: we’re often using the same underlying types for managing
input/output values, but we have to repeatedly tell Machine::run_with_io
how
to use them – we have to say || input.remove(0)
or |x| output.push(x)
rather than just passing our Vec
s into run()
and having them work
transparently.
With Rust’s trait system we can do just that, and much more besides. To start,
we need to change Machine::run_with_io
so that instead of taking two closures,
it takes one argument that implements a new trait called Input
, and another
that implements Output
. These traits are parameterised but a Machine
deals
with i64
values so we’ll want to input and output those. The Input
trait
will have a get()
function for fetching input, while Output
has a put()
function, and we use those in place of closure calls for opcodes 3
and 4
.
fn run_with_io<I, O>(&mut self, mut input: I, mut output: O)
where
I: Input<i64>,
O: Output<i64>,
{
loop {
match self.memory[self.ip] {
3 => {
let value = input.get();
self.write(1, value);
self.ip += 2;
}
4 => {
let value = self.read(1);
output.put(value);
self.ip += 2;
}
// ...
}
}
}
This is the only change needed to Machine
itself and it opens up the ability
to work with many more types – we just need implementations of Input
and
Output
for all the types we want to use.
trait Input<T> {
fn get(&mut self) -> T;
}
trait Output<T> {
fn put(&mut self, value: T);
}
These traits take &mut self
because in general, these functions will mutate
the value they’re called on, for example by pushing a new value to a Vec
or
pulling the next value out of an Iterator
inside a FnMut
closure. Because
Input::get
and Output::put
take &mut self
, the input
and output
parameters to Machine::run_with_io
need to be marked mut
, but aside from
that their types are completely opaque – any type can be passed here as long as
it implements the right traits. If we’d instead said that output: &mut O where
O: Output<i64>
then we would only be able to pass &mut T
types, rather than
any type at all. For example, we’d need to prefix all closures we pass in with
&mut
, and some of the types we’ll use later can be mutated through a &
reference rather than a &mut
one. Therefore, we leave these parameter types as
open-ended as possible.
To keep our existing examples working, we’ll need to implement Input<T>
for
closures of type FnMut() -> T
, and Output<T>
for closures of type
FnMut(T)
.
impl<F, T> Input<T> for F
where
F: FnMut() -> T,
{
fn get(&mut self) -> T {
self()
}
}
impl<F, T> Output<T> for F
where
F: FnMut(T),
{
fn put(&mut self, value: T) {
self(value);
}
}
But now that we have these traits, we can start implementing them for other
built-in types like Vec
, thus allowing those types to be used directly with
Machine::run_with_io
. A lot of our examples would work if we could pass a
mutable Vec
reference in, so let’s write an implementation for &mut Vec<T>
:
impl<T> Input<T> for &mut Vec<T> {
fn get(&mut self) -> T {
self.remove(0)
}
}
impl<T> Output<T> for &mut Vec<T> {
fn put(&mut self, value: T) {
self.push(value);
}
}
With these impl
s in place we can now pass &mut Vec<i64>
in directly, without
having to write closures that explain how to work with them:
let mut input: Vec<i64>;
let mut output = vec![2];
for _ in 0..3 {
input = output;
output = vec![];
let mut machine = Machine::new(&program);
input.push(3);
machine.run_with_io(&mut input, &mut output);
}
println!("{:?}", output);
// -> [134217728]
We had to implement the traits for &mut Vec<T>
rather than for Vec<T>
,
because of the signature of run_with_io
:
fn run_with_io<I, O>(&mut self, mut input: I, mut output: O)
where
I: Input<i64>,
O: Output<i64>,
If it took &mut I
and &mut O
, then we could implement the traits for
Vec<T>
and then passing in &mut Vec<T>
would work. But since it takes just
I
and O
, the traits must be implemented for the exact type we pass in, which
in this case is &mut Vec<T>
. Remember, T
and &mut T
are distinct types,
and implementing a trait for one does not imply it’s implemented for the other.
Functions that rely on traits will often take a &T
or &mut T
where T
implements the trait, but the range of types we want to want to work with means
I’ve gone with accepting any type T
, rather than restricting to types matching
&mut T
.
Here’s one more type these traits would be useful for. Often, as a process
produces a single output, it would be nice to have that written directly to a
single variable, rather than allocating a Vec
for it. For example, our
exponentiating machine could write its answer into a &mut i64
:
let mut machine = Machine::new(&program);
let mut output = 0;
machine.run_with_io(&mut vec![3, 7], &mut output);
println!("{}", output);
// -> 2187
To make this work, we might be tempted to implement Output<T>
for &mut T
:
impl<T> Output<T> for &mut T {
fn put(&mut self, value: T) {
**self = value;
}
}
However, this won’t compile as-is:
error[E0119]: conflicting implementations of trait `Output<_>` for type `&mut _`:
|
| / impl<F, T> Output<T> for F
| | where
| | F: FnMut(T),
| | {
... |
| | }
| | }
| |_- first implementation here
...
| impl<T> Output<T> for &mut T {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `&mut _`
Rust allows for some very dynamic use of its type system, but a constraint your
code must obey is: for every function/method call in your program, there must be
a single fn
that that call dispatches to, and it must be unambiguously
decidable at compile time which fn
the compiler should pick. If we implement
the same trait multiple times for the same type, this rule is broken, because
the compiler won’t know which impl
of Input<T>
to pick when it sees
input.get()
.
The error above says that that we can’t implement the trait for both &mut T
and for F where F: FnMut(T)
, because there might be some type that matches
both of these. Even if there isn’t one right now, there’s nothing to say one
won’t be added to your program or to the Rust standard library in the future.
And since we don’t want that to be a breaking change, you’re not allowed to
write two impls
that could match the same type.
Can we still do anything useful here? In this case we can, by being more
specific: &mut i64
does not implement FnMut(i64)
, and furthermore the
compiler guarantees that it never will; &mut T
is a fundamental type, which
means the compiler considers it a breaking change to implement any new traits
for it.
impl Output<i64> for &mut i64 {
fn put(&mut self, value: i64) {
**self = value;
}
}
We can implement Output<i64>
for &mut i64
because Output<T>
is defined in
our crate. The Rust compiler won’t add new traits to &mut i64
, so it will
never match F where F: FnMut(T)
. And, other crates can’t write conflicting
implementations because of the orphan rules, which roughly say an impl
is
not allowed if neither the implemented trait nor the type it’s being applied to
is defined in the current crate.
The Fn
traits are also fundamental, which means the compiler considers it a
breaking change to implement them for any more types. So the conflict between
implementations for &mut T
and F where F: FnMut(T)
must be because there’s
an existing type matching both of them. And indeed there is: &mut F where F:
FnMut(T)
! FnMut<A>
has a blanket implementation for &mut F where F:
FnMut<A>
, so &mut T
and F where F: FnMut(A)
can refer to the same type, if
T: FnMut(A)
.
This notion that every function call and type reference must unambiguously resolve to a single item at compile time is known as coherence, and you are violating the coherence rules if the compiler tells you that you have conflicting implementations for a trait.
In part 2 of this article, we’ll look at some other useful types we can use here, when we need to deal with concurrency or multiple input/output sources. We’ll also see some patterns we can use to deal with coherence errors, and some new tricks that const generics will enable.