Heist 0.3: complete set of Scheme data types

I actually tagged the 0.3.0 release of Heist, my Ruby Scheme implementation, about a month back, mostly to get it off my desk for a while. I’ve made a few minor tweaks and released 0.3.1 over the weekend, so now’s as good a time as any to go over what’s new.

The major milestone for this release was to complete the set of R5RS data types: Heist now supports vectors (including macro and quasiquotation support), characters and strings, including full libraries for each type. These work largely as advertised by the spec, although vector handling is slightly idiosyncratic.

Some Schemes require vectors to be quoted, as in '#(1 2 3). This makes the vector a constant (i.e. it is immutable), and every evaluation of it returns the same object. That is (in mzscheme):

> (define (f) '#(1 2 3))
> (define a (f))
> (define b (f))
> (eq? a b)
#t

> (define (f) (vector 1 2 3))
> (define a (f))
> (define b (f))
> (eq? a b)
#f

So we see that vector allocates a new object every time, but the quoted literal does not. This is still true in Heist, but Heist also allows non-quoted vectors. These return a new mutable object when evaluated.

(define (f) '#(1 2 3))
(define (g) #(1 2 3))

(define a (f)) (define b (f))
; => #(1 2 3)
(eq? a b)
; => #t
(vector-set! a 0 9)
; [error] Cannot modify vector constant

(define c (g)) (define d (g))
; => #(1 2 3)
(eq? c d)
; => #f
(vector-set! c 0 9)
c
; => #(9 2 3)
d
; => #(1 2 3)

The reason for this is that unquoted vectors must be allowed – they are part of the syntax-rules spec. But, making them immutable makes performing macro transformations very hard; particularly inlining the transformation into the AST is impossible since it involves setting an attribute or changing a cell on my now-frozen Heist::Runtime::Vector instance. So they need to be mutable (left unfrozen, in Ruby terms). If an unquoted literal evaluated to the same object every time, then the vector-set! procedure would be able to modify the parse tree, which is clearly undesirable. So, unquoted vector literals evaluate to a mutable copy of themselves.

With the Scheme type system completed, where do we go from here? The only pieces of R5RS I’ve not implemented so far are the file I/O procedures, and some of the more exotic continuation stuff like call-with-values and dynamic-wind (I’d love to figure these out but so far they’ve eluded me). My main reason for developing this was to help me with SICP, so I plan to keep going with that and fill in any blanks the book requires along the way.

I may provide faster Ruby implementations of the built-in procedures currently written in Scheme, but that’s not a priority. Paul Graham will tell you that Lisp “came about … as the byproduct of an attempt to axiomatize computation”, and Scheme is no exception. If you write a Scheme yourself, you soon start to find that there do not need to be very many functions built into the core language: a lot of the R5RS library can be written in Scheme itself. Most of the syntactic functions can be written as macros, and many of the data types just need a constructor, accessors and mutators and the user can build the rest on top. Witness the list functions: the core provides cons, car, cdr, set-car!, set-cdr! and pair?, and all the other ‘built-in’ list functions are written in Scheme. SICP chapter 2 reinforces this pattern of data abstraction and is well worth reading as a lesson in robust program design.

So, for the time being at least I’m happy letting Heist be a neat little example of this axiomatization idea at work. I’d check out the source now if I were you, before I get bored and decide the performance needs kicking up a notch.