Homework 4 - Error handling and the heap
Assignment Instructions
-
Accept the assignment on GitHub Classroom here.
-
Do the assignment 🐛.
-
Upload the assignment on Gradescope. The most convenient way to do this is by uploading your assignment repo through the integrated GitHub submission method on Gradescope, but you may also upload a .zip of your repo.
Introduction
In this homework, you'll implement lists using pairs and arrays. You'll get practice handling runtime errors and dealing with data on the heap.
Lists
As we saw in class, lists can be implemented using pairs. In fact, this is how lists are generally implemented in Scheme and other Lisp-like languages. A list is either:
- The empty list, for which we used
false
in class - A pair where the second element is a list
Instead of using false to signal the empty list, Scheme includes a special
value ()
(pronounced "nil").
Add support for ()
to both the interpreter and the compiler. In the compiler,
use 0b11111111
as the runtime representation of ()
. You will also need to add
support for displaying ()
to the runtime (see "The Runtime" section below).
Now we can build lists using pair
and ()
. Here are a few such lists:
()
(pair 1 ())
(pair 1 (pair 2 ()))
(pair (pair 1 2) ())
Add a primitive (list? e)
. It should evaluate to true
when its argument is a
list and false
otherwise.
Hint: implement list?
in the interpreter with a recursive helper function.
Hint: implement list?
in the compiler with a loop: you'll need to define a
label and repeatedly jump back to that label.
Vectors
We can also implement lists using arrays. In Scheme, array-backed lists are called vectors. You'll implement the following vector operations:
(vector n e)
creates a vector of lengthn
where all of the elements are the expressione
.n
must evaluate to a positive integer. For example,(vector 3 10)
produces a vector of length 3 where each entry is the integer 10.(vector? e)
returnstrue
ife
is a vector andfalse
otherwise.(vector-length v)
returns the length of the vectorv
.(vector-get v n)
returns the element of the vectorv
at (0-based) indexn
.n
must evaluate to a non-negative integer less than the length ofv
.(vector-set v n e)
sets the element at indexn
of the vectorv
toe
and evaluates to the vector.n
must evaluate to a non-negative integer less than the length ofv
.
Vectors should be printed as space-delimited lists of their values enclosed in square brackets. For instance, the following represent some simple vectors of numbers:
[1]
[1 2 3]
In the interpreter, you can implement vectors using OCaml's built-in array
type and the functions in the Array
module.
In the compiler, you should implement vectors on the heap. A vector of length
n
should occupy n+1
8-byte cells; the first cell should be used to store its
length. Use the three-bit tag 0b101
for vector values.
Hint: you may need to make use of an additional register in order to implement
some vector operations. If you do, you can use R9
in addition to the usual
Rax
and R8
.
Note: unlike the other data types in our language so far,
vectors are mutable. The following code should only store one vector, [10 21 10]
, on the heap:
(let ((v (vector 3 10))) (vector-set v 1 21))
It can be frustrating to nest together calls to vector-set
to create test cases.
We provide a new function for you,
(do <expr> <expr> ...)
, that will help with this.
Similar to OCaml's semicolon, which throws away the return value of a function with type Unit
,
the do
function will run each of its arguments but only return the final one.
For example, the program
(let ((v (vector 5 10))) (do (vector-set v 0 4) (vector-set v 1 8) (vector-set v 3 200) (pair (vector-get v 2) (vector-get v 3)) ) )
will return the pair (10, 200)
.
Error handling
For this assignment, unlike previous assignments, you are expected to implement
error-handling in the compiler. Invalid expressions, such as vector-length
on
a non-vector or vector-get
with an index greater than the vector's length,
should always cause runtime errors. Take a look at ensure_num
and ensure_pair
for some examples of the error handling we've implemented so far.
We're particularly interested in catching runtime errors (and the corresponding interpreter errors). You can assume that all of our tests are well-formed s-expressions that do not contain unbound variables.
For error testing with .lisp, instead of creating the normal .out files, you should create a file example.err
with empty content.
The Runtime
You'll need to change the runtime to handle displaying nil
and vector values.
In order to display vectors, it will likely be helpful to cast a vector's pointer
to the heap into a C pointer to a uint64_t
and then use
pointer arithmetic
to access each of the values in the vector.
Grammar
The grammar your new interpreter and compiler should support is as follows:
<expr> ::= <num> | <id> | true | false + | () | (<un_prim> <expr>) | (<bin_prim> <expr> <expr>) | (<tri_prim> <expr> <expr> <expr>) | (if <expr> <expr> <expr>) | (let ((<id> <expr>) ...) <expr>) | (do <expr> <expr> ...) <un_prim> ::= add1 | sub1 | zero? | num? | not | pair? | left | right + | list? + | vector? + | vector-length <bin_prim> ::= + | - | = | < | pair + | vector + | vector-get <tri_prim> ::= + vector-set