Homework 4 - Error handling and the heap

Assignment Instructions

  1. Accept the assignment on GitHub Classroom here.

  2. Do the assignment 🐫.

  3. Upload the assignment on Gradescope. The most convenient way to do this is by uploading your assignment repo through the integrated GitHub submission method on Gradescope, but you may also upload a .zip of your repo.

Introduction

In this homework, you'll implement lists using pairs and arrays. You'll get practice handling runtime errors and dealing with data on the heap.

Lists

As we saw in class, lists can be implemented using pairs. In fact, this is how lists are generally implemented in Scheme and other Lisp-like languages. A list is either:

Instead of using false to signal the empty list, Scheme includes a special value () (pronounced "nil").

Add support for () to both the interpreter and the compiler. In the compiler, use 0b11111111 as the runtime representation of ().

Now we can build lists using pair and (). Here are a few such lists:

Add a primitive (list? e). It should evaluate to true when its argument is a list and false otherwise.

Hint: implement list? in the interpreter with a recursive helper function.

Hint: implement list? in the compiler with a loop: you'll need to define a label and repeatedly jump back to that label.

Vectors

We can also implement lists using arrays. In Scheme, array-backed lists are called vectors. You'll implement the following vector operations:

Vectors should be printed as space-delimited lists of their values enclosed in square brackets. For instance, the following represent some simple vectors of numbers:

In the interpreter, you can implement vectors using OCaml's built-in array type and the functions in the Array module.

In the compiler, you should implement vectors on the heap. A vector of length n should occupy n+1 8-byte cells; the first cell should be used to store its length. Use the three-bit tag 0b101 for vector values.

Hint: you may need to make use of an additional register in order to implement some vector operations. If you do, you can use R9 in addition to the usual Rax and R8.

Note: unlike the other data types in our language so far, vectors are mutable. The following code should only store one vector, [10 21 10], on the heap:

(let ((v (vector 3 10))) (vector-set v 1 21))

It can be frustrating to nest together calls to vector-set to create test cases. We provide a new function for you, (do <expr> <expr> ...), that will help with this. Similar to OCaml's semicolon, which throws away the return value of a function with type Unit, the do function will run each of its arguments but only return the final one. For example, the program

(let ((v (vector 5 10)))
  (do 
    (vector-set v 0 4)
    (vector-set v 1 8)
    (vector-set v 3 200)
    (pair (vector-get v 2) (vector-get v 3))
  )
)

will return the pair (10, 200).

Error handling

For this assignment, unlike previous assignments, you are expected to implement error-handling in the compiler. Invalid expressions, such as vector-length on a non-vector or vector-get with an index greater than the vector's length, should always cause runtime errors. Take a look at ensure_num and ensure_pair for some examples of the error handling we've implemented so far.

For error testing with .lisp, instead of creating the normal .out files, you should create a file example.err with empty content.

The runtime

You'll need to change the runtime to handle displaying nil and vector values. In order to display vectors, it will likely be helpful to cast a vector's pointer to the heap into a C pointer to a uint64_t and then use pointer arithmetic to access each of the values in the vector.

Grammar

The grammar your new interpreter and compiler should support is as follows:

<expr> ::= <num>
         | <id>
         | true
         | false
+        | ()
         | (<un_prim> <expr>)
         | (<bin_prim> <expr> <expr>)
         | (<tri_prim> <expr> <expr> <expr>)
         | (if <expr> <expr> <expr>)
         | (let ((<id> <expr>) ...) <expr>)
         | (do <expr> <expr> ...)
             

<un_prim> ::= add1
            | sub1
            | zero?
            | num?
            | not
            | pair?
            | left
            | right
+           | list?
+           | vector?
+           | vector-length

<bin_prim> ::= +
             | -
             | =
             | <
             | pair
+            | vector
+            | vector-get

<tri_prim> ::=
+            vector-set