Pairs and the heap

Today we'll add a new type to our language: the pair type. A pair, as its name suggests, packages up two values into one value. We can then access the components individually. We'll implement a few operations to work with this new type:

This is a simple addition to our language, but it's quite powerful! In Scheme and other Lisp-like languages, the pair operation is often called cons (short for "construct") and pairs are called cons-cells. These cons-cells are used to implement complex nested structures like lists and trees. In our language, we could implement a list of values like this:

(pair 1 (pair 2 (pair 3 false)))

We can access the first element of the list with left and the rest of the list with right. In Lisp, the left and right operations are called car and cdr; the names refer to assembly instructions on the computer on which Lisp was originally implemented.

Pairs in the AST

Our AST refactor made it nice and easy for us to add operations to our language. left and right are primitive unary functions, and pair is a binary function. Stick them in the appropriate spots, update the prim*_of_string functions, and we're done:

type prim1 = Add1 | Sub1 | ZeroP | NumP | Not | Left | Right

type prim2 = Plus | Minus | Eq | Lt | Pair

Pairs in the interpreter

Our implementation of pairs in the interpreter will be straightforward because we'll rely on OCaml's own pair implementation. Pairs are a kind of value, so we'll need to add a constructor to our value type and implement printing:

interp.ml

type value = Number of int | Boolean of bool | Pair of (value * value)

let rec string_of_value (v : value) : string =
  match v with
  | Number n ->
      string_of_int n
  | Boolean b ->
      if b then "true" else "false"
  | Pair (v1, v2) ->
      Printf.sprintf "(pair %s %s)" (string_of_value v1) (string_of_value v2)

We'll then implement the pair operations:

interp.ml

let rec interp_exp env (exp : expr) : value =
  match exp with
  ...
  | Prim1 (Left, arg) -> (
    match interp_exp env arg with
    | Pair (l, _) ->
        l
    | _ ->
        raise (BadExpression exp) )
  | Prim1 (Right, arg) -> (
    match interp_exp env arg with
    | Pair (_, r) ->
        r
    | _ ->
        raise (BadExpression exp) )
  | Prim2 (Pair, l, r) ->
      Pair (interp_exp env l, interp_exp env r)

Pairs in the compiler

OK, so now we've got a value which is actually a pair of values. This is going to be a bit tricky to implement in the compiler–remember, all of our values have to fit in a 64-bit register! And since our other values are 64 bits long, it's going to be hard to fit two of them in 64 bits.

What if we added a level of indirection? Instead of trying to store a pair in a register, can we store the address of a memory location that stores a pair?

We've seen how to write to a section of memory called the stack; we've used it to store the values of variables, as well as temporary values used during computation. Can we put pairs on the stack?

If we try to do that, we're going to run into some trouble. Consider our let expression:

(let ((x <expr)) <body>) 

Right now, regardless of what expr is, we can compile the let by moving the value in rax (i.e., the value of expr) to the next available stack address. So how should we compile this expression?

(let ((x (pair 1 2))) (left x))

If that pair lives on the stack, we have to put x somewhere else! We'd need to somehow track how much stack space the expression used, and put x after it. What about an expression like

(let ((x (if (< 1 2) (pair 1 (pair 2 3)) (pair 1 2))) (left x))

How much stack space will we use for the let? At compile-time, we really don't know.

We need to put that pair someplace where the let won't clobber it. If the stack is for "short-lived" values whose size is known at compile-time, we need somewhere to put longer-lived values whose size might only be known at run-time.

This other area of memory is called the heap. The heap is where objects live in Python and Java. It's where OCaml puts complex objects (including pairs). In C, we can allocate space on the heap with the malloc call.

Just as the stack traditionally "grows" from higher to lower addresses, the heap will grow from lower addresses to higher addresses. Just as our program starts with the address of the bottom of the stack in the rsp register, it will start with the bottom of the heap in the rdi register (we'll see how we can make that happen in the runtime momentarily). We keep track of indexes into the stack at compile-time; the rsp register never changes. As we saw above, though, we won't be able to do that for the heap: at compile-time, we don't actually know how much heap space we'll use. So we'll need to increment the rdi register as our program executes in order to make sure it always points at the next available chunk of heap memory.

Let's see how that works by implementing the pair operation:

compile.ml

let rec compile_exp (tab : int symtab) (stack_index : int) (exp : expr) :
    directive list =
  match exp with
  | Prim2 (Pair, e1, e2) ->
      compile_exp tab stack_index e1
      @ [Mov (stack_address stack_index, Reg Rax)]
      @ compile_exp tab (stack_index - 8) e2
      @ [ Mov (Reg R8, stack_address stack_index)
        ; Mov (MemOffset (Reg Rdi, Imm 0), Reg R8)
        ; Mov (MemOffset (Reg Rdi, Imm 8), Reg Rax)
        ; Mov (Reg Rax, Reg Rdi)
        ; Add (Reg Rdi, Imm 16) ]

All we're doing here is moving the arguments to pair into memory, storing the right address in rax, and making sure rdi still points at the last place in the heap.

Let's go ahead and implement the runtime now. First, we'll need to make sure rdi points at a region of memory we can write to. We'll do that like this:

runtime.c

#include <stdlib.h>

extern uint64_t entry(void *heap);

// some definitions elided...

int main(int argc, char **argv) {
  print_value(entry((void*)malloc(4096)));
  return 0;
}

We're allocating 4096 bytes of space to use as our heap, then passing its address into our entry function. In the x86-64 C calling convention, the first argument to a function is passed in the rdi register, so this will do what we want it to!

Now, let's implement print_value for our pairs. Here we're going to run into some trouble. After all, at runtime a pair value is just a memory address. These addresses are just 64-bit integers! How will we know that we're looking at the address of a pair and not at an integer, or a boolean, or some future other heap-allocated type? We want to tag our pair addresses just like we're tagging our other types. We really can't shift them, though–x86-64 might use the whole 64-bit number!

We're going to use a little trick in order to make this work. All of our values are 64-bit numbers, and we're storing values in memory; so, we've been modeling memory as an array of 8-byte cells. As long as we only ever access memory like this, we'll only ever increment rdi by multiples of 8. So as long as rdi starts out as a multiple of 8, all of our addresses will end in 0b000. And, malloc is guaranteed to return a multiple of 8 (actually 16). So, since all of our actual addresses will end in 0b000, we can use those last three bits to store a tag.

This tag can't end in 0b00, since that's used by numbers. And it can't end in 0b111, since our tags for booleans (and characters, on the homework) end in 0b111. All of the other three-bit sequences, however, are fair game! We'll use 0b010 for pairs. Here's how that looks in the runtime:

runtime.c

#define heap_mask 0b111
#define pair_tag 0b010

void print_value(uint64_t value) {
  if ((value & num_mask) == num_tag) {
    int64_t ivalue = (int64_t)value;
    printf("%" PRIi64, ivalue >> num_shift);
  } else if ((value & bool_mask) == bool_tag) {
    if (value >> bool_shift) {
      printf("true");
    } else {
      printf("false");
    }
  } else if ((value & heap_mask) == pair_tag) {
    uint64_t v1 = *(uint64_t*)(value - pair_tag);
    uint64_t v2 = *(uint64_t*)(value - pair_tag + 8);
    printf("(pair ");
    print_value(v1);
    printf(" ");
    print_value(v2);
    printf(")");
  } 
  else {
    printf("BAD %" PRIu64, value);
  }
}

(Remember to recompile the runtime with gcc -c runtime.c -o runtime.o.)

And in the compiler:

compile.ml

let heap_mask = 0b111

let pair_tag = 0b010

let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) :
    directive list =
  match exp with
  | Prim2 (Pair, e1, e2) ->
      compile_exp tab stack_index e1
      @ [Mov (stack_address stack_index, Reg Rax)]
      @ compile_exp tab (stack_index - 8) e2
      @ [ Mov (Reg R8, stack_address stack_index)
        ; Mov (MemOffset (Reg Rdi, Imm 0), Reg R8)
        ; Mov (MemOffset (Reg Rdi, Imm 8), Reg Rax)
        ; Mov (Reg Rax, Reg Rdi)
        ; Or (Reg Rax, Imm pair_tag)
        ; Add (Reg Rdi, Imm 16) ]

Now we can correctly compile and run programs that create pairs:

$ compile_and_run "(pair 1 false)";;

The left and right operations are now straightforward: we just need to remove the pair tag and access the right location in memory.

compile.ml

let rec compile_exp (tab : int symtab) (stack_index : int) (exp : expr) :
    directive list =
  match exp with
  | Prim1 (Left, arg) ->
      compile_exp tab stack_index arg
      @ [Mov (Reg Rax, MemOffset (Reg Rax, Imm (-pair_tag)))]
  | Prim1 (Right, arg) ->
      compile_exp tab stack_index arg
      @ [Mov (Reg Rax, MemOffset (Reg Rax, Imm (-pair_tag + 8)))]

A note about garbage

Once we're done with some memory on the stack, we decrement our stack index and keep executing the program, using that memory again if necessary. What happens when we're done with memory on the heap?

Right now, nothing! We never decrement rdi. For long-running programs, this could be a big problem. We're only allocating 4096 bytes for our heap–that could run out!

Reclaiming unused allocated memory is called garbage collection. Efficiently determining which memory can be safely reclaimed is a tough problem, and fast garbage collection is the subject of ongoing research. For now, we're not going to worry about reclaiming heap space. We'll talk more about garbage collection later in the course.