Naming expressions with let
Now that we know how to access memory, we can add support for names. Our
language will support OCaml-style variable binding: variables are
immutable, and are bound in expressions using
let. Some examples:
(let ((x 1)) x) ; => 1 (let ((x 3)) (let ((y (+ x 2)) (+ x y))) ; => 8 (let ((x 3)) (let ((x (+ x 2)) (+ x x))) ; => 10
We'll start with updating our AST, then our interpreter, then work on the compiler.
Let expressions in the AST type
Let expressions are a great example of the nicer abstraction that our ASTs allow
compared to s-expressions.
The syntax for
let is kind of funny: notice the double-parenthesized
((x 1)) above.
This is how Scheme's
let works, so we've done it that way as well. In the homework compiler, we support
multiple variables, where this notation makes a bit more sense:
(let ((x 1) (y 2)) (+ x y))
But for a unary
let, it looks like silly syntax. Our AST type can remove this silliness.
type expr = | True | False | Num of int | Var of string | If of expr * expr * expr | Prim1 of prim1 * expr | Prim2 of prim2 * expr * expr | Let of string * expr * expr
let expression pairs a variable name, an expression to bind that name to, and a body.
Notice the other new constructor,
Var of string.
To refer to the new variable in the body expression, we use
(let ((x 1)) (add1 x)) would be represented by the AST
Let "x" (Num 1) (Prim1 (Add1, Var "x")).
We eliminate the ugly syntax with a new case for our
| Lst [Sym "let"; Lst [Lst [Sym s; e]]; body] -> Let (s, expr_of_s_exp e, expr_of_s_exp body)
But to parse variables in the body, we need to add another case too:
| Sym var -> Var var
We add this after the
false patterns, so that those fire first.
Names in the interpreter
How should the interpreter evaluate this expression?
It depends! If the expression is nested inside a
(let ((x 1)) x)
then the interpreter should evaluate it to its bound value. Otherwise,
x is meaningless. So it seems like the interpreter needs to do
different things depending on context. We can't just do
interp_expr (Sym "x") and expect it to do the right thing!
We'll supply this context via another argument to the interpreter. This
argument stores a mapping between names (i.e., strings) and values. In
util.ml, we've defined a type called
symtab (short for "symbol
table") that maps strings to some other value. We can use symbol tables
$ open Csci1260.Util;; $ let st : int symtab = Symtab.empty;; $ let st' = Symtab.add "x" 2 st;; $ Symtab.find "x" st';; 2 $ Symtab.find_opt "x" st;; Exception: Not_found
Symbol tables are sort of like dictionaries in Python or Java, except
that they are immutable: notice that adding a key to
returned a new symbol table instead of changing
st. (For the curious,
our symbol tables use OCaml's
which is implemented using balanced binary trees.) We'll be using only a
few functions to manipulate our tables:
Symtab.emptyis the empty symbol table, containing no mappings
Symtab.add var value streturns a symbol table with all of
st's mappings plus a mapping between
value; any previous mapping for
Symtab.mem var streturns
sthas a mapping for
Symtab.find var streturns the value
varis mapped to in
st, raising an exception if no such mapping is present
OK, so we're going to need to modify our interpreter to take a
value symtab as an argument:
let rec interp_exp (env: value symtab) (exp : expr) : value = ...
env is short for "environment". None of our existing expressions
modify the environment, so they'll all just pass it through in their
Now that we have our environment, we can use it to interpret names:
let rec interp_exp (env: value symtab) (exp : expr) : value = match exp with (* some cases elided ... *) | Var var when Symtab.mem var env -> Symtab.find var env | Var _ -> raise (BadExpression exp)
That "when" restricts when this pattern fires: we'll only evaluate the right-hand side of the pattern if the "when" condition is true.
We'll also need to update
parse program |> expr_of_s_exp |> interp_exp Symtab.empty |> string_of_value
Let's see our interpreter work:
$ Interp.interp_exp Symtab.empty (parse "x");; <BadExpression error> $ Interp.interp_exp (Symtab.add "x" (Number 1) Symtab.empty) (parse "x");; Number 1 $ Interp.interp_exp (Symtab.add "x" (Number 1) Symtab.empty) (parse "(+ x x)");; Number 2
Cool! Now we just need to implement
let, which updates this
environment. It's pretty straightforward:
let rec interp_exp (env : value symtab) (exp : s_exp) : value = match exp with (* some cases elided... *) | Let (var, e, body) -> let e_value = interp_exp env e in interp_exp (Symtab.add var e_value env) body
Names in the compiler
In the compiler, we'll start with
let rec compile_exp (stack_index : int) (exp : s_exp) : directive list = match exp with | Let (var, e, body) -> compile_exp stack_index e @ ...
Once we've compiled
e, what should we do with it? We know we want to
body, and somehow have references to
body be able to
"find" the value of
e. We definitely can't leave that value in
rax–as we saw last time, it will get overwritten right away. We'll
need to "remember" this value for later use. Let's put it in memory!
let stack_address (stack_index : int) = MemOffset (Reg Rsp, Imm stack_index) let rec compile_exp (stack_index : int) (exp : s_exp) : directive list = match exp with | Let (var, e, body) -> compile_exp tab stack_index e @ [Mov (stack_address stack_index, Reg Rax)] @ ...
stack_address helper function lets us avoid a little repetition;
we'll want to rewrite the stack code from last time to use it as well).
Now we want to compile
body, but make sure our compiler knows to get
the right value from memory if it runs into
Sym var. Just like in the
interpreter, we can do this with a symbol table. In the interpreter, our
table mapped variables to values; here, we'll need to map variables to
let stack_address (stack_index : int) = MemOffset (Reg Rsp, Imm stack_index) let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) : directive list = match exp with | Let (var, e, body) -> compile_exp tab stack_index e @ [Mov (stack_address stack_index, Reg Rax)] @ compile_exp (Symtab.add var stack_index tab) (stack_index - 8) body | Var var when Symtab.mem var tab -> [Mov (Reg Rax, stack_address (Symtab.find var tab))] | Var _ -> raise (BadExpression exp)
Here's what the compiler produces for
(let ((x 1)) (+ x 2)):
entry: mov rax, 4 mov [rsp + -8], rax mov rax, [rsp + -8] mov [rsp + -16], rax mov rax, 8 mov r8, [rsp + -16] add rax, r8 ret