Output and functions
We finished last class being able to take input from the user.
It makes sense to also be able to write output.
Specifically, we'll implement (print e)
and (newline)
.
So far, our interactions with our compiler have been kind of REPL-like, maybe without the L: we read an S-expression, evaluate it, and print the result. Since every program in our language is a value, this makes sense. But when we add input and output side effects, it makes less sense. We might want to write a program that prints multiple things, or does a computation and prints nothing at all.
Adding output to the AST
On HW4, you may have seen that we provided you with a do
function,
which worked a lot like OCaml's ;
.
This wasn't needed before because we weren't writing programs with side effects.
Now, it will be useful for us to do this kind of sequencing,
so we'll implement do
in the class interpreter and compiler as well.
ast.ml
type prim0 = ReadNum | Newline let prim0_of_string (s : string) : prim0 option = match s with (* ... *) | "newline" -> Some Newline type prim1 = Add1 | Sub1 | ZeroP | NumP | Not | Left | Right | Print let prim1_of_string (s : string) : prim1 option = match s with (* ... *) | "print" -> Some Print type expr = (* ... *) | Do of expr list
Adding output to the interpreter
Unsurprisingly, adding output to the interpreter looks a lot like input. We use OCaml functionality to make it easy.
It's a somewhat arbitrary choice to have (print e)
return true
.
But it's an easy one to implement, and a sensible option:
really, print
is a "void" function, and we don't care about its output.
interp.ml
let output_channel = ref stdout let rec interp_exp env (exp : expr) : value = match exp with (* ... *) | Do exps -> exps |> List.rev_map (interp_exp env) |> List.hd | Prim1 (Print, e) -> interp_exp env e |> string_of_value |> output_string stdout ; Boolean true | Prim0 Newline -> output_string stdout "\n" ;
We should also change our main calling function so that it doesn't print the return value automatically.
let interp (program : string) : unit = parse program |> interp_exp Symtab.empty |> ignore
Adding output to the compiler
In the runtime
We have very little to do in the runtime.
In fact, we've already implemented a (C) version of print_value
.
So we can just call this as an external function!
We only need to remove it, again, from our main calling function.
We'll add a function print_newline()
that doesn't do very much.
runtime.c
int main(int argc, char **argv) { void *heap = (void *)malloc(4096); entry(heap); return 0; } void print_newline() { printf("\n"); }
We'll then recompile the runtime:
gcc -c runtime.c -o runtime.o
In the compiler
Our task list is familiar now. First, we'll make the assembler aware of the C functions we want to use.
compile.ml
let compile (program : expr) : string = [ Global "entry" ; Extern "error" ; Extern "read_num" ; Extern "print_value" ; Extern "print_newline" ; Label "entry" ] @ compile_exp Symtab.empty (-8) program @ [Ret] |> List.map string_of_directive |> String.concat "\n"
The compiler changes will look a lot like the read-num
case.
Remember that the C calling convention expects to find the input to print_value
in rdi
.
We have to make sure this value is there,
and that we return the value true
at the end in rax
.
newline
is even simpler, since we don't need to worry about the argument.
let rec compile_exp (tab : int symtab) (stack_index : int) (exp : expr) : directive list = match exp with | Do exps -> List.map (fun exp -> compile_exp tab stack_index exp) exps |> List.concat | Prim1 (Print, e) -> compile_exp tab stack_index e @ [ Mov (stack_address stack_index, Reg Rdi) ; Mov (Reg Rdi, Reg Rax) ; Add (Reg Rsp, Imm (align_stack_index stack_index)) ; Call "print_value" ; Sub (Reg Rsp, Imm (align_stack_index stack_index)) ; Mov (Reg Rdi, stack_address stack_index) ; Mov (Reg Rax, operand_of_bool true) ] | Prim0 Newline -> [ Mov (stack_address stack_index, Reg Rdi) ; Add (Reg Rsp, Imm (align_stack_index stack_index)) ; Call "print_newline" ; Sub (Reg Rsp, Imm (align_stack_index stack_index)) ; Mov (Reg Rdi, stack_address stack_index) ; Mov (Reg Rax, operand_of_bool true) ] (* ... *) let compile_and_run (program : string) : unit = compile_to_file program ; ignore (Unix.system "nasm program.s -f elf64 -o program.o") ; ignore (Unix.system "gcc program.o runtime.o -o program -z noexecstack") ; ignore (Unix.system "./program")
Updating difftest infrastructure
Our testing infrastructure is now all messed up.
First of all, our tests will need to print something,
otherwise we won't be able to observe the results of the computations.
Second of all, we probably want to test (read-num)
,
which expects user input!
We're not going to talk through the details of this testing infrastructure. The bottom line is, we should be able to give our tester input strings along with the programs we want to run, and compare the entire output.
interp.ml
let interp (program : string) : unit = interp_exp Symtab.empty (parse program) |> ignore let interp_io (program : string) (input : string) = let input_pipe_ex, input_pipe_en = Unix.pipe () in let output_pipe_ex, output_pipe_en = Unix.pipe () in input_channel := Unix.in_channel_of_descr input_pipe_ex ; set_binary_mode_in !input_channel false ; output_channel := Unix.out_channel_of_descr output_pipe_en ; set_binary_mode_out !output_channel false ; let write_input_channel = Unix.out_channel_of_descr input_pipe_en in set_binary_mode_out write_input_channel false ; let read_output_channel = Unix.in_channel_of_descr output_pipe_ex in set_binary_mode_in read_output_channel false ; output_string write_input_channel input ; close_out write_input_channel ; interp program ; close_out !output_channel ; let r = input_all read_output_channel in input_channel := stdin ; output_channel := stdout ; r let interp_err (program : string) (input : string) : string = try interp_io program input with BadExpression _ -> "ERROR"
compile.ml
let compile_and_run_io (program : string) (input : string) : string = compile_to_file program ; ignore (Unix.system "nasm program.s -f macho64 -o program.o") ; ignore (Unix.system "gcc program.o runtime.o -o program") ; let inp, outp = Unix.open_process "./program" in output_string outp input ; close_out outp ; let r = input_all inp in close_in inp ; r let compile_and_run_err (program : string) (input : string) : string = try compile_and_run_io program input with BadExpression _ -> "ERROR" let difftest (examples : (string * string) list) = let results = List.map (fun (ex, i) -> (compile_and_run_err ex i, Interp.interp_err ex i)) examples in List.for_all (fun (r1, r2) -> r1 = r2) results let test () = difftest [("(print (read-num))", "1")]
Functions
The next feature we add to our language won't take a ton of code, but it will be extremely powerful, increasing the scope of programs we can write: we'll be able to define functions.
Some example programs, to get in the spirit of things:
(define (id x) x) (print (id 4)) (define (f x y) (+ x y)) (define (g x) (f x x)) (print (f 4 5)) (define (fib n) (if (< n 2) n (+ (fib (- n 1)) (fib (- n 2))))) (print (fib (read-num))) (define (even n) (or (zero? n) (odd (sub1 n)))) (define (odd n) (even (sub1 n))) (print (even (read-num)))
We'll rely on a few features of these programs.
- Programs are now lists of expressions, instead of single expressions!
- The definitions all come before the body.
- We're never going to pass a function like
even
as an argument to another function -- that is, functions are not values. (We'll get to this later in the course.)
Some of the changes we make are structural, to account for lists of expressions as input. For now, just the interpreter.
Interpreter:
let rec interp_exp (defns : defn list) (env : value symtab) (exp : expr) : value = (* ... *) | Lst (Var f :: args) when is_defn defns f -> let defn = get_defn defns f in if List.length args = List.length defn.args then let vals = List.map (interp_exp defns env) args in let fenv = List.combine defn.args vals |> List.fold_left (fun tab (arg, value) -> Symtab.add arg value tab) env in interp_exp defns fenv defn.body else raise (BadExpression exp) let interp (program : string) : unit = let defns, body = parse_many program |> get_defns_and_body in interp_exp defns Symtab.empty body |> ignore