Optimization: constant folding
Let's take a step back and look at the information pipeline in our compiler right now.
string --1--> token list --2--> s_exp --3--> ast_lam --4--> ast --5--> assembly --6--> binary
Each numbered arrow represents a transformation step.
- The tokenizer identifies language tokens in the input string.
- The parser turns a list of tokens into data with nested structure.
- A second "parsing" step abstracts away from the surface syntax.
- We eliminate a construct of the abstract syntax tree with lambda lifting.
- The main compilation step turns an AST into assembly code.
- The assembler and linker produce an executable binary.
We're done adding features to our language. From this point on, our goal is to improve this pipeline. We want to produce better executables -- for some definition of better! Mostly, we'll mean code that executes faster on our computers.
Different optimizations fit in this pipeline at different spots.
A common spot is at the AST level.
This kind of optimization has type ast -> ast
:
that is, it transforms an AST into an equivalent AST
that has better performance characteristics.
We can visualize these as "loop" arrows at the AST stage of the pipeline.
Other kinds of optimizations (in particular, so-called "peephole optimizations") operate on the assembly code directly.
Constant folding
A while back, we talked about the possibility of a cheating compiler
that would just call the interpreter and return the result.
This worked when our programs were static.
Introducing (read-num)
ruled this out as a valid compilation strategy,
since the return value was no longer known at compile time.
But some parts of our program are still static, and we can imagine identifying them and hard coding their values in our compiled code. This is an optimization technique known as constant folding.
Here's the implementation of constant folding:
constant_folding.ml
open Ast open Util let rec fold : expr -> expr = function | Prim1 (Add1, e) -> ( match fold e with Num x -> Num (x + 1) | e -> Prim1 (Add1, e) ) | Prim1 (Sub1, e) -> ( match fold e with Num x -> Num (x - 1) | e -> Prim1 (Sub1, e) ) | Prim1 (p, e) -> Prim1 (p, fold e) | Prim2 (Plus, e1, e2) -> ( match (fold e1, fold e2) with | Num x, Num y -> Num (x + y) | e1, e2 -> Prim2 (Plus, e1, e2) ) | Prim2 (Minus, e1, e2) -> ( match (fold e1, fold e2) with | Num x, Num y -> Num (x - y) | e1, e2 -> Prim2 (Plus, e1, e2) ) | Prim2 (p, e1, e2) -> Prim2 (p, fold e1, fold e2) | If (e1, e2, e3) -> If (fold e1, fold e2, fold e3) | Let (v, e, b) -> Let (v, fold e, fold b) | Call (e, args) -> Call (fold e, List.map fold args) | Do exps -> Do (List.map fold exps) | e -> e let fold_program (prog : program) = { defns= List.map (fun {name; args; body} -> {name; args; body= fold body}) prog.defns ; body= fold prog.body }
And we can add this optimization pass to our compiler:
compile.ml
let compile (program : s_exp list) : string = let prog = program_of_s_exps program in let prog = Constant_folding.fold_program prog in [ Global "entry" ; Extern "error" ; Extern "read_num" ; Extern "print_value" ; Extern "print_newline" ; Label "entry" ] @ compile_exp prog.defns Symtab.empty (-8) prog.body true @ [Ret] @ (List.map (compile_defn prog.defns) prog.defns |> List.concat) |> List.map string_of_directive |> String.concat "\n"
We can tell that our optimization is working:
> compile_and_run "(print (+ 2 (+ 3 (- 5 4))))"
Safety check
There are two essential criteria for an optimization: it shouldn't change the meaning of our program, and it shouldn't make it any slower. Every time we discuss an optimization we should consider these criteria carefully.
And there are two related questions to ask: when can the optimization be applied, and when should it be applied?
In the case of constant folding, the answers to both are, pretty much, "always." This is a cheap, powerful optimization that does not threaten to change the meanings of our programs. But For other optimizations we may not have such a strong guarantee.