Homework 2 - Characters
Assignment Instructions
-
Accept the assignment on GitHub Classroom here.
-
Do the assignment 🐛.
-
Upload the assignment on Gradescope. The most convenient way to do this is by uploading your assignment repo through the integrated GitHub submission method on Gradescope, but you may also upload a .zip of your repo.
Code Structure and Assembly
This assignment comes with an interpreter and compiler that you will be expanding upon. This is the pattern that all future homeworks will follow. The general structure of each project is as follows:
./examples
: tests for compiler and interpreter. You will add test cases here everytime implementing a new language feature../lib
: the interesting parts of interpreters and compilers. Typically, you will modify both./lib/interp.ml
and./lib/compile.ml
in each assignment. Sometimes, you also need to modify./lib/runtime.c
../asm
: a library for producing assembly programs. Don't change anything there!./s_exp
: a library for working with S-expressions. Don't change anything there!./bin
: the "main functions" of the compiler and the interpreter. Don't change anything there!./test
: a library for testing your interpreter and compiler. Don't change anything there!
Our OCaml representation of assmebly in homeworks is identical to the representation you are familiar with from lecture. For reference, please use the x86-64 reference sheet on the course webpage. Additionally, the features already implemented in the stencil compiler contain relevant examples of assembly operations.
Introduction
In this homework, the interpreter and compiler in the stencil supports a language with numbers, booleans, and several unary operations. You'll be extending both the compiler and the interpreter with a new type for individual characters, as well as three additional unary operations:
(char? e)
, which returnstrue
if its argument is a character and false otherwise(char->num e)
, which converts its argument from a character to a number representing its ASCII code. When the argument is not a character, the result is undefined (and the interpreter should raise an exception).(num->char e)
, which converts its argument from a number representing an ASCII code to its corresponding character. When the argument is not a number, the result is undefined (and the interpreter should raise an exception).
The parser in the S_exp
module handles parsing characters. In our Lisp
dialect, characters are written as #\
followed by either a single (printable)
character, "newline", or "space". For instance, all of the following are
characters:
#\a
#\A
#\;
#\newline
#\space
Grammar
<bool> ::= true | false <expr> ::= <num> | <bool> | <char> | (<unary_op> <expr>) <unary_op> ::= num? | char? | add1 | sub1 | zero? | not | char->num | num->char
Testing
When you want to test if the result of a program (e.g. (add1 42)
) is as expected (e.g. 42
), you should:
- Write the program in an
examples/*.lisp
file (e.g.examples/test42.lisp
) - Write the expected output in an
examples/*.out
file (e.g.examples/test42.out
)
Some programs are expected to result in an error instead of a value. For example, (char->num 42)
should result in an error because 42
is not a char
. To test this behavior, you should use an *.err
file instead of *.out
. The content of *.err
files are ignored but you are free to put descriptions of why an error should occur there. (You are not required to make use of *.err
files in HW2, as we will not test your implementations against "bad" programs. In the future, we will explicitly state when you are required to throw errors.)
Task: Write a set of examples in the examples
directory that exercise
character functionality.
Similar to previous homeworks, you can compare the output of the compiler and the interpreter (as well as the
anticipated output in .out
files) by running dune runtest -f
.
Gradescope
When you submit your implementation to Gradescope, your suite of examples will be run against a reference interpreter and compiler. This is analogous to comparing your compiler with an existing implementation, one of the testing strategies we mentioned in class. You can always use Gradescope to check the expected behavior of an example.
If the reference implementation fails on any
of your examples, Gradescope will show you how its output differed from the
expected output of your example (if you wrote a *.out
file for it).
You can do this as many times as you want. We encourage you to use this option to develop a correct and thorough set of examples before you start working on your interpreter and compiler!
Grading
The completeness of your tests will be graded in two ways:
-
Chaff detection. Each assignment includes several chaffs—incorrect implementations of the project. Your test suite will be run against each chaff. If at least one of your tests fails for a given chaff, you’ll earn credit for “catching” it. There are usually 3–4 chaffs per assignment.
-
Manual review. TAs will also inspect your test suite to evaluate its overall thoroughness.
When writing your tests, make sure to (1) cover edge cases and (2) test how different features of the interpreter/compiler interact with each other.
Extending the interpreter
Inside of lib/interp.ml
, extend the interpreter with support for characters.
Task: Add Char
as a variant of the value
type (consisting of an OCaml
char
) and extend display_value
to support it.
Note: Be careful to handle #\space
and #\newline
correctly, as OCaml's
sprintf
will format them as '
' and '\n
'.
Task: Add support for character constants to interp_expr
.
Task: Add support for the new primitives operations to interp_primitive
.
Extending the runtime
Before extending the compiler, the runtime must be made aware of characters.
We'll represent characters during runtime as their ASCII values shifted left 8
places and tagged with 0b00001111
.
Task: Inside of lib/runtime/runtime.c
, extend the print_value
function
to handle printing characters in the #\a
form our Lisp dialect uses.
Note: Be careful to handle #\space
and #\newline
correctly, as C's
printf
will format them as '
' and '\n
'.
Extending the compiler
Inside of lib/compile.ml
, extend the compiler with support for characters.
Task: Add constants char_mask
and char_tag
to tag character immediates
as required by the runtime (similar to how we've tagged numbers and booleans).
Task: Implement a function operand_of_char : char -> operand
to make an
immediate from a character constant.
Task: Add handling for character constants to compile_expr
. This should be
similar to the handling for Num
and include a call to operand_of_char
.
Task: Add support for new character primitive operations to
compile_primitive
.
Running the compiler and the interpreter
You can run the interpreter on a file:
dune exec bin/interp.exe -- <file.lisp>
You can also run the compiler:
dune exec bin/compile.exe -- <file.lisp> output
The resulting .s
file (containing assembly code) and .exe
file (an
executable) will be in the output/
directory. You can run the executable with
./output/<file.lisp>.exe
You can also tell the compiler to run the executable immediately after compiling
by adding -r
, e.g.:
dune exec bin/compile.exe -- <file.lisp> output -r