Homework 2 - Characters

Assignment Instructions

  1. Accept the assignment on GitHub Classroom here.

  2. Do the assignment 🐛.

  3. Upload the assignment on Gradescope. The most convenient way to do this is by uploading your assignment repo through the integrated GitHub submission method on Gradescope, but you may also upload a .zip of your repo.

Code Structure and Assembly

This assignment comes with an interpreter and compiler that you will be expanding upon. This is the pattern that all future homeworks will follow. The general structure of each project is as follows:

Our OCaml representation of assmebly in homeworks is identical to the representation you are familiar with from lecture. For reference, please use the x86-64 reference sheet on the course webpage. Additionally, the features already implemented in the stencil compiler contain relevant examples of assembly operations.

Introduction

In this homework, the interpreter and compiler in the stencil supports a language with numbers, booleans, and several unary operations. You'll be extending both the compiler and the interpreter with a new type for individual characters, as well as three additional unary operations:

The parser in the S_exp module handles parsing characters. In our Lisp dialect, characters are written as #\ followed by either a single (printable) character, "newline", or "space". For instance, all of the following are characters:

Grammar

<bool> ::= true
         | false
<expr> ::= <num>
         | <bool>
         | <char>
         | (<unary_op> <expr>)
<unary_op> ::= num? | char?
             | add1 | sub1 | zero?
             | not
             | char->num | num->char

Testing

When you want to test if the result of a program (e.g. (add1 42)) is as expected (e.g. 42), you should:

Some programs are expected to result in an error instead of a value. For example, (char->num 42) should result in an error because 42 is not a char. To test this behavior, you should use an *.err file instead of *.out. The content of *.err files are ignored but you are free to put descriptions of why an error should occur there. (You are not required to make use of *.err files in HW2, as we will not test your implementations against "bad" programs. In the future, we will explicitly state when you are required to throw errors.)

Task: Write a set of examples in the examples directory that exercise character functionality.

Similar to previous homeworks, you can compare the output of the compiler and the interpreter (as well as the anticipated output in .out files) by running dune runtest -f.

Gradescope

When you submit your implementation to Gradescope, your suite of examples will be run against a reference interpreter and compiler. This is analogous to comparing your compiler with an existing implementation, one of the testing strategies we mentioned in class. You can always use Gradescope to check the expected behavior of an example.

If the reference implementation fails on any of your examples, Gradescope will show you how its output differed from the expected output of your example (if you wrote a *.out file for it).

You can do this as many times as you want. We encourage you to use this option to develop a correct and thorough set of examples before you start working on your interpreter and compiler!

Grading

The completeness of your tests will be graded in two ways:

  1. Chaff detection. Each assignment includes several chaffs—incorrect implementations of the project. Your test suite will be run against each chaff. If at least one of your tests fails for a given chaff, you’ll earn credit for “catching” it. There are usually 3–4 chaffs per assignment.

  2. Manual review. TAs will also inspect your test suite to evaluate its overall thoroughness.

When writing your tests, make sure to (1) cover edge cases and (2) test how different features of the interpreter/compiler interact with each other.

Extending the interpreter

Inside of lib/interp.ml, extend the interpreter with support for characters.

Task: Add Char as a variant of the value type (consisting of an OCaml char) and extend display_value to support it.

Note: Be careful to handle #\space and #\newline correctly, as OCaml's sprintf will format them as ' ' and '\n'.

Task: Add support for character constants to interp_expr.

Task: Add support for the new primitives operations to interp_primitive.

Extending the runtime

Before extending the compiler, the runtime must be made aware of characters. We'll represent characters during runtime as their ASCII values shifted left 8 places and tagged with 0b00001111.

Task: Inside of lib/runtime/runtime.c, extend the print_value function to handle printing characters in the #\a form our Lisp dialect uses.

Note: Be careful to handle #\space and #\newline correctly, as C's printf will format them as ' ' and '\n'.

Extending the compiler

Inside of lib/compile.ml, extend the compiler with support for characters.

Task: Add constants char_mask and char_tag to tag character immediates as required by the runtime (similar to how we've tagged numbers and booleans).

Task: Implement a function operand_of_char : char -> operand to make an immediate from a character constant.

Task: Add handling for character constants to compile_expr. This should be similar to the handling for Num and include a call to operand_of_char.

Task: Add support for new character primitive operations to compile_primitive.

Running the compiler and the interpreter

You can run the interpreter on a file:

dune exec bin/interp.exe -- <file.lisp>

You can also run the compiler:

dune exec bin/compile.exe -- <file.lisp> output

The resulting .s file (containing assembly code) and .exe file (an executable) will be in the output/ directory. You can run the executable with

./output/<file.lisp>.exe

You can also tell the compiler to run the executable immediately after compiling by adding -r, e.g.:

dune exec bin/compile.exe -- <file.lisp> output -r