LUND UNIVERSITY
Department of Computer Science
Lab 1, Optimizing Compilers, EDAN75
The purpose of this lab is to introduce you to the compiler and simulation
tool you will use in the design of your optimizing compiler. This is the
first of two labs and focusses on finding dominators in the control flow graph using the Lengauer-Tarjan algoritithm. The focus of the second
lab is on SSA Form.
- You need to run this lab on UNIX, and you need a C compiler which understands C99.
The lab environment consists of a compiler for a very small subset of C
and a RISC processor simulator. The compiler generates code for a
hypothetical RISC processor which it then simulates itself and reports
the number of clock cycles needed to execute the compiled program.
- Download the source code for the compiler according to instructions
given by your teacher.
- Untar it, e.g. by typing:
tar xvf vcc.tar
- Go to the vcc directory and type make. This should do the following:
- Build the compiler, called vcc.
- Compile the benchmark program a.c using gcc.
- Run the benchmark program, i.e. a.out, with output going into the
file correct.
- Compile and run (in one step) a.c using vcc
with output from the simulated program going to the file out.
- Do a diff out correct | head to check that we get the same output.
Output from diff indicates a compiler-bug.
vcc reports (on stderr) the number of clock cycles
needed to simulate the program (i.e., C = ...). vcc
also generates a file stats with detailed statistics.
- Remove the RETURN-statement in dom.c mentioned in the output from vcc.
- Look into the file func.h where vertex_t is defined.
- Make the depth-first search
procedure dfs() complete. It is in the file
dom.c. Type make which will produce a file cfg.dot.1.
As you can see, vcc still crashes but it should crash after having produced the file
cfg.dot.1.
-
Then use dot to visualize the CFG. A useful parameter to dot is -Tps to
generate a PostScript file which then can be converted to PDF with ps2pdf.
This file should show the control flow graph with nodes numbered using DFS.
- Insert a return statement after the call to print_cfg() in dominance().
Type make. You should now see the program running to completion and report the
number of clock cycles, and diff should not report any difference.
- Start a text editor with the file stats.
- First it shows
the total number of clock cycles needed, and then the number of
stall cycles due to the delayed-load architecture of the simulated
RISC processor. As you can see, the stall cycles is a significant
performance penalty. You can also see that not all of the source
code is executed so focus you optimizations on the code that will actually
contribute to the execution time!
- Then we statistics for each instruction.
- Start a text editor with the files a.c:* (i.e., a.c:0,
a.c:1, a.c:2, etc). The files a.c:N contain
three-address code which is dumped after certain steps during optimization.
- Make the rest of Lengauer-Tarjan algorithm complete (in the same file)
according to Algorithm 2.7 (naive version) in the book (you need to adapt the
algorithm slightly for our data structures). Use the fields domchild and
domsibling to construct the tree. The domchild points from an immediate
dominator to its first child and domsibling point to a sibling with the
same immediate dominator. See the file vcg.c how they are used.
- Inspect the dominator tree using the command sh rdot dt 2
and viewing the file dt.pdf.
(the number 2 may have changed if you printed additional graphs: look
for files dt.*.dot).
Requirements for the Lab
When you have performed the tasks above, contact your teacher and then
you will discuss what you have learnt during the lab.
Mon Sep 7 14:33:03 CEST 2020