Doc Thesis Dissertation: Extensible Compiler Construction

Date: June 05, 2006 (Monday) at 13:15

Torbjörn Ekman defends his Ph D Thesis: Extensible Compiler Construction

The faculty's opponent: Dr. Gilad Bracha, Distinguished Engineer, Sun Microsystems

Processing of programs is a core area in computer science. A compiler that translates source text to machine language is the most well-known kind of tool in this area, but there are numerous other kinds of related applications: source-to-source translators, refactoring tools, reengineering tools, metrics tools, consistency checkers, etc. These tools perform similar analyses and can therefore benefit from shared infrastructure. This thesis addresses the problem of how to build program-processing tools much more easily, providing high-level concise ways of programming the tools, and providing good modularity and extensibility, allowing for a high degree of reuse between tools.

We present Rewritable Reference Attributed Grammars (ReRAGs), a technique that builds on well-known software development techniques such as object-orientation, aspect-oriented software development, declarative programming, attribute grammars, and transformation systems. ReRAGs combine mechanisms from these areas into one coherent framework with synergistic effects on modularity and extensibility. These mechanisms support several different decomposition criteria for modularization that enable re-use: as separate computations on the program model, as a base language and language extensions, and the same decomposition as used in a language specification for traceability. ReRAGs allow such modules to be decoupled from each other and automatically resolves complex context-sensitive dependences.

An evaluation algorithm for the formalism is presented and implemented in the JastAdd tool which combines ReRAGs with Java. We have implemented a complete Java 1.4 compiler to demonstrate the full potential of the JastAdd system and to evaluate its mechanisms for modularization and extensibility. We show how name analysis for Java with its complex visibility rules involving nested scopes, inheritance, qualified access, and syntactic ambiguities can be modularized in the same way as the informal Java language specification. We have also extended our Java compiler with non-null types for detection of possible null-pointer violations at compile time. The extension is completely modular and the technique allows for so called pluggable type systems that can be enabled at will.

The techniques scale to real languages and large applications. Our generated Java compiler passes as many tests as production use compilers during compliance testing, compiles applications larger than 100.000 lines of code, and the executable specification is less than two-thirds the size of handwritten compilers.

Room: LTH E:B


Last modified Dec 9, 2011 12:59 pm