1.2: Compiler Design Overview
The need for compilers arises from the need for high-level languages, which are more relevant for problem solving and software development by humans. Moreover, high-level languages are machine independent.
Since development of a compiler is a relatively complex system-development effort, typically having many users and developers, and will be maintained over a life of many years, a formal process should be used for its development. The development process should extend from requirements through verification and validation, and should include reviews, tests, analysis and measures, quality assurance, configuration control, and key documentation. The development process is a part of the overall system life cycle process, which additionally, includes deployment, maintenance, disposal and archival storage. The compiler development process should consist of procedures for writing and documenting the needs and requirements, the architecture and design, construction, integration, and verification and validation of the compiler. Documentation should also include the formal foundations and techniques used, tradeoffs made, alternatives evaluated, and the chosen alternative for design or implementation. Full coverage of all of these in detail is beyond the scope of this course. As we proceed through this course, however, we include high-level needs, requirements, functions, performance considerations, and verification and validation issues for a compiler and its parts.
1.2.1: One-Pass vs. Multi-Pass
Read the paragraph on one-pass and multi-pass compilers. This distinction was more relevant in the early days of computing, when computer memory was limited and processing time was much slower compared to current computers.
A "pass" refers to reading and processing the input, i.e., source or source representation. A one-pass compiler is simpler to implement. However, more sophisticated processing, such as optimizations, may require multiple passes over the source representation.
The compiler is composed of components, which perform the steps of the compilation process. Some of the components may make a separate pass over the source representation--thus, the name of multi-pass. Multi-pass compilers are used for translating more complex languages (for example, high-level language to high-level language), and for performing more complete optimizations. Moreover, multi-pass compilers are easier to verify and validate, because testing and proof of correctness can be accomplished for the smaller components. Multi-pass compilers are also used for translation to an assembler or machine language for a theoretical machine, and for translation to an intermediate representation such as bytecode. Bytecode, or pcode (portable code), is a compact encoding that uses a one-byte instruction, and parameters for data or addresses, to represent the results of syntax and semantic analysis. The bytecode or pcode is independent of a particular machine. It is executed by a virtual machine residing on the target machine.
1.2.2: Structure of a Compiler
Read the CS143 Course Overview handout. You have studied the lecture "Introduction to Compilers" already. The handout gives an overview of the structure of a compiler with a good explanation.
This overview describes the front-end process, or analysis stage, of the compilation process. The intermediate code used as an example is TAC, three-address code. The front end, or analysis, consists of lexical analysis, syntax analysis or parsing, semantic analysis, and intermediate code generation. Lexical analysis may be preceded by preprocessing, depending on the language and the development environment.
It also explains the back-end, or synthesis state, of the compilation process, which includes intermediate code optimization, object code generation, and object code optimization. The symbol table and error-handling topics apply to both the front end and to the back end. The section on one-pass versus multi-pass complements the previous explanation of the number of "passes" that a compiler uses.
The handout also has a very good section on history, which helps us understand why the structure of a compiler is as described in previous sections. Read the Historical Perspective section, which shows the important influence of programming languages on the development of compilers.