CH 1
CH 1
COMPILER CONSTRUCTION
Salam R. Al-E’mari
Adham University College
1
Syllabus
• Prerequisites: 6803331-3 Programming Languages
• Textbook: “Compilers: Principles, Techniques, and Tools ,A. V. Aho, R. Sethi, J. D. Ullman; (c) 2010;
• Evaluation Plan:
Midterm exam: 20%.
Final exam: 50%.
Project: 15%
Homework 10%.
Quiz: 5%.
2
CHAPTER1
INTRODUCTION
3
Outline
1. Compilers and Interpreters
2. The structure of a compiler
3. Why learn about compilers?
4. The Evolution of Programming Language
5. Summary
4
Compilers and Interpreters
• “Compilation”
• Translation of a program written in a source language into a
semantically equivalent program written in a target language
Input
Source Target
Program Compiler
Program
5
What is a compiler?
• A compiler is a program translates (or compiles) a program written
in a high-level programming language (the source language) that is
suitable for human programmers into the low-level machine
language (target language) that is required by computers.
• During this process, the compiler will also attempt to spot and
report obvious programmer mistakes that detect during the
translation process.
6
Why we use high-level language for
programming?
Using a high-level language for programming has a large impact on how fast programs
can be developed. The main reasons for this are:
1. Compared to machine language, the notation used by programming
languages is closer to the way humans think about problems.
2. The compiler can spot some obvious programming mistakes.
3. Programs written in a high-level language tend to be shorter than
equivalent programs written in machine language.
4. The same program can be compiled to many different machine
languages and, hence, be brought to run on many different machines.
7
Compilers and Interpreters (cont’d)
• “Interpretation”
• Performing the operations implied by the
source program
Source
Program
Interpreter Output
Input
Error messages
8
Compiler vs. Interpreter
Compiler Interpreter
• Take single instruction as input
• Takes Entire program as input
• • It is Slower
It is Faster
• intermediate object code is generated. • No intermediate code is generated
• Required more memory Due to • Required less memory As no intermediate code is
intermediate object code generated
• Program not need compile every time • Every time higher level program is converted into
• Errors are displayed after entire lower level program.
program is checked.
• Errors are displayed for every instruction
• Debugging is comparatively hard. interpreted.
• Ex: C, C++. • Debugging is easy.
• Ex: python, Ruby, basic. 9
Hybrid compiler Compilation and interpretation may be
combined to implement a programming
language:
The compiler may produce intermediate-
Source Program level code which is then interpreted rather
than compiled to machine code.
Ex: java
Translator
(Compiler)
Intermediate Program
Virtual machine Output
(Interpreter)
Input
10
The Analysis-Synthesis Model of
Compilation
• There are two parts to compilation:
• Analysis
determines the operations implied by the source program which are
recorded in a tree structure
• Synthesis takes the tree structure and translates the operations therein into the
target program
11
Other Tools that Use the Analysis-
Synthesis Model
• Editors (syntax highlighting)
• Pretty printers (e.g. doxygen)
• Static checkers (e.g. lint and splint)
• Interpreters
• Text formatters (e.g. TeX and LaTeX)
• Silicon compilers (e.g. VHDL)
• Query interpreters/compilers (Databases)
12
Preprocessors, Compilers, Assemblers,
and Linkers
Skeletal Source Program
Preprocessor
Source Program
Try for example:
Compiler
gcc -v myprog.c
Target Assembly Program
Assembler
Relocatable Object Code
Libraries and
Linker
Relocatable Object Files
13
Compiler-Construction Tools
1. Parsergenerators that automatically produce syntax analyzers from a grammatical
description of a programming language.
2. Scanner generators that produce lexical analyzers from a regular-expression description
of the tokens of a language.
3. Syntax-directed translation engines that produce collections of routines for walking a
parse tree and generating intermediate code.
4. Code-generator generators that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for a
target machine.
5. Data-flowanalysis engines that facilitate the gathering of information about how values
are transmitted from one part of a program to each other part. Data-flow analysis is a key
part of code optimization.
6. Compiler-construction toolkits that provide an integrated set of routines for
constructing various phases of a compiler.
14
Why learn about compilers?
• It is considered a topic that you should know in order to be “well-cultured” in
computer science.
• A good craftsman should know his tools, and compilers are important tools
for programmers and computer scientists.
• The techniques used for constructing a compiler are useful for other
purposes as well.
• There is a good chance that a programmer or computer scientist will need to
write a compiler or interpreter for a domain-specific language.
15
The Evolution of Programming Language
classification by generation
First-generation languages: machine languages
Second-generation : assembly languages
Third-generation : higher-level languages like Fortran, Cobol, Lisp, C, C++,
C#, and Java.
Fourth-generation languages: languages designed for specific applications
like NOMAD for report generation, SQL for database queries, and Postscript
for text formatting.
fifth-generation language has been applied to logic- and constraint-based
languages like Prolog and OPS5.
16
Impacts on Compilers
• The
advances in programming languages placed new
demands on compiler writers.
• Compiler
writers would take maximal advantage of the new
hardware capabilities.
• Good software-engineering techniques are essential for
creating and evolving modern language processors.
17
The Phases of a Compiler
Phase Output Sample
Programmer Source string A=B+C;
Scanner (performs lexical Token string ‘A’, ‘=’, ‘B’, ‘+’, ‘C’, ‘;’
analysis) And symbol table for identifiers
;
Parser (performs syntax Parse tree or abstract syntax tree |
analysis based on the grammar =
/ \
of the programming language) A +
/ \
B C
• Passes
• A collection of phases may be repeated only once (single pass) or multiple times
(multi pass)
• Single pass: usually requires everything to be defined before being used in
source program
• Multi pass: compiler may have to keep entire program representation in memory
19
Compiler-Construction Tools
Software development tools are available to implement
one or more compiler phases:
• Scanner generators
• Parser generators
• Syntax-directed translation engines
• Automatic code generators
• Data-flow engines
20
Summary
• Language Processors: An integrated software development environment includes many
different kinds of language processors such as compilers, interpreters, assemblers,
linkers, loaders, debuggers, profilers.
• Compiler Phases: A compiler operates as a sequence of phases, each of which
transforms the source program from one intermediate representation to another.
• Lexical Analyzer
• Syntax Analyzer
• Semantic Analyzer
• Intermediate Code Generator
• Machine-Independent Code Optimizer
• Code Generator
• Machine-Dependent Code Optimizer
• Machine and Assembly Languages: Machine languages were the first generation
programming languages, followed by assembly languages. Programming in these
languages was time consuming and error prone.
21