Part - 9: Complier Design: 9.1 Introduction To Compilers
Part - 9: Complier Design: 9.1 Introduction To Compilers
Part - 9: Complier Design: 9.1 Introduction To Compilers
Preprocessor
Source program
Compiler
Assembler
THE GATE ACADEMY PVT.LTD. H.O.: #74, KeshavaKrupa (third Floor), 30th Cross, 10th Main, Jayanagar 4th Block, Bangalore-11
: 080-65700750, info@thegateacademy.com © Copyright reserved. Web: www.thegateacademy.com Page 293
Quick Refresher Guide Complier Design
COMPILERS
A compiler is a program that reads a program written in one language – the source
language – and translates it into an equivalent program in another language – the target
language
As an important part of this translation process, the compiler reports to its user the
presence of errors in the source program.
Applications: Errors
Design of Interfaces
Design of language migration tools
Design of Re – engineering Tools
Two-Pass Assembly:
The simplest form of assembler makes two passes over the input, where a pass consists of
reading an input file once. In the first pass, all the identifiers that denote storage locations are
found and stored in a symbol table.
In the second pass, the assembler scans the input again. This time, it translates each operation
code into the sequence of bits representing that operation in machine language, and it translates
each identifier representing a location into the address given for that identifier in the symbol
table.
The output of the second pass is usually relocatable machine code, meaning that it can be loaded
starting at any location L in memory; i.e., If L is added to all addresses in the code, then all
references will be correct. Thus, the out- put of the assembler must distinguish those portions of
instructions that refer to addresses that can be relocated.
Loaders and Link-Editors:
A program called loader performs the two functions of loading and link-editing.
The process of loading consists of taking relocatable machine code, altering the relocatable
addresses and placing the altered instructions and data in memory at the proper locations.
The link-editor makes a single program from several files of relocatable machine code.
THE GATE ACADEMY PVT.LTD. H.O.: #74, KeshavaKrupa (third Floor), 30th Cross, 10th Main, Jayanagar 4th Block, Bangalore-11
: 080-65700750, info@thegateacademy.com © Copyright reserved. Web: www.thegateacademy.com Page 294
Quick Refresher Guide Complier Design
Errors
Lexical Analyzer
Stream of tokens
Syntax Analyzer
Parse tree
Semantic Analyzer
Annotates Parse tree
Symbol Table Error Handling
Management Intermediate code Generation Table
Intermediate form
Code Optimization
Optimized intermediate form
Code Generatin
Assembly Program
THE GATE ACADEMY PVT.LTD. H.O.: #74, KeshavaKrupa (third Floor), 30th Cross, 10th Main, Jayanagar 4th Block, Bangalore-11
: 080-65700750, info@thegateacademy.com © Copyright reserved. Web: www.thegateacademy.com Page 295
Quick Refresher Guide Complier Design
Symbol-Table Management
A symbol table is a data structure containing a record for each identifier, with fields for the
attributes of the identifier.
Symbol table is a data Structure in a compiler used for managing information about variables
& their attributes.
The syntax and semantic analysis phases usually handle a large fraction of the errors
detectable by the compiler.
The lexical phase can detect errors where the characters remaining in the input do not form
any token of the language.
Errors where the token stream violates the structure rules (syntax) of the language are
determined by the syntax analysis phase.
1. Linear or Lexical analysis, in which stream of characters making up the source program
is read from left-to-right and grouped into tokens that are sequences of characters
having a collective meaning.
2. Hierarchical or Syntax analysis, in which characters or tokens are grouped hierarchically
into nested collections with collective meaning.
3. Semantic analysis, in which certain checks are performed to ensure that the components
of a program fit together meaningfully.
Lexical Analysis:
The lexical analyzer is the first phase of a compiler. Its main task is to read the input
characters and produce as output a sequence of tokens that the parser uses for syntax
analysis.
Sometimes, lexical analyzers are divided into a cascade of two phases, the first called
“scanning” and the second "lexical analysis."
The scanner is responsible for doing simple tasks, while the lexical analyzer does the more
complex operations.
Syntax Analysis:
THE GATE ACADEMY PVT.LTD. H.O.: #74, KeshavaKrupa (third Floor), 30th Cross, 10th Main, Jayanagar 4th Block, Bangalore-11
: 080-65700750, info@thegateacademy.com © Copyright reserved. Web: www.thegateacademy.com Page 296
Quick Refresher Guide Complier Design
Semantic Analysis:
The semantic analysis phase checks the source program for semantic errors and gathers
type information for the subsequent code-generation phase.
It uses the hierarchical structure determined by the syntax-analysis phase to identify the
operators and operands of expressions and statements.
An important component of semantic analysis is type checking.
=
id +
id ×
id Int to float
12
THE GATE ACADEMY PVT.LTD. H.O.: #74, KeshavaKrupa (third Floor), 30th Cross, 10th Main, Jayanagar 4th Block, Bangalore-11
: 080-65700750, info@thegateacademy.com © Copyright reserved. Web: www.thegateacademy.com Page 297
Quick Refresher Guide Complier Design
Code Optimization:
The code optimization phase attempts to improve the intermediate code, so that faster-running
machine code will result. Some optimizations are trivial.
- Improves Efficiency
- Occupies less memory
- Executes fast
te id .
id id te
Code Generation:
The final phase of the compiler is the generation of target code, consisting normally of
relocatable machine code or assembly code. Memory locations are selected for each of the
variables used by the program. Then, intermediate instructions are each translated into a
sequence of machine instructions that perform the same task. A crucial aspect is the assignment
of variables to registers.
MUL .
ADD
MOV id
Where contains id & contains id .
Lexical Analysis
The process of forming tokens from an input stream of characters is called tokenization
and the lexer categorizes them according to a symbol type.
Int number ;
The substring ‘nu ber’ is a lexe e for the token “identifier” or “ID” ‘int’ is a lexe e
for the token “keyword” and ‘;’ is a lexe e for the token”;”
THE GATE ACADEMY PVT.LTD. H.O.: #74, KeshavaKrupa (third Floor), 30th Cross, 10th Main, Jayanagar 4th Block, Bangalore-11
: 080-65700750, info@thegateacademy.com © Copyright reserved. Web: www.thegateacademy.com Page 298
Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.
Alternative Proxies: