0% found this document useful (0 votes)

34 views

CSE353 Slides

Compiler Design Notes

Uploaded by

Ayush Jindal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

CSE353 Slides

Compiler Design Notes

Uploaded by

Ayush Jindal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 76

Compiler Design

CSE 353
Course Outcome
CO Course Outcome(CO)
Code

CO1 CO 1:Explain the concepts and different phases of compilation with

compile time error handling

CO2 :Represent language tokens using regular expressions, context free

grammar and finite automata and design lexical analyzer for a language

CO3 Compare top down with bottom up parsers, and develop appropriate
parser to produce parse tree representation of the input
CO4 Design syntax directed translation schemes for a given context free
grammar.
CO5 Generate intermediate code for statements in high level language,
Benefits and limitations of automatic memory management.

CO6 Apply optimization techniques to intermediate code and generate

machine code for high level language program
Introduction to Language processing
System

HLL

Pure HLL
• Pre-Processor – The pre-processor removes all the #include directives by including
the files called file inclusion and all the #define directives using macro expansion. It
performs file inclusion, augmentation, macro-processing etc.

• Assembly Language – Its neither in binary form nor high level. It is an intermediate
state that is a combination of machine instructions and some other useful data
needed for execution.

• Assembler – For every platform (Hardware + OS) we will have an assembler. They
are not universal since for each platform we have one. The output of assembler is
called object file. It translates assembly language to machine code.

• Interpreter – An interpreter converts high level language into low level machine
language, just like a compiler. But they are different in the way they read the input.
The Compiler in one go reads the inputs, does the processing and executes the
source code whereas the interpreter does the same line by line. Compiler scans the
entire program and translates it as a whole into machine code whereas an
interpreter translates the program one statement at a time. Interpreted programs
are usually slower with respect to compiled ones.
• Relocatable Machine Code – It can be loaded at any point and can be run. The
address within the program will be in such a way that it will cooperate for the
program movement.

• Loader/Linker – It converts the relocatable code into absolute code and tries to
run the program resulting in a running program or an error message (or sometimes
both can happen). Linker loads a variety of object files into a single file to make it
executable. Then loader loads it in memory and executes it.
Compiler
Compiler is a translator program that translates a program written in (HLL)
the source program and translate it into an equivalent program in (MLL) the
target program. As an important part of a compiler is error showing to the
programmer.

Structure of Compiler
Executing a program written in HLL programming language is basically of two parts.
the source program must first be compiled translated into an object program. Then
the result object program is loaded into a memory is executed.

Execution process of source program in Compiler

LIST OF COMPILERS
1. Ada compilers
2. ALGOL compilers
3. BASIC compilers
4. C# compilers
5. C compilers
6. C++ compilers
7. COBOL compilers
8. Common Lisp compilers
9. ECMAScript interpreters
10. Fortran compilers
11. Java compilers
12. Pascal compilers
13. PL/I compilers
14. Python compilers
15. Smalltalk compilers
STRUCTURE OF THE COMPILER DESIGN
Phases of a compiler: A compiler operates in phases. A phase is a logically
interrelated operation that takes source program in one representation and produces
output in another representation.

There are two phases of compilation.

a. Analysis (Machine Independent/Language Dependent)
b. Synthesis (Machine Dependent/Language Independent)

Compilation process is partitioned into no. of sub-processes called ‘phases’.

Structure of Compiler

Stream of Tokens

Parse Tree

refined Parse
Tree

3- Address Code

Optimized Code
Lexical Analyzer-

• It is the first phase of the compiler.

• It gets input from the source program and produces tokens as output.
• It reads the characters one by one, starting from left to right and forms the tokens.
Token : It represents a logically cohesive sequence of characters a + b = 20 Here,
a,b,+,=,20 are all separate tokens. Group of characters forming a token is called the
Lexeme.
•The lexical analyser not only generates a token but also characters such as keywords,
operators, identifiers, special symbols etc. Example: enter the lexeme into the symbol
table if it is not already there.

Syntax Analyzer –

•It is the second phase of the compiler. It is also known as parser.

•It gets the token stream as input from the lexical analyser of the compiler and
generates syntax tree as the output.
•Syntax tree: It is a tree in which interior nodes are operators and exterior nodes are
operands.
•Example: For a=b+c*2, syntax tree is
Semantic Analyzer –
•It is the third phase of the compiler.
•It gets input from the syntax analysis as parse tree and checks whether the given
syntax is correct or not.
•It performs type conversion of all the data types into real data types.

Intermediate Code Generator –

•It is the fourth phase of the compiler.
•It gets input from the semantic analysis and converts the input into output as
intermediate code such as three-address code.
•The three-address code consists of a sequence of instructions, each of which has
atmost three operands.
Example: t1=t2+t3
T1=t3
CODE OPTIMIZATION:
• It is the fifth phase of the compiler.
• It gets the intermediate code as input and produces optimized intermediate code as
output.
•This phase reduces the redundant code and attempts to improve the intermediate
code so that faster-running machine code will result.
•During the code optimization, the result of the program is not affected.
•To improve the code generation, the optimization involves
-deduction and removal of dead code (unreachable code).
- - calculation of constants in expressions and terms.
-- collapsing of repeated expression into temporary string.
-- loop unrolling. - moving code outside the loop.
-- removal of unwanted temporary variables.

CODE GENERATION:
•It is the final phase of the compiler.
•It gets input from code optimization phase and produces the target code or object
code as result.
•Intermediate instructions are translated into a sequence of machine instructions that
perform the same task.
•The code generation involves - allocation of register and memory - generation of
correct references - generation of correct data types - generation of missing code.
SYMBOL TABLE MANAGEMENT:
•Symbol table is used to store all the information about identifiers used in the program.
•It is a data structure containing a record for each identifier, with fields for the attributes
of the identifier.
•It allows to find the record for each identifier quickly and to store or retrieve data from
that record.
•Whenever an identifier is detected in any of the phases, it is stored in the symbol table.

ERROR HANDLING:
•Each phase can encounter errors. After detecting an error, a phase must handle the
error so that compilation can proceed.
•In lexical analysis, errors occur in separation of tokens.
•In syntax analysis, errors occur during construction of syntax tree.
•In semantic analysis, errors occur when the compiler detects constructs with right
syntactic structure but no meaning during type conversion.
•In code optimization, errors occur when the result is affected by the optimization. In
code generation, it shows error when code is missing etc.
Example
Lexical Analyzer
Lexical Analysis is the first phase of the compiler also known as a scanner. It converts
the High level input program into a sequence of Tokens.
•Lexical Analysis can be implemented with the Deterministic finite Automata.
•The output is a sequence of tokens that is sent to the parser for syntax analysis.
•Its task is to read the input characters and produce as output a sequence of tokens
that the parser uses for syntax analysis

•Upon receiving a “get next token” command from the parser, the lexical analyzer
reads input characters until it can identify the next token.
TOKENS
A token is a string of characters, categorized according to the rules as a symbol (e.g.,
IDENTIFIER, NUMBER, COMMA).
The process of forming tokens from an input stream of characters is called
tokenization.
Consider this expression in the C programming language: sum=3+2;
Lexeme Token Type
Sum Identifier
= Assignment Operator
3 Number
+ Addition Operator
2 Number
; End of statement

LEXEME: Collection or group of characters forming tokens is called Lexeme.

PATTERN: A pattern is a description of the form that the lexemes of a token may
take. In the case of a keyword as a token, the pattern is just the sequence of
characters that form the keyword. For identifiers and some other tokens, the pattern
is a more complex structure that is matched by many strings.
How Lexical Analyzer functions

1. Tokenization i.e. Dividing the program into valid tokens.

2. Remove white space characters.
3. Remove comments.

The function of lexical analysis is to tokenize and separate them out from the
program or statement.

How to separate them from program?

We have designed strong regular expression to represent the tokens and
design NFA/DFA to act as token recognizer.

Letter/Digit
Start
0 1 2
Letter delimiter

Transition Diagram for identifier

Consider the Program and find the number of token

int main()
{
// 2 variables
int a, b;
a = 10;
return 0;
}
Solution

'int' 'main‘ '(' ')‘ '{' 'int‘ 'a‘

',‘ 'b‘ ';‘ 'a‘ '=‘ '10‘

';‘ 'return‘ '0‘ ';’ '}‘

Question 2:
int max(x,y)
int x, y;
/* find max of x and y*/
{
return(x>y ? X:y);
}
Solution:
int max ( x , y )

int x , y ; { return

( x > y ? x :

y ) ; }

Question3 :

Printf (“%d, hi”, &x);

Passes

A pass refers to the number of times the compiler goes through the source
code.

Two types of Passes in compiler

Single-pass compiler goes through the program only once. In other words,
the single pass compiler allows the source code to pass through each
compilation unit only once. It immediately translates each code section into
its final machine code.

Multi-pass compiler goes through the source code several times. In other
words, it allows the source code to pass through each compilation unit several
times. Each pass takes the result of the previous pass as input and creates
intermediate outputs. Therefore, the code improves in each pass. The final
code is generated after the final pass

The main difference between phases and passes of compiler is that phases are the
steps in the compilation process while passes are the number of times
the compiler traverses through the source code.
Bootstrapping and Cross Compiler
Bootstrapping is a process in which simple language is used to translate
more complicated program which in turn may handle far more complicated
program. This complicated program can further handle even more complicated
program and so on.

A cross compiler is a compiler capable of creating executable code for a

platform other than the one on which the compiler is running. For example, a
compiler that runs on a Windows 7 PC but generates code that runs on Android
smartphone is a cross compiler.
Continued..

▪ Suppose we want to write a cross

compiler for new language X.
▪ The implementation language of this
compiler is say Y and the target
code being generated is in language
Z. That is, we create XYZ.
▪ Now if existing compiler Y runs on
machine M and generates code for
M then it is denoted as YMM.
▪ Now if we run XYZ using YMM
then we get a compiler XMZ.
▪ That means a compiler for source
language X that generates a target
code in language Z.
Continued..
Difference between Native Compiler and
Cross Compiler
Native Compiler Cross Compiler
Translates program for same Translates program for different
hardware/platform/machine on it is hardware/platform/machine other
running. than the platform which it is running.

It is used to build programs for same It is used to build programs for other
system/machine & OS it is installed. system/machine like AVR/ARM.

It is dependent on System/machine and It is independent of System/machine

OS and OS

It can generate executable file like .exe It can generate raw code .hex

TurboC or GCC is native Compiler. Keil is a cross compiler.

Issues in Lexical Analysis

Lexical analysis is the process of producing tokens from the source program.
It has the following issues:

•Lookahead

•Ambiguities
Lookahead
•Lookahead is required to decide when one token will end
and the next token will begin. The simple example which
has lookahead issues are i vs if,
•= vs. ==. Therefore, a way to describe the lexemes of
each token is required.
•A way needed to resolve ambiguities
• Is if it is two variables i and f or if?
• Is == is two equal signs =, = or ==?
•Hence, the number of lookahead to be considered and a
way to describe the lexemes of each token is also needed.
Ambiguities

The lexical analysis programs written with lex accept ambiguous

specifications and choose the longest match possible at each input point.
Lex can handle ambiguous specifications. When more than one expression
can match the current input, lex chooses as follows:

•The longest match is preferred.

• Among rules which matched the same number of characters, the rule given
first is preferred.
Lexical Errors
A character sequence which is not possible to scan into any valid
token is a lexical error. Important facts about the lexical error:
▪ Lexical errors are not very common, but it should be managed by a
scanner

▪ Misspelling of identifiers, operators, keyword are considered as

lexical errors

▪ Generally, a lexical error is caused by the appearance of some

illegal character, mostly at the beginning of a token.
Error Recovery in Lexical Analyzer

Most common error recovery techniques:

▶ Removes one character from the remaining input

▶ In the panic mode, the successive characters are always ignored
until we reach a well-formed token

▶ By inserting the missing character into the remaining input

▶ Replace a character with another character

▶ Transpose two serial characters

Lexical Analyzer vs. Parser
Lexical Analyser Parser

Scan Input program Perform syntax analysis

Identify Tokens Create an abstract

representation of the code

Insert tokens into Symbol Update symbol table entries

Table
It generates lexical errors It generates a parse tree of the
source code
Parser
• Parser in a compiler that is used to break the
data into smaller elements coming from lexical
analysis phase.
• A parser takes input in the form of sequence
of tokens and produces output in the form of
parse tree.
• Parsing is of two types: top down parsing and
bottom up parsing.
Stream of tokens

Parse tree
Parser
Context-free grammar
Parsing Techniques
Parsing

Top Down Parsing Bottom Up Parsing

Backtracking Non-Backtracking Parsing Operator Precedence Table driven LR

Parsing (Predictive Parsing) Parsing Parsing
Works on Operator Grammar

Recursive Descent SLR Canonical LALR

Table Driven Predictive
Parsing Parsing LR Parsing Parsing
Parsing (LL1)
There are several types of parsing algorithms used in syntax
analysis, including:

• LL parsing: This is a top-down parsing algorithm that starts with the

root of the parse tree and constructs the tree by successively
expanding non-terminals. LL parsing is known for its simplicity and
ease of implementation.
• LR parsing: This is a bottom-up parsing algorithm that starts with
the leaves of the parse tree and constructs the tree by successively
reducing terminals. LR parsing is more powerful than LL parsing and
can handle a larger class of grammars.
• LR(1) parsing: This is a variant of LR parsing that uses lookahead to
disambiguate the grammar.
• LALR parsing: This is a variant of LR parsing that uses a reduced set
of lookahead symbols to reduce the number of states in the LR
parser.
• Once the parse tree is constructed, the compiler can perform
semantic analysis to check if the source code makes sense and
follows the semantics of the programming language.
• The parse tree or AST can also be used in the code generation phase
of the compiler design to generate intermediate code or machine
code.
Parsing
Techniques
Top-down parsers (LL(1), recursive descent)
• Start at the root of the parse tree and grow toward leaves
• Pick a production & try to match the input
• Bad “pick”, may need to backtrack
• Some grammars are backtrack-free (predictive parsing)

Bottom-up parsers ( Shift Reduce Parser, LR(1), operator precedence)

• Start at the leaves and grow toward root
• As input is consumed, encode possibilities in an internal state
• Start in a state valid for legal first tokens
• Bottom-up parsers handle a large class of grammars
Top Down Parsing
Bottom UP Parsing

• Bottom up parsing is also known as

shift- reduce parsing.
• Bottom up parsing is used to construct a parse
tree for an input string.
• In the bottom up parsing, the parsing starts
with the input symbol and construct the parse
tree up to the start symbol by tracing out the
rightmost derivations of string in reverse.
Top–Down Parsing Bottom–Up Parsing
• A parse tree is created from • A parse tree is created from
root to leaves leaves to root
• The traversal of parse trees is a • The traversal of parse trees
preorder traversal is a reversal of postorder
• Tracing leftmost derivation traversal

• Two types: • Tracing rightmost derivation

— Backtracking parser • More powerful than top-
down parsing, eg shift
— Predictive parser reduce parser, operator
precedence parser, LR
• Guess the structure of the parse parser
tree from the next input
•Try different structures and
backtrack if it does not matched
Basic Idea in Top-Down Parsing
• Top-Down Parsing is an attempt to find a left-most derivation
for an input string
• Example:
S -> c A d Find a derivation for w = c a d
A -> a b | a

S S backtrack S
/ | \ 🡪 / | \ 🡪 / | \
c A d c A d c Ad
/ \ |
a b a
Bottom-up
Parsing
A bottom-up parser builds a derivation by working from the input
sentence back toward the start symbol S

S ⇒ γ0 ⇒ γ1 ⇒ γ2 ⇒ … ⇒ γn–1 ⇒ γn ⇒ sentence

To reduce γi to γi–1 match some rhs β against γi then replace β with its
corresponding LHS, A. ( production A→β)

In terms of the parse tree-

🡪 working from leaves to root
Finding Reductions
Consider the simple grammar Sentential Next
→ Form Reduction
1 S aABe
Prod’n
→
2 A Abc abbcde 3
3| b a A bcde 2
→ a A de 4
4B d
aABe 1
And the input string abbcde S —

The trick is scanning the input and finding the

next reduction
The mechanism for doing this must be efficient
Example list → list +
digit
Parse tree for 9-5+2 | list – digit
| digit
digit
list →0|1|…|9

list + digit

list - digit 2

digit 5

9
47
Ambiguity
• A Grammar can have more than one
parse tree for a string
• Consider grammar
list 🡪 list+ list
| list – list
|0|1|…|9

• String 9-5+2 has two parse trees

48
list list

list + list list - list

list - list 2 9 list + list

9 5 5 2

49
Ambiguity
…
• Ambiguity is problematic because meaning
of the programs can be incorrect
• Ambiguity can be handled in several ways
– Enforce associativity and precedence
– Rewrite the grammar (cleanest way)
• There is no algorithm to convert
automatically any ambiguous grammar to
an unambiguous grammar accepting the
same language
• Worse, there are inherently ambiguous
50
languages!
Ambiguity in Programming Lang.
• Dangling else problem
stmt → if expr stmt
| if expr stmt else stmt
• For this grammar, the string
if e1 if e2 then s1 else s2
has two parse trees
51
if e1
if e2
stmt
s1
else s2
if expr stmt else stmt

if e1 e1 if expr stmt s2
if e2
s1
else s2 e2 s1
stmt

if expr stmt

e1 if expr stmt else stmt

e2 s1 s2 17
Resolving dangling else problem
• General rule: match each else with the closest
previous unmatched if. The grammar can be
rewritten as
stmt → matched-stmt
| unmatched-stmt
matched-stmt → if expr matched-stmt
else matched-stmt
| others
unmatched-stmt → if expr stmt
| if expr matched-stmt
else unmatched-stmt 18
Associativity
• If an operand has operator on both the
sides, the side on which operator takes this
operand is the associativity of that
operator
• In a+b+c b is taken by left +
• +, -, *, / are left associative
• ^, = are right associative
• Grammar to generate strings with right
associative operators
right 🡪 letter = right | letter
letter 🡪 a| b |…| z
54
Precedence
• String a+5*2 has two possible
interpretations because of two different
parse trees corresponding to
(a+5)*2 and a+(5*2)
• Precedence determines the correct
interpretation.
• Next, an example of how precedence
rules are encoded in a grammar 55
Precedence/Associativity in the Grammar for
Arithmetic Expressions
Ambiguous Unambiguous,
E🡪E+E with precedence
| E*E and associativity
| (E) rules honored
| num | id E🡪E+T|T
T🡪T*F|F
3+2+5 F 🡪 (E)|num|id
3+2*5
Suppose Production rules for the Grammar of a language are:
S -> cAd A -> bc|a
And the input string is “cad”.

Backtrack was needed to get the correct syntax tree, which is really a complex
process to implement.
There can be an easier way to solve this using “Concepts of FIRST and
FOLLOW sets in Compiler Design”.
Why FIRST?
• We saw the need of backtrack in Syntax Analysis, which is really a complex
process to implement.

• There can be easier way to sort out this problem: If the compiler would have
come to know in advance, that what is the “ﬁrst character of the string
produced when a production rule is applied” and comparing it to the current
character or token in the input string it sees, it can wisely take decision on
which production rule to apply.

• Example: S -> cAd A -> bc|a and the input string is “cad”.

• Thus, if we knew that after reading character ‘c’ in the input string and
applying S->cAd, next character in the input string is ‘a’, then we would have
ignored the production rule A->bc and directly use the production rule A->a.

• Hence if parser knows ﬁrst character of the string that can be obtained by
applying a production rule, then it can wisely apply the correct production rule
to get the correct syntax tree for the given input string.
Why FOLLOW?
• The parser faces one more problem.

• Consider grammar A -> aBb B -> c | ε and input string is “ab”. As the ﬁrst character
in the input is a, the parser applies the rule A->aBb.

• Now the parser checks for the second character of the input string which is b,
and Non-Terminal to derive is B

• Parser can’t get any string derivable from B that contains b as ﬁrst character.

• But the Grammar does contain a production rule B -> ε, if that is applied then
B will vanish, and the parser gets the input “ab”.
Why FOLLOW?
• But the parser can apply it only when it knows that the character that follows
B in the production rule is same as the current character in the input. In RHS of
A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the current
input character read is also b.

• Hence the parser applies this rule. And it is able to get the string “ab” from the
given grammar.

FOLLOW can make a Non-terminal vanish out if needed to generate the string
from the parse tree. The conclusions is, we need to ﬁnd FIRST and FOLLOW sets
for a given grammar so that the parser can properly apply the needed rule at the
correct position.
FIRST
FIRST(X) for a grammar symbol X is the set of terminals that begin the strings
derivable from X.

Example:
E -> TE’
E’ -> +T E’
T -> F T’
T’ -> *F T’
F -> (E) | id

FIRST(E) = { ( , id }
FIRST
FIRST
FOLLOW
Thank You

Kato Hei - Puhekielen Alkeet (PDFDrive)
100% (1)
Kato Hei - Puhekielen Alkeet (PDFDrive)
186 pages
R Cheat Sheet PDF
100% (1)
R Cheat Sheet PDF
38 pages
Assembly Language:Simple, Short, And Straightforward Way Of Learning Assembly Programming
From Everand
Assembly Language:Simple, Short, And Straightforward Way Of Learning Assembly Programming
Sherwyn Allibang
2/5 (1)
1 Lexial Analysis
No ratings yet
1 Lexial Analysis
24 pages
MOGLIX Profile
No ratings yet
MOGLIX Profile
2 pages
Unit 1 Slides
No ratings yet
Unit 1 Slides
49 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 1
No ratings yet
Unit 1
29 pages
Automata Theory and Compiler Design
No ratings yet
Automata Theory and Compiler Design
55 pages
Chapter-1 Compiler Design
No ratings yet
Chapter-1 Compiler Design
13 pages
CD Iii I
No ratings yet
CD Iii I
180 pages
Gate Compiler Design-
No ratings yet
Gate Compiler Design-
72 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
CSC303 - Compiler Design - 060624
No ratings yet
CSC303 - Compiler Design - 060624
49 pages
AK CD CSE 305 ASSIGNMENT 1
No ratings yet
AK CD CSE 305 ASSIGNMENT 1
15 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
CC_unit_1
No ratings yet
CC_unit_1
70 pages
midsem
No ratings yet
midsem
13 pages
Quick Book of Compiler
100% (1)
Quick Book of Compiler
66 pages
Compiler Design
No ratings yet
Compiler Design
118 pages
Unit I SRM
100% (1)
Unit I SRM
36 pages
Unit-1 PCD
No ratings yet
Unit-1 PCD
28 pages
Compiler Design Module
No ratings yet
Compiler Design Module
120 pages
CD Sanchit Sir Notes
No ratings yet
CD Sanchit Sir Notes
115 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
61 pages
Introduction To Compilers - Analysis of The Source Program - Phases of A Compiler Phases of Compiler
No ratings yet
Introduction To Compilers - Analysis of The Source Program - Phases of A Compiler Phases of Compiler
25 pages
Chapter 1
No ratings yet
Chapter 1
43 pages
Chapter 1 - Introduction To Comp
No ratings yet
Chapter 1 - Introduction To Comp
27 pages
Unit 1 Introduction To Compiler 1. Introduction To Compiler
No ratings yet
Unit 1 Introduction To Compiler 1. Introduction To Compiler
134 pages
ACD Unit-2 part-1
No ratings yet
ACD Unit-2 part-1
36 pages
UNIT 1 COMPILER DESIGN
No ratings yet
UNIT 1 COMPILER DESIGN
43 pages
compiler design unit 1 srm 21 regulation
No ratings yet
compiler design unit 1 srm 21 regulation
193 pages
Compiler Notes Arv
No ratings yet
Compiler Notes Arv
171 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
124 pages
Phases of Compiler
No ratings yet
Phases of Compiler
17 pages
Muhammad Hamza BSCS-E3-22-23 Compiler
No ratings yet
Muhammad Hamza BSCS-E3-22-23 Compiler
11 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
CD Notes
No ratings yet
CD Notes
69 pages
Mini Compiler: Submitted By: Tejash Niroula 16bce2292
No ratings yet
Mini Compiler: Submitted By: Tejash Niroula 16bce2292
14 pages
Cousins of Compiler
100% (1)
Cousins of Compiler
25 pages
Introduction to Compiler
No ratings yet
Introduction to Compiler
10 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
CH 1
No ratings yet
CH 1
23 pages
SCS13033
No ratings yet
SCS13033
121 pages
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
No ratings yet
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
27 pages
_CD -unit 1
No ratings yet
_CD -unit 1
46 pages
Compiler Design: Instructor: Mohammed O. Samara University
No ratings yet
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Introduction To Compilation
No ratings yet
Introduction To Compilation
33 pages
Compiler Design: Objectives
No ratings yet
Compiler Design: Objectives
45 pages
INTRO TO COMPILERS
No ratings yet
INTRO TO COMPILERS
77 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Desining Complete Notes
No ratings yet
Compiler Desining Complete Notes
175 pages
cd unit I
No ratings yet
cd unit I
20 pages
Lecture Notes: Sir C R Reddy College of Engineering
No ratings yet
Lecture Notes: Sir C R Reddy College of Engineering
25 pages
CS 321 - Compilers: Outline
No ratings yet
CS 321 - Compilers: Outline
8 pages
section c
No ratings yet
section c
16 pages
phases of compiler
No ratings yet
phases of compiler
36 pages
Compiler Design CSE - 353: UNIT-1
No ratings yet
Compiler Design CSE - 353: UNIT-1
42 pages
Language Processing System:-: Compiler
No ratings yet
Language Processing System:-: Compiler
6 pages
Compiler Construction CSEC325 Token
No ratings yet
Compiler Construction CSEC325 Token
2 pages
Introduction Compiler
No ratings yet
Introduction Compiler
47 pages
Compiler Design
From Everand
Compiler Design
Knowledge Flow
No ratings yet
Banking Software System Monitoring Tool
No ratings yet
Banking Software System Monitoring Tool
4 pages
Root Users Guide A 4
No ratings yet
Root Users Guide A 4
642 pages
Presentation 4
No ratings yet
Presentation 4
18 pages
Interview Question Answers Based On 8051 Mic..
No ratings yet
Interview Question Answers Based On 8051 Mic..
3 pages
RTOS4FC2
No ratings yet
RTOS4FC2
25 pages
TF1.7 How To Guides - Swapping Out A GTU
No ratings yet
TF1.7 How To Guides - Swapping Out A GTU
6 pages
Javaindex 1
100% (2)
Javaindex 1
7 pages
Int411:Software Project Management: Page:1/1
No ratings yet
Int411:Software Project Management: Page:1/1
1 page
IDEA Slab 5: User Guide
No ratings yet
IDEA Slab 5: User Guide
38 pages
Data Analysis from Scratch with Python Peters Morgan - The ebook in PDF/DOCX format is available for instant download
100% (2)
Data Analysis from Scratch with Python Peters Morgan - The ebook in PDF/DOCX format is available for instant download
58 pages
14 9 16 Sanjay Saxena UPI Presntation
No ratings yet
14 9 16 Sanjay Saxena UPI Presntation
65 pages
Go Cheat Sheet
No ratings yet
Go Cheat Sheet
1 page
WD2013 GuidedProject 2 2 Instructions
No ratings yet
WD2013 GuidedProject 2 2 Instructions
4 pages
Modeling & Simulation
No ratings yet
Modeling & Simulation
37 pages
EcgViewer User Manual
No ratings yet
EcgViewer User Manual
28 pages
Reflection No.7
No ratings yet
Reflection No.7
1 page
Chap1 Digital System and Number System
No ratings yet
Chap1 Digital System and Number System
17 pages
3 Ways To Change A Computer's Mac Address in Windows - WikiHow
No ratings yet
3 Ways To Change A Computer's Mac Address in Windows - WikiHow
5 pages
CRM - Lesson 04 - Types of Customers and Customer Value
100% (1)
CRM - Lesson 04 - Types of Customers and Customer Value
30 pages
Logic and Propositional Calculus
No ratings yet
Logic and Propositional Calculus
31 pages
Mvmeter2 Manual
No ratings yet
Mvmeter2 Manual
5 pages
Section Designer FAQ - Technical Knowledge Base - CSI
No ratings yet
Section Designer FAQ - Technical Knowledge Base - CSI
3 pages
Arcview 9.3 Single Use DVD Front End Help: Quick-Start Guide
No ratings yet
Arcview 9.3 Single Use DVD Front End Help: Quick-Start Guide
4 pages
Notebook
No ratings yet
Notebook
2 pages
Rapid Response To Internet Leads Drives Conversion
No ratings yet
Rapid Response To Internet Leads Drives Conversion
3 pages
Anomaly Detection in Surveillance Camera: Capstone Project Report End-Semester Evaluation
No ratings yet
Anomaly Detection in Surveillance Camera: Capstone Project Report End-Semester Evaluation
26 pages
Management Information System: Subject in Charge: Ms Manisha Sharma
No ratings yet
Management Information System: Subject in Charge: Ms Manisha Sharma
28 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CSE353 Slides

Uploaded by

CSE353 Slides

Uploaded by

Compiler Design

CO1 CO 1:Explain the concepts and different phases of compilation with

CO2 :Represent language tokens using regular expressions, context free

CO6 Apply optimization techniques to intermediate code and generate

Execution process of source program in Compiler

There are two phases of compilation.

Compilation process is partitioned into no. of sub-processes called ‘phases’.

• It is the first phase of the compiler.

•It is the second phase of the compiler. It is also known as parser.

Intermediate Code Generator –

LEXEME: Collection or group of characters forming tokens is called Lexeme.

1. Tokenization i.e. Dividing the program into valid tokens.

How to separate them from program?

Transition Diagram for identifier

'int' 'main‘ '(' ')‘ '{' 'int‘ 'a‘

',‘ 'b‘ ';‘ 'a‘ '=‘ '10‘

';‘ 'return‘ '0‘ ';’ '}‘

Printf (“%d, hi”, &x);

Two types of Passes in compiler

A cross compiler is a compiler capable of creating executable code for a

▪ Suppose we want to write a cross

It is dependent on System/machine and It is independent of System/machine

TurboC or GCC is native Compiler. Keil is a cross compiler.

The lexical analysis programs written with lex accept ambiguous

•The longest match is preferred.

▪ Misspelling of identifiers, operators, keyword are considered as

▪ Generally, a lexical error is caused by the appearance of some

Most common error recovery techniques:

▶ Removes one character from the remaining input

▶ By inserting the missing character into the remaining input

▶ Replace a character with another character

▶ Transpose two serial characters

Scan Input program Perform syntax analysis

Identify Tokens Create an abstract

Insert tokens into Symbol Update symbol table entries

Top Down Parsing Bottom Up Parsing

Backtracking Non-Backtracking Parsing Operator Precedence Table driven LR

Recursive Descent SLR Canonical LALR

• LL parsing: This is a top-down parsing algorithm that starts with the

Bottom-up parsers ( Shift Reduce Parser, LR(1), operator precedence)

• Bottom up parsing is also known as

• Two types: • Tracing rightmost derivation

In terms of the parse tree-

The trick is scanning the input and finding the

• String 9-5+2 has two parse trees

list + list list - list

list - list 2 9 list + list

e1 if expr stmt else stmt

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.