0% found this document useful (0 votes)

4 views30 pages

CD Notes3

This document discusses the syntax analysis phase of a compiler, focusing on the construction of parsers using context-free grammar (CFG). It explains the components of CFG, the process of parsing, derivation methods, and the significance of parse trees, as well as issues like ambiguity and left recursion in grammar. Additionally, it covers techniques like left factoring and the concepts of 'first' and 'follow' sets in grammar to aid in parser decision-making.

Uploaded by

gbmudhol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views30 pages

CD Notes3

Uploaded by

gbmudhol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

PCET-NMVPM’s

Nutan College of Engineering and Research, Talegaon, Pune

DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

UNIT 3
Syntax Analyzer
Syntax analysis or parsing is the second phase of a compiler. In this chapter, we shall
learn the basic concepts used in the construction of a parser.
The parser (syntax analyzer) receives the source code in the form of tokens from the
lexical analyzer and performs syntax analysis, which create a tree-like intermediate
representation that depicts the grammatical structure of the token stream.
We have seen that a lexical analyzer can identify tokens with the help of regular
expressions and pattern rules. But a lexical analyzer cannot check the syntax of a given
sentence due to the limitations of the regular expressions. Regular expressions cannot
check balancing tokens, such as parenthesis. Therefore, this phase uses context-free
grammar CFG, which is recognized by pushdown automata.
Syntax of a language refers to the structure of a valid programs / statements of that
language. It is specified by certain rules known as productions and collection of such rules
is known as grammar.
Parsing is a process of determining that stream of tokens are valid or not which is
defined by a grammar.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

Context Free Grammar:

What is grammar?
Grammar contains the set of rules to construct a sentence in a language. We
are defining set of rules in CFG , from these rules we will construct string.

In lexical analysis regular grammar is used and the language is generated is

known as regular language. But some of the grammars are not able to
generate the regular language. For eg.

L = {ambm | m ≥ 1}.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

The reason is, you have to reach the final state only when no. of 'a' and
no. of 'b' are equal in the input string. And to do that you have to count both,
the no. of 'a' as well as no. of 'b' but because value of 'n' can reach infinity, it's
not possible to count up to infinity using Finite automata.

Finite State Automaton has no data structure (stack) - memory as in case

of push down automaton. So it can give you some 'a's followed by some 'b's
but not exact amount of 'a' followed by that no 'b'.

So, context free grammar is type 2 grammar and it will recognized by

push down automata machine. That is the reason we have to use context free
grammar in syntax analysis phase to accept all possible languages.

 A context-free grammar (CFG) has four components:

Context free grammar is a set of recursive rules used to generate patterns of

strings. A context free grammar can generate CFL by taking set of variables
which are defined recursively , in terms of one another by a set of production
rules.

 A set of terminal symbols, sometimes referred to as "tokens."

 A set of nonterminal symbols. Sometimes called "syntactic variables."
 One nonterminal is distinguished as the start symbol.

A set of productions in the form: LHS RHS where

 LHS (called head, or left side) is a single nonterminal symbol
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

 RHS (called body, or right side) consists of zero or more terminals

and nonterminals.
 The terminals are the elementary symbols of the language defined by the
grammar.
 Non terminals impose a hierarchical structure on the language that is key
to syntax analysis and translation.
 Conventionally, the productions for the start symbol are listed first.
 The productions specify the manner in which the terminals and
 Non terminals can be combined to form strings.

Context free grammar G can be defined by four tuples as:

G= (V, T, P, S)
Where,
G describe T describes a finite set of terminal symbols.
V describes a finite set of non-terminal symbols
P describes a set of production rules
S is the start symbol.s the grammar

In CFG, the start symbol is used to derive the string. You can derive the string
by repeatedly replacing a non-terminal by the right hand side of the
production, until all non-terminal have been replaced by terminal symbols.
Production rules:

S aSa
S bSb
S c
Now check that abbcbba string can be derived from the given CFG.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

S aSa
S abSba
S abbSbba
S abbcbba

Capabilities of CFG

There are the various capabilities of CFG:

1. Context free grammar is useful to describe most of the programming
languages.
2. If the grammar is properly designed then an efficient parser can be
constructed automatically.
3. Using the features of associatively & precedence information, suitable
grammars for expressions can be constructed.
4. Context free grammar is capable of describing nested structures like:
balanced parentheses, matching begin-end, corresponding if-then-else's &
so on
Derivation
Syntax analyzer will create a parse tree to check the syntax or pattern of the
string for particular language. So for generating parse tree there are two types
of parser. One is Top-Down parser and second is Bottom-up parser. In both
the methods we generate parse tree and the tree will be generated using two
methods.
1) LMD(Left Most Derivation) is used by Top-Down parser
2) RMD(Right Most Derivation) is used by Bottom-up parser.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

Derivation is a sequence of production rules. It is used to get the input string

through these production rules. During parsing we have to take two decisions.
These are as follows:
1. We have to decide the non-terminal which is to be replaced.
2. We have to decide the production rule by which the non-terminal will
be replaced.
We have two options to decide which non-terminal to be replaced with
production rule.
1. Left-most Derivation
In the left most derivation, the input is scanned and replaced with the
production rule from left to right. So in left most derivatives we read the input
string from left to right.
Example :
S=S+S
S=S-S
S = a | b |c
string to be derived : a - b + c (Left Most Derivation)
S=S+S
S=S-S+S
S=a-S+S
S=a-b+S
S=a-b+c
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

2. Right-most Derivation
In the right most derivation, the input is scanned and replaced with the
production rule from right to left. So in right most derivatives we read the
input string from right to left.

Example : Gramar is:

S=S+S
S=S-S
S = a | b |c

String to be derived: a - b + c
aa - b + c - b + c
The right-most derivation is:
S=S-S
S=S-S+S
S=S-S+c
S=S-b+c
S=a-b+c
 Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be
terminal or non-terminal.
o In parsing, the string is derived using the start symbol. The root of the
parse tree is that start symbol.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

o It is the graphical representation of symbol that can be terminals or

non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree
traversed first. So, the operator in the parent node has less precedence
over the operator in the sub-tree.
The parse tree follows these points:
o All leaf nodes have to be terminals.
o All interior nodes have to be non-terminals.
o In-order traversal gives original input string.
Example: Production rules are :
T= T + T | T * T
T = a|b|c
String to be derived : a * b + c
Step 1:

Step 2:

Step 3:
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

Step 4:

Step 5:

Ambiguity
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

A grammar is said to be ambiguous if there exists more than one leftmost

derivation or more than one rightmost derivative or more than one parse tree
for the given input string. If the grammar is not ambiguous then it is called
unambiguous.

Example:
S = aSb | SS
S=
For the string aabb, the above grammar generates two parse trees:

If the grammar has ambiguity then it is not good for a compiler construction.
No method can automatically detect and remove the ambiguity but you can
remove ambiguity by re-writing the whole grammar without ambiguity.

Grammar can be ambiguous in 2 cases

1. Left recursion (left most symbol of R.H.S = L.H.S)

2. Right recursion (right most symbol of R.H.S = L.H.S)

A A α |β.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

The above Grammar is left recursive because the left of production is

occurring at a first position on the right side of production. It can eliminate
left recursion by replacing a pair of production with

1) A βA 2) A αA |

In Left Recursive Grammar, expansion of A will generate Aα, Aαα, Aααα at

each step, causing it to enter into an infinite loop. And generated language is
βα* .

In right recursion A() is going to do some work alpha(α) first and then execute
recursive function A(). So alpha(α) act as a condition checking , so no way to
fall in infinite loop. The language will be generated by right recursion is α*β.

Most of parser (Top-Down) don’t allow left recursion. Therefore we have

to eliminate left recursion without changing language i.e βα*

A-> βα* but grammar should not contain * symbol. So we will make it as

A βA now its responsibility of A’ to generate any no of alpha(α) or

including 0 no of alpha.

1)A βA
2)A αA |

Example1 − Consider the Left Recursion from the Grammar.

E E + T|T
T T * F|F
F (E)|id
Eliminate immediate left recursion from the Grammar.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

Solution
Comparing E E + T|T with A A α |β

E → E +T | T

A → A α | Β

A = E, α = +T, β = T

A A α |β is changed to A βA and A α A |ε

A βA means E TE

A α A |ε means E +TE |ε

Comparing T T F|F with A Aα|β

T → T *F | F

A → A α | β

∴ A = T, α =∗ F, β = F

∴ A → β A′ means T → FT′
A → α A′|ε means T′ →* FT′|ε
Production F → (E)|id does not have any left recursion
∴ Combining productions 1, 2, 3, 4, 5, we get

E TE
E +TE | ε
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

T FT
T * FT |ε
F (E)| id

Till now we have seen grammar can be ambiguous or unambiguous,

left recursive or right recursive. One more thing is there - grammar is
Deterministic or Nondeterministic.
A αβ1| αβ2| αβ3
In above grammar on seeing α it goes to β1 next β2 or β3. So in Non-deterministic
we have many option on single symbol. We can do parsing for selected grammar
such as it has to be simplified grammar.

 Left Factoring-
The grammar with common prefix between at least two different productions
from the same L.H.S (non terminal symbol in CFG) is known as Non
Deterministic grammar. Because for one symbol it has many production. In
this case of Non Deterministic grammar compiler cann’t decide unique
production for particular terminal and many times it needs to backtrack for
searching correct production. This backtracking process is more time
consuming and top down parser don’t allow backtracking. So to convert Non-
Deterministic grammar into Deterministic grammar is known as left factoring
method.
In left factoring,
 We make one production for each common prefixes.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

 The common prefix may be a terminal or a non-terminal or a combination

of both.
 Rest of the derivation is added by new productions.
The grammar obtained after the process of left factoring is called as Left
Factored Grammar.

Example :

Example 1: Do left factoring in the following grammar-

S iEtS / iEtSeS / a
E b

Solution-
The left factored grammar is-
S iEtSS’ / a
S’ eS /
E b
Example 2: Do left factoring in the following grammar-

A aAB / aBc / aAc

PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

Solution :

A aA’
A’ → AB / Bc / Ac
Again this grammar has common prefix A.
A’ → AA’’
A’’ → B / c

First and Follow

 Why to find First?
We saw the need of backtrack in the previous part of on Introduction to
Syntax Analysis, which is really a complex process to implement.
If the compiler would have come to know in advance, that what is the
“first character of the string produced when a production rule is applied”, and
comparing it to the current character or token in the input string it sees, it can
wisely take decision on which production rule to apply.
Let’s take the same grammar from the previous article:
S -> cAd
A -> bc|a
And the input string is “cad”.
Thus, in the example above, if it knew that after reading character ‘c’ in the
input string and applying S->cAd, next character in the input string is ‘a’, then
it would have ignored the production rule A->bc (because ‘b’ is the first
character of the string produced by this production rule, not ‘a’ ), and directly
use the production rule A->a (because ‘a’ is the first character of the string
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

produced by this production rule, and is same as the current character of the
input string which is also ‘a’).
Hence it is validated that if the compiler/parser knows about first character of
the string that can be obtained by applying a production rule, then it can
wisely apply the correct production rule to get the correct syntax tree for the
given input string.
 Why FOLLOW?
The parser faces one more problem. Let us consider below grammar to
understand this problem.
A -> aBb
B -> c | ε
And suppose the input string is “ab” to parse.

As the first character in the input is a, the parser applies the rule A->aBb.

A
/| \
a B b
Now the parser checks for the second character of the input string which is
b, and the Non-Terminal to derive is B, but the parser can’t get any string
derivable from B that contains b as first character.
But the Grammar does contain a production rule B -> ε, if that is applied
then B will vanish, and the parser gets the input “ab”, as shown below. But
the parser can apply it only when it knows that the character that follows B
in the production rule is same as the current character in the input.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

In RHS of A -> aBb, b follows Non-Terminal B, i.e. FOLLOW(B) = {b}, and the
current input character read is also b. Hence the parser applies this rule.
And it is able to get the string “ab” from the given grammar.
A A
/ | \ / \
a B b => a b
|
ε
So FOLLOW can make a Non-terminal vanish out if needed to generate the
string from the parse tree.

The conclusions is, we need to find FIRST and FOLLOW sets for a given
grammar so that the parser can properly apply the needed rule at the
correct position.

 FIRST
FIRST(X) for a grammar symbol X is the set of terminals that begin the strings
derivable from X.

Rules to compute FIRST set:

1) If x is a terminal, then FIRST(x) = { ‘x’ }

2) If x-> Є, is a production rule, then add Є to FIRST(x).
Example 1:
Production Rules of Grammar
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
F -> (E) | id

FIRST sets
FIRST(E) = FIRST(T) = { ( , id }
FIRST(E’) = { +, Є }
FIRST(T) = FIRST(F) = { ( , id }
FIRST(T’) = { *, Є }
FIRST(F) = { ( , id }

Example 2:
Production Rules of Grammar
S -> ACB | Cbb | Ba
A -> da | BC
B -> g | Є
C -> h | Є

FIRST sets

FIRST(S) = FIRST(ACB) U FIRST(Cbb) U FIRST(Ba) where ever Є is there put it

in production and again see the first of symbol.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

In grammar C , B has Є production, if we put Є to B then the first of A is C and

then find C’s first is h.
= { d, g, h, b, a, Є}

FIRST(A) = { d } U FIRST(BC)
= { d, g, h, Є }
FIRST(B) = { g , Є }

FIRST(C) = { h , Є }

Follow
Follow(X) to be the set of terminals that can appear immediately to the right o
Rules to compute FOLLOW set:
1) FOLLOW(S) = { $ } // where S is the starting Non-Terminal
2) If A -> pBq is a production, where p, B and q are any grammar symbols,
then everything in FIRST(q) except Є is in FOLLOW(B).
3) If A->pB is a production, then everything in FOLLOW(A) is in FOLLOW(B).
4) If A->pBq is a production and FIRST(q) contains Є,
then FOLLOW(B) contains { FIRST(q) – Є } U FOLLOW(A) f Non-Terminal X in
some sentential form.
Example :
Production Rules:
E -> TE’
E’ -> +T E’|Є
T -> F T’
T’ -> *F T’ | Є
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

F -> (E) | id

FIRST set
FIRST(E) = FIRST(T) = { ( , id }
FIRST(E’) = { +, Є }
FIRST(T) = FIRST(F) = { ( , id }
FIRST(T’) = { *, Є }
FIRST(F) = { ( , id }

FOLLOW Set
FOLLOW(E) = { $ , ) } // Note ')' is there because of 5th rule
FOLLOW(E’) = FOLLOW(E) = { $, ) } // See 1st production rule
FOLLOW(T) = { FIRST(E’) – Є } U FOLLOW(E’) U FOLLOW(E) = { + , $ , ) }
FOLLOW(T’) = FOLLOW(T) = {+,$,)}
FOLLOW(F) = { FIRST(T’) – Є } U FOLLOW(T’) U FOLLOW(T) = { *, +, $, ) }
Example 2:
Production Rules:
S -> aBDh
B -> cC
C -> bC | Є
D -> EF
E -> g | Є
F -> f | Є
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

FIRST set
FIRST(S) = { a }
FIRST(B) = { c }
FIRST(C) = { b , Є }
FIRST(D) = FIRST(E) U FIRST(F) = { g, f, Є }
FIRST(E) = { g , Є }
FIRST(F) = { f , Є }

FOLLOW Set
FOLLOW(S) = { $ }
FOLLOW(B) = { FIRST(D) – Є } U FIRST(h) = { g , f , h }
FOLLOW(C) = FOLLOW(B) = { g , f , h }
FOLLOW(D) = FIRST(h) = { h }
FOLLOW(E) = { FIRST(F) – Є } U FOLLOW(D) = { f , h }
FOLLOW(F) = FOLLOW(D) = { h }
Example 3:
Production Rules:

S -> ACB|Cbb|Ba

A -> da|BC

B-> g|Є

C-> h| Є

FIRST set
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

FIRST(S) = FIRST(A) U FIRST(B) U FIRST(C) = { d, g, h, Є, b, a}

FIRST(A) = { d } U {FIRST(B)-Є} U FIRST(C) = { d, g, h, Є }

FIRST(B) = { g, Є }

FIRST(C) = { h, Є }

FOLLOW Set

FOLLOW(S) = { $ }

FOLLOW(A) = { h, g, $ }

FOLLOW(B) = { a, $, h, g }

FOLLOW(C) = { b, g, $, h }

Note :

1. Є as a FOLLOW doesn’t mean anything (Є is an empty string).

2. $ is called end-marker, which represents the end of the input string, hence
used while parsing to indicate that the input string has been completely
processed.
3. The grammar used above is Context-Free Grammar (CFG). The syntax of a
programming language can be specified using CFG.
4. CFG is of the form A -> B, where A is a single Non-Terminal, and B can be a
set of grammar symbols ( i.e. Terminals as well as Non-Terminals)
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

LL(1) Parser
Here the 1st L represents that the scanning of the Input will be done from
Left to Right manner and the second L shows that in this parsing technique we
are going to use Left most Derivation Tree. And finally, the 1 represents the
number of look-ahead, which means how many symbols are you going to see
when you want to make a decision.

Algorithm to construct LL(1) Parsing Table:

Step 1: First check for left recursion in the grammar, if there is left recursion
in the grammar remove that and go to step 2.
Step 2: Calculate First() and Follow() for all non-terminals.
 First(): If there is a variable, and from that variable, if we try to drive all
the strings then the beginning Terminal Symbol is called the First.
 Follow(): What is the Terminal Symbol which follows a variable in the
process of derivation.
Step 3: For each production A –> α. (A tends to alpha)
 Find First(α) and for each terminal in First(α), make entry A –> α in the
table.
 If First(α) contains ε (epsilon) as terminal than, find the Follow(A) and
for each terminal in Follow(A), make entry A –> α in the table.
 If the First(α) contains ε and Follow(A) contains $ as terminal, then
make entry A –> α in the table for the $.
To construct the parsing table, we have two functions:
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

In the table, rows will contain the Non-Terminals and the column will
contain the Terminal Symbols. All the Null Productions of the Grammars will
go under the Follow elements and the remaining productions will lie under
the elements of the First set.

Example-1:
Consider the Grammar:
E --> TE'
E' --> +TE' | ε
T --> FT'
T' --> *FT' | ε
F --> id | (E)
*ε denotes epsilon

Find their First and Follow sets:

First Follow
E –> TE’ { id, ( } { $, ) }

E’ –> +TE’/ε { +, ε } { $, ) }

T –> FT’ { id, ( } { +, $, ) }

T’ –> *FT’/ε { *, ε } { +, $, ) }

F –> id/(E) { id, ( } { *, +, $, ) }

Now, the LL(1) Parsing Table is:

id + * ( ) $

E E –> TE’ E –> TE’

Rule 1 Rule 1
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

E’ E’ –> +TE’ E’ –> ε E’ –> ε

Rule 2 Rule 3
T T –> FT’ T –> FT’

T’ T’ –> ε T’ –> *FT’ T’ –> ε T’ –> ε

F F –> id F –> (E)

Here you can write production numbers also rather than production rules.

As you can see that all the null productions are put under the Follow set of
that symbol and all the remaining productions are lie under the First of that
symbol.

Note: Every grammar is not feasible for LL(1) Parsing table. It may be possible
that one cell may contain more than one production. If each cell contain only
one production then the grammar is LL(1) or can be accepted by LL(1) parser.

Operator Precedence Parser

An operator precedence parser is a bottom-up parser that interprets an
operator grammar. This parser is only used for operator grammars.
Ambiguous grammars are not allowed in any parser except operator
precedence parser. It is mainly used to define mathematical operators for
compiler. First we have to see operator grammar.

Operator Grammar:

A grammar that is used to define mathematical operators is called an operator

grammar or operator precedence grammar. Such grammars have the
restriction that no production has either an empty right-hand side (null
productions) or two adjacent non-terminals in its right-hand side.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

Examples –
 This is an example of operator grammar:
o E->E+E/E*E/id
 However, the grammar given below is not an operator grammar
because two non-terminals are adjacent to each other:
o S->SAS/a
o A->bSb/b
 We can convert it into an operator grammar, though:
o S->SbSbS/SbS/a
o A->bSb/b
There are two methods for determining what precedence relations should
hold between a pair of terminals:
Use the conventional associativity and precedence of operator.
The second method of selecting operator-precedence relations is first to
construct an unambiguous grammar for the language, a grammar that reflects
the correct associativity and precedence in its parse trees.
This parser relies on the following three precedence relations: , ,
 a b This means a “yields precedence to” b.
 a b This means a “takes precedence over” b.
 a b This means a “has same precedence as” b.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

The given grammar is :

E->E+E/E*E/id
So the operator relation table will contain only terminals.

 There is not given any relation between id and id as id will not be

compared and two variables can not come side by side.
 Id and + , id will be given highest precedence. Because always identifier
will given highest precedence compared to any other operator.
 (+, +) so first we read row terminal and then column terminal, so (+, +) are
having same precedence and that’s why we have to check associativity of
the operator. (+, -, *, /) operators are left associative, so we will write > sign
for row terminal.
 $ is always having less precedence.
 So using operator precedence table we will parse the input string.
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

If we want to parse the string is : id + id * id

1) Whenever the top of the stack is less than or equal to look ahead
operator of input buffer, we will push the terminal in to the stack.
2) Whenever the top of the stack is greater , we will pop the terminal from
the stack.
3) Initially top of the stack is $.
4) TOS (top of the stack) is $ and look ahead is at “id” so ($, id) will
compared and its relation is <, so we will push “id” and increment look
ahead.
5) Now TOS is “id” and look ahead is at “+” (id, +) will compared , in table it
shows >, so we will pop it from stack.

$ id +

6) Now TOS is “+” and look ahead is at “id” so (+, id) will compared , it
shows < so we will push “id” in stack again.
7) So , like this way we have to follow all steps .
 So, operator relation table has a disadvantage – if we have n operators then
size of table will be n*n and complexity will be 0(n2). In order to decrease
the size of table, we use operator function table.
 Operator precedence parsers usually do not store the precedence table
with the relations; rather they are implemented in a special way. Operator
precedence parsers use precedence functions that map terminal symbols
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

to integers, and the precedence relations between the symbols are

implemented by numerical comparison. The parsing table can be encoded
by two precedence functions f and g that map terminal symbols to integers.
We select f and g such that:

f(a) < g(b) whenever a yields precedence to b

f(a) = g(b) whenever a and b have the same precedence
f(a) > g(b) whenever a takes precedence over b
Example – Consider the following grammar:
E -> E + E/E * E/( E )/id
This is the directed graph representing the precedence function:

Since there is no cycle in the graph, we can make this function table:
From this graph we have to find longest route.
fid -> g* -> f+ ->g+ -> f$
PCET-NMVPM’s
Nutan College of Engineering and Research, Talegaon, Pune
DEPARTMENT OF COMPUTER ENGINEERING AND SCIENCE

gid -> f* -> g* ->f+ -> g+ ->f$

from this route we will feel the function table. As from fid how many arrows
are there upto the end…… 4 so we will write 4 for fid.
f+ path is 2 so we will fill 2.

Size of the table is 2n.

 One disadvantage of function tables is that even though we have blank
entries in relation table we have non-blank entries in function table. Blank
entries are also called error. Hence error detection capability of relation
table is greater than function table.

Grammar
No ratings yet
Grammar
57 pages
ATCD PPT Module-3
No ratings yet
ATCD PPT Module-3
136 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
Chapter - 2 - Finite State Automata - Part - 3
No ratings yet
Chapter - 2 - Finite State Automata - Part - 3
50 pages
Compiler Design CS - 4
No ratings yet
Compiler Design CS - 4
70 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
Compiler 8
No ratings yet
Compiler 8
28 pages
CD Unit 3
No ratings yet
CD Unit 3
76 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
CSE2002 Session20 TopDownParsingSession1
No ratings yet
CSE2002 Session20 TopDownParsingSession1
33 pages
Module 2a - With Soln
No ratings yet
Module 2a - With Soln
90 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
Mod - 3
No ratings yet
Mod - 3
51 pages
Chapter 3 Syntax Analyzer
No ratings yet
Chapter 3 Syntax Analyzer
46 pages
Chapter 3 Syntax Analysis Full Reading Material
No ratings yet
Chapter 3 Syntax Analysis Full Reading Material
76 pages
Unit-Ii: Top Down Parsing
No ratings yet
Unit-Ii: Top Down Parsing
42 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
CD Unit-2 (R20)
No ratings yet
CD Unit-2 (R20)
38 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
11 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Unit II
No ratings yet
Unit II
32 pages
CD GTU Study Material Presentations Unit-3 15092020080346AM
No ratings yet
CD GTU Study Material Presentations Unit-3 15092020080346AM
128 pages
CD Unit-2 Notes
No ratings yet
CD Unit-2 Notes
23 pages
Morphological Parsing
No ratings yet
Morphological Parsing
19 pages
BMC - ITIL4 SVC and Value Streams PDF
No ratings yet
BMC - ITIL4 SVC and Value Streams PDF
23 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
Lex
No ratings yet
Lex
13 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
24 pages
Lec 01. Grammar, Derivations, Parse Tree
No ratings yet
Lec 01. Grammar, Derivations, Parse Tree
14 pages
Compiler Design Module 2 Notes 2022-23 02-04-2023 Modified
No ratings yet
Compiler Design Module 2 Notes 2022-23 02-04-2023 Modified
46 pages
Syntax Analysis: CD: Compiler Design
No ratings yet
Syntax Analysis: CD: Compiler Design
36 pages
Compiler Design 3
No ratings yet
Compiler Design 3
9 pages
2nd Phase Syntax Analyzer - 1
No ratings yet
2nd Phase Syntax Analyzer - 1
136 pages
Multimedia Application L4
No ratings yet
Multimedia Application L4
42 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
17 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
CFG & GNF
No ratings yet
CFG & GNF
21 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
Lecture 6 (6-2-23)
No ratings yet
Lecture 6 (6-2-23)
9 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Chapter Four
No ratings yet
Chapter Four
54 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
Measurment of Health Edited PDF
No ratings yet
Measurment of Health Edited PDF
347 pages
Theory of Automata: by Arjun Singh
No ratings yet
Theory of Automata: by Arjun Singh
29 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
CH 6
No ratings yet
CH 6
18 pages
Chapter Three Context Free Grammar
No ratings yet
Chapter Three Context Free Grammar
55 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
RG CFG AMbiguity
No ratings yet
RG CFG AMbiguity
8 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Unit 2
No ratings yet
Unit 2
10 pages
Behavior Modification Principles and Procedures 5th Edition Miltenberger Test Bank - Download PDF
100% (5)
Behavior Modification Principles and Procedures 5th Edition Miltenberger Test Bank - Download PDF
39 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
16 pages
Compiler 3
No ratings yet
Compiler 3
11 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
Statement of Wealth of Kamlesh Kurani (Recovered) - 3
No ratings yet
Statement of Wealth of Kamlesh Kurani (Recovered) - 3
15 pages
Industrial Battery: Maintenance
No ratings yet
Industrial Battery: Maintenance
26 pages
Animal Classification
No ratings yet
Animal Classification
14 pages
Lesson Plan Format Filipino
No ratings yet
Lesson Plan Format Filipino
4 pages
Salaries and Allowances of Members of Jammu and Kashmir State Legislature Act, 1960
No ratings yet
Salaries and Allowances of Members of Jammu and Kashmir State Legislature Act, 1960
13 pages
Physics JEE 29 Jul 2023
No ratings yet
Physics JEE 29 Jul 2023
5 pages
RBI Grade B 2022 Phase II FM Previous Year Paper
No ratings yet
RBI Grade B 2022 Phase II FM Previous Year Paper
19 pages
A Little Fire
No ratings yet
A Little Fire
16 pages
LAB Fish Dissection
No ratings yet
LAB Fish Dissection
5 pages
Class Ix Study Material
No ratings yet
Class Ix Study Material
74 pages
Soal ASAS 9 B.ing Kisi2
No ratings yet
Soal ASAS 9 B.ing Kisi2
3 pages
Als Budget of Work SY-2020-2021: Schools Division of Batangas City
100% (1)
Als Budget of Work SY-2020-2021: Schools Division of Batangas City
3 pages
Philips hr7761 hr7762 Food-Processor
No ratings yet
Philips hr7761 hr7762 Food-Processor
5 pages
UNIT 5 MEXICAN Edited
No ratings yet
UNIT 5 MEXICAN Edited
13 pages
Careers in Water Resources Engineering
No ratings yet
Careers in Water Resources Engineering
11 pages
Hkmw-How Do Plants Breathe
No ratings yet
Hkmw-How Do Plants Breathe
2 pages
PIG Paper: Dress Code in Herricks High School
100% (1)
PIG Paper: Dress Code in Herricks High School
14 pages
Corporation Sqe Reviewer 1
No ratings yet
Corporation Sqe Reviewer 1
3 pages
Course 6
14% (7)
Course 6
2 pages
2 Ms Annual Syllabus Distribution Roaissat M 2020-2021
0% (1)
2 Ms Annual Syllabus Distribution Roaissat M 2020-2021
2 pages
Master of Ceremony
No ratings yet
Master of Ceremony
2 pages
Lesson Plan - Variables On Both Sides With Notes
No ratings yet
Lesson Plan - Variables On Both Sides With Notes
2 pages
Exercise and Nonspecific Low Back Pain A Literature Review
No ratings yet
Exercise and Nonspecific Low Back Pain A Literature Review
7 pages
Utkal University: Form No
No ratings yet
Utkal University: Form No
4 pages
Lect 9-10 Choosing Brand Elements To Build Brand Equity
No ratings yet
Lect 9-10 Choosing Brand Elements To Build Brand Equity
36 pages
VMZINC - Double Lock Standing Seam - Webpage
No ratings yet
VMZINC - Double Lock Standing Seam - Webpage
2 pages
Handout II. Rearrange The Dialogue Sentences in A Correct Way
No ratings yet
Handout II. Rearrange The Dialogue Sentences in A Correct Way
1 page
Gene Expression Programming: Fundamentals and Applications
From Everand
Gene Expression Programming: Fundamentals and Applications
Fouad Sabry
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

CD Notes3

Uploaded by

CD Notes3

Uploaded by

PCET-NMVPM’s

Nutan College of Engineering and Research, Talegaon, Pune

Context Free Grammar:

In lexical analysis regular grammar is used and the language is generated is

Finite State Automaton has no data structure (stack) - memory as in case

So, context free grammar is type 2 grammar and it will recognized by

 A context-free grammar (CFG) has four components:

Context free grammar is a set of recursive rules used to generate patterns of

 A set of terminal symbols, sometimes referred to as "tokens."

A set of productions in the form: LHS RHS where

 RHS (called body, or right side) consists of zero or more terminals

Context free grammar G can be defined by four tuples as:

There are the various capabilities of CFG:

Derivation is a sequence of production rules. It is used to get the input string

Example : Gramar is:

o It is the graphical representation of symbol that can be terminals or

A grammar is said to be ambiguous if there exists more than one leftmost

Grammar can be ambiguous in 2 cases

1. Left recursion (left most symbol of R.H.S = L.H.S)

The above Grammar is left recursive because the left of production is

In Left Recursive Grammar, expansion of A will generate Aα, Aαα, Aααα at

Most of parser (Top-Down) don’t allow left recursion. Therefore we have

A βA now its responsibility of A’ to generate any no of alpha(α) or

Example1 − Consider the Left Recursion from the Grammar.

Comparing T T F|F with A Aα|β

Till now we have seen grammar can be ambiguous or unambiguous,

 The common prefix may be a terminal or a non-terminal or a combination

Example 1: Do left factoring in the following grammar-

A aAB / aBc / aAc

First and Follow

Rules to compute FIRST set:

1) If x is a terminal, then FIRST(x) = { ‘x’ }

FIRST(S) = FIRST(ACB) U FIRST(Cbb) U FIRST(Ba) where ever Є is there put it

In grammar C , B has Є production, if we put Є to B then the first of A is C and

FIRST(S) = FIRST(A) U FIRST(B) U FIRST(C) = { d, g, h, Є, b, a}

FIRST(A) = { d } U {FIRST(B)-Є} U FIRST(C) = { d, g, h, Є }

1. Є as a FOLLOW doesn’t mean anything (Є is an empty string).

Algorithm to construct LL(1) Parsing Table:

Find their First and Follow sets:

T –> FT’ { id, ( } { +, $, ) }

F –> id/(E) { id, ( } { *, +, $, ) }

E E –> TE’ E –> TE’

E’ E’ –> +TE’ E’ –> ε E’ –> ε

T’ T’ –> ε T’ –> *FT’ T’ –> ε T’ –> ε

F F –> id F –> (E)

Operator Precedence Parser

A grammar that is used to define mathematical operators is called an operator

The given grammar is :

 There is not given any relation between id and id as id will not be

If we want to parse the string is : id + id * id

to integers, and the precedence relations between the symbols are

f(a) < g(b) whenever a yields precedence to b

gid -> f* -> g* ->f+ -> g+ ->f$

Size of the table is 2n.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.