0% found this document useful (0 votes)
35 views

Top Down Parsing

The document discusses top-down parsing and left factoring of grammars. It explains the concepts of recursive descent parsing, predictive parsing, and construction of LL(1) parsing tables. Examples are provided to illustrate left factoring, predictive parsing automation, and use of parsing tables to parse strings.

Uploaded by

Vedant Deshmukh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Top Down Parsing

The document discusses top-down parsing and left factoring of grammars. It explains the concepts of recursive descent parsing, predictive parsing, and construction of LL(1) parsing tables. Examples are provided to illustrate left factoring, predictive parsing automation, and use of parsing tables to parse strings.

Uploaded by

Vedant Deshmukh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

TOP-DOWN PARSING

Recursive-Descent, Predictive Parsing


Mrs.Soma Ghosh
gsn.comp@coeptech.ac.in
Prior to top-down parsing
• Checklist :

1. Remove ambiguity if possible by


rewriting the grammar
2. Remove left- recursion, otherwise it may
lead to an infinite loop.
3. Do left- factoring.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Left- factoring
• In predictive parsing , the prediction is made about which
rule to follow to parse the non-terminal by reading the
following input symbols
• In case of predictive parsing, left-factoring helps remove
removable ambiguity.
• “Left factoring is a grammar transformation that is useful
for producing a grammar suitable for predictive parsing.
The basic idea is that when it is not clear which of two
alternative productions to use to expand a non-terminal
A, we may be able to rewrite the A-productions to defer
the decision until we have seen enough of the input to
make the right choice.”
- Aho,Ullman,Sethi

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Left-factoring
• Here is a grammar rule that is ambiguous:
A -> xP1 | xP2 | xP3 | xP4 ….| xPn
Where x & Pi’s are strings of terminals and non-terminals
and x !=
If we rewrite it as
A-> xP’
P’ -> P1|P2|P3 …|Pn

We call that the grammar has been “left-factored”, and the


apparent ambiguity has been removed. Repeating this
for every rule left-factors a grammar completely

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Example
• stmt -> if exp then stmt endif |
if exp then stmt endif else stmt endif

We can left factor it as follows :

stmt -> if exp then stmt endif ELSEFUNC


ELSEFUNC -> else stmt endif | (epsilon)

Thereby removing the ambiguity

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Parsers: Recursive-Descent
• Recursive, Uses backtracking
• Tries to find a leftmost derivation
• Unless the grammar is ambiguous or left-recursive, it
finds a suitable parse tree
• But is rarely used as programming constructs can be
parsed without backtracking

Consider the grammar:


S cAd | bd
A ab | a
and the string “cad”

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Recursive parsing with backtracking : example
S
Following the first rule, S->cAd
S->cAd to parse S
c A d

The next non=term in c A d A -> ab


line A is parsed using
first rule, A -> ab , but
turns out INCORRECT, a b
parser backtracks

c A d A -> a
Next rule to parse A is taken
A->a, turns out CORRECT ,
Parser stops a
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
Predictive parser
• It is a recursive-descent parser that needs
no backtracking
• Suppose
A -> A1 | A2 | ….| An
• If the non-terminal to be expanded next is
‘A’ , then the choice of rule is made on the
basis of the current input symbol ‘a’ .

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Procedure
• Make a transition diagram( like dfa/nfa) for every
rule of the grammar.
• Optimize the dfa by reducing the number of
states, yielding the final transition diagram
• To parse a string, simulate the string on the
transition diagram
• If after consuming the input the transition
diagram reaches an accept state, it is parsed.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Example
The grammar is as follows
• E -> E + T | T
• T-> T*F|F
• F -> (E) | id
After removing left-recursion , left-factoring
The rules are :

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Rules and their transition diagrams
E T T’
• E->T T’ START

T’
• T’ -> +T T’ |  + T
T

• T -> F T’’ T F T’’


• T -> *F T’’ |  T’
+
T T
• T -> (E) |id T ( E )

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in) id
Optimization
After optimization it yields the following DFA
like structures:

+ *
T F
START

FINAL 

T ( E )

id
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
SIMULATION METHOD
• Start from the start state
• If a terminal comes consume it, move to next
state
• If a non – terminal comes go to the state of the
“dfa” of the non-term and return on reaching the
final state
• Return to the original “dfa” and continue parsing
• If on completion( reading input string
completely), you reach a final state, string is
successfully parsed.
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
Disadvantage :
• It is inherently a recursive parser, so it
consumes a lot of memory as the stack
grows.
• To remove this recursion, we use
LL-parser, which uses a table for lookup.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
EExample of LL(1) grammar
• E -> TE’
• E -> +TE’|ε
• T -> FT’
• T’ -> *FT’|ε
• F -> (E)|id

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
First and Follow

Symbol FIRST FOLLOW


E (,id $,)
E’ +,ë $,)
T (,id +,$,)
T’ *,ë +,$,)
F (,id *,+,$,)
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
Algo for Construction of predictive
parsing table :
1. For each production Aa of grammar G, do
steps 2 and 3
2. For each terminal 'a' in FIRST(a) , add Aa in
M[A,a].
3. If e is in FIRST(a) , add Aa to M[A,b] for
each terminal b in FOLLOW(A). If ë is in
FIRST(a ) , and $ is in FOLLOW(A), then add
Aa to M[A,$]
4. Make each undefined entry as “ERROR”, i.e.
An error entry.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Generated Parser Table
For String id + id * id
Non INPUT
Terminal SYMBOLS

id + * ( ) $

E E -> TE’ E -> TE’

E’ E -> +TE’ E’ -> ε E’ -> ε

T T -> FT’ T -> FT’

T’ T’ -> ε T’ -> *FT’ T’ -> ε T’ -> ε

F F -> id F -> (E)

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
How to control the parser?

 If X=a=$ , parser halts, string accepted.


 If X=a !=$ , parser pops X, and advances the

input pointer to point to next input symbol.


 If X is a non-terminal, the program consults

entry M[X,a] of the parsing table M. Replace


the top of stack(X) with production rule
corresponding to entry in table. If entry =
ERROR, call error recovery routine.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
MATCHED STACK INPUT ACTION
E$ id+id * id$
TE’$ id+id * id$ E->TE’
FT’E’$ id+id * id$ T->FT’
id T’E’$ id+id * id$ F->id
id T’E’$ +id * id$ Match id
id E’$ +id * id$ T’->Є
id +TE’$ +id * id$ E’-> +TE’
id+ TE’$ id * id$ Match +
id+ FT’E’$ id * id$ T-> FT’
id+ idT’E’$ id * id$ F-> id
id+id T’E’$ * id$ Match id
id+id * FT’E’$ * id$ T’-> *FT’
id+id * FT’E’$ id$ Match *
id+id * idT’E’$ id$ F-> id
id+id * id T’E’$ $ Match id
id+id * id E’$ $ T’-> Є
Mrs.Soma Ghosh (gsn.comp@co
id+id * id $
eptech.ac.in) $ E’-> Є
What does LL signify ?
The first L means that the scanning takes place from
Left to right.
The second L means that the Left derivation is
produced first.

The prime requirements are : -


 Stack
 Parsing Table
 Input buffer
 Parsing program .

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
What does LL signify ?
The first L means that the scanning takes place from
Left to right.
The second L means that the Left derivation is
produced first.

The prime requirements are : -


 Stack
 Parsing Table
 Input buffer
 Parsing program .

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
 Inputbuffer contains the string to be
parsed, followed by $ ,a symbol used to
indicate end of the input string. The stack
indicates a sequence of grammar
symbols with $ on the bottom,indicating
bottom of the stack. Initially, the stack
contains the start symbol of the grammar
on the top of $. The parsing table is a 2-D
array M[A,a] where A is a nonterminal,
and a is a terminal or the symbol $.
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
 How to control the parser?
 If X=a=$ , parser halts, string accepted.
 If X=a !=$ , parser pops X, and advances the input pointer
to point to next input symbol.
 If X is a non-terminal, the program consults entry M[X,a]
of the parsing table M. Replace the top of stack(X) with
production rule corresponding to entry in table. If entry =
ERROR, call error recovery routine.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Algo for Construction of predictive
parsing table :
1. For each production Aa of grammar G, do
steps 2 and 3
2. For each terminal 'a' in FIRST(a) , add Aa in
M[A,a].
3. If e is in FIRST(a) , add Aa to M[A,b] for each
terminal b in FOLLOW(A). If ë is in FIRST(a ) ,
and $ is in FOLLOW(A), then add Aa to M[A,$]
4. Make each undefined entry as “ERROR”, i.e. An
error entry.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Example:

Grammar
ETE'
E'+TE' | ë
T FT'
T'*FT' | ë
F(E) | id
( ë stands for epsilon)

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
First and Follow
Symbol FIRST FOLLOW
E (,id $,)
E’ +,ë $,)
T (,id +,$,)
T’ *,ë +,$,)
F (,id *,+,$,)
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
Building the table
Id + * ( ) $
E ETE ETE
’ ’
E E’+T E’ë E’
’ E’ ë
T TFT’ TFT’
T T’ë T’*FT T’ë T’
’ ’ ë
F Fid F(E)
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
Input=id+id*id
Stack Input buffer
$E id+id*id$
$E'T' Id+id*id$
$E'T'F Id+id*id$
$E'T'id Id+id*id$
$E'T' +id*id$
$E' +id*id$
$E'T+ +id*id$
$E'T Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
id*id$
Stack Input Buffer
$E'T'F id*id$
$E'T'id id*id$
$E'T' *id$
$E'T'F* *id$
$E'T'F id$
$E'T'id id$
$E'T' $
$E' $
$ Accepted
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
Thus, we can easily construct an
LL parse with 1 lookahead. Since
one look ahead is involved, we
also call it an LL(1) parser.
There are grammars which may requite LL(k) parsing.
For e.g. look at next example…..

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Grammar:
FIRST FOLLOW
SiEtSS’ | a
S’Es | ë S a,I $,ë
Eb
S’ $,ë $,ë
E b t
Note that this is
If then else statement

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Parse Table
a b e i t $

S Sa SiEtS
S’
S’ Së Së
Se
S
E Eb
Ambiguity
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
 The grammar is ambiguous and it
is evident by the fact that we have
two entries corresponding to
M[S’,e] containing S € and S’
eS. This ambiguity can be
resolved if we choose
S’eS i.e associating the else’s with the closest previous “then”.

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Note that the ambiguity will be
solved if we use LL(2) parser,
i.e. always see for the two
input symbols. How?
When input is ‘e’ then it looks at next input. Depending on the
next input we choose the appropriate rule.
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
LL(1) grammars have distinct
properties. -No ambiguous
grammar or left recursive
grammar can be LL(1).
A grammar is LL(1) if and only if whenever a production A
C | D the following conditions hold:
…contd
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)
1)For no terminal a both C and
D derive strings beginning with
a.
Thus First(C) != First(D)
2)At most one of C or D can derive €
3) If C* € then D does not derive any string beginning with
terminal Follow(A).

Mrs.Soma Ghosh (gsn.comp@co


eptech.ac.in)
Mrs.Soma Ghosh (gsn.comp@co
eptech.ac.in)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy