0% found this document useful (0 votes)

9 views

7- Parsing Techniques- Top Down Parsing

Uploaded by

نواف الشهراني

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

7- Parsing Techniques- Top Down Parsing

Uploaded by

نواف الشهراني

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Parsing Techniques

Top-Down Parsing
Parsing
 Lexical Analyzer has translated the source
program into a sequence of tokens
 The Parser must translate the sequence of
tokens into an intermediate representation
 Assume that the interface is that the parser can
call get NextToken to get the next token from the
lexical analyzer
 And the parser can call a function called emit
that will put out intermediate representations,
currently unspecified
 The parser outputs error messages if the syntax
of the source program is wrong
2
Parsing ‫اإلعراب‬
 Parsing means a set of steps used to verify the
extent to which a formula is compatible with a
specific rule, and thus we can judge whether that
formula belongs to the rule (of the language) or not.
 The results of the parsing process is called a parse
tree
 If the formula (line of code) is incorrect, error
messages in the formula are returned to the
programmer, and they are called Syntax Errors.
 Logical Errors in the program are not detected by
the compiler and therefore do not generate any
error messages, but the error is in the final output of
the program.
Different Types of Parsing
 Top-Down Parsing:
 Beginning with the start symbol, try to
guess the productions to apply to end
up at the user's program.
 Bottom-Up Parsing:
 Beginning with the user's program, try
to apply productions in reverse to
convert the program back into the
start symbol.
Parsing Techniques
 There are two parsing techniques that can be used
by the compiler to parse formulas:
 Top-down parsing
 It is the easiest logically and manually, as it starts
with the start symbol located in the parsing base,
trying to find a way through which to reach the
desired formula that is in the line of code.
 Bottom-Up Parsing
 It is the opposite of the previous method, where we
use the line of code as a start, trying to reach
through its elements the set of base lines that lead
us to the start symbol.
Top-Down Parsing
 A top-down parsing algorithm parses an input string in such a
way that the implied traversal of the parse tree occurs from
the root to the leaves.
 Top-down parsers come in two forms:
backtracking parsers and predictive parsers.
 A predictive parser attempts to predict the next construction
using one or more lookahead tokens.
 Two well-known top-down parsing methods are:
recursive-descent parsing and LL(1) parsing.
 Recursive descent parsing is the most suitable method for a
handwritten parser.

6
Parsing: Top-Down, Bottom-Up
 using the following grammar:
E -> E + E
|2
Parse the syntax 2 + 2
The formula can be parsed from top to bottom starting
from the starting symbol as follows:
E  E+E  2+E  2+2
It can be parsed from bottom to top starting with the
formula and ending with the starting symbol as follows:
2+2  E+2 E+E  E
Recursive-Descent Parsing
 Recursive-Descent Parsing follows these steps:
 The correct parsing rule is determined based on the values of
the terminal expressions in the form to be parsed and always
from the left-most side.
 If a wrong path is chosen that does not correspond to the
formula, it is reverted, ignored, and another path is tried from
among the available alternatives, and thus it is possible to
retract several times.
 It is ineffective way of parsing due to the large number of trial
and error in it.
S  aBc
B  rk
| r
Input  arc
Predictive Parsing
Stmt  if ......
| while ......
| cout ......
| for .....
 Using the previous rule, it is possible to predict which
formulas (alternatives) to the rule Stmt can be used by
looking at the first element in the formula that replaces
the parsing.
 For example, when parsing a formula that contains the
terminating if element, here it is possible to specify which
alternatives to the base will be used, which is the first
alternative for if
 When we are trying to write the non-terminal stmt, if the
current token is if we must choose first production rule.
First Predictive Parser: LL(1)
 top-down parser :- starts with start symbol on
stack, and repeatedly replace nonterminals until
string is generated.
 predictive parser - predict next rewrite rule
 Frst L of LL means :- read input string left to
right
 Second L of LL means :- produces leftmost
derivation
 k :- number of lookahead symbols
First Predictive Parser: LL(1)
 Top-down, predictive parsing LL(1):
 L: Left-to-right scan of the tokens
 L: Leftmost derivation.
 (1): One token of lookahead
Top-Down Parsing
LL(1) Parser Components
 The parser consists of 4 basic elements, as
shown in the following figure:
Input Buffer
‫مخزن الصيغة‬

Stack LL(1) Parser Output

‫المكدس‬ ‫ناتج اإلعراب‬

Parsing
‫جدول اإلعراب‬ Table
LL(1) Parser Components
 The input buffer
A store to store the formula to be parsed, and
we will always assume that it ends with a
special symbol that represents the end of the
string (the symbol for stopping), which is the
symbol $.
 parsing output
A set of rules that represent the steps of the
derivation process used to parse the syntax
stored in the parsing repository.
LL(1) Parser Components
 Stack
• At the bottom of the stack, there is a special symbol that represents the
end of the parsing process, which is the $ symbol and is used to stop.
• At the beginning of the parsing the stack contains only the start symbol
and the end symbol $S .
• The syntax symbols used in the parsing process are stored according to
the steps involved.
• When the stack is empty and contains only the end symbol, the syntax
has been parsed correctly.
 Parsing table
• Table T[N,a] to specify the possible alternatives when parsing any
formula.
• Each row represents a nonterminal symbol
• Each column represents a terminal symbol as well as the special symbol
$
• Each cell represents the rule used in that case.
LL(1) Parsing Table
 Assuming that we have the following syntax, the
parsing table will look like the following table
 S  aBa

 B  bB |  Terminal End Symbol

a b $

S ‫القاعدة‬
Nonterminal
B

LL(1) Parsing Table

Exercise 1

 S  aBa

 B  bB
|
Using the LL(1) form of the preceding syntax, find the
following:
 Parsing table design.
 Parsing the syntax abba
LL(1) Parsing Table
S  aBa

B  bB
|

a b $
S S  aBa

B B B  bB

LL(1) Parsing Table

Secondly, Parsing
S  aBa a b $

S S  aBa
B  bB
B B  bB
| B

LL(1) Parsing Table

stack input output
$S abba$ S  aBa
$aBa abba$
$aB bba$ B  bB
$aBb bba$
$aB ba$ B  bB
$aBb ba$
$aB a$ B
$a a$
$ $ Accept,
successful completion
LL(1) Parser – Example1 (cont.)

Derivation(left-most): parse tree

S aBa
S
abBa
abbBa a B a
abba
b B

b B


CS416 Compiler Design 20
Parsing by Parsing
 The previous parsing steps using the stack are as
following:
Exercise 2
 S  aK
 K  bK
|X
 Xc

Using the LL(1) form of the preceding syntax,

find the following:
 Parsing table design.
 Parsing the syntax abbc
First: Designing the parsing table
 S  aK
 K  bK
|X
 Xc

a b c $

S S  aK

LL(1) Parsing Table

First: Designing the parsing table
 S  aK
 K  bK
|X
 Xc

a b c $

S S  aKc

K K  bK

LL(1) Parsing Table

First: Designing the parsing table
 S  aK
 K  bK
|X
 Xc

a b c $

S S  aKc

K K  bK KX

LL(1) Parsing Table

First: Designing the parsing table
 S  aK
 K  bK
|X
 Xc

a b c $

S S  aK

K K  bK KX

X Xc

LL(1) Parsing Table

First: Designing the parsing table
1. S  aK
2. K  bK
3. |X
4. Xc
a b c $

S 1

K 2 3

X 4
Secondly, Parsing
1. S  aK a b c $
2. K  bK S S  ak
3. |X K K  bK KX

4. Xc X Xc

stack input output LL(1) Parsing Table

$S abbc$ S  aK
$Ka abbc$
$K bbc$ K  bK
$Kb bbc$
$K bc$ K  bK
$Kb bc$
$K c$ KX
$X c$ Xc
$c c$
$ $ Accept,
successful completion
Parsing by Stack
 The previous parsing steps using the stack are as
follows:
Exercises
Using the above syntax express the following
formulas:
 abc
 abbbc
 abrc
Exercise 3
Using the following :
S→E
E → id
E → (E Op E)
Op → +
Op → *
 Construct LL(1) Parse Table
 Required to build an LL(1) parsing table
 Parse id + ( id * id )
LL(1) Parsing Table Construction
 S→E
 E → id
 E → (E Op E)
 Op → +
 Op → *

id ( ) + * $