7- Parsing Techniques- Top Down Parsing
7- Parsing Techniques- Top Down Parsing
Top-Down Parsing
Parsing
Lexical Analyzer has translated the source
program into a sequence of tokens
The Parser must translate the sequence of
tokens into an intermediate representation
Assume that the interface is that the parser can
call get NextToken to get the next token from the
lexical analyzer
And the parser can call a function called emit
that will put out intermediate representations,
currently unspecified
The parser outputs error messages if the syntax
of the source program is wrong
2
Parsing اإلعراب
Parsing means a set of steps used to verify the
extent to which a formula is compatible with a
specific rule, and thus we can judge whether that
formula belongs to the rule (of the language) or not.
The results of the parsing process is called a parse
tree
If the formula (line of code) is incorrect, error
messages in the formula are returned to the
programmer, and they are called Syntax Errors.
Logical Errors in the program are not detected by
the compiler and therefore do not generate any
error messages, but the error is in the final output of
the program.
Different Types of Parsing
Top-Down Parsing:
Beginning with the start symbol, try to
guess the productions to apply to end
up at the user's program.
Bottom-Up Parsing:
Beginning with the user's program, try
to apply productions in reverse to
convert the program back into the
start symbol.
Parsing Techniques
There are two parsing techniques that can be used
by the compiler to parse formulas:
Top-down parsing
It is the easiest logically and manually, as it starts
with the start symbol located in the parsing base,
trying to find a way through which to reach the
desired formula that is in the line of code.
Bottom-Up Parsing
It is the opposite of the previous method, where we
use the line of code as a start, trying to reach
through its elements the set of base lines that lead
us to the start symbol.
Top-Down Parsing
A top-down parsing algorithm parses an input string in such a
way that the implied traversal of the parse tree occurs from
the root to the leaves.
Top-down parsers come in two forms:
backtracking parsers and predictive parsers.
A predictive parser attempts to predict the next construction
using one or more lookahead tokens.
Two well-known top-down parsing methods are:
recursive-descent parsing and LL(1) parsing.
Recursive descent parsing is the most suitable method for a
handwritten parser.
6
Parsing: Top-Down, Bottom-Up
using the following grammar:
E -> E + E
|2
Parse the syntax 2 + 2
The formula can be parsed from top to bottom starting
from the starting symbol as follows:
E E+E 2+E 2+2
It can be parsed from bottom to top starting with the
formula and ending with the starting symbol as follows:
2+2 E+2 E+E E
Recursive-Descent Parsing
Recursive-Descent Parsing follows these steps:
The correct parsing rule is determined based on the values of
the terminal expressions in the form to be parsed and always
from the left-most side.
If a wrong path is chosen that does not correspond to the
formula, it is reverted, ignored, and another path is tried from
among the available alternatives, and thus it is possible to
retract several times.
It is ineffective way of parsing due to the large number of trial
and error in it.
S aBc
B rk
| r
Input arc
Predictive Parsing
Stmt if ......
| while ......
| cout ......
| for .....
Using the previous rule, it is possible to predict which
formulas (alternatives) to the rule Stmt can be used by
looking at the first element in the formula that replaces
the parsing.
For example, when parsing a formula that contains the
terminating if element, here it is possible to specify which
alternatives to the base will be used, which is the first
alternative for if
When we are trying to write the non-terminal stmt, if the
current token is if we must choose first production rule.
First Predictive Parser: LL(1)
top-down parser :- starts with start symbol on
stack, and repeatedly replace nonterminals until
string is generated.
predictive parser - predict next rewrite rule
Frst L of LL means :- read input string left to
right
Second L of LL means :- produces leftmost
derivation
k :- number of lookahead symbols
First Predictive Parser: LL(1)
Top-down, predictive parsing LL(1):
L: Left-to-right scan of the tokens
L: Leftmost derivation.
(1): One token of lookahead
Top-Down Parsing
LL(1) Parser Components
The parser consists of 4 basic elements, as
shown in the following figure:
Input Buffer
مخزن الصيغة
Parsing
جدول اإلعراب Table
LL(1) Parser Components
The input buffer
A store to store the formula to be parsed, and
we will always assume that it ends with a
special symbol that represents the end of the
string (the symbol for stopping), which is the
symbol $.
parsing output
A set of rules that represent the steps of the
derivation process used to parse the syntax
stored in the parsing repository.
LL(1) Parser Components
Stack
• At the bottom of the stack, there is a special symbol that represents the
end of the parsing process, which is the $ symbol and is used to stop.
• At the beginning of the parsing the stack contains only the start symbol
and the end symbol $S .
• The syntax symbols used in the parsing process are stored according to
the steps involved.
• When the stack is empty and contains only the end symbol, the syntax
has been parsed correctly.
Parsing table
• Table T[N,a] to specify the possible alternatives when parsing any
formula.
• Each row represents a nonterminal symbol
• Each column represents a terminal symbol as well as the special symbol
$
• Each cell represents the rule used in that case.
LL(1) Parsing Table
Assuming that we have the following syntax, the
parsing table will look like the following table
S aBa
a b $
S القاعدة
Nonterminal
B
S aBa
B bB
|
Using the LL(1) form of the preceding syntax, find the
following:
Parsing table design.
Parsing the syntax abba
LL(1) Parsing Table
S aBa
B bB
|
a b $
S S aBa
B B B bB
S S aBa
B bB
B B bB
| B
b B
CS416 Compiler Design 20
Parsing by Parsing
The previous parsing steps using the stack are as
following:
Exercise 2
S aK
K bK
|X
Xc
a b c $
S S aK
a b c $
S S aKc
K K bK
a b c $
S S aKc
K K bK KX
a b c $
S S aK
K K bK KX
X Xc
S 1
K 2 3
X 4
Secondly, Parsing
1. S aK a b c $
2. K bK S S ak
3. |X K K bK KX
4. Xc X Xc
id ( ) + * $
OP
LL(1) Parsing Table Construction
S→E
E → id
E → (E Op E)
Op → +
Op → *
id ( ) + * $
S
SE
OP
LL(1) Parsing Table Construction
S→E
E → id
E → (E Op E)
Op → +
Op → *
id ( ) + * $
S
SE SE
OP
LL(1) Parsing Table Construction
S→E
E → id
E → (E Op E)
Op → +
Op → *
id ( ) + * $
S
SE SE
E
E id
OP
LL(1) Parsing Table Construction
S→E
E → id
E → (E Op E)
Op → +
Op → *
id ( ) + * $
S
SE SE
E
E id E (E Op E)
OP
LL(1) Parsing Table Construction
S→E
E → id
E → (E Op E)
Op → +
Op → *
id ( ) + * $
S
SE SE
E
E id E (E Op E)
OP
Op +
LL(1) Parsing Table Construction
S→E
E → id
E → (E Op E)
Op → +
Op → *
id ( ) + * $
S
SE SE
E
E id E (E Op E)
OP
Op + Op *
LL(1) Parsing Table Construction
1. S → E
2. E → id
3. E → (E Op E)
4. Op → +
5. Op → *
id ( ) + * $
S
1 1
E
2 3
OP
4 5
Exercises
Using the preceding syntax, parse the following formulas:
(id + id)
(id + id) * id
(id * id)
id * id
Exercise 4
STMT → if EXPR then STMT (1)
| while EXPR do STMT (2)
| EXPR ; (3)
EXPR → TERM ==id (4)
| zero? TERM (5)
| not EXPR (6)
| ++ id (7)
| -- id (8)
TERM → id (9)
| constant (10)
Constructing the parsing table
STMT → if EXPR then STMT (1)
| while EXPR do STMT (2)
| EXPR ; (3)
EXPR → TERM ==id (4)
| zero? TERM (5)
| not EXPR (6)
| ++ id (7)
| -- id (8)
TERM → id (9)
| constant (10)
i th wh =er no == == = i con =
do
=
f en il o= t d st
e
==== 1=
====
====
Constructing the parsing table
STMT → if EXPR then STMT (1)
| while EXPR do STMT (2)
| EXPR ; (3)
EXPR → TERM ==id (4)
| zero? TERM (5)
| not EXPR (6)
| ++ id (7)
| -- id (8)
TERM → id (9)
| constant (10)
i th wh =er no == == = i con =
do
=
f en il o= t d st
e
==== 1= 2=
====
====
Constructing the parsing table
STMT → if EXPR then STMT (1)
| while EXPR do STMT (2)
| EXPR ; (3)
EXPR → TERM ==id (4)
| zero? TERM (5)
| not EXPR (6)
| ++ id (7)
| -- id (8)
TERM → id (9)
| constant (10)
i th wh =er no == == = i con =
do
=
f en il o= t d st
e
==== 1= 2= 3= 3= 3= 3= 3= 3=
====
====
Constructing the parsing table
STMT → if EXPR then STMT (1)
| while EXPR do STMT (2)
| EXPR ; (3)
EXPR → TERM ==id (4)
| zero? TERM (5)
| not EXPR (6)
| ++ id (7)
| -- id (8)
TERM → id (9)
| constant (10)
i th wh =er no == == = i con =
do
=
f en il o= t d st
e
==== 1= 2= 3= 3= 3= 3= 3= 3=
==== 5= 6= 7= 8= 4= 4=
====
Constructing the parsing table
STMT → if EXPR then STMT (1)
| while EXPR do STMT (2)
| EXPR ; (3)
EXPR → TERM ==id (4)
| zero? TERM (5)
| not EXPR (6)
| ++ id (7)
| -- id (8)
TERM → id (9)
| constant (10)
i th wh =er no == == = i con =
do
=
f en il o= t d st
e
==== 1= 2= 3= 3= 3= 3= 3= 3=
==== 5= 6= 7= 8= 4= 4=
==== 9= 10=
Exercises
Parse the following formulas:
constant = id;
if constant then ++id;
while id do not constant = id;
using:
3 column model (Stack, Input, Output)
Using Stack with Drawing