0% found this document useful (0 votes)
48 views

D LR Parsing

D-lr-parsing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

D LR Parsing

D-lr-parsing
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

CSE 401 – Compilers

LR Parsing
Hal Perkins
Autumn 2010

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-1


Agenda
„ LR Parsing
„ Table-driven Parsers
„ Parser States
„ Shift-Reduce and Reduce-Reduce
conflicts

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-2


LR(1) Parsing
„ We’ll look at LR(1) parsers
„ Left to right scan, Rightmost derivation, 1
symbol lookahead
„ Almost all practical programming
languages have an LR(1) grammar
„ LALR(1), SLR(1), etc. – subsets of LR(1)
„ LALR(1) can parse most real languages, tables
are more compact, and is used by YACC/Bison/
CUP/etc.
10/10/2010 © 2002-10 Hal Perkins & UW CSE D-3
Bottom-Up Parsing
„ Idea: Read the input left to right
„ Whenever we’ve matched the right
hand side of a production, reduce it to
the appropriate non-terminal and add
that non-terminal to the parse tree
„ The upper edge of this partial parse
tree is known as the frontier

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-4


Example
„ Grammar „ Bottom-up Parse

S ::= aAB e
A ::= Abc | b
B ::= d

a b b c d e
10/10/2010 © 2002-10 Hal Perkins & UW CSE D-5
Details
„ The bottom-up parser reconstructs a reverse
rightmost derivation
„ Given the rightmost derivation
S =>β1=>β2=>…=>βn-2=>βn-1=>βn = w
the parser will first discover βn-1=>βn , then
βn-2=>βn-1 , etc.
„ Parsing terminates when
„ β1 reduced to S (start symbol, success), or
„ No match can be found (syntax error)

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-6


How Do We Parse with This?
„ Key: given what we’ve already seen and the
next input symbol, decide what to do.
„ Choices:
„ Perform a reduction
„ Look ahead further
„ Can reduce A=>β if both of these hold:
„ A=>β is a valid production
„ A=>β is a step in this rightmost derivation
„ This is known as a shift-reduce parser

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-7


Sentential Forms
„ If S =>* α, the string α is called a sentential
form of the of the grammar
„ In the derivation
S =>β1=>β2=>…=>βn-2=>βn-1=>βn = w
each of the βi are sentential forms
„ A sentential form in a rightmost derivation is
called a right-sentential form (similarly for
leftmost and left-sentential)

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-8


Handles
„ Informally, a substring of the tree
frontier that matches the right side of a
production
„ Even if A::=β is a production, β is a handle
only if it matches the frontier at a point
where A::=β was used in that derivation
„ β may appear in many other places in the
frontier without being a handle for that
particular production

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-9


Handles (cont.)
„ Formally, a handle of a right-sentential
form γ is a production A ::= β and a
position in γ where β may be replaced
by A to produce the previous right-
sentential form in the rightmost
derivation of γ

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-10


Handle Examples
„ In the derivation
S => aABe => aAde => aAbcde => abbcde
„ abbcde is a right sentential form whose
handle is A::=b at position 2
„ aAbcde is a right sentential form whose
handle is A::=Abc at position 4
„ Note: some books take the left of the match as
the position

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-11


Implementing Shift-Reduce
Parsers
„ Key Data structures
„ A stack holding the frontier of the tree
„ A string with the remaining input

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-12


Shift-Reduce Parser
Operations
„ Reduce – if the top of the stack is the
right side of a handle A::=β, pop the
right side β and push the left side A
„ Shift – push the next input symbol onto
the stack
„ Accept – announce success
„ Error – syntax error discovered

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-13


S ::= aABe
A ::= Abc | b

Shift-Reduce Example B ::= d

Stack Input Action


$ abbcde$ shift

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-14


How Do We Automate This?
„ Def. Viable prefix – a prefix of a right-
sentential form that can appear on the stack
of the shift-reduce parser
„ Equivalent: a prefix of a right-sentential form that
does not continue past the rightmost handle of
that sentential form
„ Idea: Construct a DFA to recognize viable
prefixes given the stack and remaining input
„ Perform reductions when we recognize them

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-15


S ::= aABe
A ::= Abc | b
DFA for prefixes of B ::= d

e
accept
8 9 S ::= aABe
$ B

start a A b c
1 2 3 6 7 A ::= Abc

b d

4 5

A ::= b B ::= d

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-16


S ::= aABe
A ::= Abc | b
Trace B ::= d

accept e
8 9 S ::= aABe
Stack Input B
$ abbcde$ start
$
a A b c
1 2 3 6 7 A ::= Abc
b d

4 5
A ::= b B ::= d

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-17


Observations
„ Way too much backtracking
„ We want the parser to run in time
proportional to the length of the input
„ Where the heck did this DFA come from
anyway?
„ From the underlying grammar
„ We’ll defer construction details for now

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-18


Avoiding DFA Rescanning
„ Observation: after a reduction, the contents
of the stack are the same as before except
for the new non-terminal on top
„ ∴ Scanning the stack will take us through the
same transitions as before until the last one
„ ∴ If we record state numbers on the stack, we
can go directly to the appropriate state when we
pop the right hand side of a production from the
stack

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-19


Stack
„ Change the stack to contain pairs of
states and symbols from the grammar
$s0 X1 s1 X2 s2 … Xn sn
„ State s0 represents the accept state
(Not always added – depends on particular presentation)

„ Observation: in an actual parser, only the state numbers need


to be pushed, since they implicitly contain the symbol
information, but for explanations it’s clearer to use both.

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-20


Encoding the DFA in a Table
„ A shift-reduce parser’s DFA can be
encoded in two tables
„ One row for each state
„ action table encodes what to do given the
current state and the next input symbol
„ goto table encodes the transitions to take
after a reduction

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-21


Actions (1)
„ Given the current state and input
symbol, the main possible actions are
„ si – shift the input symbol and state i onto
the stack (i.e., shift and move to state i )
„ rj – reduce using grammar production j
„ The production number tells us how many
<symbol, state> pairs to pop off the stack

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-22


Actions (2)
„ Other possible action table entries
„ accept
„ blank – no transition – syntax error
„ A LR parser will detect an error as soon as
possible on a left-to-right scan
„ A real compiler needs to produce an error
message, recover, and continue parsing when
this happens

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-23


Goto
„ When a reduction is performed,
<symbol, state> pairs are popped from
the stack revealing a state uncovered_s
on the top of the stack
„ goto[uncovered_s , A] is the new state
to push on the stack when reducing
production A ::= β (after popping β and
revealing state uncovered_s on top)
10/10/2010 © 2002-10 Hal Perkins & UW CSE D-24
S ::= aABe
A ::= Abc | b
Reminder: DFA for B ::= d

accept e
8 9 S ::= aABe
$ B

start a A b c
1 2 3 6 7 A ::= Abc

b d

4 5

A ::= b B ::= d

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-25


1. S ::= aABe
2. A ::= Abc
3. A ::= b
LR Parse Table for 4. B ::= d
action goto
State
a b c d e $ A B S
1 s2 acc g1
2 s4 g3
3 s6 s5 g8
4 r3 r3 r3 r3 r3 r3
5 r4 r4 r4 r4 r4 r4
6 s7
7 r2 r2 r2 r2 r2 r2
8 s9
9 r1 r1 r1 r1 r1 r1

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-26


LR Parsing Algorithm (1)
word = scanner.getToken(); } else if (action[s, word] = accept ) {
while (true) { return;
s = top of stack; } else {
if (action[s, word] = si ) { // no entry in action table
push word; push i (state); report syntax error;
word = scanner.getToken(); halt or attempt recovery;
} else if (action[s, word] = rj ) { }
pop 2 * length of right side of
production j (2*|β|);
uncovered_s = top of stack;
push left side A of production j ;
push state goto[uncovered_s, A];
}

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-27


1. S ::= aABe
2. A ::= Abc
3. A ::= b
Example 4. B ::= d

action goto
Stack Input S
a b c d e $ A B S
$ abbcde$
1 s2 ac g1

2 s4 g3

3 s6 s5 g8

4 r3 r3 r3 r3 r3 r3

5 r4 r4 r4 r4 r4 r4

6 s7

7 r2 r2 r2 r2 r2 r2

8 s9

9 r1 r1 r1 r1 r1 r1

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-28


LR States
„ Idea is that each state encodes
„ The set of all possible productions that we
could be looking at, given the current state
of the parse, and
„ Where we are in the right hand side of
each of those productions

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-29


Items
„ An item is a production with a dot in
the right hand side
„ Example: Items for production A ::= XY
A ::= .XY
A ::= X.Y
A ::= XY.
„ Idea: The dot represents a position in
the production
10/10/2010 © 2002-10 Hal Perkins & UW CSE D-30
S ::= aABe
A ::= Abc | b
DFA for B ::= d

1 8 9
$ e S ::= aABe.
S ::= .aABe accept S ::= aAB.e
B
a
2 3
S ::= a.ABe S ::= aA.Be 6
A b
A ::= .Abc A ::= A.bc A ::= Ab.c
A ::= .b B ::= .d c
7
b d
4 5 A ::= Abc.
A ::= b. B ::= d.

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-31


Problems with Grammars
„ Grammars can cause problems when
constructing a LR parser
„ Shift-reduce conflicts
„ Reduce-reduce conflicts

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-32


Shift-Reduce Conflicts
„ Situation: both a shift and a reduce are
possible at a given point in the parse
(equivalently: in a particular state of the
DFA)
„ Classic example: if-else statement
S ::= ifthen S | ifthen S else S

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-33


1. S ::= ifthen S
Parser States for 2. S ::= ifthen S else S

S ::= . ifthen S
1
S ::= . ifthen S else S „ State 3 has a shift-
ifthen reduce conflict
2 S ::= ifthen . S „ Can shift past else
S ::= ifthen . S else S into state 4 (s4)
S
„ Can reduce (r1)
3 S ::= ifthen S . S ::= ifthen S
S ::= ifthen S . else S
else
4 S ::= ifthen S else . S (Note: other S ::= . ifthen
items not included in states
2-4 to save space)

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-34


Solving Shift-Reduce Conflicts
„ Fix the grammar
„ Done in Java reference grammar, others
„ Use a parse tool with a “longest match”
rule – i.e., if there is a conflict, choose
to shift instead of reduce
„ Does exactly what we want for if-else case
„ Guideline: a few shift-reduce conflicts are
fine, but be sure they do what you want
10/10/2010 © 2002-10 Hal Perkins & UW CSE D-35
Reduce-Reduce Conflicts
„ Situation: two different reductions are
possible in a given state
„ Contrived example
S ::= A
S ::= B
A ::= x
B ::= x

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-36


1. S ::= A
2. S ::= B
Parser States for 3.
4.
A ::= x
B ::= x
S ::= .A
1
S ::= .B „ State 2 has a
A ::= .x reduce-reduce
B ::= .x
x
conflict (r3, r4)
2
A ::= x.
B ::= x.

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-37


Handling Reduce-Reduce
Conflicts
„ These normally indicate a serious
problem with the grammar.
„ Fixes
„ Use a different kind of parser generator
that takes lookahead information into
account when constructing the states
„ Most practical tools use this information
„ Fix the grammar

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-38


Another Reduce-Reduce
Conflict
„ Suppose the grammar separates
arithmetic and boolean expressions
expr ::= aexp | bexp
aexp ::= aexp * aident | aident
bexp ::= bexp && bident | bident
aident ::= id
bident ::= id
„ This will create a reduce-reduce conflict

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-39


Covering Grammars
„ A solution is to merge aident and bident into
a single non-terminal (or use id in place of
aident and bident everywhere they appear)
„ This is a covering grammar
„ Includes some programs that are not generated
by the original grammar
„ Use the type checker or other static semantic
analysis to weed out illegal programs later

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-40


Coming Attractions
„ Constructing LR tables
„ We’ll present a simple version (SLR(0)) in
lecture, then talk about extending it to
LR(1)
„ LL parsers and recursive descent
„ Continue reading ch. 3

10/10/2010 © 2002-10 Hal Perkins & UW CSE D-41

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy