Parsing: Compiler Design

Download as pdf or txt
Download as pdf or txt
You are on page 1of 102

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing
Compiler Design

CSE 504

1 2 3

Grammars Recursive-Descent Parsing Top-Down Predictive Parsing


Version: 1.3 20:16:32 2012/02/14 Compiled at 07:22 on 2012/02/22 Compiler Design Parsing CSE 504 1 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing

A.k.a. Syntax Analysis Recognize sentences in a language. Discover the structure of a document/program. Construct (implicitly or explicitly) a tree (called as a parse tree) to represent the structure. The parse tree is used to guide translation.

Compiler Design

Parsing

CSE 504

2 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
The syntactic structure of a language is dened using grammars. Grammars (like regular expressions) specify a set of strings over an alphabet.

Compiler Design

Parsing

CSE 504

3 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
The syntactic structure of a language is dened using grammars. Grammars (like regular expressions) specify a set of strings over an alphabet. Ecient recognizers (automata) can be constructed to determine whether a string is in the language.

Compiler Design

Parsing

CSE 504

3 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
The syntactic structure of a language is dened using grammars. Grammars (like regular expressions) specify a set of strings over an alphabet. Ecient recognizers (automata) can be constructed to determine whether a string is in the language. Language heirarchy:

Compiler Design

Parsing

CSE 504

3 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
The syntactic structure of a language is dened using grammars. Grammars (like regular expressions) specify a set of strings over an alphabet. Ecient recognizers (automata) can be constructed to determine whether a string is in the language. Language heirarchy:
Finite Languages (FL) Enumeration

Compiler Design

Parsing

CSE 504

3 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
The syntactic structure of a language is dened using grammars. Grammars (like regular expressions) specify a set of strings over an alphabet. Ecient recognizers (automata) can be constructed to determine whether a string is in the language. Language heirarchy:
Finite Languages (FL) Enumeration Regular Languages (RL FL) Regular Expressions

Compiler Design

Parsing

CSE 504

3 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
The syntactic structure of a language is dened using grammars. Grammars (like regular expressions) specify a set of strings over an alphabet. Ecient recognizers (automata) can be constructed to determine whether a string is in the language. Language heirarchy:
Finite Languages (FL) Enumeration Regular Languages (RL FL) Regular Expressions Context-Free Languages (CFL RL) Context-Free Grammars

Compiler Design

Parsing

CSE 504

3 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Regular Languages

Languages represented by regular expressions Examples: {a, b, c}

Languages recognized by nite automata

Compiler Design

Parsing

CSE 504

4 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Regular Languages

Languages represented by regular expressions Examples: {a, b, c}

Languages recognized by nite automata

{ , a, b, aa, ab, ba, bb, . . .}

Compiler Design

Parsing

CSE 504

4 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Regular Languages

Languages represented by regular expressions Examples: {a, b, c}

Languages recognized by nite automata

{ , a, b, aa, ab, ba, bb, . . .} {(ab)n | n 0}

Compiler Design

Parsing

CSE 504

4 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Regular Languages

Languages represented by regular expressions Examples: {a, b, c}

Languages recognized by nite automata

{ , a, b, aa, ab, ba, bb, . . .} {(ab)n | n 0}

{an b n | n 0}

Compiler Design

Parsing

CSE 504

4 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Context-Free Grammars

Terminal Symbols: Tokens Nonterminal Symbols: set of strings made up of tokens Productions: Rules for constructing the set of strings associated with nonterminal symbols. Example: Stmt while Expr do Stmt Start symbol: a nonterminal symbol that represents the set of all strings in the language.

Compiler Design

Parsing

CSE 504

5 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
Notation where recursion is explicit.
Examples: { , a, b, aa, ab, ba, bb, . . .} = L( (a | b)): S Notational shorthand: S ES S | ES E a E a | b E b

Compiler Design

Parsing

CSE 504

6 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
Notation where recursion is explicit.
Examples: { , a, b, aa, ab, ba, bb, . . .} = L( (a | b)): S Notational shorthand: S ES S | ES E a E a | b E b {an bn | n 0} : S S aSb

Compiler Design

Parsing

CSE 504

6 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Grammars
Notation where recursion is explicit.
Examples: { , a, b, aa, ab, ba, bb, . . .} = L( (a | b)): S Notational shorthand: S ES S | ES E a E a | b E b {an bn | n 0} : S S aSb {w | no. of as in w = no. of bs in w }: Not expressible in CFG .

Compiler Design

Parsing

CSE 504

6 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

The First Useful Example

E E E E E E

E + E E E E E E / E ( E ) id

L(E ) = {id, id + id, id id, . . . , id + (id id) id, . . .}

Compiler Design

Parsing

CSE 504

7 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Context-Free Grammars: Notations


Production: rule with nonterminal symbol on left hand side, and a (possibly empty) sequence of terminal or nonterminal symbols on the right hand side. Notations: Terminals: lower case letters, digits, punctuation Nonterminals: Upper case letters Arbitrary Terminals/Nonterminals: X , Y , Z Strings of Terminals: u, v , w Strings of Terminals/Nonterminals: , , Start Symbol: S

Compiler Design

Parsing

CSE 504

8 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id E = = = E +E E + id id + id

E derives id + id:

A = i A is a production in the grammar.

Compiler Design

Parsing

CSE 504

9 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id E = = = E +E E + id id + id

E derives id + id:

A = i A is a production in the grammar. = if derives in zero or more steps. Example: E = id + id

Compiler Design

Parsing

CSE 504

9 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id E = = = E +E E + id id + id

E derives id + id:

A = i A is a production in the grammar. = if derives in zero or more steps. Example: E = id + id Sentence: A sequence of terminal symbols w such that S = w (where S is the start symbol)
+

Compiler Design

Parsing

CSE 504

9 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id E = = = E +E E + id id + id

E derives id + id:

A = i A is a production in the grammar. = if derives in zero or more steps. Example: E = id + id Sentence: A sequence of terminal symbols w such that S = w (where S is the start symbol) Sentential Form: A sequence of terminal/nonterminal symbols such that S =
Compiler Design Parsing CSE 504 9 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst:

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = E +E

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = E +E id + E

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = = E +E id + E id + id

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = = Written as E =lm id + id

E +E id + E id + id

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = =

E +E id + E id + id

Written as E =lm id + id Rightmost derivation: Rightmost nonterminal is replaced rst:

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = =

E +E id + E id + id

Written as E =lm id + id Rightmost derivation: Rightmost nonterminal is replaced rst: E = E +E

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = =

E +E id + E id + id

Written as E =lm id + id Rightmost derivation: Rightmost nonterminal is replaced rst: E = = E +E E + id

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = =

E +E id + E id + id

Written as E =lm id + id Rightmost derivation: Rightmost nonterminal is replaced rst: E = = = E +E E + id id + id

Compiler Design

Parsing

CSE 504

10 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Derivations
Grammar: E E E +E id

Leftmost derivation: Leftmost nonterminal is replaced rst: E = = =

E +E id + E id + id

Written as E =lm id + id Rightmost derivation: Rightmost nonterminal is replaced rst: E = = = Written as E =rm id + id
Compiler Design Parsing CSE 504 10 / 37

E +E E + id id + id

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parse Trees
A Parse Tree is a graphical representation of a derivation
Grammar: E E
E

E +E id

= E + E = id + E = id + id

E
E + E

= E + E = E + id = id + id

id

id

Compiler Design

Parsing

CSE 504

11 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parse Trees
A Parse Tree is a graphical representation of a derivation
Grammar: E E
E

E +E id

= E + E = id + E = id + id

E
E + E

= E + E = E + id = id + id

id

id

A Parse Tree succinctly captures the structure of a sentence.

Compiler Design

Parsing

CSE 504

11 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Ambiguity
A Grammar is ambiguous if there are multiple parse trees for the same sentence.

Grammar: E E + E E E E E id Sentence: id + id id

Compiler Design

Parsing

CSE 504

12 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Ambiguity
A Grammar is ambiguous if there are multiple parse trees for the same sentence.
E

Grammar: E E + E E E E E id Sentence: id + id id

id

id

id

Compiler Design

Parsing

CSE 504

12 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Ambiguity
A Grammar is ambiguous if there are multiple parse trees for the same sentence.
E

Grammar: E E + E E E E E id Sentence: id + id id

id

id

id

id

id

id

Compiler Design

Parsing

CSE 504

12 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Disambiguition
Express Preference for one parse tree over others. Example: id + id id The usual precedence of over + means:
E

id

id

id

id

id

id

Preferred

Compiler Design

Parsing

CSE 504

13 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing
Construct a parse tree for a given string.
S S S (S)S a

Compiler Design

Parsing

CSE 504

14 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing
Construct a parse tree for a given string.
S S S (S)S a

(a)a

(a)(a)

Compiler Design

Parsing

CSE 504

14 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing
Construct a parse tree for a given string.
S S S (S)S a

(a)a
S

(a)(a)

Compiler Design

Parsing

CSE 504

14 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing
Construct a parse tree for a given string.
S S S (S)S a

(a)a
S
( S

(a)(a)
S

S
a ( S ) S

a
a

CSE 504 14 / 37

Compiler Design

Parsing

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

A Procedure for Parsing

Grammar:

Algorithm parse S() { switch (input token) { case TOKEN A: consume(TOKEN A); return; default: /* Parse Error */ } }

Compiler Design

Parsing

CSE 504

15 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

A Procedure for Parsing (Contd.)


S S a

Grammar:

Algorithm parse S() { switch (input token) { case TOKEN A: /* Production 1 */ consume(TOKEN A); return; case TOKEN EOF : /* Production 2 */ return; default: /* Parse Error */ } }

Compiler Design

Parsing

CSE 504

16 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

A Procedure for Parsing (Contd.)


S S S (S)S a

Grammar:

Algorithm parse S() { switch (input token) { case TOKEN OPEN PAREN: /* Production 1 */ consume(TOKEN OPEN PAREN); parse S(); consume(TOKEN CLOSE PAREN); parse S(); return; /* Continued on next page */

Compiler Design

Parsing

CSE 504

17 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

A Procedure for Parsing (contd.)


S S S (S)S a

Grammar:

case TOKEN A: /* Production 2 */ consume(TOKEN A); return; case TOKEN CLOSE PAREN: case TOKEN EOF : /* Production 3 */ return; default: /* Parse Error */

Compiler Design

Parsing

CSE 504

18 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Predictive Parsing: Restrictions

May not be able to choose a unique production

a B d

B b B bc In general, we may need a backtracking parser: Recursive Descent Parsing

Compiler Design

Parsing

CSE 504

19 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

FIRST and FOLLOW


Grammar: S (S)S | a |

FIRST(X ) = First symbol of any string that can be derived from X FIRST(S) = {(, a, }. FOLLOW(A) = First symbol that, in some derivation of a sentence in the language, appears immediately after A. FOLLOW(S) = {), EOF}
S

C a
Compiler Design

a FIRST(C ) b FOLLOW(C )
b
Parsing CSE 504 20 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

FIRST and FOLLOW

Grammar:

S S

A S B

A a B b

FIRST (X ): FOLLOW (A):

First terminal in some such that X = . First terminal in some such that S = A.

FIRST (S) FIRST (A) FIRST (B)

= = =

{ a, } {a} {b}

Compiler Design

Parsing

CSE 504

21 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

FIRST and FOLLOW

Grammar:

S S

A S B

A a B b

FIRST (X ): FOLLOW (A):

First terminal in some such that X = . First terminal in some such that S = A. FOLLOW (S) = { b, EOF }

FIRST (S) FIRST (A) FIRST (B)

= = =

{ a, } {a} {b}

Compiler Design

Parsing

CSE 504

21 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

FIRST and FOLLOW

Grammar:

S S

A S B

A a B b

FIRST (X ): FOLLOW (A):

First terminal in some such that X = . First terminal in some such that S = A. FOLLOW (S) FOLLOW (A) = = { b, EOF } { a, b }

FIRST (S) FIRST (A) FIRST (B)

= = =

{ a, } {a} {b}

Compiler Design

Parsing

CSE 504

21 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

FIRST and FOLLOW

Grammar:

S S

A S B

A a B b

FIRST (X ): FOLLOW (A):

First terminal in some such that X = . First terminal in some such that S = A. FOLLOW (S) FOLLOW (A) FOLLOW (B) = = = { b, EOF } { a, b } { b, EOF }

FIRST (S) FIRST (A) FIRST (B)

= = =

{ a, } {a} {b}

Compiler Design

Parsing

CSE 504

21 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FIRST
S S AS B A B a b

Grammar:

FIRST () is the smallest set such that


= a, a terminal A, a nonterminal X1 , X2 , . . . , Xk , a string of terminals and nonterminals Property of FIRST () a FIRST () A G = FIRST () A G , = = FIRST () FIRST () FIRST (X1 ) { } FIRST () FIRST (Xi ) FIRST () if j < i FIRST (Xj ) FIRST () if j k FIRST (Xj )

FIRST (A) FIRST (a) {a}


Compiler Design Parsing CSE 504 22 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FIRST
S S AS B A B a b

Grammar:

FIRST () is the smallest set such that


= a, a terminal A, a nonterminal X1 , X2 , . . . , Xk , a string of terminals and nonterminals Property of FIRST () a FIRST () A G = FIRST () A G , = = FIRST () FIRST () FIRST (X1 ) { } FIRST () FIRST (Xi ) FIRST () if j < i FIRST (Xj ) FIRST () if j k FIRST (Xj )

FIRST (B) FIRST (b) {b}


Compiler Design Parsing CSE 504 22 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FIRST
S S AS B A B a b

Grammar:

FIRST () is the smallest set such that


= a, a terminal A, a nonterminal X1 , X2 , . . . , Xk , a string of terminals and nonterminals Property of FIRST () a FIRST () A G = FIRST () A G , = = FIRST () FIRST () FIRST (X1 ) { } FIRST () FIRST (Xi ) FIRST () if j < i FIRST (Xj ) FIRST () if j k FIRST (Xj )

FIRST (S) { }
Compiler Design Parsing CSE 504 22 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FIRST
S S AS B A B a b

Grammar:

FIRST () is the smallest set such that


= a, a terminal A, a nonterminal X1 , X2 , . . . , Xk , a string of terminals and nonterminals Property of FIRST () a FIRST () A G = FIRST () A G , = = FIRST () FIRST () FIRST (X1 ) { } FIRST () FIRST (Xi ) FIRST () if j < i FIRST (Xj ) FIRST () if j k FIRST (Xj )

FIRST (S) { }, and FIRST (S) FIRST ({ASB}) FIRST (A) {a}
Compiler Design Parsing CSE 504 22 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FOLLOW

Grammar:

S S

AS B

A B

a b

FOLLOW (A) is the smallest set such that


A = S, the start symbol B A G B A, or B A, FIRST () Property of FOLLOW (A) EOF FOLLOW (S) Book notation: $ FOLLOW (S) FIRST () { } FOLLOW (A) FOLLOW (B) FOLLOW (A)

FOLLOW (S) {EOF }


Compiler Design Parsing CSE 504 23 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FOLLOW

Grammar:

S S

AS B

A B

a b

FOLLOW (A) is the smallest set such that


A = S, the start symbol B A G B A, or B A, FIRST () Property of FOLLOW (A) EOF FOLLOW (S) Book notation: $ FOLLOW (S) FIRST () { } FOLLOW (A) FOLLOW (B) FOLLOW (A)

FOLLOW (S) {EOF }, and FOLLOW (S) FIRST (B) {b}


Compiler Design Parsing CSE 504 23 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FOLLOW

Grammar:

S S

AS B

A B

a b

FOLLOW (A) is the smallest set such that


A = S, the start symbol B A G B A, or B A, FIRST () Property of FOLLOW (A) EOF FOLLOW (S) Book notation: $ FOLLOW (S) FIRST () { } FOLLOW (A) FOLLOW (B) FOLLOW (A)

FOLLOW (A) FIRST (SB) {a, b}


Compiler Design Parsing CSE 504 23 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Denition of FOLLOW

Grammar:

S S

AS B

A B

a b

FOLLOW (A) is the smallest set such that


A = S, the start symbol B A G B A, or B A, FIRST () Property of FOLLOW (A) EOF FOLLOW (S) Book notation: $ FOLLOW (S) FIRST () { } FOLLOW (A) FOLLOW (B) FOLLOW (A)

FOLLOW (B) FOLLOW (S) {b, EOF }


Compiler Design Parsing CSE 504 23 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing Table
Grammar: S S AS B A B a b

Algorithm parse S() { switch (input token) { case TOKEN A: /* Production 3 */ parse A(); parse S(); parse B(); return; case TOKEN B: case TOKEN EOF : /* Production 4 */ return; Parsing Table:

Nonterminal S

Input Symbol a b EOF S A S B S S

Compiler Design

Parsing

CSE 504

24 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Table-driven Parsing

Grammar:

S S

AS B

A B

a b

Parsing Table: Input Symbol a b EOF S A S B S S A a B b

Nonterminal S A B

Compiler Design

Parsing

CSE 504

25 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Nonrecursive Parsing

Instead of recursion, use an explicit stack along with the parsing table. Data objects: Parsing Table: M(A, a), a two-dimensional array, dimensions indexed by nonterminal symbols (A) and terminal symbols (a). A Stack of terminal/nonterminal symbols Input stream of tokens The above data structures manipulated using a table-driven parsing program.

Compiler Design

Parsing

CSE 504

26 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Table-driven Parsing: a Sketch


stack initialized to hold the start symbol. while (! stack.isEmpty()) { X = stack.top(); if (X is a terminal symbol) consume(X ); else /* X is a nonterminal */ if (M[X , input token] = X Y1 , Y2 , . . . , Yk ) { stack.pop(); for i = k downto 1 do stack.push(Yi ); } else /* Syntax Error */ }
Compiler Design Parsing CSE 504 27 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Constructing Parsing Table

Grammar:

S S

AS B

A B = = =

a b

First(S) First(A) First(B)

= = =

{ a, } {a} {b}

Follow (S) Follow (A) Follow (B)

{ b, EOF } { a, b } { b, EOF }

Nonterminal S A B

Input Symbol a b EOF S A S B S S A a B b

Compiler Design

Parsing

CSE 504

28 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

A Procedure to Construct Parsing Tables


FIRST (X ): FOLLOW (A): First terminal in some such that X = . First terminal in some such that S = A.

Algorithm table construct(G ) { for each A G { for each a FIRST () such that a = add A to M[A, a]; if FIRST () for each b FOLLOW (A) add A to M[A, b]; }}

Compiler Design

Parsing

CSE 504

29 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing Table: Another Example

Grammar:

S S S

(S)S a

FIRST(S) = {(, a, } FOLLOW(S) = {), EOF} a S a Input Symbol ( ) S (S)S S EOF S

LL(1) Grammar: When the grammars recursive descent parsing table has no conicts (i.e. each cell has at most one entry).

Compiler Design

Parsing

CSE 504

30 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Recursive Descent Parsing: Restrictions


Grammar cannot be left-recursive Example: E E + E | a
Algorithm parse E () { switch (input token) { case TOKEN A: /* Production 1 */ parse E (); consume(TOKEN PLUS); parse E (); return; case TOKEN A: /* Production 2 */ consume(TOKEN A); return; } }

Compiler Design

Parsing

CSE 504

31 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Removing Left Recursion

A A a A b

L(A) = {b, ba, baa, baaa, baaaa, . . .}

A bA A A aA

Compiler Design

Parsing

CSE 504

32 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Removing Left Recursion: Another Example

E E

E + E id

E E E

id E + id E

Compiler Design

Parsing

CSE 504

33 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S a S A S B A a Rule b S B b Derivation EOF S

Input Stream a a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S a S A S B A a b S B b Rule S A S B Derivation S = A S B EOF S

Input Stream a a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A a S A S B A a b S B b Rule S A S B A a Derivation S = A S B = a S B EOF S

Input Stream a a b b$ a a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a a S A S B A a b S B b Rule S A S B A a Derivation S = A S B = a S B EOF S

Input Stream a a b b$ a a b b$ a a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S a S A S B A a b S B b Rule S A S B A a S A S B Derivation S = A S B = a S B = aAS B B EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A a S A S B A a b S B b Rule S A S B A a S A S B A a Derivation S = A S B = a S B = = aAS B B aaS B B EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a a S A S B A a b S B b Rule S A S B A a S A S B A a Derivation S = A S B = a S B = = aAS B B aaS B B EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a $B B S a S A S B A a b S B b Rule S A S B A a S A S B A a S Derivation S = A S B = a S B = = = aAS B B aaS B B aaB B EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$ b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a $B B S $B B a S A S B A a b S B b Rule S A S B A a S A S B A a S B b Derivation S = A S B = a S B = = = = aAS B B aaS B B aaB B aabB EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$ b b$ b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a $B B S $B B $B b a S A S B A a b S B b Rule S A S B A a S A S B A a S B b Derivation S = A S B = a S B = = = = aAS B B aaS B B aaB B aabB EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$ b b$ b b$ b b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a $B B S $B B $B b $B a S A S B A a b S B b Rule S A S B A a S A S B A a S B b B b Derivation S = A S B = a S B = = = = = aAS B B aaS B B aaB B aabB aabb EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$ b b$ b b$ b b$ b$

Compiler Design

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a $B B S $B B $B b $B $b
Compiler Design

a S A S B A a

b S B b

EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$ b b$ b b$ b b$ b$ b$

Rule S A S B A a S A S B A a S B b B b

Derivation S = A S B = a S B = = = = = aAS B B aaS B B aaB B aabB aabb

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 1


S A B Stack $S $B S A $B S a $B S $B B S A $B B S a $B B S $B B $B b $B $b $
Compiler Design

a S A S B A a

b S B b

EOF S

Input Stream a a b b$ a a b b$ a a b b$ a b b$ a b b$ a b b$ b b$ b b$ b b$ b$ b$ $

Rule S A S B A a S A S B A a S B b B b

Derivation S = A S B = a S B = = = = = aAS B B aaS B B aaB B aabB aabb

Parsing

CSE 504

34 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule

Nonterminal E E Stack $E

id E id E

EOF E

Input Stream id + id$

Derivation

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E

Nonterminal E E Stack $E

id E id E

EOF E

Input Stream id + id$

Derivation E = id E

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E

Nonterminal E E Stack $E $E id

id E id E

EOF E

Input Stream id + id$ id + id$

Derivation E = id E

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E E + id E

Nonterminal E E Stack $E $E id $E

id E id E

EOF E

Input Stream id + id$ id + id$ + id$

Derivation E = id E = id + id E

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E E + id E

Nonterminal E E Stack $E $E id $E $E id +

id E id E

EOF E

Input Stream id + id$ id + id$ + id$ + id$

Derivation E = id E = id + id E

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E E + id E

Nonterminal E E Stack $E $E id $E $E id + $E id

id E id E

EOF E

Input Stream id + id$ id + id$ + id$ + id$ id$

Derivation E = id E = id + id E

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E E + id E

Nonterminal E E Stack $E $E id $E $E id + $E id $E

id E id E

EOF E

Input Stream id + id$ id + id$ + id$ + id$ id$ $

Derivation E = id E = id + id E

id+id

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Parsing with LL(1) Grammars: Example 2


Input Symbol + E + E E Rule E id E E + id E

Nonterminal E E Stack $E $E id $E $E id + $E id $E $

id E id E

EOF E

Input Stream id + id$ id + id$ + id$ + id$ id$ $ $

Derivation E = id E = id + id E

id+id

Compiler Design

Parsing

CSE 504

35 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

LL(1) Derivations

Left to Right Scan of input

Compiler Design

Parsing

CSE 504

36 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

LL(1) Derivations

Left to Right Scan of input Leftmost Derivation

Compiler Design

Parsing

CSE 504

36 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

LL(1) Derivations

Left to Right Scan of input Leftmost Derivation (1) look ahead 1 token at each step

Compiler Design

Parsing

CSE 504

36 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

LL(1) Derivations

Left to Right Scan of input Leftmost Derivation (1) look ahead 1 token at each step Alternative characterization of LL(1) Grammars: Whenever A | G
1

FIRST () FIRST () = { }, and

Compiler Design

Parsing

CSE 504

36 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

LL(1) Derivations

Left to Right Scan of input Leftmost Derivation (1) look ahead 1 token at each step Alternative characterization of LL(1) Grammars: Whenever A | G
1 2

FIRST () FIRST () = { }, and if =

then FIRST () FOLLOW (A) = { }.

Compiler Design

Parsing

CSE 504

36 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

LL(1) Derivations

Left to Right Scan of input Leftmost Derivation (1) look ahead 1 token at each step Alternative characterization of LL(1) Grammars: Whenever A | G
1 2

FIRST () FIRST () = { }, and if =

then FIRST () FOLLOW (A) = { }.

Corollary: No Ambiguous Grammar is LL(1).

Compiler Design

Parsing

CSE 504

36 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Other Parsing Algorithms


LR-, LALR-, SLR,. . . Table-driven bottom-up parsers (builds parse trees from leaves to root). Parsing time is linear in the length of the input string.

Compiler Design

Parsing

CSE 504

37 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Other Parsing Algorithms


LR-, LALR-, SLR,. . . Table-driven bottom-up parsers (builds parse trees from leaves to root). Parsing time is linear in the length of the input string. The set of LR(k) grammars includes all LL(k) grammars (but some grammars are not LR(k) for any k).

Compiler Design

Parsing

CSE 504

37 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Other Parsing Algorithms


LR-, LALR-, SLR,. . . Table-driven bottom-up parsers (builds parse trees from leaves to root). Parsing time is linear in the length of the input string. The set of LR(k) grammars includes all LL(k) grammars (but some grammars are not LR(k) for any k). Operator precedence parsers

Compiler Design

Parsing

CSE 504

37 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Other Parsing Algorithms


LR-, LALR-, SLR,. . . Table-driven bottom-up parsers (builds parse trees from leaves to root). Parsing time is linear in the length of the input string. The set of LR(k) grammars includes all LL(k) grammars (but some grammars are not LR(k) for any k). Operator precedence parsers Chart parsers (used in Natural Language Processing)

Compiler Design

Parsing

CSE 504

37 / 37

Grammars

Recursive-Descent Parsing

Top-Down Predictive Parsing

Other Parsing Algorithms


LR-, LALR-, SLR,. . . Table-driven bottom-up parsers (builds parse trees from leaves to root). Parsing time is linear in the length of the input string. The set of LR(k) grammars includes all LL(k) grammars (but some grammars are not LR(k) for any k). Operator precedence parsers Chart parsers (used in Natural Language Processing) Cocke-Kasami-Younger & Earley parsers: can handle arbitrary context-free grammars (but the parsers may take quadratic or cubic time).

Compiler Design

Parsing

CSE 504

37 / 37

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy