Lecture 7 - Context Free Grammars
Lecture 7 - Context Free Grammars
Lecture 7:
Context-Free Grammars
Jos Uiterwijk
uiterwijk@maastrichtuniversity.nl
Contents
3
Rewrite Systems and Grammars
A rewrite system (or production system or rule-based
system) is:
Example rules:
S ® aSb
aS ® e
aSb ® bSabSa
4
Simple-rewrite
simple-rewrite(R: rewrite system, w: initial string) =
1. Set working-string to w.
3. Return working-string.
5
A Rewrite System Formalism
6
An Example
w = SaS
Rules:
[1] S ® aSb
[2] aS ® e
● When to quit?
7
Rule-Based Systems
Examples of rule-based systems
● Expert systems
● Cognitive modeling
8
Grammars Define Languages
A grammar is a set of rules that are stated in terms of
two alphabets:
9
Using a Grammar to Derive a String
Simple-rewrite (G, S) will generate the strings in L(G).
S Þ aSb Þ aaSbb Þ …
10
Generating Many Strings
• Multiple rules may match.
11
Generating Many Strings
• One rule may match in more than one way.
12
When to Stop
May stop when:
Example:
Rules: S ® aSb, S ® bTa, and S ® e
13
When to Stop
May stop when:
Example:
Rules: S ® aSb, S ® bTa, and S ® e
14
When to Stop
It is possible that neither (1) nor (2) is achieved.
Example:
15
Context-free Grammars, Languages,
and PDAs
Defines Context-free
Language
Context-free
Grammar Accepts
PDA
16
More Powerful Grammars
Regular grammars must always produce strings one character at
a time, moving left to right.
18
Example 1: AnBn
S®e
S ® aSb
19
Example 2: Balanced Parentheses
S®e
S ® SS
S ® (S)
20
Definition: Context-Free Grammars
A context-free grammar G is a quadruple,
(V, S, R, S), where:
Example:
({S, a, b}, {a, b}, {S ® aSb, S ® e}, S)
21
Derivations
We define the derives-in-one-step relation ÞG as:
x ÞG y iff x = a A b
and A ® g is in R
y=agb
w0 ÞG w1 ÞG w2 ÞG . . . ÞG wn for some n ≥ 0 is a
derivation in G.
Example:
S Þ* aaabbb
23
Definition of a Context-Free
Grammar
A language L is context-free iff it is generated by some
context-free grammar G.
24
Recursive Grammar Rules
• Examples:
S ® (S)
S ® (T)
T ® (S)
25
Self-Embedding Grammar Rules
• A rule in a grammar G is self-embedding iff it is :
X ® w1Yw2, where Y Þ* w3Xw4 and
both w1w3 and w4w2 are in S+.
• A grammar is self-embedding iff it contains at least one
self-embedding rule.
• Examples:
S ® aSa is self-embedding
S ® aT
T ® Sa is self-embedding
26
Where Context-Free Grammars
Get Their Power
• If a context-free grammar G is not self-embedding
then L(G) is regular.
27
PalEven = {wwR : w Î {a, b}*}
R = { S ® aSa
S ® bSb
S ® e }.
28
Equal Numbers of a’s and b’s
R = { S ® aSb
S ® bSa
S ® SS
S ® e }.
29
BNF
A notation for writing practical context-free grammars
• The symbol | should be read as “or”.
Examples of nonterminals:
<program>
<variable> 30
BNF for a Java Fragment
Etcetera
31
Spam Generation
Source: How Many Ways Can You Spell V1@gra? By Brian Hayes
American Scientist, July-August 2007
http://www.americanscientist.org/issues/pub/2007/7/how-many-ways-can-you-spell-v1gra 32
HTML
<ul>
<li>Item 1, which will include a sublist</li>
<ul>
<li>First item in sublist</li>
<li>Second item in sublist</li>
</ul>
<li>Item 2</li>
</ul>
A grammar:
/* Text is a sequence of elements.
HTMLtext ® Element HTMLtext | e
Element ® UL | LI | … (and other kinds of elements that
are allowed in the body of an HTML document)
/* The <ul> and </ul> tags must match.
UL ® <ul> HTMLtext </ul>
/* The <li> and </li> tags must match.
LI ® <li> HTMLtext </li> 33
English
S ® NP VP
NP ® the Nominal | a Nominal | Nominal |
ProperNoun | NP PP
Nominal ® N | Adjs N
N ® cat | dogs | bear | girl | chocolate | rifle
ProperNoun ® Chris | Fluffy
Adjs ® Adj Adjs | Adj
Adj ® young | older | smart
VP ® V | V NP | VP PP
V ® like | likes | thinks | shots | smells
PP ® Prep NP
Prep ® with
34
Designing Context-Free Grammars
AnBn
A ® BC
A ® aAb
35
Outside-In Structure and RNA Folding
36
Concatenating Independent
Languages
Let L = {anbncm : n, m ³ 0}.
R = { S ® MS
S®e
M ® aMb
M ® e}.
38
Another Example: Unequal a’s and b’s
L = {anbm : n ¹ m}
R=
{ S ® AB | AC
A ® aAb | e
B ® aA
C ® bCa
D ® AB }
40
Unproductive Nonterminals
removeunproductive(G: CFG) =
1. G¢ = G.
2. Mark every nonterminal symbol in G¢ as unproductive.
3. Mark every terminal symbol in G¢ as productive.
4. Until one entire pass has been made without any new
symbol being marked do:
For each rule X ® a in R do:
If every symbol in a has been marked as
productive and X has not yet been marked as
productive then:
Mark X as productive.
5. Remove from G¢ every unproductive symbol.
6. Remove from G¢ every rule that contains an
unproductive symbol.
7. Return G¢.
41
Simplifying Context-Free Grammars
G = ({S, A, B, C, D, a, b}, {a, b}, R, S), where
R=
{ S ® AB | AC
A ® aAb | e
B ® aA
C ® bCa
D ® AB }
42
Unreachable Nonterminals
removeunreachable(G: CFG) =
1. G¢ = G.
2. Mark S as reachable.
3. Mark every other nonterminal symbol as unreachable.
4. Until one entire pass has been made without any new
symbol being marked do:
For each rule X ® aAb (where A Î V ‒ S) in R do:
If X has been marked as reachable and A has not then:
Mark A as reachable.
5. Remove from G¢ every unreachable symbol.
6. Remove from G¢ every rule with an unreachable symbol on
the left-hand side.
7. Return G¢.
43
Simplifying Context-Free Grammars
G = ({S, A, B, C, D, a, b}, {a, b}, R, S), where
R=
{ S ® AB | AC
A ® aAb | e
B ® aA
C ® bCa
D ® AB }
44
Structure
Context free languages:
We care about structure of strings derived.
E + E
id E * E
3 id id
5 7
45
Derivations
To capture structure, we must capture the path we took
through the grammar. Derivations do that.
Example:
S®e
S ® SS
S ® (S)
1 2 3 4 5 6
S Þ SS Þ (S)S Þ ((S))S Þ (())S Þ (())(S) Þ (())()
S Þ SS Þ (S)S Þ ((S))S Þ ((S))(S) Þ (())(S) Þ (())()
1 2 3 5 4 6
1 2 3 4 5 6
S Þ SS Þ (S)S Þ ((S))S Þ (())S Þ (())(S) Þ (())()
S Þ SS Þ (S)S Þ ((S))S Þ ((S))(S) Þ (())(S) Þ (())()
1 2 3 5 4 6
S S
( S ) ( S )
( S ) e
e 47
Parse Trees
A parse tree, derived by a grammar G = (V, S, R, S), is
a rooted, ordered tree in which:
● If
m is a nonleaf node labeled X and the children of m
are labeled x1, x2, …, xn, then R contains the rule
X ® x1x2…xn.
48
Structure in English
S
NP VP
Nominal V NP
Adjs N Nominal
Adj N
50
Algorithms Care How We Search
S
S S
( S ) ( S )
( S ) e
52
Ambiguity
A grammar is ambiguous iff there is at least one string
in L(G) for which G produces more than one parse tree.
53
An Arithmetic Expression Grammar
E®E+E
E®E*E
E ® (E)
E ® id
54
Even a Very Simple Grammar Can be
Highly Ambiguous
S® e
S ® SS
S ® (S)
55
Derivation is Not Necessarily Unique
This is True for Regular Languages Too
Regular Expression: Regular Grammar:
create aaa from create aaa from
(a È b)*a (a È b)*
S®a
choose a from (a È b)*, then S ® bS
choose a from (a È b)*, then S ® aS
choose a, then S ® aT
choose ε from (a È b)*. T®a
T®b
or T ® aT
T ® bT
choose ε from (a È b)*, then
choose a, then
choose a from (a È b)*, then
choose a from (a È b)*.
56
Inherent Ambiguity
Sometimes we can avoid ambiguity, but …
Example:
57
Inherent Ambiguity
L = {anbncm: n, m ³ 0} È {anbmcm: n, m ³ 0}.
S ® S1 | S2
S1 ® S1c | A /* Generate all strings in {anbncm}.
A ® aAb | e
S2 ® aS2 | B /* Generate all strings in {anbmcm}.
B ® bBc | e
59