Unit 3 CFG
Unit 3 CFG
PPT
Prepared By
Archana Kadam
Assistant Professor, 1
Unit-3 Context-Free Grammar
Introduction, Regular Grammar, Context Free Grammar- Definition,
Derivation. Sentential form, parse tree, Ambiguous Grammar,
Simplification of CFG: Eliminating unit productions, useless production,
useless symbols, Greibach normal form, Chomsky normal form. Types of
Grammar: Chomsky Hierarchy, Context Free Language (CFL): Closure
properties of CFL.
Case Study- CFG for Parenthesis Match- XML and Document Type
Definitions, Natural Language Processing- Text Parsing
Definition
■ A context-free grammar (CFG) G is a
quadruple (V, T, P, S) where
■ V: a set of non-terminal symbols
■ T: a set of terminals (V ∩ T = Ǿ)
■ P: a set of rules (P: V → (V U T)*)
■ S: a start symbol.
Example
■ V = {q, f,}
■ T = {0, 1}
■ P = {q → 11q, q → 00f,
f → 11f, f → ε }
■ S=q
■ (R= {q → 11q | 00f, f → 11f | ε })
How do we use rules?
■ If A → B, then xAy xBy and we say that
xAy derivates xBy.
CFL
Regula
r
Context Free Grammar
■ Construct the CFG for L ={ an bn | n>=1}
■ L = {ab, aabb, aaabbb,…..}
P : S-> aSb | ab
Construct the CFG for L of palindrome
strings. S - > aSa | bSb | Ɛ
9
Derivations
■ There are three ways to derive a string
from grammar
1. Leftmost Derivation : always deriving leftmost
non-terminal first
10
Derivations Examples
■ Consider following CFG , G= {(S,A), (a,b), P, S}
■ Where P: S-> aAS | a
■ A-> SbA | SS | ba
b b a
13
Ambiguous Context Free Grammar
■ A CFG for a language is said to be
ambiguous, if there exists at least one
string which can be generated more
than one ways.
■ There exists more than one leftmost
derivation, rightmost derivation and
more than one derivation tree.
14
Ambiguous Context Free Grammar
■ Example :
■ E -> E + E | E * E | id
16
Ambiguous Context Free Grammar
S -> iCtS | iCtSeS | a
C -> b
Consider string ‘ibtibtaea’
18
As two derivation tree Grammar is Ambiguous
19
Removal of Ambiguity
Example :
E -> E + E | E * E | id
Postpone higher priority operators from start symbol
Use other Non terminals to take care of it
E-> E + T | T
T->T*F|F
F -> id
20
CFG Simplification
21
Three ways to simplify/clean a CFG
(clean)
1. Eliminate useless symbols
(simplify)
2. Eliminate ε-productions A => ε
22
Eliminating useless symbols
A symbol X is reachable if there exists:
■ S 🡺* α X β
reachable generating
23
Algorithm to detect useless
symbols
1. First, eliminate all symbols that are not
generating
24
Example: Useless symbols
■ S🡺AB | a
■ A🡺 b
1. A, S are generating
2. B is not generating (and therefore B is useless)
3. ==> Eliminating B… (i.e., remove all productions that involve B)
1. S🡺 a
2. A🡺b
4. Now, A is not reachable and therefore is useless
5. Simplified G: S 🡺 a
25
What’s the point of removing ε-productions?
A🡺ε
Eliminating ε-productions
26
Algorithm to detect all nullable
variables
■ Basis:
■ If A🡺 ε is a production in G, then A is
nullable
(note: A can still have other productions)
■ Induction:
■ If there is a production B🡺 C1C2…Ck,
where every Ci is nullable, then B is also
nullable
27
Eliminating ε-productions
Given: G=(V,T,P,S)
Algorithm:
1. Detect all nullable variables in G
2. Then construct G1=(V,T,P1,S) as follows:
i. For each production of the form: A🡺X1X2…Xk, where
k≥1, suppose m out of the k Xi’s are nullable symbols
ii. Then G1 will have 2m versions for this production
i. i.e, all combinations where each Xi is either present or absent
iii. Alternatively, if a production is of the form: A🡺ε, then
remove it
28
Example: Eliminating
ε-productions
■ Let L be the language represented by the following CFG G:
i. S🡺AB
ii. A🡺aAA | ε
iii. B🡺bBB | ε Simplified
grammar
Goal: To construct G1, which is the grammar for L-{ε}
29
Eliminating unit productions
■ Algorithm:
Step 1) eliminate ε -productions Again,
Step 2) eliminate unit productions the order is
Step 3) eliminate useless symbols important!
Why?
31
Normal Forms
32
Why normal forms?
■ If all productions of the grammar could be
expressed in the same form(s), then:
33
Chomsky Normal Form (CNF)
Let G be a CFG for some L-{ε}
Definition:
G is said to be in Chomsky Normal Form if all
its productions are in one of the following
two forms:
i. A 🡺 BC where A,B,C are variables, or
ii. A🡺a where a is a terminal
■ G has no useless symbols
■ G has no unit productions
■ G has no ε-productions
34
CNF checklist
Is this grammar in CNF?
G1:
1. E 🡺 E+T | T*F | (E) | Ia | Ib | I0 | I1
2. T 🡺 T*F | (E) | Ia | Ib | I0 | I1
3. F 🡺 (E) | Ia | Ib | I0 | I1
4. I 🡺 a | b | Ia | Ib | I0 | I1
Checklist:
• G has no ε-productions
• G has no unit productions
• G has no useless symbols
• But…
• the normal form for productions is violated
37
Example #2
G: 1. E 🡺 EX+T | TX*F | X(EX) | IXa | IXb | IX0 | IX1
1. E 🡺 E+T | T*F | (E) | Ia | Ib | I0 | I1 2. T 🡺 TX*F | X(EX) | IXa | IXb | IX0 | IX1
2. T 🡺 T*F | (E) | Ia | Ib | I0 | I1 3. F 🡺 X(EX) | IXa | IXb | IX0 | IX1
3. F 🡺 (E) | Ia | Ib | I0 | I1 4. I 🡺 Xa | Xb | IXa | IXb | IX0 | IX1
4. I 🡺 a | b | Ia | Ib | I0 | I1 Step (1) 5. X+ 🡺 +
6. X* 🡺 *
7. X+ 🡺 +
8. X( 🡺 (
)
(2 9. …….
ep
St
38
Other Normal Forms
■ Griebach Normal Form (GNF)
■All productions of the form
A==>a α
Where a is terminal and α is string of zero or
more non-terminals
39
Greibach Normal Form (GNF)
■ Two ways to handle to convert grmmar in GNF
■ Theorem 1: By substitution Method
■ Example:
A->ABA | AB
A-> aA | a
Using theorem 1 of substitution for B in production of first
grammsr
A-> aABA | aBA | aAB| aB
Now all grammar production are in required format.
(Kindly note do not replace for grammar production already in format)
40
Greibach Normal Form (GNF)
■ Two ways to handle to convert grammar in GNF
■ Theorem 2:is used to handle left recursive
grammar
■ If CFG consist of production of the form
41
Greibach Normal Form (GNF)
Now A’s production are in
Example : GNF.
A -> AB| aB | bB| c Need to handle Z, will apply
theorem one of substitution
B -> b So,
To bring this grammar in GNF will apply Z-> b | bZ
theorem 2 to handle left recursive
grammar of A -> AB
Final Answer is
Here, α1 = B,
β1 = aB, β2 = bB, , β3 = c A -> aB | bB| c
A -> | aBZ | bBZ| cZ
Then,
Z -> b | bZ
A -> aB | bB| c
A -> | aBZ | bBZ| cZ
Z -> B | BZ
42
Chomsky Hierarchy
26-09-2023 43
Introduction
■ Different classes of phrase structure
grammar can be obtained with few
constraints
■ It is a containment hierarchy
■ Described by Chomsky-Schitzenberger
■ Suggested four different classes
Type-0 (Unrestricted grammar)
Type-1 (Context sensitive grammar)
Type-2 (Context free grammar)
Type-3 (Regular grammar)
26-09-2023 44
Type-0 (Unrestricted grammar)
26-09-2023 47
Type-3 (Regular grammar)
26-09-2023 48
Type-3 (Regular grammar)
1. Left-linear grammar:
2. allowed productions are: A −>Bw , A −>w,
■ The rule S −> ε is allowed only if S doesn’t
appear on RHS
■ For example:
S −>Ca|Bb ,
C −>Bb ,
B −>Ba|b
26-09-2023 49
Type-3 (Regular grammar)
2. Right-linear grammar:
■ allowed productions are:
■ A −>wB , A −>w,
■ The rule S −> ε is allowed only if S
doesn’t appear on RHS
■ For example: S −>0A, A −>0A | 1
26-09-2023 50
CFL Closure Properties
51
Closure Property Results
■ CFLs are closed under:
■ Union
■ Concatenation
■ Kleene closure operator
■ Substitution
■ Homomorphism, inverse homomorphism
■ reversal
■ CFLs are not closed under:
■ Intersection
■ Difference
■ Complementation
52
Substitution of a CFL:
example
■ Let L = language of binary palindromes s.t., substitutions for 0
and 1 are defined as follows:
■ s(0) = {anbn | n ≥1}, s(1) = {xx,yy}
■ Prove that s(L) is also a CFL.
S=> S0SS0 | S1 S S1 |ε
S0=> aS0b | ab
S1=> xx | yy 53
CFLs are closed under union
Let L1 and L2 be CFLs
To show: L2 U L2 is also a CFL
Let us show by using the result of Substitution
S->S1.S2
S1-> aSa|bSb|a|b|Ɛ
S2-> aSb |Ɛ 55
CFLs are closed under
Kleene Closure
■ Let L be a CFL
■ Let Lnew = {a}* and s(a) = L1
■ Then, L* = s(L )
new
56
We won’t use substitution to prove this result
58
Some negative closure results
■ L1 ∩ L 2 = L 1 U L 2
Logic: if CFLs were to be closed under complementation
🡺 the whole right hand side becomes a CFL (because
CFL is closed for union)
🡺 the left hand side (intersection) is also a CFL
🡺 but we just showed CFLs are
NOT closed under intersection!
🡺 CFLs cannot be closed under complementation.
59
Some negative closure results
60
Decision Properties
■ Emptiness test
■ Generating test
■ Reachability test
■ Membership test
■ PDA acceptance
61
Application of CFL
■ Compiler –Parsing
■ Eg a=b;
■ <STMT> -> <Assign STMT> | <If STMT>
|<while loop>|<for loop>
■ <while loop>-> while <condition><block of stmt>
■ <block of stmt> -> <List of STMT>|<STMT>
■ <Assign STMT> -> <ID>=<ID>
62
Application of CFL
■ Markup Languages
■ DOC -> Element DOC
■ Element -> Text | <EM> DOC </EM> | <P>
DOC </P> | <OL> List </OL>
■ XML
■ DTD
63
memo :
addressee: sender date title body;
addressee : TEXT;
sender : TEXT;
date : DATE;
title : TEXT;
body : TEXT;
For example, the following would be a valid memo:
<memo> <addressee>John</addressee>
<sender>Carla</addressee> <date>1998-09-01</date>
<title>New coffee maker</title>
<body> The new coffee maker has been installed! Operation is
simple: put a cup in the opening and press the red button.
</body>
</memo>
64
“Undecidable” problems for
CFL
■ Is a given CFG G ambiguous?
■ Is a given CFL inherently ambiguous?
■ Is the intersection of two CFLs empty?
■ Are two CFLs the same?
■ Is a given L(G) equal to ∑*?
65