0% found this document useful (0 votes)
49 views

H5 Context Free Grammar

The document discusses context-free grammars and languages. It provides an informal example of a palindrome language and defines a context-free grammar to represent that language. It then defines the formal components of a context-free grammar and describes how to represent expressions and identifiers using variables and production rules. Finally, it discusses parse trees and derivations as ways to represent the structure and generate strings from a context-free grammar.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

H5 Context Free Grammar

The document discusses context-free grammars and languages. It provides an informal example of a palindrome language and defines a context-free grammar to represent that language. It then defines the formal components of a context-free grammar and describes how to represent expressions and identifiers using variables and production rules. Finally, it discusses parse trees and derivations as ways to represent the structure and generate strings from a context-free grammar.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

CMSC 114– Context-free Grammar and Languages Basics

The first three rules (1-3) forms the BASIS. They tell us
Many languages cannot be REGULAR. Thus we need to that the class of palindromes includes the strings , 0
consider larger classes of languages called “Context- and 1. None of the right sides of these rules (the
Free Languages (CFL's)” portions following the arrows) contains a variable,
which is why they form a basis for the definition.
CONTEXT-FREE GRAMMAR (CFG) – another notation
for describing languages; formal notation for expressing The last two rules (4-5) form the INDUCTIVE part of the
recursive definition definition. For instance, rule 4 says that if we take any
string from the class P, then 00 is also in class P.
The natural, recursive notations of CFL have:
Rule 5 likewise tells us that 11 is also in P.

 played a central role in compiler technology since


the 1960's.

 enhanced the implementation of parsers (functions


that discover the structure of a program)

 been used to describe document formats especially


for information exchange in the WWW

Note: grammar – defines languages; follow certain


rules

INFORMAL EXAMPLE – language of palindrome


(string that reads the same forward and backward)

Example – OTTO, MADAMIMADAM, 0110, 11011 and 

 begins and ends with the same symbol


 when the first and last symbols are removed, the
resulting string is also a palindrome

BASIS: , 0 and 1 are palindromes.

Induction: If w is a palindrome, so are 0w0 and 1w1.

FORMAL DEFINITION OF CFG:

A context-free grammar is a quadruple


G = (V, T, P, S)
where
V is a finite set of variables.
T is a finite set of terminals.
0 and 1 are TERMINALS
P is a finite set of productions of the form
P is a VARIABLE (or non terminal) A 

P is in this grammar also the START SYMBOL. where A is a variable and   (V  T)*
S is a designated variable called the start
1-5 are PRODUCTIONS (or rules)
symbol.

COLEGIO DE LOS BANOS 5


CMSC 114– Context-free Grammar and Languages Basics

CFG THAT REPRESENTS EXPRESSIONS:  Root – labeled by the start symbol

Limitations:

 Operators + and * (representing addition and


multiplication)

 Arguments (values passed) as Identifiers

 Two variables to use:

 I represents identifiers
 E represents expressions (combination of Parse tree for a palindrome
values and operators that will be evaluated) (Yield – 0110)

 Every identifier must begin with a or b, which


maybe followed by any string in {a, b. 0, 1}*

 Language:

(a + b)(a + b + 0 + 1)*

 Formal definition: G = ({E,I}, {a, b. 0, 1}, A, E}

where A: (production rules)

1. E -> E + E (E can be 2 expressions


connected by a + sign)
2. E -> E * E (E can be 2 expressions
connected by a * sign)
3. E -> (E) (parenthesized expression)
4. E -> I (basic rule for expressions – E
can be a single identifier)
Parse tree for a
5. I -> a (a is an identifier) regular expression
6. I -> b (b is an identifier) Yield – a*(a + b00)
7. I -> Ia (Identifier followed by a is
another identifier) NOTE:
8. I -> Ib (Identifier followed by is
another identifier b)
Why such grammars are called `context free'?
9. I -> I0 (Identifier followed by 0 is
another identifier) Because all rules contain only one symbol on the left
10. I -> I1 (Identifier followed by 1 is hand side --- and wherever we see that symbol while
another identifier) doing a derivation, we are free to replace it with the
stuff on the right hand side. That is, the `context' in
PARSE TREE: which a symbol on the left hand side of a rule occurs is
unimportant --- we can always use the rule to make the
 Alternative representation to derivations that
rewrite while doing a derivation.
tells about the syntactic structure of 
The language generated by a context free grammar is
 There can be several parse trees for the same
the set of terminal symbols that can be derived starting
string (called ambiguity)
from the start symbol.
 Yield – concatenation of the string of leaves
from left to right of the parse tree (terminal
string)

COLEGIO DE LOS BANOS 6


CMSC 114– Context-free Grammar and Languages Basics

DERIVATIONS USING A GRAMMAR:

Productions of a CFG are used to infer that certain


strings are in the language of a certain variable.

Two approaches:

1. Recursive inference – use the production


rules from body to head

2. Derivation – use the production rules from


head to body

EXAMPLE OF RECURSIVE INFERENCE: CONSIDER THE GRAMMAR FOR THE LANGUAGE:


(Using Approach B)

a*(a + b00)

String Lang Prod String(s)


used
i a I 5 -
ii b I 6 -
Two Parse trees
iii b0 I 9 ii
iv b00 I 9 iii
v a E 4 i
vi b00 E 4 iv
vii a + b00 E 1 v, vi
viii (a + b00) E 3 vii
ix a * (a + b00) E 2 v, viii

YIELD – aabbccdd

Two leftmost derivations:

DERIVATION 

*For every parse tree, there is a unique leftmost and


rightmost derivation.

TWO TYPES:

Example A:

COLEGIO DE LOS BANOS 7


CMSC 114– Context-free Grammar and Languages Basics

A language that generates strings of 0’s and 1’s such A. Recursive Inference:
that the strings starts with either 0 or 1 followed by
101. String Lang Prod String(s)
used
Language: (0 + 1) 101
i 0 I 5 -
Strings included: 0101, 1101
ii 1 I 6 -
Formal definition: G = ({E, I}, {0, 1}, A, E} iii 10 I 8 ii

where A: (production rules) iv 101 I 7 iii


v 0 E 4 i
1. E -> E + E (E can be 2 expressions
vi 1 E 4 ii
connected by a + sign)
2. E -> E * E (E can be 2 expressions vii 101 E 4 iv
connected by a * sign) viii 0+1 E 1 v, vi
3. E -> (E) (parenthesized expression)
ix (0 + 1) E 3 viii
4. E -> I (basic rule for expressions – E
can be a single identifier) x (0 + 1)101 E 2 vii, ix
5. I -> 0 (a is an identifier)
6. I -> 1 (b is an identifier) B. Derivations:
7. I -> I1 (Identifier followed by 1 is
another identifier) Leftmost: (Rule used)
8. I -> I0 (Identifier followed by 0 is
another identifier b) E  E  E  (E) E  (E + E)  E  (I + E)  E 
(2) (3) (1) (4)
9. I -> I01 (Identifier followed by 01 is
another identifier) (0+ E)  E  (0 + I)  E  (0 + 1)  E  (0 + 1)  I 
(5) (4) (6) (4)
PARSE TREE:
(0 + 1)  I1  (0 + 1)  I01  (0 + 1)  101 
E (7) (9) (6)

(0 + 1)101
E * E

( E ) Rightmost:

I E  E  E  E  I  E  I1  E  I01 
(2) (4) (7) (9)
E + E

I 1 E  101  (E)  101  (E + E)  101 (E + I) 101


(6) (3) (1) (4)
I
(E + 1) 101 (I + 1)  I01  (0 + 1)  101 
I (6) (4) (5)
0
I 0 (0 + 1)101

EXAMPLE B:
DETERMINE IF CERTAIN STRINGS ARE IN THE
LANGUAGE OF A CERTAIN VARIABLE. Given a context-free language with:

COLEGIO DE LOS BANOS 8


CMSC 114– Context-free Grammar and Languages Basics

Formal definition: G = ({S, A, B}, {0, 1, }, A, S} Formal definition: G = ({S, A, B}, {a, b, }, A, S}

where A: (production rules) where A: (production rules)

1. S -> A1B 1. S -> ASB

2. A -> 0A |  2. S -> AB
3. A -> a
3. B -> 0B | 1B | 
4. B -> b

Parse Trees: Parse Tree:

S S

A 1 B A S B
S

0 A
A 0 B
a A 0
B
b
1 B
0 A
a b
 
Yield = aabb

DERIVATIONS: DERIVATIONS:

Leftmost: (Rule No.) Leftmost: (Rule No.)

S => A1B => 0A1B => 00A1B => 001B => 0010B S => ASB => aSB => aABB => aaBB => aabB
(2) (2) (2) (3) (3) (2) (3) (4)

=> 00101B => 00101 (yield) => aabb (yield)


(3)

Rightmost:
Rightmost: S => ASB => ASb => AABb => AAbb => Aabb
S => A1B => A10B => A101B => A101 => 0A101 (4) (2) (4) (3)
(3) (3) (3) (2)
=> aabb (yield)
=> 00A101 => 00101 (yield)
(2)

EXAMPLE C:
Example A:
Given a context-free language with:
Language: (g + h) + (10g + h)

COLEGIO DE LOS BANOS 9


CMSC 114– Context-free Grammar and Languages Basics

Strings included: g, h, 10g


String Lang Prod String(s)
Formal definition: G = ({E, I}, {g, h}, A, E} used

where A: (production rules) i g I 5 -


ii h I 6 -
1. E -> E + E (E can be 2 expressions
connected by a + sign) iii 1 I 7 -
2. E -> E * E (E can be 2 expressions iv 10 I 8 iii
connected by a * sign)
v 10g I 9 iv
3. E -> (E) (parenthesized expression)
vi g E 4 i
4. E -> I (basic rule for expressions – E
can be a single identifier) vii h E 4 ii
5. I -> g (g is an identifier) viii 10g E 4 v
6. I -> h (h is an identifier)
ix g+h E 1 vi, vii
7. I -> 1 (1 is an identifier)
x (g + h) E 3 ix
8. I -> I0 (Identifier followed by 0 is
another identifier) xi 10g + h E 1 viii + vii
9. I -> Ig (Identifier followed by g is xii (10g + h) E 3 xi
another identifier)
xiii (g + h) + E 3 x, xii
PARSE TREE: (10g + h)

E
LEFTMOST DERIVATION:

E + E

( E ) ( E )

E + E
E + E

I I

I I
NOTE:
g h I g
Why such grammars are called `context free'?
h
I 0 Because all rules contain only one symbol on the left hand side
--- and wherever we see that symbol while doing a derivation,
we are free to replace it with the stuff on the right hand side.
That is, the `context' in which a symbol on the left hand side
1
of a rule occurs is unimportant --- we can always use the rule
to make the rewrite while doing a derivation.

The language generated by a context free grammar is the set


of terminal symbols that can be derived starting from the start
symbol.
DETERMINE IF CERTAIN STRINGS ARE IN THE Example B:
LANGUAGE OF A CERTAIN VARIABLE.
Language: (g1h)( g0 + g1 + h)
C. Recursive Inference:

COLEGIO DE LOS BANOS


10
CMSC 114– Context-free Grammar and Languages Basics

Strings included: g1hg0, g1hg1, g1hh


String Lang Prod String(s)
Formal definition: G = ({E, I}, {g, h}, A, E} used

where A: (production rules) i


ii
1. E -> E + E (E can be 2 expressions
connected by a + sign) iii
2. E -> E * E (E can be 2 expressions iv
connected by a * sign)
v
3. E -> (E) (parenthesized expression)
vi
4. E -> I (basic rule for expressions – E
can be a single identifier) vii
5. I -> g (g is an identifier) viii
6. I -> h (h is an identifier)
ix
7. I -> I1 (Identifier followed by 1 is
another identifier) x

8. I -> Ih (Identifier followed by h is xi


another identifier)
xii
9. I -> I0 (Identifier followed by 0 is
another identifier) xiii

PARSE TREE:
LEFTMOST DERIVATION:
E

E * E

( E )

I
E + E
I h

E + E
I 1
RIGHTMOST DERIVATION:
I
I
g h
I

I 0 I 1

g g

DETERMINE IF CERTAIN STRINGS ARE IN THE


LANGUAGE OF A CERTAIN VARIABLE.

D. Recursive Inference:

COLEGIO DE LOS BANOS


11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy