0% found this document useful (0 votes)
17 views

Context Free Grammars

This document discusses context-free languages and grammars (CFLs and CFGs). It begins by noting that not all languages are regular and introduces context-free languages as a class larger than regular languages. Context-free languages can be defined using context-free grammars (CFGs), which allow for recursive notation. CFGs have applications in areas like parse trees and compilers. An example CFG is given for the language of binary palindromes. The key components of a CFG are defined. More examples of CFGs are provided, including for simple expressions, markup languages, and derivations.

Uploaded by

adityahammad02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Context Free Grammars

This document discusses context-free languages and grammars (CFLs and CFGs). It begins by noting that not all languages are regular and introduces context-free languages as a class larger than regular languages. Context-free languages can be defined using context-free grammars (CFGs), which allow for recursive notation. CFGs have applications in areas like parse trees and compilers. An example CFG is given for the language of binary palindromes. The key components of a CFG are defined. More examples of CFGs are provided, including for simple expressions, markup languages, and derivations.

Uploaded by

adityahammad02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Context-Free Languages &

Grammars
(CFLs & CFGs)
Chapter 3

1
Not all languages are regular
◼ So what happens to the languages
which are not regular?

◼ Can we still come up with a language


recognizer?
◼ i.e., something that will accept (or reject)
strings that belong (or do not belong) to the
language?

2
Context-Free Languages
◼ A language class larger than the class of regular
languages

◼ Supports natural, recursive notation called “context-


free grammar”

◼ Applications:
Context-
◼ Parse trees, compilers
Regular free
◼ XML (FA/RE)
(PDA/CFG)

3
An Example
◼ A palindrome is a word that reads identical from both
ends
◼ E.g., madam, redivider, malayalam, 010010010
◼ Let L = { w | w is a binary palindrome}
◼ Is L regular?
◼ No.
◼ Proof:
◼ Let w=0N10N (assuming N to be the p/l constant)

◼ By Pumping lemma, w can be rewritten as xyz, such that xykz is also L


(for any k≥0)
◼ But |xy|≤N and y≠
◼ ==> y=0+
◼ ==> xykz will NOT be in L for k=0
◼ ==> Contradiction
4
But the language of
palindromes…
is a CFL, because it supports recursive
substitution (in the form of a CFG)
◼ This is because we can construct a
“grammar” like this:
Same as:
1. A ==>  A => 0A0 | 1A1 | 0 | 1 | 
Terminal
2. A ==> 0
3. A ==> 1
Variable or non-terminal
Productions 4. A ==> 0A0
5. A ==> 1A1
How does this grammar work?
5
How does the CFG for
palindromes work?
An input string belongs to the language (i.e.,
accepted) iff it can be generated by the CFG
G:
◼ Example: w=01110 A => 0A0 | 1A1 | 0 | 1 | 
◼ G can generate w as follows:
Generating a string from a grammar:
1. A => 0A0 1. Pick and choose a sequence
2. => 01A10 of productions that would
3. => 01110 allow us to generate the
string.
2. At every step, substitute one variable
with one of its productions.
6
Context-Free Grammar:
Definition
◼ A context-free grammar G=(V,T,P,S), where:
◼ V: set of variables or non-terminals
◼ T: set of terminals (= alphabet U {})
◼ P: set of productions, each of which is of the form
V ==> 1 | 2 | …
◼ Where each i is an arbitrary string of variables and

terminals
◼ S ==> start variable

CFG for the language of binary palindromes:


G=({A},{0,1},P,A)
P: A ==> 0 A 0 | 1 A 1 | 0 | 1 | 
7
More examples
◼ Parenthesis matching in code
◼ Syntax checking
◼ In scenarios where there is a general need
for:
◼ Matching a symbol with another symbol, or
◼ Matching a count of one symbol with that of
another symbol, or
◼ Recursively substituting one symbol with a string
of other symbols

8
Applications of CFLs & CFGs
◼ Compilers use parsers for syntactic checking
◼ Parsers can be expressed as CFGs
1. Balancing paranthesis:
◼ B ==> BB | (B) | Statement
◼ Statement ==> …
2. If-then-else:
◼ S ==> SS | if Condition then Statement else Statement | if Condition
then Statement | Statement
◼ Condition ==> …
◼ Statement ==> …
3. C paranthesis matching { … }
4. Pascal begin-end matching
5. YACC (Yet Another Compiler-Compiler)

9
More applications
◼ Markup languages
◼ Nested Tag Matching
◼ HTML
◼ <html> …<p> … <a href=…> … </a> </p> … </html>

◼ XML
◼ <PC> … <MODEL> … </MODEL> .. <RAM> …
</RAM> … </PC>

10
Tag-Markup Languages
Roll ==> <ROLL> Class Students </ROLL>
Class ==> <CLASS> Text </CLASS>
Text ==> Char Text | Char
Char ==> a | b | … | z | A | B | .. | Z
Students ==> Student Students | 
Student ==> <STUD> Text </STUD>
Here, the left hand side of each production denotes one non-terminals
(e.g., “Roll”, “Class”, etc.)
Those symbols on the right hand side for which no productions (i.e.,
substitutions) are defined are terminals (e.g., ‘a’, ‘b’, ‘|’, ‘<‘, ‘>’, “ROLL”,
etc.)
11
Simple Expressions…
◼ We can write a CFG for accepting simple
expressions
◼ G = (V,T,P,S)
◼ V = {E,F}
◼ T = {0,1,a,b,+,*,(,)}
◼ S = {E}
◼ P:
◼ E ==> E+E | E*E | (E) | F
◼ F ==> aF | bF | 0F | 1F | a | b | 0 | 1

12
Context-Free Language
◼ The language of a CFG, G=(V,T,P,S),
denoted by L(G), is the set of terminal
strings that have a derivation from the
start variable S.
◼ L(G) = { w in T* | S ==>*G w }

13
Left-most & Right-most
G:
Derivation Styles EF =>
=> E+E | E*E | (E) | F
aF | bF | 0F | 1F | 
Derive the string a*(ab+10) from G: E =*=>G a*(ab+10)
◼E ◼E
◼==> E*E ◼==> E*E
◼==> F*E ◼==> E * (E)
Left-most Right-most
◼==> aF * E ◼==> E * (E + E)
derivation: derivation:
◼==> a*E ◼==> E * (E + F)
◼==> a * (E) ◼==> E * (E + 1F)
Always Always
◼==> a * (E + E) ◼==> E * (E + 10F)
substitute ◼==> a * (F + E) ◼==> E * (E + 10) substitute
leftmost ◼==> a * (aF + E) ◼==> E * (F + 10) rightmost
variable ◼==> a * (abF + E) ◼==> E * (aF + 10) variable
◼==> a * (ab + E) ◼==> E * (abF + 0)
◼==> a * (ab + F) ◼==> E * (ab + 10)
◼==> a * (ab + 1F) ◼==> F * (ab + 10)
◼==> a * (ab + 10F) ◼==> aF * (ab + 10)
◼==> a * (ab + 10) ◼==> a * (ab + 10)
14
Leftmost vs. Rightmost
derivations
Q1) For every leftmost derivation, there is a rightmost
derivation, and vice versa. True or False?
True - will use parse trees to prove this

Q2) Does every word generated by a CFG have a


leftmost and a rightmost derivation?
Yes – easy to prove (reverse direction)

Q3) Could there be words which have more than one


leftmost (or rightmost) derivation?
Yes – depending on the grammar
15
Parse Trees
◼ Each CFG can be represented using a parse tree:
◼ Each internal node is labeled by a variable in V

◼ Each leaf is terminal symbol

◼ For a production, A==>X1X2…Xk, then any internal node


labeled A has k children which are labeled from X1,X2,…Xk
from left to right

Parse tree for production and all other subsequent productions:


A ==> X1..Xi..Xk A

X1 … Xi … Xk

16
Examples
E

Recursive inference
A
E + E
0 A 0
F F

Derivation
1 A 1
a 1

Parse tree for 0110


Parse tree for a + 1
G: G:
E => E+E | E*E | (E) | F A => 0A0 | 1A1 | 0 | 1 | 
F => aF | bF | 0F | 1F | 0 | 1 | a | b
17
Parse Trees, Derivations, and
Recursive Inferences
Production:
A ==> X1..Xi..Xk
A

Derivation
Recursive

X1 … Xi … Xk
inference

Left-most Parse tree


derivation

Derivation Right-most
Recursive
derivation
inference
18
Interchangeability of different
CFG representations
◼ Parse tree ==> left-most derivation
◼ DFS left to right
◼ Parse tree ==> right-most derivation
◼ DFS right to left
◼ ==> left-most derivation == right-most
derivation
◼ Derivation ==> Recursive inference
◼ Reverse the order of productions
◼ Recursive inference ==> Parse trees
◼ bottom-up traversal of parse tree
19
What kind of grammars result for regular languages?

CFLs & Regular Languages


◼ A CFG is said to be right-linear if all the
productions are one of the following two
forms: A ==> wB (or) A ==> w
Where:
• A & B are variables,
• w is a string of terminals

◼ Theorem 1: Every right-linear CFG generates


a regular language
◼ Theorem 2: Every regular language has a
right-linear grammar
◼ Theorem 3: Left-linear CFGs also represent
RLs 20
Some Examples
0 1 0,1 0 1 ➢A => 01B | C
1 0 B => 11B | 0C | 1A
A B C 1 0
A B 1 C C => 1A | 0 | 1
0
Right linear CFG? Right linear CFG? Finite Automaton?

21
Ambiguity in CFGs and CFLs

22
Ambiguity in CFGs
◼ A CFG is said to be ambiguous if there
exists a string which has more than one
left-most derivation

Example:
S ==> AS |  LM derivation #1: LM derivation #2:
A ==> A1 | 0A1 | 01 S => AS S => AS
=> 0A1S => A1S
=>0A11S => 0A11S
=> 00111S => 00111S
Input string: 00111 => 00111 => 00111
Can be derived in two ways
23
Why does ambiguity matter?
Values are
E ==> E + E | E * E | (E) | a | b | c | 0 | 1 different !!!
string = a * b + c
E
• LM derivation #1:
•E => E + E => E * E + E E + E (a*b)+c
==>* a * b + c
E * E c

a b
E
• LM derivation #2
•E => E * E => a * E => E E a*(b+c)
*
a * E + E ==>* a * b + c
a E + E

The calculated value depends on which b c


of the two parse trees is actually used.
24
Inherently Ambiguous CFLs
◼ However, for some languages, it may not be
possible to remove ambiguity

◼ A CFL is said to be inherently ambiguous if


every CFG that describes it is ambiguous
Example:
◼ L = { anbncmdm | n,m≥ 1} U {anbmc mdn | n,m≥ 1}
◼ L is inherently ambiguous
◼ Why? n n n n
Input string: a b c d

25

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy