0% found this document useful (0 votes)
24 views

Regular Expressions

This is the NFA-ε corresponding to the regular expression r = 0(0+1)*.

Uploaded by

Aryan Panchal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Regular Expressions

This is the NFA-ε corresponding to the regular expression r = 0(0+1)*.

Uploaded by

Aryan Panchal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

Regular Expressions

• Highlights:
– A regular expression is used to specify a language, and it does so
precisely.
– Regular expressions are very intuitive.
– Regular expressions are very useful in a variety of contexts.
– Given a regular expression, an NFA-ε can be constructed from it
automatically.
– Thus, so can an NFA be constructed, and a DFA, and a corresponding
program, all automatically!

1
Two Operations

• Concatenation:
– x = 010
– y = 1101
– xy = 010 1101

• Language Concatenation: L1L2 = {xy | x is in L1 and y is in L2}


– L1 = {01, 00}
– L2 = {11, 010}
– L1L2 = {01 11, 01 010, 00 11, 00 010}

• Language Union:
– L1 = {01, 00}
– L2 = {01, 11, 010}
– L1 UL2 = {01, 00, 11, 010}

2
Operations on Languages

• Let L, L1, L2 be subsets of Σ*

• Concatenation: L1L2 = {xy | x is in L1 and y is in L2}

• Concatenating a language with itself: L0 = {ε}


Li = LLi-1, for all i >= 1

3
Kleene closure
Say, L, or L1 ={a, abc, ba}, on Σ ={a,b,c}

Then, L2 = {aa, aabc, aba, abca, abcabc, abcba, baa, baabc, baba}

L3= {a, abc, ba}. L2

…..

But, L0 = {ε}

Kleene closure of L, L* = {ε, L1, L2, L3, . . .}

4
Operations on Languages

• Let L, L1, L2 be subsets of Σ*

• Concatenation: L1L2 = {xy | x is in L1 and y is in L2}

• Union is set union of L1 and L2

• Kleene Closure: L* =


i 0
Li = L0 U L1 U L2 U…

• Positive Closure: L+ = L1 U L2 U…


Li =
i 1

• Question: Does L+ contain ε?

5
Definition of a Regular Expression

• Let Σ be an alphabet. The regular expressions over Σ are:

– Ø Represents the empty set { }


– ε Represents the set {ε}
– a Represents the set {a}, one string of length 1, for any symbol a
in Σ

Let r and s be regular expressions that represent the sets R and S, respectively.

– r+s Represents the set R U S (precedence 3)


– rs Represents the set RS (precedence level 2)
– r* Represents the set R* (highest precedence, level 1)
– (r) Represents the set R (not an operator, rather provides
precedence)

• If r is a regular expression, then L(r) is used to denote the corresponding language.

6
• Examples: Let Σ = {0, 1}

(0 + 1)* All strings of 0’s and 1’s


01* 0 followed by any number 1’s

0(0 + 1)* All strings of 0’s and 1’s, beginning with a 0

(0 + 1)*1 All strings of 0’s and 1’s, ending with a 1

(0 + 1)*0(0 + 1)* All strings of 0’s and 1’s containing at least one 0

(0 + 1)*0(0 + 1)*0(0 + 1)* All strings of 0’s and 1’s containing at least two 0’s

(0 + 1)*01*01* All strings of 0’s and 1’s containing at least two 0’s

(1 + 01*0)* All strings of 0’s and 1’s containing an even number of 0’s

1*(01*01*)* All strings of 0’s and 1’s containing an even number of 0’s

(1*01*0)*1* All strings of 0’s and 1’s containing an even number of 0’s
(0+1)* = (0*1*)* Any string, or (sigma)*, sigma={0, 1} in all cases here

• Question: Is there a unique minimum regular expression for a given language? 7


• Identities:

1. Øu = uØ = Ø Like multiplying by 0
2. εu = u ε = u Like multiplying by 1

3. Ø* = ε L = Li = L0 U L1 U L2 U…
*

i 0
4. ε* = ε = { ε}
5. u+v = v+u
6. u+Ø=u
7. u+u=u
8. u* = (u*)*
9. u(v+w) = uv+uw [which operation is hidden before parenthesis?]
10. (u+v)w = uw+vw
11. (uv)*u = u(vu)* [note: you have to have a single u, at start or
end]
[note (uv)* =/= u*v*]
1. (u+v)* = (u*+v)*
= u*(u+v)*
= (u+vu*)*
= (u*v*)*
= u*(vu*)*
= (u*v)*u* 8
Equivalence of Regular Expressions
and NFA-εs
• Note:
Throughout the following, keep in mind that a string is accepted by an NFA-ε
if there exists ANY path from the start state to any final state.

• Lemma 1: Let r be a regular expression. Then there exists an NFA-ε M such


that L(M) = L(r). Furthermore, M has exactly one final state with no
transitions out of it.

• Proof: (by induction on the number of operators, denoted by OP(r), in r).

9
Basis: OP(r) = 0

Then r is either Ø, ε, or a, for some symbol a in Σ

For Ø:

q0 qf

For ε:

qf

For a:

a
q0 qf
10
Inductive Hypothesis: Suppose there exists a k  0 such that for any regular
expression r where 0  OP(r)  k, there exists an NFA-ε such that L(M) = L(r).
Furthermore, suppose that M has exactly one final state.

Inductive Step: Let r be a regular expression with k + 1 operators (OP(r) = k +


1), where k + 1 >= 1.

Case 1) r = r1 + r2

Since OP(r) = k +1, it follows that 0<= OP(r1), OP(r2) <= k. By the inductive
hypothesis there exist NFA-ε machines M1 and M2 such that L(M1) = L(r1) and
L(M2) = L(r2). Furthermore, both M1 and M2 have exactly one final state.

Construct M as:

ε q1 M1 f1 ε
q0 qf
ε ε
q2 M2 f2
11
Case 2) r = r1r2

Since OP(r) = k+1, it follows that 0<= OP(r1), OP(r2) <= k. By the inductive hypothesis there
exist NFA-ε machines M1 and M2 such that L(M1) = L(r1) and L(M2) = L(r2). Furthermore,
both M1 and M2 have exactly one final state.

Construct M as:
ε
q1 M1 f1 q2 M2 f2

Case 3) r = r1*

Since OP(r) = k+1, it follows that 0<= OP(r1) <= k. By the inductive hypothesis there exists
an NFA-ε machine M1 such that L(M1) = L(r1). Furthermore, M1 has exactly one final state.
ε
Construct M as:

q0 ε q1 M1 f1 ε qf

12
ε
• Example:

r = 0(0+1)*

r = r 1 r2

r1 = 0

r2 = (0+1)*

r2 = r 3 * q0 1
q1

r3 = 0+1

r3 = r 4 + r 5

r4 = 0

r5 = 1
13
• Example:

r = 0(0+1)*

r = r 1 r2

r1 = 0

r2 = (0+1)*

r2 = r 3 * q0 1
q1

r3 = 0+1
q2 0
q3
r3 = r 4 + r 5

r4 = 0

r5 = 1
14
• Example:

r = 0(0+1)*

r = r 1 r2

r1 = 0

r2 = (0+1)*

r2 = r 3 * ε q0 1 q1 ε
q4 q5
r3 = 0+1
ε q2 0 q3 ε
r3 = r4 + r5

r4 = 0

r5 = 1
15
• Example:

r = 0(0+1)*

r = r 1 r2

r1 = 0

ε
r2 = (0+1)*

r2 = r3* ε q0 1 q1 ε
q6 ε q4 q5 ε qf
r3 = 0+1
ε q2 0 q3 ε
r3 = r 4 + r 5

r4 = 0 ε

r5 = 1
16
• Example:

r = 0(0+1)*

q8 0 q9
r = r 1 r2

r1 = 0

ε
r2 = (0+1)*

r2 = r 3 * ε q0 1 q1 ε
q6 ε q4 q5 ε qf
r3 = 0+1
ε q2 0 q3 ε
r3 = r 4 + r 5

r4 = 0 ε

r5 = 1
17
• Example:

r = 0(0+1)*

q8 0 q9
r = r 1r2

r1 = 0
ε
ε
r2 = (0+1)*

r2 = r 3 * ε q0 1 q1 ε
q6 ε q4 q5 ε qf
r3 = 0+1
ε q2 0 q3 ε
r3 = r 4 + r 5

r4 = 0 ε

r5 = 1
18
Equivalence Proved So Far

• DFA ≡ NFA ≡ NFA-e

• Every regular expression has an NFA-e, so,


• r.e subset-equal NFA-e

• We did not show how to convert an NFA-e to its r.e, so,


• The equivalence of r.e. to the machines is not show yet.

• We know at this stage, r.e. is subset-equal regular language, but


• Not other way round

• Will show now, how to convert DFA to its accepted r.e. 19


Definitions Required to Convert a DFA
to a Regular Expression

• Let M = (Q, Σ, δ, q1, F) be a DFA with state set Q = {q1, q2, …, qn}, and
define:

Ri,j = { x | x is in Σ* and δ(qi,x) = qj}

Ri,j is the set of all strings that define a path in M from q i to qj.

• Note that states have been numbered starting at q 1, not q0!

20
• Example:

1
q2 q4
0
0 1

q1
0
1

0
1 q3 q5
1
0

R2,3 = {0, 001, 00101, 011, …}


R1,4 = {01, 00101, …}
R3,3 = {11, 100, …}

21
• In words: Rki,j is the set of all the strings that define a path in M from q i to qj
but that passes through no state numbered greater than k.

• Definition:

Rki,j = { x | x is in Σ* and δ(qi,x) = qj, and for no u where 1  |u| < |x| and
x = uv there is no case such that δ(qi,u) = qp where p>k}

• Note that it may be true that i>=k or j>=k, only the intermediate states on the
path from i to j may not be >k.

22
• Example:
1
q2 q4
0
0 1

q1
0
1

0
1 q3 q5
1 0

R42,3 = {0, 1000, 011, …} R12,3 = {0}

111 is not in R42,3 because it goes via q5111 is not in R12,3


101 is not in R12,3

R52,3 = R2,3 any state may be on the path now


23
• Obeservations:

1) Rni,j = Ri,j , where n is the number of states

2) Rk-1i,j is a subset of Rki,j

3) L(M) =  Rn1,q =  R1,q


qF qF

{a |  (qi , a )  q j }, orPhi i  j


4) R i,j =
0  Easily computed from the DFA!
{a |  (qi , a )  q j }{ } i  j

5) Rki,j = Rk-1i,k (Rk-1k,k)* Rk-1k,j U Rk-1i,j Now, you see the purpose of
introducing k:
So that we can write it as a
RE
24
• Notes on 5:

5) Rki,j = Rk-1i,k (Rk-1k,k)* Rk-1k,j U Rk-1i,j

• Consider paths represented by the strings in R ki,j :

qi qj

• IF x is a string in Rki,j then no state numbered > k may passed through when processing
x and either:
– qk is not passed through, i.e., x is in Rk-1i,j
– qk is passed through one or more times, i.e., x is in Rk-1i,k (Rk-1k,k)* Rk-1k,j

25
• Lemma 2: Let M = (Q, Σ, δ, q1, F) be a DFA. Then there exists a regular expression r
such that L(M) = L(r).

• Proof:
First we will show (by induction on k) that for all i,j, and k, where 1  i,j  n
and 0  k  n, that there exists a regular expression r such that L(r) = Rki,j .

Basis: k=0

R0i,j contains single symbols, one for each transition from q i to qj, and possibly ε if i=j.

case 1) No transitions from qi to qj and i != j

r0i,j = Ø

case 2) At least one (m  1) transition from qi to qj and i != j

r0i,j = a1 + a2 + a3 + … + am where δ(qi, ap) = qj,


for all 1  p  m

26
case 3) No transitions from qi to qj and i = j

r0i,j = ε

case 4) At least one (m  1) transition from qi to qj and i = j

r0i,j = a1 + a2 + a3 + … + am + ε where δ(qi, ap) = qj


for all 1  p  m
Inductive Hypothesis:
Suppose that Rk-1i,j can be represented by the regular expression rk-1i,j for all
1  i,j  n, and some k1.

Inductive Step:
Consider Rki,j = Rk-1i,k (Rk-1k,k)* Rk-1k,j U Rk-1i,j . By the inductive hypothesis there
exist regular expressions rk-1i,k , rk-1k,k , rk-1k,j , and rk-1i,j generating Rk-1i,k , Rk-1k,k ,
Rk-1k,j , and Rk-1i,j , respectively. Thus, if we let

rki,j = rk-1i,k (rk-1k,k)* rk-1k,j + rk-1i,j

then rki,j is a regular expression generating Rki,j ,i.e., L(rki,j) = Rki,j .

27
• Finally, if F = {qj1, qj2, …, qjr}, then

rn1,j1 + rn1,j2 + … + rn1,jr

is a regular expression generating L(M).

• Note: not only does this prove that the regular expressions generate the regular
languages, but it also provides an algorithm for computing it!

28
• Example:

1
First table column is
q1 0 1 computed from the
q2 q3
DFA.

0 0/1

k=0 k=1 k=2

rk1,1 ε
rk1,2 0
rk1,3 1
rk2,1 0
rk2,2 ε
rk2,3 1
rk3,1 Ø
rk3,2 0+1
rk3,3 ε 29
• All remaining columns are computed from the previous column using the
formula. 1

r12,3 = r02,1 (r01,1 )* r01,3 + r02,3 0 1


q1 q2 q3
= 0 (ε)* 1 + 1
= 01 + 1
0 0/1
k=0 k=1 k=2

rk1,1 ε ε
rk1,2 0 0
rk1,3 1 1
rk2,1 0 0
rk2,2 ε ε + 00
rk2,3 1 1 + 01
rk3,1 Ø Ø
rk3,2 0+1 0+1
rk3,3 ε ε
30
1

r21,3 = r11,2 (r12,2 )* r12,3 + r11,3


q1 0 1
= 0 (ε + 00)* (1 + 01) + 1 q2 q3
= (odd 0’s)1 + (even 0’s)1 + 1
= 0*1 0 0/1

k=0 k=1 k=2

rk1,1 ε ε (00)*
rk1,2 0 0 0(00)*
rk1,3 1 1 0*1
rk2,1 0 0 0(00)*
rk2,2 ε ε + 00 (00)*
rk2,3 1 1 + 01 0*1
rk3,1 Ø Ø (0 + 1)(00)*0
rk3,2 0+1 0+1 (0 + 1)(00)*
31
rk3,3 ε ε ε + (0 + 1)0*1
• To complete the regular expression for the language, we compute:
r31,2 + r31,3 [complete this]

k=0 k=1 k=2 k=3

rk1,1 ε ε (00)*
rk1,2 0 0 0(00)*
rk1,3 1 1 0*1
rk2,1 0 0 0(00)*
rk2,2 ε ε + 00 (00)*
rk2,3 1 1 + 01 0*1
rk3,1 Ø Ø (0 + 1)(00)*0
rk3,2 0+1 0+1 (0 + 1)(00)*
rk3,3 ε ε ε + (0 + 1)0*1

32
Now we have proved equivalence of all

• DFA ≡ NFA ≡ NFA-e

• DFA can be converted to its r.e., or DFA subset-equal r.e.

• R.e. subset-equal NFA-e

• So, r.e ≡ NFA-e, or

• DFA ≡ NFA ≡ NFA-e ≡ r.e.


• (note my abuse of concepts, r.e. is about language)

• We proved, r.e. expresses regular language, and only regular language


33
• Theorem: Let L be a language. Then there exists an a regular expression r
such that L = L(r) if and only if there exits a DFA M such that L = L(M).

• Proof:

(if) Suppose there exists a DFA M such that L = L(M). Then by Lemma 2
there exists a regular expression r such that L = L(r).

(only if) Suppose there exists a regular expression r such that L = L(r). Then
by Lemma 1 there exists a DFA M such that L = L(M). 

• Corollary: The regular expressions define the regular languages.

• Note: The conversion from a regular expression to a DFA and a program


accepting L(r) is now complete, and fully automated!

34

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy