0% found this document useful (0 votes)
79 views

Unit 4: Regular Expressions

Regular expressions can be used to describe regular languages and can be represented by nondeterministic finite automata. A regular expression can be converted to an NFA by applying rules that transform operations like concatenation, union, and Kleene star into state transitions. Conversely, a deterministic finite automaton can be converted to a regular expression by contracting states and edges until a single regular expression describes the language. Regular expressions, regular languages, and finite automata are mathematically equivalent ways to represent regular languages.

Uploaded by

muhammad atiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views

Unit 4: Regular Expressions

Regular expressions can be used to describe regular languages and can be represented by nondeterministic finite automata. A regular expression can be converted to an NFA by applying rules that transform operations like concatenation, union, and Kleene star into state transitions. Conversely, a deterministic finite automaton can be converted to a regular expression by contracting states and edges until a single regular expression describes the language. Regular expressions, regular languages, and finite automata are mathematically equivalent ways to represent regular languages.

Uploaded by

muhammad atiq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Unit 4

Regular Expressions

Reading: Sipser, chapter 1

1
Overview

• Regular expressions

• Equivalence of RE and RL

2
Regular Expressions
• Regular languages (RL) are often described
by means of algebraic expressions called
regular expressions (RE).
• In arithmetic we use the +, * operations to
construct expressions: (2+3)*5
• The value of the arithmetic expression is the
number 25.
• The value of a regular expression is a regular
language.
3
Regular operations
• In RE we use regular operations to construct
expressions describing regular languages:
( 0 + 1 )* ◦ 0
where :
 r+s means r OR s
 r* means Kleene star of r
 r◦s (or rs) means concatenation of r and s

4
Formal definition
• A set of regular expressions over an alphabet
 is defined inductively as follows:

Basis:
, , and  (for all ) are regular
expressions.
Induction:
If r and s are RE then the following expressions
are also RE:
– (r)
– r+s
– r◦s
– r* 5
Examples over ={a,b}
, a, a+b, b*, (a+b)b, ab*, a*+b*
• To avoid using many parentheses, the
operations have the following priority
hierarchy:
1. * - highest (do it first)
2. ◦
3. + - lowest (do it last)
• Example: (b+(a◦(b*)) = b + ab*
(Notations: The symbol ◦ can be dropped.  means
(1+2+3…) , r+ means rr* ). 6
Regular expressions and
regular languages
We associate each regular expression r with a
regular language L(r) as follows:
• L()=,
• L()={},
• L()={} for each ,
• L(r+s)=L(r)L(s),
• L(r◦s)=L(r)◦L(s),
• L(r*)=(L(r))*.
7
Examples over ={0,1}
In Class: Describe each language as
a regular expression.

L1 = { w | w has a single 1}
L2 = { w | w has at least one 1}
L3 = { w | w contains the string 110}
L4 = { w | |w| mod 2 =0}
L5 = { w | w starts with 1}
L6 = { w | w ends with 00}

8
Examples over ={0,1}, cont’

L7 = { w | w starts with 0 and ends with 10}


L8 = { w | w contains the string 010 or the string 101 }
L9 = { w | w starts and ends with the same letter }
L10 = {0,101}
L11 = {w | w does not contain 11 as a substring}
L12 = {w | #1(w) is even}
L13={w | w does not contain 101}

9
Properties of regular expressions 1

Useful properties of regular expressions:


• r+s=s+r
• r+=+r=r
• r+r=r
• r=r=
• rr*=r+
• r=r=r

10
Properties of regular expressions 2

• r(s+t)=rs+rt
• r+(s+t)=(r+s)+t
• r(st)=(rs)t
• r*=(r*)*=r*r*=r*+r
• r*+r+=r*

11
Equivalence of
regular expressions

• To prove that two regular expressions r and s


are equivalent we need to show that
L[r]L[s] and L[s]L[r].

• To show that two regular expressions are not


equivalent we have to find a word that
belongs to one expression and does not
belong to the other.

12
Example

Let r=+(0+1)*1
Let s=(0*1)*
Are r and s equivalent?

Answer: Yes. We will prove


L[s]  L[r] and L[r]  L[s]

13
L[r]L[s]
r=+(0+1)*1 s=(0*1)*

• Let wL[s] =(0*1)*.


• w= or w=x1x2..xn , n>0 such that xiL[0*1]
• If w= then wL[r].
• If w=x1x2..xn then we can represent
w=w’1=x1x2..xn-10z1 with z  0.
However, w’= x1x2..xn-10zL[(0+1)*],
implying w’1 L[(0+1)*]1L[r].

14
L[r]L[s]
r=+(0+1)*1 s=(0*1)*

• Let wL[r]=+(0+1)*1.
• If w= then wL[s] (by definition of *).
• If w then w can be represented as w=w’1
where w’L[(0+1)*]. Assume that w’ contains k
instances of the letter 1. This means that w’ can
be written as w’= x11x21.. xk1xk+1 where xi0*
But then w=w’1=
=(x11)(x21)…(xk+11)=(0*1)(0*1)...(0*1)
So wL[(0*1)*]. 15
Another example
Are r and s equivalent?
r=(0+1)*1+0*
s=(1+0)(0*1)*

Answer: No.
• Consider the word w = .
• wL[r]=(0+1)*1+0*, because w0*.
• But wL[s] =(1+0)(0*1)*, as all words in L[s]
have at least one letter.
16
Equivalence of RE with FA
• Regular expressions and finite automata are
equivalent in terms of the languages they
describe.

Theorem:
A language is regular iff some regular
expression describes it.

This theorem has two directions. We prove


each direction separately.
17
Equivalences so far…

Regular
Languages  DFA

 
Regular
NFA
Expressions
18
Converting RE into FA

• If a language is described by a regular


expression, then it is regular.

Proof idea:
Build a NFA by transforming some regular
expression r into a non-deterministic finite
automaton N that accepts the language L(r).

19
Converting RE into RL

Regular Regular
NFA
Expression Language

20
RE to FA Algorithm

• Given r we start the algorithm with N having a


start state, a single accepting state and an
edge labeled r:

q
r
p

Note: We assume at a moment that transactions


can be labeled with RE, not just letters.
21
RE to FA Algorithm (cont.)
Now transform this machine into an NFA N by
applying the following rules until all the edges
are labeled with either a letter  from  or :

1. If an edge is labeled , then delete this edge.

i
 j

i j
22
RE to FA Algorithm (cont.)

2. Transform any diagram of the type

r+s
i j

into the diagram


r

i j

s 23
RE to FA Algorithm (cont.)

3.Transform any diagram of the type

rs
i j

into the diagram

r s
i j

24
RE to FA Algorithm (cont.)

4.Transform any diagram of the type

r*
i j

into the diagram


r
 
i j

25
Example

Construct an NFA for the regular expression


b*+ ab.

Solution: We start by drawing the diagram:

b*+ab
s f

26
Example (cont.)

Next we apply rule 2 for b*+ab:

b*
b*+ab
s f
ab

27
Example (cont.)

Next we apply rule 3 for ab:

b*

s f
ab

a b 28
Example (cont.)
Next we apply rule 4 for b*:
b

 
b*

s f

a b 29
The final NFA
b
 

s f

a b

Example 2: Draw an NFA for (ab+a)*


Example 3: Draw an NFA for *101 *
30
Converting FA into RE

• If a language is regular, then it can be described


by a regular expression.

Proof idea:
Transform some DFA N into a regular
expression r that s.t. L(r)=L(N).

31
Converting RL into RE

Regular Regular
DFA
Language Expression

32
Converting FA into RE

• The algorithm will perform a sequence of


transformations that contract the DFA into
new machines with edges labeled with
regular expressions
• It stops when the machine has:
1. two states: Start, Finish
2. one edge with a regular expression on it.

33
Generalized NFA
• Before we start we first convert the DFA into a
Generalized NFA (GNFA):
– GNFA might have RE as labels.
– GNFA has a single accept state.
– The start state has arrows going out but none coming in.
– The accept state has arrows coming in but none going
out.
– Except for the start and accept state, one arrow goes
from every state to every other state (including itself).
r
 s f  34
Converting DFA into GNFA
The input is a DFA or an NFA N=(Q, , , q0, F).
Perform the following steps:

1. Create a new start state s and draw a new edge


labeled with  from s to the q0. (s,)= q0 .

s  q0 f

35
Converting DFA into GNFA (cont.)

2. Create a new single accepting state f and


draw new edges labeled  from all the original
accepting states to f.
Formally: For each qF, (q,)= f .


s  q0 f

36
Converting DFA into GNFA (cont.)
3. For each pair of states i and j that have more
than one edge between them (in the same
direction), replace all the edges by a single
edge labeled with the RE formed by the sum of
the labels of these edges.
4. If there is no edge <i,j> then label(i,j)=.
r

i j

s
r+s
i j 37
Example: DFA to GNFA

a,b

a
q0 q1

b
q2 a,b

38
Example: DFA to GNFA (cont.)

a,b

 a 
s q0 q1 f

b
q2 a,b

39
Example: DFA to GNFA (cont.)

a+b

 a 
s q0 q1 f

b
q2 a+b

40
Converting GNFA into RE
• Let old(i,j) denote the label on edge <i,j> of
the current GNFA.
• Construct a sequence of new machines by
eliminating one state at a time until the only
two states remaining are s and f.
• The state elimination order is arbitrary.
• When a state is eliminated, a new
(equivalent) machine is constructed.
• The label on <s,t> in the final machine is
the required RE.
41
Converting GNFA into RE (cont.)
Eliminating state k
• For each pair of states (i,j) where i,jk, the
label of (i,j) will be updated as follows:

new(i,j)=old(i,j) + old(i,k)old(k,k)*old(k,j)
old(i,j)

i j old(i,j)+old(i,k)old(k,k)*old(k,j)
i j
old(i,k) k old(k,j)

old(k,k) 42
Converting GNFA into RE (cont.)
• The states of the new machine are those of
the current machine with state k eliminated.
• The edges of the new machine are those
edges (i,j) for which new(i,j) has been
calculated.

• The algorithm terminates when s and f are the


only two remaining states. The regular
expression new(s,f) represents the language
of the original automaton.

43
Example: GNFA to RE

a+b

 a 
s q0 q1 f

b
q2 a+b
44
Example: GNFA to RE
Eliminate state q2
• No paths pass through q2. There are no states
that connect through 2. So no need to change
anything after deletion of state 2.

a+b

 a 
s q0 q1 f

b
q2 a+b
45
Example: GNFA to RE
Eliminate state q2
• No paths pass through q2. There are no states
that connect through 2. So no need to change
anything after deletion of state 2.

a+b

 a 
s q0 q1 f

46
Example: GNFA to RE
Eliminate state q0
• The only path through it is s  q1.
• We add an edge that is labeled by regular
expression associated with the deleted edges:
new(s,q1)=old(s,q1)+old(s,q0)old(q0,q0)*old(q0,q1)=
=+*a=a a+b

 a 
s q0 q1 f


47
Example: GNFA to RE
Eliminate state q0
• The only path through it is s  q1.
• We add an edge that is labeled by regular
expression associated with the deleted edges.
new(s,q1)=old(s,q1)+old(s,q0)old(q0,q0)*old(q0,q1)=
=+*a=a a+b

a 
s q1 f

48
Example: GNFA to RE
Eliminate state q1
• The only path through it is s  f.
new(s,f)=old(s,f)+old(s,q1)old(q1,q1)*old(q1,f)=
=+a(a+b)* = a(a+b)*

a+b

a 
s q1 f

49
Example: GNFA to RE
Eliminate state q1
• The only path through it is s  f.
new(s,f)=old(s,f)+old(s,q1)old(q1,q1)*old(q1,f)=
=+a(a+b)* = a(a+b)*

a(a+b)*
s f

50
Example II
What is the regular expression of L(A)?

a a,b
b
q0 q1

Solution: In class
51
Example III
What is the regular expression of L(A)?

a
q0 q1 b
a

b b a
q2

Solution: In class
52

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy