dfa-decision-procedures

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

CSC 341: DFA Decision Procedures

DFA Decision Procedures


When studying models of computation, we are primarily interested in two kinds of results:

• Closure Properties (Sipser §1.2)—given an operation over a particular sort of language (e.g.,
the regular languages), does the operation preserve the sort of the input languages? For
example, the result of concatenating two regular languages is always a regular language.
• Decision Procedures—given a machine (e.g., a DFA), can we determine if a proposition about
that machine holds? For example, is a given DFA minimal with respect to the number of
states that it contains?

Here, we are concerned with the second kind of results. In particular, we will study the following
problems:

• Emptiness: given a DFA (equivalently, an NFA or regular expression), is its language empty?
• Acceptance: given a DFA (equivalently, an NFA or regular expression) and an input string,
does the DFA recognize that input string?
• Minimization: given a DFA, is it minimal with respect to the number of states that it contains?
• Equivalence: given two DFAs (equivalently, NFAs or regular expressions), are the DFAs
equivalent? That is, do they accept the same set of strings and reject the same set of strings?

Emptiness and Acceptance


The language of a DFA D is empty if it is the empty set, i.e., L(D) = ∅. The emptiness problem
consists of an input DFA D and determining if L(D) = ∅. A naı̈ve strategy for solving this problem
is running D on every possible string w ∈ Σ∗ , returning false if any such w is rejected by D. This
does not work because there are an infinite number of such strings—we never have the opportunity
to return true! Instead, we must determine whether L(D) = ∅ by inspecting the structure of D. To
get a feel of how to check this, consider the following DFA:

The language of this DFA is non-empty. It at least accepts the string 10. In contrast, consider the
following DFA:

1 This work is licensed under a “CC BY-NC-SA 4.0” license.


CSC 341: DFA Decision Procedures

This DFA is identical to the first, except that it does not have any accepting states. Because of this,
no string will ever be accepted by the DFA and so its language is empty. In light of this, it is clear
that the language of any DFA whose set of accept states is empty will be empty as well. However,
there is one other case we must consider.

Here, the DFA has an accepting state, q1 . However, there is no way to reach that accepting state
from the start state. To formalize this idea, let’s define the notion of reachability in a DFA.
Definition 1. In a DFA D = (Q, Σ, δ, q0 , F ), a state q 0 ∈ Q is reachable from another state q ∈ Q
if there exists a w ∈ Σ such that δ(δ(δ(q, w0 ), w1 ), . . .), wn ) = q 0 where w = w0 w1 · · · wn .
Intuitively, a state q 0 is reachable from a state q if there exists a series of characters we can read to
go from q to q 0 according to the transition function δ. In the case where q = q0 , then the series of
characters we read correspond to strings accepted by the DFA.

Thus, the emptiness problem is a matter of determining reachability in a DFA. We can define the
solution to the emptiness problem as follows:
Theorem 1. Given a DFA D = (Q, Σ, δ, q0 , F ), L(D) = ∅ if and only if no qf ∈ F is reachable
from q0 .
Proof. In both directions of the biconditional, correctness follows from the fact that the definition of
reachability is analogous to the definition of acceptance of a DFA: “There exists a chain of characters
that take the machine from the start state to an accepting state.”
Note that we don’t need to enumerate all possible strings to test reachability. It is sufficient to
simply perform a breadth- or depth-first search starting from q0 to see if we are able to reach an
accepting state.

Likewise, the acceptance problem considers a DFA D and a string w and asks whether w ∈ L(D).
This also corresponds to reachability:

2 This work is licensed under a “CC BY-NC-SA 4.0” license.


CSC 341: DFA Decision Procedures

Theorem 2. Given a DFA D = (Σ, Q, δ, q0 , F ) and string w ∈ Σ∗ , w ∈ L(D) if and only if qf ∈ F


is reachable from q0 using w.

Proof. Again, both directions of the biconditional are immediate from the definition of reachability
and DFA acceptance.

Here, we simply need to hand-simulate execution of D on w to determine acceptance.

Minimization and Equivalence


Consider the following DFA:

The language of this DFA are the strings corresponding to the following regular expression: (0 ∪ 1)01.
However, this is a simpler DFA that recognizes the same language:

This DFA is simpler in the sense that it has one fewer states than the original DFA. We could
try to reduce this DFA further, but it turns out that this DFA is the smallest DFA recognizing
the language (0 ∪ 1)01. The process of taking a DFA that recognizes language L and finding a
corresponding DFA with less states that also recognizes L is called minimization.

The intuition behind minimizing a DFA is that some states in a non-minimal DFA exhibit the same
behavior. For example, consider q1 and q2 in the original DFA. Even though they are different states,
you can see that for any string w that simulating the machine on input w starting at either q1 or q2
results in the same result: either both cases accept the string or reject the string. This is because
for any character, q1 and q2 both move to the same states (q3 on a 0 and q5 on a 1). In this sense,
q1 and q2 are indistinguishable—we can combine them without any change in the behavior of the
DFA. Formally, we define two states to be distinguishable as follows:

Definition 2. In a DFA D = (Σ, Q, δ, q0 , F ), states p, q ∈ Q are distinguishable if there exists a


string w ∈ Σ∗ such that δ ∗ (p, w) = p0 and δ ∗ (q, w) = q 0 and either p0 ∈ F, q 0 6∈ F or p0 6∈ F, q 0 ∈ F .

3 This work is licensed under a “CC BY-NC-SA 4.0” license.


CSC 341: DFA Decision Procedures

Indistinguishability is defined similarly:

Definition 3. In a DFA D = (Σ, Q, δ, q0 , F ), states p, q ∈ Q are indistinguishable if for all w ∈ Σ∗ ,


δ ∗ (p, w) = p0 and δ ∗ (q, w) = q 0 and either p0 ∈ F, q 0 ∈ F or p0 6∈ F, q 0 6∈ F .

With this in mind, the DFA minimization process works in three steps:

1. First, remove from the DFA any states that are unreachable from the start state (using the
definition of reachability from the previous section).
2. Discover pairs of states in the DFA that are indistinguishable.
3. Combine indistinguishable states to achieve a final, minimal DFA.

To discover pairs of indistinguishable states, we use a table-filling algorithm that tracks which pairs
of states are indistinguishable. We repeatedly refine the table, discovering new states that are
indistinguishable until we reach a point where we make no new discoveries. The final table denotes
which states are indistinguishable and thus combinable in the final step of the algorithm.

As an example, consider running the table-filling algorithm on the original DFA. Our table has the
following shape:

q0 × × × × × ×
q1 × × × × ×
q2 × × × ×
q3 × × ×
q4 × ×
q5 ×
q0 q1 q2 q3 q4 q5

An entry in the table is marked with a circle (◦) if we have discovered that the corresponding pair of
states are distinguishable. If the entry is empty, we say that the states are indistinguishable. Entries
marked with a cross (×) are ignored as they are redundant (since the ordering does not matter in
these pairs of states).

First, from our definition of distinguished states, we see that any accepting state is distinguished
from a non-accepting state. We therefore mark any pair of states that contains q4 as distinguished:

q0 × × × × × ×
q1 × × × × ×
q2 × × × ×
q3 × × ×
q4 ◦ ◦ ◦ ◦ × ×
q5 ◦ ×
q0 q1 q2 q3 q4 q5

Now, we repeat the following process until we no longer change the table:

1. For each pair of states (p, q) and for each possible character a ∈ Σ, consider the new pair of
states (δ(p, a), δ(q, a)).

4 This work is licensed under a “CC BY-NC-SA 4.0” license.


CSC 341: DFA Decision Procedures

2. If (p, q) is undistinguished, but (δ(p, a), δ(q, a)) is distinguished, then mark (p, q) as distin-
guished.

For example, on the first iteration of our process, we first consider the pair (q1 , q0 ) (the top-left
corner of the table although the order does not matter) and the possible transitions from it:

• On input 0, (δ(q1 , 0), δ(q0 , 0)) = (q3 , q1 ).

• On input 1, (δ(q1 , 1), δ(q0 , 1)) = (q5 , q2 ).

We note that (q3 , q1 ) and (q5 , q2 ) are not marked (i.e., are considered undistinguished), so we do
not change the entry for (q1 , q0 ). In contrast, consider the pair (q2 , q3 ):

• On input 0, (δ(q2 , 0), δ(q3 , 0)) = (q3 , q5 ).

• On input 1, (δ(q2 , 1), δ(q3 , 1)) = (q5 , q4 ).

While (q3 , q5 ) is not marked, (q5 , q4 ) is marked, so we mark (q2 , q3 ). Continuing this process, we
arrive at the following updated table

q0 × × × × × ×
q1 × × × × ×
q2 × × × ×
q3 ◦ ◦ ◦ × × ×
q4 ◦ ◦ ◦ ◦ × ×
q5 ◦ ◦ ×
q0 q1 q2 q3 q4 q5

Note that because the only pairs of states involving q4 were marked as distinguished initially, then
only pairs of states that transitioned into a pair containing q4 are marked as distinguished in this
first round. The only such state is q3 , so only pairs of states involving q3 were marked in this round.

Since we modified the table in the first round, we repeat the process again for all the unmarked
states. On the next iteration, the updated table is:

q0 × × × × × ×
q1 ◦ × × × × ×
q2 ◦ × × × ×
q3 ◦ ◦ ◦ × × ×
q4 ◦ ◦ ◦ ◦ × ×
q5 ◦ ◦ ◦ ◦ ×
q0 q1 q2 q3 q4 q5

And on the final iteration, the updated table is:

5 This work is licensed under a “CC BY-NC-SA 4.0” license.


CSC 341: DFA Decision Procedures

q0 × × × × × ×
q1 ◦ × × × × ×
q2 ◦ × × × ×
q3 ◦ ◦ ◦ × × ×
q4 ◦ ◦ ◦ ◦ × ×
q5 ◦ ◦ ◦ ◦ ◦ ×
q0 q1 q2 q3 q4 q5

(I recommend stepping through the algorithm and checking your work with the tables above.) In
the final iteration, the pair (q2 , q1 ) is left unmarked. To see why we’ll never mark (q2 , q1 ), look at
its transitions:

• On input 0, (δ(q2 , 0), δ(q1 , 0)) = (q3 , q3 ).


• On input 1, (δ(q2 , 1), δ(q1 , 1)) = (q5 , q5 ).

And note that the pairs (q3 , q3 ) and (q5 , q5 ) are not marked as distinguished.

Because of this, successive iterations of the process do not change the table. Therefore, this is the
final result and we note that q2 and q1 are indistinguishable. Collapsing these two states into a
single state results in the minimized DFA given above. A surprising but important fact about our
minimization algorithm is that it produces a unique DFA.

Theorem 3. Consider a DFA D. There exists a unique DFA D0 = (Q0 , Σ, δ 0 , q00 , F 0 ) (up to
renaming of states) such that L(D) = L(D0 ) and for any other DFA D00 = (Q00 , Σ, δ 00 , q000 , F 00 ) such
that L(D) = L(D00 ) and |Q0 | < |Q00 |. The table-filling algorithm above derives this D0 for some given
DFA D.

We’ll withhold the proof of this claim until later when we have additional machinery to reason about
the equivalence classes of states discovered by our algorithm. With this theorem, we can tackle the
closely-related problem of DFA equivalence:

Definition 4. Let D1 and D2 be DFAs. D1 is equivalent to D2 , written D1 ≡ D2 , if L(D1 ) =


L(D2 ). That is D1 accepts the same strings as D2 , and D1 rejects the same strings as D2 .

Luckily, because the minimal DFA D0 for some DFA D is unique, we can use the following procedure
to see if two DFAs D1 and D2 are equivalent:

1. Run the minimization procedure on D1 and D2 to produce minimal DFAs D10 and D20 .
2. If D10 and D20 are identical (up to renaming of states), then D1 ≡ D2 .

Acknowledgments
This reading was written by Dr. Peter-Michael Osera. The only changes made were in regard to
spacing, and in notation to match Introduction to the Theory of Computation, by Sipser, 3rd edition.

6 This work is licensed under a “CC BY-NC-SA 4.0” license.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy