0% found this document useful (0 votes)
38 views

Compilers Design: M. T. Bennani Assistant Professor, FST - El Manar University, LISI-INSAT

This document discusses the implementation of lexical analysis in compilers using finite automata. It begins by covering regular expressions and how they are used to specify the lexical structure of a language. It then discusses finite automata models like deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) that are used to implement these regular expressions. The document provides examples of converting regular expressions to NFAs and NFAs to equivalent DFAs, which can then be implemented as tables for efficient lexical analysis during compilation.

Uploaded by

Dorra Boutiti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Compilers Design: M. T. Bennani Assistant Professor, FST - El Manar University, LISI-INSAT

This document discusses the implementation of lexical analysis in compilers using finite automata. It begins by covering regular expressions and how they are used to specify the lexical structure of a language. It then discusses finite automata models like deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs) that are used to implement these regular expressions. The document provides examples of converting regular expressions to NFAs and NFAs to equivalent DFAs, which can then be implemented as tables for efficient lexical analysis during compilation.

Uploaded by

Dorra Boutiti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Regular expressions Finite automata Implementation

Compilers Design

M. T. Bennani
Assistant Professor, FST - El Manar University, LISI-INSAT

-Academic Year 2017-2018-

M. T. Bennani Assistant Professor, FST - El Manar University, LISI-INSAT


Lecture III. Lexical Analysis Implementation
Regular expressions Finite automata Implementation

Outline

I Specifying lexical structure using regular expressions

I Finite automata
I Deterministic Finite Automata (DFAs)
I Non-deterministic Finite Automata (NFAs)

I Implementation
I RegExp ⇒ NFA ⇒ DFA ⇒ Tables

I Exercises

M. T. Bennani Assistant Professor, FST - El Manar University, LISI-INSAT


Lecture III. Lexical Analysis Implementation
Regular expressions Finite automata Implementation

Regular expressions extension

Variation in regular expression notation


I Union : A | B ⇔ A + B
I Option : A + ε ⇔ A?
I Range : ’a’+’b’+...+’z’ ⇔ [a-z]
I Excluded range : complement of [a-z] ⇔ [ˆa-z]

3 / 15
Regular expressions Finite automata Implementation

Regular expression to lexical specification (1)

1. Write a rexp for the lexemes of each token


I Number = digit+
I Keyword = ’if’+’else’+...
I Identifier = letter(letter+digit)*
2. Construction R, matching all lexemes for all tokens
I R = Keyword + Identifier + Number + ...
I R = R1 + R2 + ...
3. Verify if the input (X1 ...Xn ) belongs to the language
I For 1 ≤ i ≤ n check X1 ...Xi ∈ L(R)
4. If success
I X1 ...Xi ∈ L(Rj ) for some j
remove the lexeme X1 ...Xi from the input and go to (3)

4 / 15
Regular expressions Finite automata Implementation

Ambiguities
I How much input is used? What if
I X1 ...Xi ∈ L(R) and also
I X1 ...Xk ∈ L(R)

Rule 1
Pick logest possible string in L(R): ”The maximal munch”
algorithm
I Which token is used? What if
I X1 ...Xi ∈ L(Rj ) and also
I X1 ...Xi ∈ L(Rk )

Rule 2
Use rule listed first (j if j < k )
- Treats ”if” as a keyword, not an identifier

5 / 15
Regular expressions Finite automata Implementation

Error Handling

I What if: No rule matches a prefix of input?


I Write a rule matching all ”bad” strings
I Put it last (lowest priority)

6 / 15
Regular expressions Finite automata Implementation

Definitions

I Regular expressions = Specification


I Finite automata = implementation

Finite automaton consists of


I An input alphabet Σ
I A set of states S
I A start state n
I A set of accepting states F ⊆ S
I A set of transitions state →input state

7 / 15
Regular expressions Finite automata Implementation

Notations
I Transition : S1 →a S2
I In state S1 on input ”a” go to state S2
I If end of input and in accepting state ⇒ accept
I Otherwise ⇒ reject

Initial State Final State Transition


State

Examples
I Design a finite automaton that accepts only ”1”
I Design a finite automaton that accepts numbers of 1’s
followed by a single 0.

8 / 15
Regular expressions Finite automata Implementation

Automata examples
I Alphabet{0,1}
I What language does this recongnize?

Epsilon moves
I Another kind of transition: ε-moves
I A →ε B
I Machine can move from state A to state B without reading
input

9 / 15
Regular expressions Finite automata Implementation

Deterministic and Nondeterministic Automata


Deterministic Finite Automata (DFA)
I One transition per input per state
I No ε-moves
⇒ A DFA can take only one path through the state graph
- Completely determined by input
Nondeterministic Finite Automata (NFA)
I Can have multiple transitions for one input in a given state
I Can have ε-moves
⇒ NFAs can choose
I Whether to make ε-moves
I Which of multiple transitions for a single input to take

10 / 15
Regular expressions Finite automata Implementation

NFA and DFA


NFA acceptance
State 0 1
-A {A,B} A
I Let the following NFA
B C ∅
C+ ∅ ∅
I For input: 1 0 0, The NFA accepts the input if it can get to
the final state

Comparaison
I NFAs and DFAs recognize the same set of languages (regular
languages)
I DFSs are faster to execute because there are no choices to
consider
I For a given language NFA can be simpler than DFA
11 / 15
Regular expressions Finite automata Implementation

NFA to DFA (1/2)

Step 1 - Create state table from the given NDFA.


Step 2 - Create a blank state table under possible input alphabets
for the equivalent DFA.
Step 3 - Mark the start state of the DFA by q0 (Same as the
NDFA).
Step 4 - Find out the combination of States Q0, Q1,... , Qn for
each possible input alphabet.
Step 5 - Each time we generate a new DFA state under the input
alphabet columns, we have to apply step 4 again, otherwise go to
step 6.
Step 6 - The states which contain any of the final states of the
NDFA are the final states of the equivalent DFA.

12 / 15
Regular expressions Finite automata Implementation

NFA to DFA (2/2)

I Each state of DFA is a non-empty subset of states of the NFA


I Start state
I The set of NFA states reachable through ε-moves from NFA
start state
I Add a transition S →a S’ to DFA iff
I S’ is the set of NFA states reachable from any state in S after
seeing the input a, considering ε-moves as well

13 / 15
Regular expressions Finite automata Implementation

Example

The regular expression is: (1+0)*1

14 / 15
Regular expressions Finite automata Implementation

I A DFA can be implemented by a 2D table T


I One dimension is ”states”
I Other dimension is ”input symbol”
I For each transition Si →a Sk define T[i,a] = k
I DFA execution
I If in state Si and input a, read T[i,a]=k and skip to the state
Sk
I very efficient

Note
- NFA to DFA conversion is the heart of the tools such as flex
- But, DFAs can ve huge
- In practive, flex-like tools trade off speed for space in the choice
of NFA and DFA representations

15 / 15

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy