0% found this document useful (1 vote)

906 views17 pages

Specification of Tokens

This document defines key concepts related to strings, languages, and regular expressions. It introduces strings as sequences of symbols from an alphabet. Languages are defined as sets of strings over a fixed alphabet. Regular expressions provide a notation for specifying patterns in strings using operations like concatenation, union, and Kleene closure. The document also outlines properties and limitations of regular expressions.

Uploaded by

SMARTELLIGENT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

906 views17 pages

Specification of Tokens

Uploaded by

SMARTELLIGENT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

SPECIFICATION OF

TOKENS

1
Strings and Languages
• Regular Expressions are an important notation for specifying patterns.

• Alphabet – any finite set of symbols

e.g. ASCII, binary alphabet, UNICODE, EBCDIC,LATIN-1

• String – A finite sequence of symbols drawn from an alphabet

– Banana (ASCII Alphabet)
– Length of a string => |s|
– Empty String => ε

• Other terms relating to strings: prefix; suffix; substring; proper prefix,

suffix, or substring (non-empty, not entire string); subsequence

• Language – A set of strings over a fixed alphabet

2
Languages
• A language, L, is simply any set of strings over a
fixed alphabet.

Alphabet Languages
{0,1} {0,10,100,1000,100000…}
{0,1,00,11,000,111,…}
{a,b,c} {abc,aabbcc,aaabbbccc,…}
{A, … ,Z} {FOR,WHILE,GOTO,…}
{A,…,Z,a,…,z,0,…9, { All legal PASCAL progs}
+,-,…,<,>,…}

Special Languages:  - EMPTY LANGUAGE

 - contains  string only

3
String operations
• Given String: banana
• Prefix : ban, banana
• Suffix : ana, banana
• Substring : nan, ban, ana, banana
• Subsequence: bnan, nn
• Proper Prefix and Suffix

4
String Operations
• Concatenation
– xy; s = s = s;  - identity for concatenation
– s0 =  if i > 0 si = si-1s

5
Operations on Languages

OPERATION DEFINITION
union of L and M L  M = {s | s is in L or s is in M}
written L  M
concatenation of L LM = {st | s is in L and t is in M}
and M written LM

Kleene closure of L L*= Li

written L*

i 0

L* denotes “zero or more concatenations of “ L

positive closure of L 
L i

written L+ L+= 
i 1

L+ denotes “one or more concatenations of “ L

Exponentiation Lo={ε}, L1=L,L2=LL
6
Operations on Languages
• LUD is the set of letters and digits
• LD is the set of strings consisting of a
letter followed by a digit
• L4 is the set of all four strings
• L* is the set of strings including ε
• D+ is the set of strings of one or more
digits.

7
Say What?
L = {A, B, C, D } D = {1, 2, 3}
• LD
{A, B, C, D, 1, 2, 3 }
• LD
{A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 }
• L2
{ AA, AB, AC, AD, BA, BB, BC, BD, CA, … DD}
• L*
{ All possible strings of L plus  }
• L+
L* - 
• L (L  D )
Valid :{ A1,AA,B3,CD} Invlaid:{321,4A2}
• L (L  D )*
Valid:{ A,A1,A23,D3,DA3..} Invalid:{31}
8
Regular Expressions
• A Regular Expression is a Set of Rules /
Techniques for Constructing Sequences of
Symbols (Strings) from an Alphabet.

• Let  Be an Alphabet, r a Regular Expression

Then L(r) is the Language That is characterized
by the Rules of r

9
Regular Expressions
• Defined over an alphabet Σ

• ε represents {ε}, the set containing the empty string

• If a is a symbol in Σ, then a is a regular expression

denoting {a}, the set containing the string a

• If r and s are regular expressions denoting the

languages L(r) and L(s), then:
– (r)|(s) is a regular expression denoting L(r)U L(s)
– (r)(s) is a regular expression denoting L(r)L(s)
– (r)* is a regular expression denoting (L(r))*
– (r) is a regular expression denoting L(r)

• Precedence: * (left associative), then concatenation (left

associative), then | (left associative) 10
Regular Expressions
Alphabet = {a, b}
1. a|b denotes {a, b}
2. (a|b)(a|b) denotes {ab, aa, ba, bb}
3. a* denotes {, a, aa, …}
4. (a|b)* - Strings of a’s and b’s including the 
5. a|a*b – a followed by zero/more a’s followed by b

11
Algebraic Properties of Regular
Expressions

AXIOM DESCRIPTION
r|s=s|r | is commutative
r | (s | t) = (r | s) | t | is associative
(r s) t = r (s t) concatenation is associative
r(s|t)=rs|rt
(s|t)r=sr|tr concatenation distributes over |

r = r
r = r  Is the identity element for concatenation

r* = ( r |  )* relation between * and 

r** = r* * is idempotent

12
Regular Definitions
• Names maybe given to regular expressions; these
names can be used like symbols
• Let  is an alphabet of basic symbols. The regular
definition is a sequence of definitions of the form
d1 r1
d2 r2
...
dn rn
Where, each di is a distinct name, and each ri is a
regular expression over the symbols in   {d1, d2,
…, di-1 }

13
Regular Definitions
• Example 1:
– letter  A|B|…|Z|a|b|…|z
– digit  0|1|…|9
– id  letter (letter | digit)*
• Example 2
– digit  0 | 1 | 2 | … | 9
– digits  digit digit*
– optional_fraction  . digits | 
– optional_exponent  ( E ( + | -| ) digits) | 
– num  digits optional_fraction optional_exponent

14
Regular Definitions
• Shorthand
– One or more instances: r+ denotes rr*
– Zero or one Instance: r? denotes r|ε
– Character classes: [a-z] denotes [a|b|…|
z]

15
Example
• digit  0 | 1 | 2 | … | 9
• digits  digit+
• optional_fraction  (. digits ) ?
• optional_exponent  ( E ( + | -) ? digits) ?
• num  digits optional_fraction optional_exponent

16
Limitations of Regular
Expression
• Some languages cannot be described by any regular
expression
• Cannot describe balanced or nested constructs
– Example, all valid strings of balanced parentheses
– This can be done with CFG
• Cannot describe repeated strings
– Example: {wcw|w is a string of a’s and b’s}
– This can be done with CFG
• Can be used to denote only a fixed or unspecified
number of repetitions.

TOC - NOTES - 1 - Compressed
No ratings yet
TOC - NOTES - 1 - Compressed
122 pages
Data Communication Unit1 As Per Pune University
No ratings yet
Data Communication Unit1 As Per Pune University
22 pages
OS Unit IV PPT 2023
100% (1)
OS Unit IV PPT 2023
85 pages
Specification of Tokens
No ratings yet
Specification of Tokens
17 pages
@vtucode - in 21CS61 Module 5 PDF 2021 Scheme
No ratings yet
@vtucode - in 21CS61 Module 5 PDF 2021 Scheme
53 pages
COA Course File 2023-24
No ratings yet
COA Course File 2023-24
61 pages
Compiler Design-UNIT-5
No ratings yet
Compiler Design-UNIT-5
34 pages
Software Engineering Notes Unit-4
No ratings yet
Software Engineering Notes Unit-4
26 pages
4.CPU Scheduling and Algorithm-Notes
No ratings yet
4.CPU Scheduling and Algorithm-Notes
31 pages
4.5 Issues in Code Generation
No ratings yet
4.5 Issues in Code Generation
7 pages
CH1-Introduction To Unix Linux Kernel
100% (2)
CH1-Introduction To Unix Linux Kernel
25 pages
Os Lab Manual PDF
No ratings yet
Os Lab Manual PDF
60 pages
Unit-2 Os R23
No ratings yet
Unit-2 Os R23
16 pages
Web Technology Question Bank
No ratings yet
Web Technology Question Bank
7 pages
Operating System Lab Manual
No ratings yet
Operating System Lab Manual
58 pages
Os Model Paper
0% (1)
Os Model Paper
2 pages
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
12 pages
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
No ratings yet
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
12 pages
Principles of Programming Language: B.Tech
No ratings yet
Principles of Programming Language: B.Tech
121 pages
DAA Question Bank
No ratings yet
DAA Question Bank
39 pages
Unit-1 Cyber Laws
No ratings yet
Unit-1 Cyber Laws
21 pages
Mental Health Report
No ratings yet
Mental Health Report
15 pages
DAA Question Bank
No ratings yet
DAA Question Bank
9 pages
Qa - CD Unit-3
No ratings yet
Qa - CD Unit-3
8 pages
Java Viva Questions - Coders Lodge
100% (1)
Java Viva Questions - Coders Lodge
15 pages
AJ - Lab Manual
No ratings yet
AJ - Lab Manual
97 pages
Sande Jonathan Dart Apprentice Beyond The Basics PDF
100% (1)
Sande Jonathan Dart Apprentice Beyond The Basics PDF
207 pages
2-QUESTION PAPER DR K UMA Question Bank CS3001 SOFTWARE ENGG-converted1
No ratings yet
2-QUESTION PAPER DR K UMA Question Bank CS3001 SOFTWARE ENGG-converted1
71 pages
Software Quality: Robert Hughes and Mike Cotterell
No ratings yet
Software Quality: Robert Hughes and Mike Cotterell
46 pages
Q1. Explain DES (Data Encryption Standard) and Its Round Function
No ratings yet
Q1. Explain DES (Data Encryption Standard) and Its Round Function
7 pages
Input and Output Text and Binary I/O: Introduction To Java Y.Daniel Liang 1
No ratings yet
Input and Output Text and Binary I/O: Introduction To Java Y.Daniel Liang 1
64 pages
Optimization in Engineering - 6th Sem 18-19
No ratings yet
Optimization in Engineering - 6th Sem 18-19
2 pages
Oopcgl Mini Project
No ratings yet
Oopcgl Mini Project
6 pages
Operating Digital Notes (R22 Regulation)
No ratings yet
Operating Digital Notes (R22 Regulation)
156 pages
Characteristics of A Good SRS
No ratings yet
Characteristics of A Good SRS
2 pages
PPL UNIT 2 Notes
No ratings yet
PPL UNIT 2 Notes
66 pages
4.CPU Scheduling and Algorithm-Notes
No ratings yet
4.CPU Scheduling and Algorithm-Notes
31 pages
4th Sem Syllabus of RGPV Bhopal Cse
No ratings yet
4th Sem Syllabus of RGPV Bhopal Cse
14 pages
Operating System Notes For MCA
No ratings yet
Operating System Notes For MCA
83 pages
Operating System Kcs-401. Question Bank À Unit-Iii: Cpu Scheduling and Deadlocks
No ratings yet
Operating System Kcs-401. Question Bank À Unit-Iii: Cpu Scheduling and Deadlocks
4 pages
OS 2 Marks
100% (11)
OS 2 Marks
15 pages
Module-3 Syntax Analyzer
No ratings yet
Module-3 Syntax Analyzer
80 pages
Presentations PPT Unit-1 27042019073920AM
100% (1)
Presentations PPT Unit-1 27042019073920AM
42 pages
Unit 6 Fds 2023
No ratings yet
Unit 6 Fds 2023
67 pages
Data Structures Viva Questions
No ratings yet
Data Structures Viva Questions
3 pages
CHAPTER 03: Big Data Technology Landscape
No ratings yet
CHAPTER 03: Big Data Technology Landscape
81 pages
ATCD Unit Wise Important Questions
No ratings yet
ATCD Unit Wise Important Questions
5 pages
@vtucode - in 21CS61 Module 4 2021 Scheme
No ratings yet
@vtucode - in 21CS61 Module 4 2021 Scheme
31 pages
Production Systems
No ratings yet
Production Systems
27 pages
HACKERRANK
No ratings yet
HACKERRANK
16 pages
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
No ratings yet
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
48 pages
20mcal16 DS Lab Manual Isem
100% (1)
20mcal16 DS Lab Manual Isem
41 pages
PPL Unit 3
No ratings yet
PPL Unit 3
14 pages
Computer Architecture Unit Notes
100% (1)
Computer Architecture Unit Notes
30 pages
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
No ratings yet
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
33 pages
Theory of Computation Unit-1 - Notes
100% (15)
Theory of Computation Unit-1 - Notes
100 pages
Q. What Is Input Buffering. What Is Sentinels?
No ratings yet
Q. What Is Input Buffering. What Is Sentinels?
6 pages
Fs Lab Manual
No ratings yet
Fs Lab Manual
57 pages
Intern
No ratings yet
Intern
120 pages
Optimization of DFA Based Pattern Matchers
50% (2)
Optimization of DFA Based Pattern Matchers
17 pages
Enterprise DLP Admin
No ratings yet
Enterprise DLP Admin
328 pages
Classical Problems of Synchronization
No ratings yet
Classical Problems of Synchronization
18 pages
BCA 4th Sem
No ratings yet
BCA 4th Sem
12 pages
Design and Analysis of Algorithms Question Bank
100% (1)
Design and Analysis of Algorithms Question Bank
10 pages
Case Study
No ratings yet
Case Study
7 pages
Operating System Lecturer Notes
No ratings yet
Operating System Lecturer Notes
12 pages
What Is DHTML?
No ratings yet
What Is DHTML?
22 pages
PHP - Introduction
No ratings yet
PHP - Introduction
47 pages
Splunk 6.1 SearchReference
No ratings yet
Splunk 6.1 SearchReference
454 pages
Grasp - Unit 2
No ratings yet
Grasp - Unit 2
63 pages
Aegisub User Manual
100% (1)
Aegisub User Manual
168 pages
Lecture 0 CSE322
No ratings yet
Lecture 0 CSE322
46 pages
Introduction and Structure of A Compiler
No ratings yet
Introduction and Structure of A Compiler
47 pages
Oo Methodologies
No ratings yet
Oo Methodologies
148 pages
(100% Off) Complete Python Bootcamp For Everyone From Zero To Hero 2023
No ratings yet
(100% Off) Complete Python Bootcamp For Everyone From Zero To Hero 2023
29 pages
Web Reactive
No ratings yet
Web Reactive
50 pages
Practical Computing For Biologists
No ratings yet
Practical Computing For Biologists
109 pages
Top 20 TCL Syntax Helpful To Improve TCL Scripting Skill For VLSI Engineers PDF
No ratings yet
Top 20 TCL Syntax Helpful To Improve TCL Scripting Skill For VLSI Engineers PDF
9 pages
Python Programming Lab Manual
No ratings yet
Python Programming Lab Manual
36 pages
Distributed System
100% (1)
Distributed System
26 pages
Tovarňák D. and Pitner T
No ratings yet
Tovarňák D. and Pitner T
6 pages
Lecture 04
No ratings yet
Lecture 04
18 pages
Use VIM Like A Pro
100% (1)
Use VIM Like A Pro
28 pages
Unit 1
No ratings yet
Unit 1
29 pages
Troubleshooting Guide: Ibm Security Qradar
No ratings yet
Troubleshooting Guide: Ibm Security Qradar
20 pages
Vim Tips
100% (4)
Vim Tips
18 pages
CD - CH2 - Lexical Analysis
No ratings yet
CD - CH2 - Lexical Analysis
67 pages
Unit 1
No ratings yet
Unit 1
29 pages
SSD and Relationship-Ssd and Usecase
No ratings yet
SSD and Relationship-Ssd and Usecase
30 pages
Introduction and Structure of A Compiler
No ratings yet
Introduction and Structure of A Compiler
47 pages
CSIT Henyo Quiz Bee
No ratings yet
CSIT Henyo Quiz Bee
6 pages
Regular Expressions in Perl
No ratings yet
Regular Expressions in Perl
13 pages
Expect Lecciones Rapidas
No ratings yet
Expect Lecciones Rapidas
11 pages
AbInitio String Functions PDF
No ratings yet
AbInitio String Functions PDF
13 pages
Java Regular Expressions Cheat Sheet
No ratings yet
Java Regular Expressions Cheat Sheet
1 page
Unit 1 - Assignment - Updated - AKTU
No ratings yet
Unit 1 - Assignment - Updated - AKTU
2 pages
Mongodb
No ratings yet
Mongodb
1 page
Regular Expressions Quick Reference
No ratings yet
Regular Expressions Quick Reference
3 pages
C & Data Structures
From Everand
C & Data Structures
Prof. P. Padmanabham
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Specification of Tokens

Uploaded by

Specification of Tokens

Uploaded by

SPECIFICATION OF

• Alphabet – any finite set of symbols

• String – A finite sequence of symbols drawn from an alphabet

• Other terms relating to strings: prefix; suffix; substring; proper prefix,

• Language – A set of strings over a fixed alphabet

Special Languages:  - EMPTY LANGUAGE

L* denotes “zero or more concatenations of “ L

L+ denotes “one or more concatenations of “ L

• Let  Be an Alphabet, r a Regular Expression

• ε represents {ε}, the set containing the empty string

• If a is a symbol in Σ, then a is a regular expression

• If r and s are regular expressions denoting the

• Precedence: * (left associative), then concatenation (left

r* = ( r |  )* relation between * and 

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.