0% found this document useful (0 votes)

6 views

L02_Programming_RE plc

Regular expressions (regex) are formal languages used for matching, searching, and replacing text patterns without mathematical operations. They have a rich history dating back to the 1940s and are implemented in various programming languages, with different engines like PCRE and POSIX. The document covers the syntax, metacharacters, and practical applications of regex, including examples in Python.

Uploaded by

Rasha Elsayed Sakr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

L02_Programming_RE plc

Uploaded by

Rasha Elsayed Sakr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Regular Expressions in

programming
CSE 307 – Principles of Programming Languages
Stony Brook University
http://www.cs.stonybrook.edu/~cse307

1
What are Regular Expressions?
 Formal language representing a text pattern interpreted
by a regular expression processor
 Used for matching, searching and replacing text
 There are no variables and you cannot do
mathematical operations (such as: you cannot add
1+1) – it is not a programming language
 Frequently you will hear them called regex or RE for
short (or pluralized "regexes")

2
(c) Paul Fodor (CS Stony Brook)
What are Regular Expressions?
 Usage examples:
 Test if a phone number has the correct number of digits
 Test if an email address has the correct format
 Test if a Social Security Number is in the correct format
 Search a text for words that contain digits
 Find duplicate words in a text
 Replace all occurrences of "Bob" and "Bobby" with "Robert"
 Count the number of times "science" is preceded by
"computer" or "information"
 Convert a tab indentations file with spaces indentations

3
(c) Paul Fodor (CS Stony Brook)
What are Regular Expressions?
 But what is "Matches"?
 a text matches a regular expression if it is correctly
described by the regex
>>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
>>> m
<re.Match object; span=(0, 12), match='Isaac Newton'>

>>> m.group(0) # The entire match

'Isaac Newton'

>>> m.group(1) # The first parenthesized subgroup.

'Isaac'

>>> m.group(2) # The second parenthesized subgroup.

'Newton'
4
(c) Paul Fodor (CS Stony Brook)
History of Regular Expressions
 1943: Warren McCulloch and Walter Pitts developed
models of how the nervous system works
 1956: Steven Kleene described these models with an
algebra called "regular sets" and created a notation to
express them called "regular expressions"
 1968: Ken Thompson implements regular expressions in
ed, a Unix text editor
 Example: g/Regular Expression/p
 meaning Global Regular Expression Print (grep)
 g = global / whole file; p= print

5
(c) Paul Fodor (CS Stony Brook)
History of Regular Expressions
 grep evolved into egrep
 but broke backward compatibility
 Therefore, in 1986, everyone came together and defined POSIX
(Portable Operating Systems Interface)
 Basic Regular Expressions (BREs)
 Extended Regular Expressions (EREs)
 1986: Henry Spencer releases the regex library in C
 Many incorporated it in other languages and tools
 1987: Larry Wall released Perl
 Used Spencer's regex library
 Added powerful features
 Everybody wanted to have it in their languages: Perl Compatible RE
(PCRE) library, Java, Javascript, C#/VB/.NET, MySQL, PHP,
Python, Ruby
6
(c) Paul Fodor (CS Stony Brook)
Regular Expressions Engines
 Main versions / standards:
 PCRE
 POSIX BRE
 POSIX ERE
 Very subtle differences
 Mainly older UNIX tools that use POSIX BRE for compatibility reasons
 In use:
 Unix (POSIX BRE, POSIX ERE)
 PHP (PCRE)
 Apache (v1: POSIX ERE, v2: PCRE)
 MySQL (POSIX ERE)
 Each of these languages is improving, so check their manuals

7
(c) Paul Fodor (CS Stony Brook)
Python Regular Expressions
 https://docs.python.org/3/library/re.html
 It is more powerful than String splits:
>>> "ab bc cd".split()
['ab', 'bc', 'cd']
 Import the re module:
import re
>>> re.split(" ", "ab bc cd")
['ab', 'bc', 'cd']

>>> re.split("\d", "ab1bc4cd")

['ab', 'bc', 'cd']

>>> re.split("\d*", "ab13bc44cd443gg")

['', 'a', 'b', '', 'b', 'c', '', 'c', 'd',
8 '', 'g', 'g', '']
(c) Paul Fodor (CS Stony Brook)
Python Regular Expressions
>>> re.split("\d+", "ab13bc44cd443gg")
['ab', 'bc', 'cd', 'gg']

>>> m = re.search('(?<=abc)def', 'abcdef')

>>> m
<re.Match object; span=(3, 6), match='def'>

9
(c) Paul Fodor (CS Stony Brook)
Online Regular Expressions
 https://regexpal.com

10
(c) Paul Fodor (CS Stony Brook)
Regular Expressions
 Strings:
 "car" matches "car"
 "car" also matches the first three letters in "cartoon"
 "car" does not match "c_a_r"
 Similar to search in a word processor
 Case-sensitive (by default): "car" does not match "Car"
 Metacharacters:
 Have a special meaning
 Like mathematical operators
 Transform char sequences into powerful patterns
 Only a few characters to learn: \ . * + - { } [ ] ( ) ^ $ | ? : ! =
 May have multiple meanings
 Depend on the context in which they are used
 Variation between regex engines
11
(c) Paul Fodor (CS Stony Brook)
The wildcard character
 Like in card games: one card can replace any other card on the
pattern
Metacharacter Meaning
. Any character except newline

 Examples:
 "h.t" matches "hat", "hot", "heat"
 ".a.a.a" matches "banana", "papaya"
 "h.t" does not match ""heat" or "Hot"
 Common mistake:
 "9.00" matches "9.00", but it also match "9500", "9-00"
 We should write regular expressions to match what we
want and ONLY what we want (We don’t want to be overly
permissive, we don't want false positives, we want the
regular expression to match what we are not looking for)
12
(c) Paul Fodor (CS Stony Brook)
Escaping Metacharacter
 Allow the use of metacharacters as characters:
 "\." matches "."
Metacharacter Meaning
\ Escape the next character
 "9\.00" matches only "9.00", but not "9500" or "9-00"
 Match a backslash by escaping it with a backslash:
 "\\" matches only "\"
 "C:\\Users\\Paul" matches "C:\Users\Paul"
 Only for metacharacters
 literal characters should never be escaped because it gives them meaning, e.g., r"\n"
 Sometimes we want both meanings
 Example: we want to match files of the name: "1_report.txt", "2_report.txt",…
 "._report\.txt" uses the first . as a wildcard and the second \. as the period itself

13
(c) Paul Fodor (CS Stony Brook)
Other special characters
 Tabs: \t
 Line returns: \r (line return), \n (newline), \r\n
 Unicode codes: \u00A9
 ASCII codes: \x00A9

14
(c) Paul Fodor (CS Stony Brook)
Character sets
Metacharacter Meaning
[ Begin character set
] End character set

 Matches any of the characters inside the set

 But only one character
 Order of characters does not matter
 Examples:
 "[aeiou]" matches a single vowel, such as: "a" or "e"
 "gr[ae]y" matches "gray" or "grey"
 "gr[ae]t" does not match "great"

15
(c) Paul Fodor (CS Stony Brook)
Character ranges
 [a-z] = [abcdefghijklmnoprqstuxyz]
 Range metacharacter - is only a character range when it is inside a
character set, a dash line otherwise
 represent all characters between two characters
 Examples:
 [0-9]
 [A-Za-z]
 [0-9A-Za-z]
 [0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9] matches phone "631-632-9820"
 [0-9][0-9][0-9][0-9][0-9] matches zip code "90210"
 [A-Z0-9][A-Z0-9][A-Z0-9] [A-Z0-9][A-Z0-9][A-Z0-9] matches Canadian zip codes,
such as, "VC5 B6T"
 Caution:
 What is [50-99]?
 It is not {50,51,…,99}
 It is same with [0-9]: the set contains already 5 and 9
16
(c) Paul Fodor (CS Stony Brook)
Negative Character sets
Metacharacter Meaning
^ Negate a character set
 Caret (^) = not one of several characters
 Add ^ as the first character inside a character set
 Still represents one character
 Examples:
 [âeiou] matches any one character that is not a lower case vowel
 [âeiouAEIOU] matches any one character that is not a vowel (non-vowel)
 [â-zA-Z] matches any one character that is not a letter
 see[^mn] matches "seek" and "sees", but not "seem" or "seen"
 see[^mn] matches "see " because space matches [^mn]
 see[^mn] does not match "see" because there is no more character after see

17
(c) Paul Fodor (CS Stony Brook)
Metacharacters
 Metacharacters inside Character sets are already escaped:
 Do not need to escape them again
 Examples:
 h[o.]t matches "hot" and "h.t"
 Exceptions: metacharacters that have to do with character sets: ]-^\
 Examples:
 [[\]] matches "[" or "]"
 var[[(][0-9][)\]] matches "var()" or "var[]"

 Exception to exception: "10[-/]10" matches "10-10" or "10/10"

 - does not need to be escaped because it is not a range

18
(c) Paul Fodor (CS Stony Brook)
Shorthand character sets
Shorthand Meaning Equivalent
\d Digit [0-9]
\w Word character [a-zA-z0-9_]
\s Whitespace [ \t\n\r]
\D Not digit [^0-9]
\W Not word character [^a-zA-z0-9_]
\S Not white space [^ \t\n\r]

 Underscore (_) is a word character

 Hyphen (-) is not a word character Introduced in Perl
Not in many Unix tools
 "\d\d\d" matches "123"
 "\w\w\w" matches "123" and "ABC" and "1_A"
 "\w\s\w\w" matches "I am", but not "Am I"
 "[^\d]" matches "a"
19  "[^\d\w]" is not the same with "[\D\W]" (accepts "a")
(c) Paul Fodor (CS Stony Brook)
POSIX Bracket Expressions

20
(c) Paul Fodor (CS Stony Brook)
Repetition
Metacharacter Meaning
* Preceding item zero or more times
+ Preceding item one or more times
? Preceding item zero or one time

 Examples:
 apples* matches "apple" and "apples" and "applessssssss"
 apples+ matches "apples" and "applessssssss"
 apples? matches "apple" and "apples"
 \d* matches "123"
 colou?r matches "color" and "colour"

21
(c) Paul Fodor (CS Stony Brook)
Quantified Repetition
Metacharacter Meaning
{ Start quantified repetition of preceding item
} End quantified repetition of preceding item

 {min, max}
 min and max must be positive numbers
 min must always be included
 min can be 0
 max is optional
 Syntaxes:
 \d{4,8} matches numbers with 4 to 8 digits
 \d{4} matches numbers with exactly 4 digits
 \d{4,} matches numbers with minimum 4 digits
 \d{0,} is equivalent to \d*
22  \d{1,} is equivalent to(c) Paul
\d+ Fodor (CS Stony Brook)
Greedy Expressions
 Standard repetition quantifiers are greedy:
 expressions try to match the longest possible string
 \d* matches the entire string "1234" and not just "123", "1",
or "23"
 Lazy expressions:
 matches as little as possible before giving control to the next
expression part
 ? makes the preceding quantifier into a lazy quantifier
 *?
 +?
 {min,max}?
 ??
 Example:
23  "apples??" matches "apple" in "apples"
(c) Paul Fodor (CS Stony Brook)
Grouping metacharacters
Metacharacter Meaning
( Start grouped expression
) End grouped expression

 Group a large part to apply repetition to it

 "(abc)*" matches "abc" and "abcabcabc"
 "(in)?dependent" matches "dependent" and "independent"
 Makes expressions easier to read
 Cannot be used inside character sets

24
(c) Paul Fodor (CS Stony Brook)
Metacharacters
$ Matches the ending position of the string or the position
just before a string-ending newline.
 In line-based tools, it matches the ending position of any line.
 [hc]at$ matches "hat" and "cat", but only at the end of the string or line.
 ^ Matches the beginning of a line or string.
 | The choice (also known as alternation or set union) operator matches
either the expression before or the expression after the operator.
 For example, abc|def matches "abc" or "def".
 \A Matches the beginning of a string (but not an internal line).
 \z Matches the end of a string (but not an internal line).

25
(c) Paul Fodor (CS Stony Brook)
Summary: Frequently Used Regular Expressions

26
(c) Paul Fodor (CS Stony Brook)
Python match and search Functions
 re.match(r, s) returns a match object if the regex r
matches at the start of string s
import re
regex = "\d{3}-\d{2}-\d{4}"
ssn = input("Enter SSN: ")
match1 = re.match(regex, ssn)
if match1 != None:
print(ssn, " is a valid SSN")
print("start position of the matched text is "
+ str(match1.start()))
print("start and end position of the matched text is "
+ str(match1.span()))
else:
print(ssn, " is not a valid SSN")

Enter SSN: 123-12-1234 more text

123-12-1234 more text is a valid SSN
start position of the matched text is 0
start and end position of the matched text is (0, 11)
27
(c) Paul Fodor (CS Stony Brook)
Python match and search Functions
 Invoking re.match returns a match object if the string
matches the regex pattern at the start of the string.
 Otherwise, it returns None.
 The program checks whether if there is a match.
 If so, it invokes the match object’s start() method to return
the start position of the matched text in the string (line 10) and the
span() method to return the start and end position of the
matched text in a tuple (line 11).

28
(c) Paul Fodor (CS Stony Brook)
Python match and search Functions
 re.search(r, s) returns a match object if the regex r matches
anywhere in string s
import re
regex = "\d{3}-\d{2}-\d{4}"
text = input("Enter a text: ")
match1 = re.search(regex, text)
if match1 != None:
print(text, " contains a valid SSN")
print("start position of the matched text is "
+ str(match1.start()))
print("start and end position of the matched text is "
+ str(match1.span()))
else:
print(ssn, " does not contain a valid SSN")

Enter a text: The ssn for Smith is 343-34-3490

The ssn for Smith is 343-34-3490 contains a SSN
start position of the matched text is 21
start and end position of the matched text is (21, 32)
29
(c) Paul Fodor (CS Stony Brook)
Flags
 For the functions in the re module, an optional flag parameter
can be used to specify additional constraints
 For example, in the following statement
re.search("a{3}", "AaaBe", re.IGNORECASE)
The string "AaaBe" matches the pattern a{3} case-insensitive

30
(c) Paul Fodor (CS Stony Brook)
Findall
 findall(pattern, string [, flags]) return a list of
strings giving all nonoverlapping matches of pattern in string. If there are
any groups in patterns, returns a list of groups, and a list of tuples if the
pattern has more than one group
>>> re.findall('<(.*?)>','<spam> /<ham><eggs>')
['spam', 'ham', 'eggs']
>>> re.findall('<(.*?)>/?<(.*?)>',
'<spam>/<ham> ... <eggs><cheese>')
[('spam', 'ham'), ('eggs', 'cheese')]

31
(c) Paul Fodor (CS Stony Brook)
Findall
 sub(pattern, repl, string [, count, flags])
returns the string obtained by replacing the (first count) leftmost
nonoverlapping occurrences of pattern (a string or a pattern object) in
string by repl (which may be a string with backslash escapes that may
back-reference a matched group, or a function that is passed a single match
object and returns the replacement string).
 compile(pattern [, flags]) compiles a regular expression
pattern string into a regular expression pattern object, for later matching.

32
(c) Paul Fodor (CS Stony Brook)
Groups
 Groups: extract substrings matched by REs in '()' parts
 (R) Matches any regular expression inside (), and delimits a group (retains
matched substring)
 \N Matches the contents of the group of the same number N: '(.+) \1' matches
“42 42”
import re
patt = re.compile("A(.)B(.)C(.)") # saves 3 substrings
mobj = patt.match("A0B1C2") # each '()' is a group, 1..n
print(mobj.group(1), mobj.group(2), mobj.group(3))
patt = re.compile("A(.*)B(.*)C(.*)") # saves 3 substrings
mobj = patt.match("A000B111C222") # groups() gives all groups
print(mobj.groups())
print(re.search("(A|X)(B|Y)(C|Z)D", "..AYCD..").groups())
print(re.search("(?P<a>A|X)(?P<b>B|Y)(?P<c>C|Z)D",
"..AYCD..").groupdict())
patt = re.compile(r"[\t ]*#\s*define\s*([a-z0-9_]*)\s*(.*)")
mobj = patt.search(" # define spam 1 + 2 + 3") # parts of C #define
print(mobj.groups()) # \s is whitespace

33
(c) Paul Fodor (CS Stony Brook)
Groups
python re-groups.py
0 1 2
('000', '111', '222')
('A', 'Y', 'C')
{'a': 'A', 'c': 'C', 'b': 'Y'}
('spam', '1 + 2 + 3')

34
(c) Paul Fodor (CS Stony Brook)
Groups
 When a match or search function or method is successful, you get back a
match object
 group(g) group(g1, g2, ...) Return the substring that matched a
parenthesized group (or groups) in the pattern. Accept group numbers or names.
Group numbers start at 1; group 0 is the entire string matched by the pattern. Returns
a tuple when passed multiple group numbers, and group number defaults to 0 if
omitted
 groups() Returns a tuple of all groups’ substrings of the match (for
group numbers 1 and higher).
 start([group]) end([group]) Indices of the start and end of the
substring matched by group (or the entire matched string, if no group is
passed).
 span([group]) Returns the two-item tuple: (start(group),
end(group))

35
(c) Paul Fodor (CS Stony Brook)

regular expressions - Pattern matching
No ratings yet
regular expressions - Pattern matching
107 pages
L4 (2)
No ratings yet
L4 (2)
73 pages
Regex
No ratings yet
Regex
24 pages
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
No ratings yet
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
32 pages
Python How To Regex
No ratings yet
Python How To Regex
19 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
20 pages
Regex
100% (1)
Regex
42 pages
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
No ratings yet
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
42 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
18 pages
Howto Regex
No ratings yet
Howto Regex
19 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
No ratings yet
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Regular Expressions
No ratings yet
Regular Expressions
5 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Lecture 6 Re Basics
No ratings yet
Lecture 6 Re Basics
12 pages
Validations php with regex
No ratings yet
Validations php with regex
13 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
100% (1)
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
howto-regex
No ratings yet
howto-regex
20 pages
Howto Regex PDF
No ratings yet
Howto Regex PDF
20 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
No ratings yet
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Howto Regex
No ratings yet
Howto Regex
17 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
2 - Python Strings
No ratings yet
2 - Python Strings
23 pages
Regular Expressions
100% (5)
Regular Expressions
94 pages
Lecture 2
No ratings yet
Lecture 2
70 pages
NLP Chapter 5
No ratings yet
NLP Chapter 5
70 pages
Regular Expressions & Automata
No ratings yet
Regular Expressions & Automata
62 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
Regular Expression
No ratings yet
Regular Expression
18 pages
Chapter 5 Regular Expression, Rollover and Frames
No ratings yet
Chapter 5 Regular Expression, Rollover and Frames
56 pages
WT - Regular Expression
No ratings yet
WT - Regular Expression
22 pages
CSS Unit 5
No ratings yet
CSS Unit 5
61 pages
Regular Expression Python
No ratings yet
Regular Expression Python
23 pages
Python RegEx
No ratings yet
Python RegEx
8 pages
An Introduction To Regular Expressions (9781492082569)
100% (1)
An Introduction To Regular Expressions (9781492082569)
17 pages
COMP3.RegEx
No ratings yet
COMP3.RegEx
10 pages
Regular Expression Howto: A.M. Kuchling
No ratings yet
Regular Expression Howto: A.M. Kuchling
20 pages
Structuring with regix
No ratings yet
Structuring with regix
49 pages
CHAPTER 10
No ratings yet
CHAPTER 10
28 pages
28.PHP Form
No ratings yet
28.PHP Form
13 pages
Learning REGEX
No ratings yet
Learning REGEX
94 pages
css unit 5 dev notes
No ratings yet
css unit 5 dev notes
13 pages
Regex Slides PDF
No ratings yet
Regex Slides PDF
435 pages
Natural Language Processing - Session 3 - Regular Expressions
No ratings yet
Natural Language Processing - Session 3 - Regular Expressions
39 pages
Chapter 5 css
No ratings yet
Chapter 5 css
52 pages
Regex Clinic
100% (1)
Regex Clinic
148 pages
Python Learn Python Regular Expressions FAST The Ultimate Crash Course To Learning The Basics of Python Regular Expressions - (Acodemy)
100% (1)
Python Learn Python Regular Expressions FAST The Ultimate Crash Course To Learning The Basics of Python Regular Expressions - (Acodemy)
127 pages
Chapter Two
No ratings yet
Chapter Two
72 pages
Lecture # 06
No ratings yet
Lecture # 06
27 pages
Python Module-41
No ratings yet
Python Module-41
56 pages
Regular Expressions and Its Applications
No ratings yet
Regular Expressions and Its Applications
6 pages
Regular Expression
No ratings yet
Regular Expression
65 pages
Regular Expressions
No ratings yet
Regular Expressions
104 pages
Regular Expressions: Regular Expression Syntax in Python
No ratings yet
Regular Expressions: Regular Expression Syntax in Python
11 pages
Coding In C Decoded: Decoded, #1
From Everand
Coding In C Decoded: Decoded, #1
D Brown
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Learn C Programming through Nursery Rhymes and Fairy Tales: Classic Stories Translated into C Programs
From Everand
Learn C Programming through Nursery Rhymes and Fairy Tales: Classic Stories Translated into C Programs
Shari Eskenas
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Computer Programming The Doctrine
From Everand
Computer Programming The Doctrine
Adesh Silva
No ratings yet
Tut4_WordEmb nlp
No ratings yet
Tut4_WordEmb nlp
30 pages
Syntactic and Dependency Parsing
No ratings yet
Syntactic and Dependency Parsing
159 pages
bag_of_words nlp
No ratings yet
bag_of_words nlp
23 pages
Primes
No ratings yet
Primes
39 pages
imc_shift-cipher
No ratings yet
imc_shift-cipher
17 pages
lect33-textcat (1)
No ratings yet
lect33-textcat (1)
70 pages
reduction proofs
No ratings yet
reduction proofs
9 pages
new trends for authentication
No ratings yet
new trends for authentication
5 pages
ch07-consistency-replication (1)
No ratings yet
ch07-consistency-replication (1)
30 pages
2DI90_ch9 (1)
No ratings yet
2DI90_ch9 (1)
83 pages
slides08-lr-parsing
No ratings yet
slides08-lr-parsing
25 pages
2DI90_chID190-CH5
No ratings yet
2DI90_chID190-CH5
62 pages
NLP-LLM
No ratings yet
NLP-LLM
47 pages
10-estimators-pre-lecture
No ratings yet
10-estimators-pre-lecture
109 pages
2DI90_ch11 (1)
No ratings yet
2DI90_ch11 (1)
54 pages
ML4D-L6 nlp2
No ratings yet
ML4D-L6 nlp2
58 pages
CSE538 sp25 (4) Lexical and Vector Semantics 2-25 nlp
No ratings yet
CSE538 sp25 (4) Lexical and Vector Semantics 2-25 nlp
126 pages
13-neuralcrf pos tagging
No ratings yet
13-neuralcrf pos tagging
40 pages
Jarrar.LectureNotes.Ch1.Introduction
No ratings yet
Jarrar.LectureNotes.Ch1.Introduction
18 pages
13-oo-opolymorphism plc
No ratings yet
13-oo-opolymorphism plc
15 pages
07-covariance-answers-hidden-lecture
No ratings yet
07-covariance-answers-hidden-lecture
62 pages
3_slides corpus3
No ratings yet
3_slides corpus3
88 pages
2.BasicTextProcessing NEW
No ratings yet
2.BasicTextProcessing NEW
39 pages
4_slides Regualer expression
No ratings yet
4_slides Regualer expression
75 pages
61799956 POS tagging
No ratings yet
61799956 POS tagging
63 pages
04-textcat text class
No ratings yet
04-textcat text class
77 pages
01-introduction plc
No ratings yet
01-introduction plc
53 pages
Ch. 1 Notes
No ratings yet
Ch. 1 Notes
11 pages
02 Random Vars All Handout
No ratings yet
02 Random Vars All Handout
23 pages
01-bayes-all-handout prob
No ratings yet
01-bayes-all-handout prob
28 pages
Hardware Quiz123
No ratings yet
Hardware Quiz123
7 pages
Logcat
No ratings yet
Logcat
11 pages
A Spring Weekend: Write The Past Tense (8p)
No ratings yet
A Spring Weekend: Write The Past Tense (8p)
1 page
Informatica Deployment Checklist
No ratings yet
Informatica Deployment Checklist
9 pages
ALP CBT 2 Fitter 21 Jan 2019 Shift 1 English
No ratings yet
ALP CBT 2 Fitter 21 Jan 2019 Shift 1 English
56 pages
اختبار انجليزي فتري اول متوسط 1445
No ratings yet
اختبار انجليزي فتري اول متوسط 1445
5 pages
Regex Regular Cro
No ratings yet
Regex Regular Cro
1 page
Lab 3 - iOS: Topics in Mobile Computing
No ratings yet
Lab 3 - iOS: Topics in Mobile Computing
18 pages
Excel Formula Bar
No ratings yet
Excel Formula Bar
5 pages
Quia - Week 5 Grammar - Simple Past Tense
No ratings yet
Quia - Week 5 Grammar - Simple Past Tense
5 pages
Grammar and Structures List: Examples Indirect Objects Comparative and Superlative Adjectives Verbs
0% (2)
Grammar and Structures List: Examples Indirect Objects Comparative and Superlative Adjectives Verbs
2 pages
Activity Sheet: Rubrics For Reflection Paper Score 10 8 6 4
No ratings yet
Activity Sheet: Rubrics For Reflection Paper Score 10 8 6 4
3 pages
Eng 2
No ratings yet
Eng 2
5 pages
Law of Sine and Cosine
No ratings yet
Law of Sine and Cosine
10 pages
Labview DSP Module: Digital Signal Processing System-Level Design Using Labview
No ratings yet
Labview DSP Module: Digital Signal Processing System-Level Design Using Labview
6 pages
Unit 2 January 2018 MS
No ratings yet
Unit 2 January 2018 MS
12 pages
Finding The Story: Writing An Introduction
No ratings yet
Finding The Story: Writing An Introduction
28 pages
Modul 1 Parts of Speech Name: - Class
No ratings yet
Modul 1 Parts of Speech Name: - Class
12 pages
Vimalamitra, Lozang Jamspal - The Stages of Meditation
91% (23)
Vimalamitra, Lozang Jamspal - The Stages of Meditation
76 pages
CEFR Lesson Plan Let's Play
100% (3)
CEFR Lesson Plan Let's Play
1 page
VW Composition Golf 7
No ratings yet
VW Composition Golf 7
80 pages
The Life of Faith (C. Nuzum (Nuzum, C.) ) (Z-Library)
No ratings yet
The Life of Faith (C. Nuzum (Nuzum, C.) ) (Z-Library)
91 pages
21ST Pending Visits
No ratings yet
21ST Pending Visits
4 pages
English For Specific Purposes
100% (1)
English For Specific Purposes
22 pages
Latin As The Language of Science and Learning
No ratings yet
Latin As The Language of Science and Learning
660 pages
Download full Universal Algebras Joseph Muscat ebook all chapters
100% (3)
Download full Universal Algebras Joseph Muscat ebook all chapters
19 pages
Fruits of The Holy Spirit
0% (1)
Fruits of The Holy Spirit
21 pages
CMM366A-WIFI_en
No ratings yet
CMM366A-WIFI_en
17 pages
Unify Phone V2 Administrator Documentation Issue 10
No ratings yet
Unify Phone V2 Administrator Documentation Issue 10
63 pages
MGMT640TeamAssignment1
No ratings yet
MGMT640TeamAssignment1
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

L02_Programming_RE plc

Uploaded by

L02_Programming_RE plc

Uploaded by

Regular Expressions in

>>> m.group(0) # The entire match

>>> m.group(1) # The first parenthesized subgroup.

>>> m.group(2) # The second parenthesized subgroup.

>>> re.split("\d", "ab1bc4cd")

>>> re.split("\d*", "ab13bc44cd443gg")

>>> m = re.search('(?<=abc)def', 'abcdef')

 Matches any of the characters inside the set

 Exception to exception: "10[-/]10" matches "10-10" or "10/10"

 Underscore (_) is a word character

 Group a large part to apply repetition to it

Enter SSN: 123-12-1234 more text

Enter a text: The ssn for Smith is 343-34-3490

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.