Regular Expressions Cheat Sheet
Regular Expressions Cheat Sheet
ro \ \ first : 1 hell o
Example
12 Angry men
Example
10032
non-matches -
Syntax Description Syntax Description second
pattern matches pattern matches non matches
\ w
match word character s
\wee\w
s
tree The bee
( |y)
re b re d rant
bee4
eels eat meat
alternative pattern s
banter
bear
b \ \ blob
bea r
character s
Swing the bat fast
bat 53
captures where n is the bribe
bring
1
Learn regular expressions online at www DataCamp com . . group index starting at
\ s
match whitespace
\sfox\s
the fox ate
’
it s the fox .
his fox ran
foxfur
Regular expression (regex or regexp) is a pattern of characters that describes an amount of text. To process
\metacharacter escape a metacharacter
to match on the
\.
\^
The cat ate
2^3
.
the cat ate
23
> Lookahead
regexes, you will use a “regex engine.” Each of these engines use slightly different syntax called regex metacharacter
You can specify that specific characters must appear before or after you match, without including those
flavor. A list of popular engines can be found here. Two common programming languages we discuss on
characters in the match.
DataCamp are Python and R which each have their own engines.
Character
Example
Example
Example
Since regex describes patterns of text, it can be used to check for the existence of patterns in a text,
> classes
Syntax Description
pattern matches -
non matches
extract substrings from longer strings, and help make adjustments to text. Regex can be very simple to
describe specific words, or it can be more advanced to find vague patterns of characters like the top-level (?= )
x looks ahead at the next an(?=an)
banan a
ban d
domain in a url.
Character classes are sets or ranges of characters.
characters without using iss(?=ipp)
Mississippi
missed
[x y]
match several character s
gr[ea]y
gray
green
(?! )
x looks ahead at next ai(?!n)
fail
faint
> Definitions grey
gree k
characters to not match brail
trai n
on
Literal character: A literal character is the most basic regular expression you can use. It simply matches
[x -y]
match a range of [a-e]
am ber
fo x
the actual character you write. So if you are trying to represent an “r,” you would write r.
character s
brand
join
(?<= )
x looks at previous
(?<= )a
tr
trail
bea r
characters for a match translat e
streak
Metacharacter: Metacharacters signify to the regex engine that the following character has a special
[^x y]
does not match several gr[^ea]y
green
gray
without using those in
meaning. You typically include a \ in front of the metacharacter and they can do things like signify the
character s
greek
grey
the match
It is signified by [ and ] with the characters you are looking for in the middle of the brackets.
on
class
Capture group: A capture group is signified by opening and closing, round parenthesis. They allow you to
group regexes together to apply other regex features like quantifiers (see below) to the group.
> R epetition
> Literal matches and modifiers
Modifiers are settings that change the way the matching rules work.
> Anchors Rather than matching single instances of characters, you can match repeated characters.
Syntax Description
pattern matches -
non matches
times
carrot
artichok e
^
match start of line
^r
rabbit
parrot
(?i) (?-i).
te sTep Trench
raccoon
ferret
x +
match one or more time s
re +
green
trap
case-insensitiv e
tEach
bear
tre e
ruine d
$
match end of line
t$
rabbit
trap
(? ) (?- )
x t x tap c a t
foot
star
x ?
match z ero or one time s
ro ?a
roast
root
whitespac e
tapdanc e
rot a potato
rant
rear
\A
match start of line
\Ar
rabbit
parrot
(?s) (?-s)
x {m}
match m time s
\we{2}\w
dee r
re d
DOTALL mode which s c nd(?-s)
e o Secon d an d third
second
see r
enter
makes the “.” include and hi d
t r and third
foot
star
2222224
123
else
sleep$
start or end of a word
the fox ate
foxskin scarf
times
1222384
1222223
end of line rather than eat an d
sleep
de
middle of other non- beef tree
et c. number of times - known freeeee roasted
space characters
as a la z
y quantifier
> Unico
Graphemes: Is either a codepoint or a character. All characters are made up of one or more graphemes
In order to extract specific parts of a string, you can capture those parts, and even name the parts that you
Rather than matching specific characters, you can match specific types of characters such as letters,
in a sequence.
captured.
numbers, and more.
.
anything except for a c.e
clean
acert
( )
x capturing a patter n
(iss)+
Mississipp i
mist
\X
match grapheme s
\u0000 gmail
@gmail
gmail
linebrea k
chea p
cent
misse d
persist
www.email@gmail
@aol
\ d
match a digi t
\d
6060-842
tw o
(?: )
x create a group without (?:a )(cd)
2b|^2b
**___
capturin g
Group 1: cd
like ones with an accent \u006 5\u0 300
\ D
match a non-digi t
\D
The 5 cats ate
52
(?<nam > )
e x create a named capture (?< i s > d)(?
f r t \ Match : 1325
2
12 Angry men
10032
p
<sc nd> d) d*
: 1
o
. .
grou ro \ \ first hell
second : 3
Learn Data Skills Online at www DataCamp com
\ w
match word character s
\wee\w
s
tree The bee
bee4
eels eat meat
( |y)
re b re d rant
match non-word
alternative pattern s
banter
bear