Regular Expressions
Regular Expressions
Regular Expressions
Regular expressions are patterns that allow you to describe, match, or parse text. With regular expressions, you can
do things like find and replace text, verify that input data follows the format required, and and other similar things.
Here's a scenario: you want to verify that the telephone number entered by a user on a form matches a format,
say, ###-###-#### (where # represents a number). One way to solve this could be:
Alternatively, we can use a regular expression here like this:
function isPattern(userInput) {
return /^\d{3}-\d{3}-\d{4}$/.test(userInput);
}
Notice how we’ve refactored the code using regex. Amazing right? That is the power of regular expressions.
How to Create A Regular Expression
In JavaScript, you can create a regular expression in either of two ways:
Method #1: using a regular expression literal. This consists of a pattern enclosed in forward slashes. You
can write this with or without a flag (we will see what flag means shortly). The syntax is as follows:
const regExpLiteral = /pattern/; // Without flags
console.log(regExpStr.match(regExpLiteral));
console.log(regExpStr.match(regExpLiteral));
// Output: null
We get null because the characters in the string do not appear as specified in the pattern. So a literal pattern such
as /hello/, means h followed by e followed by l followed by l followed by o, exactly like that.
How to use a regex constructor:
// Syntax: RegExp(pattern [, flags])
console.log(str.match(regExpConstructor));
console.log(regexPattern1.test('The cat and mouse')); // Output: false because the line does not start with cat
// Syntax 2: /...\b/
// Syntax 3: /\b....\b/
// Search for a stand-alone word that begins and end with the pattern ward
const regexPattern3 = /\bward\b/gi;
console.log(regexPattern.test('helo')); // Output:true
console.log(regExp.exec('abcdef'));
// Output: ['abc', index: 0, input: 'abcdef', groups: undefined]
console.log(regExp.exec('bcadef'));
// Output: null
Also, there are string methods that accept regular expressions as a parameter like match(), replace(), replaceAll(),
matchAll(), search(), and split().
Regex Examples
Here are some examples to reinforce some of the concepts we've learned in this article.
First example: How to use a regex pattern to match an email address:
const regexPattern = /^[(\w\d\W)+]+@[\w+]+\.[\w+]+$/i;
console.log(regexPattern.test('abcdef123@gmailcom'));
// Output: false, missing dot
console.log(regexPattern.test('abcdef123gmail.'));
// Output: false, missing end literal 'com'
console.log(regexPattern.test('abcdef123@gmail.com'));
// Output: true, the input matches the pattern correctly
Let's interpret the pattern. Here's what's happening:
/ represents the start of the regular expression pattern.
^ checks for the start of a line with the characters in the character class.
[(\w\d\W)+ ]+ matches any word, digit and non-word character in the character class at least once. Notice
how the parentheses were used to group the characters before adding the quantifier. This is same as this [\
w+\d+\W+]+ .
@ matches the literal @ in the email format.
[\w+]+ matches any word character in this character class at least once.
\. escapes the dot so it appears as a literal character.
[\w+]+$ matches any word character in this class. Also this character class is anchored at the end of the
line.
/ - ends the pattern
Alright, next example: how to match a URL with format http://example.com or https://www.example.com:
const pattern = /^[https?]+:\/\/((w{3}\.)?[\w+]+)\.[\w+]+$/i;
console.log(pattern.test('https://www.example.com'));
// Output: true
console.log(pattern.test('http://example.com'));
// Output: true
console.log(pattern.test('https://example'));
// Output: false
Let's also interpret this pattern. Here's what's happening:
/...../ represents the start and end of the regex pattern
^ asserts for the start of the line
[https?]+ matches the characters listed at least once, however ? makes 's' optional.
: matches a literal semi-colon.
\/\/ escapes the two forward slashes.
(w{3}\.) matches the character w 3 times and the dot that follows immediately. However, this group is
optional.
[\w+]+ matches character in this class at least once.
\. escapes the dot
[\w+]+$ matches any word character in this class. Also this character class is anchored at the end of the
line.
avaScript Regex
In JavaScript, a Regular Expression (RegEx) is an object that describes a sequence of characters used for defining a
search pattern. For example,
/^a...s$/
The above code defines a RegEx pattern. The pattern is: any five letter string starting with a and ending with s.
A pattern defined using RegEx can be used to match against a string.
Expression String Matched?
abs No match
alias Match
/^a...s$/ abyss Match
Alias No match
An abacus No match
Create a RegEx
There are two ways you can create a regular expression in JavaScript.
1. Using a regular expression literal:
The regular expression consists of a pattern enclosed between slashes /. For example,
cost regularExp = /abc/;
Here, /abc/ is a regular expression.
2. Using the RegExp() constructor function:
You can also create a regular expression by calling the RegExp() constructor function. For example,
const reguarExp = new RegExp('abc');
For example,
const regex = new RegExp(/^a...s$/);
console.log(regex.test('alias')); // true
Run Code
In the above example, the string alias matches with the RegEx pattern /^a...s$/. Here, the test() method is used to
check if the string matches the pattern.
There are several other methods available to use with JavaScript RegEx. Before we explore them, let's learn about
regular expressions themselves.
If you already know the basics of RegEx, jump to JavaScript RegEx Methods.
Specify Pattern Using RegEx
To specify regular expressions, metacharacters are used. In the above example (/^a...s$/), ^ and $ are
metacharacters.
MetaCharacters
Metacharacters are characters that are interpreted in a special way by a RegEx engine. Here's a list of
metacharacters:
[] . ^ $ * + ? {} () \ |
[] - Square brackets
Square brackets specify a set of characters you wish to match.
Expression String Matched?
a 1 match
ac 2 matches
[abc]
Hey Jude No match
abc de ca 5 matches
Here, [abc] will match if the string you are trying to match contains any of the a, b or c.
You can also specify a range of characters using - inside square brackets.
[a-e] is the same as [abcde].
[1-4] is the same as [1234].
[0-39] is the same as [01239].
You can complement (invert) the character set by using caret ^ symbol at the start of a square-bracket.
[^abc] means any character except a or b or c.
[^0-9] means any non-digit character.
. - Period
A period matches any single character (except newline '\n').
Expression String Matched?
a No match
ac 1 match
..
acd 1 match
acde 2 matches (contains 4 characters)
^ - Caret
The caret symbol ^ is used to check if a string starts with a certain character.
Expression String Matched?
a 1 match
^a abc 1 match
bac No match
abc 1 match
^ab
acb No match (starts with a but not followed by b)
$ - Dollar
The dollar symbol $ is used to check if a string ends with a certain character.
Expression String Matched?
a 1 match
a$ formula 1 match
cab No match
* - Star
The star symbol * matches zero or more occurrences of the pattern left to it.
Expression String Matched?
mn 1 match
man 1 match
ma*n mann 1 match
main No match (a is not followed by n)
woman 1 match
+ - Plus
The plus symbol + matches one or more occurrences of the pattern left to it.
Expression String Matched?
mn No match (no a character)
man 1 match
ma+n mann 1 match
main No match (a is not followed by n)
woman 1 match
? - Question Mark
The question mark symbol ? matches zero or one occurrence of the pattern left to it.
Expression String Matched?
mn 1 match
man 1 match
ma?n maan No match (more than one a character)
main No match (a is not followed by n)
woman 1 match
{} - Braces
Consider this code: {n,m}. This means at least n, and at most m repetitions of the pattern left to it.
Expression String Matched?
abc dat No match
abc daat 1 match (at daat)
a{2,3}
aabc daaat 2 matches (at aabc and daaat)
aabc daaaat 2 matches (at aabc and daaaat)
Let's try one more example. This RegEx [0-9]{2, 4} matches at least 2 digits but not more than 4 digits.
Expression String Matched?
ab123csde 1 match (match at ab123csde)
[0-9]{2,4} 12 and 345673 3 matches (12, 3456, 73)
1 and 2 No match
| - Alternation
Vertical bar | is used for alternation (or operator).
Expression String Matched?
a|b cde No match
ade 1 match (match at ade)
acdbea 3 matches (at acdbea)
Here, a|b match any string that contains either a or b
() - Group
Parentheses () is used to group sub-patterns. For example, (a|b|c)xz match any string that matches either a or b or
c followed by xz
Expression String Matched?
ab xz No match
(a|b|c)xz abxz 1 match (match at abxz)
axz cabxz 2 matches (at axzbc cabxz)
\ - Backslash
Backslash \ is used to escape various characters including all metacharacters. For example,
\$a match if a string contains $ followed by a. Here, $ is not interpreted by a RegEx engine in a special way.
If you are unsure if a character has special meaning or not, you can put \ in front of it. This makes sure the
character is not treated in a special way.
Special Sequences
Special sequences make commonly used patterns easier to write. Here's a list of special sequences:
\A - Matches if the specified characters are at the start of a string.
Expression String Matched?
the sun Match
\Athe
In the sun No match
\b - Matches if the specified characters are at the beginning or end of a word.
Expression String Matched?
football Match
\bfoo
a football Match
a football No match
the foo Match
foo\b
the afoo test Match
the afootest No match
\B - Opposite of \b. Matches if the specified characters are not at the beginning or end of a word.
Expression String Matched?
football No match
\Bfoo
a football No match
a football Match
the foo No match
foo\B
the afoo test No match
the afootest Match
\d - Matches any decimal digit. Equivalent to [0-9]
Expression String Matched?
12abc3 3 matches (at 12abc3)
\d
JavaScript No match
\D - Matches any non-decimal digit. Equivalent to [^0-9]
Expression String Matched?
1ab34"50 3 matches (at 1ab34"50)
\D
1345 No match
\s - Matches where a string contains any whitespace character. Equivalent to [ \t\n\r\f\v].
Expression String Matched?
JavaScript RegEx 1 match
\s
JavaScriptRegEx No match
\S - Matches where a string contains any non-whitespace character. Equivalent to [^ \t\n\r\f\v].
Expression String Matched?
a b 2 matches (at a b)
\S
No match
\w - Matches any alphanumeric character (digits and alphabets). Equivalent to [a-zA-Z0-9_]. By the way, underscore
_ is also considered an alphanumeric character.
Expression String Matched?
12&": ;c 3 matches (at 12&": ;c)
\w
%"> ! No match
\W - Matches any non-alphanumeric character. Equivalent to [^a-zA-Z0-9_]
Expression String Matched?
1a2%c 1 match (at 1a2%c)
\W
JavaScript No match
// take input
let number = prompt('Enter a number XXX-XXX-XXXX');