0% found this document useful (0 votes)
37 views45 pages

Slide14 RegularExpression

Regular expressions can be used to describe patterns in strings. They form their own mini-language with syntax like /abc/ to match a sequence of characters, [] for character sets, + and * for repetition, and | for alternatives. Regular expressions have methods like test() and exec() to search strings and match patterns. They can also be used with the replace() method to substitute patterns in strings.

Uploaded by

Pro Unipadder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views45 pages

Slide14 RegularExpression

Regular expressions can be used to describe patterns in strings. They form their own mini-language with syntax like /abc/ to match a sequence of characters, [] for character sets, + and * for repetition, and | for alternatives. Regular expressions have methods like test() and exec() to search strings and match patterns. They can also be used with the replace() method to substitute patterns in strings.

Uploaded by

Pro Unipadder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Regular Expressions

• Some people, when confronted


with a problem, think ‘I know, I’ll
use regular expressions.’ Now
they have two problems.

- Jamie Zawinski
Regular expression

• Regular expressions are a way to describe patterns in


string data. They form a small, separate language that
is part of JavaScript and many other languages and
systems.
Creating a regular expression

• A regular expression is a type of object. It can be


either constructed with the RegExp constructor or
written as a literal value by enclosing a pattern in
forward slash (/) characters.
Creating a regular expression
• let re1 = new RegExp("abc");
• let re2 = /abc/;
• Both of those regular expression
objects represent the same
pattern: an a character followed
by a b followed by a c
Creating a regular expression
• Since a forward slash ends the pattern, we need to put a backslash
before any forward slash that we want to be part of the pattern.
Some characters, such as question marks and plus signs, have special
meanings in regular expressions and must be preceded by a backslash
if they are meant to represent the character itself.
• let eighteenPlus = /eighteen\+/;
Testing for matches

• console.log(/abc/.test("abcde"));

// → true

• console.log(/abc/.test("abxde"));

// → false
Testing for matches

• Regular expression objects have a number of


methods. The simplest one is test. If you pass it a
string, it will return a Boolean telling you whether the
string contains a match of the pattern in the
expression.
Testing for matches

• A regular expression consisting of only nonspecial


characters simply represents that sequence of
characters. If abc occurs anywhere in the string we are
testing against (not just at the start), test will return
true.
Sets of characters

• console.log(/[0123456789]/.test("in 1992"));

// → true

• console.log(/[0-9]/.test("in 1992"));

// → true
Sets of characters

• In a regular expression, putting a set of characters


between square brackets makes that part of the
expression match any of the characters between the
brackets
Sets of characters

• Within square brackets, a hyphen (-) between two


characters can be used to indicate a range of
characters, where the ordering is determined by the
character’s Unicode number.
• Characters 0 to 9 sit right next to each other in this
ordering (codes 48 to 57), so [0-9] covers all of them and
matches any digit.
Sets of characters
• A number of common character groups have their own built-in shortcuts. Digits are
one of them: \d means the same thing as [0-9].
• \d Any digit character
• \w An alphanumeric character (“word character”)
• \s Any whitespace character (space, tab, newline, and similar)
• \D A character that is not a digit
• \W A nonalphanumeric character
• \S A nonwhitespace character
• . Any character except for newline
Sets of characters

• So you could match a date and time format like 01-30-


2003 15:20 with the following expression:
Sets of characters

• let dateTime = /\d\d-\d\d-\d\d\d\d \d\d:\d\d/;


• console.log(dateTime.test("01-30-2003 15:20"));
// → true
• console.log(dateTime.test("30-jan-2003 15:20"));
// → false
Sets of characters

• To invert a set of characters—that is, to express that


you want to match any character except the ones in
the set—you can write a caret (^) character after the
opening bracket.
Sets of characters

• let notBinary = /[^01]/;


• console.log(notBinary.test("1100100010100110"));
// → false
• console.log(notBinary.test("1100100010200110"));
// → true
Repeating parts of a pattern

• We now know how to match a single digit. What if we want


to match a whole number—a sequence of one or more
digits?
Repeating parts of a pattern
• When you put a plus sign (+) after something in a regular expression,
it indicates that the element may be repeated more than once. Thus,
/\d+/ matches one or more digit characters.
• The star (*) has a similar meaning but also allows the pattern to
match zero times.
• A question mark makes a part of a pattern optional, meaning it may
occur zero times or one time.
Repeating parts of a pattern
• console.log(/'\d+'/.test("'123'"));
// → true
• console.log(/'\d+'/.test("''"));
// → false
• console.log(/'\d*'/.test("'123'"));
// → true
• console.log(/'\d*'/.test("''"));
// → true
Repeating parts of a pattern

• let neighbor = /neighbou?r/;


• console.log(neighbor.test("neighbour"));
// → true
• console.log(neighbor.test("neighbor"));
// → true
Repeating parts of a pattern
• To indicate that a pattern should occur a precise number of times, use braces.

• Putting {4} after an element, for example, requires it to occur exactly four
times.

• It is also possible to specify a range this way: {2,4} means the element must
occur at least twice and at most four times.

• You can also specify open-ended ranges when using braces by omitting the
number after the comma. So, {5,} means five or more times.
Repeating parts of a pattern

• let dateTime = /\d{1,2}-\d{1,2}-\d{4} \d{1,2}:\d{2}/;

• console.log(dateTime.test("1-30-2003 8:45"));

// → true
Grouping subexpressions

• To use an operator like * or + on more than one element at a


time, you have to use parentheses. A part of a regular
expression that is enclosed in parentheses counts as a single
element as far as the operators following it are concerned.
Grouping subexpressions

• let cartoonCrying = /boo+(hoo+)+/i;

• console.log(cartoonCrying.test("Boohoooohoohooo"));

// → true
Grouping subexpressions

• The first and second + characters apply only to the second o in boo and
hoo, respectively. The third + applies to the whole group (hoo+),
matching one or more sequences like that.

• The i at the end of the expression in the example makes this regular
expression case insensitive, allowing it to match the uppercase B in the
input string, even though the pattern is itself all lowercase.
Matches and groups

• The test method is the absolute simplest way to match a


regular expression. It tells you only whether it matched and
nothing else.
• Regular expressions also have an exec (execute) method that
will return null if no match was found and return an object
with information about the match otherwise.
Matches and groups

• let match = /\d+/.exec("one two 100");

• console.log(match);

// → ["100"]

• console.log(match.index);

// → 8
Matches and groups

• String values have a match method that behaves


similarly.

• console.log("one two 100".match(/\d+/));

// → ["100"]
Word and string boundaries

• If we want to enforce that the match must span the whole string,
we can add the markers ^ and $. The caret matches the start of
the input string, whereas the dollar sign matches the end
• /^\d+$/
• /^!/
• /x^/
Word and string boundaries

• /^\d+$/ matches a string consisting entirely of one or more


digits

• /^!/ matches any string that starts with an exclamation mark

• /x^/ does not match any string (there cannot be an x before


the start of the string)
Word and string boundaries

• If, on the other hand, we just want to make sure the date
starts and ends on a word boundary, we can use the marker
\b.
Word and string boundaries

• console.log(/cat/.test("concatenate"));

// → true

• console.log(/\bcat\b/.test("concatenate"));

// → false
Choice patterns

• We could write three regular expressions and test


them in turn, but there is a nicer way. The pipe
character (|) denotes a choice between the pattern to
its left and the pattern to its right. So I can say this:
Choice patterns

• let animalCount = /\b\d+ (pig|cow|chicken)s?\b/;

• console.log(animalCount.test("15 pigs"));

// → true

• console.log(animalCount.test("15 pigchickens"));

// → false
Choice patterns

• Say we want to know whether a piece of text contains not only a


number but a number followed by one of the words pig, cow,
or chicken, or any of their plural forms.

• Parentheses can be used to limit the part of the pattern that the pipe
operator applies to, and you can put multiple such operators next to
each other to express a choice between more than two alternatives.
The mechanics of matching
let animalCount = /\b\d+ (pig|cow|chicken)s?\b/;
The replace method

• String values have a replace method that can be used to


replace part of the string with another string.

• console.log("papa".replace("p", "m"));

// → mapa
The replace method

• The first argument can also be a regular expression, in which case the
first match of the regular expression is replaced. When a g option (for
global) is added to the regular expression, all matches in the string
will be replaced, not just the first.
The replace method

• console.log("Borobudur".replace(/[ou]/, "a"));

// → Barobudur

• console.log("Borobudur".replace(/[ou]/g, "a"));

// → Barabadar
The search method

• The indexOf method on strings cannot be called with a regular


expression. But there is another method, search, that does expect a
regular expression. Like indexOf, it returns the first index on which the
expression was found, or -1 when it wasn’t found.
The search method

• console.log(" word".search(/\S/));

// → 2

• console.log(" ".search(/\S/));

// → -1
Summary
• /abc/ A sequence of characters

• /[abc]/ Any character from a set of characters

• /[^abc]/ Any character not in a set of characters

• /[0-9]/ Any character in a range of characters

• /x+/ One or more occurrences of the pattern x

• /x+?/ One or more occurrences, nongreedy

• /x*/ Zero or more occurrences

• /x?/ Zero or one occurrence

• /x{2,4}/ Two to four occurrences

• /(abc)/
Summary
• /(abc)/ A group

• /a|b|c/ Any one of several patterns

• /\d/ Any digit character

• /\w/ An alphanumeric character (“word character”)

• /\s/ Any whitespace character

• /./ Any character except newlines

• /\b/ A word boundary

• /^/ Start of input

• /$/ End of input


References

• https://eloquentjavascript.net/

• https://www.w3schools.com/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy