regular expressions

There are occassions when you want to match parts of a string. This can be achieved by using regular expressions, which allow you to create patterns to look for these matches.

There are two ways to construct a regular expression:

Using a regular expression literal. This consists of a pattern enclosed between slashes:

const re = /ab+c/;

Regular expression literals provide a compilation of the regular expression when the script is loaded. If the regular expression remains constant, using this can improve performance.

Calling the constructor function of the RegExp object:

const re = new RegExp('ab+c');

Using the constructor function provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing or you don’t know the pattern and you’re getting it from another source.

writing a regular expression pattern

A regular expression pattern comprises simple characters or a combination of simple and special characters.

simple patterns

Simple patterns consist of characters for which you want to find a direct match:

/abc/

This matches character combinations in strings only, when the exact sequence abc occurs – all of the characters together AND in that order.

“The first three letters of the English alphabet are abc.” would be a successful match because the characters are together and in the correct order.

“Grab cake.” would not be a successful match because there is white space between the b and c.

special characters

When you need to perform a search for more than a direct match, you can use special characters in the pattern:

assertions

Assertions include boundaries, which indicate the beginning and endings of lines and words and other patterns indicating in some way that a match is possible.

^

This is a boundary-type assertion that is used to match the beginning of input.

/^A/

“An apple” would be a successful match because there is an “A” at the beginning of the line.

“an Antelope” would not be a successful match because the “A” is NOT at the beginning of the line.

$

This is a boundary-type assertion that is used to match the end of input.

/a$/

“tea” would be a successful match because there is an “a” at the end of the line.

“toast” would not be a successful match because the “a” is NOT at the end of the line.

/b

This is a boundary-type assertion that is used to match a word boundary. This is a position where a word character is not followed or proceeded by another word-character, such as between a letter and a space. A matched word boundary is NOT included in the match.

/\bm/

“moon” would be a successful match becuase there is a “m” at the beginning.

/oo\b/

“moon” would not be a successful match because there is a “n” after the “oo”.

/oon\b/

“moon” would not be a successful match because there is a “oon” at the end.

/B

This is a boundary-type assertion that is used to match a non-word boundary. This is a position where the previous and next character are NOT of the same type. Either must be words, or both must be non-words.

/\Bon/

“at noon” would be a successful match because “on” is at the end.

/ye\B/

“possibly yesterday” would be a successful match because “ye” is at the start of the second word.

Regular expressions can be used to look for very complex patterns. Here is a real-world example:

/^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/

This is actually a regular expression for HTML 5 email validation!