22.2 RegExp (Regular Expression) Objects
A RegExp object contains a regular expression and the associated flags.
The form and functionality of regular expressions is modelled after the regular expression facility in the Perl 5 programming language.
22.2.1 Patterns
The RegExp
Syntax
Each \u
u
u
\u
A number of productions in this section are given alternative definitions in section
22.2.1.1 Static Semantics: Early Errors
This section is amended in
 It is a Syntax Error if NcapturingParens ≥ 2^{32}  1.

It is a Syntax Error if
Pattern contains multipleGroupSpecifier s whose enclosedRegExpIdentifierName s have the sameCapturingGroupName .

It is a Syntax Error if the MV of the first
DecimalDigits is larger than the MV of the secondDecimalDigits .

It is a Syntax Error if the enclosing
Pattern does not contain aGroupSpecifier with an enclosedRegExpIdentifierName whoseCapturingGroupName equals theCapturingGroupName of theRegExpIdentifierName of this production'sGroupName .

It is a Syntax Error if the
CapturingGroupNumber ofDecimalEscape is larger than NcapturingParens (22.2.2.1 ).

It is a Syntax Error if
IsCharacterClass of the firstClassAtom istrue orIsCharacterClass of the secondClassAtom istrue . 
It is a Syntax Error if
IsCharacterClass of the firstClassAtom isfalse andIsCharacterClass of the secondClassAtom isfalse and theCharacterValue of the firstClassAtom is larger than theCharacterValue of the secondClassAtom .

It is a Syntax Error if
IsCharacterClass ofClassAtomNoDash istrue orIsCharacterClass ofClassAtom istrue . 
It is a Syntax Error if
IsCharacterClass ofClassAtomNoDash isfalse andIsCharacterClass ofClassAtom isfalse and theCharacterValue ofClassAtomNoDash is larger than theCharacterValue ofClassAtom .

It is a Syntax Error if the
CharacterValue ofRegExpUnicodeEscapeSequence is not the code point value of"$" ,"_" , or some code point matched by theUnicodeIDStart lexical grammar production.

It is a Syntax Error if the result of performing
UTF16SurrogatePairToCodePoint on the two code points matched byUnicodeLeadSurrogate andUnicodeTrailSurrogate respectively is not matched by theUnicodeIDStart lexical grammar production.

It is a Syntax Error if the
CharacterValue ofRegExpUnicodeEscapeSequence is not the code point value of"$" ,"_" , <ZWNJ>, <ZWJ>, or some code point matched by theUnicodeIDContinue lexical grammar production.

It is a Syntax Error if the result of performing
UTF16SurrogatePairToCodePoint on the two code points matched byUnicodeLeadSurrogate andUnicodeTrailSurrogate respectively is not matched by theUnicodeIDContinue lexical grammar production.

It is a Syntax Error if the
List of Unicode code points that isSourceText ofUnicodePropertyName is not identical to aList of Unicode code points that is a Unicodeproperty name or property alias listed in the “Property name and aliases” column ofTable 56 . 
It is a Syntax Error if the
List of Unicode code points that isSourceText ofUnicodePropertyValue is not identical to aList of Unicode code points that is a value or value alias for the Unicode property or property alias given bySourceText ofUnicodePropertyName listed in the “Property value and aliases” column of the corresponding tablesTable 58 orTable 59 .

It is a Syntax Error if the
List of Unicode code points that isSourceText ofLoneUnicodePropertyNameOrValue is not identical to aList of Unicode code points that is a Unicode general category or general category alias listed in the “Property value and aliases” column ofTable 58 , nor a binary property or binary property alias listed in the “Property name and aliases” column ofTable 57 .
22.2.1.2 Static Semantics: CapturingGroupNumber
This section is amended in
 Return the MV of
NonZeroDigit .
 Let n be the number of code points in
DecimalDigits .  Return (the MV of
NonZeroDigit × 10^{n} plus the MV ofDecimalDigits ).
The definitions of “the MV of
22.2.1.3 Static Semantics: IsCharacterClass
This section is amended in
 Return
false .
 Return
true .
22.2.1.4 Static Semantics: CharacterValue
This section is amended in
 Return the code point value of U+002D (HYPHENMINUS).
 Let ch be the code point matched by
SourceCharacter .  Return the code point value of ch.
 Return the code point value of U+0008 (BACKSPACE).
 Return the code point value of U+002D (HYPHENMINUS).
 Return the code point value according to
Table 55 .
ControlEscape  Code Point Value  Code Point  Unicode Name  Symbol 

t

9 
U+0009

CHARACTER TABULATION  <HT> 
n

10 
U+000A

LINE FEED (LF)  <LF> 
v

11 
U+000B

LINE TABULATION  <VT> 
f

12 
U+000C

FORM FEED (FF)  <FF> 
r

13 
U+000D

CARRIAGE RETURN (CR)  <CR> 
 Let ch be the code point matched by
ControlLetter .  Let i be ch's code point value.
 Return the remainder of dividing i by 32.
 Return the code point value of U+0000 (NULL).
\0
represents the <NUL> character and cannot be followed by a decimal digit.
 Return the MV of
HexEscapeSequence .
 Let lead be the
CharacterValue ofHexLeadSurrogate .  Let trail be the
CharacterValue ofHexTrailSurrogate .  Let cp be
UTF16SurrogatePairToCodePoint (lead, trail).  Return the code point value of cp.
 Return the MV of
Hex4Digits .
 Return the MV of
CodePoint .
 Return the MV of
HexDigits .
 Let ch be the code point matched by
IdentityEscape .  Return the code point value of ch.
22.2.1.5 Static Semantics: SourceText
 Return the
List , in source text order, of Unicode code points in the source text matched by this production.
22.2.1.6 Static Semantics: CapturingGroupName
 Let idText be the source text matched by
RegExpIdentifierName .  Let idTextUnescaped be the result of replacing any occurrences of
\
RegExpUnicodeEscapeSequence in idText with the code point represented by theRegExpUnicodeEscapeSequence .  Return !
CodePointsToString (idTextUnescaped).
22.2.2 Pattern Semantics
This section is amended in
A regular expression pattern is converted into an
A u
. A BMP pattern matches against a String interpreted as consisting of a sequence of 16bit values that are Unicode code points in the range of the Basic Multilingual Plane. A Unicode pattern matches against a String interpreted as consisting of Unicode code points encoded using UTF16. In the context of describing the behaviour of a BMP pattern “character” means a single 16bit Unicode BMP code point. In the context of describing the behaviour of a Unicode pattern “character” means a UTF16 encoded code point (
The syntax and semantics of
For example, consider a pattern expressed in source text as the single nonBMP character U+1D11E (MUSICAL SYMBOL G CLEF). Interpreted as a Unicode pattern, it would be a single element (character)
Patterns are passed to the RegExp
An implementation may not actually perform such translations to or from UTF16, but the semantics of this specification requires that the result of pattern matching be as if such translations were performed.
22.2.2.1 Notation
The descriptions below use the following aliases:

Input is a
List whose elements are the characters of the String being matched by the regular expression pattern. Each character is either a code unit or a code point, depending upon the kind of pattern involved. The notation Input[n] means the n^{th} character of Input, where n can range between 0 (inclusive) and InputLength (exclusive).  InputLength is the number of characters in Input.

NcapturingParens is the total number of leftcapturing parentheses (i.e. the total number of
Parse Nodes) in the pattern. A leftcapturing parenthesis is anyAtom :: ( GroupSpecifier Disjunction ) (
pattern character that is matched by the(
terminal of the production.Atom :: ( GroupSpecifier Disjunction ) 
DotAll is
true if the RegExp object's [[OriginalFlags]] internal slot contains"s" and otherwise isfalse . 
IgnoreCase is
true if the RegExp object's [[OriginalFlags]] internal slot contains"i" and otherwise isfalse . 
Multiline is
true if the RegExp object's [[OriginalFlags]] internal slot contains"m" and otherwise isfalse . 
Unicode is
true if the RegExp object's [[OriginalFlags]] internal slot contains"u" and otherwise isfalse . 
WordCharacters is the mathematical set that is the union of all sixtythree characters in
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_" (letters, numbers, and U+005F (LOW LINE) in the Unicode Basic Latin block) and all characters c for which c is not in that set butCanonicalize (c) is. WordCharacters cannot contain more than sixtythree characters unless Unicode and IgnoreCase are bothtrue .
Furthermore, the descriptions below use the following internal data structures:

A CharSet is a mathematical set of characters. When the Unicode flag is
true , “all characters” means the CharSet containing all code point values; otherwise “all characters” means the CharSet containing all code unit values. 
A State is an ordered pair (endIndex, captures) where endIndex is an
integer and captures is aList of NcapturingParens values. States are used to represent partial match states in the regular expression matching algorithms. The endIndex is one plus the index of the last input character matched so far by the pattern, while captures holds the results of capturing parentheses. The n^{th} element of captures is either aList of characters that represents the value obtained by the n^{th} set of capturing parentheses orundefined if the n^{th} set of capturing parentheses hasn't been reached yet. Due to backtracking, many States may be in use at any time during the matching process. 
A MatchResult is either a State or the special token
failure that indicates that the match failed. 
A Continuation is an
Abstract Closure that takes one State argument and returns a MatchResult result. The Continuation attempts to match the remaining portion (specified by the closure's captured values) of the pattern against Input, starting at the intermediate state given by its State argument. If the match succeeds, the Continuation returns the final State that it reached; if the match fails, the Continuation returnsfailure . 
A Matcher is an
Abstract Closure that takes two arguments—a State and a Continuation—and returns a MatchResult result. A Matcher attempts to match a middle subpattern (specified by the closure's captured values) of the pattern against Input, starting at the intermediate state given by its State argument. The Continuation argument should be a closure that matches the rest of the pattern. After matching the subpattern of a pattern to obtain a new State, the Matcher then calls Continuation on that new State to test if the rest of the pattern can match as well. If it can, the Matcher returns the State returned by Continuation; if not, the Matcher may try different choices at its choice points, repeatedly calling Continuation until it either succeeds or all possibilities have been exhausted.
22.2.2.2 Pattern
The production
 Evaluate
Disjunction with 1 as its direction argument to obtain a Matcher m.  Return a new
Abstract Closure with parameters (str, index) that captures m and performs the following steps when called:Assert :Type (str) is String.Assert : index is a nonnegativeinteger which is ≤ the length of str. If Unicode is
true , let Input be !StringToCodePoints (str). Otherwise, let Input be aList whose elements are the code units that are the elements of str. Input will be used throughout the algorithms in22.2.2 . Each element of Input is considered to be a character.  Let InputLength be the number of characters contained in Input. This alias will be used throughout the algorithms in
22.2.2 .  Let listIndex be the index into Input of the character that was obtained from element index of str.
 Let c be a new Continuation with parameters (y) that captures nothing and performs the following steps when called:
Assert : y is a State. Return y.
 Let cap be a
List of NcapturingParensundefined values, indexed 1 through NcapturingParens.  Let x be the State (listIndex, cap).
 Return m(x, c).
A Pattern evaluates (“compiles”) to an
22.2.2.3 Disjunction
With parameter direction.
The production
 Evaluate
Alternative with argument direction to obtain a Matcher m.  Return m.
The production
 Evaluate
Alternative with argument direction to obtain a Matcher m1.  Evaluate
Disjunction with argument direction to obtain a Matcher m2.  Return a new Matcher with parameters (x, c) that captures m1 and m2 and performs the following steps when called:
The 
regular expression operator separates two alternatives. The pattern first tries to match the left 
produce
/aab/.exec("abc")
returns the result
/((a)(ab))((c)(bc))/.exec("abc")
returns the array
["abc", "a", "a", undefined, "bc", undefined, "bc"]
and not
["abc", "ab", undefined, "ab", "c", "c", undefined]
The order in which the two alternatives are tried is independent of the value of direction.
22.2.2.4 Alternative
With parameter direction.
The production
The production
 Evaluate
Alternative with argument direction to obtain a Matcher m1.  Evaluate
Term with argument direction to obtain a Matcher m2.  If direction = 1, then
 Return a new Matcher with parameters (x, c) that captures m1 and m2 and performs the following steps when called:
 Else,
Assert : direction is 1. Return a new Matcher with parameters (x, c) that captures m1 and m2 and performs the following steps when called:
Consecutive
22.2.2.5 Term
With parameter direction.
The production
 Return the Matcher that is the result of evaluating
Assertion .
The resulting Matcher is independent of direction.
The production
 Return the Matcher that is the result of evaluating
Atom with argument direction.
The production
 Evaluate
Atom with argument direction to obtain a Matcher m.  Evaluate
Quantifier to obtain the three results: a nonnegativeinteger min, a nonnegativeinteger (or +∞) max, and Boolean greedy. Assert : min ≤ max. Let parenIndex be the number of leftcapturing parentheses in the entire regular expression that occur to the left of this
Term . This is the total number of Parse Nodes prior to or enclosing thisAtom :: ( GroupSpecifier Disjunction ) Term .  Let parenCount be the number of leftcapturing parentheses in
Atom . This is the total number of Parse Nodes enclosed byAtom :: ( GroupSpecifier Disjunction ) Atom .  Return a new Matcher with parameters (x, c) that captures m, min, max, greedy, parenIndex, and parenCount and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Return !
RepeatMatcher (m, min, max, greedy, x, c, parenIndex, parenCount).
22.2.2.5.1 RepeatMatcher ( m, min, max, greedy, x, c, parenIndex, parenCount )
The abstract operation RepeatMatcher takes arguments m (a Matcher), min (a nonnegative
 If max = 0, return c(x).
 Let d be a new Continuation with parameters (y) that captures m, min, max, greedy, x, c, parenIndex, and parenCount and performs the following steps when called:
Assert : y is a State. If min = 0 and y's endIndex = x's endIndex, return
failure .  If min = 0, let min2 be 0; otherwise let min2 be min  1.
 If max is +∞, let max2 be +∞; otherwise let max2 be max  1.
 Return !
RepeatMatcher (m, min2, max2, greedy, y, c, parenIndex, parenCount).
 Let cap be a copy of x's captures
List .  For each
integer k such that parenIndex < k and k ≤ parenIndex + parenCount, set cap[k] toundefined .  Let e be x's endIndex.
 Let xr be the State (e, cap).
 If min ≠ 0, return m(xr, d).
 If greedy is
false , then Let z be c(x).
 If z is not
failure , return z.  Return m(xr, d).
 Let z be m(xr, d).
 If z is not
failure , return z.  Return c(x).
An
If the
Compare
/a[az]{2,4}/.exec("abcdefghi")
which returns
/a[az]{2,4}?/.exec("abcdefghi")
which returns
Consider also
/(aaaabaacbabc)*/.exec("aabaac")
which, by the choice point ordering above, returns the array
["aaba", "ba"]
and not any of:
["aabaac", "aabaac"]
["aabaac", "c"]
The above ordering of choice points can be used to write a regular expression that calculates the greatest common divisor of two numbers (represented in unary notation). The following example calculates the gcd of 10 and 15:
"aaaaaaaaaa,aaaaaaaaaaaaaaa".replace(/^(a+)\1*,\1+$/, "$1")
which returns the gcd in unary notation
Step
/(z)((a+)?(b+)?(c))*/.exec("zaacbbbcac")
which returns the array
["zaacbbbcac", "z", "ac", "a", undefined, "c"]
and not
["zaacbbbcac", "z", "ac", "a", "bbb", "c"]
because each iteration of the outermost *
clears all captured Strings contained in the quantified
Step
/(a*)*/.exec("b")
or the slightly more complicated:
/(a*)b\1+/.exec("baaaac")
which returns the array
["b", ""]
22.2.2.6 Assertion
The production
 Return a new Matcher with parameters (x, c) that captures nothing and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let e be x's endIndex.
 If e = 0, or if Multiline is
true and the character Input[e  1] is one ofLineTerminator , then Return c(x).
 Return
failure .
Even when the y
flag is used with a pattern, ^
always matches only at the beginning of Input, or (if Multiline is
The production
 Return a new Matcher with parameters (x, c) that captures nothing and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let e be x's endIndex.
 If e = InputLength, or if Multiline is
true and the character Input[e] is one ofLineTerminator , then Return c(x).
 Return
failure .
The production
 Return a new Matcher with parameters (x, c) that captures nothing and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let e be x's endIndex.
 Let a be !
IsWordChar (e  1).  Let b be !
IsWordChar (e).  If a is
true and b isfalse , or if a isfalse and b istrue , return c(x).  Return
failure .
The production
 Return a new Matcher with parameters (x, c) that captures nothing and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let e be x's endIndex.
 Let a be !
IsWordChar (e  1).  Let b be !
IsWordChar (e).  If a is
true and b istrue , or if a isfalse and b isfalse , return c(x).  Return
failure .
The production
 Evaluate
Disjunction with 1 as its direction argument to obtain a Matcher m.  Return a new Matcher with parameters (x, c) that captures m and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let d be a new Continuation with parameters (y) that captures nothing and performs the following steps when called:
Assert : y is a State. Return y.
 Let r be m(x, d).
 If r is
failure , returnfailure .  Let y be r's State.
 Let cap be y's captures
List .  Let xe be x's endIndex.
 Let z be the State (xe, cap).
 Return c(z).
The production
 Evaluate
Disjunction with 1 as its direction argument to obtain a Matcher m.  Return a new Matcher with parameters (x, c) that captures m and performs the following steps when called:
The production
 Evaluate
Disjunction with 1 as its direction argument to obtain a Matcher m.  Return a new Matcher with parameters (x, c) that captures m and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let d be a new Continuation with parameters (y) that captures nothing and performs the following steps when called:
Assert : y is a State. Return y.
 Let r be m(x, d).
 If r is
failure , returnfailure .  Let y be r's State.
 Let cap be y's captures
List .  Let xe be x's endIndex.
 Let z be the State (xe, cap).
 Return c(z).
The production
 Evaluate
Disjunction with 1 as its direction argument to obtain a Matcher m.  Return a new Matcher with parameters (x, c) that captures m and performs the following steps when called:
22.2.2.6.1 IsWordChar ( e )
The abstract operation IsWordChar takes argument e (an
 If e = 1 or e is InputLength, return
false .  Let c be the character Input[e].
 If c is in WordCharacters, return
true .  Return
false .
22.2.2.7 Quantifier
The production
 Evaluate
QuantifierPrefix to obtain the two results: aninteger min and aninteger (or +∞) max.  Return the three results min, max, and
true .
The production
 Evaluate
QuantifierPrefix to obtain the two results: aninteger min and aninteger (or +∞) max.  Return the three results min, max, and
false .
The production
 Return the two results 0 and +∞.
The production
 Return the two results 1 and +∞.
The production
 Return the two results 0 and 1.
The production
 Let i be the MV of
DecimalDigits (see12.8.3 ).  Return the two results i and i.
The production
 Let i be the MV of
DecimalDigits .  Return the two results i and +∞.
The production
 Let i be the MV of the first
DecimalDigits .  Let j be the MV of the second
DecimalDigits .  Return the two results i and j.
22.2.2.8 Atom
With parameter direction.
The production
 Let ch be the character matched by
PatternCharacter .  Let A be a oneelement CharSet containing the character ch.
 Return !
CharacterSetMatcher (A,false , direction).
The production
 Let A be the CharSet of all characters.
 If DotAll is not
true , then Remove from A all characters corresponding to a code point on the righthand side of the
LineTerminator production.
 Remove from A all characters corresponding to a code point on the righthand side of the
 Return !
CharacterSetMatcher (A,false , direction).
The production
 Return the Matcher that is the result of evaluating
AtomEscape with argument direction.
The production
 Evaluate
CharacterClass to obtain a CharSet A and a Boolean invert.  Return !
CharacterSetMatcher (A, invert, direction).
The production
 Evaluate
Disjunction with argument direction to obtain a Matcher m.  Let parenIndex be the number of leftcapturing parentheses in the entire regular expression that occur to the left of this
Atom . This is the total number of Parse Nodes prior to or enclosing thisAtom :: ( GroupSpecifier Disjunction ) Atom .  Return a new Matcher with parameters (x, c) that captures direction, m, and parenIndex and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let d be a new Continuation with parameters (y) that captures x, c, direction, and parenIndex and performs the following steps when called:
 Return m(x, d).
The production
 Return the Matcher that is the result of evaluating
Disjunction with argument direction.
22.2.2.8.1 CharacterSetMatcher ( A, invert, direction )
The abstract operation CharacterSetMatcher takes arguments A (a CharSet), invert (a Boolean), and direction (1 or 1). It performs the following steps when called:
 Return a new Matcher with parameters (x, c) that captures A, invert, and direction and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let e be x's endIndex.
 Let f be e + direction.
 If f < 0 or f > InputLength, return
failure .  Let index be
min (e, f).  Let ch be the character Input[index].
 Let cc be
Canonicalize (ch).  If there exists a member a of A such that
Canonicalize (a) is cc, let found betrue . Otherwise, let found befalse .  If invert is
false and found isfalse , returnfailure .  If invert is
true and found istrue , returnfailure .  Let cap be x's captures
List .  Let y be the State (f, cap).
 Return c(y).
22.2.2.8.2 Canonicalize ( ch )
The abstract operation Canonicalize takes argument ch (a character). It performs the following steps when called:
 If Unicode is
true and IgnoreCase istrue , then If the file CaseFolding.txt of the Unicode Character Database provides a simple or common case folding mapping for ch, return the result of applying that mapping to ch.
 Return ch.
 If IgnoreCase is
false , return ch. Assert : ch is a UTF16 code unit. Let cp be the code point whose numeric value is that of ch.
 Let u be the result of toUppercase(« cp »), according to the Unicode Default Case Conversion algorithm.
 Let uStr be !
CodePointsToString (u).  If uStr does not consist of a single code unit, return ch.
 Let cu be uStr's single code unit element.
 If the numeric value of ch ≥ 128 and the numeric value of cu < 128, return ch.
 Return cu.
Parentheses of the form (
)
serve both to group the components of the \
followed by a nonzero decimal number), referenced in a replace String, or returned as part of an array from the regular expression matching (?:
)
instead.
The form (?=
)
specifies a zerowidth positive lookahead. In order for it to succeed, the pattern inside (?=
form (this unusual behaviour is inherited from Perl). This only matters when the
For example,
/(?=(a+))/.exec("baaabac")
matches the empty String immediately after the first b
and therefore returns the array:
["", "aaa"]
To illustrate the lack of backtracking into the lookahead, consider:
/(?=(a+))a*b\1/.exec("baaabac")
This expression returns
["aba", "a"]
and not:
["aaaba", "a"]
The form (?!
)
specifies a zerowidth negative lookahead. In order for it to succeed, the pattern inside
/(.*?)a(?!(a+)b\2c)\2(.*)/.exec("baaabaac")
looks for an a
not immediately followed by some positive number n of a
's, a b
, another n a
's (specified by the first \2
) and a c
. The second \2
is outside the negative lookahead, so it matches against
["baaabaac", "ba", undefined, "abaac"]
In caseinsignificant matches when Unicode is ß
(U+00DF) to SS
. It may however map a code point outside the Basic Latin range to a character within, for example, ſ
(U+017F) to s
. Such characters are not mapped if Unicode is /[az]/i
, but they will match /[az]/ui
.
22.2.2.8.3 UnicodeMatchProperty ( p )
The abstract operation UnicodeMatchProperty takes argument p (a
Assert : p is aList of Unicode code points that is identical to aList of Unicode code points that is a Unicodeproperty name or property alias listed in the “Property name and aliases” column ofTable 56 orTable 57 . Let c be the canonical
property name of p as given in the “Canonicalproperty name ” column of the corresponding row.  Return the
List of Unicode code points of c.
Implementations must support the Unicode property names and aliases listed in
For example, Script_Extensions
(scx
(property alias) are valid, but script_extensions
or Scx
aren't.
The listed properties form a superset of what UTS18 RL1.2 requires.
Canonical 


General_Category 
General_Category 
gc 

Script 
Script 
sc 

Script_Extensions 
Script_Extensions 
scx 
Canonical 


ASCII 
ASCII 
ASCII_Hex_Digit 
ASCII_Hex_Digit 
AHex 

Alphabetic 
Alphabetic 
Alpha 

Any 
Any 
Assigned 
Assigned 
Bidi_Control 
Bidi_Control 
Bidi_C 

Bidi_Mirrored 
Bidi_Mirrored 
Bidi_M 

Case_Ignorable 
Case_Ignorable 
CI 

Cased 
Cased 
Changes_When_Casefolded 
Changes_When_Casefolded 
CWCF 

Changes_When_Casemapped 
Changes_When_Casemapped 
CWCM 

Changes_When_Lowercased 
Changes_When_Lowercased 
CWL 

Changes_When_NFKC_Casefolded 
Changes_When_NFKC_Casefolded 
CWKCF 

Changes_When_Titlecased 
Changes_When_Titlecased 
CWT 

Changes_When_Uppercased 
Changes_When_Uppercased 
CWU 

Dash 
Dash 
Default_Ignorable_Code_Point 
Default_Ignorable_Code_Point 
DI 

Deprecated 
Deprecated 
Dep 

Diacritic 
Diacritic 
Dia 

Emoji 
Emoji 
Emoji_Component 
Emoji_Component 
EComp 

Emoji_Modifier 
Emoji_Modifier 
EMod 

Emoji_Modifier_Base 
Emoji_Modifier_Base 
EBase 

Emoji_Presentation 
Emoji_Presentation 
EPres 

Extended_Pictographic 
Extended_Pictographic 
ExtPict 

Extender 
Extender 
Ext 

Grapheme_Base 
Grapheme_Base 
Gr_Base 

Grapheme_Extend 
Grapheme_Extend 
Gr_Ext 

Hex_Digit 
Hex_Digit 
Hex 

IDS_Binary_Operator 
IDS_Binary_Operator 
IDSB 

IDS_Trinary_Operator 
IDS_Trinary_Operator 
IDST 

ID_Continue 
ID_Continue 
IDC 

ID_Start 
ID_Start 
IDS 

Ideographic 
Ideographic 
Ideo 

Join_Control 
Join_Control 
Join_C 

Logical_Order_Exception 
Logical_Order_Exception 
LOE 

Lowercase 
Lowercase 
Lower 

Math 
Math 
Noncharacter_Code_Point 
Noncharacter_Code_Point 
NChar 

Pattern_Syntax 
Pattern_Syntax 
Pat_Syn 

Pattern_White_Space 
Pattern_White_Space 
Pat_WS 

Quotation_Mark 
Quotation_Mark 
QMark 

Radical 
Radical 
Regional_Indicator 
Regional_Indicator 
RI 

Sentence_Terminal 
Sentence_Terminal 
STerm 

Soft_Dotted 
Soft_Dotted 
SD 

Terminal_Punctuation 
Terminal_Punctuation 
Term 

Unified_Ideograph 
Unified_Ideograph 
UIdeo 

Uppercase 
Uppercase 
Upper 

Variation_Selector 
Variation_Selector 
VS 

White_Space 
White_Space 
space 

XID_Continue 
XID_Continue 
XIDC 

XID_Start 
XID_Start 
XIDS 
22.2.2.8.4 UnicodeMatchPropertyValue ( p, v )
The abstract operation UnicodeMatchPropertyValue takes arguments p (a
Assert : p is aList of Unicode code points that is identical to aList of Unicode code points that is a canonical, unaliased Unicodeproperty name listed in the “Canonicalproperty name ” column ofTable 56 .Assert : v is aList of Unicode code points that is identical to aList of Unicode code points that is a property value or property value alias for Unicode property p listed in the “Property value and aliases” column ofTable 58 orTable 59 . Let value be the canonical property value of v as given in the “Canonical property value” column of the corresponding row.
 Return the
List of Unicode code points of value.
Implementations must support the Unicode property value names and aliases listed in
For example, Xpeo
and Old_Persian
are valid Script_Extensions
values, but xpeo
and Old Persian
aren't.
This algorithm differs from the matching rules for symbolic values listed in UAX44: case, Is
prefix is not supported.
General_Category
Property value and aliases  Canonical property value 

Cased_Letter 
Cased_Letter 
LC 

Close_Punctuation 
Close_Punctuation 
Pe 

Connector_Punctuation 
Connector_Punctuation 
Pc 

Control 
Control 
Cc 

cntrl 

Currency_Symbol 
Currency_Symbol 
Sc 

Dash_Punctuation 
Dash_Punctuation 
Pd 

Decimal_Number 
Decimal_Number 
Nd 

digit 

Enclosing_Mark 
Enclosing_Mark 
Me 

Final_Punctuation 
Final_Punctuation 
Pf 

Format 
Format 
Cf 

Initial_Punctuation 
Initial_Punctuation 
Pi 

Letter 
Letter 
L 

Letter_Number 
Letter_Number 
Nl 

Line_Separator 
Line_Separator 
Zl 

Lowercase_Letter 
Lowercase_Letter 
Ll 

Mark 
Mark 
M 

Combining_Mark 

Math_Symbol 
Math_Symbol 
Sm 

Modifier_Letter 
Modifier_Letter 
Lm 

Modifier_Symbol 
Modifier_Symbol 
Sk 

Nonspacing_Mark 
Nonspacing_Mark 
Mn 

Number 
Number 
N 

Open_Punctuation 
Open_Punctuation 
Ps 

Other 
Other 
C 

Other_Letter 
Other_Letter 
Lo 

Other_Number 
Other_Number 
No 

Other_Punctuation 
Other_Punctuation 
Po 

Other_Symbol 
Other_Symbol 
So 

Paragraph_Separator 
Paragraph_Separator 
Zp 

Private_Use 
Private_Use 
Co 

Punctuation 
Punctuation 
P 

punct 

Separator 
Separator 
Z 

Space_Separator 
Space_Separator 
Zs 

Spacing_Mark 
Spacing_Mark 
Mc 

Surrogate 
Surrogate 
Cs 

Symbol 
Symbol 
S 

Titlecase_Letter 
Titlecase_Letter 
Lt 

Unassigned 
Unassigned 
Cn 

Uppercase_Letter 
Uppercase_Letter 
Lu 
Script
and Script_Extensions
Property value and aliases  Canonical property value 

Adlam 
Adlam 
Adlm 

Ahom 
Ahom 
Anatolian_Hieroglyphs 
Anatolian_Hieroglyphs 
Hluw 

Arabic 
Arabic 
Arab 

Armenian 
Armenian 
Armn 

Avestan 
Avestan 
Avst 

Balinese 
Balinese 
Bali 

Bamum 
Bamum 
Bamu 

Bassa_Vah 
Bassa_Vah 
Bass 

Batak 
Batak 
Batk 

Bengali 
Bengali 
Beng 

Bhaiksuki 
Bhaiksuki 
Bhks 

Bopomofo 
Bopomofo 
Bopo 

Brahmi 
Brahmi 
Brah 

Braille 
Braille 
Brai 

Buginese 
Buginese 
Bugi 

Buhid 
Buhid 
Buhd 

Canadian_Aboriginal 
Canadian_Aboriginal 
Cans 

Carian 
Carian 
Cari 

Caucasian_Albanian 
Caucasian_Albanian 
Aghb 

Chakma 
Chakma 
Cakm 

Cham 
Cham 
Chorasmian 
Chorasmian 
Chrs 

Cherokee 
Cherokee 
Cher 

Common 
Common 
Zyyy 

Coptic 
Coptic 
Copt 

Qaac 

Cuneiform 
Cuneiform 
Xsux 

Cypriot 
Cypriot 
Cprt 

Cyrillic 
Cyrillic 
Cyrl 

Deseret 
Deseret 
Dsrt 

Devanagari 
Devanagari 
Deva 

Dives_Akuru 
Dives_Akuru 
Diak 

Dogra 
Dogra 
Dogr 

Duployan 
Duployan 
Dupl 

Egyptian_Hieroglyphs 
Egyptian_Hieroglyphs 
Egyp 

Elbasan 
Elbasan 
Elba 

Elymaic 
Elymaic 
Elym 

Ethiopic 
Ethiopic 
Ethi 

Georgian 
Georgian 
Geor 

Glagolitic 
Glagolitic 
Glag 

Gothic 
Gothic 
Goth 

Grantha 
Grantha 
Gran 

Greek 
Greek 
Grek 

Gujarati 
Gujarati 
Gujr 

Gunjala_Gondi 
Gunjala_Gondi 
Gong 

Gurmukhi 
Gurmukhi 
Guru 

Han 
Han 
Hani 

Hangul 
Hangul 
Hang 

Hanifi_Rohingya 
Hanifi_Rohingya 
Rohg 

Hanunoo 
Hanunoo 
Hano 

Hatran 
Hatran 
Hatr 

Hebrew 
Hebrew 
Hebr 

Hiragana 
Hiragana 
Hira 

Imperial_Aramaic 
Imperial_Aramaic 
Armi 

Inherited 
Inherited 
Zinh 

Qaai 

Inscriptional_Pahlavi 
Inscriptional_Pahlavi 
Phli 

Inscriptional_Parthian 
Inscriptional_Parthian 
Prti 

Javanese 
Javanese 
Java 

Kaithi 
Kaithi 
Kthi 

Kannada 
Kannada 
Knda 

Katakana 
Katakana 
Kana 

Kayah_Li 
Kayah_Li 
Kali 

Kharoshthi 
Kharoshthi 
Khar 

Khitan_Small_Script 
Khitan_Small_Script 
Kits 

Khmer 
Khmer 
Khmr 

Khojki 
Khojki 
Khoj 

Khudawadi 
Khudawadi 
Sind 

Lao 
Lao 
Laoo 

Latin 
Latin 
Latn 

Lepcha 
Lepcha 
Lepc 

Limbu 
Limbu 
Limb 

Linear_A 
Linear_A 
Lina 

Linear_B 
Linear_B 
Linb 

Lisu 
Lisu 
Lycian 
Lycian 
Lyci 

Lydian 
Lydian 
Lydi 

Mahajani 
Mahajani 
Mahj 

Makasar 
Makasar 
Maka 

Malayalam 
Malayalam 
Mlym 

Mandaic 
Mandaic 
Mand 

Manichaean 
Manichaean 
Mani 

Marchen 
Marchen 
Marc 

Medefaidrin 
Medefaidrin 
Medf 

Masaram_Gondi 
Masaram_Gondi 
Gonm 

Meetei_Mayek 
Meetei_Mayek 
Mtei 

Mende_Kikakui 
Mende_Kikakui 
Mend 

Meroitic_Cursive 
Meroitic_Cursive 
Merc 

Meroitic_Hieroglyphs 
Meroitic_Hieroglyphs 
Mero 

Miao 
Miao 
Plrd 

Modi 
Modi 
Mongolian 
Mongolian 
Mong 

Mro 
Mro 
Mroo 

Multani 
Multani 
Mult 

Myanmar 
Myanmar 
Mymr 

Nabataean 
Nabataean 
Nbat 

Nandinagari 
Nandinagari 
Nand 

New_Tai_Lue 
New_Tai_Lue 
Talu 

Newa 
Newa 
Nko 
Nko 
Nkoo 

Nushu 
Nushu 
Nshu 

Nyiakeng_Puachue_Hmong 
Nyiakeng_Puachue_Hmong 
Hmnp 

Ogham 
Ogham 
Ogam 

Ol_Chiki 
Ol_Chiki 
Olck 

Old_Hungarian 
Old_Hungarian 
Hung 

Old_Italic 
Old_Italic 
Ital 

Old_North_Arabian 
Old_North_Arabian 
Narb 

Old_Permic 
Old_Permic 
Perm 

Old_Persian 
Old_Persian 
Xpeo 

Old_Sogdian 
Old_Sogdian 
Sogo 

Old_South_Arabian 
Old_South_Arabian 
Sarb 

Old_Turkic 
Old_Turkic 
Orkh 

Oriya 
Oriya 
Orya 

Osage 
Osage 
Osge 

Osmanya 
Osmanya 
Osma 

Pahawh_Hmong 
Pahawh_Hmong 
Hmng 

Palmyrene 
Palmyrene 
Palm 

Pau_Cin_Hau 
Pau_Cin_Hau 
Pauc 

Phags_Pa 
Phags_Pa 
Phag 

Phoenician 
Phoenician 
Phnx 

Psalter_Pahlavi 
Psalter_Pahlavi 
Phlp 

Rejang 
Rejang 
Rjng 

Runic 
Runic 
Runr 

Samaritan 
Samaritan 
Samr 

Saurashtra 
Saurashtra 
Saur 

Sharada 
Sharada 
Shrd 

Shavian 
Shavian 
Shaw 

Siddham 
Siddham 
Sidd 

SignWriting 
SignWriting 
Sgnw 

Sinhala 
Sinhala 
Sinh 

Sogdian 
Sogdian 
Sogd 

Sora_Sompeng 
Sora_Sompeng 
Sora 

Soyombo 
Soyombo 
Soyo 

Sundanese 
Sundanese 
Sund 

Syloti_Nagri 
Syloti_Nagri 
Sylo 

Syriac 
Syriac 
Syrc 

Tagalog 
Tagalog 
Tglg 

Tagbanwa 
Tagbanwa 
Tagb 

Tai_Le 
Tai_Le 
Tale 

Tai_Tham 
Tai_Tham 
Lana 

Tai_Viet 
Tai_Viet 
Tavt 

Takri 
Takri 
Takr 

Tamil 
Tamil 
Taml 

Tangut 
Tangut 
Tang 

Telugu 
Telugu 
Telu 

Thaana 
Thaana 
Thaa 

Thai 
Thai 
Tibetan 
Tibetan 
Tibt 

Tifinagh 
Tifinagh 
Tfng 

Tirhuta 
Tirhuta 
Tirh 

Ugaritic 
Ugaritic 
Ugar 

Vai 
Vai 
Vaii 

Wancho 
Wancho 
Wcho 

Warang_Citi 
Warang_Citi 
Wara 

Yezidi 
Yezidi 
Yezi 

Yi 
Yi 
Yiii 

Zanabazar_Square 
Zanabazar_Square 
Zanb 
22.2.2.9 AtomEscape
With parameter direction.
The production
 Evaluate
DecimalEscape to obtain aninteger n. Assert : n ≤ NcapturingParens. Return !
BackreferenceMatcher (n, direction).
The production
 Evaluate
CharacterEscape to obtain a character ch.  Let A be a oneelement CharSet containing the character ch.
 Return !
CharacterSetMatcher (A,false , direction).
The production
 Evaluate
CharacterClassEscape to obtain a CharSet A.  Return !
CharacterSetMatcher (A,false , direction).
An escape sequence of the form \
followed by a nonzero decimal number n matches the result of the n^{th} set of capturing parentheses (
The production
 Search the enclosing
Pattern for an instance of aGroupSpecifier containing aRegExpIdentifierName which has aCapturingGroupName equal to theCapturingGroupName of theRegExpIdentifierName contained inGroupName . Assert : A unique suchGroupSpecifier is found. Let parenIndex be the number of leftcapturing parentheses in the entire regular expression that occur to the left of the located
GroupSpecifier . This is the total number of Parse Nodes prior to or enclosing the locatedAtom :: ( GroupSpecifier Disjunction ) GroupSpecifier , including its immediately enclosingAtom .  Return !
BackreferenceMatcher (parenIndex, direction).
22.2.2.9.1 BackreferenceMatcher ( n, direction )
The abstract operation BackreferenceMatcher takes arguments n (a positive
Assert : n ≥ 1. Return a new Matcher with parameters (x, c) that captures n and direction and performs the following steps when called:
Assert : x is a State.Assert : c is a Continuation. Let cap be x's captures
List .  Let s be cap[n].
 If s is
undefined , return c(x).  Let e be x's endIndex.
 Let len be the number of elements in s.
 Let f be e + direction × len.
 If f < 0 or f > InputLength, return
failure .  Let g be
min (e, f).  If there exists an
integer i between 0 (inclusive) and len (exclusive) such thatCanonicalize (s[i]) is not the same character value asCanonicalize (Input[g + i]), returnfailure .  Let y be the State (f, cap).
 Return c(y).
22.2.2.10 CharacterEscape
The
 Let cv be the
CharacterValue of thisCharacterEscape .  Return the character whose character value is cv.
22.2.2.11 DecimalEscape
The
 Return the
CapturingGroupNumber of thisDecimalEscape .
If \
is followed by a decimal number n whose first digit is not 0
, then the escape sequence is considered to be a backreference. It is an error if n is greater than the total number of leftcapturing parentheses in the entire regular expression.
22.2.2.12 CharacterClassEscape
The production
 Return the tenelement CharSet containing the characters
0
through9
inclusive.
The production
 Return the CharSet containing all characters not in the CharSet returned by
.CharacterClassEscape :: d
The production
 Return the CharSet containing all characters corresponding to a code point on the righthand side of the
WhiteSpace orLineTerminator productions.
The production
 Return the CharSet containing all characters not in the CharSet returned by
.CharacterClassEscape :: s
The production
 Return WordCharacters.
The production
 Return the CharSet containing all characters not in the CharSet returned by
.CharacterClassEscape :: w
The production
 Return the CharSet containing all Unicode code points included in the CharSet returned by
UnicodePropertyValueExpression .
The production
 Return the CharSet containing all Unicode code points not included in the CharSet returned by
UnicodePropertyValueExpression .
The production
 Let ps be
SourceText ofUnicodePropertyName .  Let p be !
UnicodeMatchProperty (ps). Assert : p is a Unicodeproperty name or property alias listed in the “Property name and aliases” column ofTable 56 . Let vs be
SourceText ofUnicodePropertyValue .  Let v be !
UnicodeMatchPropertyValue (p, vs).  Return the CharSet containing all Unicode code points whose character database definition includes the property p with value v.
The production
 Let s be
SourceText ofLoneUnicodePropertyNameOrValue .  If !
UnicodeMatchPropertyValue (General_Category
, s) is identical to aList of Unicode code points that is the name of a Unicode general category or general category alias listed in the “Property value and aliases” column ofTable 58 , then Return the CharSet containing all Unicode code points whose character database definition includes the property “General_Category” with value s.
 Let p be !
UnicodeMatchProperty (s). Assert : p is a binary Unicode property or binary property alias listed in the “Property name and aliases” column ofTable 57 . Return the CharSet containing all Unicode code points whose character database definition includes the property p with value “True”.
22.2.2.13 CharacterClass
The production
 Evaluate
ClassRanges to obtain a CharSet A.  Return the two results A and
false .
The production
 Evaluate
ClassRanges to obtain a CharSet A.  Return the two results A and
true .
22.2.2.14 ClassRanges
The production
 Return the empty CharSet.
The production
 Return the CharSet that is the result of evaluating
NonemptyClassRanges .
22.2.2.15 NonemptyClassRanges
The production
 Return the CharSet that is the result of evaluating
ClassAtom .
The production
 Evaluate
ClassAtom to obtain a CharSet A.  Evaluate
NonemptyClassRangesNoDash to obtain a CharSet B.  Return the union of CharSets A and B.
The production
 Evaluate the first
ClassAtom to obtain a CharSet A.  Evaluate the second
ClassAtom to obtain a CharSet B.  Evaluate
ClassRanges to obtain a CharSet C.  Let D be !
CharacterRange (A, B).  Return the union of D and C.
22.2.2.15.1 CharacterRange ( A, B )
The abstract operation CharacterRange takes arguments A (a CharSet) and B (a CharSet). It performs the following steps when called:
Assert : A and B each contain exactly one character. Let a be the one character in CharSet A.
 Let b be the one character in CharSet B.
 Let i be the character value of character a.
 Let j be the character value of character b.
Assert : i ≤ j. Return the CharSet containing all characters with a character value greater than or equal to i and less than or equal to j.
22.2.2.16 NonemptyClassRangesNoDash
The production
 Return the CharSet that is the result of evaluating
ClassAtom .
The production
 Evaluate
ClassAtomNoDash to obtain a CharSet A.  Evaluate
NonemptyClassRangesNoDash to obtain a CharSet B.  Return the union of CharSets A and B.
The production
 Evaluate
ClassAtomNoDash to obtain a CharSet A.  Evaluate
ClassAtom to obtain a CharSet B.  Evaluate
ClassRanges to obtain a CharSet C.  Let D be !
CharacterRange (A, B).  Return the union of D and C.
Even if the pattern ignores case, the case of the two ends of a range is significant in determining which characters belong to the range. Thus, for example, the pattern /[EF]/i
matches only the letters E
, F
, e
, and f
, while the pattern /[Ef]/i
matches all upper and lowercase letters in the Unicode Basic Latin block as well as the symbols [
, \
, ]
, ^
, _
, and `
.
A 
character can be treated literally or it can denote a range. It is treated literally if it is the first or last character of
22.2.2.17 ClassAtom
The production
 Return the CharSet containing the single character

U+002D (HYPHENMINUS).
The production
 Return the CharSet that is the result of evaluating
ClassAtomNoDash .
22.2.2.18 ClassAtomNoDash
The production
 Return the CharSet containing the character matched by
SourceCharacter .
The production
 Return the CharSet that is the result of evaluating
ClassEscape .
22.2.2.19 ClassEscape
The
 Let cv be the
CharacterValue of thisClassEscape .  Let c be the character whose character value is cv.
 Return the CharSet containing the single character c.
 Return the CharSet that is the result of evaluating
CharacterClassEscape .
A \b
, \B
, and backreferences. Inside a \b
means the backspace character, while \B
and backreferences raise errors. Using a backreference inside a
22.2.3 The RegExp Constructor
The RegExp
 is %RegExp%.
 is the initial value of the
"RegExp" property of theglobal object .  creates and initializes a new RegExp object when called as a function rather than as a
constructor . Thus the function callRegExp(…)
is equivalent to the object creation expressionnew RegExp(…)
with the same arguments.  is designed to be subclassable. It may be used as the value of an
extends
clause of a class definition. Subclass constructors that intend to inherit the specified RegExp behaviour must include asuper
call to the RegExpconstructor to create and initialize subclass instances with the necessary internal slots.
22.2.3.1 RegExp ( pattern, flags )
The following steps are taken:
 Let patternIsRegExp be ?
IsRegExp (pattern).  If NewTarget is
undefined , then Let newTarget be the
active function object .  If patternIsRegExp is
true and flags isundefined , then
 Let newTarget be the
 Else, let newTarget be NewTarget.
 If
Type (pattern) is Object and pattern has a [[RegExpMatcher]] internal slot, then Let P be pattern.[[OriginalSource]].
 If flags is
undefined , let F be pattern.[[OriginalFlags]].  Else, let F be flags.
 Else if patternIsRegExp is
true , then  Else,
 Let P be pattern.
 Let F be flags.
 Let O be ?
RegExpAlloc (newTarget).  Return ?
RegExpInitialize (O, P, F).
If pattern is supplied using a
22.2.3.2 Abstract Operations for the RegExp Constructor
22.2.3.2.1 RegExpAlloc ( newTarget )
The abstract operation RegExpAlloc takes argument newTarget. It performs the following steps when called:
 Let obj be ?
OrdinaryCreateFromConstructor (newTarget,"%RegExp.prototype%" , « [[RegExpMatcher]], [[OriginalSource]], [[OriginalFlags]] »).  Perform !
DefinePropertyOrThrow (obj,"lastIndex" , PropertyDescriptor { [[Writable]]:true , [[Enumerable]]:false , [[Configurable]]:false }).  Return obj.
22.2.3.2.2 RegExpInitialize ( obj, pattern, flags )
The abstract operation RegExpInitialize takes arguments obj, pattern, and flags. It performs the following steps when called:
 If pattern is
undefined , let P be the empty String.  Else, let P be ?
ToString (pattern).  If flags is
undefined , let F be the empty String.  Else, let F be ?
ToString (flags).  If F contains any code unit other than
"g" ,"i" ,"m" ,"s" ,"u" , or"y" or if it contains the same code unit more than once, throw aSyntaxError exception.  If F contains
"u" , let u betrue ; else let u befalse .  If u is
true , then Let patternText be !
StringToCodePoints (P).  Let patternCharacters be a
List whose elements are the code points of patternText.
 Let patternText be !
 Else,
 Let patternText be the result of interpreting each of P's 16bit elements as a Unicode BMP code point. UTF16 decoding is not applied to the elements.
 Let patternCharacters be a
List whose elements are the code unit elements of P.
 Let parseResult be
ParsePattern (patternText, u).  If parseResult is a nonempty
List ofSyntaxError objects, throw aSyntaxError exception. Assert : parseResult is aParse Node forPattern . Set obj.[[OriginalSource]] to P.
 Set obj.[[OriginalFlags]] to F.
 Set obj.[[RegExpMatcher]] to the
Abstract Closure that evaluates parseResult by applying the semantics provided in22.2.2 using patternCharacters as the pattern'sList ofSourceCharacter values and F as the flag parameters.  Perform ?
Set (obj,"lastIndex" ,+0 _{𝔽},true ).  Return obj.
22.2.3.2.3 Static Semantics: ParsePattern ( patternText, u )
The abstract operation ParsePattern takes arguments patternText (a sequence of Unicode code points) and u (a Boolean). It performs the following steps when called:
22.2.3.2.4 RegExpCreate ( P, F )
The abstract operation RegExpCreate takes arguments P and F. It performs the following steps when called:
 Let obj be ?
RegExpAlloc (%RegExp% ).  Return ?
RegExpInitialize (obj, P, F).
22.2.3.2.5 EscapeRegExpPattern ( P, F )
The abstract operation EscapeRegExpPattern takes arguments P and F. It performs the following steps when called:
 Let S be a String in the form of a
Pattern ([~U] Pattern if F contains[+U] "u" ) equivalent to P interpreted as UTF16 encoded Unicode code points (6.1.4 ), in which certain code points are escaped as described below. S may or may not be identical to P; however, theAbstract Closure that would result from evaluating S as aPattern ([~U] Pattern if F contains[+U] "u" ) must behave identically to theAbstract Closure given by the constructed object's [[RegExpMatcher]] internal slot. Multiple calls to this abstract operation using the same values for P and F must produce identical results.  The code points
/
or anyLineTerminator occurring in the pattern shall be escaped in S as necessary to ensure that thestringconcatenation of"/" , S,"/" , and F can be parsed (in an appropriate lexical context) as aRegularExpressionLiteral that behaves identically to the constructed regular expression. For example, if P is"/" , then S could be"\/" or"\u002F" , among other possibilities, but not"/" , because///
followed by F would be parsed as aSingleLineComment rather than aRegularExpressionLiteral . If P is the empty String, this specification can be met by letting S be"(?:)" .  Return S.
22.2.4 Properties of the RegExp Constructor
The RegExp
 has a [[Prototype]] internal slot whose value is
%Function.prototype% .  has the following properties:
22.2.4.1 RegExp.prototype
The initial value of RegExp.prototype
is the
This property has the attributes { [[Writable]]:
22.2.4.2 get RegExp [ @@species ]
RegExp[@@species]
is an
 Return the
this value.
The value of the
RegExp prototype methods normally use their
22.2.5 Properties of the RegExp Prototype Object
The RegExp prototype object:
 is %RegExp.prototype%.
 is an
ordinary object .  is not a RegExp instance and does not have a [[RegExpMatcher]] internal slot or any of the other internal slots of RegExp instance objects.
 has a [[Prototype]] internal slot whose value is
%Object.prototype% .
The RegExp prototype object does not have a
22.2.5.1 RegExp.prototype.constructor
The initial value of RegExp.prototype.constructor
is
22.2.5.2 RegExp.prototype.exec ( string )
Performs a regular expression match of string against the regular expression and returns an Array object containing the results of the match, or
The String
 Let R be the
this value.  Perform ?
RequireInternalSlot (R, [[RegExpMatcher]]).  Let S be ?
ToString (string).  Return ?
RegExpBuiltinExec (R, S).
22.2.5.2.1 RegExpExec ( R, S )
The abstract operation RegExpExec takes arguments R and S. It performs the following steps when called:
Assert :Type (R) is Object.Assert :Type (S) is String. Let exec be ?
Get (R,"exec" ).  If
IsCallable (exec) istrue , then  Perform ?
RequireInternalSlot (R, [[RegExpMatcher]]).  Return ?
RegExpBuiltinExec (R, S).
If a callable
22.2.5.2.2 RegExpBuiltinExec ( R, S )
The abstract operation RegExpBuiltinExec takes arguments R and S. It performs the following steps when called:
Assert : R is an initialized RegExp instance.Assert :Type (S) is String. Let length be the number of code units in S.
 Let lastIndex be
ℝ (?ToLength (?Get (R,"lastIndex" ))).  Let flags be R.[[OriginalFlags]].
 If flags contains
"g" , let global betrue ; else let global befalse .  If flags contains
"y" , let sticky betrue ; else let sticky befalse .  If global is
false and sticky isfalse , set lastIndex to 0.  Let matcher be R.[[RegExpMatcher]].
 If flags contains
"u" , let fullUnicode betrue ; else let fullUnicode befalse .  Let matchSucceeded be
false .  Repeat, while matchSucceeded is
false , If lastIndex > length, then
 If global is
true or sticky istrue , then Perform ?
Set (R,"lastIndex" ,+0 _{𝔽},true ).
 Perform ?
 Return
null .
 If global is
 Let r be matcher(S, lastIndex).
 If r is
failure , then If sticky is
true , then Perform ?
Set (R,"lastIndex" ,+0 _{𝔽},true ).  Return
null .
 Perform ?
 Set lastIndex to
AdvanceStringIndex (S, lastIndex, fullUnicode).
 If sticky is
 Else,
Assert : r is a State. Set matchSucceeded to
true .
 If lastIndex > length, then
 Let e be r's endIndex value.
 If fullUnicode is
true , then e is an index into the Input character list, derived from S, matched by matcher. Let eUTF be the smallest index into S that corresponds to the character at element e of Input. If e is greater than or equal to the number of elements in Input, then eUTF is the number of code units in S.
 Set e to eUTF.
 If global is
true or sticky istrue , then  Let n be the number of elements in r's captures
List . (This is the same value as22.2.2.1 's NcapturingParens.) Assert : n < 2^{32}  1. Let A be !
ArrayCreate (n + 1). Assert : Themathematical value of A's"length" property is n + 1. Perform !
CreateDataPropertyOrThrow (A,"index" ,𝔽 (lastIndex)).  Perform !
CreateDataPropertyOrThrow (A,"input" , S).  Let matchedSubstr be the
substring of S from lastIndex to e.  Perform !
CreateDataPropertyOrThrow (A,"0" , matchedSubstr).  If R contains any
GroupName , then Let groups be !
OrdinaryObjectCreate (null ).
 Let groups be !
 Else,
 Let groups be
undefined .
 Let groups be
 Perform !
CreateDataPropertyOrThrow (A,"groups" , groups).  For each
integer i such that i ≥ 1 and i ≤ n, do Let captureI be i^{th} element of r's captures
List .  If captureI is
undefined , let capturedValue beundefined .  Else if fullUnicode is
true , thenAssert : captureI is aList of code points. Let capturedValue be !
CodePointsToString (captureI).
 Else,
 Perform !
CreateDataPropertyOrThrow (A, !ToString (𝔽 (i)), capturedValue).  If the i^{th} capture of R was defined with a
GroupName , then Let s be the
CapturingGroupName of the correspondingRegExpIdentifierName .  Perform !
CreateDataPropertyOrThrow (groups, s, capturedValue).
 Let s be the
 Let captureI be i^{th} element of r's captures
 Return A.
22.2.5.2.3 AdvanceStringIndex ( S, index, unicode )
The abstract operation AdvanceStringIndex takes arguments S (a String), index (a nonnegative
Assert : index ≤ 2^{53}  1. If unicode is
false , return index + 1.  Let length be the number of code units in S.
 If index + 1 ≥ length, return index + 1.
 Let cp be !
CodePointAt (S, index).  Return index + cp.[[CodeUnitCount]].
22.2.5.3 get RegExp.prototype.dotAll
RegExp.prototype.dotAll
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalFlags]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , returnundefined .  Otherwise, throw a
TypeError exception.
 If
 Let flags be R.[[OriginalFlags]].
 If flags contains the code unit 0x0073 (LATIN SMALL LETTER S), return
true .  Return
false .
22.2.5.4 get RegExp.prototype.flags
RegExp.prototype.flags
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  Let result be the empty String.
 Let global be !
ToBoolean (?Get (R,"global" )).  If global is
true , append the code unit 0x0067 (LATIN SMALL LETTER G) as the last code unit of result.  Let ignoreCase be !
ToBoolean (?Get (R,"ignoreCase" )).  If ignoreCase is
true , append the code unit 0x0069 (LATIN SMALL LETTER I) as the last code unit of result.  Let multiline be !
ToBoolean (?Get (R,"multiline" )).  If multiline is
true , append the code unit 0x006D (LATIN SMALL LETTER M) as the last code unit of result.  Let dotAll be !
ToBoolean (?Get (R,"dotAll" )).  If dotAll is
true , append the code unit 0x0073 (LATIN SMALL LETTER S) as the last code unit of result.  Let unicode be !
ToBoolean (?Get (R,"unicode" )).  If unicode is
true , append the code unit 0x0075 (LATIN SMALL LETTER U) as the last code unit of result.  Let sticky be !
ToBoolean (?Get (R,"sticky" )).  If sticky is
true , append the code unit 0x0079 (LATIN SMALL LETTER Y) as the last code unit of result.  Return result.
22.2.5.5 get RegExp.prototype.global
RegExp.prototype.global
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalFlags]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , returnundefined .  Otherwise, throw a
TypeError exception.
 If
 Let flags be R.[[OriginalFlags]].
 If flags contains the code unit 0x0067 (LATIN SMALL LETTER G), return
true .  Return
false .
22.2.5.6 get RegExp.prototype.ignoreCase
RegExp.prototype.ignoreCase
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalFlags]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , returnundefined .  Otherwise, throw a
TypeError exception.
 If
 Let flags be R.[[OriginalFlags]].
 If flags contains the code unit 0x0069 (LATIN SMALL LETTER I), return
true .  Return
false .
22.2.5.7 RegExp.prototype [ @@match ] ( string )
When the @@match
method is called with argument string, the following steps are taken:
 Let rx be the
this value.  If
Type (rx) is not Object, throw aTypeError exception.  Let S be ?
ToString (string).  Let global be !
ToBoolean (?Get (rx,"global" )).  If global is
false , then Return ?
RegExpExec (rx, S).
 Return ?
 Else,
Assert : global istrue . Let fullUnicode be !
ToBoolean (?Get (rx,"unicode" )).  Perform ?
Set (rx,"lastIndex" ,+0 _{𝔽},true ).  Let A be !
ArrayCreate (0).  Let n be 0.
 Repeat,
 Let result be ?
RegExpExec (rx, S).  If result is
null , then If n = 0, return
null .  Return A.
 If n = 0, return
 Else,
 Let matchStr be ?
ToString (?Get (result,"0" )).  Perform !
CreateDataPropertyOrThrow (A, !ToString (𝔽 (n)), matchStr).  If matchStr is the empty String, then
 Set n to n + 1.
 Let matchStr be ?
 Let result be ?
The value of the
The
22.2.5.8 RegExp.prototype [ @@matchAll ] ( string )
When the @@matchAll
method is called with argument string, the following steps are taken:
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  Let S be ?
ToString (string).  Let C be ?
SpeciesConstructor (R,%RegExp% ).  Let flags be ?
ToString (?Get (R,"flags" )).  Let matcher be ?
Construct (C, « R, flags »).  Let lastIndex be ?
ToLength (?Get (R,"lastIndex" )).  Perform ?
Set (matcher,"lastIndex" , lastIndex,true ).  If flags contains
"g" , let global betrue .  Else, let global be
false .  If flags contains
"u" , let fullUnicode betrue .  Else, let fullUnicode be
false .  Return !
CreateRegExpStringIterator (matcher, S, global, fullUnicode).
The value of the
22.2.5.9 get RegExp.prototype.multiline
RegExp.prototype.multiline
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalFlags]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , returnundefined .  Otherwise, throw a
TypeError exception.
 If
 Let flags be R.[[OriginalFlags]].
 If flags contains the code unit 0x006D (LATIN SMALL LETTER M), return
true .  Return
false .
22.2.5.10 RegExp.prototype [ @@replace ] ( string, replaceValue )
When the @@replace
method is called with arguments string and replaceValue, the following steps are taken:
 Let rx be the
this value.  If
Type (rx) is not Object, throw aTypeError exception.  Let S be ?
ToString (string).  Let lengthS be the number of code unit elements in S.
 Let functionalReplace be
IsCallable (replaceValue).  If functionalReplace is
false , then Set replaceValue to ?
ToString (replaceValue).
 Set replaceValue to ?
 Let global be !
ToBoolean (?Get (rx,"global" )).  If global is
true , then  Let results be a new empty
List .  Let done be
false .  Repeat, while done is
false , Let result be ?
RegExpExec (rx, S).  If result is
null , set done totrue .  Else,
 Append result to the end of results.
 If global is
false , set done totrue .  Else,
 Let result be ?
 Let accumulatedResult be the empty String.
 Let nextSourcePosition be 0.
 For each element result of results, do
 Let resultLength be ?
LengthOfArrayLike (result).  Let nCaptures be
max (resultLength  1, 0).  Let matched be ?
ToString (?Get (result,"0" )).  Let matchLength be the number of code units in matched.
 Let position be ?
ToIntegerOrInfinity (?Get (result,"index" )).  Set position to the result of
clamping position between 0 and lengthS.  Let n be 1.
 Let captures be a new empty
List .  Repeat, while n ≤ nCaptures,
 Let namedCaptures be ?
Get (result,"groups" ).  If functionalReplace is
true , then Let replacerArgs be « matched ».
 Append in
List order the elements of captures to the end of theList replacerArgs.  Append
𝔽 (position) and S to replacerArgs.  If namedCaptures is not
undefined , then Append namedCaptures as the last element of replacerArgs.
 Let replValue be ?
Call (replaceValue,undefined , replacerArgs).  Let replacement be ?
ToString (replValue).
 Else,
 If namedCaptures is not
undefined , then Set namedCaptures to ?
ToObject (namedCaptures).
 Set namedCaptures to ?
 Let replacement be ?
GetSubstitution (matched, S, position, captures, namedCaptures, replaceValue).
 If namedCaptures is not
 If position ≥ nextSourcePosition, then
 NOTE: position should not normally move backwards. If it does, it is an indication of an illbehaving RegExp subclass or use of an access triggered sideeffect to change the global flag or other characteristics of rx. In such cases, the corresponding substitution is ignored.
 Set accumulatedResult to the
stringconcatenation of accumulatedResult, thesubstring of S from nextSourcePosition to position, and replacement.  Set nextSourcePosition to position + matchLength.
 Let resultLength be ?
 If nextSourcePosition ≥ lengthS, return accumulatedResult.
 Return the
stringconcatenation of accumulatedResult and thesubstring of S from nextSourcePosition.
The value of the
22.2.5.11 RegExp.prototype [ @@search ] ( string )
When the @@search
method is called with argument string, the following steps are taken:
 Let rx be the
this value.  If
Type (rx) is not Object, throw aTypeError exception.  Let S be ?
ToString (string).  Let previousLastIndex be ?
Get (rx,"lastIndex" ).  If
SameValue (previousLastIndex,+0 _{𝔽}) isfalse , then Perform ?
Set (rx,"lastIndex" ,+0 _{𝔽},true ).
 Perform ?
 Let result be ?
RegExpExec (rx, S).  Let currentLastIndex be ?
Get (rx,"lastIndex" ).  If
SameValue (currentLastIndex, previousLastIndex) isfalse , then Perform ?
Set (rx,"lastIndex" , previousLastIndex,true ).
 Perform ?
 If result is
null , return1 _{𝔽}.  Return ?
Get (result,"index" ).
The value of the
The
22.2.5.12 get RegExp.prototype.source
RegExp.prototype.source
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalSource]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , return"(?:)" .  Otherwise, throw a
TypeError exception.
 If
Assert : R has an [[OriginalFlags]] internal slot. Let src be R.[[OriginalSource]].
 Let flags be R.[[OriginalFlags]].
 Return
EscapeRegExpPattern (src, flags).
22.2.5.13 RegExp.prototype [ @@split ] ( string, limit )
Returns an Array object into which substrings of the result of converting string to a String have been stored. The substrings are determined by searching from left to right for matches of the
The /a*?/[Symbol.split]("ab")
evaluates to the array ["a", "b"]
, while /a*/[Symbol.split]("ab")
evaluates to the array ["","b"]
.)
If string is (or converts to) the empty String, the result depends on whether the regular expression can match the empty String. If it can, the result array contains no elements. Otherwise, the result array contains one element, which is the empty String.
If the regular expression contains capturing parentheses, then each time separator is matched the results (including any
/<(\/)?([^<>]+)>/[Symbol.split]("A<B>bold</B>and<CODE>coded</CODE>")
evaluates to the array
["A", undefined, "B", "bold", "/", "B", "and", undefined, "CODE", "coded", "/", "CODE", ""]
If limit is not
When the @@split
method is called, the following steps are taken:
 Let rx be the
this value.  If
Type (rx) is not Object, throw aTypeError exception.  Let S be ?
ToString (string).  Let C be ?
SpeciesConstructor (rx,%RegExp% ).  Let flags be ?
ToString (?Get (rx,"flags" )).  If flags contains
"u" , let unicodeMatching betrue .  Else, let unicodeMatching be
false .  If flags contains
"y" , let newFlags be flags.  Else, let newFlags be the
stringconcatenation of flags and"y" .  Let splitter be ?
Construct (C, « rx, newFlags »).  Let A be !
ArrayCreate (0).  Let lengthA be 0.
 If limit is
undefined , let lim be 2^{32}  1; else let lim beℝ (?ToUint32 (limit)).  If lim is 0, return A.
 Let size be the length of S.
 If size is 0, then
 Let z be ?
RegExpExec (splitter, S).  If z is not
null , return A.  Perform !
CreateDataPropertyOrThrow (A,"0" , S).  Return A.
 Let z be ?
 Let p be 0.
 Let q be p.
 Repeat, while q < size,
 Perform ?
Set (splitter,"lastIndex" ,𝔽 (q),true ).  Let z be ?
RegExpExec (splitter, S).  If z is
null , set q toAdvanceStringIndex (S, q, unicodeMatching).  Else,
 Let e be
ℝ (?ToLength (?Get (splitter,"lastIndex" ))).  Set e to
min (e, size).  If e = p, set q to
AdvanceStringIndex (S, q, unicodeMatching).  Else,
 Let T be the
substring of S from p to q.  Perform !
CreateDataPropertyOrThrow (A, !ToString (𝔽 (lengthA)), T).  Set lengthA to lengthA + 1.
 If lengthA = lim, return A.
 Set p to e.
 Let numberOfCaptures be ?
LengthOfArrayLike (z).  Set numberOfCaptures to
max (numberOfCaptures  1, 0).  Let i be 1.
 Repeat, while i ≤ numberOfCaptures,
 Set q to p.
 Let T be the
 Let e be
 Perform ?
 Let T be the
substring of S from p to size.  Perform !
CreateDataPropertyOrThrow (A, !ToString (𝔽 (lengthA)), T).  Return A.
The value of the
The @@split
method ignores the value of the
22.2.5.14 get RegExp.prototype.sticky
RegExp.prototype.sticky
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalFlags]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , returnundefined .  Otherwise, throw a
TypeError exception.
 If
 Let flags be R.[[OriginalFlags]].
 If flags contains the code unit 0x0079 (LATIN SMALL LETTER Y), return
true .  Return
false .
22.2.5.15 RegExp.prototype.test ( S )
The following steps are taken:
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  Let string be ?
ToString (S).  Let match be ?
RegExpExec (R, string).  If match is not
null , returntrue ; else returnfalse .
22.2.5.16 RegExp.prototype.toString ( )
The returned String has the form of a
22.2.5.17 get RegExp.prototype.unicode
RegExp.prototype.unicode
is an
 Let R be the
this value.  If
Type (R) is not Object, throw aTypeError exception.  If R does not have an [[OriginalFlags]] internal slot, then
 If
SameValue (R,%RegExp.prototype% ) istrue , returnundefined .  Otherwise, throw a
TypeError exception.
 If
 Let flags be R.[[OriginalFlags]].
 If flags contains the code unit 0x0075 (LATIN SMALL LETTER U), return
true .  Return
false .
22.2.6 Properties of RegExp Instances
RegExp instances are ordinary objects that inherit properties from the
Prior to ECMAScript 2015, RegExp instances were specified as having the own data properties RegExp.prototype
.
RegExp instances also have the following property:
22.2.6.1 lastIndex
The value of the
22.2.7 RegExp String Iterator Objects
A RegExp String Iterator is an object, that represents a specific iteration over some specific String instance object, matching against some specific RegExp instance object. There is not a named
22.2.7.1 CreateRegExpStringIterator ( R, S, global, fullUnicode )
The abstract operation CreateRegExpStringIterator takes arguments R, S, global, and fullUnicode. It performs the following steps when called:
Assert :Type (S) is String.Assert :Type (global) is Boolean.Assert :Type (fullUnicode) is Boolean. Let closure be a new
Abstract Closure with no parameters that captures R, S, global, and fullUnicode and performs the following steps when called: Repeat,
 Let match be ?
RegExpExec (R, S).  If match is
null , returnundefined .  If global is
false , then Perform ?
Yield (match).  Return
undefined .
 Perform ?
 Let matchStr be ?
ToString (?Get (match,"0" )).  If matchStr is the empty String, then
 Perform ?
Yield (match).
 Let match be ?
 Repeat,
 Return !
CreateIteratorFromClosure (closure,"%RegExpStringIteratorPrototype%" ,%RegExpStringIteratorPrototype% ).
22.2.7.2 The %RegExpStringIteratorPrototype% Object
The %RegExpStringIteratorPrototype% object:
 has properties that are inherited by all RegExp String Iterator Objects.
 is an
ordinary object .  has a [[Prototype]] internal slot whose value is
%IteratorPrototype% .  has the following properties:
22.2.7.2.1 %RegExpStringIteratorPrototype%.next ( )
 Return ?
GeneratorResume (this value,empty ,"%RegExpStringIteratorPrototype%" ).
22.2.7.2.2 %RegExpStringIteratorPrototype% [ @@toStringTag ]
The initial value of the
This property has the attributes { [[Writable]]: