How to Use Backreferences in JavaScript Regular Expressions
Regular expressions become truly powerful when patterns can reference their own matches. Backreferences let you refer back to text that was already captured by a group earlier in the same pattern. This means you can match repeated words, paired delimiters like quotes, and symmetrical structures that would be impossible to express with basic regex syntax alone.
In this guide, you will learn how numbered and named backreferences work in JavaScript, how the regex engine resolves them, and where they solve real problems that no other regex feature can handle.
What Are Backreferences?
A backreference is a way to match the exact same text that was previously captured by a group in the same regular expression.
When you write a capturing group like (abc), the regex engine stores the matched text. A backreference such as \1 tells the engine: "match the same text that group 1 already captured."
This is fundamentally different from simply repeating the group pattern. Consider this distinction:
// This matches any digit followed by any digit (not necessarily the same)
let pattern1 = /(\d)\d/;
console.log(pattern1.test("12")); // true ("1" then "2")
console.log(pattern1.test("33")); // true ("3" then "3")
// This matches a digit followed by THE SAME digit
let pattern2 = /(\d)\1/;
console.log(pattern2.test("12")); // false ("1" then "2" (not the same))
console.log(pattern2.test("33")); // true ("3" then "3" (same digit))
The group (\d) captures a digit, and \1 insists that the next character must be exactly what was captured. Not just any digit, but the same one.
Numbered Backreferences: \1, \2, and Beyond
Every set of parentheses () in a regular expression creates a capturing group that is automatically assigned a number. The numbering starts at 1 and increases from left to right, based on the position of the opening parenthesis.
Basic Syntax
let pattern = /(group1)(group2)(group3)/;
// \1 refers to whatever (group1) matched
// \2 refers to whatever (group2) matched
// \3 refers to whatever (group3) matched
How Numbering Works
Groups are numbered by the position of their opening parenthesis, reading left to right:
let pattern = /((a)(b))(c)/;
// ^ (Group 1: ((a)(b)))
// ^ (Group 2: (a))
// ^ (Group 3: (b))
// ^ (Group 4: (c))
Matching Repeated Characters
A classic use of \1 is finding repeated characters:
let doubleChar = /(.)\1/;
console.log(doubleChar.test("book")); // true ("oo")
console.log(doubleChar.test("letter")); // true ("tt")
console.log(doubleChar.test("abc")); // false (no repeated chars)
Output:
true
true
false
Let's extract the repeated character:
let doubleChar = /(.)\1/g;
let text = "The bookkeeper shuffled the Mississippi letters.";
console.log(text.match(doubleChar));
Output:
[ 'oo', 'kk', 'ee', 'ff', 'ss', 'ss', 'pp', 'tt' ]
Multiple Backreferences
You can use several backreferences in a single pattern. Each one refers back to its corresponding group:
// Match a pattern like "ABAB" where A and B are single characters
let pattern = /(.)(.)\1\2/;
console.log(pattern.test("abab")); // true (a, b, then a, b again)
console.log(pattern.test("xyxy")); // true (x, y, then x, y again)
console.log(pattern.test("abcd")); // false (no repetition)
console.log(pattern.test("abba")); // false (a,b then b,a (wrong order))
Output:
true
true
false
false
Notice how "abba" does not match because \1 demands the same text as group 1 (a), but position 3 is b. Backreferences are strict and order-sensitive.
Matching Palindrome-Like Patterns
You can match simple symmetrical structures:
// Match a 5-character palindrome pattern: ABCBA
let palindrome5 = /(.)(.)(.)\2\1/;
console.log(palindrome5.test("abcba")); // true
console.log(palindrome5.test("radar")); // true (r, a, d, a, r)
console.log(palindrome5.test("hello")); // false
Output:
true
true
false
Here, group 1 captures the first character, group 2 the second, group 3 the third (the center). Then \2 matches the second character again, and \1 matches the first.
Backreferences to Unmatched Groups
If a backreference refers to a group that has not participated in the match (the group exists but did not capture anything), the backreference matches the empty string:
// Group 2 is optional via ?
let pattern = /(a)(b)?\2/;
// When "b" is present: group 2 captures "b", \2 expects "b"
console.log(pattern.test("abb")); // true (a, b, b)
// When "b" is absent: group 2 captures nothing, \2 matches empty string
console.log(pattern.test("a")); // true (a, then empty, then empty)
Output:
true
true
Be careful with optional groups and backreferences. When a group does not participate in the match, \1 matches the empty string rather than failing. This can produce unexpected matches.
Named Backreferences: \k<name>
JavaScript supports named capturing groups with the syntax (?<name>...). To reference these groups later in the same pattern, you use \k<name>.
Basic Syntax
let pattern = /(?<char>.)\k<char>/;
console.log(pattern.test("aa")); // true
console.log(pattern.test("bb")); // true
console.log(pattern.test("ab")); // false
Output:
true
true
false
This does exactly the same thing as (.)\1, but the intent is much clearer. You are explicitly saying "match the same text that the group named char captured."
Why Named Backreferences Are Better
Named backreferences improve readability and maintainability significantly, especially in complex patterns:
// Numbered
let numbered = /^(\w+):\/\/(\w+)\.(\w+)\.(\w+)\/\2/;
// Named (immediately clear)
let named = /^(?<protocol>\w+):\/\/(?<subdomain>\w+)\.(?<domain>\w+)\.(?<tld>\w+)\/\k<subdomain>/;
When you add or remove groups, numbered backreferences can break because the numbers shift. Named backreferences remain stable regardless of pattern changes.
Mixing Named and Numbered References
Named groups still receive a number. You can reference them either way, though it is best to stay consistent:
let pattern = /(?<word>\w+)\s+\1/; // \1 refers to the named group by number
let pattern2 = /(?<word>\w+)\s+\k<word>/; // Same, but explicit
let text = "hello hello world";
console.log(text.match(pattern));
console.log(text.match(pattern2));
Output:
[ 'hello hello', 'hello', index: 0, ... groups: { word: 'hello' } ]
[ 'hello hello', 'hello', index: 0, ... groups: { word: 'hello' } ]
Prefer \k<name> when using named groups. Mixing \1 with named groups works, but it reduces the clarity benefit that named groups provide.
Use Case: Matching Repeated Words
One of the most practical uses of backreferences is detecting accidentally repeated words in text, a common typo in writing.
Simple Repeated Word Detection
let repeatedWord = /\b(\w+)\s+\1\b/gi;
let text = "This is is a test test of the the system.";
let matches = text.match(repeatedWord);
console.log(matches);
Output:
[ 'is is', 'test test', 'the the' ]
Let's break down the pattern:
| Part | Meaning |
|---|---|
\b | Word boundary (start of a word) |
(\w+) | Capture one or more word characters (the first word) |
\s+ | One or more whitespace characters between words |
\1 | The exact same text captured by group 1 |
\b | Word boundary (end of the second word) |
gi | Global search, case-insensitive |
Why Word Boundaries Matter
Without \b, you would get false matches:
// Without word boundaries, problematic
let bad = /(\w+)\s+\1/gi;
let text = "The therapist is here.";
console.log(text.match(bad));
Output:
null
In this case it works fine, but consider:
let bad = /(\w+)\s+\1/gi;
let text = "for fortune";
console.log(text.match(bad));
Output:
[ 'for for' ]
It matches "for for" inside "for fortune" because (\w+) captures "for" and \1 finds "for" at the start of "fortune". Adding \b prevents this:
let good = /\b(\w+)\s+\1\b/gi;
let text = "for fortune";
console.log(text.match(good));
Output:
null
Now it correctly rejects the match because "fortune" does not end at a word boundary after "for".
Named Version for Clarity
let repeatedWord = /\b(?<word>\w+)\s+\k<word>\b/gi;
let text = "I went went to to the store store yesterday.";
let match;
while ((match = repeatedWord.exec(text)) !== null) {
console.log(`Found "${match[0]}" - repeated word: "${match.groups.word}"`);
}
Output:
Found "went went" - repeated word: "went"
Found "to to" - repeated word: "to"
Found "store store" - repeated word: "store"
Removing Duplicate Words with replace()
You can fix repeated words by combining the pattern with replace(). Note that in the replacement string, you use $1 (not \1) to reference captured groups:
let repeatedWord = /\b(\w+)\s+\1\b/gi;
let text = "This is is a test test of the the system.";
let fixed = text.replace(repeatedWord, "$1");
console.log(fixed);
Output:
This is a test of the system.
Inside the regex pattern, you use \1 for backreferences. Inside the replacement string of replace(), you use $1. These are different syntaxes for different contexts.
Use Case: Matching Paired Delimiters
Backreferences are essential when you need to match opening and closing delimiters that must be the same character.
Matching Quoted Strings
A common challenge is matching strings enclosed in either single or double quotes, where the closing quote must match the opening one:
// Wrong approach - matches mismatched quotes
let bad = /["'][^"']*["']/g;
let text = `He said "hello' and she said 'world"`;
console.log(text.match(bad));
Output:
[ `"hello'`, `'world"` ]
This incorrectly matches "hello' where the opening double quote is closed by a single quote. Backreferences solve this:
// Correct approach - opening quote must match closing quote
let good = /(["'])[^"']*\1/g;
let text = `He said "hello" and she said 'world'`;
console.log(text.match(good));
Output:
[ '"hello"', "'world'" ]
The group (["']) captures whichever quote character opens the string, and \1 ensures the same quote character closes it.
Named Version for Quoted Strings
let quoted = /(?<quote>["'])(?<content>[^"']*)\k<quote>/g;
let text = `title="JavaScript" lang='en' broken="oops'`;
let match;
while ((match = quoted.exec(text)) !== null) {
console.log(`Quote: ${match.groups.quote}, Content: "${match.groups.content}"`);
}
Output:
Quote: ", Content: "JavaScript"
Quote: ', Content: "en"
The broken="oops' part is correctly ignored because the quotes do not match.
Handling Escaped Quotes Inside Strings
A more realistic pattern allows escaped quotes within the string:
// Match quoted strings that may contain escaped quotes
let quoted = /(["'])(?:\\.|(?!\1).)*\1/g;
let text = `She said "He\\'s \\"great\\"" and 'it\\'s fine'`;
console.log(text.match(quoted));
Let's break this pattern down:
| Part | Meaning |
|---|---|
(["']) | Capture the opening quote (group 1) |
(?:...)* | Non-capturing group, repeated |
\\. | Match any escaped character (backslash + anything) |
| | OR |
(?!\1). | Any character that is NOT the opening quote (negative lookahead) |
\1 | The matching closing quote |
Matching Paired HTML Tags
Backreferences can match opening and closing HTML tags with the same tag name:
let pairedTag = /<(\w+)(?:\s[^>]*)?>[\s\S]*?<\/\1>/g;
let html = `
<div>Hello</div>
<span>World</span>
<p>Text</p>
<div>Mismatched</span>
`;
console.log(html.match(pairedTag));
Output:
[ '<div>Hello</div>', '<span>World</span>', '<p>Text</p>' ]
The <div>Mismatched</span> is correctly excluded because \1 captured "div" but the closing tag says "span".
While backreferences can match simple paired HTML tags, never use regex to parse full HTML documents. HTML is not a regular language, and nested or malformed tags will break any regex approach. Use a proper DOM parser instead.
Matching Repeated Patterns in Data
Backreferences work well for detecting symmetry or repetition in structured data:
// Match coordinates where x equals y: (n, n)
let sameCoord = /\((\d+),\s*\1\)/g;
let data = "(3, 3) (4, 5) (7, 7) (10, 10) (2, 8)";
console.log(data.match(sameCoord));
Output:
[ '(3, 3)', '(7, 7)', '(10, 10)' ]
Backreferences with matchAll() and exec()
When you need full details about each match, matchAll() or exec() give you access to all captured groups and their backreference values:
let pattern = /\b(?<word>\w+)\s+\k<word>\b/gi;
let text = "The the quick brown fox fox jumped jumped high.";
for (let match of text.matchAll(pattern)) {
console.log({
fullMatch: match[0],
capturedWord: match[1],
namedGroup: match.groups.word,
index: match.index
});
}
Output:
{ fullMatch: 'The the', capturedWord: 'The', namedGroup: 'The', index: 0 }
{ fullMatch: 'fox fox', capturedWord: 'fox', namedGroup: 'fox', index: 20 }
{ fullMatch: 'jumped jumped', capturedWord: 'jumped', namedGroup: 'jumped', index: 28 }
Backreferences vs. Replacement References
It is important to understand that backreferences inside a pattern (\1, \k<name>) and references inside a replacement string ($1, $<name>) are related but different mechanisms:
| Context | Numbered | Named |
|---|---|---|
| Inside the regex pattern | \1, \2 | \k<name> |
Inside replace() replacement string | $1, $2 | $<name> |
Inside replace() callback function | args[1], args[2] | args[args.length-1].name |
let pattern = /(?<first>\w)(?<second>\w)/;
let text = "ab";
// In pattern: \k<first> or \1
// In replacement: $<first> or $1
console.log(text.replace(pattern, "$<second>$<first>"));
Output:
ba
Common Mistakes with Backreferences
Mistake 1: Using $1 Inside the Pattern
// WRONG: $1 is for replacement strings, not patterns
let bad = /(\w+)\s+$1/;
console.log(bad.test("hello hello"));
Output:
false
The $1 inside a regex pattern is treated as the literal characters $ and 1, not as a backreference. Use \1 instead:
// CORRECT
let good = /(\w+)\s+\1/;
console.log(good.test("hello hello"));
Output:
true
Mistake 2: Backreference Before the Group
A backreference must appear after the group it refers to. If the group has not matched yet, the backreference matches the empty string or fails:
// The group comes AFTER the backreference (won't work as intended)
let pattern = /\1(\w+)/;
console.log(pattern.test("hello")); // true, but \1 matched empty string
Always place the capturing group before the backreference that refers to it.
Mistake 3: Forgetting That Backreferences Are Case-Sensitive
Backreferences match the exact captured text, including case:
let pattern = /\b(\w+)\s+\1\b/g; // No "i" flag
console.log("The the".match(pattern)); // null ()"The" ≠ "the")
Output:
null
Add the i flag if you want case-insensitive matching:
let pattern = /\b(\w+)\s+\1\b/gi; // With "i" flag
console.log("The the".match(pattern));
Output:
[ 'The the' ]
Mistake 4: Non-Capturing Groups Do Not Create Backreferences
Non-capturing groups (?:...) are not assigned a number and cannot be backreferenced:
// (?:\w+) is non-capturing (\1 refers to the next actual capturing group)
let pattern = /(?:\w+)\s+(\w+)\s+\1/;
let text = "one two two";
console.log(pattern.test(text)); // true (\1 refers to group 1 which is (\w+) = "two")
Output:
true
If you expected \1 to reference the first set of parentheses (the non-capturing group), it does not. Only () creates numbered groups, not (?:).
Performance Considerations
Backreferences can cause the regex engine to do extra work because it must store and compare captured text. A few guidelines:
- Simple backreferences like
(.)\1for repeated characters are efficient - Long captured text with backreferences works fine but uses more memory
- Nested quantifiers with backreferences can potentially trigger excessive backtracking in edge cases
- When possible, use atomic alternatives such as character classes to avoid backreferences if the same logic can be expressed without them
// Using a backreference (works perfectly)
let doubled = /(.)\1/g;
// Without backreference (if you only need to detect doubled letters)
// Not possible with basic character classes (backreference is the right tool here)
For repeated word detection or paired delimiter matching, backreferences are the correct and efficient tool. There is no simpler alternative.
Summary
| Feature | Syntax | Example |
|---|---|---|
| Numbered backreference | \1, \2, etc. | (.)\1 matches "aa", "bb" |
| Named backreference | \k<name> | (?<ch>.)\k<ch> matches "aa" |
| In replacement string | $1, $<name> | "aa".replace(/(.)(\1)/, "$1") gives "a" |
| Unmatched group backreference | \1 | Matches empty string if group did not participate |
Key takeaways:
- Backreferences match the exact same text previously captured, not the same pattern
- Groups are numbered by the position of their opening parenthesis from left to right
- Named backreferences (
\k<name>) are more readable and resilient to pattern changes - The most common real-world uses are repeated word detection and paired delimiter matching
- Inside patterns use
\1, inside replacement strings use$1 - Non-capturing groups
(?:...)do not receive numbers and cannot be backreferenced
Backreferences bridge the gap between what regular languages can express and the practical text-matching problems developers face daily. Combined with named groups, they make complex patterns both powerful and readable.