Skip to main content

How to Use Backreferences in JavaScript Regular Expressions

Regular expressions become truly powerful when patterns can reference their own matches. Backreferences let you refer back to text that was already captured by a group earlier in the same pattern. This means you can match repeated words, paired delimiters like quotes, and symmetrical structures that would be impossible to express with basic regex syntax alone.

In this guide, you will learn how numbered and named backreferences work in JavaScript, how the regex engine resolves them, and where they solve real problems that no other regex feature can handle.

What Are Backreferences?

A backreference is a way to match the exact same text that was previously captured by a group in the same regular expression.

When you write a capturing group like (abc), the regex engine stores the matched text. A backreference such as \1 tells the engine: "match the same text that group 1 already captured."

This is fundamentally different from simply repeating the group pattern. Consider this distinction:

// This matches any digit followed by any digit (not necessarily the same)
let pattern1 = /(\d)\d/;
console.log(pattern1.test("12")); // true ("1" then "2")
console.log(pattern1.test("33")); // true ("3" then "3")

// This matches a digit followed by THE SAME digit
let pattern2 = /(\d)\1/;
console.log(pattern2.test("12")); // false ("1" then "2" (not the same))
console.log(pattern2.test("33")); // true ("3" then "3" (same digit))

The group (\d) captures a digit, and \1 insists that the next character must be exactly what was captured. Not just any digit, but the same one.

Numbered Backreferences: \1, \2, and Beyond

Every set of parentheses () in a regular expression creates a capturing group that is automatically assigned a number. The numbering starts at 1 and increases from left to right, based on the position of the opening parenthesis.

Basic Syntax

let pattern = /(group1)(group2)(group3)/;
// \1 refers to whatever (group1) matched
// \2 refers to whatever (group2) matched
// \3 refers to whatever (group3) matched

How Numbering Works

Groups are numbered by the position of their opening parenthesis, reading left to right:

let pattern = /((a)(b))(c)/;
// ^ (Group 1: ((a)(b)))
// ^ (Group 2: (a))
// ^ (Group 3: (b))
// ^ (Group 4: (c))

Matching Repeated Characters

A classic use of \1 is finding repeated characters:

let doubleChar = /(.)\1/;

console.log(doubleChar.test("book")); // true ("oo")
console.log(doubleChar.test("letter")); // true ("tt")
console.log(doubleChar.test("abc")); // false (no repeated chars)

Output:

true
true
false

Let's extract the repeated character:

let doubleChar = /(.)\1/g;
let text = "The bookkeeper shuffled the Mississippi letters.";

console.log(text.match(doubleChar));

Output:

[ 'oo', 'kk', 'ee', 'ff', 'ss', 'ss', 'pp', 'tt' ]

Multiple Backreferences

You can use several backreferences in a single pattern. Each one refers back to its corresponding group:

// Match a pattern like "ABAB" where A and B are single characters
let pattern = /(.)(.)\1\2/;

console.log(pattern.test("abab")); // true (a, b, then a, b again)
console.log(pattern.test("xyxy")); // true (x, y, then x, y again)
console.log(pattern.test("abcd")); // false (no repetition)
console.log(pattern.test("abba")); // false (a,b then b,a (wrong order))

Output:

true
true
false
false

Notice how "abba" does not match because \1 demands the same text as group 1 (a), but position 3 is b. Backreferences are strict and order-sensitive.

Matching Palindrome-Like Patterns

You can match simple symmetrical structures:

// Match a 5-character palindrome pattern: ABCBA
let palindrome5 = /(.)(.)(.)\2\1/;

console.log(palindrome5.test("abcba")); // true
console.log(palindrome5.test("radar")); // true (r, a, d, a, r)
console.log(palindrome5.test("hello")); // false

Output:

true
true
false

Here, group 1 captures the first character, group 2 the second, group 3 the third (the center). Then \2 matches the second character again, and \1 matches the first.

Backreferences to Unmatched Groups

If a backreference refers to a group that has not participated in the match (the group exists but did not capture anything), the backreference matches the empty string:

// Group 2 is optional via ?
let pattern = /(a)(b)?\2/;

// When "b" is present: group 2 captures "b", \2 expects "b"
console.log(pattern.test("abb")); // true (a, b, b)

// When "b" is absent: group 2 captures nothing, \2 matches empty string
console.log(pattern.test("a")); // true (a, then empty, then empty)

Output:

true
true
caution

Be careful with optional groups and backreferences. When a group does not participate in the match, \1 matches the empty string rather than failing. This can produce unexpected matches.

Named Backreferences: \k<name>

JavaScript supports named capturing groups with the syntax (?<name>...). To reference these groups later in the same pattern, you use \k<name>.

Basic Syntax

let pattern = /(?<char>.)\k<char>/;

console.log(pattern.test("aa")); // true
console.log(pattern.test("bb")); // true
console.log(pattern.test("ab")); // false

Output:

true
true
false

This does exactly the same thing as (.)\1, but the intent is much clearer. You are explicitly saying "match the same text that the group named char captured."

Why Named Backreferences Are Better

Named backreferences improve readability and maintainability significantly, especially in complex patterns:

// Numbered
let numbered = /^(\w+):\/\/(\w+)\.(\w+)\.(\w+)\/\2/;

// Named (immediately clear)
let named = /^(?<protocol>\w+):\/\/(?<subdomain>\w+)\.(?<domain>\w+)\.(?<tld>\w+)\/\k<subdomain>/;

When you add or remove groups, numbered backreferences can break because the numbers shift. Named backreferences remain stable regardless of pattern changes.

Mixing Named and Numbered References

Named groups still receive a number. You can reference them either way, though it is best to stay consistent:

let pattern = /(?<word>\w+)\s+\1/;        // \1 refers to the named group by number
let pattern2 = /(?<word>\w+)\s+\k<word>/; // Same, but explicit

let text = "hello hello world";

console.log(text.match(pattern));
console.log(text.match(pattern2));

Output:

[ 'hello hello', 'hello', index: 0, ... groups: { word: 'hello' } ]
[ 'hello hello', 'hello', index: 0, ... groups: { word: 'hello' } ]
tip

Prefer \k<name> when using named groups. Mixing \1 with named groups works, but it reduces the clarity benefit that named groups provide.

Use Case: Matching Repeated Words

One of the most practical uses of backreferences is detecting accidentally repeated words in text, a common typo in writing.

Simple Repeated Word Detection

let repeatedWord = /\b(\w+)\s+\1\b/gi;

let text = "This is is a test test of the the system.";
let matches = text.match(repeatedWord);

console.log(matches);

Output:

[ 'is is', 'test test', 'the the' ]

Let's break down the pattern:

PartMeaning
\bWord boundary (start of a word)
(\w+)Capture one or more word characters (the first word)
\s+One or more whitespace characters between words
\1The exact same text captured by group 1
\bWord boundary (end of the second word)
giGlobal search, case-insensitive

Why Word Boundaries Matter

Without \b, you would get false matches:

// Without word boundaries, problematic
let bad = /(\w+)\s+\1/gi;
let text = "The therapist is here.";

console.log(text.match(bad));

Output:

null

In this case it works fine, but consider:

let bad = /(\w+)\s+\1/gi;
let text = "for fortune";

console.log(text.match(bad));

Output:

[ 'for for' ]

It matches "for for" inside "for fortune" because (\w+) captures "for" and \1 finds "for" at the start of "fortune". Adding \b prevents this:

let good = /\b(\w+)\s+\1\b/gi;
let text = "for fortune";

console.log(text.match(good));

Output:

null

Now it correctly rejects the match because "fortune" does not end at a word boundary after "for".

Named Version for Clarity

let repeatedWord = /\b(?<word>\w+)\s+\k<word>\b/gi;

let text = "I went went to to the store store yesterday.";
let match;

while ((match = repeatedWord.exec(text)) !== null) {
console.log(`Found "${match[0]}" - repeated word: "${match.groups.word}"`);
}

Output:

Found "went went" - repeated word: "went"
Found "to to" - repeated word: "to"
Found "store store" - repeated word: "store"

Removing Duplicate Words with replace()

You can fix repeated words by combining the pattern with replace(). Note that in the replacement string, you use $1 (not \1) to reference captured groups:

let repeatedWord = /\b(\w+)\s+\1\b/gi;
let text = "This is is a test test of the the system.";

let fixed = text.replace(repeatedWord, "$1");
console.log(fixed);

Output:

This is a test of the system.
note

Inside the regex pattern, you use \1 for backreferences. Inside the replacement string of replace(), you use $1. These are different syntaxes for different contexts.

Use Case: Matching Paired Delimiters

Backreferences are essential when you need to match opening and closing delimiters that must be the same character.

Matching Quoted Strings

A common challenge is matching strings enclosed in either single or double quotes, where the closing quote must match the opening one:

// Wrong approach - matches mismatched quotes
let bad = /["'][^"']*["']/g;
let text = `He said "hello' and she said 'world"`;

console.log(text.match(bad));

Output:

[ `"hello'`, `'world"` ]

This incorrectly matches "hello' where the opening double quote is closed by a single quote. Backreferences solve this:

// Correct approach - opening quote must match closing quote
let good = /(["'])[^"']*\1/g;
let text = `He said "hello" and she said 'world'`;

console.log(text.match(good));

Output:

[ '"hello"', "'world'" ]

The group (["']) captures whichever quote character opens the string, and \1 ensures the same quote character closes it.

Named Version for Quoted Strings

let quoted = /(?<quote>["'])(?<content>[^"']*)\k<quote>/g;
let text = `title="JavaScript" lang='en' broken="oops'`;

let match;
while ((match = quoted.exec(text)) !== null) {
console.log(`Quote: ${match.groups.quote}, Content: "${match.groups.content}"`);
}

Output:

Quote: ", Content: "JavaScript"
Quote: ', Content: "en"

The broken="oops' part is correctly ignored because the quotes do not match.

Handling Escaped Quotes Inside Strings

A more realistic pattern allows escaped quotes within the string:

// Match quoted strings that may contain escaped quotes
let quoted = /(["'])(?:\\.|(?!\1).)*\1/g;

let text = `She said "He\\'s \\"great\\"" and 'it\\'s fine'`;
console.log(text.match(quoted));

Let's break this pattern down:

PartMeaning
(["'])Capture the opening quote (group 1)
(?:...)* Non-capturing group, repeated
\\.Match any escaped character (backslash + anything)
|OR
(?!\1).Any character that is NOT the opening quote (negative lookahead)
\1The matching closing quote

Matching Paired HTML Tags

Backreferences can match opening and closing HTML tags with the same tag name:

let pairedTag = /<(\w+)(?:\s[^>]*)?>[\s\S]*?<\/\1>/g;

let html = `
<div>Hello</div>
<span>World</span>
<p>Text</p>
<div>Mismatched</span>
`;

console.log(html.match(pairedTag));

Output:

[ '<div>Hello</div>', '<span>World</span>', '<p>Text</p>' ]

The <div>Mismatched</span> is correctly excluded because \1 captured "div" but the closing tag says "span".

warning

While backreferences can match simple paired HTML tags, never use regex to parse full HTML documents. HTML is not a regular language, and nested or malformed tags will break any regex approach. Use a proper DOM parser instead.

Matching Repeated Patterns in Data

Backreferences work well for detecting symmetry or repetition in structured data:

// Match coordinates where x equals y: (n, n)
let sameCoord = /\((\d+),\s*\1\)/g;

let data = "(3, 3) (4, 5) (7, 7) (10, 10) (2, 8)";
console.log(data.match(sameCoord));

Output:

[ '(3, 3)', '(7, 7)', '(10, 10)' ]

Backreferences with matchAll() and exec()

When you need full details about each match, matchAll() or exec() give you access to all captured groups and their backreference values:

let pattern = /\b(?<word>\w+)\s+\k<word>\b/gi;
let text = "The the quick brown fox fox jumped jumped high.";

for (let match of text.matchAll(pattern)) {
console.log({
fullMatch: match[0],
capturedWord: match[1],
namedGroup: match.groups.word,
index: match.index
});
}

Output:

{ fullMatch: 'The the', capturedWord: 'The', namedGroup: 'The', index: 0 }
{ fullMatch: 'fox fox', capturedWord: 'fox', namedGroup: 'fox', index: 20 }
{ fullMatch: 'jumped jumped', capturedWord: 'jumped', namedGroup: 'jumped', index: 28 }

Backreferences vs. Replacement References

It is important to understand that backreferences inside a pattern (\1, \k<name>) and references inside a replacement string ($1, $<name>) are related but different mechanisms:

ContextNumberedNamed
Inside the regex pattern\1, \2\k<name>
Inside replace() replacement string$1, $2$<name>
Inside replace() callback functionargs[1], args[2]args[args.length-1].name
let pattern = /(?<first>\w)(?<second>\w)/;
let text = "ab";

// In pattern: \k<first> or \1
// In replacement: $<first> or $1
console.log(text.replace(pattern, "$<second>$<first>"));

Output:

ba

Common Mistakes with Backreferences

Mistake 1: Using $1 Inside the Pattern

// WRONG: $1 is for replacement strings, not patterns
let bad = /(\w+)\s+$1/;
console.log(bad.test("hello hello"));

Output:

false

The $1 inside a regex pattern is treated as the literal characters $ and 1, not as a backreference. Use \1 instead:

// CORRECT
let good = /(\w+)\s+\1/;
console.log(good.test("hello hello"));

Output:

true

Mistake 2: Backreference Before the Group

A backreference must appear after the group it refers to. If the group has not matched yet, the backreference matches the empty string or fails:

// The group comes AFTER the backreference (won't work as intended)
let pattern = /\1(\w+)/;

console.log(pattern.test("hello")); // true, but \1 matched empty string

Always place the capturing group before the backreference that refers to it.

Mistake 3: Forgetting That Backreferences Are Case-Sensitive

Backreferences match the exact captured text, including case:

let pattern = /\b(\w+)\s+\1\b/g;        // No "i" flag

console.log("The the".match(pattern)); // null ()"The" ≠ "the")

Output:

null

Add the i flag if you want case-insensitive matching:

let pattern = /\b(\w+)\s+\1\b/gi;  // With "i" flag

console.log("The the".match(pattern));

Output:

[ 'The the' ]

Mistake 4: Non-Capturing Groups Do Not Create Backreferences

Non-capturing groups (?:...) are not assigned a number and cannot be backreferenced:

// (?:\w+) is non-capturing  (\1 refers to the next actual capturing group)
let pattern = /(?:\w+)\s+(\w+)\s+\1/;

let text = "one two two";
console.log(pattern.test(text)); // true (\1 refers to group 1 which is (\w+) = "two")

Output:

true

If you expected \1 to reference the first set of parentheses (the non-capturing group), it does not. Only () creates numbered groups, not (?:).

Performance Considerations

Backreferences can cause the regex engine to do extra work because it must store and compare captured text. A few guidelines:

  • Simple backreferences like (.)\1 for repeated characters are efficient
  • Long captured text with backreferences works fine but uses more memory
  • Nested quantifiers with backreferences can potentially trigger excessive backtracking in edge cases
  • When possible, use atomic alternatives such as character classes to avoid backreferences if the same logic can be expressed without them
// Using a backreference (works perfectly)
let doubled = /(.)\1/g;

// Without backreference (if you only need to detect doubled letters)
// Not possible with basic character classes (backreference is the right tool here)

For repeated word detection or paired delimiter matching, backreferences are the correct and efficient tool. There is no simpler alternative.

Summary

FeatureSyntaxExample
Numbered backreference\1, \2, etc.(.)\1 matches "aa", "bb"
Named backreference\k<name>(?<ch>.)\k<ch> matches "aa"
In replacement string$1, $<name>"aa".replace(/(.)(\1)/, "$1") gives "a"
Unmatched group backreference\1Matches empty string if group did not participate

Key takeaways:

  • Backreferences match the exact same text previously captured, not the same pattern
  • Groups are numbered by the position of their opening parenthesis from left to right
  • Named backreferences (\k<name>) are more readable and resilient to pattern changes
  • The most common real-world uses are repeated word detection and paired delimiter matching
  • Inside patterns use \1, inside replacement strings use $1
  • Non-capturing groups (?:...) do not receive numbers and cannot be backreferenced

Backreferences bridge the gap between what regular languages can express and the practical text-matching problems developers face daily. Combined with named groups, they make complex patterns both powerful and readable.