How Greedy and Lazy Quantifiers Work in JavaScript Regular Expressions
The previous guide on quantifiers introduced the concept of greedy vs. lazy matching. This article goes much deeper. Understanding how the regex engine backtracks is the single most important skill for debugging regular expressions that return unexpected results, match too much, or run painfully slow.
Most regex bugs are not caused by wrong syntax. They are caused by misunderstanding how the engine decides where to start and where to stop a match. This guide walks you through the engine's decision-making process step by step, explores lazy quantifiers as the primary tool for controlling match length, and presents negated character sets as a powerful and often superior alternative.
The Regex Engine: A Step-by-Step Mental Model
Before diving into greedy and lazy behavior, you need to understand the basic algorithm the JavaScript regex engine follows. It is a backtracking engine (technically, an NFA-based engine), and it works like this:
- Start at position 0 of the string
- Try to match the pattern from that position
- If the pattern matches, report the match
- If the pattern fails, move to position 1 and try again
- Repeat until a match is found or the entire string is exhausted
Within step 2, the engine processes the pattern element by element, left to right. When a quantifier is involved, the engine must decide how many characters to consume, and this is where greedy and lazy behavior diverge.
How Greedy Quantifiers Work: The Backtracking Process
By default, every quantifier in JavaScript is greedy. A greedy quantifier tries to match as many characters as possible first, then gives characters back one at a time (backtracks) if the rest of the pattern cannot match.
Let's trace through a concrete example in precise detail.
Example: Matching a Quoted String (Greedy)
const pattern = /".+"/;
const str = 'She said "hello" quietly';
Step-by-step engine execution:
Step 1: The engine starts at position 0 (S). The first element in the pattern is a literal ". The character S is not ", so the match fails at position 0. The engine moves to position 1.
Steps 2 through 9: The engine tries positions 1 through 8 (h, e, , s, a, i, d, ). None of them are ". Each attempt fails immediately.
Step 10: Position 9 is ". The first pattern element matches. The engine advances to the next pattern element: .+
Step 11: .+ is greedy. The . matches any character (except newline by default), and + means one or more. The engine tries to match as many characters as possible.
Starting from position 10, .+ consumes:
- Position 10:
h(matched) - Position 11:
e(matched) - Position 12:
l(matched) - Position 13:
l(matched) - Position 14:
o(matched) - Position 15:
"(matched,.matches"too!) - Position 16:
(matched) - Position 17:
q(matched) - Position 18:
u(matched) - Position 19:
i(matched) - Position 20:
e(matched) - Position 21:
t(matched) - Position 22:
l(matched) - Position 23:
y(matched)
The engine reaches the end of the string. .+ has consumed positions 10 through 23 (14 characters).
Step 12: The engine now needs to match the final " in the pattern. But there are no characters left in the string. The match fails at this point.
Step 13 (Backtracking begins): The engine gives back one character from .+. Now .+ matches positions 10 through 22, and the engine tries to match " at position 23. Position 23 is y. Not a match.
Step 14: Give back another character. .+ matches positions 10 through 21. Position 22 is l. Not a match.
Steps 15 through 19: The engine continues backtracking: t, e, i, u, q. None are ".
Step 20: .+ now matches positions 10 through 15. Position 15 is... wait, that was already consumed. Let me recalculate.
Actually, after giving back characters one by one, .+ eventually covers positions 10 through 14 (h, e, l, l, o), and the engine tries position 15. Position 15 is ". Match found!
console.log(str.match(pattern)[0]);
// Output: '"hello"'
In this case, we got the result we wanted. But what happens when there are multiple quoted sections?
The Greedy Problem: Multiple Delimiters
const pattern = /".+"/;
const str = 'She said "hello" and "goodbye" today';
console.log(str.match(pattern)[0]);
// Output: '"hello" and "goodbye"'
Here is the trace:
- The engine finds
"at position 9 .+(greedy) consumes everything from position 10 to the end of the string (position 35)- Backtracking begins. The engine gives back characters one at a time:
y,a,d,o,t,... - At position 29, the engine finds
"(the quote after "goodbye") - The closing
"in the pattern matches. Done.
The engine stops at the first successful match during backtracking, which is the last " in the string. It never considers stopping at the " after "hello" because greedy backtracking works from right to left: it gave back the minimum needed to make the pattern succeed.
This is the core of the greedy problem: the engine found a valid match, but it was not the match you intended.
Visualizing Backtracking
Think of greedy matching as a two-phase process:
Phase 1 (Expansion): Eat everything possible
"hello" and "goodbye" today
^^^^^^^^^^^^^^^^^^^^^^^^^^ (.+ grabs all of this)
Phase 2 (Backtracking): Give back until closing " is found
"hello" and "goodbye" today ← position 35: no "
"hello" and "goodbye" toda ← position 34: no "
"hello" and "goodbye" tod ← position 33: no "
...
"hello" and "goodbye" ← position 29: found "! Stop here.
Result: "hello" and "goodbye"
How Lazy Quantifiers Work
A lazy quantifier reverses the strategy. Instead of starting with the maximum and giving back, it starts with the minimum and expands one character at a time until the rest of the pattern can match.
You make any quantifier lazy by appending ? to it:
| Greedy | Lazy |
|---|---|
+ | +? |
* | *? |
? | ?? |
{n,m} | {n,m}? |
{n,} | {n,}? |
Example: Matching a Quoted String (Lazy)
const pattern = /".+?"/;
const str = 'She said "hello" and "goodbye" today';
Step-by-step engine execution:
Steps 1 through 9: Same as before. The engine finds " at position 9.
Step 10: .+? is lazy. The + requires at least one character, so the engine matches the minimum: just one character at position 10 (h).
Step 11: The engine tries to match the closing " at position 11. Position 11 is e. Not a match.
Step 12: The lazy quantifier expands by one. .+? now covers positions 10-11 (he). The engine tries " at position 12. It is l. Not a match.
Steps 13-14: Expand to hel, try " at 13 (l). Expand to hell, try " at 14 (o). No match either time.
Step 15: Expand to hello. Try " at position 15. It is "! Match found!
console.log(str.match(pattern)[0]);
// Output: '"hello"'
With the g flag, the engine continues and finds the second match:
const pattern = /".+?"/g;
const str = 'She said "hello" and "goodbye" today';
console.log(str.match(pattern));
// Output: ['"hello"', '"goodbye"']
Visualizing Lazy Matching
Phase 1 (Minimum): Start with the least possible
"h ← .+? matches just "h"
→ try closing ": position 11 is 'e'. Fail.
Phase 2 (Expansion): Grow one character at a time
"he ← .+? matches "he"
→ try closing ": position 12 is 'l'. Fail.
"hel ← .+? matches "hel"
→ try closing ": position 13 is 'l'. Fail.
"hell ← .+? matches "hell"
→ try closing ": position 14 is 'o'. Fail.
"hello ← .+? matches "hello"
→ try closing ": position 15 is '"'. Match!
Result: "hello"
All Lazy Quantifiers in Practice
+? : One or More (Prefer Fewer)
const greedy = /\d+/;
const lazy = /\d+?/;
const str = "12345";
console.log(str.match(greedy)[0]); // '12345' (takes all digits)
console.log(str.match(lazy)[0]); // '1' (takes one digit (minimum for +))
// With the g flag, lazy + produces individual matches
const greedy = /\d+/g;
const lazy = /\d+?/g;
const str = "123 4567 89";
console.log(str.match(greedy));
// Output: ['123', '4567', '89']
console.log(str.match(lazy));
// Output: ['1', '2', '3', '4', '5', '6', '7', '8', '9']
*? : Zero or More (Prefer Zero)
const greedy = /a.*/;
const lazy = /a.*?/;
const str = "abcabc";
console.log(str.match(greedy)[0]); // 'abcabc' (.* takes everything after 'a')
console.log(str.match(lazy)[0]); // 'a' (.*? takes nothing (zero is valid for *))
Notice that .*? alone at the end of a pattern always matches zero characters because zero satisfies *. It only expands when there is a pattern element after it that forces expansion:
// .*? expands only when forced by the rest of the pattern
const pattern = /a.*?c/;
const str = "abcabc";
console.log(str.match(pattern)[0]);
// Output: 'abc'
// .*? tried zero chars first: "a" + "c" at position 1? No, position 1 is 'b'.
// .*? expands to one char: "ab" + "c" at position 2? No, position 2 is 'c'... wait, yes!
// Result: 'abc'
?? : Zero or One (Prefer Zero)
const greedy = /colou?r/;
const lazy = /colou??r/;
// Both match "color" and "colour"
console.log("color".match(greedy)[0]); // 'color'
console.log("color".match(lazy)[0]); // 'color'
console.log("colour".match(greedy)[0]); // 'colour'
console.log("colour".match(lazy)[0]); // 'colour'
The difference between ? and ?? is visible when you look at what a capturing group captures:
const greedy = /(colou?)r/;
const lazy = /(colou??)r/;
const str = "colour";
console.log(str.match(greedy)[1]); // 'colou' (? prefers to include 'u')
console.log(str.match(lazy)[1]); // 'colou' (?? prefers to skip 'u', but must include it for 'r' to match)
In this specific case, both produce 'colou' because the engine is forced to include u either way. The lazy ?? tries to skip it but then the r does not match (position of u is not r), so it backtracks and includes u.
Where ?? makes a visible difference:
const greedy = /(\d?)_/;
const lazy = /(\d??)_/;
console.log("5_".match(greedy)[1]); // '5' (? prefers to match the digit)
console.log("5_".match(lazy)[1]); // '' (?? prefers to skip the digit, and _ still matches)
// Another example where ?? matters
const greedy = /^(.*?)(\d?)$/;
const lazy = /^(.*?)(\d??)$/;
const str = "abc5";
console.log(str.match(greedy).slice(1)); // ['abc', '5'] (\d? grabs the digit)
console.log(str.match(lazy).slice(1)); // ['abc5', ''] (\d?? skips the digit)
{n,m}? : Range (Prefer Minimum)
const greedy = /\d{2,5}/g;
const lazy = /\d{2,5}?/g;
const str = "1234567890";
console.log(str.match(greedy));
// Output: ['12345', '67890'] (takes max (5) each time)
console.log(str.match(lazy));
// Output: ['12', '34', '56', '78', '90'] (takes min (2) each time)
// Practical example: matching words of varying length
const greedy = /\w{3,6}/g;
const lazy = /\w{3,6}?/g;
const str = "abcdefghij";
console.log(str.match(greedy));
// Output: ['abcdef', 'ghij'] (takes 6 then remaining 4)
console.log(str.match(lazy));
// Output: ['abc', 'def', 'ghi'] (takes 3 each time, 'j' alone is below minimum)
Backtracking Deeper: Complex Patterns
The greedy/lazy distinction becomes critical when patterns have multiple quantifiers or when the quantified element overlaps with what follows.
Two Greedy Quantifiers Competing
const pattern = /(\d+)(\d+)/;
const str = "12345";
console.log(str.match(pattern).slice(1));
// Output: ['1234', '5']
Here is why:
- The first
\d+(greedy) tries to take all 5 digits:12345 - The second
\d+needs at least one digit, but there are none left - The first
\d+backtracks, giving up5. Now first has1234, second gets5 - Both quantifiers are satisfied. Match found.
The first greedy quantifier takes as much as possible while leaving just enough for the second.
// With three groups
const pattern = /(\d+)(\d+)(\d+)/;
const str = "12345";
console.log(str.match(pattern).slice(1));
// Output: ['123', '4', '5']
Each subsequent group gets the minimum (one digit), and the first group gets everything else.
Greedy and Lazy in the Same Pattern
const pattern = /(\d+?)(\d+)/;
const str = "12345";
console.log(str.match(pattern).slice(1));
// Output: ['1', '2345']
Now the first group is lazy (+?), so it takes the minimum (one digit: 1). The second group is greedy, so it takes everything remaining: 2345.
// Reversed laziness
const pattern = /(\d+)(\d+?)/;
const str = "12345";
console.log(str.match(pattern).slice(1));
// Output: ['1234', '5']
First group is greedy (takes 12345), but second needs at least one, so first backtracks to 1234, second gets 5 (minimum).
The Alternative: Negated Character Sets
In many practical scenarios, you do not actually need lazy quantifiers at all. A negated character set solves the problem more directly, more efficiently, and more readably.
The Core Idea
Instead of saying "match any character, but lazily" (.+?), you say "match any character that is not the delimiter" ([^delimiter]+).
// Task: match content between double quotes
// Approach 1: Lazy quantifier
const lazy = /".*?"/g;
// Approach 2: Negated set
const negated = /"[^"]*"/g;
const str = 'She said "hello" and "goodbye" today';
console.log(str.match(lazy));
// Output: ['"hello"', '"goodbye"']
console.log(str.match(negated));
// Output: ['"hello"', '"goodbye"']
Both produce the same result, but the negated set version works fundamentally differently.
Why Negated Sets Are Faster
The lazy approach:
.matches one character- Check if next char is
". No? Expand by one. .matches another character- Check if next char is
". No? Expand by one. - Repeat until
"is found
At each position, the engine must check two things: can . match, and does the next character satisfy the rest of the pattern?
The negated set approach:
[^"]matches one character (as long as it is not")[^"]matches another (still not")- Continues until it hits
", where[^"]fails - The
*quantifier is satisfied, and the engine moves to the closing"
The negated set is greedy but never overshoots because it physically cannot match the delimiter. No backtracking is needed.
// Benchmarking the difference (simplified)
const lazy = /".*?"/g;
const negated = /"[^"]*"/g;
// Build a long string with many quoted segments
const str = Array(10000).fill('"word"').join(' and ');
console.time("lazy");
str.match(lazy);
console.timeEnd("lazy");
console.time("negated");
str.match(negated);
console.timeEnd("negated");
// Negated set is typically faster, especially on longer strings
Common Delimiter Patterns with Negated Sets
Here are the most frequent use cases translated from lazy to negated set patterns:
// Double-quoted strings
const lazyQuote = /".*?"/g;
const negatedQuote = /"[^"]*"/g;
// Single-quoted strings
const lazySingle = /'.*?'/g;
const negatedSingle = /'[^']*'/g;
// Content between parentheses (non-nested)
const lazyParens = /\(.*?\)/g;
const negatedParens = /\([^)]*\)/g;
// Content between square brackets
const lazyBrackets = /\[.*?\]/g;
const negatedBrackets = /\[[^\]]*\]/g;
// HTML tags
const lazyTags = /<.*?>/g;
const negatedTags = /<[^>]*>/g;
// Let's verify they all produce identical results
const html = '<div class="main"><p>Hello</p><br/></div>';
console.log(html.match(/<.*?>/g));
// Output: ['<div class="main">', '<p>', '</p>', '<br/>', '</div>']
console.log(html.match(/<[^>]*>/g));
// Output: ['<div class="main">', '<p>', '</p>', '<br/>', '</div>']
When Negated Sets Cannot Replace Lazy Quantifiers
Negated sets work perfectly when the delimiter is a single character (a quote, a bracket, an angle bracket). They become more complex or impossible when:
1. The delimiter is multi-character:
// Match content between "<!--" and "-->"
// Negated set approach is awkward for multi-char delimiters
const lazy = /<!--.*?-->/gs;
const str = "<!-- comment --> code <!-- another -->";
console.log(str.match(lazy));
// Output: ['<!-- comment -->', '<!-- another -->']
// A negated set would need to say "match anything that is not the start of -->"
// This is possible but complex:
const negated = /<!--(?:[^-]|-(?!->))*-->/g;
// Much harder to read and write
2. The dot needs to match newlines:
// With the s flag, . matches newlines
const lazy = /".*?"/gs;
const str = '"line one\nline two"';
console.log(str.match(lazy));
// Output: ['"line one\nline two"']
// Negated set equivalent needs to exclude " but include \n
const negated = /"[^"]*"/g;
// This already works because [^"] means "not a quote", which includes newlines
console.log(str.match(negated));
// Output: ['"line one\nline two"']
Actually, in this case the negated set works just fine because [^"] already includes newlines.
3. The matching logic is more complex than a simple delimiter:
// Match the shortest sequence ending with a digit followed by a period
const lazy = /\w+?\d\./;
const str = "abc3.";
console.log(str.match(lazy)[0]);
// Output: 'abc3.'
// A negated set approach does not cleanly express this
Use negated sets when matching content between single-character delimiters. They are faster, clearer, and avoid backtracking entirely.
Use lazy quantifiers when dealing with multi-character delimiters, complex stop conditions, or when the negated set would be harder to read.
Greedy and Lazy with the Dot and s Flag
The dot . matches any character except newline. With the s flag (dotall), it matches newlines too. This affects how greedy and lazy quantifiers interact with multi-line content.
const multiline = `first "hello
world" second "foo" end`;
// Without s flag: . does not match \n, so it stops at the newline
const withoutS = /".*?"/g;
console.log(multiline.match(withoutS));
// Output: ['"foo"'] (the first quote pair spans a newline, so .+? can't cross it)
// With s flag: . matches \n too
const withS = /".*?"/gs;
console.log(multiline.match(withS));
// Output: ['"hello\nworld"', '"foo"']
Real-World Patterns: Greedy vs. Lazy vs. Negated
Extracting HTML Tag Attributes
const html = '<img src="photo.jpg" alt="A photo" class="thumbnail">';
// Greedy: grabs too much
const greedy = /\w+=".*"/g;
console.log(html.match(greedy));
// Output: ['src="photo.jpg" alt="A photo" class="thumbnail"']
// One giant match from first = to last "
// Lazy: correct
const lazy = /\w+=".*?"/g;
console.log(html.match(lazy));
// Output: ['src="photo.jpg"', 'alt="A photo"', 'class="thumbnail"']
// Negated: also correct and faster
const negated = /\w+="[^"]*"/g;
console.log(html.match(negated));
// Output: ['src="photo.jpg"', 'alt="A photo"', 'class="thumbnail"']
Parsing Template Literals (Custom Delimiters)
// Extract content between {{ and }}
const template = "Hello {{name}}, your order {{orderId}} is ready.";
// Lazy approach
const lazy = /\{\{.*?\}\}/g;
console.log(template.match(lazy));
// Output: ['{{name}}', '{{orderId}}']
// Negated set approach (matching anything that is not })
const negated = /\{\{[^}]*\}\}/g;
console.log(template.match(negated));
// Output: ['{{name}}', '{{orderId}}']
Extracting CSS Property Values
const css = "color: red; background-color: #fff; font-size: 16px;";
// Match property: value pairs
const lazy = /[\w-]+:\s*.*?;/g;
console.log(css.match(lazy));
// Output: ['color: red;', 'background-color: #fff;', 'font-size: 16px;']
const negated = /[\w-]+:\s*[^;]*;/g;
console.log(css.match(negated));
// Output: ['color: red;', 'background-color: #fff;', 'font-size: 16px;']
Matching Balanced Delimiters (Simple, Non-Nested)
const str = "calc(1 + 2) and func(a, b) end";
// Lazy
const lazy = /\w+\(.*?\)/g;
console.log(str.match(lazy));
// Output: ['calc(1 + 2)', 'func(a, b)']
// Negated
const negated = /\w+\([^)]*\)/g;
console.log(str.match(negated));
// Output: ['calc(1 + 2)', 'func(a, b)']
Neither lazy quantifiers nor simple negated sets handle nested delimiters correctly. For example, func(a, (b + c)) contains nested parentheses. The patterns above would match func(a, (b + c) with the lazy version or func(a, (b + c) with the negated version (stopping at the first ) for negated). For truly nested structures, you need recursive patterns or a proper parser.
const nested = "func(a, (b + c))";
const lazy = /\w+\(.*?\)/g;
console.log(nested.match(lazy));
// Output: ['func(a, (b + c)'] - stops at first ), misses outer )
const negated = /\w+\([^)]*\)/g;
console.log(nested.match(negated));
// Output: ['func(a, (b + c)'] - same issue
Common Mistakes
Mistake 1: Greedy .+ Crossing Multiple Delimiters
This is by far the most common regex mistake involving quantifiers:
// ❌ Greedy: matches from first " to LAST "
const greedy = /".+"/g;
const str = '"one" "two" "three"';
console.log(str.match(greedy));
// Output: ['"one" "two" "three"'] (one giant match!)
// ✅ Fix with lazy
const lazy = /".+?"/g;
console.log(str.match(lazy));
// Output: ['"one"', '"two"', '"three"']
// ✅ Better fix with negated set
const negated = /"[^"]+"/g;
console.log(str.match(negated));
// Output: ['"one"', '"two"', '"three"']
Mistake 2: Assuming Lazy Always Gives the "Shortest" Match
Lazy quantifiers give the shortest match from the current starting position. They do not find the globally shortest possible match.
const pattern = /a.*?b/;
const str = "aXXXbYYYaZb";
console.log(str.match(pattern)[0]);
// Output: 'aXXXb'
// NOT 'aZb', even though 'aZb' is shorter overall
The engine starts at position 0 (the first a) and finds the closest b from there. It does not jump ahead to position 8 to find a shorter match. The engine always tries from left to right and reports the first match it finds.
// To find ALL matches, use the g flag
const pattern = /a.*?b/g;
const str = "aXXXbYYYaZb";
console.log(str.match(pattern));
// Output: ['aXXXb', 'aZb']
Mistake 3: Using .* at the Start or End Without Purpose
// ❌ Pointless: .* at the beginning matches everything then backtracks
const bad = /.*\d+/;
const str = "abc123";
// This works but is inefficient. The .* grabs "abc123", then backtracks
// until \d+ can match. It's equivalent to just finding digits.
// ✅ Simpler and faster
const good = /\d+/;
console.log(str.match(bad)[0]); // 'abc123'
console.log(str.match(good)[0]); // '123'
// They don't even match the same thing! Be explicit about what you need.
Mistake 4: Confusing ? Roles
// The ? after + means lazy (not "optional +")
const lazy = /\d+?/; // One or more digits, prefer fewer
// The ? alone means optional (zero or one)
const optional = /\d?/; // Zero or one digit
console.log("12345".match(lazy)[0]); // '1' (lazy +, minimum one digit)
console.log("12345".match(optional)[0]); // '1' (optional, matched one digit)
console.log("abcde".match(optional)[0]); // '' (optional, matched zero digits)
console.log("abcde".match(lazy)); // null (lazy + still requires at least one digit)
Mistake 5: Trying to Match Empty Content with +?
// +? still requires AT LEAST one match
const pattern = /\(.+?\)/g;
const str = "empty () and full (content)";
console.log(str.match(pattern));
// Output: ['(content)'] ()"()" has nothing between parens for .+ to match)
// If empty content should also match, use *? or a negated set with *
const withStar = /\(.*?\)/g;
console.log(str.match(withStar));
// Output: ['()', '(content)']
const withNegated = /\([^)]*\)/g;
console.log(str.match(withNegated));
// Output: ['()', '(content)']
Decision Guide: Greedy, Lazy, or Negated Set?
Use this decision process when writing a quantifier:
Do you need to match content between two single-character delimiters?
Use a negated set: "[^"]*", \([^)]*\), <[^>]*>
Do you need to match content between multi-character delimiters?
Use a lazy quantifier: <!--.*?-->, \{\{.*?\}\}
Do you need to capture the maximum possible span?
Use a greedy quantifier (the default): .+, \d+, \w+
Do you need to match the minimum while a later pattern element defines the boundary?
Use a lazy quantifier: .+?boundary, \d+?[a-z]
Are you unsure? Start with a negated set if possible. It is the safest default because it never backtracks.
Quick Reference
| Approach | Pattern | Behavior | Backtracking |
|---|---|---|---|
| Greedy (default) | .+ | Match max, shrink if needed | Yes, from right |
| Lazy | .+? | Match min, expand if needed | Yes, from left |
| Negated set | [^x]+ | Match all non-x greedily | No backtracking |
| Greedy | Lazy | Minimum Matches |
|---|---|---|
+ | +? | 1 |
* | *? | 0 |
? | ?? | 0 |
{n,m} | {n,m}? | n |
{n,} | {n,}? | n |
Understanding the backtracking process transforms regex from mysterious black-box behavior into predictable, debuggable logic. When your regex returns an unexpected match, trace through the greedy expansion and backtracking steps. Nine times out of ten, the issue is a greedy quantifier consuming too much. Switch to lazy or, better yet, use a negated character set to eliminate the problem entirely.