A single regex pattern brought down a production API. CPU hit 100% and stayed there. The fix was one character. This is the story of catastrophic backtracking and why you should never trust a regex you didnβt test under load.
One user submitted a form with a weird email address. Our server CPU hit 100% and stayed there for 6 minutes.
The trigger
A user entered this in the email field:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@
Thatβs 30 a characters followed by @ with no domain. Obviously not a valid email. But our regex validator didnβt just reject it β it hung.
The regex
const emailRegex = /^([a-zA-Z0-9._-]+)+@([a-zA-Z0-9.-]+)\.[a-zA-Z]{2,}$/;
Spot the problem? Itβs the ([a-zA-Z0-9._-]+)+ part. A quantified group containing a quantifier. This creates catastrophic backtracking.
What is catastrophic backtracking?
When the regex engine tries to match aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@ against ([a-zA-Z0-9._-]+)+, it needs to figure out how to split the as between the inner + and the outer +.
For 30 characters, there are 2^29 (536 million) possible ways to split them. The regex engine tries every single one before concluding βthis doesnβt match.β
This is called ReDoS β Regular Expression Denial of Service.
The timeline
11:23 AM β User submits the form. The API endpoint that validates the email starts processing.
11:23 AM β Node.js is single-threaded. The regex evaluation blocks the event loop. No other requests can be processed.
11:24 AM β Health checks fail. Load balancer marks the instance as unhealthy.
11:25 AM β Auto-scaling spins up a new instance. The queued request from the original user gets retried on the new instance. That instance also hangs.
11:29 AM β We now have 4 hung instances. All from the same userβs request being retried.
11:29 AM β We identify the problem, kill the hung processes, and block the request.
The fix
Immediate: Use a safe regex
// Safe: no nested quantifiers
const emailRegex = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
Removing the outer group ()+ eliminates the backtracking. This regex is linear time β it processes each character once.
Better: Donβt use regex for email validation
function isValidEmail(email) {
// Basic structural check
const parts = email.split('@');
if (parts.length !== 2) return false;
if (parts[0].length === 0 || parts[1].length === 0) return false;
if (!parts[1].includes('.')) return false;
return true;
}
Or even better β just send a confirmation email. Thatβs the only real validation.
Best: Add a timeout to regex operations
// Node.js doesn't have built-in regex timeouts, but you can use:
// 1. Input length limits
if (input.length > 254) return false; // RFC 5321 max email length
// 2. The 're2' package (Google's regex engine, no backtracking)
const RE2 = require('re2');
const safeRegex = new RE2(/^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/);
How to detect vulnerable regexes
# Install the safe-regex package
npm install safe-regex
# Or use the 'recheck' tool
npx recheck "^([a-zA-Z0-9._-]+)+@([a-zA-Z0-9.-]+)\.[a-zA-Z]{2,}$"
# Output: VULNERABLE - exponential backtracking
Common vulnerable patterns
(a+)+ # Nested quantifiers
(a|a)+ # Overlapping alternatives
(a+b?)+ # Optional in quantified group
([a-zA-Z]+)* # Star of plus
The lesson
A single regex in a single endpoint took down our entire API. The fix was removing two characters (( and )+).
Rules for production regexes:
- Never nest quantifiers:
(x+)+or(x*)*or(x+)* - Limit input length before applying regex
- Test regexes with adversarial input, not just valid input
- Consider using RE2 for user-facing input validation
- Add request timeouts so one bad request canβt block the server forever