Regular Expression Denial of Service (ReDoS)

Regular Expression Denial of Service (ReDoS) is a type of denial of service attack that exploits the way a regular expression engine processes certain patterns of input. The goal is to cause the engine to consume excessive amounts of CPU time, thereby degrading the performance of the application or making it unavailable.

How ReDoS Works

In JavaScript and Node.js (which uses the NPM ecosystem), regular expressions are commonly used for string matching and manipulation. The ReDoS vulnerability arises when an attacker provides specially crafted input that causes the regular expression engine to enter a state of catastrophic backtracking.

Here’s a breakdown of the key components:

1. Regular Expressions:

Regular expressions are patterns used to match character combinations in strings. They can be simple, like matching all occurrences of the letter “a”, or complex, like matching a valid email address.

2. Backtracking:

When a regular expression engine attempts to match a string against a pattern, it may need to try multiple possibilities to find a match. If a match fails, the engine will “backtrack” and try a different possibility. Normally, this process is fast, but for certain patterns and inputs, the amount of backtracking required can grow exponentially.

3. Catastrophic Backtracking:

This occurs when the regular expression is poorly constructed, and the input is crafted to exploit this weakness. For example, consider the regular expression /^(a+)+$/ and the input aaaaaaaaaaaaaaaaaaaaa!. The engine will try to match the input in many different ways, each time backtracking when it fails, leading to a large number of computations.

4. Impact:

An attacker can exploit this by sending inputs that cause the regular expression to take a long time to evaluate. In a web application, this can tie up the server’s resources, leading to a denial of service. Since regular expressions are often used for validation, filtering, and parsing, they are commonly found in web applications, making them a target for ReDoS attacks.

Example

Here’s a simple Node.js example that creates a vulnerable regular expression:

function vulnerableRegex(input) {
    const regex = /^([a-zA-Z]+)+$/;
    return regex.test(input);
}

const safeInput = "aaaaaaaaaaaaaaaa";
const maliciousInput = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!";

console.log("Testing safe input...");
console.time("Safe Input Time");
console.log(vulnerableRegex(safeInput)); // Should return true
console.timeEnd("Safe Input Time");

console.log("Testing malicious input...");
console.time("Malicious Input Time");
console.log(vulnerableRegex(maliciousInput)); // Should take much longer
console.timeEnd("Malicious Input Time");

Explanation

Regex Structure: The regular expression /^([a-zA-Z]+)+$/ is vulnerable because of the nested quantifiers +)+. When given a malicious input that almost matches but fails (like adding a special character at the end), it causes excessive backtracking.
Timing: Running this code should show a significant difference in time taken to evaluate the safeInput versus the maliciousInput.

Example with runtime

const readline = require('readline');

// Function to test the regular expression
function vulnerableRegex(input) {
    const regex = /^([a-zA-Z]+)+$/;
    return regex.test(input);
}

// Set up the readline interface for user input
const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout
});

// Prompt the user for input
rl.question("Enter a string to test for ReDoS: ", (userInput) => {
    console.time("Input Processing Time");
    const result = vulnerableRegex(userInput);
    console.timeEnd("Input Processing Time");

    console.log("Result:", result);
    
    rl.close();
});

Run above code as follow:

Given safe input shows normal runtime for the code.

Now run same code with malicious input and see the time it takes to run.

Time it took to response as follow:

Explanation

Readline Module: The readline module allows you to capture input from the user via the command line.
vulnerableRegex Function: The same regular expression is used to test the user input.
Timing: The script will output the time taken to process the user’s input.

Expected Behavior

Safe Input: The script should return true and process quickly.
Malicious Input: The script may take longer to process, demonstrating the ReDoS effect.

This script now allows you to explore the impact of ReDoS by entering different strings, making it more interactive and useful for demonstrations.

Preventing ReDoS

To prevent ReDoS attacks:

Use Safe Regular Expressions:
- Avoid complex nested quantifiers like (a+)+. Simpler and more specific regular expressions are less likely to suffer from catastrophic backtracking.
Limit Input Length:
- Implement input length validation before applying a regular expression. This reduces the impact of a ReDoS attack.
Use Timeouts:
- Set a maximum time for regular expression evaluation. If it takes too long, it should be aborted.
Regular Expression Libraries:
- Use libraries that are optimized to prevent ReDoS, such as those that employ finite automata instead of backtracking.

By being aware of how regular expressions can be exploited and taking steps to write safer patterns, you can protect your Node.js applications from ReDoS attacks.

Categorized in:

Secure Code Review

Regular Expression Denial of Service (ReDoS)