Dangerous regex pattern detection #45

Closed
opened 2026-02-15 04:14:07 +03:00 by NiXTheDev · 0 comments
NiXTheDev commented 2026-02-15 04:14:07 +03:00 (Migrated from github.com)

Description

Detect and warn users about potentially dangerous regex patterns that could cause catastrophic backtracking or performance issues.

Background

Certain regex patterns can cause exponential execution time (ReDoS - Regular Expression Denial of Service). Examples:

  • (a+)+ - nested quantifiers
  • (a*)* - star inside star
  • (a|aa)+ - ambiguous alternation with quantifier

Implementation Plan

  1. Pattern analysis module

    • Create detectDangerousPattern(pattern) function
    • Detect known dangerous patterns
    • Calculate complexity score
  2. Detection rules

    • Nested quantifiers: (+)+, ()+, (+), etc.
    • Quantified groups containing alternation: (a|b)+
    • Multiple wildcards: ..
    • Excessive backtracking potential
  3. Warning system

    • Before executing substitution, analyze pattern
    • If dangerous, show warning with explanation
    • Allow user to proceed or cancel
    • Include performance estimation
  4. Warning message format

    Warning: This pattern may cause performance issues
    
    Pattern: (a+)+b
    Issue: Nested quantifiers can cause exponential execution time
    
    Consider: Using atomic groups or possessive quantifiers
    Proceed anyway? (yes/no)
    
  5. Integration points

    • Hook into sed command processing
    • Check before sending to worker pool
    • Skip warning for simple patterns

Acceptance Criteria

  • Detects common dangerous patterns
  • Shows clear warning message
  • Explains why pattern is problematic
  • Suggests alternatives when possible
  • Does not block execution (warn only)
  • Tests for detection accuracy

Example Scenarios

User: s/(a+)+b/replacement/
Bot: Warning about nested quantifiers

User: s/\d{3}-\d{2}-\d{4}/replacement/
Bot: No warning (safe pattern)

Part of Epic #38

## Description Detect and warn users about potentially dangerous regex patterns that could cause catastrophic backtracking or performance issues. ## Background Certain regex patterns can cause exponential execution time (ReDoS - Regular Expression Denial of Service). Examples: - (a+)+ - nested quantifiers - (a*)* - star inside star - (a|aa)+ - ambiguous alternation with quantifier ## Implementation Plan 1. **Pattern analysis module** - Create detectDangerousPattern(pattern) function - Detect known dangerous patterns - Calculate complexity score 2. **Detection rules** - Nested quantifiers: (+)+, (_)+, (+)_, etc. - Quantified groups containing alternation: (a|b)+ - Multiple wildcards: ._._ - Excessive backtracking potential 3. **Warning system** - Before executing substitution, analyze pattern - If dangerous, show warning with explanation - Allow user to proceed or cancel - Include performance estimation 4. **Warning message format** ``` Warning: This pattern may cause performance issues Pattern: (a+)+b Issue: Nested quantifiers can cause exponential execution time Consider: Using atomic groups or possessive quantifiers Proceed anyway? (yes/no) ``` 5. **Integration points** - Hook into sed command processing - Check before sending to worker pool - Skip warning for simple patterns ## Acceptance Criteria - [ ] Detects common dangerous patterns - [ ] Shows clear warning message - [ ] Explains why pattern is problematic - [ ] Suggests alternatives when possible - [ ] Does not block execution (warn only) - [ ] Tests for detection accuracy ## Example Scenarios User: s/(a+)+b/replacement/ Bot: Warning about nested quantifiers User: s/\d{3}-\d{2}-\d{4}/replacement/ Bot: No warning (safe pattern) ## Related Part of Epic #38
Sign in to join this conversation.
No description provided.