Implement Memoization Cache in NFA Simulator to Prevent Exponential Blowup #4
Labels
No labels
Epic
GHA
Release
bug
dependencies
documentation
duplicate
enhancement
good first issue
help wanted
invalid
major
question
rust
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
NiXTheDev/Ogex#4
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal: Improve worst-case performance for patterns with heavy backtracking (e.g., nested quantifiers and backreferences).
The current NFA simulation in
engine.rsuses a straightforward backtracking approach for backreferences. This can lead to exponential runtime for certain patterns (ReDoS vulnerability).Proposed Solution:
Add a memoization cache (often called "thread factoring") to the
NfaSimulator. The cache stores results for(state_id, position)pairs, indicating whether that combination can lead to a match. When the same state and position are encountered again via a different path, the cached result can be reused, avoiding redundant work.Design Considerations:
HashMap<(StateId, usize), Option<...>>where the value indicates match success and possibly the best capture groups.Implementation Steps:
cache: HashMap<(StateId, usize), Option<MatchResult>>toNfaSimulator.run()andstep_with_backrefs, before processing a state, check the cache.Testing:
(a*)*to ensure runtime remains reasonable.