Cache Epsilon Closures to Reduce Runtime Computation #12
Labels
No labels
Epic
GHA
Release
bug
dependencies
documentation
duplicate
enhancement
good first issue
help wanted
invalid
major
question
rust
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
NiXTheDev/Ogex#12
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal: Reduce the overhead of repeatedly computing epsilon closures during matching.
In
engine.rs,epsilon_closureis called frequently (every step). It traverses the NFA graph to find all states reachable via epsilon transitions. This can be expensive for large NFAs.Proposed Solution:
During NFA construction (
nfa.rs), precompute the epsilon closure for each state and store it as aVec<StateId>. Then, at matching time, simply use the precomputed list instead of recomputing.Implementation Steps:
Nfa, add a fieldepsilon_closures: Vec<Vec<StateId>>whereepsilon_closures[state_id]is the list of states reachable via epsilon (including itself?).epsilon_closure, use&self.nfa.epsilon_closures[state_id]to get the closure, then filter/combine as needed (maybe you need to handle group start/end transitions separately? Those are epsilon transitions but have side effects; they must be processed individually, not cached).Alternative: Use the "PikeVM" approach where epsilon transitions are handled on-the-fly with a frontier set; caching may still help.
Benchmark: