Replace HashMap with Dense Vector for Capture Group Storage #5
Labels
No labels
Epic
GHA
Release
bug
dependencies
documentation
duplicate
enhancement
good first issue
help wanted
invalid
major
question
rust
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
NiXTheDev/Ogex#5
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal: Reduce memory usage and improve performance by optimizing how capture groups are stored in
SimState.Currently, each
SimStatecontains aHashMap<u32, (usize, usize)>for group captures. For patterns with many groups and many simulation states, this is memory-intensive and slows down cloning.Proposed Alternatives:
Vec<Option<(usize, usize)>>where index = group ID (1-based). This is cache-friendly and avoids hashing overhead. Group 0 is unused, so allocate length =max_group_id + 1.Implementation Plan (Dense Vector):
nfa.rs, compute the maximum group ID during NFA construction (including numbered and named groups).engine.rs, changeSimState.groupstoVec<Option<(usize, usize)>>.GroupStartandGroupEndtransitions to modify the vector at the appropriate index.Matchfinalization.Benchmark: