Replace Linear Scan of Character Class Items with O(1) Lookup #11
Labels
No labels
Epic
GHA
Release
bug
dependencies
documentation
duplicate
enhancement
good first issue
help wanted
invalid
major
question
rust
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
NiXTheDev/Ogex#11
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal: Speed up matching of character classes by precomputing a membership table.
Currently,
try_transitioninengine.rsiterates through all items in a character class to check if a character matches. For large classes (e.g.,[a-zA-Z0-9_]), this is inefficient.Proposed Solution:
During NFA construction (or even earlier), convert each character class into a bitset (for ASCII) or a boolean lookup table using a
[bool; 128]for ASCII, and for Unicode, use aRangeSetor a perfect hash. For simplicity, start with ASCII optimization.Implementation Plan:
nfa.rs, when processingExpr::CharacterClass, precompute aCharClassSetstruct: