feat: optimize unitig index and document evidence elimination

Replace the dense per-chunk offset index with a sparse block-sampled structure (64 chunks per block), reducing the index file size by approximately 300× while preserving O(1) k-mer extraction. Introduce a design document for eliminating the `evidence.bin` file, which accounts for ~66% of the lookup layer, by transitioning to fingerprint-based approximate indexing and value-based MPHF lookups. Update MkDocs navigation to include the new documentation and add a file count tracker to the scatter step progress bar for improved observability.
This commit is contained in:
Eric Coissac
2026-05-23 07:51:59 +02:00
parent 9b700ff4a4
commit 4a5ab0b8c2
5 changed files with 488 additions and 151 deletions
+1
View File
@@ -44,6 +44,7 @@ nav:
- On-disk storage: implementation/storage.md
- MPHF selection: implementation/mphf.md
- Unitig evidence encoding: implementation/unitig_evidence.md
- Evidence elimination (discussion): implementation/evidence_elimination.md
- obilayeredmap crate: implementation/obilayeredmap.md
- PersistentCompactIntVec: implementation/persistent_compact_int_vec.md
- PersistentBitVec: implementation/persistent_bit_vec.md