feat: implement persistent layered index and chunked binary format

Introduce the `obilayeredmap` specification and persistent MPHF-based index architecture for incremental multi-dataset indexing. Implement chunked binary serialization with a fixed `u8` k-mer count limit (256) and overlapping super-kmer segments. Add memory-mapped I/O and a companion `.idx` index file for allocation-free, O(1) unitig access. Update MkDocs navigation, enhance the k-mer comparison script, and add comprehensive tests for serialization, partitioning, and file I/O pipelines.
This commit is contained in:
Eric Coissac
2026-05-09 17:20:08 +08:00
parent 8c17bf958b
commit 5169f65dc9
24 changed files with 1342 additions and 382 deletions
+5
View File
@@ -1615,7 +1615,10 @@ version = "0.1.0"
dependencies = [
"memmap2",
"niffler 3.0.0",
"obikrope",
"obikseq",
"obiread",
"obiskbuilder",
"obiskio",
"ph",
"rayon",
@@ -1623,6 +1626,7 @@ dependencies = [
"serde",
"serde_json",
"sysinfo 0.33.1",
"tempfile",
"tracing",
]
@@ -1679,6 +1683,7 @@ name = "obiskio"
version = "0.1.0"
dependencies = [
"lru",
"memmap2",
"niffler 3.0.0",
"obikseq",
"rustix",