Files
obikmer/docmd/implementation/storage.md
T
Eric Coissac f36b095ce2 docs: clarify MPHF indexing, storage layout, and distance traits
Formalize the two-phase MPHF indexing architecture and update Phase 6 to use `evidence.bin` for direct kmer extraction. Simplify the evidence and unitig storage layouts to flat packed formats enabling O(1) random access. Introduce aggregation traits (`ColumnWeights`, `CountPartials`, `BitPartials`) to support additive distance metric decomposition across partitions. Narrow the documented scope from metagenomic to individual genome datasets, and replace speculative open questions with concrete implementation specifications.
2026-05-17 15:59:10 +08:00

378 B

On-disk collection structure

See obilayeredmap crate for the current on-disk layout.

The index root contains one part_XXXXX/ directory per partition, each holding one or more layer_N/ directories. Each layer directory contains mphf.bin, unitigs.bin, unitigs.bin.idx, evidence.bin, and optionally a counts/ or presence/ payload directory.