f36b095ce2
Formalize the two-phase MPHF indexing architecture and update Phase 6 to use `evidence.bin` for direct kmer extraction. Simplify the evidence and unitig storage layouts to flat packed formats enabling O(1) random access. Introduce aggregation traits (`ColumnWeights`, `CountPartials`, `BitPartials`) to support additive distance metric decomposition across partitions. Narrow the documented scope from metagenomic to individual genome datasets, and replace speculative open questions with concrete implementation specifications.
378 B
378 B
On-disk collection structure
See obilayeredmap crate for the current on-disk layout.
The index root contains one part_XXXXX/ directory per partition, each holding one or more layer_N/ directories. Each layer directory contains mphf.bin, unitigs.bin, unitigs.bin.idx, evidence.bin, and optionally a counts/ or presence/ payload directory.