`obilayeredmap` implements a persistent, incrementally extensible kmer index. Each layer covers a disjoint kmer set and wraps a `ptr_hash` MPHF with associated per-slot data. Adding a new dataset never rebuilds existing layers.
Each layer carries one of two evidence bundles, recorded in `layer_meta.json` at build time:
```rust
pubenumEvidenceKind{
Exact,
Approx{b: u8,z: u8},
}
```
`EvidenceKind` is stored in `LayerMeta` (one per layer directory). `open()` reads it to decide which evidence files to load.
- **Exact**: writes `evidence.bin` + `unitigs.bin.idx`. Zero false positives. Requires random-access `.idx` at query time.
- **Approx**: writes `fingerprint.bin` only. False-positive rate per kmer query = 1/2^b. `z` is the Findere consecutive-kmer parameter: `z` consecutive kmers must all match, reducing the effective FP rate per read to approximately W / 2^(b·z) where W = L − k − z + 2 is the number of windows in a read of length L. No `.idx` written or required.
`open()` requires `unitigs.bin.idx` (random access into unitigs). `open_sequential()` on `UnitigFileReader` does not require the `.idx` and is used during build passes.
`MphfLayer::build` runs two sequential passes over `unitigs.bin`:
1.**Pass 1** (parallel via rayon): iterate all canonical kmers, construct and store `mphf.bin`. `new_from_par_iter` avoids materialising a full key `Vec`.
2.**Pass 2** (sequential): iterate again, fill `evidence.bin`, call `fill_slot(slot, kmer)` once per kmer for payload population. A compact `n/8`-byte seen-bitset verifies MPHF injectivity inline.
`build` always produces exact evidence. For approximate evidence, use `build_approx_evidence` after MPHF construction.
For empty layers (n = 0), all build variants return `Ok(0)` immediately after creating empty output files.
All build impls delegate MPHF + evidence construction to `MphfLayer::build` via a mode-specific `fill_slot` callback. Modes 2 and 3 pre-read `n_kmers` from `unitigs.bin` via `UnitigFileReader::open_sequential` to size the matrix builder before calling `MphfLayer::build`.
`matches(slot, hash)` extracts the b-bit fingerprint stored at `slot` and compares it to the low b bits of `hash`. It is the core operation of `find_approx`.
---
## LayeredMap\<D\> — collection of layers
`LayeredMap<D>` wraps `Vec<Layer<D>>` for a single partition directory.
```rust
pubstructLayeredMap<D: LayerData=()>{
root: PathBuf,
meta: PartitionMeta,
layers: Vec<Layer<D>>,
}
```
`PartitionMeta` (`meta.json` at the partition root) stores `n_layers`.
`LayeredStore<S>` is a generic aggregation wrapper over `Vec<S>`. It propagates three traits from `obicompactvec::traits` up the hierarchy via blanket impls:
```rust
pubstructLayeredStore<S>(pubVec<S>);
impl<S: ColumnWeights>ColumnWeightsforLayeredStore<S>{…}// Σ col_weights across inner stores
Because blanket impls compose, `LayeredStore<LayeredStore<S>>` automatically inherits all three traits when `S` does — providing the partitioned level without a separate type.
`unitigs.bin.idx` is required by `open()` (random access). `open_sequential()` on `UnitigFileReader` omits it and is used during build passes and approx-evidence construction.
`chunk_id = raw >> 7`, `rank = raw & 0x7F`. Reconstructing the kmer: read k nucleotides at position `rank` within unitig `chunk_id` (requires `unitigs.bin.idx` for random access).
`Xx64` is chosen over `FxHash` because canonical kmer raw values are left-aligned u64 with structural zeros in the low bits (42 zeros for k=11, 2 zeros for k=31), which single-multiply hashes distribute poorly.
Both delegate to the corresponding `PersistentBitMatrix::append_column` / `PersistentCompactIntMatrix::append_column`. They write a new column file (`col_NNNNNN.pbiv` / `col_NNNNNN.pciv`) and update `meta.json` to increment `n_cols`. `value_of` is called once per slot (0..n).
Called on the first merge of a Presence-mode index. Creates `presence/` with `meta.json {"n": n_kmers, "n_cols": 1}` and `col_000000.pbiv` set entirely to `true`. This retroactively records genome 0 (the original source) as present in every slot, satisfying the column-count invariant before any new-source column is appended.
The MPHF, evidence, and unitigs are built once from the kmer set of a layer and are immutable for the lifetime of that layer. Adding a genome column does not change the kmer set — it only appends a new data column indexed by the same slot numbers. The only disk writes are one new `.pciv`/`.pbiv` file and a single `meta.json` update.