Merge pull request 'Push tklvqnrqtzpo' (#10) from push-tklvqnrqtzpo into main

Reviewed-on: #10
2026-05-26 15:41:05 +00:00
parent 6f7abddeaf 694da5208e
commit 82ec6aa1cf
29 changed files with 793 additions and 533 deletions
@@ -26,7 +26,8 @@ for each batch of sequences:
        query_partition(p, superkmers_routed_to_p)
            → load QueryLayer(s) for p
            → for each kmer in each superkmer: MphfLayer::find(kmer)
-        broadcast results back to each (seq_idx, kmer_offset) that referenced the superkmer
+        apply_findere(sk_kmer_results, effective_z)   ← per superkmer
        broadcast confirmed results back to each (seq_idx, kmer_offset)
    emit annotated sequences
 ```
@@ -36,28 +37,46 @@ Parallelism is **not yet active** in the current implementation: batches are pro
 ---
-## Layer lookup: `MphfLayer::find`
+## Findere z-window filter
-`MphfLayer::open` reads `layer_meta.json` and loads either exact or approximate evidence. The caller (`QueryLayer::find`) never chooses the dispatch path — it is fixed at open time by `LayerEvidence`:
+For approximate and hybrid index modes, a raw k-mer hit is a positive from the fingerprint test with false-positive rate 1/2^b. The Findere scheme reduces the effective FP rate to 1/2^(b·z) by requiring **z consecutive k-mers** to all test positive before any of them is counted as a confirmed match.
 `apply_findere` implements this as a sliding-window confirmation, independently for each genome:
 ```rust
-pub fn find(&self, kmer: CanonicalKmer) -> Option<usize> {
+fn apply_findere(
-    match &self.ev {
+    results: &[Option<Box<[u32]>>],
-        LayerEvidence::Exact { .. }  => self.find_exact(kmer),
+    z: usize,
-        LayerEvidence::Approx { .. } => self.find_approx(kmer),
+    n_genomes: usize,
-    }
+) -> Vec<Option<Box<[u32]>>>
 }
 ```
-### Exact layers
+For each genome g, a position i is confirmed if there exists at least one window of z consecutive positions `[j, j+z)` that contains i and where all z positions are present for g (i.e. `results[pos]` is `Some(row)` and `row[g] > 0`). The implementation uses a single O(n) sliding-window scan per genome.
-`find_exact` maps the k-mer through the MPHF to a slot, then calls `UnitigFileReader::verify_canonical_kmer(chunk_id, rank, kmer)` to confirm the stored k-mer matches. Zero false positives. Requires `UnitigFileReader::open()` (random-access via `.idx`); `open_sequential()` cannot serve random-access verification.
+Unconfirmed positions are zeroed in the returned row. If all genomes are zeroed for a position, it is returned as `None`.
-### Approximate layers
+**Short superkmers**: when a superkmer contains fewer than z k-mers, no complete z-window can be formed. Rather than discarding all results, `apply_findere` returns them unchanged (no filter applied). This avoids penalising legitimate short sequences near read ends.
-`find_approx` maps the k-mer through the MPHF, then checks a stored `b`-bit fingerprint of the canonical hash. False-positive rate: **1/2^b per k-mer query**. No `.idx` file is needed; the layer carries only `fingerprint.bin`.
+**Exact indexes**: `z` is effectively 1 (every k-mer is its own window). `apply_findere` is a no-op.
-For a query window of z consecutive k-mers (Findere scheme), the false-positive rate per window is **1/2^(b·z)**. The `z` parameter is recorded in `layer_meta.json` at build time but is not enforced during querying — the caller is responsible for interpreting window-level results accordingly.
+### Effective z at query time
 `effective_z` is resolved at the start of `run()`:
 ```rust
 let effective_z = args.findere_z.unwrap_or_else(|| match idx.meta().config.evidence {
    IndexMode::Approx { z, .. } | IndexMode::Hybrid { z, .. } => z as usize,
    IndexMode::Exact => 1,
 });
 ```
 The `-z` CLI option overrides the index metadata value. A higher z increases stringency (lower FP, some true positives may be discarded at sequence ends); a lower z increases sensitivity.
 ---
 ## Layer lookup: `MphfLayer::find`
 `MphfLayer::open(dir, mode: &IndexMode)` receives the mode from `PartitionMeta` — no per-layer file is read. The caller (`QueryLayer`) never chooses the dispatch path: it is fixed at open time by `LayerEvidence`. See [obilayeredmap](../implementation/obilayeredmap.md) for the full `find` / `find_strict` API.
 ### `QueryLayer` variant selection
@@ -72,18 +91,6 @@ For a query window of z consecutive k-mers (Findere scheme), the false-positive
 ---
 ## `open()` vs `open_sequential()`
 `UnitigFileReader::open()` loads the `.idx` block-offset table, enabling random access to individual unitig chunks. It is required whenever `verify_canonical_kmer` is called (exact layers at query time).
 `UnitigFileReader::open_sequential()` skips the `.idx` and supports only forward iteration. It is sufficient for:
 - build passes that scan all unitigs sequentially (`build_exact_evidence`, `build_approx_evidence`);
 - the `unitig` subcommand, which iterates and prints unitigs without random access.
 `KmerIndex::open()` (called by `query::run`) triggers `MphfLayer::open` for each layer, which calls `UnitigFileReader::open()` for exact layers. Approximate layers do not open a unitig reader at all.
 ---
 ## Presence / count mode at query time
 The `--force-presence` flag and `--presence-threshold` control how per-genome values are accumulated, independently of what the index stores:
@@ -96,49 +103,83 @@ genome_totals[g] += if presence { u32::from(v >= threshold) } else { v }
 ---
 ## Coverage vectors (`--detail`)
 When `--detail` is requested, a 3-D accumulator `cov[seq_idx][genome][kmer_pos]` is allocated before the partition loop, with dimensions derived from `batch.n_kmers`:
 ```
 cov[seq_idx][g][abs_pos] += contribution
 where abs_pos = desc.kmer_offset + local_pos  (absolute kmer position in the sequence)
 ```
 Coverage reflects confirmed k-mers only (post-Findere). The vectors are emitted in the JSON annotation under the key `"coverage"`.
 ---
 ## `kmer_missing` semantics
 `kmer_missing` counts k-mers that returned `None` from the index before Findere filtering — i.e. k-mers truly absent from every layer. K-mers that were found in the index but rejected by the z-window filter are not counted as missing.
 ---
 ## Output format
 Output sequences are written in **OBITools4 format**: the original sequence with a JSON annotation map in the title line.
 ```
->read_id {"kmer_count":59,"kmer_strict_matches":{"genome_a":42,"genome_b":7,...}}
+>read_id {"kmer_count":59,"kmer_strict_matches":{"genome_a":42,"genome_b":7}}
 ATCGATCG...
 ```
-Genome keys in `kmer_strict_matches` are genome labels from `index.meta`. Key order follows iteration order of `meta.genomes`.
+With `--detail`:
 ```
 >read_id {"kmer_count":59,"kmer_strict_matches":{...},"coverage":{"genome_a":[0,1,2,...],...}}
 ATCGATCG...
 ```
 Genome keys follow the iteration order of `meta.genomes`.
 ---
-## Annotation schema (current implementation)
+## Annotation schema
 | Key | Type | Condition | Semantics |
 |---|---|---|---|
-| `kmer_count` | int | always | k-mers with at least one match |
+| `kmer_count` | int | always | k-mers confirmed (post-Findere) with at least one genome match |
-| `kmer_missing` | int | `--count-missing` | k-mers absent from every layer |
+| `kmer_missing` | int | `--count-missing` | k-mers absent from the index entirely (pre-Findere None) |
 | `kmer_strict_matches` | object | always | per-genome accumulated value (label → count or 0/1) |
 | `coverage` | object | `--detail` | per-genome array of per-position contributions (label → [u32]) |
-`kmer_count` counts matched k-mer positions (incremented once per `Some(row)` hit regardless of how many genomes are covered). `kmer_missing` counts `None` hits.
+`kmer_count + kmer_missing` ≤ total k-mers in the sequence. The gap (if any) corresponds to k-mers found in the index but rejected by the Findere z-window filter.
 **Note on doc/impl divergence:** the doc previously used keys `kmer_total`, `kmer_found`, and `kmer_match` (list). The implementation uses `kmer_count` (int, matched only) and `kmer_strict_matches` (object keyed by genome label). `--mismatch` and `--detail` are parsed but not yet implemented and emit a warning.
 ---
 ## CLI
 ```
-obikmer query -i <index> [--detail] [--mismatch] [--count-missing]
+obikmer query <index> [--detail] [--mismatch] [--count-missing]
              [--force-presence] [--presence-threshold <n>]
-              [-T <threads>] <query.fa> [<query2.fa> ...]
+              [-z <z>] [-T <threads>]
              <query.fa> [<query2.fa> ...]
 ```
-`--mismatch` and `--detail` are accepted but currently ignored with a warning on stderr.
+| Option | Default | Semantics |
 |---|---|---|
 | `-z` / `--findere-z` | from index metadata | Override Findere z parameter |
 | `--detail` | off | Emit per-position coverage vectors in JSON |
 | `--count-missing` | off | Add `kmer_missing` field to JSON |
 | `--force-presence` | off | Report 0/1 per genome regardless of index counts |
 | `--presence-threshold` | 1 | Minimum count to declare genome present |
 | `-T` / `--threads` | all CPUs | Worker threads (parallelism not yet active) |
 `--mismatch` is accepted but currently ignored with a warning on stderr.
 ---
 ## Future work
 - **`--mismatch`**: 1-mismatch approximate matching — generate `3·k` single-substitution variants per k-mer, look each up independently.
 - **`--detail`**: per-position coverage vectors (`cov_<i>`) per genome.
 - **Read classification** (`--classify`): assign each read to the genome with the highest match score.
 - **Parallelism**: activate per-partition or per-sequence worker threads using the already-parsed `--threads` value.
 - **Whitelist / blacklist filtering**: threshold-based accept/reject on per-genome match scores.
@@ -27,10 +27,10 @@ part_XXXXX/
 After filtering (applying a min-count threshold derived from the spectrum) and building the local De Bruijn graph + unitigs (see [Construction pipeline](pipeline.md)), the exact filtered kmer set is available via `unitigs.bin`.
-`MphfLayer::build(dir, block_bits, fill_slot)` is called on the unitig directory:
+`MphfLayer::build(dir, block_bits, mode: &IndexMode, fill_slot)` is called on the unitig directory:
-1. **Pass 1**: build `.idx` via `build_unitig_idx(unitig_path, block_bits)`, then iterate all canonical kmers in parallel over chunks using `(0..unitigs.len()).into_par_iter()` + `unitigs.unitig(ci).into_canonical_kmers()`. Constructs and stores `mphf.bin` (ptr_hash, `new_from_par_iter`).
+1. **Pass 1** (parallel): a `CanonicalKmerIter` — clonable via `Arc<Mmap>`, no file reopening — is passed directly to `new_from_par_iter` via `par_bridge()`. No `.idx` is read or created at this stage; parallelism is at partition/layer level, not within a single MPHF. Produces `mphf.bin`.
-2. **Pass 2**: iterate sequentially with `iter_indexed_canonical_kmers`; fill `evidence.bin`; call `fill_slot(slot, kmer)` callback once per kmer for DataStore population.
+2. **Pass 2** (sequential): iterate with `iter_indexed_canonical_kmers`; fill evidence files; call `fill_slot(slot, kmer)` callback per kmer. For Exact/Hybrid, `.idx` is written at the end of this pass — never earlier.
 `mphf1.bin` and `counts1.bin` are no longer needed after phase 2 and can be deleted.
@@ -110,7 +110,7 @@ layer_i/
  mphf.bin         — ptr_hash phase-2 MPHF
  evidence.bin     — n × (chunk_id: 25 bits | rank: 7 bits) per slot  [exact mode]
  fingerprint.bin  — n × b-bit fingerprints per slot                  [approx mode]
-  layer_meta.json  — evidence kind, recorded at build time
+  [no layer_meta.json — mode stored once in partition-level meta.json]
 ```
 Layers are **disjoint**: a canonical kmer belongs to exactly one layer. Layer 0 is built from dataset A. Adding dataset B:
@@ -121,32 +121,32 @@ Layers are **disjoint**: a canonical kmer belongs to exactly one layer. Layer 0
 ### Evidence modes
-Two evidence modes are supported, selected at build time via `EvidenceKind` and recorded in `layer_meta.json`.
+Three evidence modes are supported via `IndexMode`, stored once in `PartitionMeta` at partition root. There is no `layer_meta.json`.
-**Exact** (`EvidenceKind::Exact`): `evidence.bin` stores one `(chunk_id, rank)` pair per MPHF slot, encoding the position of the corresponding kmer in `unitigs.bin`. Membership verification reconstructs the kmer from `(chunk_id, rank)` and compares it to the query. Zero false positives. Requires `.idx` for random access.
+**Exact** (`IndexMode::Exact`): `evidence.bin` stores one `(chunk_id, rank)` pair per MPHF slot. Verification reconstructs the kmer and compares to the query. Zero false positives. `.idx` required at query time.
-**Approx** (`EvidenceKind::Approx { b, z }`): `fingerprint.bin` stores a `b`-bit hash of the kmer at each MPHF slot. Membership check compares `kmer.seq_hash()` against the stored fingerprint. False-positive rate: 1/2^b per query. No `.idx` is written or needed.
+**Approx** (`IndexMode::Approx { b, z }`): `fingerprint.bin` stores a b-bit hash per slot. False-positive rate 1/2^b per query; Findere z-parameter reduces window FP to ≈ 1/2^(b·z). No `.idx` written or needed.
 **Hybrid** (`IndexMode::Hybrid { b, z }`): both `fingerprint.bin` and `evidence.bin` + `.idx`. `find()` uses the fingerprint (O(1)); `find_strict()` uses exact evidence (O(1)).
 ### Build functions
 ```
-MphfLayer::build(dir, block_bits, fill_slot)
+MphfLayer::build(dir, block_bits, mode: &IndexMode, fill_slot)
-    Pass 1: par_iter over chunks via .idx → build mphf.bin
+    Pass 1: CanonicalKmerIter + par_bridge() → build mphf.bin  (no .idx used)
-    Pass 2: sequential iter → fill evidence.bin + call fill_slot
+    Pass 2: sequential iter → fill evidence files + call fill_slot
            .idx written last for Exact/Hybrid (query-time only)
 MphfLayer::build_exact_evidence(dir, block_bits)
-    Standalone post-hoc: builds evidence.bin + .idx from existing mphf.bin + unitigs.bin
+    Post-hoc: builds evidence.bin + .idx from existing mphf.bin + unitigs.bin
    Uses open_sequential(); no .idx required on entry
 MphfLayer::build_approx_evidence(dir, b, z)
-    Standalone post-hoc: builds fingerprint.bin from existing mphf.bin + unitigs.bin
+    Post-hoc: builds fingerprint.bin from existing mphf.bin + unitigs.bin
    Uses open_sequential(); never writes .idx
 MphfLayer::build_evidence(dir, kind, block_bits)
    Dispatch wrapper: routes to build_exact_evidence or build_approx_evidence
 ```
-`build` always produces exact evidence. If approximate evidence is needed (e.g. `EvidenceKind::Approx`), the caller invokes `build_approx_evidence` after `build` to replace the evidence bundle.
+There is no `build_evidence` dispatch wrapper. Callers choose the appropriate post-hoc build directly.
 In `obikpartitionner`, `build_index_layer` receives `block_bits: u8` from `IndexConfig::block_bits` and forwards it directly to `Layer::build` and `Layer::build_approx_evidence`.
@@ -170,7 +170,7 @@ fn query(kmer) → Option<(layer_index, slot)>:
    return None
 ```
-`MphfLayer::find` dispatches transparently to `find_exact` or `find_approx` based on the evidence loaded at `open` time. Expected probe depth: 1 for kmers in layer 0. Each probe is a ptr_hash lookup (~10 ns) plus one evidence check.
+`MphfLayer::find` dispatches on `LayerEvidence` at O(1) — no panicking `find_exact`/`find_approx` methods. `find_strict` always performs an exact check: O(1) for Exact/Hybrid, O(n) sequential scan for Approx. Expected probe depth: 1 for kmers in layer 0. Each probe is a ptr_hash lookup (~10 ns) plus one evidence check.
 ### Merging layers
@@ -20,21 +20,26 @@ Both `PersistentCompactIntMatrix` and `PersistentBitMatrix` come from the `obico
 ---
-## Evidence kinds
+## Index mode (homogeneity invariant)
-Each layer carries one of two evidence bundles, recorded in `layer_meta.json` at build time:
+A partitioned index is homogeneous: every layer within a partition shares the same mode. The mode is determined once at `LayeredMap::open()` from `PartitionMeta.mode` and passed to each `Layer::open()` — no per-layer file is read.
 ```rust
-pub enum EvidenceKind {
+#[derive(Serialize, Deserialize, Default)]
 #[serde(tag = "type", rename_all = "snake_case")]
 pub enum IndexMode {
    #[default]
    Exact,
    Approx { b: u8, z: u8 },
    Hybrid { b: u8, z: u8 },
 }
 ```
-`EvidenceKind` is stored in `LayerMeta` (one per layer directory). `open()` reads it to decide which evidence files to load.
+`IndexMode` is stored once in `PartitionMeta` (`meta.json` at partition root). There is no `layer_meta.json`.
- **Exact**: writes `evidence.bin` + `unitigs.bin.idx`. Zero false positives. Requires random-access `.idx` at query time.
+- **Exact**: writes `evidence.bin` + `unitigs.bin.idx`. Zero false positives.
- **Approx**: writes `fingerprint.bin` only. False-positive rate per kmer query = 1/2^b. `z` is the Findere consecutive-kmer parameter: `z` consecutive kmers must all match, reducing the effective FP rate per read to approximately W / 2^(b·z) where W = L − k − z + 2 is the number of windows in a read of length L. No `.idx` written or required.
+- **Approx**: writes `fingerprint.bin` only. FP rate per kmer = 1/2^b; with Findere z-parameter, z consecutive kmers must all match → effective window FP ≈ 1/2^(b·z). No `.idx` written or required.
 - **Hybrid**: writes both `fingerprint.bin` and `evidence.bin` + `.idx`. `find()` uses the fingerprint (fast, O(1)); `find_strict()` uses exact evidence.
 ---
@@ -55,44 +60,44 @@ pub struct MphfLayer {
 ```rust
 enum LayerEvidence {
    Exact  { evidence: Evidence, unitigs: UnitigFileReader },
-    Approx { fingerprint: FingerprintVec },
+    Approx { fingerprint: FingerprintVec, unitigs_path: PathBuf },
    Hybrid { evidence: Evidence, unitigs: UnitigFileReader, fingerprint: FingerprintVec },
 }
 ```
 `MphfLayer::open(dir, mode: &IndexMode)` receives the mode from `PartitionMeta` — no per-layer file is read.
 ### Query API
-Three public query methods, all returning `Option<usize>` (slot index):
+Two public query methods, both returning `Option<usize>` (slot index):
 ```rust
 pub fn find(&self, kmer: CanonicalKmer) -> Option<usize>
-pub fn find_exact(&self, kmer: CanonicalKmer) -> Option<usize>
+pub fn find_strict(&self, kmer: CanonicalKmer) -> Option<usize>
 pub fn find_approx(&self, kmer: CanonicalKmer) -> Option<usize>
 ```
- `find` dispatches transparently to `find_exact` or `find_approx` based on the evidence variant loaded at `open()`.
+- `find`: O(1) auto-dispatch. Exact/Hybrid → exact evidence check. Approx/Hybrid → fingerprint comparison.
- `find_exact` panics if the layer holds approximate evidence; zero false positives.
+- `find_strict`: always exact. Exact/Hybrid → O(1) evidence check. Approx → O(n) sequential scan (no `.idx`).
 - `find_approx` panics if the layer holds exact evidence; FP rate 1/2^b per kmer.
-`open()` requires `unitigs.bin.idx` (random access into unitigs). `open_sequential()` on `UnitigFileReader` does not require the `.idx` and is used during build passes.
+There are no `find_exact`/`find_approx` methods; panicking dispatch is eliminated.
 ### Build surface
 ```rust
-// Full MPHF + exact evidence build (two-pass, parallel)
+// Full MPHF + evidence build (two-pass)
-pub(crate) fn build(dir, block_bits, fill_slot) -> OLMResult<usize>
+pub(crate) fn build(dir, block_bits, mode: &IndexMode, fill_slot) -> OLMResult<usize>
-// Evidence-only builds (MPHF already present in dir)
+// Evidence-only post-hoc builds (MPHF already present)
 pub fn build_exact_evidence(dir, block_bits) -> OLMResult<usize>
 pub fn build_approx_evidence(dir, b, z)      -> OLMResult<usize>
 pub fn build_evidence(dir, kind, block_bits) -> OLMResult<usize>  // dispatch
 ```
-`MphfLayer::build` runs two sequential passes over `unitigs.bin`:
+`MphfLayer::build` runs two passes over `unitigs.bin`:
-1. **Pass 1** (parallel via rayon): iterate all canonical kmers, construct and store `mphf.bin`. `new_from_par_iter` avoids materialising a full key `Vec`.
+1. **Pass 1** (parallel via rayon): a `CanonicalKmerIter` (clonable, `Arc<Mmap>`, no file reopening) is passed to `new_from_par_iter` via `par_bridge()`. Produces `mphf.bin`. No `.idx` is read or created at this stage.
-2. **Pass 2** (sequential): iterate again, fill `evidence.bin`, call `fill_slot(slot, kmer)` once per kmer for payload population. A compact `n/8`-byte seen-bitset verifies MPHF injectivity inline.
+2. **Pass 2** (sequential): fill evidence files; call `fill_slot(slot, kmer)` per kmer. `.idx` is written last for Exact/Hybrid modes (query-time only).
-`build` always produces exact evidence. For approximate evidence, use `build_approx_evidence` after MPHF construction.
+There is no `build_evidence` dispatch wrapper — callers invoke `build_exact_evidence` or `build_approx_evidence` directly.
 For empty layers (n = 0), all build variants return `Ok(0)` immediately after creating empty output files.
@@ -133,38 +138,37 @@ pub struct Hit<T = ()> {
 ```rust
 // mode 1
 impl Layer<()> {
-    pub fn build(out_dir: &Path, block_bits: u8) -> OLMResult<usize>
+    pub fn build(out_dir: &Path, block_bits: u8, mode: &IndexMode) -> OLMResult<usize>
 }
 // mode 2
 impl Layer<PersistentCompactIntMatrix> {
-    pub fn build(out_dir: &Path, block_bits: u8,
+    pub fn build(out_dir: &Path, block_bits: u8, mode: &IndexMode,
                 count_of: impl Fn(CanonicalKmer) -> u32) -> OLMResult<usize>
-    pub fn build_from_map(out_dir: &Path, block_bits: u8,
+    pub fn build_from_map(out_dir: &Path, block_bits: u8, mode: &IndexMode,
                          counts: &HashMap<CanonicalKmer, u32>) -> OLMResult<usize>
 }
 // mode 3
 impl Layer<PersistentBitMatrix> {
-    pub fn build_presence(out_dir: &Path, block_bits: u8,
+    pub fn build_presence(out_dir: &Path, block_bits: u8, mode: &IndexMode,
                          n_genomes: usize,
                          present_in: impl Fn(CanonicalKmer, usize) -> bool) -> OLMResult<usize>
 }
 ```
-All build impls delegate MPHF + evidence construction to `MphfLayer::build` via a mode-specific `fill_slot` callback. Modes 2 and 3 pre-read `n_kmers` from `unitigs.bin` via `UnitigFileReader::open_sequential` to size the matrix builder before calling `MphfLayer::build`.
+All build impls delegate to `MphfLayer::build` via a mode-specific `fill_slot` callback. The `mode` parameter is forwarded directly — no `LayerMeta` is written.
-### Evidence build helpers on Layer
+Evidence-only post-hoc builds are accessible directly on `Layer<D>`:
 ```rust
 impl<D: LayerData> Layer<D> {
    pub fn build_exact_evidence(layer_dir: &Path, block_bits: u8) -> OLMResult<usize>
    pub fn build_approx_evidence(layer_dir: &Path, b: u8, z: u8)  -> OLMResult<usize>
    pub fn build_evidence(layer_dir: &Path, kind: &EvidenceKind, block_bits: u8) -> OLMResult<usize>
 }
 ```
-These delegate directly to the corresponding `MphfLayer` methods and are provided so call sites can remain typed at the `Layer<D>` level.
+There is no `build_evidence` dispatch wrapper.
 ---
@@ -213,13 +217,16 @@ pub struct LayeredMap<D: LayerData = ()> {
 ```rust
 pub fn open(root: &Path)              -> OLMResult<Self>
-pub fn create(root: &Path) -> OLMResult<Self>
+pub fn create(root: &Path, mode: IndexMode) -> OLMResult<Self>
 pub fn n_layers(&self)                -> usize
 pub fn layer(&self, i: usize)         -> &Layer<D>
 pub fn mode(&self)                    -> &IndexMode
 pub fn query(&self, kmer: CanonicalKmer) -> Option<(usize, Hit<D::Item>)>
 pub fn next_layer_writer(&self)       -> OLMResult<UnitigFileWriter>
 ```
 `open` reads `PartitionMeta` once, extracts `mode`, and passes it to every `Layer::open` — no per-layer file is read. `create` stores the given mode in `PartitionMeta`.
 `query` probes layers in order and returns `(layer_index, Hit)` on the first match. Expected probe depth: 1 for kmers in layer 0.
 ### push_layer
@@ -272,14 +279,13 @@ See [Kmer index architecture](../architecture/index_architecture.md) for the ful
 ```
 partition_root/                    ← LayeredMap (one partition)
-  meta.json                        — {"n_layers": N}
+  meta.json                        — {"n_layers": N, "mode": {"type": "exact"|"approx"|"hybrid", ...}}
  layer_0/                         ← Layer
    layer_meta.json                — {"type": "exact"} or {"type": "approx", "b": B, "z": Z}
    mphf.bin                       — ptr_hash MPHF (epserde format)
    unitigs.bin                    — packed 2-bit nucleotide sequences
-    unitigs.bin.idx                — UIDX index (exact evidence only)
+    unitigs.bin.idx                — UIDX index (Exact/Hybrid only; query-time, never built during MPHF construction)
-    evidence.bin                   — [u32; n], LE  (exact evidence only)
+    evidence.bin                   — [u32; n], LE  (Exact/Hybrid only)
-    fingerprint.bin                — packed b-bit array  (approx evidence only)
+    fingerprint.bin                — packed b-bit array  (Approx/Hybrid only)
    counts/                        [mode 2] PersistentCompactIntMatrix
      meta.json
      col_000000.pciv
@@ -290,7 +296,7 @@ partition_root/                    ← LayeredMap (one partition)
    …
 ```
-`unitigs.bin.idx` is required by `open()` (random access). `open_sequential()` on `UnitigFileReader` omits it and is used during build passes and approx-evidence construction.
+There is no `layer_meta.json`. The mode is stored once in `PartitionMeta` and is valid for all layers. `unitigs.bin.idx` is built at the end of `build_exact_evidence` — never during MPHF construction — and is consumed at query time only.
 ---
@@ -387,4 +393,4 @@ Each partition's new layer is built independently; the operation is fully parall
 | `obiskio` | unitig file writer/reader + `.idx` build |
 | `obicompactvec` | payload types + aggregation traits |
 | `rayon 1` | parallel MPHF construction pass |
-| `serde / serde_json` | `LayerMeta` + `PartitionMeta` serialisation |
+| `serde / serde_json` | `PartitionMeta` serialisation |
@@ -61,35 +61,24 @@ File size = `n_slots × 4` bytes. `chunk_id` is the 0-based index of the record
 Scans `unitigs.bin` sequentially: for each chunk at byte offset `offset`, if `chunk_count & mask == 0` (where `mask = (1 << block_bits) − 1`), appends `offset as u32` to `block_offsets`. After the scan, appends a sentinel (= total file size), then writes the `.idx` file. Called after the unitig file is fully written and closed.
-### `open()` vs `open_sequential()`
+### `open()`, `open_sequential()`, `open_direct_access()`
-`UnitigFileReader::open(path)` loads the `.idx` file into `block_offsets: Vec<u32>` and memory-maps `unitigs.bin`. Enables random access via `chunk_start(i)`, `unitig(i)`, `raw_kmer(i, j)`, and `verify_canonical_kmer(i, j, q)`.
+`UnitigFileReader` has three constructors:
-`UnitigFileReader::open_sequential(path)` does not read `.idx`. It scans `unitigs.bin` once to count chunks and kmers, then leaves `block_offsets` empty. Only sequential iterators work: `iter_unitigs`, `iter_kmers`, `iter_canonical_kmers`, `iter_indexed_canonical_kmers`. Any call to `chunk_start()` panics with a diagnostic message.
+- `open(path)` — smart default: if `unitigs.bin.idx` exists, delegates to `open_direct_access`; otherwise delegates to `open_sequential`. Prefer this in call sites that don't require one specific mode.
 - `open_sequential(path)` — never reads `.idx`. Sequential iterators only; `chunk_start(i)` falls back to an O(i) mmap scan rather than panicking.
 - `open_direct_access(path)` — requires `.idx` to be present. Enables O(1) or O(2^block_bits) `chunk_start(i)`, used by `verify_canonical_kmer` at query time.
-### `chunk_start(i)` — random access
+`CanonicalKmerIter` — a clonable sequential iterator returned by `UnitigFileReader::iter_canonical_kmers()`. It holds an `Arc<Mmap>` so cloning resets the cursor to the start without reopening the file. This makes it usable with `par_bridge()` for parallel MPHF construction without random access.
-```rust
+### `chunk_start(i)` — access modes
 fn chunk_start(&self, i: usize) -> usize {
    // block_bits=0: single table lookup, O(1) — hot path
    if self.block_bits == 0 {
        return self.block_offsets[i] as usize;
    }
    // block_bits>0: lookup block, then scan at most 2^block_bits − 1 records
    let block = i >> self.block_bits;
    let rem   = i &  self.mask;
    let mut offset = self.block_offsets[block] as usize;
    for _ in 0..rem {
        let seql_minus_k = self.mmap[offset] as usize;
        offset += 1 + (seql_minus_k + self.k + 3) / 4;
    }
    offset
 }
 ```
-With `block_bits = 0` (the default), every chunk has a direct entry in `block_offsets`: lookup is a single array index, O(1), with no sequential scan. The `if self.block_bits == 0` branch is explicit in the code and handles this hot path first.
+When `.idx` is loaded (`open_direct_access`):
-With `block_bits > 0`, one offset covers `2^block_bits` consecutive chunks; access cost is O(`2^block_bits`) sequential mmap reads.
+- `block_bits = 0`: single array lookup, O(1).
 - `block_bits > 0`: lookup block, then scan ≤ 2^block_bits records, O(2^block_bits).
 When `.idx` is absent (`open_sequential`): `chunk_start(i)` performs an O(i) sequential mmap scan from offset 0. No panic — the function degrades gracefully. This degraded path is used by `find_strict()` on Approx layers (sequential scan of all canonical kmers).
 ### Decoding a kmer from slot `s`
@@ -17,6 +17,7 @@
 | `unitig`    | Dump unitigs from a built index to stdout (debug) |
 | `estimate`  | Estimate approximate-index parameters (z, evidence bits, FP rates) before indexing |
 | `reindex`   | Convert an index's evidence in-place: exact ↔ approx |
 | `utils`     | Miscellaneous index utilities: `--new-label NEW=OLD` renames a genome label in-place (NEW gets OLD's identity) |
 ## Constraints
@@ -24,7 +25,21 @@
 - Maximum efficiency in computation, memory, and disk usage
 - k odd, k ∈ [11, 31], fixed at runtime; kmer fits in a u64 (2 bits/base)
 - Canonical form: `min(kmer, revcomp(kmer))` reduces strand-symmetric space by half
- Input formats: FASTA, FASTQ, gzip, streaming stdin
+- Input formats: FASTA, FASTQ, gzip, streaming stdin; `index` reads from stdin automatically when no input files are provided (`-` can also be passed explicitly among other paths)
 ## Genome label constraints
 Genome labels are arbitrary Unicode strings with the following restrictions:
 | Character | Forbidden | Reason |
 |-----------|-----------|--------|
 | `/` | yes | filesystem path separator |
 | `=` | yes | `--new-label` parser separator |
 | `\0` | yes | null byte |
 | `\n` `\r` `\t` | yes | break CSV output |
 | spaces | **allowed** | use shell quoting: `--new-label 'new label=old label'` |
 Empty labels are also rejected. Labels derived automatically from the index directory name (when `--label` is omitted) are not validated since they come from the filesystem and are already safe.
 ## Priority operations
@@ -12,5 +12,5 @@ pub use error::{OKIError, OKIResult};
 pub use distance::{DistanceMetric, DistanceOutput};
 pub use index::KmerIndex;
 pub use merge::MergeMode;
-pub use meta::{GenomeInfo, IndexConfig, IndexMeta, META_FILENAME};
+pub use meta::{validate_label, GenomeInfo, IndexConfig, IndexMeta, META_FILENAME};
 pub use state::{IndexState, SENTINEL_COUNTED, SENTINEL_INDEXED, SENTINEL_SCATTERED};
@@ -9,7 +9,7 @@ use obisys::{Reporter, Stage};
 use rayon::prelude::*;
 use tracing::info;
-use obilayeredmap::EvidenceKind;
+use obilayeredmap::IndexMode;
 use crate::error::{OKIError, OKIResult};
 use crate::index::KmerIndex;
@@ -271,14 +271,16 @@ fn partition_bar(n: u64) -> ProgressBar {
 /// - all `Exact`                        → OK, returns `Exact`
 /// - all `Approx { b, z }` same params  → OK, returns `Approx { b, z }`
 /// - mixed exact/approx or different approx params → `IncompatibleEvidence`
-fn validate_evidence_compat(sources: &[&KmerIndex]) -> OKIResult<EvidenceKind> {
+fn validate_evidence_compat(sources: &[&KmerIndex]) -> OKIResult<IndexMode> {
    let ref_ev = &sources[0].meta.config.evidence;
    for src in &sources[1..] {
        let ev = &src.meta.config.evidence;
        let compat = match (ref_ev, ev) {
-            (EvidenceKind::Exact, EvidenceKind::Exact) => true,
+            (IndexMode::Exact, IndexMode::Exact) => true,
-            (EvidenceKind::Approx { b: b1, z: z1 },
+            (IndexMode::Approx { b: b1, z: z1 },
-             EvidenceKind::Approx { b: b2, z: z2 }) => b1 == b2 && z1 == z2,
+             IndexMode::Approx { b: b2, z: z2 }) => b1 == b2 && z1 == z2,
            (IndexMode::Hybrid { b: b1, z: z1 },
             IndexMode::Hybrid { b: b2, z: z2 }) => b1 == b2 && z1 == z2,
            _ => false,
        };
        if !compat {
@@ -3,7 +3,7 @@ use std::fs;
 use std::io;
 use std::path::Path;
-use obilayeredmap::EvidenceKind;
+use obilayeredmap::IndexMode;
 use serde::{Deserialize, Serialize};
 pub const META_FILENAME: &str = "index.meta";
@@ -30,7 +30,7 @@ pub struct IndexConfig {
    pub n_bits: usize,
    pub with_counts: bool,
    #[serde(default)]
-    pub evidence: EvidenceKind,
+    pub evidence: IndexMode,
    /// Block size for the unitig index as a power-of-two exponent.
    /// The `.idx` block covers 2^block_bits consecutive unitigs.
    /// 0 = one entry per unitig (O(1) access, largest `.idx`).
@@ -70,3 +70,26 @@ impl IndexMeta {
        self.genomes.iter().map(|g| g.label.as_str())
    }
 }
 /// Validate a user-supplied genome label.
 ///
 /// Forbidden: `/` (filesystem separator), `=` (--new-label parser separator),
 /// `\0` (null byte), `\n`, `\r`, `\t` (break CSV output).
 /// Empty labels are also rejected.
 pub fn validate_label(label: &str) -> Result<(), String> {
    if label.is_empty() {
        return Err("genome label must not be empty".into());
    }
    const FORBIDDEN: &[char] = &['/', '=', '\0', '\n', '\r', '\t'];
    if let Some(c) = label.chars().find(|c| FORBIDDEN.contains(c)) {
        let display = match c {
            '\0' => "\\0 (null)".to_string(),
            '\n' => "\\n (newline)".to_string(),
            '\r' => "\\r (carriage return)".to_string(),
            '\t' => "\\t (tab)".to_string(),
            c    => format!("'{c}'"),
        };
        return Err(format!("genome label contains forbidden character {display}"));
    }
    Ok(())
 }
@@ -3,7 +3,7 @@ use std::path::Path;
 use std::time::Duration;
 use indicatif::{ProgressBar, ProgressStyle};
-use obilayeredmap::{EvidenceKind, layer::Layer};
+use obilayeredmap::{IndexMode, layer::Layer};
 use obilayeredmap::meta::PartitionMeta;
 use obisys::{Reporter, Stage};
 use rayon::prelude::*;
@@ -31,7 +31,7 @@ impl KmerIndex {
    /// `index.meta` is updated with the new evidence kind on success.
    pub fn reindex(
        &mut self,
-        target: EvidenceKind,
+        target: IndexMode,
        block_bits: u8,
        rep: &mut Reporter,
    ) -> OKIResult<()> {
@@ -75,7 +75,7 @@ impl KmerIndex {
        }
        self.meta.config.evidence = target;
-        if matches!(self.meta.config.evidence, EvidenceKind::Exact) {
+        if matches!(self.meta.config.evidence, IndexMode::Exact) {
            self.meta.config.block_bits = block_bits;
        }
        self.meta.write(&self.root_path)?;
@@ -85,7 +85,7 @@ impl KmerIndex {
 }
 /// Process all layers of one partition's index directory.
-fn reindex_partition(index_dir: &Path, target: &EvidenceKind, block_bits: u8) -> OKIResult<()> {
+fn reindex_partition(index_dir: &Path, target: &IndexMode, block_bits: u8) -> OKIResult<()> {
    if !index_dir.exists() {
        return Ok(());
    }
@@ -97,22 +97,30 @@ fn reindex_partition(index_dir: &Path, target: &EvidenceKind, block_bits: u8) ->
    Ok(())
 }
-fn reindex_layer(layer_dir: &Path, target: &EvidenceKind, block_bits: u8) -> OKIResult<()> {
+fn reindex_layer(layer_dir: &Path, target: &IndexMode, block_bits: u8) -> OKIResult<()> {
-    Layer::<()>::build_evidence(layer_dir, target, block_bits).map_err(olm_to_oki)?;
+    match target {
        IndexMode::Exact => {
            Layer::<()>::build_exact_evidence(layer_dir, block_bits).map_err(olm_to_oki)?;
        }
        IndexMode::Approx { b, z } | IndexMode::Hybrid { b, z } => {
            Layer::<()>::build_approx_evidence(layer_dir, *b, *z).map_err(olm_to_oki)?;
        }
    }
    remove_stale_evidence(layer_dir, target)
 }
-fn remove_stale_evidence(layer_dir: &Path, target: &EvidenceKind) -> OKIResult<()> {
+fn remove_stale_evidence(layer_dir: &Path, target: &IndexMode) -> OKIResult<()> {
    match target {
-        EvidenceKind::Exact => {
+        IndexMode::Exact => {
            // fingerprint.bin is no longer valid
            remove_if_exists(&layer_dir.join(FINGERPRINT_FILE));
        }
-        EvidenceKind::Approx { .. } => {
+        IndexMode::Approx { .. } => {
            // exact bundle is no longer valid
            remove_if_exists(&layer_dir.join(EVIDENCE_FILE));
            remove_if_exists(&layer_dir.join(UNITIG_IDX_FILE));
        }
        IndexMode::Hybrid { .. } => {
            // both bundles kept — nothing to remove
        }
    }
    Ok(())
 }
@@ -10,8 +10,9 @@ use obikseq::RoutableSuperKmer;
 #[derive(Args)]
 pub struct CommonArgs {
-    /// Input files or directories (FASTA/FASTQ, optionally gzip-compressed)
+    /// Input files or directories (FASTA/FASTQ, optionally gzip-compressed).
-    #[arg(num_args = 1..)]
+    /// If omitted, reads from stdin.
    #[arg(num_args = 0..)]
    pub inputs: Vec<String>,
    /// k-mer size
@@ -63,7 +64,11 @@ pub fn block_size_to_bits(n: usize) -> u8 {
 impl CommonArgs {
    pub fn seqfile_paths(&self) -> obiread::PathIter {
-        let paths = self.inputs.iter().map(PathBuf::from).collect();
+        let paths: Vec<PathBuf> = if self.inputs.is_empty() {
            vec![PathBuf::from("-")]
        } else {
            self.inputs.iter().map(PathBuf::from).collect()
        };
        obiread::PathIter::new(paths)
    }
 }
@@ -1,8 +1,8 @@
 use std::path::PathBuf;
 use clap::Args;
-use obikindex::{GenomeInfo, IndexConfig, IndexState, KmerIndex};
+use obikindex::{validate_label, GenomeInfo, IndexConfig, IndexState, KmerIndex};
-use obilayeredmap::EvidenceKind;
+use obilayeredmap::IndexMode;
 fn parse_key_value(s: &str) -> Result<(String, String), String> {
    let pos = s.find('=').ok_or_else(|| format!("invalid key=value: no '=' in '{s}'"))?;
@@ -159,9 +159,9 @@ pub fn run(args: IndexArgs) {
    let evidence = if args.approx {
        let (z, b, fp) = resolve_approx_params(args.findere_z, args.evidence_bits, args.fp);
        info!("approximate evidence: b={b}, z={z}, fp={fp:.2e}");
-        EvidenceKind::Approx { b, z }
+        IndexMode::Approx { b, z }
    } else {
-        EvidenceKind::Exact
+        IndexMode::Exact
    };
    // ── Open or create the index ─────────────────────────────────────────────
@@ -194,6 +194,10 @@ pub fn run(args: IndexArgs) {
            block_bits,
        };
        let genome_info = args.label.as_ref().map(|label| {
            validate_label(label).unwrap_or_else(|e| {
                eprintln!("error: --label: {e}");
                std::process::exit(1);
            });
            let mut info = GenomeInfo::new(label.clone());
            for (k, v) in &args.meta {
                info.meta.insert(k.clone(), v.clone());
@@ -1,4 +1,5 @@
 pub mod annotate;
 pub mod utils;
 pub mod distance;
 pub mod dump;
 pub mod estimate;
@@ -4,6 +4,7 @@ use std::path::PathBuf;
 use clap::Args;
 use obikindex::KmerIndex;
 use obilayeredmap::IndexMode;
 use obiread::record::{SeqRecord, parse_chunk};
 use obiread::chunk::read_sequence_chunks;
 use obikseq::{RoutableSuperKmer, set_k, set_m};
@@ -21,7 +22,7 @@ pub struct QueryArgs {
    #[arg(num_args = 1..)]
    pub inputs: Vec<String>,
-    /// Also report per-position coverage vectors (cov_<i> per genome)
+    /// Report per-position coverage vectors per genome (adds "coverage" to JSON)
    #[arg(long)]
    pub detail: bool,
@@ -41,6 +42,10 @@ pub struct QueryArgs {
    #[arg(long, default_value_t = 1)]
    pub presence_threshold: u32,
    /// Override the Findere z parameter from index metadata
    #[arg(short = 'z', long)]
    pub findere_z: Option<usize>,
    /// Number of worker threads
    #[arg(
        short = 'T',
@@ -59,24 +64,18 @@ pub struct SKDesc {
    /// Index of the source sequence within the batch.
    pub seq_idx: u32,
    /// Kmer offset of the first kmer of this superkmer within its sequence.
    /// Computed as the cumulative number of kmers emitted before this superkmer
    /// in the same sequence. Used for `--detail` coverage vectors.
    pub kmer_offset: u32,
 }
 // ── QueryBatch ────────────────────────────────────────────────────────────────
 /// A batch of query sequences with their superkmers deduplicated.
 ///
 /// Each unique `RoutableSuperKmer` maps to all the (seq_idx, kmer_offset)
 /// positions it occupies across the batch. The superkmer is queried once
 /// per partition; the result is broadcast to every SKDesc entry.
 pub struct QueryBatch {
    /// Sequence ids in batch order.
    pub ids: Vec<String>,
    /// Raw sequence bytes (for output), in batch order.
    pub seqs: Vec<Vec<u8>>,
-    /// Per-sequence total kmer count (kmer_count + kmer_missing).
+    /// Total kmer count per sequence (used for `--detail` coverage allocation).
    pub n_kmers: Vec<u32>,
    /// Deduplicated superkmer map.
    pub map: HashMap<RoutableSuperKmer, Vec<SKDesc>>,
@@ -115,9 +114,6 @@ impl QueryBatch {
    }
    /// Split the superkmer map by partition index.
    ///
    /// Returns a vec of length `n_partitions`; each slot holds the RSK refs
    /// whose minimizer routes to that partition.
    pub fn split_by_partition(&self, n_partitions: usize) -> Vec<Vec<&RoutableSuperKmer>> {
        let mask = (n_partitions as u64) - 1;
        let mut by_part: Vec<Vec<&RoutableSuperKmer>> = vec![Vec::new(); n_partitions];
@@ -134,7 +130,6 @@ impl QueryBatch {
 struct SeqAcc {
    kmer_count:    u32,
    kmer_missing:  u32,
    /// Per-genome accumulated count (count mode) or presence sum (presence mode).
    genome_totals: Vec<u32>,
 }
@@ -148,6 +143,68 @@ impl SeqAcc {
    }
 }
 // ── Findere z-window filter ───────────────────────────────────────────────────
 /// Apply the Findere z-window filter to per-kmer query results for one superkmer.
 ///
 /// A k-mer at position i for genome g is confirmed only if it belongs to at least
 /// one run of z consecutive positions where all k-mers are present for g.
 /// Unconfirmed positions are zeroed; positions whose entire row becomes zero are
 /// returned as `None`.
 ///
 /// When z <= 1 or the superkmer is shorter than z k-mers, results are returned
 /// unchanged (short superkmers cannot satisfy the z-window constraint).
 fn apply_findere(
    results: &[Option<Box<[u32]>>],
    z: usize,
    n_genomes: usize,
 ) -> Vec<Option<Box<[u32]>>> {
    let n = results.len();
    if z <= 1 || n < z {
        return results.iter().map(|r| r.as_ref().map(|row| row.clone())).collect();
    }
    let mut confirmed = vec![vec![false; n_genomes]; n];
    for g in 0..n_genomes {
        let present: Vec<bool> = results
            .iter()
            .map(|r| r.as_ref().map_or(false, |row| row[g] > 0))
            .collect();
        let mut window_count = present[..z].iter().filter(|&&p| p).count();
        if window_count == z {
            for c in confirmed[..z].iter_mut() {
                c[g] = true;
            }
        }
        for j in 1..=(n - z) {
            if present[j - 1]     { window_count -= 1; }
            if present[j + z - 1] { window_count += 1; }
            if window_count == z {
                for c in confirmed[j..j + z].iter_mut() {
                    c[g] = true;
                }
            }
        }
    }
    results.iter().enumerate().map(|(i, opt)| {
        let row = opt.as_ref()?;
        let mut new_row: Box<[u32]> = row.clone();
        let mut any = false;
        for g in 0..n_genomes {
            if !confirmed[i][g] {
                new_row[g] = 0;
            } else {
                any = true;
            }
        }
        if any { Some(new_row) } else { None }
    }).collect()
 }
 // ── Entry point ───────────────────────────────────────────────────────────────
 pub fn run(args: QueryArgs) {
@@ -164,17 +221,22 @@ pub fn run(args: QueryArgs) {
    let n_partitions = idx.n_partitions();
    let with_counts  = idx.meta().config.with_counts;
    let effective_z: usize = args.findere_z.unwrap_or_else(|| {
        match idx.meta().config.evidence {
            IndexMode::Approx { z, .. } | IndexMode::Hybrid { z, .. } => z as usize,
            IndexMode::Exact => 1,
        }
    });
    info!(
-        "query: k={k}, {} genome(s), with_counts={with_counts}, mismatch={}, detail={}",
+        "query: k={k}, {} genome(s), with_counts={with_counts}, z={effective_z}, \
         mismatch={}, detail={}",
        n_genomes, args.mismatch, args.detail
    );
    if args.mismatch {
        eprintln!("warning: --mismatch not yet implemented, ignored");
    }
    if args.detail {
        eprintln!("warning: --detail not yet implemented, ignored");
    }
    let paths: Vec<PathBuf> = args.inputs.iter().map(PathBuf::from).collect();
    let mut out = BufWriter::new(io::stdout());
@@ -200,7 +262,17 @@ pub fn run(args: QueryArgs) {
            let batch  = QueryBatch::from_records(records, k, 6, 0.7);
            let n_seqs = batch.ids.len();
-            let mut accs: Vec<SeqAcc> = (0..n_seqs).map(|_| SeqAcc::new(n_genomes)).collect();
+            let mut accs: Vec<SeqAcc> =
                (0..n_seqs).map(|_| SeqAcc::new(n_genomes)).collect();
            // [seq_idx][genome_idx][kmer_position] — allocated only with --detail
            let mut cov: Vec<Vec<Vec<u32>>> = if args.detail {
                batch.n_kmers.iter()
                    .map(|&n| vec![vec![0u32; n as usize]; n_genomes])
                    .collect()
            } else {
                Vec::new()
            };
            let by_part = batch.split_by_partition(n_partitions);
@@ -221,20 +293,36 @@ pub fn run(args: QueryArgs) {
                let threshold = args.presence_threshold;
                for (rsk, sk_kmer_results) in part_sks.iter().zip(kmer_results.iter()) {
                    let filtered = apply_findere(sk_kmer_results, effective_z, n_genomes);
                    let descs = batch.map.get(*rsk).expect("rsk must be in map");
                    for desc in descs {
                        let acc = &mut accs[desc.seq_idx as usize];
-                        for hit in sk_kmer_results.iter() {
+
                        for (local_pos, hit) in filtered.iter().enumerate() {
                            match hit {
-                                None => acc.kmer_missing += 1,
+                                None => {
                                    // Only truly missing if the index also had no entry.
                                    if sk_kmer_results[local_pos].is_none() {
                                        acc.kmer_missing += 1;
                                    }
                                }
                                Some(row) => {
                                    acc.kmer_count += 1;
                                    for (g, &v) in row.iter().enumerate() {
-                                        acc.genome_totals[g] += if presence {
+                                        if v == 0 {
                                            continue;
                                        }
                                        let contribution = if presence {
                                            u32::from(v >= threshold)
                                        } else {
                                            v
                                        };
                                        acc.genome_totals[g] += contribution;
                                        if args.detail {
                                            let abs_pos = desc.kmer_offset as usize + local_pos;
                                            cov[desc.seq_idx as usize][g][abs_pos] += contribution;
                                        }
                                    }
                                }
                            }
@@ -243,7 +331,11 @@ pub fn run(args: QueryArgs) {
                }
            }
-            emit_batch(&batch, &accs, idx.meta(), args.count_missing, &mut out);
+            emit_batch(
                &batch, &accs, idx.meta(),
                args.count_missing, args.detail, &cov,
                &mut out,
            );
        }
    }
 }
@@ -255,6 +347,8 @@ fn emit_batch(
    accs:          &[SeqAcc],
    meta:          &obikindex::meta::IndexMeta,
    count_missing: bool,
    detail:        bool,
    cov:           &[Vec<Vec<u32>>],
    out:           &mut impl Write,
 ) {
    for (seq_idx, (id, seq)) in batch.ids.iter().zip(batch.seqs.iter()).enumerate() {
@@ -265,12 +359,23 @@ fn emit_batch(
        if count_missing {
            ann.insert("kmer_missing".into(), acc.kmer_missing.into());
        }
        let mut match_map = serde_json::Map::new();
        for (g, genome) in meta.genomes.iter().enumerate() {
            match_map.insert(genome.label.clone(), acc.genome_totals[g].into());
        }
        ann.insert("kmer_strict_matches".into(), match_map.into());
        if detail && !cov.is_empty() {
            let mut cov_map = serde_json::Map::new();
            for (g, genome) in meta.genomes.iter().enumerate() {
                let v: Vec<serde_json::Value> =
                    cov[seq_idx][g].iter().map(|&x| x.into()).collect();
                cov_map.insert(genome.label.clone(), v.into());
            }
            ann.insert("coverage".into(), cov_map.into());
        }
        let ann_str = serde_json::to_string(&ann).unwrap_or_else(|_| "{}".to_string());
        // OBITools4 FASTA format: >id {"key":value,...}
@@ -2,7 +2,7 @@ use std::path::PathBuf;
 use clap::Args;
 use obikindex::KmerIndex;
-use obilayeredmap::EvidenceKind;
+use obilayeredmap::IndexMode;
 use obisys::Reporter;
 use tracing::info;
@@ -41,10 +41,10 @@ pub fn run(args: ReindexArgs) {
    let target = if args.approx {
        let (z, b, fp) = resolve_approx_params(args.findere_z, args.evidence_bits, args.fp);
        info!("target: approximate evidence — b={b}, z={z}, fp={fp:.2e}");
-        EvidenceKind::Approx { b, z }
+        IndexMode::Approx { b, z }
    } else {
        info!("target: exact evidence");
-        EvidenceKind::Exact
+        IndexMode::Exact
    };
    let mut idx = KmerIndex::open(&args.index).unwrap_or_else(|e| {
@@ -0,0 +1,91 @@
 use std::path::PathBuf;
 use clap::Args;
 use obikindex::{validate_label, KmerIndex};
 use tracing::info;
 #[derive(Args)]
 pub struct UtilsArgs {
    /// Index directory to operate on
    pub index: PathBuf,
    /// Set a new genome label: NEW_LABEL=OLD_LABEL
    #[arg(long, value_name = "NEW=OLD")]
    pub new_label: Option<String>,
 }
 pub fn run(args: UtilsArgs) {
    let mut any = false;
    if let Some(spec) = &args.new_label {
        any = true;
        run_rename(&args.index, spec);
    }
    if !any {
        eprintln!("utils: no operation specified. Available options: --new-label NEW=OLD");
        std::process::exit(1);
    }
 }
 fn run_rename(index_path: &PathBuf, spec: &str) {
    let (old_label, new_label) = parse_rename_spec(spec);
    let mut idx = KmerIndex::open(index_path).unwrap_or_else(|e| {
        eprintln!("error opening index: {e}");
        std::process::exit(1);
    });
    let pos = idx
        .meta()
        .genomes
        .iter()
        .position(|g| g.label == old_label)
        .unwrap_or_else(|| {
            eprintln!("error: genome '{old_label}' not found in index");
            std::process::exit(1);
        });
    validate_label(&new_label).unwrap_or_else(|e| {
        eprintln!("error: --new-label: {e}");
        std::process::exit(1);
    });
    // Check the new label is not already taken.
    if idx.meta().genomes.iter().any(|g| g.label == new_label) {
        eprintln!("error: label '{new_label}' already exists in index");
        std::process::exit(1);
    }
    idx.meta_mut().genomes[pos].label = new_label.clone();
    idx.meta_mut().write(index_path).unwrap_or_else(|e| {
        eprintln!("error writing index metadata: {e}");
        std::process::exit(1);
    });
    // Rename the spectrum file if it exists.
    let spectrums_dir = index_path.join("spectrums");
    let old_spectrum = spectrums_dir.join(format!("{old_label}.json"));
    let new_spectrum = spectrums_dir.join(format!("{new_label}.json"));
    if old_spectrum.exists() {
        std::fs::rename(&old_spectrum, &new_spectrum).unwrap_or_else(|e| {
            eprintln!("warning: could not rename spectrum file: {e}");
        });
    }
    info!("renamed genome '{old_label}' → '{new_label}'");
 }
 fn parse_rename_spec(spec: &str) -> (String, String) {
    let eq = spec.find('=').unwrap_or_else(|| {
        eprintln!("error: --new-label expects NEW_LABEL=OLD_LABEL, got '{spec}'");
        std::process::exit(1);
    });
    let new = spec[..eq].trim().to_string();
    let old = spec[eq + 1..].trim().to_string();
    if old.is_empty() || new.is_empty() {
        eprintln!("error: --new-label: both old and new labels must be non-empty");
        std::process::exit(1);
    }
    (old, new)
 }
@@ -36,6 +36,8 @@ enum Commands {
    Estimate(cmd::estimate::EstimateArgs),
    /// Convert an index's evidence in-place: exact ↔ approx
    Reindex(cmd::reindex::ReindexArgs),
    /// Miscellaneous index utilities (--rename, …)
    Utils(cmd::utils::UtilsArgs),
 }
 fn main() {
@@ -68,6 +70,7 @@ fn main() {
        Commands::Unitig(args)    => cmd::unitig::run(args),
        Commands::Estimate(args)  => cmd::estimate::run(args),
        Commands::Reindex(args)   => cmd::reindex::run(args),
        Commands::Utils(args)     => cmd::utils::run(args),
    }
    #[cfg(feature = "profiling")]
@@ -1,8 +1,8 @@
 use obicompactvec::{PersistentBitMatrix, PersistentCompactIntMatrix};
 use obikseq::CanonicalKmer;
 use obiskio::{SKError, SKResult, UnitigFileReader};
-use obilayeredmap::OLMError;
+use obilayeredmap::{IndexMode, MphfLayer, OLMError};
-use obilayeredmap::MphfLayer;
+use obilayeredmap::meta::PartitionMeta;
 use crate::partition::KmerPartition;
@@ -35,15 +35,16 @@ impl KmerPartition {
            return Ok(());
        }
-        // Discover layers by probing layer_0, layer_1, … until one is absent.
+        let index_mode = PartitionMeta::load(&index_dir)
-        // PartitionMeta (meta.json) is only created by the merge path, not by
+            .map(|m| m.mode)
-        // the initial single-genome build, so we cannot rely on it here.
+            .unwrap_or(IndexMode::Exact);
        let mut l = 0;
        loop {
            let layer_dir = index_dir.join(format!("layer_{l}"));
            if !layer_dir.exists() { break; }
            l += 1;
-            let mphf = MphfLayer::open(&layer_dir).map_err(olm_to_sk)?;
+            let mphf = MphfLayer::open(&layer_dir, &index_mode).map_err(olm_to_sk)?;
            let reader = UnitigFileReader::open_sequential(&layer_dir.join("unitigs.bin"))?;
            let counts_dir  = layer_dir.join("counts");
@@ -92,11 +93,15 @@ impl KmerPartition {
            return Ok(());
        }
        let index_mode = PartitionMeta::load(&index_dir)
            .map(|m| m.mode)
            .unwrap_or(IndexMode::Exact);
        let mut layer = 0;
        loop {
            let layer_dir = index_dir.join(format!("layer_{layer}"));
            if !layer_dir.exists() { break; }
-            let mphf = MphfLayer::open(&layer_dir).map_err(olm_to_sk)?;
+            let mphf = MphfLayer::open(&layer_dir, &index_mode).map_err(olm_to_sk)?;
            let reader = UnitigFileReader::open_sequential(&layer_dir.join("unitigs.bin"))?;
            let counts_dir   = layer_dir.join("counts");
@@ -5,7 +5,7 @@ use cacheline_ef::{CachelineEf, CachelineEfVec};
 use epserde::prelude::*;
 use obicompactvec::{PersistentCompactIntMatrix, PersistentCompactIntVec};
 use obidebruinj::GraphDeBruijn;
-use obilayeredmap::{EvidenceKind, OLMError, layer::Layer};
+use obilayeredmap::{IndexMode, OLMError, layer::Layer};
 use obilayeredmap::meta::PartitionMeta;
 use obiskio::{SKError, SKFileMeta, SKFileReader};
 use ptr_hash::{PtrHash, bucket_fn::CubicEps, hash::Xx64};
@@ -44,7 +44,7 @@ impl KmerPartition {
        min_ab: u32,
        max_ab: Option<u32>,
        with_counts: bool,
-        evidence: &EvidenceKind,
+        mode: &IndexMode,
        block_bits: u8,
    ) -> Result<usize, SKError> {
        let part_dir = self.part_dir(i);
@@ -110,7 +110,7 @@ impl KmerPartition {
        uw.close()?;
        if with_counts {
-            Layer::<PersistentCompactIntMatrix>::build(&layer_dir, block_bits, evidence, |kmer| {
+            Layer::<PersistentCompactIntMatrix>::build(&layer_dir, block_bits, mode, |kmer| {
                match (&mphf1_opt, &counts1_opt) {
                    (Some(mphf), Some(counts)) => counts.get(mphf.index(&kmer.raw())),
                    _ => 1,
@@ -118,13 +118,11 @@ impl KmerPartition {
            })
            .map_err(olm_to_sk)?;
        } else {
-            Layer::<()>::build(&layer_dir, block_bits, evidence).map_err(olm_to_sk)?;
+            Layer::<()>::build(&layer_dir, block_bits, mode).map_err(olm_to_sk)?;
        }
        // Write meta.json in the index/ directory so LayeredMap::open works
        // (e.g. for subsequent merge operations).
        let index_dir = layer_dir.parent().expect("layer_dir has a parent");
-        PartitionMeta { n_layers: 1 }.save(index_dir).map_err(olm_to_sk)?;
+        PartitionMeta { n_layers: 1, mode: mode.clone() }.save(index_dir).map_err(olm_to_sk)?;
        Ok(n_kmers)
    }
@@ -9,7 +9,7 @@ use obicompactvec::{PersistentBitMatrix, PersistentBitMatrixBuilder,
                    PersistentCompactIntVecBuilder};
 use obikseq::CanonicalKmer;
 use obiskio::{SKError, SKResult, UnitigFileReader};
-use obilayeredmap::{EvidenceKind, Layer, LayeredMap, MphfLayer, OLMError};
+use obilayeredmap::{IndexMode, Layer, LayeredMap, MphfLayer, OLMError};
 use obilayeredmap::meta::PartitionMeta;
 use crate::partition::KmerPartition;
@@ -52,18 +52,17 @@ pub(crate) enum SrcLayerData {
 }
 impl SrcLayerData {
-    pub(crate) fn open(layer_dir: &Path, mode: MergeMode) -> SKResult<Self> {
+    pub(crate) fn open(layer_dir: &Path, merge_mode: MergeMode, index_mode: &IndexMode) -> SKResult<Self> {
        let presence_dir = layer_dir.join("presence");
        let counts_dir   = layer_dir.join("counts");
-        match mode {
+        match merge_mode {
            MergeMode::Presence => {
                if presence_dir.exists() {
-                    let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
+                    let mphf = MphfLayer::open(layer_dir, index_mode).map_err(olm_to_sk)?;
                    let mat  = PersistentBitMatrix::open(&presence_dir).map_err(SKError::Io)?;
                    Ok(SrcLayerData::Presence(mphf, mat))
                } else if counts_dir.exists() {
-                    // Source is a count index; treat count > 0 as present via ColBuilder::Bit.
+                    let mphf = MphfLayer::open(layer_dir, index_mode).map_err(olm_to_sk)?;
                    let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
                    let mat  = PersistentCompactIntMatrix::open(&counts_dir).map_err(SKError::Io)?;
                    Ok(SrcLayerData::Count(mphf, mat))
                } else {
@@ -72,7 +71,7 @@ impl SrcLayerData {
            }
            MergeMode::Count => {
                if counts_dir.exists() {
-                    let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
+                    let mphf = MphfLayer::open(layer_dir, index_mode).map_err(olm_to_sk)?;
                    let mat  = PersistentCompactIntMatrix::open(&counts_dir).map_err(SKError::Io)?;
                    Ok(SrcLayerData::Count(mphf, mat))
                } else {
@@ -116,7 +115,7 @@ fn load_meta(dir: &Path) -> SKResult<PartitionMeta> {
        Err(e) if matches!(e, OLMError::Io(ref io_e) if io_e.kind() == std::io::ErrorKind::NotFound) => {
            let mut n = 0usize;
            while dir.join(format!("layer_{n}")).exists() { n += 1; }
-            let m = PartitionMeta { n_layers: n };
+            let m = PartitionMeta { n_layers: n, mode: IndexMode::default() };
            m.save(dir).map_err(olm_to_sk)?;
            Ok(m)
        }
@@ -217,12 +216,12 @@ impl KmerPartition {
                uw.write(&unitig)?;
            }
            uw.close()?;
-            Layer::<()>::build(&new_layer_dir, block_bits, &EvidenceKind::Exact).map_err(olm_to_sk)?;
+            Layer::<()>::build(&new_layer_dir, block_bits, &IndexMode::Exact).map_err(olm_to_sk)?;
        }
        drop(g);
        let new_mphf = if any_new {
-            Some(MphfLayer::open(&new_layer_dir).map_err(olm_to_sk)?)
+            Some(MphfLayer::open(&new_layer_dir, &IndexMode::Exact).map_err(olm_to_sk)?)
        } else {
            None
        };
@@ -304,7 +303,7 @@ impl KmerPartition {
            for l in 0..src_meta.n_layers {
                let src_layer_dir = src_index_dir.join(format!("layer_{l}"));
                let reader = UnitigFileReader::open_sequential(&src_layer_dir.join("unitigs.bin"))?;
-                let src_data = SrcLayerData::open(&src_layer_dir, mode)?;
+                let src_data = SrcLayerData::open(&src_layer_dir, mode, &src_meta.mode)?;
                for (kmer, _, _) in reader.iter_indexed_canonical_kmers() {
                    let values = src_data.lookup(kmer, *src_n);
@@ -3,7 +3,7 @@ use std::path::Path;
 use obicompactvec::{PersistentBitMatrix, PersistentCompactIntMatrix};
 use obikseq::{CanonicalKmer, RoutableSuperKmer};
 use obiskio::{SKError, SKResult};
-use obilayeredmap::{MphfLayer, OLMError};
+use obilayeredmap::{IndexMode, MphfLayer, OLMError};
 use obilayeredmap::meta::PartitionMeta;
 use crate::partition::KmerPartition;
@@ -27,25 +27,25 @@ enum QueryLayer {
 }
 impl QueryLayer {
-    fn open(layer_dir: &Path, with_counts: bool) -> SKResult<Self> {
+    fn open(layer_dir: &Path, with_counts: bool, mode: &IndexMode) -> SKResult<Self> {
        let presence_dir = layer_dir.join("presence");
        let counts_dir   = layer_dir.join("counts");
        if with_counts && counts_dir.exists() {
-            let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
+            let mphf = MphfLayer::open(layer_dir, mode).map_err(olm_to_sk)?;
            let mat  = PersistentCompactIntMatrix::open(&counts_dir).map_err(SKError::Io)?;
            Ok(QueryLayer::Count(mphf, mat))
        } else if presence_dir.exists() {
-            let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
+            let mphf = MphfLayer::open(layer_dir, mode).map_err(olm_to_sk)?;
            let mat  = PersistentBitMatrix::open(&presence_dir).map_err(SKError::Io)?;
            Ok(QueryLayer::Presence(mphf, mat))
        } else if counts_dir.exists() {
            // presence query on a count index — return counts as-is
-            let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
+            let mphf = MphfLayer::open(layer_dir, mode).map_err(olm_to_sk)?;
            let mat  = PersistentCompactIntMatrix::open(&counts_dir).map_err(SKError::Io)?;
            Ok(QueryLayer::Count(mphf, mat))
        } else {
-            let mphf = MphfLayer::open(layer_dir).map_err(olm_to_sk)?;
+            let mphf = MphfLayer::open(layer_dir, mode).map_err(olm_to_sk)?;
            Ok(QueryLayer::SetOnly(mphf))
        }
    }
@@ -102,7 +102,7 @@ impl KmerPartition {
        let meta = PartitionMeta::load(&index_dir).map_err(olm_to_sk)?;
        let layers: Vec<QueryLayer> = (0..meta.n_layers)
-            .map(|i| QueryLayer::open(&index_dir.join(format!("layer_{i}")), with_counts))
+            .map(|i| QueryLayer::open(&index_dir.join(format!("layer_{i}")), with_counts, &meta.mode))
            .collect::<SKResult<_>>()?;
        Ok(superkmers
@@ -8,7 +8,7 @@ use obicompactvec::{PersistentBitMatrixBuilder,
                    PersistentCompactIntVecBuilder};
 use obidebruinj::GraphDeBruijn;
 use obiskio::{SKError, SKResult, UnitigFileReader};
-use obilayeredmap::{EvidenceKind, Layer, MphfLayer, OLMError};
+use obilayeredmap::{IndexMode, Layer, MphfLayer, OLMError};
 use obilayeredmap::meta::PartitionMeta;
 use crate::filter::KmerFilter;
@@ -67,7 +67,7 @@ fn load_meta(dir: &Path) -> SKResult<PartitionMeta> {
        Err(e) if matches!(e, OLMError::Io(ref io_e) if io_e.kind() == std::io::ErrorKind::NotFound) => {
            let mut n = 0usize;
            while dir.join(format!("layer_{n}")).exists() { n += 1; }
-            let m = PartitionMeta { n_layers: n };
+            let m = PartitionMeta { n_layers: n, mode: IndexMode::default() };
            m.save(dir).map_err(olm_to_sk)?;
            Ok(m)
        }
@@ -117,7 +117,7 @@ impl KmerPartition {
            if !unitigs_path.exists() { continue; }
            let reader = UnitigFileReader::open_sequential(&unitigs_path)?;
-            let src_data = SrcLayerData::open(&src_layer_dir, mode)?;
+            let src_data = SrcLayerData::open(&src_layer_dir, mode, &src_meta.mode)?;
            for (kmer, _, _) in reader.iter_indexed_canonical_kmers() {
                let row = src_data.lookup(kmer, n_genomes);
@@ -146,8 +146,8 @@ impl KmerPartition {
        uw.close()?;
        drop(g);
-        Layer::<()>::build(&dst_layer_dir, block_bits, &EvidenceKind::Exact).map_err(olm_to_sk)?;
+        Layer::<()>::build(&dst_layer_dir, block_bits, &IndexMode::Exact).map_err(olm_to_sk)?;
-        let dst_mphf = MphfLayer::open(&dst_layer_dir).map_err(olm_to_sk)?;
+        let dst_mphf = MphfLayer::open(&dst_layer_dir, &IndexMode::Exact).map_err(olm_to_sk)?;
        // ── Prepare matrix builders (one column per genome) ───────────────────
        let data_dir = match mode {
@@ -182,7 +182,7 @@ impl KmerPartition {
            if !unitigs_path.exists() { continue; }
            let reader = UnitigFileReader::open_sequential(&unitigs_path)?;
-            let src_data = SrcLayerData::open(&src_layer_dir, mode)?;
+            let src_data = SrcLayerData::open(&src_layer_dir, mode, &src_meta.mode)?;
            for (kmer, _, _) in reader.iter_indexed_canonical_kmers() {
                let row = src_data.lookup(kmer, n_genomes);
@@ -199,7 +199,7 @@ impl KmerPartition {
        for b in builders { b.close()?; }
        write_matrix_meta(&data_dir, n_new, n_genomes).map_err(SKError::Io)?;
-        PartitionMeta { n_layers: 1 }.save(&dst_index_dir).map_err(olm_to_sk)?;
+        PartitionMeta { n_layers: 1, mode: IndexMode::Exact }.save(&dst_index_dir).map_err(olm_to_sk)?;
        Ok(())
    }
@@ -10,7 +10,7 @@ use obikseq::CanonicalKmer;
 use obiskio::{UnitigFileReader, UnitigFileWriter};
 use crate::error::{OLMError, OLMResult};
-use crate::meta::EvidenceKind;
+use crate::meta::IndexMode;
 use crate::mphf_layer::MphfLayer;
 pub(crate) use crate::mphf_layer::UNITIGS_FILE;
@@ -62,8 +62,8 @@ pub struct Hit<T = ()> {
 // ── Common read path ──────────────────────────────────────────────────────────
 impl<D: LayerData> Layer<D> {
-    pub fn open(path: &Path) -> OLMResult<Self> {
+    pub fn open(path: &Path, mode: &IndexMode) -> OLMResult<Self> {
-        let mphf = MphfLayer::open(path)?;
+        let mphf = MphfLayer::open(path, mode)?;
        let data = D::open(path)?;
        Ok(Self { mphf, data })
    }
@@ -92,18 +92,13 @@ impl<D: LayerData> Layer<D> {
        MphfLayer::build_approx_evidence(layer_dir, b, z)
    }
    /// Dispatch to `build_exact_evidence` or `build_approx_evidence`.
    /// `block_bits` is forwarded to exact evidence only.
    pub fn build_evidence(layer_dir: &Path, kind: &EvidenceKind, block_bits: u8) -> OLMResult<usize> {
        MphfLayer::build_evidence(layer_dir, kind, block_bits)
    }
 }
 // ── Mode 1 — set membership ───────────────────────────────────────────────────
 impl Layer<()> {
-    pub fn build(out_dir: &Path, block_bits: u8, evidence_kind: &EvidenceKind) -> OLMResult<usize> {
+    pub fn build(out_dir: &Path, block_bits: u8, mode: &IndexMode) -> OLMResult<usize> {
-        MphfLayer::build(out_dir, block_bits, evidence_kind, &mut |_, _| Ok(()))
+        MphfLayer::build(out_dir, block_bits, mode, &mut |_, _| Ok(()))
    }
    /// Create a presence matrix for a set-membership layer (first merge).
@@ -126,7 +121,7 @@ impl Layer<PersistentCompactIntMatrix> {
    pub fn build(
        out_dir: &Path,
        block_bits: u8,
-        evidence_kind: &EvidenceKind,
+        mode: &IndexMode,
        count_of: impl Fn(CanonicalKmer) -> u32,
    ) -> OLMResult<usize> {
        let n = UnitigFileReader::open_sequential(&out_dir.join(UNITIGS_FILE))?.n_kmers();
@@ -134,7 +129,7 @@ impl Layer<PersistentCompactIntMatrix> {
        let mut mb = PersistentCompactIntMatrixBuilder::new(n, &counts_dir)
            .map_err(OLMError::Io)?;
        let mut col = mb.add_col().map_err(OLMError::Io)?;
-        let n_built = MphfLayer::build(out_dir, block_bits, evidence_kind, &mut |slot, kmer| {
+        let n_built = MphfLayer::build(out_dir, block_bits, mode, &mut |slot, kmer| {
            col.set(slot, count_of(kmer));
            Ok(())
        })?;
@@ -146,10 +141,10 @@ impl Layer<PersistentCompactIntMatrix> {
    pub fn build_from_map(
        out_dir: &Path,
        block_bits: u8,
-        evidence_kind: &EvidenceKind,
+        mode: &IndexMode,
        counts: &HashMap<CanonicalKmer, u32>,
    ) -> OLMResult<usize> {
-        Self::build(out_dir, block_bits, evidence_kind, |kmer| counts.get(&kmer).copied().unwrap_or(0))
+        Self::build(out_dir, block_bits, mode, |kmer| counts.get(&kmer).copied().unwrap_or(0))
    }
 }
@@ -179,7 +174,7 @@ impl Layer<PersistentBitMatrix> {
    pub fn build_presence(
        out_dir: &Path,
        block_bits: u8,
-        evidence_kind: &EvidenceKind,
+        mode: &IndexMode,
        n_genomes: usize,
        present_in: impl Fn(CanonicalKmer, usize) -> bool,
    ) -> OLMResult<usize> {
@@ -189,7 +184,7 @@ impl Layer<PersistentBitMatrix> {
        let mut cols: Vec<_> = (0..n_genomes)
            .map(|_| mb.add_col().map_err(OLMError::Io))
            .collect::<OLMResult<_>>()?;
-        let n_built = MphfLayer::build(out_dir, block_bits, evidence_kind, &mut |slot, kmer| {
+        let n_built = MphfLayer::build(out_dir, block_bits, mode, &mut |slot, kmer| {
            for (g, col) in cols.iter_mut().enumerate() {
                col.set(slot, present_in(kmer, g));
            }
@@ -11,5 +11,5 @@ pub use error::{OLMError, OLMResult};
 pub use layer::{Hit, Layer, LayerData};
 pub use layered_store::LayeredStore;
 pub use map::LayeredMap;
-pub use meta::{EvidenceKind, LayerMeta};
+pub use meta::{IndexMode, PartitionMeta};
 pub use mphf_layer::MphfLayer;
@@ -5,11 +5,10 @@ use std::path::{Path, PathBuf};
 use obicompactvec::PersistentCompactIntMatrix;
 use obikseq::CanonicalKmer;
 use obiskio::{UnitigFileWriter, DEFAULT_BLOCK_BITS};
 use crate::meta::EvidenceKind;
 use crate::error::OLMResult;
 use crate::layer::{Hit, Layer, LayerData};
-use crate::meta::PartitionMeta;
+use crate::meta::{IndexMode, PartitionMeta};
 /// Layered kmer index for a single partition.
 ///
@@ -26,39 +25,26 @@ pub struct LayeredMap<D: LayerData = ()> {
 impl<D: LayerData> LayeredMap<D> {
    /// Open an existing layered index at `root`.
    /// The mode is read once from `PartitionMeta` and applied to all layers.
    pub fn open(root: &Path) -> OLMResult<Self> {
        let meta = PartitionMeta::load(root)?;
        let layers = (0..meta.n_layers)
-            .map(|i| Layer::<D>::open(&layer_dir(root, i)))
+            .map(|i| Layer::<D>::open(&layer_dir(root, i), &meta.mode))
            .collect::<OLMResult<Vec<_>>>()?;
-        Ok(Self {
+        Ok(Self { root: root.to_owned(), meta, layers })
            root: root.to_owned(),
            meta,
            layers,
        })
    }
-    /// Create a new, empty layered index at `root`.
+    /// Create a new, empty layered index at `root` with the given mode.
-    pub fn create(root: &Path) -> OLMResult<Self> {
+    pub fn create(root: &Path, mode: IndexMode) -> OLMResult<Self> {
        fs::create_dir_all(root)?;
-        let meta = PartitionMeta::new();
+        let meta = PartitionMeta::new(mode);
        meta.save(root)?;
-        Ok(Self {
+        Ok(Self { root: root.to_owned(), meta, layers: Vec::new() })
            root: root.to_owned(),
            meta,
            layers: Vec::new(),
        })
    }
-    /// Return the number of layers in this index.
+    pub fn n_layers(&self) -> usize { self.layers.len() }
-    pub fn n_layers(&self) -> usize {
+    pub fn layer(&self, i: usize) -> &Layer<D> { &self.layers[i] }
-        self.layers.len()
+    pub fn mode(&self) -> &IndexMode { &self.meta.mode }
    }
    /// Return a reference to the `i`-th layer.
    pub fn layer(&self, i: usize) -> &Layer<D> {
        &self.layers[i]
    }
    /// Query `kmer` across all layers. Returns `(layer_index, Hit)` on match.
    pub fn query(&self, kmer: CanonicalKmer) -> Option<(usize, Hit<D::Item>)> {
@@ -68,17 +54,15 @@ impl<D: LayerData> LayeredMap<D> {
            .find_map(|(i, layer)| layer.query(kmer).map(|hit| (i, hit)))
    }
    /// Return a `UnitigFileWriter` for the next layer to be built.
    pub fn next_layer_writer(&self) -> OLMResult<UnitigFileWriter> {
        let dir = layer_dir(&self.root, self.layers.len());
        Layer::<D>::unitig_writer(&dir)
    }
    /// Append a new layer to the index.
    fn append_layer(&mut self) -> OLMResult<()> {
        let i = self.layers.len();
        let dir = layer_dir(&self.root, i);
-        self.layers.push(Layer::<D>::open(&dir)?);
+        self.layers.push(Layer::<D>::open(&dir, &self.meta.mode)?);
        self.meta.n_layers = self.layers.len();
        self.meta.save(&self.root)?;
        Ok(())
@@ -91,7 +75,7 @@ impl LayeredMap<()> {
    pub fn push_layer(&mut self) -> OLMResult<usize> {
        let i = self.layers.len();
        let dir = layer_dir(&self.root, i);
-        Layer::<()>::build(&dir, DEFAULT_BLOCK_BITS, &EvidenceKind::Exact)?;
+        Layer::<()>::build(&dir, DEFAULT_BLOCK_BITS, &self.meta.mode)?;
        self.append_layer()?;
        Ok(i)
    }
@@ -103,15 +87,12 @@ impl LayeredMap<PersistentCompactIntMatrix> {
    pub fn push_layer(&mut self, count_of: impl Fn(CanonicalKmer) -> u32) -> OLMResult<usize> {
        let i = self.layers.len();
        let dir = layer_dir(&self.root, i);
-        Layer::<PersistentCompactIntMatrix>::build(&dir, DEFAULT_BLOCK_BITS, &EvidenceKind::Exact, count_of)?;
+        Layer::<PersistentCompactIntMatrix>::build(&dir, DEFAULT_BLOCK_BITS, &self.meta.mode, count_of)?;
        self.append_layer()?;
        Ok(i)
    }
-    pub fn push_layer_from_map(
+    pub fn push_layer_from_map(&mut self, counts: &HashMap<CanonicalKmer, u32>) -> OLMResult<usize> {
        &mut self,
        counts: &HashMap<CanonicalKmer, u32>,
    ) -> OLMResult<usize> {
        self.push_layer(|kmer| counts.get(&kmer).copied().unwrap_or(0))
    }
 }
@@ -6,64 +6,44 @@ use serde::{Deserialize, Serialize};
 use crate::error::OLMResult;
 const META_FILE: &str = "meta.json";
 const LAYER_META_FILE: &str = "layer_meta.json";
-// ── Layer-level metadata ──────────────────────────────────────────────────────
+// ── IndexMode ─────────────────────────────────────────────────────────────────
-/// Describes the evidence bundle stored alongside the MPHF for one layer.
+/// Evidence mode for an entire partitioned index — homogeneous across all layers.
 ///
 /// Determined once at build time; stored in `PartitionMeta` (`meta.json`).
 /// All layers within an index share the same mode.
 #[derive(Debug, Clone, Serialize, Deserialize)]
 #[serde(tag = "type", rename_all = "snake_case")]
-pub enum EvidenceKind {
+pub enum IndexMode {
    /// Exact evidence: `evidence.bin` + `unitigs.bin.idx`.  Zero false positives.
    Exact,
    /// Approximate evidence: `fingerprint.bin` only.
-    /// `b` — fingerprint bits; false-positive rate per k-mer query = 1/2^b.
+    /// `b` — fingerprint bits per slot; false-positive rate ≈ 1/2^b per query.
-    /// `z` — consecutive k-mers that must all match (Findere trick);
+    /// `z` — Findere consecutive-kmer parameter (build-time only; not used at query time).
    ///        effective FP rate per sequencing read ≈ W / 2^(b·z)
    ///        where W = L - k - z + 2 is the number of windows in a read of length L.
    Approx { b: u8, z: u8 },
    /// Hybrid: both `fingerprint.bin` and `evidence.bin` + `unitigs.bin.idx`.
    /// `find()` uses the fingerprint (O(1), approx); `find_strict()` uses exact evidence.
    Hybrid { b: u8, z: u8 },
 }
-#[derive(Debug, Clone, Serialize, Deserialize)]
+impl Default for IndexMode {
 pub struct LayerMeta {
    pub evidence: EvidenceKind,
 }
 impl Default for EvidenceKind {
    fn default() -> Self { Self::Exact }
 }
-impl LayerMeta {
+// ── PartitionMeta ─────────────────────────────────────────────────────────────
    pub fn exact() -> Self {
        Self { evidence: EvidenceKind::Exact }
    }
    pub fn approx(b: u8, z: u8) -> Self {
        Self { evidence: EvidenceKind::Approx { b, z } }
    }
    pub fn load(layer_dir: &Path) -> OLMResult<Self> {
        let f = File::open(layer_dir.join(LAYER_META_FILE))?;
        Ok(serde_json::from_reader(f)?)
    }
    pub fn save(&self, layer_dir: &Path) -> OLMResult<()> {
        let f = File::create(layer_dir.join(LAYER_META_FILE))?;
        serde_json::to_writer_pretty(f, self)?;
        Ok(())
    }
 }
 // ── Partition-level metadata ──────────────────────────────────────────────────
 /// Index-level metadata stored in `meta.json` at the root of a partition index.
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct PartitionMeta {
    pub n_layers: usize,
    #[serde(default)]
    pub mode: IndexMode,
 }
 impl PartitionMeta {
-    pub fn new() -> Self {
+    pub fn new(mode: IndexMode) -> Self {
-        Self { n_layers: 0 }
+        Self { n_layers: 0, mode }
    }
    pub fn load(dir: &Path) -> OLMResult<Self> {
@@ -79,5 +59,5 @@ impl PartitionMeta {
 }
 impl Default for PartitionMeta {
-    fn default() -> Self { Self::new() }
+    fn default() -> Self { Self::new(IndexMode::Exact) }
 }
@@ -1,5 +1,5 @@
 use std::fs;
-use std::path::Path;
+use std::path::{Path, PathBuf};
 use cacheline_ef::{CachelineEf, CachelineEfVec};
 use epserde::prelude::*;
@@ -10,7 +10,7 @@ use ptr_hash::{PtrHash, PtrHashParams, bucket_fn::CubicEps, hash::Xx64};
 use crate::error::{OLMError, OLMResult};
 use crate::evidence::{Evidence, EvidenceWriter};
 use crate::fingerprint::{FingerprintVec, FingerprintVecWriter};
-use crate::meta::{EvidenceKind, LayerMeta};
+use crate::meta::IndexMode;
 pub(crate) const MPHF_FILE:        &str = "mphf.bin";
 pub(crate) const UNITIGS_FILE:     &str = "unitigs.bin";
@@ -19,19 +19,22 @@ pub(crate) const FINGERPRINT_FILE: &str = "fingerprint.bin";
 pub(crate) type Mphf = PtrHash<u64, CubicEps, CachelineEfVec<Vec<CachelineEf>>, Xx64, Vec<u8>>;
-// ── Evidence store ────────────────────────────────────────────────────────────
+// ── LayerEvidence ─────────────────────────────────────────────────────────────
 enum LayerEvidence {
    Exact  { evidence: Evidence, unitigs: UnitigFileReader },
-    Approx { fingerprint: FingerprintVec },
+    Approx { fingerprint: FingerprintVec, unitigs_path: PathBuf },
    Hybrid { evidence: Evidence, unitigs: UnitigFileReader, fingerprint: FingerprintVec },
 }
 // ── MphfLayer ─────────────────────────────────────────────────────────────────
 /// Autonomous kmer → slot mapping for one layer.
 ///
-/// Dispatches queries to exact or approximate evidence transparently based on
+/// Two query methods:
-/// the `layer_meta.json` written at build time.
+/// - [`find`](Self::find) — O(1), uses fingerprint (Approx/Hybrid) or exact evidence (Exact).
 /// - [`find_strict`](Self::find_strict) — always exact; O(1) on Exact/Hybrid layers,
 ///   O(n) sequential scan on Approx layers.
 pub struct MphfLayer {
    mphf: Mphf,
    ev:   LayerEvidence,
@@ -39,21 +42,31 @@ pub struct MphfLayer {
 }
 impl MphfLayer {
-    pub fn open(dir: &Path) -> OLMResult<Self> {
+    /// Open a layer using the index-level `mode` determined at `LayeredMap` open time.
-        let meta = LayerMeta::load(dir)?;
+    /// No per-layer metadata file is read.
    pub fn open(dir: &Path, mode: &IndexMode) -> OLMResult<Self> {
        let mphf: Mphf = Mphf::load_full(&dir.join(MPHF_FILE))
            .map_err(|e| OLMError::InvalidLayer(e.to_string()))?;
-        let (ev, n) = match meta.evidence {
+        let (ev, n) = match mode {
-            EvidenceKind::Exact => {
+            IndexMode::Exact => {
                let evidence = Evidence::open(&dir.join(EVIDENCE_FILE))?;
                let n = evidence.len();
                // open() auto-detects: uses direct access since exact layers always have .idx
                let unitigs = UnitigFileReader::open(&dir.join(UNITIGS_FILE))?;
                (LayerEvidence::Exact { evidence, unitigs }, n)
            }
-            EvidenceKind::Approx { .. } => {
+            IndexMode::Approx { .. } => {
                let fingerprint = FingerprintVec::open(&dir.join(FINGERPRINT_FILE))?;
                let n = fingerprint.n();
-                (LayerEvidence::Approx { fingerprint }, n)
+                let unitigs_path = dir.join(UNITIGS_FILE);
                (LayerEvidence::Approx { fingerprint, unitigs_path }, n)
            }
            IndexMode::Hybrid { .. } => {
                let evidence    = Evidence::open(&dir.join(EVIDENCE_FILE))?;
                let fingerprint = FingerprintVec::open(&dir.join(FINGERPRINT_FILE))?;
                let n = evidence.len();
                let unitigs = UnitigFileReader::open(&dir.join(UNITIGS_FILE))?;
                (LayerEvidence::Hybrid { evidence, unitigs, fingerprint }, n)
            }
        };
        Ok(Self { mphf, ev, n })
@@ -61,25 +74,16 @@ impl MphfLayer {
    // ── Query API ─────────────────────────────────────────────────────────────
-    /// Transparent dispatch: routes to `find_exact` or `find_approx` based on
+    /// O(1) lookup — dispatches automatically:
-    /// the evidence loaded at `open` time.
+    /// - Exact:  evidence + `verify_canonical_kmer`, zero false positives.
    /// - Approx: fingerprint check, false-positive rate ≈ 1/2^b.
    /// - Hybrid: fingerprint check (fast path), zero false positives via `find_strict`.
    #[inline]
    pub fn find(&self, kmer: CanonicalKmer) -> Option<usize> {
        match &self.ev {
            LayerEvidence::Exact { .. }  => self.find_exact(kmer),
            LayerEvidence::Approx { .. } => self.find_approx(kmer),
        }
    }
    /// Exact lookup: zero false positives.  Panics if the layer was opened with
    /// approximate evidence.
    #[inline]
    pub fn find_exact(&self, kmer: CanonicalKmer) -> Option<usize> {
        let LayerEvidence::Exact { evidence, unitigs } = &self.ev else {
            panic!("find_exact called on an approximate layer");
        };
        let slot = self.mphf.index(&kmer.raw());
        if slot >= self.n { return None; }
        match &self.ev {
            LayerEvidence::Exact { evidence, unitigs } => {
                let (chunk_id, rank) = evidence.decode(slot);
                if unitigs.verify_canonical_kmer(chunk_id as usize, rank as usize, kmer) {
                    Some(slot)
@@ -87,17 +91,41 @@ impl MphfLayer {
                    None
                }
            }
            LayerEvidence::Approx { fingerprint, .. } |
            LayerEvidence::Hybrid { fingerprint, .. } => {
                if fingerprint.matches(slot, kmer.seq_hash()) { Some(slot) } else { None }
            }
        }
    }
-    /// Approximate lookup: false-positive rate 1/2^b per k-mer query.  Panics
+    /// Always-exact lookup — zero false positives regardless of mode.
-    /// if the layer was opened with exact evidence.
+    ///
-    #[inline]
+    /// - Exact/Hybrid: O(1) via evidence + `verify_canonical_kmer`.
-    pub fn find_approx(&self, kmer: CanonicalKmer) -> Option<usize> {
+    /// - Approx: O(n) sequential scan of `unitigs.bin` to confirm the kmer
-        let LayerEvidence::Approx { fingerprint } = &self.ev else {
+    ///   that owns the slot, then exact comparison.
-            panic!("find_approx called on an exact layer");
+    pub fn find_strict(&self, kmer: CanonicalKmer) -> Option<usize> {
        };
        let slot = self.mphf.index(&kmer.raw());
        if slot >= self.n { return None; }
-        if fingerprint.matches(slot, kmer.seq_hash()) { Some(slot) } else { None }
+        match &self.ev {
            LayerEvidence::Exact  { evidence, unitigs } |
            LayerEvidence::Hybrid { evidence, unitigs, .. } => {
                let (chunk_id, rank) = evidence.decode(slot);
                if unitigs.verify_canonical_kmer(chunk_id as usize, rank as usize, kmer) {
                    Some(slot)
                } else {
                    None
                }
            }
            LayerEvidence::Approx { unitigs_path, .. } => {
                let reader = UnitigFileReader::open_sequential(unitigs_path).ok()?;
                for stored in reader.iter_canonical_kmers() {
                    if self.mphf.index(&stored.raw()) == slot {
                        return if stored == kmer { Some(slot) } else { None };
                    }
                }
                None
            }
        }
    }
    pub fn n(&self) -> usize { self.n }
@@ -109,19 +137,7 @@ impl MphfLayer {
        Ok(UnitigFileWriter::create(&dir.join(UNITIGS_FILE))?)
    }
    /// Dispatch to `build_exact_evidence` or `build_approx_evidence` based on
    /// `kind`.  `block_bits` is forwarded to exact evidence only.
    pub fn build_evidence(dir: &Path, kind: &EvidenceKind, block_bits: u8) -> OLMResult<usize> {
        match kind {
            EvidenceKind::Exact           => Self::build_exact_evidence(dir, block_bits),
            EvidenceKind::Approx { b, z } => Self::build_approx_evidence(dir, *b, *z),
        }
    }
    /// Build `evidence.bin` + `unitigs.bin.idx` from `unitigs.bin` + `mphf.bin`.
    ///
    /// `block_bits` controls the `.idx` block size (2^block_bits chunks per block).
    /// Uses sequential iteration — no `.idx` required on entry.
    pub fn build_exact_evidence(dir: &Path, block_bits: u8) -> OLMResult<usize> {
        let unitig_path = dir.join(UNITIGS_FILE);
        let unitigs = UnitigFileReader::open_sequential(&unitig_path)?;
@@ -130,7 +146,6 @@ impl MphfLayer {
        if n == 0 {
            fs::File::create(dir.join(EVIDENCE_FILE))?;
            build_unitig_idx(&unitig_path, block_bits)?;
            LayerMeta::exact().save(dir)?;
            return Ok(0);
        }
@@ -156,13 +171,10 @@ impl MphfLayer {
        ev.write(&dir.join(EVIDENCE_FILE))?;
        build_unitig_idx(&unitig_path, block_bits)?;
        LayerMeta::exact().save(dir)?;
        Ok(n)
    }
    /// Build `fingerprint.bin` from `unitigs.bin` + `mphf.bin`.
    /// `b` — fingerprint bits (1..=64); `z` — Findere consecutive k-mer
    /// parameter (≥1).  No `.idx` is written.
    pub fn build_approx_evidence(dir: &Path, b: u8, z: u8) -> OLMResult<usize> {
        if b == 0 || b > 64 {
            return Err(OLMError::InvalidLayer("fingerprint width must be 1..=64".into()));
@@ -176,7 +188,6 @@ impl MphfLayer {
        if n == 0 {
            FingerprintVecWriter::new(0, b).write(&dir.join(FINGERPRINT_FILE))?;
            LayerMeta::approx(b, z).save(dir)?;
            return Ok(0);
        }
@@ -194,139 +205,113 @@ impl MphfLayer {
        }
        fw.write(&dir.join(FINGERPRINT_FILE))?;
        LayerMeta::approx(b, z).save(dir)?;
        Ok(n)
    }
-    /// Build MPHF then evidence from the unitigs file already present in `dir`.
+    /// Build MPHF + evidence from `unitigs.bin` already present in `dir`.
    ///
-    /// - Exact: `.idx` is built for pass-1 parallel construction and kept for
+    /// `fill_slot(slot, kmer)` is called once per kmer in all modes.
-    ///   query-time kmer verification.  `evidence.bin` is written.
+    /// No `layer_meta.json` is written — the mode is an index-level property
-    /// - Approx: pass-1 uses `open_sequential` + `par_bridge` — no `.idx` is
+    /// stored in `PartitionMeta`.
    ///   ever created.  `fingerprint.bin` is written.
    ///
    /// `fill_slot(slot, kmer)` is called once per kmer in both modes.
    pub(crate) fn build(
        dir: &Path,
        block_bits: u8,
-        evidence_kind: &EvidenceKind,
+        mode: &IndexMode,
        fill_slot: &mut impl FnMut(usize, CanonicalKmer) -> OLMResult<()>,
    ) -> OLMResult<usize> {
        use rayon::prelude::*;
        let unitig_path = dir.join(UNITIGS_FILE);
        match evidence_kind {
            // ── Exact path ────────────────────────────────────────────────────
            // .idx is built LAST, once evidence.bin is written, so it is never
            // present during construction — only at query time.
            EvidenceKind::Exact => {
        let n = UnitigFileReader::open_sequential(&unitig_path)?.n_kmers();
-                let keys = CanonicalKmerIter::new(&unitig_path)
+
-                    .map_err(|e| match e {
+        let sk_to_olm = |e: obiskio::SKError| match e {
            obiskio::SKError::Io(io) => OLMError::Io(io),
            e => OLMError::InvalidLayer(e.to_string()),
-                    })?;
+        };
        // ── Empty layer ───────────────────────────────────────────────────────
        if n == 0 {
                    fs::File::create(dir.join(EVIDENCE_FILE))?;
            let mphf: Mphf =
                Mphf::try_new(&[] as &[u64], PtrHashParams::<CubicEps>::default())
                    .ok_or_else(|| OLMError::Mphf("construction failed".into()))?;
            mphf.store(&dir.join(MPHF_FILE))
                .map_err(|e| OLMError::InvalidLayer(e.to_string()))?;
-                    LayerMeta::exact().save(dir)?;
+            match mode {
                IndexMode::Exact | IndexMode::Hybrid { .. } => {
                    fs::File::create(dir.join(EVIDENCE_FILE))?;
                    build_unitig_idx(&unitig_path, block_bits)?;
                }
                IndexMode::Approx { b, .. } => {
                    FingerprintVecWriter::new(0, *b).write(&dir.join(FINGERPRINT_FILE))?;
                }
            }
            if let IndexMode::Hybrid { b, .. } = mode {
                FingerprintVecWriter::new(0, *b).write(&dir.join(FINGERPRINT_FILE))?;
            }
            return Ok(0);
        }
-                // Pass 1 — MPHF construction via clonable mmap iterator
+        // ── Pass 1: MPHF via clonable mmap iterator ───────────────────────────
        let keys = CanonicalKmerIter::new(&unitig_path).map_err(sk_to_olm)?;
        let mphf: Mphf =
-                    Mphf::new_from_par_iter(n, keys.map(|k| k.raw()).par_bridge(), PtrHashParams::<CubicEps>::default());
+            Mphf::new_from_par_iter(n, keys.map(|k| k.raw()).par_bridge(),
                                    PtrHashParams::<CubicEps>::default());
        mphf.store(&dir.join(MPHF_FILE))
            .map_err(|e| OLMError::InvalidLayer(e.to_string()))?;
-                // Pass 2 — sequential: fill evidence.bin + callback
+        // ── Pass 2: fill evidence files + callback ────────────────────────────
        let unitigs2 = UnitigFileReader::open_sequential(&unitig_path)?;
                let mut ev   = EvidenceWriter::new(n);
        let mut seen = vec![0u8; (n + 7) / 8];
        match mode {
            IndexMode::Exact => {
                let mut ev = EvidenceWriter::new(n);
                for (kmer, chunk_id, rank) in unitigs2.iter_indexed_canonical_kmers() {
                    let slot = mphf.index(&kmer.raw());
-                    if slot >= n {
+                    if slot >= n { return Err(OLMError::Mphf("slot out of bounds".into())); }
-                        return Err(OLMError::Mphf("slot out of bounds".into()));
+                    let byte = slot / 8; let bit = 1u8 << (slot % 8);
-                    }
+                    if seen[byte] & bit != 0 { return Err(OLMError::Mphf("duplicate slot".into())); }
                    let byte = slot / 8;
                    let bit  = 1u8 << (slot % 8);
                    if seen[byte] & bit != 0 {
                        return Err(OLMError::Mphf("duplicate slot".into()));
                    }
                    seen[byte] |= bit;
                    ev.set(slot, chunk_id as u32, rank as u8);
                    fill_slot(slot, kmer)?;
                }
                ev.write(&dir.join(EVIDENCE_FILE))?;
                LayerMeta::exact().save(dir)?;
                // .idx built last: strictly for query-time kmer verification
                build_unitig_idx(&unitig_path, block_bits)?;
                Ok(n)
            }
-            // ── Approx path ───────────────────────────────────────────────────
+            IndexMode::Approx { b, .. } => {
            // No .idx is created at any point.
            EvidenceKind::Approx { b, z } => {
                let unitigs = UnitigFileReader::open_sequential(&unitig_path)?;
                let n = unitigs.n_kmers();
                if n == 0 {
                    FingerprintVecWriter::new(0, *b).write(&dir.join(FINGERPRINT_FILE))?;
                    let mphf: Mphf =
                        Mphf::try_new(&[] as &[u64], PtrHashParams::<CubicEps>::default())
                            .ok_or_else(|| OLMError::Mphf("construction failed".into()))?;
                    mphf.store(&dir.join(MPHF_FILE))
                        .map_err(|e| OLMError::InvalidLayer(e.to_string()))?;
                    LayerMeta::approx(*b, *z).save(dir)?;
                    return Ok(0);
                }
                // Pass 1 — MPHF construction via mmap-backed clonable iterator.
                // No .idx is created. par_bridge() parallelises the sequential scan;
                // Clone on CanonicalKmerRawIter shares the Arc<Mmap> and resets to pos 0.
                let keys = CanonicalKmerIter::new(&unitig_path)
                    .map_err(|e| match e {
                        obiskio::SKError::Io(io) => OLMError::Io(io),
                        e => OLMError::InvalidLayer(e.to_string()),
                    })?;
                let mphf: Mphf =
                    Mphf::new_from_par_iter(n, keys.map(|k| k.raw()).par_bridge(), PtrHashParams::<CubicEps>::default());
                mphf.store(&dir.join(MPHF_FILE))
                    .map_err(|e| OLMError::InvalidLayer(e.to_string()))?;
                // Pass 2 — sequential: fill fingerprint.bin + callback
                let unitigs2 = UnitigFileReader::open_sequential(&unitig_path)?;
                let mut fw = FingerprintVecWriter::new(n, *b);
                let mut seen = vec![0u8; (n + 7) / 8];
                for kmer in unitigs2.iter_canonical_kmers() {
                    let slot = mphf.index(&kmer.raw());
-                    if slot >= n {
+                    if slot >= n { return Err(OLMError::Mphf("slot out of bounds".into())); }
-                        return Err(OLMError::Mphf("slot out of bounds".into()));
+                    let byte = slot / 8; let bit = 1u8 << (slot % 8);
-                    }
+                    if seen[byte] & bit != 0 { return Err(OLMError::Mphf("duplicate slot".into())); }
                    let byte = slot / 8;
                    let bit  = 1u8 << (slot % 8);
                    if seen[byte] & bit != 0 {
                        return Err(OLMError::Mphf("duplicate slot".into()));
                    }
                    seen[byte] |= bit;
                    fw.set(slot, kmer.seq_hash());
                    fill_slot(slot, kmer)?;
                }
                fw.write(&dir.join(FINGERPRINT_FILE))?;
-                LayerMeta::approx(*b, *z).save(dir)?;
+            }
            IndexMode::Hybrid { b, .. } => {
                let mut ev = EvidenceWriter::new(n);
                let mut fw = FingerprintVecWriter::new(n, *b);
                for (kmer, chunk_id, rank) in unitigs2.iter_indexed_canonical_kmers() {
                    let slot = mphf.index(&kmer.raw());
                    if slot >= n { return Err(OLMError::Mphf("slot out of bounds".into())); }
                    let byte = slot / 8; let bit = 1u8 << (slot % 8);
                    if seen[byte] & bit != 0 { return Err(OLMError::Mphf("duplicate slot".into())); }
                    seen[byte] |= bit;
                    ev.set(slot, chunk_id as u32, rank as u8);
                    fw.set(slot, kmer.seq_hash());
                    fill_slot(slot, kmer)?;
                }
                ev.write(&dir.join(EVIDENCE_FILE))?;
                fw.write(&dir.join(FINGERPRINT_FILE))?;
                build_unitig_idx(&unitig_path, block_bits)?;
            }
        }
        Ok(n)
    }
 }
    }
 }
@@ -19,12 +19,13 @@ impl PathIter {
            file_buffer: Vec::new(),
        };
        for path in paths {
            // "-" is the stdin sentinel — pass it through without any extension
            // check or directory expansion.
            if path.as_os_str() == "-" {
                iter.file_buffer.push(path);
            } else if is_fasta_or_fastq(&path) {
                // Avoid stat() at construction time on network filesystems (Lustre, NFS)
                // where metadata operations can be 100s of milliseconds each.
            // Paths that look like sequence files are assumed to be files.
            // Anything else is treated as a potential directory and expanded lazily
            // in next(); read_dir errors are silently skipped.
            if is_fasta_or_fastq(&path) {
                iter.file_buffer.push(path);
            } else {
                iter.dir_stack.push(path);
@@ -24,10 +24,6 @@ impl<T: Copy + Default, const N: usize> Ring<T, N> {
        }
    }
    #[inline]
    fn is_empty(&self) -> bool {
        self.len == 0
    }
    #[inline]
    fn clear(&mut self) {
        self.len = 0;
        self.head = 0;
@@ -67,10 +63,6 @@ impl<T: Copy + Default, const N: usize> Ring<T, N> {
        }
    }
    /// Iterate over elements front-to-back (copies, since T: Copy).
    fn iter(&self) -> impl Iterator<Item = T> + '_ {
        (0..self.len).map(move |i| self.buf[(self.head + i) % N])
    }
 }
 // ── MmerItem ──────────────────────────────────────────────────────────────────
@@ -198,11 +198,16 @@ pub fn build_unitig_idx(unitigs_path: &Path, block_bits: u8) -> SKResult<()> {
 // ── Reader ────────────────────────────────────────────────────────────────────
-/// Read-only random-access view of a unitig file.
+/// Memory-mapped view of a unitig file, with optional direct-access index.
 ///
-/// The sequence file is memory-mapped; the block offset table is loaded into RAM
+/// Three constructors select the operating mode:
-/// on open.  Random access to chunk `i`: O(1 << block_bits) sequential mmap
+/// - [`open`](Self::open) — smart default: direct access if `.idx` exists, sequential otherwise.
-/// reads.  Sequential iteration: O(n) via a running-offset cursor.
+/// - [`open_sequential`](Self::open_sequential) — always sequential, ignores `.idx`.
 /// - [`open_direct_access`](Self::open_direct_access) — requires `.idx`, errors if absent.
 ///
 /// All positional methods (`chunk_start`, `verify_canonical_kmer`, …) work in
 /// both modes.  Without `.idx` they fall back to an O(i) sequential scan —
 /// correct but slower.
 pub struct UnitigFileReader {
    mmap:          Mmap,
    block_offsets: Vec<u32>,
@@ -214,28 +219,20 @@ pub struct UnitigFileReader {
 }
 impl UnitigFileReader {
-    /// Open with `.idx` — enables both sequential iteration and random access.
+    /// Smart default: opens with direct access if `.idx` is present, sequential otherwise.
    pub fn open(path: &Path) -> SKResult<Self> {
-        let file = File::open(path).map_err(SKError::Io)?;
+        if idx_path(path).exists() {
-        let mmap = unsafe { Mmap::map(&file).map_err(SKError::Io)? };
+            Self::open_direct_access(path)
-        let (n_unitigs, n_kmers, block_bits, block_offsets) = read_idx(&idx_path(path))?;
+        } else {
-        let k = obikseq::params::k();
+            Self::open_sequential(path)
-        Ok(Self {
+        }
            mmap,
            block_offsets,
            n_unitigs,
            n_kmers,
            k,
            block_bits,
            mask: (1usize << block_bits) - 1,
        })
    }
-    /// Open without `.idx` — sequential iteration only, no random access.
+    /// Always sequential — never reads `.idx` even if present.
    ///
-    /// Scans the binary file once to count chunks and k-mers.  Use when only
+    /// Scans the binary file once to count chunks and k-mers.
-    /// [`Self::iter_kmers`], [`Self::iter_canonical_kmers`], or
+    /// Positional access (`chunk_start`, `verify_canonical_kmer`) falls back to
-    /// [`Self::iter_indexed_canonical_kmers`] are needed.
+    /// O(i) sequential scan.
    pub fn open_sequential(path: &Path) -> SKResult<Self> {
        let file = File::open(path).map_err(SKError::Io)?;
        let mmap = unsafe { Mmap::map(&file).map_err(SKError::Io)? };
@@ -253,7 +250,7 @@ impl UnitigFileReader {
        Ok(Self {
            mmap,
-            block_offsets: Vec::new(), // empty → random access disabled
+            block_offsets: Vec::new(),
            n_unitigs,
            n_kmers,
            k,
@@ -262,16 +259,40 @@ impl UnitigFileReader {
        })
    }
    /// Requires `.idx` — errors if the companion index file is absent.
    ///
    /// Enables O(1 << block_bits) positional access to any chunk.
    /// Use only when direct access is architecturally required (query-time
    /// verification on an exact-evidence layer).
    pub fn open_direct_access(path: &Path) -> SKResult<Self> {
        let file = File::open(path).map_err(SKError::Io)?;
        let mmap = unsafe { Mmap::map(&file).map_err(SKError::Io)? };
        let (n_unitigs, n_kmers, block_bits, block_offsets) = read_idx(&idx_path(path))?;
        let k = obikseq::params::k();
        Ok(Self {
            mmap,
            block_offsets,
            n_unitigs,
            n_kmers,
            k,
            block_bits,
            mask: (1usize << block_bits) - 1,
        })
    }
    pub fn len(&self) -> usize { self.n_unitigs }
    pub fn is_empty(&self) -> bool { self.n_unitigs == 0 }
    pub fn n_kmers(&self) -> usize { self.n_kmers }
    pub fn block_bits(&self) -> u8 { self.block_bits }
    pub fn has_direct_access(&self) -> bool { !self.block_offsets.is_empty() }
-    /// Byte offset of the START of record `i` (the seql byte) in the mmap.
+    /// Byte offset of record `i` in the mmap.
    ///
    /// Fast path (O(1 << block_bits)) when `.idx` is loaded; degraded O(i)
    /// sequential scan otherwise.
    #[inline]
    fn chunk_start(&self, i: usize) -> usize {
-        assert!(!self.block_offsets.is_empty(),
+        if !self.block_offsets.is_empty() {
            "random access requires UnitigFileReader::open(); use open_sequential() for iteration only");
            if self.block_bits == 0 {
                return self.block_offsets[i] as usize;
            }
@@ -283,6 +304,14 @@ impl UnitigFileReader {
                offset += 1 + (seql_minus_k + self.k + 3) / 4;
            }
            offset
        } else {
            let mut offset = 0usize;
            for _ in 0..i {
                let seql_minus_k = self.mmap[offset] as usize;
                offset += 1 + (seql_minus_k + self.k + 3) / 4;
            }
            offset
        }
    }
    /// Nucleotide length of chunk `i`.
@@ -307,7 +336,9 @@ impl UnitigFileReader {
        extract_kmer_raw(&self.mmap[offset + 1..], j, self.k)
    }
-    /// `true` iff the k-mer at position `j` of chunk `i` equals `query` (canonical).
+    /// `true` iff the k-mer at position `j` of chunk `i` matches `query`.
    ///
    /// Works in both modes; O(i) scan when `.idx` is absent.
    #[inline]
    pub fn verify_canonical_kmer(&self, i: usize, j: usize, query: CanonicalKmer) -> bool {
        canonical_raw(self.raw_kmer(i, j), self.k) == query.raw()