docs: expand kmer indexing, filtering, and merging documentation
Expands MkDocs navigation and documentation for evidence elimination, the merge command, and kmer filtering. Refactors kmer representation to a generic `KmerOf<L>` type with a bitwise reverse complement algorithm. Unifies MPHF construction, introduces approximate fingerprint-based indexing, and updates the pipeline, chunkreader, and storage layouts. Adds code coverage reports and clarifies architectural invariants for improved maintainability.
This commit is contained in:
@@ -9,7 +9,7 @@
|
||||
|
||||
|
||||
|
||||
<link rel="prev" href="../unitig_evidence/">
|
||||
<link rel="prev" href="../evidence_elimination/">
|
||||
|
||||
|
||||
<link rel="next" href="../persistent_compact_int_vec/">
|
||||
@@ -647,6 +647,34 @@
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="../evidence_elimination/" class="md-nav__link">
|
||||
|
||||
|
||||
|
||||
<span class="md-ellipsis">
|
||||
|
||||
|
||||
Evidence elimination (discussion)
|
||||
|
||||
|
||||
|
||||
</span>
|
||||
|
||||
|
||||
|
||||
</a>
|
||||
</li>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -729,6 +757,17 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#index-mode-homogeneity-invariant" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Index mode (homogeneity invariant)
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -740,6 +779,34 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="MphfLayer — autonomous kmer → slot mapping">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#query-api" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Query API
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#build-surface" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Build surface
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -751,6 +818,73 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="Layer\<D: LayerData> — MPHF + payload">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#build-signatures" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Build signatures
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#fingerprintvec-and-fingerprintvecwriter" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
FingerprintVec and FingerprintVecWriter
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#layeredmapd-collection-of-layers" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
LayeredMap\<D> — collection of layers
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="LayeredMap\<D> — collection of layers">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#common-methods" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Common methods
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#push_layer" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
push_layer
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -776,10 +910,10 @@
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#evidence-encoding" class="md-nav__link">
|
||||
<a href="#evidence-encoding-exact" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Evidence encoding
|
||||
Evidence encoding (exact)
|
||||
|
||||
</span>
|
||||
</a>
|
||||
@@ -798,14 +932,53 @@
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#query-path" class="md-nav__link">
|
||||
<a href="#column-append-and-merge-support" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Query path
|
||||
Column append and merge support
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="Column append and merge support">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#layer-level-genome-column-append" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Layer-level genome column append
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#presence-matrix-initialisation" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Presence matrix initialisation
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#why-the-mphf-is-never-rebuilt" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Why the MPHF is never rebuilt
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -895,6 +1068,62 @@
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="../merge/" class="md-nav__link">
|
||||
|
||||
|
||||
|
||||
<span class="md-ellipsis">
|
||||
|
||||
|
||||
Merge command
|
||||
|
||||
|
||||
|
||||
</span>
|
||||
|
||||
|
||||
|
||||
</a>
|
||||
</li>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="../rebuild_filter/" class="md-nav__link">
|
||||
|
||||
|
||||
|
||||
<span class="md-ellipsis">
|
||||
|
||||
|
||||
Kmer filtering (rebuild/dump/unitig)
|
||||
|
||||
|
||||
|
||||
</span>
|
||||
|
||||
|
||||
|
||||
</a>
|
||||
</li>
|
||||
|
||||
|
||||
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
@@ -1058,6 +1287,17 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#index-mode-homogeneity-invariant" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Index mode (homogeneity invariant)
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -1069,6 +1309,34 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="MphfLayer — autonomous kmer → slot mapping">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#query-api" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Query API
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#build-surface" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Build surface
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -1080,6 +1348,73 @@
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="Layer\<D: LayerData> — MPHF + payload">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#build-signatures" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Build signatures
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#fingerprintvec-and-fingerprintvecwriter" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
FingerprintVec and FingerprintVecWriter
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#layeredmapd-collection-of-layers" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
LayeredMap\<D> — collection of layers
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="LayeredMap\<D> — collection of layers">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#common-methods" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Common methods
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#push_layer" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
push_layer
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -1105,10 +1440,10 @@
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#evidence-encoding" class="md-nav__link">
|
||||
<a href="#evidence-encoding-exact" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Evidence encoding
|
||||
Evidence encoding (exact)
|
||||
|
||||
</span>
|
||||
</a>
|
||||
@@ -1127,14 +1462,53 @@
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#query-path" class="md-nav__link">
|
||||
<a href="#column-append-and-merge-support" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Query path
|
||||
Column append and merge support
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
<nav class="md-nav" aria-label="Column append and merge support">
|
||||
<ul class="md-nav__list">
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#layer-level-genome-column-append" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Layer-level genome column append
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#presence-matrix-initialisation" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Presence matrix initialisation
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
<a href="#why-the-mphf-is-never-rebuilt" class="md-nav__link">
|
||||
<span class="md-ellipsis">
|
||||
|
||||
Why the MPHF is never rebuilt
|
||||
|
||||
</span>
|
||||
</a>
|
||||
|
||||
</li>
|
||||
|
||||
</ul>
|
||||
</nav>
|
||||
|
||||
</li>
|
||||
|
||||
<li class="md-nav__item">
|
||||
@@ -1178,7 +1552,7 @@
|
||||
|
||||
<h1 id="obilayeredmap-layered-kmer-index-crate">obilayeredmap — layered kmer index crate</h1>
|
||||
<h2 id="purpose">Purpose</h2>
|
||||
<p><code>obilayeredmap</code> implements a persistent, incrementally extensible kmer index. The index is organised in three levels: <strong>index root → partition → layer</strong>. Each layer covers a disjoint kmer set and wraps a <code>ptr_hash</code> MPHF with associated per-slot data. Adding a new dataset never rebuilds existing layers.</p>
|
||||
<p><code>obilayeredmap</code> implements a persistent, incrementally extensible kmer index. Each layer covers a disjoint kmer set and wraps a <code>ptr_hash</code> MPHF with associated per-slot data. Adding a new dataset never rebuilds existing layers.</p>
|
||||
<hr />
|
||||
<h2 id="three-usage-modes">Three usage modes</h2>
|
||||
<p>The MPHF + evidence infrastructure is the same for all modes. The <strong>payload</strong> varies.</p>
|
||||
@@ -1214,34 +1588,65 @@
|
||||
</table>
|
||||
<p>Both <code>PersistentCompactIntMatrix</code> and <code>PersistentBitMatrix</code> come from the <code>obicompactvec</code> crate.</p>
|
||||
<hr />
|
||||
<h2 id="index-mode-homogeneity-invariant">Index mode (homogeneity invariant)</h2>
|
||||
<p>A partitioned index is homogeneous: every layer within a partition shares the same mode. The mode is determined once at <code>LayeredMap::open()</code> from <code>PartitionMeta.mode</code> and passed to each <code>Layer::open()</code> — no per-layer file is read.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="cp">#[derive(Serialize, Deserialize, Default)]</span>
|
||||
<span class="cp">#[serde(tag = </span><span class="s">"type"</span><span class="cp">, rename_all = </span><span class="s">"snake_case"</span><span class="cp">)]</span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">enum</span><span class="w"> </span><span class="nc">IndexMode</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="cp">#[default]</span>
|
||||
<span class="w"> </span><span class="n">Exact</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">Approx</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">b</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="w"> </span><span class="p">},</span>
|
||||
<span class="w"> </span><span class="n">Hybrid</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">b</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="w"> </span><span class="p">},</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>IndexMode</code> is stored once in <code>PartitionMeta</code> (<code>meta.json</code> at partition root). There is no <code>layer_meta.json</code>.</p>
|
||||
<ul>
|
||||
<li><strong>Exact</strong>: writes <code>evidence.bin</code> + <code>unitigs.bin.idx</code>. Zero false positives.</li>
|
||||
<li><strong>Approx</strong>: writes <code>fingerprint.bin</code> only. FP rate per kmer = 1/2^b; with Findere z-parameter, z consecutive kmers must all match → effective window FP ≈ 1/2^(b·z). No <code>.idx</code> written or required.</li>
|
||||
<li><strong>Hybrid</strong>: writes both <code>fingerprint.bin</code> and <code>evidence.bin</code> + <code>.idx</code>. <code>find()</code> uses the fingerprint (fast, O(1)); <code>find_strict()</code> uses exact evidence.</li>
|
||||
</ul>
|
||||
<hr />
|
||||
<h2 id="mphflayer-autonomous-kmer-slot-mapping">MphfLayer — autonomous kmer → slot mapping</h2>
|
||||
<p><code>MphfLayer</code> encapsulates the MPHF + evidence + unitig spine for one layer. It is independent of any payload data.</p>
|
||||
<p><code>MphfLayer</code> encapsulates the MPHF and evidence store for one layer. It is independent of any payload.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span><span class="w"> </span><span class="nc">MphfLayer</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="n">mphf</span><span class="p">:</span><span class="w"> </span><span class="nc">Mphf</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">evidence</span><span class="p">:</span><span class="w"> </span><span class="nc">Evidence</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">unitigs</span><span class="p">:</span><span class="w"> </span><span class="nc">UnitigFileReader</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">n</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="c1">// number of indexed kmers = number of MPHF slots</span>
|
||||
<span class="w"> </span><span class="n">mphf</span><span class="p">:</span><span class="w"> </span><span class="nc">Mphf</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">ev</span><span class="p">:</span><span class="w"> </span><span class="nc">LayerEvidence</span><span class="p">,</span><span class="w"> </span><span class="c1">// loaded at open() time</span>
|
||||
<span class="w"> </span><span class="n">n</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p>Public API:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">MphfLayer</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">open</span><span class="p">(</span><span class="n">dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="bp">Self</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">find</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">kmer</span><span class="p">:</span><span class="w"> </span><span class="nc">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nb">Option</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span><span class="w"> </span><span class="c1">// Some(slot) or None</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">n</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">usize</span>
|
||||
<span class="w"> </span><span class="nc">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">unitig_writer</span><span class="p">(</span><span class="n">dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="n">UnitigFileWriter</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="p">(</span><span class="k">crate</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build</span><span class="p">(</span>
|
||||
<span class="w"> </span><span class="n">dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">fill_slot</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">mut</span><span class="w"> </span><span class="k">impl</span><span class="w"> </span><span class="nb">FnMut</span><span class="p">(</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<p><code>LayerEvidence</code> is an internal enum, not public:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">enum</span><span class="w"> </span><span class="nc">LayerEvidence</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="n">Exact</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">evidence</span><span class="p">:</span><span class="w"> </span><span class="nc">Evidence</span><span class="p">,</span><span class="w"> </span><span class="n">unitigs</span><span class="p">:</span><span class="w"> </span><span class="nc">UnitigFileReader</span><span class="w"> </span><span class="p">},</span>
|
||||
<span class="w"> </span><span class="n">Approx</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">fingerprint</span><span class="p">:</span><span class="w"> </span><span class="nc">FingerprintVec</span><span class="p">,</span><span class="w"> </span><span class="n">unitigs_path</span><span class="p">:</span><span class="w"> </span><span class="nc">PathBuf</span><span class="w"> </span><span class="p">},</span>
|
||||
<span class="w"> </span><span class="n">Hybrid</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">evidence</span><span class="p">:</span><span class="w"> </span><span class="nc">Evidence</span><span class="p">,</span><span class="w"> </span><span class="n">unitigs</span><span class="p">:</span><span class="w"> </span><span class="nc">UnitigFileReader</span><span class="p">,</span><span class="w"> </span><span class="n">fingerprint</span><span class="p">:</span><span class="w"> </span><span class="nc">FingerprintVec</span><span class="w"> </span><span class="p">},</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>find</code> returns <code>Some(slot)</code> only after verifying via evidence that the kmer is actually indexed. It returns <code>None</code> for absent keys (ptr_hash maps any input to a valid slot; evidence verification is the only correct-membership test).</p>
|
||||
<p><code>build</code> runs two sequential passes over <code>unitigs.bin</code>:</p>
|
||||
<p><code>MphfLayer::open(dir, mode: &IndexMode)</code> receives the mode from <code>PartitionMeta</code> — no per-layer file is read.</p>
|
||||
<h3 id="query-api">Query API</h3>
|
||||
<p>Two public query methods, both returning <code>Option<usize></code> (slot index):</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">find</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">kmer</span><span class="p">:</span><span class="w"> </span><span class="nc">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nb">Option</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">find_strict</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">kmer</span><span class="p">:</span><span class="w"> </span><span class="nc">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nb">Option</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
</code></pre></div>
|
||||
<ul>
|
||||
<li><code>find</code>: O(1) auto-dispatch. Exact/Hybrid → exact evidence check. Approx/Hybrid → fingerprint comparison.</li>
|
||||
<li><code>find_strict</code>: always exact. Exact/Hybrid → O(1) evidence check. Approx → O(n) sequential scan (no <code>.idx</code>).</li>
|
||||
</ul>
|
||||
<p>There are no <code>find_exact</code>/<code>find_approx</code> methods; panicking dispatch is eliminated.</p>
|
||||
<h3 id="build-surface">Build surface</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">// Full MPHF + evidence build (two-pass)</span>
|
||||
<span class="k">pub</span><span class="p">(</span><span class="k">crate</span><span class="p">)</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">,</span><span class="w"> </span><span class="n">mode</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">IndexMode</span><span class="p">,</span><span class="w"> </span><span class="n">fill_slot</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
|
||||
<span class="c1">// Evidence-only post-hoc builds (MPHF already present)</span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_exact_evidence</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_approx_evidence</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
</code></pre></div>
|
||||
<p><code>MphfLayer::build</code> runs two passes over <code>unitigs.bin</code>:</p>
|
||||
<ol>
|
||||
<li><strong>Pass 1</strong>: iterate all canonical kmers in parallel via rayon, construct and store <code>mphf.bin</code>. <code>new_from_par_iter</code> avoids materialising a full key <code>Vec</code>.</li>
|
||||
<li><strong>Pass 2</strong>: iterate again sequentially, fill <code>evidence.bin</code>, call <code>fill_slot(slot, kmer)</code> once per kmer for payload population. A compact <code>n/8</code>-byte seen-bitset verifies MPHF injectivity inline.</li>
|
||||
<li><strong>Pass 1</strong> (parallel via rayon): a <code>CanonicalKmerIter</code> (clonable, <code>Arc<Mmap></code>, no file reopening) is passed to <code>new_from_par_iter</code> via <code>par_bridge()</code>. Produces <code>mphf.bin</code>. No <code>.idx</code> is read or created at this stage.</li>
|
||||
<li><strong>Pass 2</strong> (sequential): fill evidence files; call <code>fill_slot(slot, kmer)</code> per kmer. <code>.idx</code> is written last for Exact/Hybrid modes (query-time only).</li>
|
||||
</ol>
|
||||
<p>For empty layers (n = 0), <code>build</code> returns <code>Ok(0)</code> immediately after creating empty <code>mphf.bin</code> and <code>evidence.bin</code>.</p>
|
||||
<p>There is no <code>build_evidence</code> dispatch wrapper — callers invoke <code>build_exact_evidence</code> or <code>build_approx_evidence</code> directly.</p>
|
||||
<p>For empty layers (n = 0), all build variants return <code>Ok(0)</code> immediately after creating empty output files.</p>
|
||||
<hr />
|
||||
<h2 id="layerd-layerdata-mphf-payload">Layer\<D: LayerData> — MPHF + payload</h2>
|
||||
<p><code>Layer<D></code> pairs an <code>MphfLayer</code> with one payload store.</p>
|
||||
@@ -1261,7 +1666,7 @@
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="n">data</span><span class="p">:</span><span class="w"> </span><span class="nc">T</span><span class="p">,</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>LayerData</code> covers the <strong>read path only</strong> (<code>open</code> + <code>read</code>). Build signatures differ between modes and are not in the trait.</p>
|
||||
<p><code>LayerData</code> covers the <strong>read path only</strong> (<code>open</code> + <code>read</code>). Build signatures differ between modes and are not part of the trait.</p>
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
@@ -1288,28 +1693,89 @@
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p><strong>Build signatures:</strong></p>
|
||||
<h3 id="build-signatures">Build signatures</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">// mode 1</span>
|
||||
<span class="k">impl</span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="p">()</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">mode</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">IndexMode</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
|
||||
<span class="c1">// mode 2</span>
|
||||
<span class="k">impl</span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="n">PersistentCompactIntMatrix</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">count_of</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="n">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_from_map</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">counts</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">HashMap</span><span class="o"><</span><span class="n">CanonicalKmer</span><span class="p">,</span><span class="w"> </span><span class="kt">u32</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">mode</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">IndexMode</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">count_of</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="n">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_from_map</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">mode</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">IndexMode</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">counts</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">HashMap</span><span class="o"><</span><span class="n">CanonicalKmer</span><span class="p">,</span><span class="w"> </span><span class="kt">u32</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
|
||||
<span class="c1">// mode 3</span>
|
||||
<span class="k">impl</span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="n">PersistentBitMatrix</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_presence</span><span class="p">(</span>
|
||||
<span class="w"> </span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">n_genomes</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">present_in</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="n">CanonicalKmer</span><span class="p">,</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">bool</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_presence</span><span class="p">(</span><span class="n">out_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">mode</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">IndexMode</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">n_genomes</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">present_in</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="n">CanonicalKmer</span><span class="p">,</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">bool</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p>All build impls delegate MPHF + evidence construction to <code>MphfLayer::build</code> via a mode-specific <code>fill_slot</code> callback. Mode 2 pre-reads <code>n_kmers</code> from <code>unitigs.bin</code> to size the <code>PersistentCompactIntMatrixBuilder</code> before calling <code>MphfLayer::build</code>. Mode 3 does the same for <code>PersistentBitMatrixBuilder</code>.</p>
|
||||
<p>All build impls delegate to <code>MphfLayer::build</code> via a mode-specific <code>fill_slot</code> callback. The <code>mode</code> parameter is forwarded directly — no <code>LayerMeta</code> is written.</p>
|
||||
<p>Evidence-only post-hoc builds are accessible directly on <code>Layer<D></code>:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="o"><</span><span class="n">D</span><span class="p">:</span><span class="w"> </span><span class="nc">LayerData</span><span class="o">></span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="n">D</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_exact_evidence</span><span class="p">(</span><span class="n">layer_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">block_bits</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">build_approx_evidence</span><span class="p">(</span><span class="n">layer_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">b</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">,</span><span class="w"> </span><span class="n">z</span><span class="p">:</span><span class="w"> </span><span class="kt">u8</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p>There is no <code>build_evidence</code> dispatch wrapper.</p>
|
||||
<hr />
|
||||
<h2 id="fingerprintvec-and-fingerprintvecwriter">FingerprintVec and FingerprintVecWriter</h2>
|
||||
<p>Approximate evidence is stored as a packed b-bit array, one fingerprint per MPHF slot.</p>
|
||||
<div class="highlight"><pre><span></span><code>fingerprint.bin format:
|
||||
magic: b"FPVF" (4 bytes)
|
||||
b: u8 (bits per fingerprint, 1..=64)
|
||||
padding: [0u8; 3]
|
||||
n: u64 LE (number of slots)
|
||||
data: packed bits, ceil(n*b/8) bytes, Lsb0 order
|
||||
</code></pre></div>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">FingerprintVec</span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">open</span><span class="p">(</span><span class="n">path</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="bp">Self</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">get</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">slot</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">u64</span>
|
||||
<span class="w"> </span><span class="nc">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">matches</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">slot</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">fingerprint</span><span class="p">:</span><span class="w"> </span><span class="kt">u64</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">bool</span>
|
||||
<span class="w"> </span><span class="nc">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">n</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">usize</span>
|
||||
<span class="w"> </span><span class="nc">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">b</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">u8</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>matches(slot, hash)</code> extracts the b-bit fingerprint stored at <code>slot</code> and compares it to the low b bits of <code>hash</code>. It is the core operation of <code>find_approx</code>.</p>
|
||||
<hr />
|
||||
<h2 id="layeredmapd-collection-of-layers">LayeredMap\<D> — collection of layers</h2>
|
||||
<p><code>LayeredMap<D></code> wraps <code>Vec<Layer<D>></code> for a single partition directory.</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">struct</span><span class="w"> </span><span class="nc">LayeredMap</span><span class="o"><</span><span class="n">D</span><span class="p">:</span><span class="w"> </span><span class="nc">LayerData</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">()</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="n">root</span><span class="p">:</span><span class="w"> </span><span class="nc">PathBuf</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">meta</span><span class="p">:</span><span class="w"> </span><span class="nc">PartitionMeta</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="n">layers</span><span class="p">:</span><span class="w"> </span><span class="nb">Vec</span><span class="o"><</span><span class="n">Layer</span><span class="o"><</span><span class="n">D</span><span class="o">>></span><span class="p">,</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>PartitionMeta</code> (<code>meta.json</code> at the partition root) stores <code>n_layers</code>.</p>
|
||||
<h3 id="common-methods">Common methods</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">open</span><span class="p">(</span><span class="n">root</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="bp">Self</span><span class="o">></span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">create</span><span class="p">(</span><span class="n">root</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">mode</span><span class="p">:</span><span class="w"> </span><span class="nc">IndexMode</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="bp">Self</span><span class="o">></span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">n_layers</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">usize</span>
|
||||
<span class="nc">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">layer</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kp">&</span><span class="nc">Layer</span><span class="o"><</span><span class="n">D</span><span class="o">></span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">mode</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kp">&</span><span class="nc">IndexMode</span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">query</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">kmer</span><span class="p">:</span><span class="w"> </span><span class="nc">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nb">Option</span><span class="o"><</span><span class="p">(</span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">Hit</span><span class="o"><</span><span class="n">D</span><span class="p">::</span><span class="n">Item</span><span class="o">></span><span class="p">)</span><span class="o">></span>
|
||||
<span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">next_layer_writer</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="n">UnitigFileWriter</span><span class="o">></span>
|
||||
</code></pre></div>
|
||||
<p><code>open</code> reads <code>PartitionMeta</code> once, extracts <code>mode</code>, and passes it to every <code>Layer::open</code> — no per-layer file is read. <code>create</code> stores the given mode in <code>PartitionMeta</code>.</p>
|
||||
<p><code>query</code> probes layers in order and returns <code>(layer_index, Hit)</code> on the first match. Expected probe depth: 1 for kmers in layer 0.</p>
|
||||
<h3 id="push_layer">push_layer</h3>
|
||||
<p><code>push_layer</code> builds the next layer from a <code>unitigs.bin</code> already written via <code>next_layer_writer</code>, using <code>DEFAULT_BLOCK_BITS</code>:</p>
|
||||
<div class="highlight"><pre><span></span><code><span class="c1">// mode 1</span>
|
||||
<span class="k">impl</span><span class="w"> </span><span class="n">LayeredMap</span><span class="o"><</span><span class="p">()</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">push_layer</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
|
||||
<span class="c1">// mode 2</span>
|
||||
<span class="k">impl</span><span class="w"> </span><span class="n">LayeredMap</span><span class="o"><</span><span class="n">PersistentCompactIntMatrix</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">push_layer</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">count_of</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="n">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">push_layer_from_map</span><span class="p">(</span><span class="o">&</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">counts</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">HashMap</span><span class="o"><</span><span class="n">CanonicalKmer</span><span class="p">,</span><span class="w"> </span><span class="kt">u32</span><span class="o">></span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="kt">usize</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p>Mode 3 (<code>PersistentBitMatrix</code>) has no <code>push_layer</code> on <code>LayeredMap</code>; callers build directly via <code>Layer<PersistentBitMatrix>::build_presence</code>.</p>
|
||||
<hr />
|
||||
<h2 id="layeredstores-and-aggregation-traits">LayeredStore\<S> and aggregation traits</h2>
|
||||
<p><code>LayeredStore<S></code> is a generic aggregation wrapper over <code>Vec<S></code>. It propagates three traits from <code>obicompactvec::traits</code> up the hierarchy via blanket impls:</p>
|
||||
@@ -1320,11 +1786,6 @@
|
||||
<span class="k">impl</span><span class="o"><</span><span class="n">S</span><span class="p">:</span><span class="w"> </span><span class="nc">BitPartials</span><span class="o">></span><span class="w"> </span><span class="n">BitPartials</span><span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="n">LayeredStore</span><span class="o"><</span><span class="n">S</span><span class="o">></span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="err">…</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="c1">// element-wise Σ partials</span>
|
||||
</code></pre></div>
|
||||
<p>Because blanket impls compose, <code>LayeredStore<LayeredStore<S>></code> automatically inherits all three traits when <code>S</code> does — providing the partitioned level without a separate type.</p>
|
||||
<p><strong>Aggregation hierarchy:</strong></p>
|
||||
<div class="highlight"><pre><span></span><code>PersistentCompactIntMatrix implements CountPartials
|
||||
LayeredStore<PersistentCompactIntMatrix> via blanket impl (one partition)
|
||||
LayeredStore<LayeredStore<…>> via blanket impl (partitioned index)
|
||||
</code></pre></div>
|
||||
<p><strong>Leaf implementors</strong> (in <code>obicompactvec</code>):</p>
|
||||
<table>
|
||||
<thead>
|
||||
@@ -1344,69 +1805,77 @@ LayeredStore<LayeredStore<…>> via blanket impl
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p><code>PersistentCompactIntVec</code> and <code>PersistentBitVec</code> do not implement these traits — they are single-column primitives, not matrix-level aggregators.</p>
|
||||
<p>See <a href="../../architecture/index_architecture/">Kmer index architecture</a> for the full trait API and the two-pass normalised-metric pattern.</p>
|
||||
<hr />
|
||||
<h2 id="on-disk-structure">On-disk structure</h2>
|
||||
<div class="highlight"><pre><span></span><code>index_root/ ← LayeredMap (collection)
|
||||
meta.json
|
||||
part_00000/ ← Partition
|
||||
layer_0/ ← Layer
|
||||
mphf.bin — ptr_hash MPHF (epserde format)
|
||||
unitigs.bin — packed 2-bit nucleotide sequences
|
||||
unitigs.bin.idx — UIDX index: n_unitigs, n_kmers, seqls[], packed_offsets[]
|
||||
evidence.bin — n × u32, each = (chunk_id: 25 bits | rank: 7 bits), LE
|
||||
counts/ [mode 2] PersistentCompactIntMatrix
|
||||
meta.json {"n": N, "n_cols": 1}
|
||||
col_000000.pciv
|
||||
presence/ [mode 3] PersistentBitMatrix
|
||||
meta.json {"n": N, "n_cols": G}
|
||||
col_000000.pbiv
|
||||
…
|
||||
layer_1/
|
||||
…
|
||||
part_00001/
|
||||
<div class="highlight"><pre><span></span><code>partition_root/ ← LayeredMap (one partition)
|
||||
meta.json — {"n_layers": N, "mode": {"type": "exact"|"approx"|"hybrid", ...}}
|
||||
layer_0/ ← Layer
|
||||
mphf.bin — ptr_hash MPHF (epserde format)
|
||||
unitigs.bin — packed 2-bit nucleotide sequences
|
||||
unitigs.bin.idx — UIDX index (Exact/Hybrid only; query-time, never built during MPHF construction)
|
||||
evidence.bin — [u32; n], LE (Exact/Hybrid only)
|
||||
fingerprint.bin — packed b-bit array (Approx/Hybrid only)
|
||||
counts/ [mode 2] PersistentCompactIntMatrix
|
||||
meta.json
|
||||
col_000000.pciv
|
||||
presence/ [mode 3] PersistentBitMatrix
|
||||
meta.json
|
||||
col_000000.pbiv …
|
||||
layer_1/
|
||||
…
|
||||
</code></pre></div>
|
||||
<p><strong>Partition</strong> (<code>part_XXXXX/</code>): all kmers whose canonical minimiser hashes to this bucket. Partitions are independent and can be processed in parallel.</p>
|
||||
<p><strong>Layer</strong> (<code>layer_N/</code>): one <code>MphfLayer</code> plus optional payload. Layer 0 covers dataset A; layer 1 covers kmers in B absent from A; etc. Layers within a partition are always disjoint.</p>
|
||||
<p>There is no <code>layer_meta.json</code>. The mode is stored once in <code>PartitionMeta</code> and is valid for all layers. <code>unitigs.bin.idx</code> is built at the end of <code>build_exact_evidence</code> — never during MPHF construction — and is consumed at query time only.</p>
|
||||
<hr />
|
||||
<h2 id="evidence-encoding">Evidence encoding</h2>
|
||||
<h2 id="evidence-encoding-exact">Evidence encoding (exact)</h2>
|
||||
<p><code>evidence.bin</code> is a flat <code>[u32; n]</code> array with no header. Each u32 encodes one slot:</p>
|
||||
<div class="highlight"><pre><span></span><code>bits [31:7] = chunk_id (25 bits) — index of the unitig chunk
|
||||
bits [6:0] = rank (7 bits) — kmer index within the chunk (0-based)
|
||||
</code></pre></div>
|
||||
<p>Decoding: <code>chunk_id = raw >> 7</code>, <code>rank = raw & 0x7F</code>. Reconstructing the kmer: read k nucleotides at position <code>rank</code> within unitig <code>chunk_id</code>.</p>
|
||||
<p>For k=31, m=11, the observed maximum is ~46 kmers per chunk — well within the 127-kmer u7 capacity. The structural maximum from superkmer construction is k − m + 1 = 21 kmers/unitig; longer unitigs arise from paths spanning more than one superkmer.</p>
|
||||
<p><code>chunk_id = raw >> 7</code>, <code>rank = raw & 0x7F</code>. Reconstructing the kmer: read k nucleotides at position <code>rank</code> within unitig <code>chunk_id</code> (requires <code>unitigs.bin.idx</code> for random access).</p>
|
||||
<p>For k=31, m=11, the observed maximum is ~46 kmers per chunk — well within the 127-kmer u7 capacity.</p>
|
||||
<hr />
|
||||
<h2 id="ptr_hash-configuration">ptr_hash configuration</h2>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">type</span><span class="w"> </span><span class="nc">Mphf</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">PtrHash</span><span class="o"><</span>
|
||||
<span class="w"> </span><span class="kt">u64</span><span class="p">,</span><span class="w"> </span><span class="c1">// key type: canonical kmer raw encoding</span>
|
||||
<span class="w"> </span><span class="n">CubicEps</span><span class="p">,</span><span class="w"> </span><span class="c1">// bucket fn: 2.4 bits/key, λ=3.5, α=0.99</span>
|
||||
<span class="w"> </span><span class="n">CachelineEfVec</span><span class="o"><</span><span class="nb">Vec</span><span class="o"><</span><span class="n">CachelineEf</span><span class="o">>></span><span class="p">,</span><span class="w"> </span><span class="c1">// remap: 11.6 bits/entry (Elias-Fano)</span>
|
||||
<span class="w"> </span><span class="n">CachelineEfVec</span><span class="o"><</span><span class="nb">Vec</span><span class="o"><</span><span class="n">CachelineEf</span><span class="o">>></span><span class="p">,</span><span class="w"> </span><span class="c1">// remap: Elias-Fano</span>
|
||||
<span class="w"> </span><span class="n">Xx64</span><span class="p">,</span><span class="w"> </span><span class="c1">// hasher: XXH3-64 with seed</span>
|
||||
<span class="w"> </span><span class="nb">Vec</span><span class="o"><</span><span class="kt">u8</span><span class="o">></span><span class="p">,</span><span class="w"> </span><span class="c1">// pilots</span>
|
||||
<span class="o">></span><span class="p">;</span>
|
||||
</code></pre></div>
|
||||
<p><code>Xx64</code> is chosen over <code>FxHash</code> because canonical kmer raw values are left-aligned u64 with structural zeros in the low bits (42 zeros for k=11, 2 zeros for k=31), which single-multiply hashes distribute poorly.</p>
|
||||
<p><code>CubicEps</code> with <code>PtrHashParams::<CubicEps>::default()</code> (λ=3.5) is a balanced tradeoff: 2× slower construction than <code>Linear/λ=3.0</code>, 20% less space.</p>
|
||||
<p><code>CubicEps</code> with <code>PtrHashParams::<CubicEps>::default()</code> (λ=3.5): 2× slower construction than <code>Linear/λ=3.0</code>, ~20% less space.</p>
|
||||
<hr />
|
||||
<h2 id="query-path">Query path</h2>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">query</span><span class="p">(</span><span class="o">&</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">kmer</span><span class="p">:</span><span class="w"> </span><span class="nc">CanonicalKmer</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nb">Option</span><span class="o"><</span><span class="n">Hit</span><span class="o"><</span><span class="n">D</span><span class="p">::</span><span class="n">Item</span><span class="o">>></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">mphf</span><span class="p">.</span><span class="n">find</span><span class="p">(</span><span class="n">kmer</span><span class="p">).</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">slot</span><span class="o">|</span><span class="w"> </span><span class="n">Hit</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">slot</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="p">:</span><span class="w"> </span><span class="nc">self</span><span class="p">.</span><span class="n">data</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="n">slot</span><span class="p">)</span><span class="w"> </span><span class="p">})</span>
|
||||
<h2 id="column-append-and-merge-support">Column append and merge support</h2>
|
||||
<p>These methods extend existing layers with new genome columns without touching the MPHF.</p>
|
||||
<h3 id="layer-level-genome-column-append">Layer-level genome column append</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="n">PersistentBitMatrix</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">append_genome_column</span><span class="p">(</span><span class="n">layer_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">value_of</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">bool</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="p">()</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
|
||||
<span class="k">impl</span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="n">PersistentCompactIntMatrix</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">append_genome_column</span><span class="p">(</span><span class="n">layer_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">value_of</span><span class="p">:</span><span class="w"> </span><span class="nc">impl</span><span class="w"> </span><span class="nb">Fn</span><span class="p">(</span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="p">()</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p><code>MphfLayer::find</code> probes the MPHF, decodes evidence, and verifies the kmer — returning <code>Some(slot)</code> on match, <code>None</code> otherwise. <code>data.read(slot)</code> is called only on a confirmed hit.</p>
|
||||
<p>In <code>LayeredMap</code>, layers are probed in order; the first match wins. Expected probe depth: 1 for kmers in layer 0.</p>
|
||||
<p>Both delegate to the corresponding <code>PersistentBitMatrix::append_column</code> / <code>PersistentCompactIntMatrix::append_column</code>. They write a new column file (<code>col_NNNNNN.pbiv</code> / <code>col_NNNNNN.pciv</code>) and update <code>meta.json</code> to increment <code>n_cols</code>. <code>value_of</code> is called once per slot (0..n).</p>
|
||||
<h3 id="presence-matrix-initialisation">Presence matrix initialisation</h3>
|
||||
<div class="highlight"><pre><span></span><code><span class="k">impl</span><span class="w"> </span><span class="n">Layer</span><span class="o"><</span><span class="p">()</span><span class="o">></span><span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">init_presence_matrix</span><span class="p">(</span><span class="n">layer_dir</span><span class="p">:</span><span class="w"> </span><span class="kp">&</span><span class="nc">Path</span><span class="p">,</span><span class="w"> </span><span class="n">n_kmers</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-></span><span class="w"> </span><span class="nc">OLMResult</span><span class="o"><</span><span class="p">()</span><span class="o">></span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
<p>Called on the first merge of a Presence-mode index. Creates <code>presence/</code> with <code>meta.json {"n": n_kmers, "n_cols": 1}</code> and <code>col_000000.pbiv</code> set entirely to <code>true</code>. This retroactively records genome 0 (the original source) as present in every slot, satisfying the column-count invariant before any new-source column is appended.</p>
|
||||
<h3 id="why-the-mphf-is-never-rebuilt">Why the MPHF is never rebuilt</h3>
|
||||
<p>The MPHF, evidence, and unitigs are built once from the kmer set of a layer and are immutable for the lifetime of that layer. Adding a genome column does not change the kmer set — it only appends a new data column indexed by the same slot numbers. The only disk writes are one new <code>.pciv</code>/<code>.pbiv</code> file and a single <code>meta.json</code> update.</p>
|
||||
<hr />
|
||||
<h2 id="add-layer-algorithm">Add-layer algorithm</h2>
|
||||
<p>When adding dataset B to an existing index:</p>
|
||||
<ol>
|
||||
<li>For each partition, probe existing layers for kmers of B routed to that partition.</li>
|
||||
<li>Collect kmers absent from all layers → <code>B \ index</code>.</li>
|
||||
<li>Write <code>B \ index</code> to a new <code>unitigs.bin</code> via <code>MphfLayer::unitig_writer</code>.</li>
|
||||
<li>Call <code>Layer<D>::build</code> on the new directory.</li>
|
||||
<li>Update <code>meta.json</code>.</li>
|
||||
<li>Write <code>B \ index</code> to a new <code>unitigs.bin</code> via <code>next_layer_writer()</code>.</li>
|
||||
<li>Call <code>Layer<D>::build</code> (or <code>build_presence</code>) on the new layer directory.</li>
|
||||
<li>Call <code>push_layer</code> (or <code>append_layer</code>) to register the new layer in <code>meta.json</code>.</li>
|
||||
</ol>
|
||||
<p>Each partition's new layer is built independently; the operation is fully parallel across partitions.</p>
|
||||
<hr />
|
||||
@@ -1433,11 +1902,15 @@ bits [6:0] = rank (7 bits) — kmer index within the chunk (0-based)
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>memmap2 0.9</code></td>
|
||||
<td>mmap of evidence and payload files</td>
|
||||
<td>mmap of evidence and fingerprint files</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>bitvec</code></td>
|
||||
<td>packed b-bit fingerprint storage</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>obiskio</code></td>
|
||||
<td>unitig file writer/reader</td>
|
||||
<td>unitig file writer/reader + <code>.idx</code> build</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>obicompactvec</code></td>
|
||||
@@ -1448,8 +1921,8 @@ bits [6:0] = rank (7 bits) — kmer index within the chunk (0-based)
|
||||
<td>parallel MPHF construction pass</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>ndarray 0.16</code></td>
|
||||
<td>aggregation output arrays</td>
|
||||
<td><code>serde / serde_json</code></td>
|
||||
<td><code>PartitionMeta</code> serialisation</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
Reference in New Issue
Block a user