refactor: implement RoutableSuperKmer and update k-mer indexing pipeline

Replace raw SuperkMer routing with a new RoutableSuperKimer type that embeds canonical sequences and precomputed minimizers, enabling direct partition routing via hash. Update the build pipeline to yield RoutableSuperKmers throughout (builder, scatterer), refactor FASTA/unitig export commands to use the new type and compressed outputs (.fasta.gz, .unitigs.fasta.zst), revise SuperKmer header to store n_kmers instead of seql (avoiding 256-byte wrap), and update documentation to reflect minimizer-based theory, two evidence-encoding strategies for unitig-MPHF indexing (global offset vs. ID+rank), and the new obipipeline library architecture with parallel workers, biased scheduling, and error handling.
This commit is contained in:
Eric Coissac
2026-04-29 22:52:42 +02:00
parent 4e26e3bd40
commit 27f5e88a7b
72 changed files with 10093 additions and 1626 deletions
+85 -1
View File
@@ -221,7 +221,7 @@
<li class="md-nav__item">
<a href="/theory/kmers/" class="md-nav__link">
<a href="/kmers/" class="md-nav__link">
@@ -304,6 +304,34 @@
<li class="md-nav__item">
<a href="/theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="/theory/indexing/" class="md-nav__link">
@@ -498,6 +526,34 @@
<li class="md-nav__item">
<a href="/implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="/implementation/storage/" class="md-nav__link">
@@ -548,6 +604,34 @@
<li class="md-nav__item">
<a href="/implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
@@ -9,7 +9,7 @@
<link rel="prev" href="../../../implementation/mphf/">
<link rel="prev" href="../../../implementation/unitig_evidence/">
@@ -228,7 +228,7 @@
<li class="md-nav__item">
<a href="../../../theory/kmers/" class="md-nav__link">
<a href="../../../kmers/" class="md-nav__link">
@@ -311,6 +311,34 @@
<li class="md-nav__item">
<a href="../../../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../../theory/indexing/" class="md-nav__link">
@@ -505,6 +533,34 @@
<li class="md-nav__item">
<a href="../../../implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../../implementation/storage/" class="md-nav__link">
@@ -555,6 +611,34 @@
<li class="md-nav__item">
<a href="../../../implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
File diff suppressed because it is too large Load Diff
+85 -1
View File
@@ -230,7 +230,7 @@
<li class="md-nav__item">
<a href="../../theory/kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -313,6 +313,34 @@
<li class="md-nav__item">
<a href="../../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../theory/indexing/" class="md-nav__link">
@@ -611,6 +639,34 @@
<li class="md-nav__item">
<a href="../obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../storage/" class="md-nav__link">
@@ -661,6 +717,34 @@
<li class="md-nav__item">
<a href="../unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
+240 -13
View File
@@ -12,7 +12,7 @@
<link rel="prev" href="../storage/">
<link rel="next" href="../../architecture/sequences/invariant/">
<link rel="next" href="../unitig_evidence/">
@@ -64,7 +64,7 @@
<div data-md-component="skip">
<a href="#mphf-selection-analysis-in-progress" class="md-skip">
<a href="#mphf-selection-two-phase-indexing-architecture" class="md-skip">
Skip to content
</a>
@@ -230,7 +230,7 @@
<li class="md-nav__item">
<a href="../../theory/kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -313,6 +313,34 @@
<li class="md-nav__item">
<a href="../../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../theory/indexing/" class="md-nav__link">
@@ -509,6 +537,34 @@
<li class="md-nav__item">
<a href="../obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../storage/" class="md-nav__link">
@@ -597,6 +653,56 @@
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
<li class="md-nav__item">
<a href="#indexing-architecture" class="md-nav__link">
<span class="md-ellipsis">
Indexing architecture
</span>
</a>
<nav class="md-nav" aria-label="Indexing architecture">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#superkmer-vs-kmer-counts" class="md-nav__link">
<span class="md-ellipsis">
Superkmer vs kmer counts
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#phase-1-provisional-index-and-spectrum" class="md-nav__link">
<span class="md-ellipsis">
Phase 1 — provisional index and spectrum
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#phase-2-definitive-index" class="md-nav__link">
<span class="md-ellipsis">
Phase 2 — definitive index
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#candidates" class="md-nav__link">
<span class="md-ellipsis">
@@ -606,6 +712,17 @@
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#mphf-choice-per-phase" class="md-nav__link">
<span class="md-ellipsis">
MPHF choice per phase
</span>
</a>
</li>
<li class="md-nav__item">
@@ -650,6 +767,34 @@
<li class="md-nav__item">
<a href="../unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
@@ -765,6 +910,56 @@
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
<li class="md-nav__item">
<a href="#indexing-architecture" class="md-nav__link">
<span class="md-ellipsis">
Indexing architecture
</span>
</a>
<nav class="md-nav" aria-label="Indexing architecture">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#superkmer-vs-kmer-counts" class="md-nav__link">
<span class="md-ellipsis">
Superkmer vs kmer counts
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#phase-1-provisional-index-and-spectrum" class="md-nav__link">
<span class="md-ellipsis">
Phase 1 — provisional index and spectrum
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#phase-2-definitive-index" class="md-nav__link">
<span class="md-ellipsis">
Phase 2 — definitive index
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#candidates" class="md-nav__link">
<span class="md-ellipsis">
@@ -774,6 +969,17 @@
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#mphf-choice-per-phase" class="md-nav__link">
<span class="md-ellipsis">
MPHF choice per phase
</span>
</a>
</li>
<li class="md-nav__item">
@@ -826,29 +1032,50 @@
<h1 id="mphf-selection-analysis-in-progress">MPHF selection — analysis in progress</h1>
<p>The choice of Minimal Perfect Hash Function for phase 6 is not yet settled. Three candidates were evaluated.</p>
<h1 id="mphf-selection-two-phase-indexing-architecture">MPHF selection — two-phase indexing architecture</h1>
<h2 id="indexing-architecture">Indexing architecture</h2>
<p>Kmer indexing per partition proceeds in two phases. The separation is necessary because the exact number of unique kmers in a partition is not known until after counting and filtering.</p>
<h3 id="superkmer-vs-kmer-counts">Superkmer vs kmer counts</h3>
<p>The <code>SKFileMeta</code> sidecar written by <code>SKFileWriter</code> records <code>instances</code> (unique superkmers) and <code>length_sum</code> (total nucleotides). A superkmer of length L contains L k + 1 kmers, so the kmer count per partition can be estimated as <code>length_sum instances × (k 1)</code>. This is an <strong>overestimate</strong> of unique kmers: two distinct superkmers (different flanking contexts, same minimizer) can share kmers. The exact count of unique kmers is only known after enumerating and deduplicating them.</p>
<p>Note: two superkmers sharing a kmer necessarily share the same minimizer and therefore always land in the same partition — no kmer can appear in two different partitions.</p>
<h3 id="phase-1-provisional-index-and-spectrum">Phase 1 — provisional index and spectrum</h3>
<ol>
<li>Enumerate all kmers from the dereplicated superkmers of the partition.</li>
<li>Build a provisional MPHF over this key set; capacity is pre-allocated from the sidecar estimate (slight overestimate, harmless).</li>
<li>Accumulate counts: for each kmer in each superkmer, <code>count[MPHF(kmer)] += sk.count()</code>.</li>
<li>Compute the kmer frequency spectrum (histogram: occurrences → number of kmers).</li>
<li>Apply count filter (e.g. discard singletons). After filtering, the exact number of surviving kmers is known.</li>
<li>Discard the provisional MPHF.</li>
</ol>
<h3 id="phase-2-definitive-index">Phase 2 — definitive index</h3>
<p>Build a new MPHF over the filtered kmer set only, with the exact key count available. This is the persistent per-partition index used for all downstream operations (queries, set operations).</p>
<hr />
<h2 id="candidates">Candidates</h2>
<p><strong>boomphf</strong> (BBHash algorithm, maintained by 10X Genomics):</p>
<ul>
<li>~3.7 bits/key; mature crate, used in production bioinformatics (Pufferfish, Piscem)</li>
<li>Parallel construction; well-tested with DNA kmer data at scale</li>
<li>Drawback: largest space footprint of the three</li>
<li>Drawback: largest space footprint; streaming construction (no exact count needed) was its main differentiator — irrelevant here since exact count is available at phase 2</li>
</ul>
<p><strong>ptr_hash</strong> (PtrHash algorithm, Groot Koerkamp, SEA 2025):</p>
<ul>
<li>~2.4 bits/key; fastest queries (≥2.1× over alternatives, 812 ns/key for u64 in tight loops) and fastest construction (≥3.1×)</li>
<li>Theoretical foundation solid; paper and Rust crate from the same author</li>
<li>Requires exact key count at construction — available at phase 2</li>
<li>Drawback: published February 2025 — very young, no production track record</li>
</ul>
<p><strong>FMPHGO</strong> (<code>ph</code> crate, Beling, ACM JEA 2023):</p>
<ul>
<li>~2.1 bits/key — most compact of the three; good query speed; parallelisable construction</li>
<li>More established than ptr_hash; actively maintained</li>
<li>Currently preferred candidate</li>
<li>Works well with overestimated capacity → natural fit for phase 1</li>
</ul>
<h2 id="mphf-choice-per-phase">MPHF choice per phase</h2>
<p><strong>Phase 1</strong> (provisional, discarded after spectrum computation): FMPHGO. Tolerates overestimated capacity, compact, no need to optimise for query speed on a temporary structure.</p>
<p><strong>Phase 2</strong> (persistent, queried repeatedly): open between FMPHGO and ptr_hash. Exact key count is available, so both operate optimally. ptr_hash's query speed advantage (2.13.3×) is meaningful for the persistent index but carries the risk of a very young crate. FMPHGO is the conservative default; ptr_hash is worth revisiting once it has broader production use.</p>
<p>boomphf is effectively eliminated: its space overhead is the largest and its streaming-construction advantage does not apply here.</p>
<hr />
<h2 id="space-at-scale">Space at scale</h2>
<p>For 1 024 partitions × 100 M kmers/partition:</p>
<p>For 1 024 partitions × 100 M kmers/partition (phase 2 index, after filtering):</p>
<table>
<thead>
<tr>
@@ -875,15 +1102,15 @@
</tr>
</tbody>
</table>
<p>In practice, partition sizes depend on the dataset. For a human genome at 30× coverage with p=10 (1 024 partitions), realistic partition sizes are 330 M kmers → 18 MB per MPHF, well within RAM.</p>
<p>For a human genome at 30× coverage with 1 024 partitions, realistic partition sizes are 330 M unique kmers → 18 MB per phase-2 MPHF, well within RAM.</p>
<h2 id="on-disk-and-mmap-considerations">On-disk and mmap considerations</h2>
<p>All three are in-memory structures. Their internal representation is flat bit arrays (no heap pointers), making them serialisable as contiguous byte blobs and mmappable per partition. True zero-copy access would require rkyv integration; the <code>ph</code> crate currently uses serde, so loading involves a copy. Given per-partition MPHF sizes of 18 MB, the OS page cache handles this transparently — strict zero-copy is a refinement, not a blocker.</p>
<p>No established Rust crate provides a natively on-disk MPHF. <strong>SSHash</strong> (Sparse and Skew Hash) is a complete kmer dictionary designed for disk access and is order-preserving (overlapping kmers receive consecutive indices → cache-friendly count access), but it is C++-only and covers more than just the MPHF layer.</p>
<h2 id="open-questions">Open questions</h2>
<ul>
<li>Confirm actual partition sizes on representative metagenomic datasets before fixing the choice.</li>
<li>Evaluate whether ptr_hash's query speed advantage (2.13.3×) justifies adopting a crate that is less than a year old.</li>
<li>Assess rkyv integration cost for FMPHGO if true zero-copy mmap becomes necessary.</li>
<li>Confirm actual partition sizes and overestimation factor on representative metagenomic datasets.</li>
<li>Revisit ptr_hash for phase 2 once the crate has broader production track record.</li>
<li>Assess rkyv integration cost for FMPHGO if true zero-copy mmap becomes necessary for the persistent index.</li>
<li>Keep SSHash in mind if the indexing architecture is reconsidered at a higher level.</li>
</ul>
File diff suppressed because it is too large Load Diff
+86 -2
View File
@@ -12,7 +12,7 @@
<link rel="prev" href="../chunkreader/">
<link rel="next" href="../storage/">
<link rel="next" href="../obipipeline/">
@@ -230,7 +230,7 @@
<li class="md-nav__item">
<a href="../../theory/kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -313,6 +313,34 @@
<li class="md-nav__item">
<a href="../../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../theory/indexing/" class="md-nav__link">
@@ -633,6 +661,34 @@
<li class="md-nav__item">
<a href="../obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../storage/" class="md-nav__link">
@@ -683,6 +739,34 @@
<li class="md-nav__item">
<a href="../unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
+86 -2
View File
@@ -9,7 +9,7 @@
<link rel="prev" href="../pipeline/">
<link rel="prev" href="../obipipeline/">
<link rel="next" href="../mphf/">
@@ -230,7 +230,7 @@
<li class="md-nav__item">
<a href="../../theory/kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -313,6 +313,34 @@
<li class="md-nav__item">
<a href="../../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../theory/indexing/" class="md-nav__link">
@@ -507,6 +535,34 @@
<li class="md-nav__item">
<a href="../obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
@@ -639,6 +695,34 @@
<li class="md-nav__item">
<a href="../unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
+198 -12
View File
@@ -230,7 +230,7 @@
<li class="md-nav__item">
<a href="../../theory/kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -313,6 +313,34 @@
<li class="md-nav__item">
<a href="../../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../theory/indexing/" class="md-nav__link">
@@ -488,6 +516,17 @@
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#minimizer-sliding-window" class="md-nav__link">
<span class="md-ellipsis">
Minimizer sliding window
</span>
</a>
</li>
<li class="md-nav__item">
@@ -600,6 +639,34 @@
<li class="md-nav__item">
<a href="../obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../storage/" class="md-nav__link">
@@ -650,6 +717,34 @@
<li class="md-nav__item">
<a href="../unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
@@ -796,6 +891,17 @@
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#minimizer-sliding-window" class="md-nav__link">
<span class="md-ellipsis">
Minimizer sliding window
</span>
</a>
</li>
<li class="md-nav__item">
@@ -828,7 +934,7 @@
<h1 id="superkmer-implementation">SuperKmer — implementation</h1>
<h2 id="memory-layout">Memory layout</h2>
<p>A super-kmer is stored as a <strong>32-bit header</strong> followed by a <strong>byte-aligned nucleotide sequence</strong> (2 bits/base, nucleotide 0 at the MSB of the first byte, max 256 nt):</p>
<p>A super-kmer is stored as a <strong>32-bit header</strong> followed by a <strong>byte-aligned nucleotide sequence</strong> (2 bits/base, nucleotide 0 at the MSB of the first byte):</p>
<table>
<thead>
<tr>
@@ -844,21 +950,44 @@
<td>Occurrence count (≤ 16 M)</td>
</tr>
<tr>
<td>SEQL</td>
<td>NKMERS</td>
<td>8</td>
<td>Sequence length in nucleotides (1256)</td>
<td>Number of kmers (= seq_length k + 1, range 1255)</td>
</tr>
</tbody>
</table>
<p>Bit layout (MSB to LSB): <code>[31:8] COUNT [7:0] SEQL</code></p>
<p>SEQL is stored as a raw <code>u8</code>: values 1255 represent lengths 1255; <strong>0 represents 256</strong> (wrapping convention). The public accessor returns a <code>usize</code> and performs the conversion:</p>
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">seql</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="mi">256</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">}</span>
<p>Bit layout (MSB to LSB): <code>[31:8] COUNT [7:0] NKMERS</code></p>
<p>NKMERS is stored as a raw <code>u8</code> in <strong>kmer units</strong>, not nucleotides. The nucleotide length is recovered as <code>NKMERS + k 1</code>. This avoids the awkward wrapping convention (<code>0 = 256</code>) that would be needed if nucleotide length were stored directly, and gains k1 = 30 units of headroom:</p>
<table>
<thead>
<tr>
<th>unit</th>
<th>u8 covers</th>
<th>max nucleotides</th>
</tr>
</thead>
<tbody>
<tr>
<td>nucleotides</td>
<td>255 nt</td>
<td>225 kmers</td>
</tr>
<tr>
<td><strong>kmers</strong></td>
<td><strong>255 kmers</strong></td>
<td><strong>285 nt</strong></td>
</tr>
</tbody>
</table>
<p>The public accessors:</p>
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">n_kmers</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="mh">0xFF</span><span class="p">)</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">}</span>
<span class="k">fn</span><span class="w"> </span><span class="nf">seql</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span class="kt">usize</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">n_kmers</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">K</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="p">}</span>
<span class="k">fn</span><span class="w"> </span><span class="nf">count</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span class="kt">u32</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="p">}</span>
<span class="k">fn</span><span class="w"> </span><span class="nf">increment</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="mi">8</span><span class="p">;</span><span class="w"> </span><span class="p">}</span>
<span class="k">fn</span><span class="w"> </span><span class="nf">add</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">n</span><span class="p">:</span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">+=</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="mi">8</span><span class="p">;</span><span class="w"> </span><span class="p">}</span>
<span class="k">fn</span><span class="w"> </span><span class="nf">set_count</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span><span class="w"> </span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">n</span><span class="p">:</span><span class="w"> </span><span class="kt">u32</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="bp">self</span><span class="p">.</span><span class="mi">0</span><span class="w"> </span><span class="o">&amp;</span><span class="w"> </span><span class="mh">0xFF</span><span class="p">)</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="o">&lt;&lt;</span><span class="w"> </span><span class="mi">8</span><span class="p">);</span><span class="w"> </span><span class="p">}</span>
</code></pre></div>
<p>The SEQL field is 8 bits, capping the stored sequence at 256 nt. Given the expected length of ~40 nt, this cap is almost never reached; when it is, the super-kmer is split at 256 nt with a k1 overlap, preserving all kmers without duplication.</p>
<p>In practice, observed super-kmer lengths on metagenomic data (k=31) are below 55 nucleotides (≤ 25 kmers) — far from the 255-kmer cap. If a super-kmer ever exceeds 255 kmers, it is split with a k1 nucleotide overlap, preserving all kmers without duplication (identical mechanism to partition-boundary splits).</p>
<p>The sequence is always stored in canonical form (lexicographic minimum of forward and reverse complement), with nucleotide 0 at the MSB of the first byte. The byte array can be hashed directly without any adjustment.</p>
<h2 id="ascii-encoding-and-decoding">ASCII encoding and decoding</h2>
<p>Two lookup tables handle ASCII ↔ 2-bit conversion:</p>
@@ -883,8 +1012,9 @@
<span class="p">}</span>
</code></pre></div>
<p><code>REVCOMP4</code> is 256 bytes (fits in L1 cache), computed at compile time. No endianness dependency — all operations are pure arithmetic on byte values.</p>
<p><strong>Step 2 — realignment.</strong> After step 1, <code>padding = n × 8 SEQL × 2</code> spurious bits (complements of the original padding A's) appear at the start of the array. They are flushed left using <code>BitSlice&lt;u8, Msb0&gt;::rotate_left(padding)</code> from the <code>bitvec</code> crate, which is SIMD-accelerated. The trailing <code>padding</code> bits are then zeroed:</p>
<div class="highlight"><pre><span></span><code><span class="n">shift</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">SEQL</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="c1">// number of padding bits</span>
<p><strong>Step 2 — realignment.</strong> After step 1, <code>padding = n × 8 seql × 2</code> spurious bits (complements of the original padding A's) appear at the start of the array. They are flushed left using <code>BitSlice&lt;u8, Msb0&gt;::rotate_left(padding)</code> from the <code>bitvec</code> crate, which is SIMD-accelerated. The trailing <code>padding</code> bits are then zeroed:</p>
<div class="highlight"><pre><span></span><code><span class="kd">let</span><span class="w"> </span><span class="n">seql</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="bp">self</span><span class="p">.</span><span class="n">n_kmers</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span>
<span class="n">shift</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">8</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">seql</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="mi">2</span><span class="w"> </span><span class="c1">// number of padding bits</span>
<span class="n">bits</span><span class="p">.</span><span class="n">rotate_left</span><span class="p">(</span><span class="n">shift</span><span class="p">)</span>
<span class="n">bits</span><span class="p">[</span><span class="n">len</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">shift</span><span class="o">..</span><span class="p">].</span><span class="n">fill</span><span class="p">(</span><span class="kc">false</span><span class="p">)</span>
</code></pre></div>
@@ -900,6 +1030,61 @@
return seq -- palindrome: either orientation valid
</code></pre></div>
</div>
<h2 id="minimizer-sliding-window">Minimizer sliding window</h2>
<p>Super-kmers are built by <code>SuperKmerIter</code> (crate <code>obiskbuilder</code>), which maintains the current minimizer with a <strong>monotonic deque</strong> over a sliding window of W = k m + 1 m-mer positions.</p>
<p>Each deque entry stores:</p>
<table>
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>position</code></td>
<td>usize</td>
<td>0-based start of this m-mer in the segment</td>
</tr>
<tr>
<td><code>canonical</code></td>
<td>u64</td>
<td>right-aligned canonical m-mer value (lex-min of fwd and rc); used as partition key</td>
</tr>
<tr>
<td><code>hash</code></td>
<td>u64</td>
<td><span class="arithmatex">\(H(\text{canonical})\)</span> — ordering key for random minimizer selection</td>
</tr>
</tbody>
</table>
<p>The hash <span class="arithmatex">\(H\)</span> is the seeded splitmix64 finalizer (see <a href="../../theory/minimizer/">Minimizer selection</a>):</p>
<div class="highlight"><pre><span></span><code><span class="k">fn</span><span class="w"> </span><span class="nf">hash_mmer</span><span class="p">(</span><span class="n">canonical</span><span class="p">:</span><span class="w"> </span><span class="kt">u64</span><span class="p">)</span><span class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span class="kt">u64</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">canonical</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="mh">0x9e3779b97f4a7c15</span><span class="p">;</span><span class="w"> </span><span class="c1">// seed: eliminates fixed point at 0</span>
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">30</span><span class="p">);</span>
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">wrapping_mul</span><span class="p">(</span><span class="mh">0xbf58476d1ce4e5b9</span><span class="p">);</span>
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">27</span><span class="p">);</span>
<span class="w"> </span><span class="kd">let</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">x</span><span class="p">.</span><span class="n">wrapping_mul</span><span class="p">(</span><span class="mh">0x94d049bb133111eb</span><span class="p">);</span>
<span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">^</span><span class="w"> </span><span class="p">(</span><span class="n">x</span><span class="w"> </span><span class="o">&gt;&gt;</span><span class="w"> </span><span class="mi">31</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div>
<p>On each new nucleotide, once the window is full, the deque is updated:</p>
<div class="admonition abstract">
<p class="admonition-title">Algorithm — minimizer deque update</p>
<div class="highlight"><pre><span></span><code>procedure UpdateMinimizer(deque, position, canonical, hash, k, received):
-- pop dominated entries from the back
while deque.back.hash ≥ hash:
deque.pop_back()
deque.push_back({position, canonical, hash})
-- evict expired entries from the front
while deque.front.position + k &lt; received:
deque.pop_front()
</code></pre></div>
</div>
<p>The front of the deque is always the current minimizer. Because the deque is maintained in strictly increasing hash order, each entry is popped at most once — O(1) amortized per nucleotide.</p>
<p>A super-kmer boundary is emitted when the minimizer changes: <code>deque.front.hash ≠ prev_hash</code>. The <code>canonical</code> field of the front entry is <strong>not</strong> used for boundary detection — that uses the hash alone. The canonical value is stored so that the partition key <span class="arithmatex">\(H(\text{canonical})\)</span> can be recomputed independently at routing time from the stored <code>minimizer_pos</code>, without inheriting the minimum-order-statistic bias (see <a href="../../theory/minimizer/#partition-key-independence">Minimizer selection — partition key independence</a>).</p>
<h2 id="kmer-extraction">Kmer extraction</h2>
<p>A k-mer is extracted from a super-kmer with <code>SuperKmer::kmer(i, k)</code>, which returns a <code>Kmer</code> — a left-aligned <code>u64</code> newtype (see <a href="../kmer/">Kmer implementation</a>):</p>
<div class="highlight"><pre><span></span><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span><span class="w"> </span><span class="nf">kmer</span><span class="p">(</span><span class="o">&amp;</span><span class="bp">self</span><span class="p">,</span><span class="w"> </span><span class="n">i</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">,</span><span class="w"> </span><span class="n">k</span><span class="p">:</span><span class="w"> </span><span class="kt">usize</span><span class="p">)</span><span class="w"> </span><span class="p">-&gt;</span><span class="w"> </span><span class="nb">Result</span><span class="o">&lt;</span><span class="n">Kmer</span><span class="p">,</span><span class="w"> </span><span class="n">KmerError</span><span class="o">&gt;</span>
@@ -909,8 +1094,9 @@
<div class="admonition abstract">
<p class="admonition-title">Algorithm — Super-kmer reverse complement</p>
<div class="highlight"><pre><span></span><code>procedure SuperKmerRevcomp(seq, SEQL):
n ← ⌈SEQL / 4⌉ -- number of bytes
shift ← n × 8 SEQL × 2 -- padding bits to flush
seql ← NKMERS + k 1 -- nucleotide length
n ← ⌈seql / 4⌉ -- number of bytes
shift ← n × 8 seql × 2 -- padding bits to flush
-- step 1: swap bytes outside-in, applying REVCOMP4 to each (256-byte L1 table)
lo ← 0 ; hi ← n 1
File diff suppressed because it is too large Load Diff
+86 -2
View File
@@ -10,7 +10,7 @@
<link rel="next" href="theory/kmers/">
<link rel="next" href="kmers/">
@@ -297,7 +297,7 @@
<li class="md-nav__item">
<a href="theory/kmers/" class="md-nav__link">
<a href="kmers/" class="md-nav__link">
@@ -380,6 +380,34 @@
<li class="md-nav__item">
<a href="theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="theory/indexing/" class="md-nav__link">
@@ -574,6 +602,34 @@
<li class="md-nav__item">
<a href="implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="implementation/storage/" class="md-nav__link">
@@ -624,6 +680,34 @@
<li class="md-nav__item">
<a href="implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
@@ -9,16 +9,16 @@
<link rel="prev" href="../..">
<link rel="prev" href="..">
<link rel="next" href="../encoding/">
<link rel="next" href="../theory/encoding/">
<link rel="icon" href="../../assets/images/favicon.png">
<link rel="icon" href="../assets/images/favicon.png">
<meta name="generator" content="mkdocs-1.6.1, mkdocs-material-9.7.6">
@@ -27,7 +27,7 @@
<link rel="stylesheet" href="../../assets/stylesheets/main.484c7ddc.min.css">
<link rel="stylesheet" href="../assets/stylesheets/main.484c7ddc.min.css">
@@ -46,7 +46,7 @@
<script>__md_scope=new URL("../..",location),__md_hash=e=>[...e].reduce(((e,_)=>(e<<5)-e+_.charCodeAt(0)),0),__md_get=(e,_=localStorage,t=__md_scope)=>JSON.parse(_.getItem(t.pathname+"."+e)),__md_set=(e,_,t=localStorage,a=__md_scope)=>{try{t.setItem(a.pathname+"."+e,JSON.stringify(_))}catch(e){}}</script>
<script>__md_scope=new URL("..",location),__md_hash=e=>[...e].reduce(((e,_)=>(e<<5)-e+_.charCodeAt(0)),0),__md_get=(e,_=localStorage,t=__md_scope)=>JSON.parse(_.getItem(t.pathname+"."+e)),__md_set=(e,_,t=localStorage,a=__md_scope)=>{try{t.setItem(a.pathname+"."+e,JSON.stringify(_))}catch(e){}}</script>
@@ -80,7 +80,7 @@
<header class="md-header md-header--shadow" data-md-component="header">
<nav class="md-header__inner md-grid" aria-label="Header">
<a href="../.." title="obikmer" class="md-header__button md-logo" aria-label="obikmer" data-md-component="logo">
<a href=".." title="obikmer" class="md-header__button md-logo" aria-label="obikmer" data-md-component="logo">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54"/></svg>
@@ -138,7 +138,7 @@
<nav class="md-nav md-nav--primary" aria-label="Navigation" data-md-level="0">
<label class="md-nav__title" for="__drawer">
<a href="../.." title="obikmer" class="md-nav__button md-logo" aria-label="obikmer" data-md-component="logo">
<a href=".." title="obikmer" class="md-nav__button md-logo" aria-label="obikmer" data-md-component="logo">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54"/></svg>
@@ -156,7 +156,7 @@
<li class="md-nav__item">
<a href="../.." class="md-nav__link">
<a href=".." class="md-nav__link">
@@ -357,7 +357,7 @@
<li class="md-nav__item">
<a href="../encoding/" class="md-nav__link">
<a href="../theory/encoding/" class="md-nav__link">
@@ -385,7 +385,7 @@
<li class="md-nav__item">
<a href="../entropy/" class="md-nav__link">
<a href="../theory/entropy/" class="md-nav__link">
@@ -413,7 +413,35 @@
<li class="md-nav__item">
<a href="../indexing/" class="md-nav__link">
<a href="../theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../theory/indexing/" class="md-nav__link">
@@ -495,7 +523,7 @@
<li class="md-nav__item">
<a href="../../implementation/superkmer/" class="md-nav__link">
<a href="../implementation/superkmer/" class="md-nav__link">
@@ -523,7 +551,7 @@
<li class="md-nav__item">
<a href="../../implementation/kmer/" class="md-nav__link">
<a href="../implementation/kmer/" class="md-nav__link">
@@ -551,7 +579,7 @@
<li class="md-nav__item">
<a href="../../implementation/chunkreader/" class="md-nav__link">
<a href="../implementation/chunkreader/" class="md-nav__link">
@@ -579,7 +607,7 @@
<li class="md-nav__item">
<a href="../../implementation/pipeline/" class="md-nav__link">
<a href="../implementation/pipeline/" class="md-nav__link">
@@ -607,7 +635,35 @@
<li class="md-nav__item">
<a href="../../implementation/storage/" class="md-nav__link">
<a href="../implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../implementation/storage/" class="md-nav__link">
@@ -635,7 +691,7 @@
<li class="md-nav__item">
<a href="../../implementation/mphf/" class="md-nav__link">
<a href="../implementation/mphf/" class="md-nav__link">
@@ -656,6 +712,34 @@
<li class="md-nav__item">
<a href="../implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
@@ -717,7 +801,7 @@
<li class="md-nav__item">
<a href="../../architecture/sequences/invariant/" class="md-nav__link">
<a href="../architecture/sequences/invariant/" class="md-nav__link">
@@ -846,7 +930,7 @@
<li><strong>k is odd</strong>: an odd-length sequence cannot equal its own reverse complement (no palindromes). This guarantees that the canonical form <code>min(kmer, revcomp(kmer))</code> is always strictly defined — the two orientations are always distinct — which is required for strand-independent counting.</li>
</ul>
<h2 id="super-kmers">Super-kmers</h2>
<p>A <strong>super-kmer</strong> is a maximal run of consecutive kmers from a DNA read, each overlapping the next by k1 nucleotides. Each kmer of the run carries the same <strong>canonical minimizer</strong>. The <strong>canonical minimizer</strong> of a kmer is the smallest value of <code>min(m-mer, revcomp(m-mer))</code> over all m-mers within the kmer (m &lt; k, m odd).</p>
<p>A <strong>super-kmer</strong> is a maximal run of consecutive kmers from a DNA read, each overlapping the next by k1 nucleotides. Each kmer of the run carries the same <strong>canonical minimizer</strong>. The <strong>canonical minimizer</strong> of a kmer is the smallest value of <code>min(m-mer, revcomp(m-mer))</code> over all m-mers within the kmer (m &lt; k, m odd), with the constraint that <strong>non-degenerate m-mers are always preferred</strong> over degenerate ones. A degenerate m-mer is one composed of a single repeated nucleotide (all-A, all-C, all-G, or all-T); such m-mers are selected only if no non-degenerate candidate exists in the window.</p>
<h3 id="canonical-super-kmers">Canonical super-kmers</h3>
<p>A <strong>canonical super-kmer</strong> is the lexicographic minimum of a super-kmer and its reverse complement:</p>
<div class="highlight"><pre><span></span><code>canonical(super-kmer) = min(super-kmer, revcomp(super-kmer))
@@ -919,10 +1003,10 @@
<script id="__config" type="application/json">{"annotate": null, "base": "../..", "features": [], "search": "../../assets/javascripts/workers/search.2c215733.min.js", "tags": null, "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}, "version": null}</script>
<script id="__config" type="application/json">{"annotate": null, "base": "..", "features": [], "search": "../assets/javascripts/workers/search.2c215733.min.js", "tags": null, "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}, "version": null}</script>
<script src="../../assets/javascripts/bundle.79ae519e.min.js"></script>
<script src="../assets/javascripts/bundle.79ae519e.min.js"></script>
<script src="https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js"></script>
Binary file not shown.
+86 -2
View File
@@ -9,7 +9,7 @@
<link rel="prev" href="../kmers/">
<link rel="prev" href="../../kmers/">
<link rel="next" href="../entropy/">
@@ -232,7 +232,7 @@
<li class="md-nav__item">
<a href="../kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -384,6 +384,34 @@
<li class="md-nav__item">
<a href="../minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../indexing/" class="md-nav__link">
@@ -578,6 +606,34 @@
<li class="md-nav__item">
<a href="../../implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../implementation/storage/" class="md-nav__link">
@@ -628,6 +684,34 @@
<li class="md-nav__item">
<a href="../../implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
+86 -2
View File
@@ -12,7 +12,7 @@
<link rel="prev" href="../encoding/">
<link rel="next" href="../indexing/">
<link rel="next" href="../minimizer/">
@@ -232,7 +232,7 @@
<li class="md-nav__item">
<a href="../kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -439,6 +439,34 @@
<li class="md-nav__item">
<a href="../minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../indexing/" class="md-nav__link">
@@ -633,6 +661,34 @@
<li class="md-nav__item">
<a href="../../implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../implementation/storage/" class="md-nav__link">
@@ -683,6 +739,34 @@
<li class="md-nav__item">
<a href="../../implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
+86 -2
View File
@@ -9,7 +9,7 @@
<link rel="prev" href="../entropy/">
<link rel="prev" href="../minimizer/">
<link rel="next" href="../../implementation/superkmer/">
@@ -232,7 +232,7 @@
<li class="md-nav__item">
<a href="../kmers/" class="md-nav__link">
<a href="../../kmers/" class="md-nav__link">
@@ -313,6 +313,34 @@
<li class="md-nav__item">
<a href="../minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
@@ -578,6 +606,34 @@
<li class="md-nav__item">
<a href="../../implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../../implementation/storage/" class="md-nav__link">
@@ -628,6 +684,34 @@
<li class="md-nav__item">
<a href="../../implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>
File diff suppressed because it is too large Load Diff