refactor: implement RoutableSuperKmer and update k-mer indexing pipeline

Replace raw SuperkMer routing with a new RoutableSuperKimer type that embeds canonical sequences and precomputed minimizers, enabling direct partition routing via hash. Update the build pipeline to yield RoutableSuperKmers throughout (builder, scatterer), refactor FASTA/unitig export commands to use the new type and compressed outputs (.fasta.gz, .unitigs.fasta.zst), revise SuperKmer header to store n_kmers instead of seql (avoiding 256-byte wrap), and update documentation to reflect minimizer-based theory, two evidence-encoding strategies for unitig-MPHF indexing (global offset vs. ID+rank), and the new obipipeline library architecture with parallel workers, biased scheduling, and error handling.
This commit is contained in:
Eric Coissac
2026-04-29 22:52:42 +02:00
parent 4e26e3bd40
commit 27f5e88a7b
72 changed files with 10093 additions and 1626 deletions
+86 -2
View File
@@ -10,7 +10,7 @@
<link rel="next" href="theory/kmers/">
<link rel="next" href="kmers/">
@@ -297,7 +297,7 @@
<li class="md-nav__item">
<a href="theory/kmers/" class="md-nav__link">
<a href="kmers/" class="md-nav__link">
@@ -380,6 +380,34 @@
<li class="md-nav__item">
<a href="theory/minimizer/" class="md-nav__link">
<span class="md-ellipsis">
Minimizer selection
</span>
</a>
</li>
<li class="md-nav__item">
<a href="theory/indexing/" class="md-nav__link">
@@ -574,6 +602,34 @@
<li class="md-nav__item">
<a href="implementation/obipipeline/" class="md-nav__link">
<span class="md-ellipsis">
obipipeline library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="implementation/storage/" class="md-nav__link">
@@ -624,6 +680,34 @@
<li class="md-nav__item">
<a href="implementation/unitig_evidence/" class="md-nav__link">
<span class="md-ellipsis">
Unitig evidence encoding
</span>
</a>
</li>
</ul>
</nav>