Fix super k-mer minimizer bijection and add validation test

This commit addresses a bug in the super k-mer implementation where the minimizer bijection property was not properly enforced. The fix ensures that:

1. All k-mers within a super k-mer share the same minimizer
2. Identical super k-mer sequences have the same minimizer

The changes include:

- Fixing the super k-mer iteration logic to properly validate the minimizer bijection property
- Adding a comprehensive test suite (TestSuperKmerMinimizerBijection) that validates the intrinsic property of super k-mers
- Updating the .gitignore file to properly track relevant files

This resolves issues where the same sequence could be associated with different minimizers, violating the super k-mer definition.
This commit is contained in:
Eric Coissac
2026-02-08 13:44:23 +01:00
parent 7a979ba77f
commit db98ddb241
4 changed files with 219 additions and 3 deletions

View File

@@ -95,7 +95,7 @@ func IterSuperKmers(seq []byte, k int, m int) iter.Seq[SuperKmer] {
}
}
if !firstKmer {
if !firstKmer && len(seq[superKmerStart:]) >= k {
superKmer := SuperKmer{
Minimizer: currentMinimizer,
Start: superKmerStart,