📖 Update super-kmer theory and implementation to prefer non-degenerate m-mers
- Update super-kmer definition in `kmERS.md` to specify that non-degenerate m-mers are preferred over degenerate ones (degeneracy = homopolymer). - Refactor `superkmer.rs`: change `.canonical()` to mutate in-place and return bool. - Add `m` field & canonical-aware minimizer position calculation to SuperKmerIter in obiskbuilder. - Add helper functions `is_degenerate` and minimizer comparison logic to rolling_stat.rs for consistent tie-breaking. - Minor formatting cleanup in superkmer command and chunk processing.
This commit is contained in:
@@ -286,21 +286,23 @@ impl SuperKmer {
|
||||
Ok(self.kmer(i, k)?.canonical(k))
|
||||
}
|
||||
|
||||
/// Return this super-kmer in canonical form (lexicographic minimum of forward and revcomp).
|
||||
pub fn canonical(mut self) -> Self {
|
||||
/// Put this super-kmer in canonical form (lexicographic minimum of forward and revcomp).
|
||||
///
|
||||
/// Returns `true` if already canonical (no change), `false` if revcomp was applied.
|
||||
pub fn canonical(&mut self) -> bool {
|
||||
let seql = self.seql();
|
||||
for i in 0..seql {
|
||||
let fwd = self.nucleotide(i);
|
||||
let rev = complement(self.nucleotide(seql - 1 - i));
|
||||
if fwd < rev {
|
||||
return self;
|
||||
return true;
|
||||
}
|
||||
if fwd > rev {
|
||||
self.revcomp();
|
||||
return self;
|
||||
return false;
|
||||
}
|
||||
}
|
||||
self
|
||||
true
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user