Files
obitools4/autodoc/docmd/pkg/obikmer/minimizer_utils.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

1.3 KiB
Raw Blame History

Minimizer Size Utilities in obikmer

This Go package provides helper functions to compute and validate the minimizer size m in k-mer-based genomic algorithms (e.g., minimizer schemes for sequence comparison or indexing).

Core Functions

  • DefaultMinimizerSize(k)
    Returns a recommended minimizer size: ceil(k / 2.5), clamped to [1, k1].
    → Ensures m is reasonably large for uniqueness while keeping window size (k m + 1) manageable.

  • MinMinimizerSize(nworkers)
    Computes the minimum m such that there are ≥ nworkers distinct minimizers:
    solves 4^m ≥ n_workers, i.e., ceil(log₄(nworkers)).
    → Guarantees enough diversity for parallelization (e.g., hashing-based distribution across workers).

  • ValidateMinimizerSize(m, k, nworkers)
    Enforces constraints on m:

    • Lower bound: ≥ MinMinimizerSize(nworkers) (warns & adjusts if violated)
    • Hard bounds: 1 ≤ m < k
      → Prevents invalid or inefficient parameter choices.

Semantic Purpose

These functions ensure that minimizer-based workflows are:

  • Theoretically sound (sufficient entropy for parallelism),
  • Practically viable (avoiding degenerate cases like m = 0 or m ≥ k),
  • User-friendly (providing sensible defaults + clear warnings on adjustment).