# Semantic Description of `obikmer` Package Functionalities

The `obikmer` package provides tools for **super k-mer extraction and minimizer-based sequence analysis** in bioinformatics.

## Core Concepts

A **super k-mer** is a maximal contiguous subsequence of DNA where *all* embedded *k*-mers share the **same minimizer**—a compact representative (typically lexicographically minimal) of *m*-mers, considering both forward and reverse-complement strands.

## Key Functions & Features

- **`IterSuperKmers(seq, k, m)`**:  
  An iterator over all super *k*-mers in input sequence `seq`, parameterized by:
  - `k`: length of embedded *k*-mers,
  - `m`: size of minimizer window (`m ≤ k`).  
  Yields structured objects with:
  - `Sequence`: the super *k*-mer substring,
  - `Start`/`End`: genomic coordinates (0-based half-open),
  - `Minimizer`: canonical hash of the shared minimizer.

- **`ExtractSuperKmers(...)`**:  
  Synchronous counterpart returning a slice of all super *k*-mers.

## Verified Properties (via Tests)

1. **Boundary correctness**: Extracted subsequences match `seq[start:end]`.
2. **Consistency between iterator and slice versions**: Both APIs produce identical results.
3. **Bijection property**:
   - Each unique super *k*-mer sequence maps to exactly one minimizer.
   - All embedded *k*-mers within a super *k-mer* share the same minimizer.

## Implementation Notes

- Minimizers are computed canonically (min of forward and reverse-complement encodings).
- Uses base encoding via `__single_base_code__` (assumed helper mapping A/C/G/T → 0/1/2/3).
- Tests cover simple, homopolymer-rich, and complex genomic patterns.

## Design Rationale

Super *k*-mers enable efficient compression, indexing (e.g., in minimizer spaces), and alignment-free comparisons—crucial for scalable genomic analysis.