⬆️ version bump to v4.5

- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
This commit is contained in:
Eric Coissac
2026-04-07 08:36:50 +02:00
parent 670edc1958
commit 8c7017a99d
392 changed files with 18875 additions and 141 deletions
+19
View File
@@ -0,0 +1,19 @@
## `obialign` Package: Semantic Overview (≤50 lines)
The `obialign` package provides a lightweight, high-performance utility for **detecting single-edit-distance relationships** between biological sequences (`obiseq.BioSequence`). Its core function, `D1Or0`, determines whether two sequences are either **identical** or differ by exactly **one substitution, insertion, or deletion (indel)**.
- `abs[k]`: A generic helper computing absolute values for integers or floats (via Go generics).
- `D1Or0(...)`: Returns a 4-tuple:
- **`int` (first)**: `0` if identical, `1` if differing by one edit, `-1` otherwise.
- **`int` (second)**: Position of the differing site (`-1` if identical).
- **`byte`, `byte`**: Mismatched characters (or `'-'` for gaps indicating indels).
**Algorithmic strategy:**
1. Early rejection if length difference exceeds 1.
2. Forward scan until first mismatch → identifies left boundary of divergence.
3. Backward scan from ends to find rightmost match boundary.
4. Validates whether the mismatch region allows exactly one edit:
- Single substitution: equal lengths, single divergent position.
- Insertion/deletion: length differs by 1 and only one non-overlapping character remains.
Designed for speed in **OTU/ASV dereplication or error correction** pipelines (e.g., metabarcoding), where rapid filtering of near-identical sequences is critical. Does *not* compute full alignments; optimized for binary decision-making under strict edit constraints.