mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 03:50:39 +00:00
⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
This commit is contained in:
@@ -0,0 +1,19 @@
|
||||
## `obialign` Package: Semantic Overview (≤50 lines)
|
||||
|
||||
The `obialign` package provides a lightweight, high-performance utility for **detecting single-edit-distance relationships** between biological sequences (`obiseq.BioSequence`). Its core function, `D1Or0`, determines whether two sequences are either **identical** or differ by exactly **one substitution, insertion, or deletion (indel)**.
|
||||
|
||||
- `abs[k]`: A generic helper computing absolute values for integers or floats (via Go generics).
|
||||
- `D1Or0(...)`: Returns a 4-tuple:
|
||||
- **`int` (first)**: `0` if identical, `1` if differing by one edit, `-1` otherwise.
|
||||
- **`int` (second)**: Position of the differing site (`-1` if identical).
|
||||
- **`byte`, `byte`**: Mismatched characters (or `'-'` for gaps indicating indels).
|
||||
|
||||
**Algorithmic strategy:**
|
||||
1. Early rejection if length difference exceeds 1.
|
||||
2. Forward scan until first mismatch → identifies left boundary of divergence.
|
||||
3. Backward scan from ends to find rightmost match boundary.
|
||||
4. Validates whether the mismatch region allows exactly one edit:
|
||||
- Single substitution: equal lengths, single divergent position.
|
||||
- Insertion/deletion: length differs by 1 and only one non-overlapping character remains.
|
||||
|
||||
Designed for speed in **OTU/ASV dereplication or error correction** pipelines (e.g., metabarcoding), where rapid filtering of near-identical sequences is critical. Does *not* compute full alignments; optimized for binary decision-making under strict edit constraints.
|
||||
Reference in New Issue
Block a user