mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
53 lines
3.1 KiB
Markdown
53 lines
3.1 KiB
Markdown
# `obidist`: Efficient Symmetric Distance and Similarity Matrix Management
|
||
|
||
The `obidist` Go package provides memory-efficient, symmetric matrix implementations for pairwise **distance** and **similarity** computations — ideal for clustering, phylogenetics, or any domain requiring fast access with minimal footprint. It enforces structural guarantees (symmetry, fixed diagonal) and offers safe, label-aware operations.
|
||
|
||
## Core Types
|
||
|
||
| Type | Description |
|
||
|------|-------------|
|
||
| `DistMatrix` | Symmetric *n×n* matrix for **distances**; diagonal entries are always `0.0`. |
|
||
| `SimilarityMatrix` | Symmetric *n×n* matrix for **similarities**; diagonal entries are always `1.0`. |
|
||
|
||
Both types store only the upper triangle (`i < j`) to reduce memory from *O(n²)* to *n(n−1)/2*. All access (`Get`, `Set`) is automatically mirrored for symmetry.
|
||
|
||
## Constructors
|
||
|
||
| Function | Description |
|
||
|---------|-------------|
|
||
| `NewDistMatrix(n)` / `WithLabels(labels []string)` | Creates a distance matrix of size *n×n* (diag = 0). Labels are optional. |
|
||
| `NewSimilarityMatrix(n)` / `WithLabels(labels []string)` | Creates a similarity matrix of size *n×n* (diag = 1). Labels are optional. |
|
||
|
||
> **Note**: Passing `labels` with length ≠ *n* panics; empty labels (`nil`) are allowed.
|
||
|
||
## Core Operations
|
||
|
||
| Method | Description |
|
||
|--------|-------------|
|
||
| `Get(i, j) float64` | Returns value at *(i,j)*; enforces symmetry (reads stored upper triangle). |
|
||
| `Set(i, j, v float64)` | Sets value at *(i,j)*; silently ignores diagonal assignments. |
|
||
| `Size() int` | Returns *n*, the matrix dimension. |
|
||
| `GetLabel(i) string`, `SetLabel(i int, label string)` | Read/write the *i*-th element’s label. |
|
||
| `Labels() []string` | Returns a **copy** of all labels (safe mutation). |
|
||
| `GetRow(i) []float64`, `GetColumn(j) []float64` | Returns full row/column as a **new slice** (symmetric copy). |
|
||
|
||
> All index access panics on out-of-bounds (`i < 0` or `≥ n`). Diagonal writes (e.g., `Set(i, i, v)`) are silently ignored.
|
||
|
||
## Analysis & Utility Methods
|
||
|
||
| Method | Description |
|
||
|--------|-------------|
|
||
| `MinDistance() (val float64, i, j int)` | Returns smallest off-diagonal value and its indices. For *n ≤ 1*, returns `(0, -1, -1)`. |
|
||
| `MaxDistance() (val float64, i, j int)` | Returns largest off-diagonal value and its indices. For *n ≤ 1*, returns `(0, -1, -1)`. |
|
||
| `Copy() *DistMatrix` | Deep copy (including labels). Safe for concurrent use or immutability. |
|
||
| `ToFullMatrix() [][]float64` | Returns a dense *n×n* copy (upper/lower triangles + diagonal). Use sparingly for large matrices. |
|
||
|
||
## Edge Cases & Guarantees
|
||
|
||
- **Empty matrix** (*n = 0*): All methods behave safely (e.g., `Size()` → `0`, min/max → `(0, -1, -1)`).
|
||
- **Singleton matrix** (*n = 1*): Only diagonal exists → min/max return `(0, -1, -1)`.
|
||
- **Label integrity**: `Labels()` and row/column copies use defensive duplication.
|
||
- **No normalization enforced** on similarity values (e.g., `[-∞, +∞]` allowed), but diagonals are *always* fixed.
|
||
|
||
Designed for correctness-first scientific workflows, with rigorous unit tests covering bounds checks and symmetry.
|