⬆️ version bump to v4.5

- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
This commit is contained in:
Eric Coissac
2026-04-07 08:36:50 +02:00
parent 670edc1958
commit 8c7017a99d
392 changed files with 18875 additions and 141 deletions
@@ -0,0 +1,41 @@
# Taxonomic Annotation Features in `obiseq` Package
This package provides semantic taxonomic annotation capabilities for biological sequences (`BioSequence`). It integrates with a taxonomy database to assign, retrieve, and manage taxonomic identifiers (taxids) and related metadata.
## Core Functions
- **`Taxid()`**: Retrieves the taxonomic ID as a string (e.g., `"12345"` or `"NA"`), supporting multiple internal representations (`string`, `int`, `float64`). Returns `"NA"` if no taxid is set.
- **`Taxon(taxonomy)`**: Returns the corresponding `*obitax.Taxon` object, or `nil` if taxid is `"NA"`.
- **`SetTaxid(taxid, rank...)`**: Assigns a taxonomic ID to the sequence. Validates against default taxonomy; handles aliases and errors based on configuration flags (`FailOnTaxonomy`, `UpdateTaxid`). Optionally stores taxid under a custom rank (e.g., `"genus_taxid"`).
- **`SetTaxon(taxon, rank...)`**: Assigns a `*obitax.Taxon` object directly; stores its string representation as taxid.
## Rank-Specific Annotation
- **`SetTaxonAtRank(taxonomy, rank)`**: Annotates the sequence with taxid and scientific name at a specified Linnaean rank (e.g., `"species"`, `"genus"`). Sets two attributes: `rank_taxid` and `rank_name`. Returns the taxon at that rank (or `nil`).
- **Convenience wrappers**:
- `SetSpecies(...)`
- `SetGenus(...)`
- `SetFamily(...)`
All delegate to `SetTaxonAtRank`.
## Taxonomic Path & Metadata
- **`SetPath(taxonomy)`**: Computes and stores the full taxonomic lineage (from root to species) as a string slice under attribute `"taxonomic_path"`.
- **`Path()`**: Retrieves the stored taxonomic path; recomputes it if missing and a default taxonomy exists.
- **`SetScientificName(taxonomy)`**: Stores the sequences species-level scientific name under `"scientific_name"`.
- **`SetTaxonomicRank(taxonomy)`**: Stores the taxons rank (e.g., `"species"`, `"genus"`) under `"taxonomic_rank"`.
## Error Handling & Configuration
- Uses `logrus` and custom logging (`obilog`) for warnings/errors.
- Behavior on taxonomy mismatches (e.g., unknown taxid, alias) is configurable via `obidefault` settings.
- Ensures type consistency: taxid must be string, int, or float; invalid types trigger fatal errors.
All methods are designed for seamless integration into bioinformatics pipelines, enabling robust taxonomic profiling of sequencing data.