mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
42 lines
2.3 KiB
Markdown
42 lines
2.3 KiB
Markdown
# Taxonomic Annotation Features in `obiseq` Package
|
||
|
||
This package provides semantic taxonomic annotation capabilities for biological sequences (`BioSequence`). It integrates with a taxonomy database to assign, retrieve, and manage taxonomic identifiers (taxids) and related metadata.
|
||
|
||
## Core Functions
|
||
|
||
- **`Taxid()`**: Retrieves the taxonomic ID as a string (e.g., `"12345"` or `"NA"`), supporting multiple internal representations (`string`, `int`, `float64`). Returns `"NA"` if no taxid is set.
|
||
|
||
- **`Taxon(taxonomy)`**: Returns the corresponding `*obitax.Taxon` object, or `nil` if taxid is `"NA"`.
|
||
|
||
- **`SetTaxid(taxid, rank...)`**: Assigns a taxonomic ID to the sequence. Validates against default taxonomy; handles aliases and errors based on configuration flags (`FailOnTaxonomy`, `UpdateTaxid`). Optionally stores taxid under a custom rank (e.g., `"genus_taxid"`).
|
||
|
||
- **`SetTaxon(taxon, rank...)`**: Assigns a `*obitax.Taxon` object directly; stores its string representation as taxid.
|
||
|
||
## Rank-Specific Annotation
|
||
|
||
- **`SetTaxonAtRank(taxonomy, rank)`**: Annotates the sequence with taxid and scientific name at a specified Linnaean rank (e.g., `"species"`, `"genus"`). Sets two attributes: `rank_taxid` and `rank_name`. Returns the taxon at that rank (or `nil`).
|
||
|
||
- **Convenience wrappers**:
|
||
- `SetSpecies(...)`
|
||
- `SetGenus(...)`
|
||
- `SetFamily(...)`
|
||
All delegate to `SetTaxonAtRank`.
|
||
|
||
## Taxonomic Path & Metadata
|
||
|
||
- **`SetPath(taxonomy)`**: Computes and stores the full taxonomic lineage (from root to species) as a string slice under attribute `"taxonomic_path"`.
|
||
|
||
- **`Path()`**: Retrieves the stored taxonomic path; recomputes it if missing and a default taxonomy exists.
|
||
|
||
- **`SetScientificName(taxonomy)`**: Stores the sequence’s species-level scientific name under `"scientific_name"`.
|
||
|
||
- **`SetTaxonomicRank(taxonomy)`**: Stores the taxon’s rank (e.g., `"species"`, `"genus"`) under `"taxonomic_rank"`.
|
||
|
||
## Error Handling & Configuration
|
||
|
||
- Uses `logrus` and custom logging (`obilog`) for warnings/errors.
|
||
- Behavior on taxonomy mismatches (e.g., unknown taxid, alias) is configurable via `obidefault` settings.
|
||
- Ensures type consistency: taxid must be string, int, or float; invalid types trigger fatal errors.
|
||
|
||
All methods are designed for seamless integration into bioinformatics pipelines, enabling robust taxonomic profiling of sequencing data.
|