mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 03:50:39 +00:00
⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
This commit is contained in:
@@ -0,0 +1,26 @@
|
||||
# Taxonomic Classification via `TaxonomyClassifier`
|
||||
|
||||
The `obiseq` package provides a taxonomic classification mechanism through the `TaxonomyClassifier` function.
|
||||
|
||||
- **Purpose**: Constructs a reusable classifier for biological sequences based on taxonomic hierarchy.
|
||||
- **Inputs**:
|
||||
- `taxonomicRank`: Target rank (e.g., `"species"`, `"genus"`).
|
||||
- `taxonomy`: Reference taxonomy (`*obitax.Taxonomy`), with fallback via `.OrDefault(true)`.
|
||||
- `abortOnMissing`: Boolean flag to enforce strict taxon resolution.
|
||||
|
||||
- **Core Logic**:
|
||||
- For each sequence, retrieves its `Taxon`, then drills down to the requested rank using `.TaxonAtRank()`.
|
||||
- If `abortOnMissing` is true, exits on failure to resolve the taxon or rank.
|
||||
- Internally maps `*TaxNode`s to integer codes for efficient storage/comparison.
|
||||
|
||||
- **Returned Object (`BioSequenceClassifier`)**:
|
||||
- `Code(sequence) int`: Assigns a unique integer code to the taxonomic assignment of a sequence.
|
||||
- `Value(code) string`: Returns the scientific name corresponding to a code.
|
||||
- `Reset()`: Reinitializes internal mappings (useful for batch processing).
|
||||
- `Clone() *BioSequenceClassifier`: Creates a fresh, identical classifier instance.
|
||||
|
||||
- **Design Rationale**:
|
||||
- Uses integer codes to avoid repeated string operations and enable fast indexing (e.g., for counting).
|
||||
- Supports both strict (`abortOnMissing=true`) and lenient classification modes.
|
||||
|
||||
This design enables scalable, efficient taxonomic profiling of sequencing datasets.
|
||||
Reference in New Issue
Block a user