mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
1.5 KiB
1.5 KiB
Taxonomic Classification via TaxonomyClassifier
The obiseq package provides a taxonomic classification mechanism through the TaxonomyClassifier function.
-
Purpose: Constructs a reusable classifier for biological sequences based on taxonomic hierarchy.
-
Inputs:
taxonomicRank: Target rank (e.g.,"species","genus").taxonomy: Reference taxonomy (*obitax.Taxonomy), with fallback via.OrDefault(true).abortOnMissing: Boolean flag to enforce strict taxon resolution.
-
Core Logic:
- For each sequence, retrieves its
Taxon, then drills down to the requested rank using.TaxonAtRank(). - If
abortOnMissingis true, exits on failure to resolve the taxon or rank. - Internally maps
*TaxNodes to integer codes for efficient storage/comparison.
- For each sequence, retrieves its
-
Returned Object (
BioSequenceClassifier):Code(sequence) int: Assigns a unique integer code to the taxonomic assignment of a sequence.Value(code) string: Returns the scientific name corresponding to a code.Reset(): Reinitializes internal mappings (useful for batch processing).Clone() *BioSequenceClassifier: Creates a fresh, identical classifier instance.
-
Design Rationale:
- Uses integer codes to avoid repeated string operations and enable fast indexing (e.g., for counting).
- Supports both strict (
abortOnMissing=true) and lenient classification modes.
This design enables scalable, efficient taxonomic profiling of sequencing datasets.