mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 03:50:39 +00:00
⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
This commit is contained in:
@@ -0,0 +1,34 @@
|
||||
# Taxonomy Loading Module (`obiformats`)
|
||||
|
||||
This Go package provides semantic functionality to automatically detect and load taxonomic data from various file formats. It supports flexible, format-agnostic taxonomy ingestion via a unified interface.
|
||||
|
||||
## Core Features
|
||||
|
||||
1. **Format Detection**
|
||||
- `DetectTaxonomyFormat(path)` identifies the taxonomy source format by inspecting file type (directory, MIME-type), filename patterns, or structure.
|
||||
- Supports:
|
||||
• NCBI Taxdump (both directory and `.tar` archive)
|
||||
• CSV files (`text/csv`)
|
||||
• FASTA/FASTQ sequences (via `mimetype` detection)
|
||||
|
||||
2. **Modular Loaders**
|
||||
- Returns a typed `TaxonomyLoader` function, enabling deferred loading with configurable options (`onlysn`, `seqAsTaxa`).
|
||||
- Each loader abstracts format-specific parsing (e.g., NCBI `nodes.dmp`, FASTA header taxonomy extraction).
|
||||
|
||||
3. **Sequence-Based Taxonomy Extraction**
|
||||
- For sequence files (FASTA/FASTQ), taxonomy is inferred from headers or associated metadata, using `ExtractTaxonomy()`.
|
||||
|
||||
4. **Integration with OBITools Ecosystem**
|
||||
- Leverages `obitax.Taxonomy` as the canonical output structure.
|
||||
- Uses custom MIME-type registration (`obiutils.RegisterOBIMimeType()`) for robust detection of bioinformatics formats.
|
||||
|
||||
5. **Error Handling & Logging**
|
||||
- Graceful failure with descriptive errors; informative logging via `logrus`.
|
||||
|
||||
## Usage Flow
|
||||
|
||||
```go
|
||||
tax, err := LoadTaxonomy("path/to/data", onlysn=true, seqAsTaxa=false)
|
||||
```
|
||||
|
||||
The module enables interoperability across taxonomic data sources in metabarcoding workflows.
|
||||
Reference in New Issue
Block a user