mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
1.6 KiB
1.6 KiB
CSVTaxaIterator Function — Semantic Description
The function CSVTaxaIterator, part of the obiformats package, converts a taxonomic iterator (*obitax.ITaxon) into an incremental CSV record generator via obiitercsv.ICSVRecord. It enables streaming, batched export of taxonomic data to CSV format with configurable fields.
Core Functionality:
- Input: A pointer-based taxonomic iterator (
*obitax.ITaxon) and optional configuration viaWithOption. - Output: An asynchronous CSV record iterator (
*obiitercsv.ICSVRecord) that yields batches of records.
Configurable Output Fields (via options):
query: Taxon-associated query identifier, if enabled (WithPattern).taxid: Either raw node ID (e.g., string pointer) or formatted taxon path (WithRawTaxidtoggle).parent: Parent taxonomic ID or string representation, if enabled (WithParent).taxonomic_rank: Taxon rank (e.g., "species", "genus").scientific_name: Full scientific name of the taxon.- Custom metadata fields: Specified via
WithMetadata, extracted from taxon metadata store. path: Full lineage path (e.g., "k__Bacteria; p__; c__..."), if enabled (WithPath).
Implementation Highlights:
- Uses goroutines for non-blocking push of batches and clean shutdown (
WaitAndClose,Done). - Supports batching (configurable via
BatchSize) to optimize I/O. - Dynamically builds CSV headers based on selected options before processing begins.
Use Case:
Efficient, memory-light conversion of large taxonomic datasets (e.g., from classification pipelines) into structured CSV for downstream analysis or reporting.