mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
2.0 KiB
2.0 KiB
Semantic Description of obichunk Package
The obichunk package provides a flexible and configurable options management system for data processing pipelines, particularly in the context of biological sequence analysis (e.g., metabarcoding). It defines a typed Options struct and associated builder-style configuration functions.
Core Concepts
- Immutable Configuration Builder: Options are constructed via
MakeOptions([]WithOption), applying a list of functional setters (WithOption) to an internal__options__struct. - Encapsulation: The concrete options are hidden behind a pointer (
pointer *__options__) to ensure safe sharing and mutation control.
Supported Functionalities
- Categorization:
OptionSubCategory(keys...)appends category labels (e.g., sample or marker names) to an internal list;PopCategories()retrieves and removes the first category. - Missing Value Handling:
OptionNAValue(na string)customizes placeholder for missing data (default:"NA"). - Statistical Tracking:
OptionStatOn(keys...)registers statistical descriptions (viaobiseq.StatsOnDescription) for per-field metrics collection. - Batch Processing Control:
OptionBatchCount(number)sets the number of batches.OptionsBatchSize(size)defines how many items per batch (default fromobidefault).
- Parallelization:
OptionsParallelWorkers(nworkers)configures concurrency level (default from environment). - Disk vs Memory Sorting:
OptionSortOnDisk()enables disk-backed sorting;OptionSortOnMemory()disables it (default). - Singleton Filtering:
OptionsNoSingleton()excludes singleton sequences;OptionsWithSingleton()allows them (default).
Design Highlights
- Functional options pattern for extensibility and readability.
- Default values derived from
obidefaultwhere applicable (e.g., batch size, workers). - Designed for integration with
obiseqandobidefault, supporting scalable, reproducible NGS data workflows.