Files
obitools4/autodoc/docmd/pkg/obichunk/options.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

2.0 KiB

Semantic Description of obichunk Package

The obichunk package provides a flexible and configurable options management system for data processing pipelines, particularly in the context of biological sequence analysis (e.g., metabarcoding). It defines a typed Options struct and associated builder-style configuration functions.

Core Concepts

  • Immutable Configuration Builder: Options are constructed via MakeOptions([]WithOption), applying a list of functional setters (WithOption) to an internal __options__ struct.
  • Encapsulation: The concrete options are hidden behind a pointer (pointer *__options__) to ensure safe sharing and mutation control.

Supported Functionalities

  • Categorization: OptionSubCategory(keys...) appends category labels (e.g., sample or marker names) to an internal list; PopCategories() retrieves and removes the first category.
  • Missing Value Handling: OptionNAValue(na string) customizes placeholder for missing data (default: "NA").
  • Statistical Tracking: OptionStatOn(keys...) registers statistical descriptions (via obiseq.StatsOnDescription) for per-field metrics collection.
  • Batch Processing Control:
    • OptionBatchCount(number) sets the number of batches.
    • OptionsBatchSize(size) defines how many items per batch (default from obidefault).
  • Parallelization: OptionsParallelWorkers(nworkers) configures concurrency level (default from environment).
  • Disk vs Memory Sorting: OptionSortOnDisk() enables disk-backed sorting; OptionSortOnMemory() disables it (default).
  • Singleton Filtering: OptionsNoSingleton() excludes singleton sequences; OptionsWithSingleton() allows them (default).

Design Highlights

  • Functional options pattern for extensibility and readability.
  • Default values derived from obidefault where applicable (e.g., batch size, workers).
  • Designed for integration with obiseq and obidefault, supporting scalable, reproducible NGS data workflows.