Files
obitools4/autodoc/docmd/pkg/obiformats/fastseq_header.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

28 lines
1.4 KiB
Markdown

# Semantic Description of `obiformats` Package
The `obiformats` package provides utilities for parsing sequence headers in the OBItools4 framework, supporting two distinct formats:
- **JSON-based format** (e.g., `{"id":"seq1", ...}`): Detected by a leading `{` character.
- **Legacy OBI format** (plain text, e.g., `>seq1 description`): Used when no JSON prefix is present.
## Core Functions
- **`ParseGuessedFastSeqHeader(sequence *obiseq.BioSequence)`**
Dynamically routes header parsing based on the first character of the sequence definition:
- Calls `ParseFastSeqJsonHeader` if JSON-prefixed.
- Otherwise invokes `ParseFastSeqOBIHeader`.
- **`IParseFastSeqHeaderBatch(iterator, options...) obiiter.IBioSequence`**
Applies header parsing to a *batch* of sequences:
- Takes an iterator over `BioSequence`s.
- Uses optional configuration (e.g., parallelism, parsing behavior).
- Wraps the parser in a worker pipeline via `MakeIWorker`, preserving sequence flow.
## Design Principles
- **Format agnosticism**: Automatically detects header type.
- **Iterator-based streaming**: Enables memory-efficient batch processing of large datasets (e.g., FASTQ/FASTA).
- **Extensibility**: Options pattern (`WithOption`) supports runtime customization.
This package serves as a header-decoding layer for downstream analysis in metagenomic or metabarcoding workflows.