⬆️ version bump to v4.5

- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
This commit is contained in:
Eric Coissac
2026-04-07 08:36:50 +02:00
parent 670edc1958
commit 8c7017a99d
392 changed files with 18875 additions and 141 deletions
@@ -0,0 +1,35 @@
# `obiformats` Package Overview
The `obiformats` package provides utilities for formatting and writing biological sequences (e.g., DNA, RNA) in standard formats—primarily **FASTA**. It is designed for high-performance batch processing and supports parallel I/O, compression-aware streaming, and flexible configuration.
## Core Formatting Functions
- **`FormatFasta(seq, formater)`**
Converts a single `BioSequence` into a FASTA string: header (`>id description`) followed by sequence lines of up to 60 characters.
- **`FormatFastaBatch(batch, formater, skipEmpty)`**
Efficiently formats a batch of sequences into FASTA using pre-allocated buffers and direct byte writes—avoiding intermediate strings. Empty sequences are either skipped (with warning) or cause a fatal error.
## File Writing Functions
- **`WriteFasta(iterator, file, options...)`**
Writes a stream of sequences to any `io.WriteCloser`. Supports:
- Parallel workers (`ParallelWorkers`)
- Chunked writing via `WriteFileChunk`
- Optional compression (e.g., gzip)
Returns a new iterator mirroring the input for pipeline chaining.
- **`WriteFastaToStdout(iterator, options...)`**
Convenience wrapper to output FASTA directly to `stdout`, with file-closing behavior configurable.
- **`WriteFastaToFile(iterator, filename, options...)`**
Writes to a named file with:
- Truncation or append mode (`AppendFile`)
- Automatic paired-end output if `HaveToSavePaired()` is enabled
(writes reverse reads to a secondary file specified via `PairedFileName`)
## Key Design Highlights
- **Memory-efficient**: Uses `bytes.Buffer.Grow()` and avoids unnecessary allocations.
- **Robust error handling**: Panics on nil sequences; logs warnings/errors via `logrus`.
- **Pipeline-friendly**: Integrates with the `obiiter` iterator abstraction for streaming workflows.