mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 03:50:39 +00:00
⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
This commit is contained in:
@@ -0,0 +1,43 @@
|
||||
# Semantic Description of `ReadSequencesBatchFromFiles`
|
||||
|
||||
This function implements **concurrent, batched streaming** of biological sequences from multiple input files.
|
||||
|
||||
## Core Functionality
|
||||
|
||||
- **Input**: A slice of file paths (`[]string`), an optional batch reader interface, and a concurrency level.
|
||||
- **Default behavior**: Uses `ReadSequencesFromFile` if no custom reader is provided.
|
||||
|
||||
## Concurrency Model
|
||||
|
||||
- Launches `concurrent_readers` goroutines to process files in parallel.
|
||||
- Files are distributed via a shared channel (`filenameChan`) — ensuring fair load balancing.
|
||||
|
||||
## Streaming Interface
|
||||
|
||||
- Returns an `obiiter.IBioSequence`, a streaming iterator over batches of biological sequences.
|
||||
- Internally uses an atomic counter (`nextCounter`) to assign unique, ordered IDs to sequence batches (via `Reorder`), preserving global order despite parallelism.
|
||||
|
||||
## Error Handling & Logging
|
||||
|
||||
- Panics on file-open failure (via `log.Panicf`).
|
||||
- Logs start/end of reading per file using structured logging (`log.Printf`, `log.Println`).
|
||||
|
||||
## Resource Management
|
||||
|
||||
- Uses a barrier pattern: each reader goroutine calls `batchiter.Done()` upon completion.
|
||||
- A finalizer goroutine waits for all readers (`WaitAndClose`) and logs termination.
|
||||
|
||||
## Design Intent
|
||||
|
||||
- Enables scalable, memory-efficient ingestion of large NGS datasets.
|
||||
- Decouples *reading logic* (via `IBatchReader`) from orchestration — supporting pluggable formats.
|
||||
- Prioritizes throughput and deterministic ordering over strict FIFO per-file semantics.
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
| Type/Interface | Role |
|
||||
|----------------|------|
|
||||
| `IBatchReader` | Reader factory: `(filename, options...) → SequenceIterator` |
|
||||
| `obiiter.IBioSequence` | Thread-safe batch iterator (push model) |
|
||||
| `AtomicCounter` | Ensures globally unique, sequential batch IDs across goroutines |
|
||||
|
||||
Reference in New Issue
Block a user