⬆️ version bump to v4.5

- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
This commit is contained in:
Eric Coissac
2026-04-07 08:36:50 +02:00
parent 670edc1958
commit 8c7017a99d
392 changed files with 18875 additions and 141 deletions
+34
View File
@@ -0,0 +1,34 @@
# ObiDefault Package: Batch Configuration Module
This Go module provides centralized configuration for sequence batching in Obitools, supporting both **count-based** and **memory-aware** batch processing.
## Core Features
- `_BatchSize` / `SetBatchSize()`
Defines and configures the *minimum* number of sequences per batch (default: `1`).
Used internally as `minSeqs` in `RebatchBySize`.
- `_BatchSizeMax()` / `SetBatchSizeMax()`
Sets the *maximum* sequences per batch (default: `2000`). Batches are flushed upon reaching this limit, regardless of memory.
- **CLI & Environment Integration**
Batch size is determined by `--batch-size` CLI flag and/or the `OBIBATCHSIZE` environment variable (via parsing logic not shown here but implied by comments).
- `_BatchMem()` / `SetBatchMem(n int)`
Configures the *maximum memory per batch* (default: `128 MB`). A value of `0` disables memory-based batching, falling back to pure count-based logic.
- `_BatchMemStr()`
Stores the *raw CLI string* passed to `--batch-mem` (e.g., `"256M"`, `"1G"`), enabling human-readable input parsing elsewhere.
## Utility Functions
- `BatchSizePtr()`, `BatchMemPtr()`
Expose pointers to internal variables for direct modification or inter-process sharing.
- `BatchSizeMaxPtr()`, `BatchMemStrPtr()`
Provide read/write access to max-size and raw memory string values.
## Design Intent
- Separates **configuration** (defaults, CLI/env parsing) from **processing logic**, enabling modular and testable batch handling.
- Supports both scalable, large-scale processing (via count limits) and memory-constrained environments (via soft RAM caps).
@@ -0,0 +1,35 @@
# Output Compression Control Module
This Go package (`obidefault`) provides a simple, global configuration mechanism for toggling output compression behavior across an application.
## Core Features
- **Global Compression Flag**: A package-level boolean variable `__compress__` (default: `false`) controls whether output should be compressed.
- **Read Access**:
- `CompressOutput()` returns the current compression setting as a boolean.
- **Write Access**:
- `SetCompressOutput(b bool)` updates the compression flag to a new value.
- **Pointer Access**:
- `CompressOutputPtr()` returns a pointer to the internal flag, enabling indirect modification (e.g., for UI bindings or reflection-based updates).
## Design Intent
- Minimal, side-effect-free API.
- Thread-safety *not* guaranteed — intended for use in single-threaded initialization or controlled environments.
- Encapsulation via unexported variable `__compress__`, enforced through accessor functions.
## Typical Usage
```go
// Enable compression globally:
obidefault.SetCompressOutput(true)
if obidefault.CompressOutput() {
// Apply compression logic (e.g., gzip, brotli)
}
```
## Notes
- The double underscore prefix (`__compress__`) signals internal/private status (convention, not enforced).
- Designed for runtime configurability without recompilation.
+38
View File
@@ -0,0 +1,38 @@
# `obidefault` Package — Semantic Overview
This minimal Go package provides a centralized, mutable global flag for controlling warning verbosity across an application.
## Core Functionality
- **`__silent_warning__`**:
A package-level boolean variable (unexported) that determines whether warnings should be suppressed.
- **`SilentWarning() bool`**:
A read-only accessor returning the current state of `__silent_warning__`. Enables safe, non-mutating checks elsewhere in the codebase.
- **`SilentWarningPtr() *bool`**:
Returns a pointer to `__silent_warning__`, allowing external code (e.g., CLI parsers, config loaders) to directly mutate the flag — e.g., `*SilentWarningPtr() = true`.
## Design Intent
- **Simplicity & Centralization**:
Avoids scattering warning-control logic; provides a single source of truth.
- **Flexibility**:
Supports both *read-only* inspection (via `SilentWarning()`) and *global mutation* (via pointer), useful for early initialization phases.
- **Explicit Semantics**:
When `SilentWarning()` returns `true`, all warning-generating code *should* suppress output (implementation responsibility lies outside this package).
## Usage Example
```go
// Suppress warnings globally:
*obidefault.SilentWarningPtr() = true
if !obidefault.SilentWarning() {
log.Println("⚠️ Warning: something happened")
}
```
> **Note**: The double underscore prefix on `__silent_warning__` signals internal/private status, discouraging direct access.
@@ -0,0 +1,33 @@
# Progress Bar Control Module (`obidefault`)
This Go package provides a simple, global mechanism to enable or disable progress bar display across an application.
## Core Functionality
- **`ProgressBar()`**: Returns `true` if progress bars are *enabled* (i.e., when `__no_progress_bar__` is `false`).
- **`NoProgressBar()`**: Returns the current state of `__no_progress_bar__`, i.e., whether progress bars are *disabled*.
- **`SetNoProgressBar(b bool)`**: Sets the global flag `__no_progress_bar__`. Passing `true` disables progress bars; passing `false` enables them.
- **`NoProgressBarPtr()`**: Returns a pointer to the internal `__no_progress_bar__` variable, allowing direct read/write access (e.g., for reflection or UI binding).
## Design Intent
- Centralizes progress bar visibility control in one place.
- Supports both boolean query/set and pointer-based manipulation for flexibility (e.g., CLI flags, config binding).
- Uses a *negative* flag name (`__no_progress_bar__`) internally to default progress bars **on** (i.e., `false` → enabled).
## Usage Example
```go
// Disable progress bars globally:
obidefault.SetNoProgressBar(true)
// Check status:
if !obidefault.ProgressBar() {
log.Println("Progress bars are disabled.")
}
```
## Notes
- Thread-safety is *not* guaranteed; concurrent access should be externally synchronized.
- The double underscore prefix (`__no_progress_bar__`) signals internal/private usage per Go convention (though not enforced).
+26
View File
@@ -0,0 +1,26 @@
# Quality Shift and Read/Write Control Module
This Go package (`obidefault`) provides configurable controls over quality score handling in sequence data processing (e.g., FASTQ files). It defines three global variables and corresponding accessor/mutator functions:
- `_Quality_Shift_Input`: Input quality score offset (default: `33`, i.e., Phred+33/Sanger format).
- `_Quality_Shift_Output`: Output quality score offset (default: `33`), allowing format conversion.
- `_Read_Qualities`: Boolean flag indicating whether quality scores should be parsed/processed (`true` by default).
## Public API
| Function | Purpose |
|---------|--------|
| `SetReadQualitiesShift(shift byte)` | Sets the quality score offset for *input* data (e.g., when reading FASTQ). |
| `ReadQualitiesShift() byte` | Returns the current input quality offset. |
| `SetWriteQualitiesShift(shift byte)` | Sets the quality score offset for *output* data (e.g., when writing FASTQ). |
| `WriteQualitiesShift() byte` | Returns the current output quality offset. |
| `SetReadQualities(read bool)` | Enables/disables reading/processing of quality scores. |
| `ReadQualities() bool` | Returns whether qualities are currently being read/used. |
## Semantic Use Cases
- **Format Interoperability**: Allows seamless conversion between Phred+33 (Sanger), Phred+64, or other quality encodings.
- **Performance Optimization**: Disabling `ReadQualities` skips parsing of quality strings, useful when only sequences are needed.
- **Centralized Configuration**: Global state enables consistent behavior across modules without passing parameters.
All functions are thread-unsafe by design—intended for initialization before concurrent processing begins.
+21
View File
@@ -0,0 +1,21 @@
# `obidefault` Package: Configuration State Management
This Go package provides a centralized, thread-safe(ish) configuration layer for taxonomy-related settings in the OBIDMS (Open Biological and Biomedical Data Management System) framework. It exposes simple getters, setters, and pointer accessors for four core boolean/string flags that control how taxonomic identifiers (taxids) are handled during data processing.
## Core Configuration Flags
- `__taxonomy__`: Stores the currently selected taxonomy (e.g., `"NCBI"`, `"UNIPROT"`).
- `__alternative_name__`: Enables/disables use of alternative taxonomic names (e.g., synonyms).
- `__fail_on_taxonomy__`: If true, processing halts on taxonomy mismatches/errors.
- `__update_taxid__`: If true, taxids are auto-updated to current NCBI/DB versions.
- `__raw_taxid__`: If true, raw (unprocessed) taxids are preserved instead of normalized.
## Public API
- **Getters**: `UseRawTaxids()`, `SelectedTaxonomy()`, `HasSelectedTaxonomy()`, etc., return current values.
- **Pointer Accessors**: e.g., `SelectedTaxonomyPtr()` returns a pointer for direct mutation (advanced use).
- **Setters**: `SetSelectedTaxonomy()`, `SetAlternativeNamesSelected()`, etc., update state.
## Use Case
Typically used at application startup to configure global behavior (e.g., `SetSelectedTaxonomy("NCBI")`, `SetUpdateTaxid(true)`), then referenced by downstream modules during data import, validation, or mapping. Minimalist and explicit—no external dependencies.
+35
View File
@@ -0,0 +1,35 @@
# Obidefault: Parallelism Configuration Module
This Go package (`obideault`) provides a centralized, configurable interface for managing parallel execution parameters—particularly useful in I/O- and CPU-bound workloads.
## Core Concepts
- **CPU-aware defaults**: Automatically detects available cores via `runtime.NumCPU()`.
- **Configurable workers per core**:
- General: `_WorkerPerCore` (default `1.0`)
- Read-specific: `_ReadWorkerPerCore` (`0.25`, i.e., ~1 reader per 4 cores)
- Write-specific: `_WriteWorkerPerCore` (`0.25`)
- **Strict overrides**: Allow hardcoding worker counts via `SetStrictReadWorker()`/`Write...`, bypassing per-core scaling.
## Public API
| Function | Purpose |
|---------|--------|
| `ParallelWorkers()` | Total workers = `MaxCPU() × WorkerPerCore` |
| `Read/WriteParallelWorkers()` | Resolves to strict count if set, else per-core calculation (min 1) |
| `ParallelFilesRead()` | Files read in parallel: defaults to `ReadParallelWorkers()`, overridable |
| Getters (`MaxCPU`, `WorkerPerCore`, etc.) | Expose current settings safely |
| Setters (`Set*`) | Dynamically adjust behavior at runtime |
## Configuration Sources
- **Command-line flags**: e.g., `--max-cpu` or `-m`
- **Environment variable**: `OBIMAXCPU`
## Design Highlights
✅ Decouples resource discovery from policy
✅ Supports both *proportional* (per-core) and *absolute* (strict) worker definitions
✅ Ensures non-zero defaults for critical paths (`ReadParallelWorkers` ≥ 1)
⚠️ **Note**: `WriteParallelWorkers()` contains a likely bug—returns `_StrictReadWorker` in the else branch instead of `StrictWriteWorker`.