mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
4.5 KiB
4.5 KiB
obiutils — Semantic Feature Overview
The obiutils package is a collection of low- and mid-level utilities for numerical computation, string manipulation, file I/O, concurrency control, data conversion, and format detection—specifically designed for bioinformatics pipelines in the OBITools 4 ecosystem. All public APIs are type-safe, well-documented, and optimized for performance or correctness depending on use case.
Core Functional Categories
🔢 Numerical Utilities
Abs[T constraints.Signed](x T) T: Generic absolute value for signed integers and floats (viagolang.org/x/exp/constraints).Min/Max(...): Unified functions accepting scalars, slices, or maps—uses reflection for heterogeneous inputs; returns errors on empty/unsupported types.MinMaxSlice[T constraints.Ordered]([]T) (min, max T): Efficient min/max for ordered slices; panics on empty input.MinMultiset[T]: Lazy-delete min-priority multiset with O(log n) insertion, amortized O(1) minimum access.
📦 Data Structures
Set[E comparable]: Generic set usingmap[E]struct{}for O(1) membership; supports union, intersection, add/contains/members.Vector[T],Matrix[T][][]T: Row-major 2D structures with methods:.Column(i),.Rows(indices...),.Dim()(safe for nil/jagged matrices).Make2DArray[T],Make2DNumericArray[T](rows, cols int, zeroed bool)for allocation.
🧠 Type Conversion & Validation
InterfaceToString(i interface{}) string,
CastableToInt(...),
InterfaceToBool(...)/Int/Float64: Safe conversions with typed errors (NotAnInteger, etc.).- **
MapToMapInterface(...),InterfaceToIntMap(...)/StringMap: Converts generic maps to concrete types via reflection. InterfaceToStringSlice(...): Normalizes[]interface{}or string slices to[]string.
📄 File & Stream I/O
ReadLines(path string) ([]string, error): Buffered line-by-line file reading.Wfileabstraction (OpenWritingFile,CompressStream) with transparent gzip (viapgzip), buffering, and append support.Ropen/Wopen(...): Unified opener for files/stdin/HTTP/pipes, auto-detecting gzip/xz/zstd/bzip2 via magic bytes.DownloadFile(url, path string): Simple HTTP download with progress bar (no retries/timeouts).TarFileReader(r io.Reader, path string): Extracts a single file from TAR by exact name match.
🔤 String & ASCII Processing
InPlaceToLower([]byte) []byte: Zero-copy uppercase→lowercase conversion for ASCII using bitwise OR (| 32).UnsafeStringFromBytes([]byte) string,UnsafeBytes(string) []byte: Zero-copy conversions (⚠️ unsafe; no bounds checks).AsciiSet[256]bool: Predefined sets (Space,Digit,Alpha) + operations (union, intersect) and helpers:.FirstWord(...),TrimLeft(s string)(via method),RightSplitInTwo(...).
📏 Memory & Path Utilities
ParseMemSize(s string) (int, error): Parses"128K","5MB"→ bytes.FormatMemSize(n int) string: Formats byte counts as"1.5K","2M"(powers of 1024).RemoveAllExt(path string),Basename(path string): Strip all extensions from paths (e.g.,"file.tar.gz"→"file").
📡 Format Detection & MIME Handling
HasBOM([]byte) bool, BOMType: Detects UTF-8/16/32 byte order marks.DropLastLine([]byte) []byte: Trims final newline-delimited line (for truncated files).RegisterOBIMimeType(...): Extends MIME detection for bioformats (FASTA/FASTQ, CSV, ecoPCR2, GenBank) via regex/magic headers.
🔄 Concurrency & Synchronization
AtomicCounter(start int): Thread-safe counter withInc(),Dec(),Value()(mutex-protected).RegisterAPipe/UnregisterPipe(),WaitForLastPipe(): Lightweight pipeline sync viasync.WaitGroup(logs active goroutines).
📊 Ranking & Ordering
- **
IntOrder(data []int) []int,ReverseIntOrder(...): Returns index permutation for ascending/descending sort (original slice unchanged). - **
Order[T sort.Interface](data T) []int: Generic stable index-based sorting.
🧪 Testing & Reliability
- All functions include unit tests (table-driven,
reflect.DeepEqual, subtests). - Error handling is explicit and typed; logging via Logrus for debugging.
- No external dependencies beyond
golang.org/x/exp/constraints(for generics) and optional libraries (progressbar,pgzip). - Designed for portability across Unix/Windows (uses standard library paths).