mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 03:50:39 +00:00
⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
This commit is contained in:
@@ -0,0 +1,33 @@
|
||||
# Obilua: Lua-Based Sequence Processing Framework
|
||||
|
||||
The `obilua` package provides a bridge between Go and the Lua scripting language for high-performance, parallelizable biological sequence processing. It enables users to write custom analysis logic in Lua while leveraging Go’s concurrency and I/O capabilities.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Lua Interpreter Initialization**: `NewInterpreter()` creates an isolated Lua state preloaded with Obi-specific types (`BioSequence`, etc.).
|
||||
- **Compilation Support**: `Compile()` and `CompileScript()` parse and compile Lua code into efficient function prototypes.
|
||||
- **Worker Conversion**: `LuaWorker(proto)` wraps a compiled Lua script as a Go-compatible `SeqWorker`, allowing seamless integration into sequence pipelines.
|
||||
- **Pipeline Integration**:
|
||||
- `LuaProcessor()` executes a Lua script over an iterator of sequences using configurable parallelism.
|
||||
- It supports optional `begin()` and `finish()` hook functions in Lua for initialization/cleanup.
|
||||
- Errors can be handled either by halting (`breakOnError=true`) or logging warnings.
|
||||
|
||||
- **Pipeable Interface**:
|
||||
- `LuaPipe()` and `LuaScriptPipe()` expose Lua scripts as reusable, chainable pipeline stages (`obiiter.Pipeable`), supporting both inline programs and external `.lua` files.
|
||||
|
||||
## Lua API Contract
|
||||
|
||||
Scripts must define a global `worker(sequence)` function returning either:
|
||||
- A single `BioSequence`
|
||||
- A list (`BioSequenceSlice`)
|
||||
Or return nothing (interpreted as filtered out).
|
||||
|
||||
Optionally, `begin()` and `finish()` functions may be defined for lifecycle management.
|
||||
|
||||
## Parallel Execution
|
||||
|
||||
Uses Go routines to run multiple workers concurrently, with batched input and output management. Default worker count falls back to system-wide parallelism settings if `nworkers ≤ 0`.
|
||||
|
||||
## Logging & Error Handling
|
||||
|
||||
Uses Logrus for structured logging; fatal errors are logged during setup, while runtime issues respect the `breakOnError` flag.
|
||||
@@ -0,0 +1,29 @@
|
||||
# `obilua` Module: Lua-Accessible Shared Context with Thread Safety
|
||||
|
||||
This Go package exposes a thread-safe, shared key-value context to Lua scripts via the Gopher-Lua interpreter.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Global `obicontext` Table**: Registered in Lua with the following methods:
|
||||
- `obicontext.item(key [, value])`:
|
||||
Get or set a context variable. Supports types: `bool`, number, string, tables (converted via helper), and user data.
|
||||
- `obicontext.lock()`: Acquire exclusive lock on the context (blocking).
|
||||
- `obicontext.unlock()`: Release the global lock.
|
||||
- `obicontext.trylock()`: Attempt to acquire non-blocking lock; returns boolean success.
|
||||
- `obicontext.inc(key)` / `dec(key)`: Atomically increment/decrement numeric values (float64 only), with lock protection.
|
||||
|
||||
## Thread Safety
|
||||
|
||||
- Uses `sync.Mutex` for serializing write operations (e.g., inc/dec, lock/unlock).
|
||||
- `sync.Map` for concurrent-safe read/write of key-value pairs.
|
||||
- Critical sections (e.g., increment/decrement) are explicitly wrapped with locks to ensure atomicity.
|
||||
|
||||
## Lua Integration
|
||||
|
||||
- Values stored in the context persist across script calls.
|
||||
- Type coercion is handled explicitly: Lua types map directly to Go equivalents, with fallback logging on unsupported types.
|
||||
- Errors (e.g., incrementing non-number) trigger fatal logs—suitable for controlled environments.
|
||||
|
||||
## Use Case
|
||||
|
||||
Ideal for embedding Lua logic in Go applications requiring shared state (e.g., config, counters), with explicit locking for race-free updates.
|
||||
@@ -0,0 +1,31 @@
|
||||
# Semantic Description of `obilua` Package
|
||||
|
||||
The `obilua` package provides utilities for **bi-directional data marshaling between Go and Lua**, specifically focusing on converting native Go values into equivalent `lua.LValue` types for use in a Lua state (`*lua.LState`). This enables Go applications to expose structured data (e.g., maps, slices) or synchronization primitives (`*sync.Mutex`) directly to Lua scripts.
|
||||
|
||||
## Core Functionality
|
||||
|
||||
- **`pushInterfaceToLua(L, val)`**:
|
||||
Main dispatcher that inspects the type of a Go `interface{}` value and routes it to specialized conversion functions. Supported types include:
|
||||
- Basic scalar types: `string`, `bool`, `int`, `float64`
|
||||
- Collections:
|
||||
- Maps: `map[string]{string,int,bool,float64,interface{}}`
|
||||
- Slices/arrays: `[]{string,int,byte,float64,bool]interface{}}`
|
||||
- Special cases:
|
||||
- `nil` → Lua’s `LNil`
|
||||
- `*sync.Mutex` (via dedicated handler)
|
||||
|
||||
- **Type-Specific Pushers**:
|
||||
Each helper function (`pushMapStringIntToLua`, `pushSliceBoolToLua`, etc.) constructs a new Lua table and populates it with converted elements using appropriate `lua.LValue` constructors (`LString`, `LNumber`, `LBool`).
|
||||
- Maps are converted as associative tables (keyed by string).
|
||||
- Slices become indexed Lua arrays (`1..n`).
|
||||
|
||||
- **Generic Slice Support**:
|
||||
`pushSliceNumericToLua[T]()` uses Go generics to handle numeric slices (`int`, `float64`, `byte`) uniformly.
|
||||
|
||||
## Design Notes
|
||||
|
||||
- **No reverse conversion** (Lua → Go) is included — only *pushing* to Lua.
|
||||
- **Strict typing**: Unsupported types trigger a fatal log (`log.Fatalf`), enforcing explicit type handling.
|
||||
- **Lua semantics respected**: Tables are 1-indexed, and numeric types map to `lua.LNumber`.
|
||||
|
||||
This package is ideal for embedding Lua in Go services where dynamic configuration, rule evaluation, or scripting requires safe and predictable data injection.
|
||||
@@ -0,0 +1,28 @@
|
||||
# Semantic Description of `obilua` Package
|
||||
|
||||
This Go package provides utilities for converting Lua tables—used in a Gopher-Lua environment—to native Go data structures.
|
||||
|
||||
- **`Table2Interface`**:
|
||||
Converts a Lua `*lua.LTable` into either:
|
||||
- A Go slice (`[]interface{}`) if the table is array-like (keys are numeric, starting at 1), preserving order and type coercion (`nil`, `bool`, `float64`, `string`).
|
||||
- A Go map (`map[string]interface{}`) if the table contains string keys (i.e., a hash/dictionary).
|
||||
|
||||
- **`Table2ByteSlice`**:
|
||||
Specifically converts an array-like Lua table into a `[]byte`, assuming all values are numeric and ≤ 255.
|
||||
- Fails with a fatal log if non-numeric or out-of-range values are encountered.
|
||||
- Also fails fatally for hash-like (non-array) tables.
|
||||
|
||||
- **Key Design Notes**:
|
||||
- Type coercion is explicit and safe: only `LTNil`, `LTBool`, `LTNumber`, `LTString` are supported.
|
||||
- Array detection relies on key type: if *all* keys are `LNumber`, the table is treated as an array.
|
||||
- Uses [`logrus`](https://github.com/sirupsen/logrus) for fatal error reporting.
|
||||
- No dependency on external serialization (e.g., JSON); conversions are direct and lightweight.
|
||||
|
||||
- **Use Cases**:
|
||||
- Bridging Lua scripting layers with Go backends (e.g., embedded config parsing, plugin systems).
|
||||
- Efficiently extracting structured data from Lua state into idiomatic Go types.
|
||||
|
||||
> ⚠️ **Limitations**:
|
||||
> - No support for nested tables or custom types.
|
||||
> - Array indexing assumes 1-based Lua semantics (converted to 0-indexed Go slices).
|
||||
> - No error handling: misuse triggers `log.Fatalf`.
|
||||
@@ -0,0 +1,30 @@
|
||||
# `obilua.Mutex`: Thread-Safe Synchronization in Lua via Go's sync.Mutex
|
||||
|
||||
This package exposes **Go’s `sync.Mutex`** to the Lua environment using [gopher-lua](https://github.com/yuin/gopher-lua), enabling safe concurrent access from Lua scripts.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **Custom userdata type**: Registers a new metatable `"Mutex"` in the Lua state.
|
||||
- **Constructor function**:
|
||||
- ` Mutex.new() → mutex userdata`
|
||||
Creates and returns a new Go-backed mutex instance.
|
||||
- **Instance methods**:
|
||||
- `mutex:lock()` — Acquires the lock (blocks until available).
|
||||
- `mutex:unlock()` — Releases the lock.
|
||||
- **Type safety**: Validates that only valid mutex userdatas are passed to `lock`/`unlock`.
|
||||
- **Integration**: Designed for embedding Lua in Go applications requiring synchronization (e.g., multi-threaded scripting).
|
||||
|
||||
## Usage Example
|
||||
|
||||
```lua
|
||||
local m = Mutex.new()
|
||||
m:lock() -- Acquire lock (safe across goroutines)
|
||||
-- critical section
|
||||
m:unlock()
|
||||
```
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- Mutex state is stored in a Go `*sync.Mutex` inside Lua userdata.
|
||||
- No reference counting or finalizers — user must manually manage lock/unlock lifecycle to avoid deadlocks.
|
||||
- Thread-safe *from Go side only*; Lua calls must respect goroutine safety (e.g., avoid calling from multiple VMs concurrently).
|
||||
@@ -0,0 +1,30 @@
|
||||
# Obilib Module Overview
|
||||
|
||||
The `obilua` package provides Lua bindings for core OBIL (Ontology-Based Information Library) functionality, enabling scripting and extension of ontological data processing within a Lua environment.
|
||||
|
||||
## Core Components
|
||||
|
||||
- **`RegisterObilib(luaState *lua.LState)`**
|
||||
Main registration function; initializes and exposes OBIL modules to a given Lua state.
|
||||
|
||||
- **`RegisterObiSeq(luaState *lua.LState)`**
|
||||
Registers sequence-related operations (e.g., parsing, manipulation, and analysis of biological sequences like DNA/RNA/proteins).
|
||||
|
||||
- **`RegisterObiTaxonomy(luaState *lua.LState)`**
|
||||
Registers taxonomy utilities (e.g., classification, lineage lookup, and hierarchical navigation of taxonomic trees).
|
||||
|
||||
## Semantic Capabilities
|
||||
|
||||
- Enables *semantic querying* over structured biological data via Lua scripts.
|
||||
- Supports integration of ontological reasoning (e.g., using GO, NCBI Taxonomy) in dynamic workflows.
|
||||
- Provides extensibility: new modules can be added by implementing `Register*` functions.
|
||||
|
||||
## Design Principles
|
||||
|
||||
- Minimal, non-intrusive API: only exposes essential high-level operations.
|
||||
- Leverages `gopher-lua` for seamless interoperability between Go and Lua.
|
||||
|
||||
## Use Cases
|
||||
|
||||
- Custom annotation pipelines in bioinformatics.
|
||||
- Interactive exploration of ontologies and sequences (e.g., via REPL or embedded Lua engines).
|
||||
@@ -0,0 +1,34 @@
|
||||
# `obilua` Package: Biosequence Lua Bindings
|
||||
|
||||
The `obilua` Go package provides **Lua bindings** for biological sequence objects (`obiseq.BioSequence`) used in the OBITools4 ecosystem. It enables scripting and automation of sequence analysis directly from Lua.
|
||||
|
||||
## Core Functionality
|
||||
|
||||
- **Type Registration**: Registers a new userdata type `BioSequence` in the Lua state, exposing methods and constructors.
|
||||
- **Constructor**:
|
||||
```lua
|
||||
BioSequence.new(id, sequence[, definition]) →BioSequence```
|
||||
- **Accessors & Mutators**:
|
||||
- `id()`, `sequence()`, `definition()` – get/set identifiers and sequence data.
|
||||
- `qualities([table])` – handle PHRED-quality scores (as Lua table or string).
|
||||
- `count()`, `taxid()` – numeric abundance and taxonomic ID.
|
||||
- **Taxonomy Integration**:
|
||||
- `taxon([Taxon])` – get/set taxonomic assignment via integrated taxonomy engine.
|
||||
- **Attributes**:
|
||||
- `attribute(name[, value])` – arbitrary metadata storage (supports tables, strings, numbers).
|
||||
- **Sequence Operations**:
|
||||
- `len()` – length of the sequence.
|
||||
- `has_sequence()`, `has_qualities()` – boolean checks for presence of data.
|
||||
- **Computation & Transformation**:
|
||||
- `subsequence(start, end)` – extract a region.
|
||||
- `reverse_complement()` → BioSequence.
|
||||
- `md5()`, `md5_string()` – compute sequence checksums (raw bytes or hex string).
|
||||
- **Serialization**:
|
||||
- `fasta([format])`, `fastq([format])` – output in FASTA/FASTQ, supporting `"json"` or `"obi"` header formats.
|
||||
- `string([format])` – smart formatting: FASTQ if qualities present, else FASTA.
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- Uses `gopher-lua` for interpreter integration.
|
||||
- UserData wrapping ensures type safety and GC management of Go-backed objects.
|
||||
- Error handling via Lua `ArgError` or `RaiseError`.
|
||||
@@ -0,0 +1,31 @@
|
||||
# `obilua` Package: Lua Bindings for BioSequence Slicing
|
||||
|
||||
This Go module provides **Lua scripting support** for biological sequence manipulation via the `obilua` package. It exposes a custom Lua type, `"BioSequenceSlice"`, wrapping Go’s `*obiseq.BioSequenceSlice` to enable high-level sequence operations in Lua.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Type Registration**: Registers `BioSequenceSlice` as a userdata type in Lua with metatable support.
|
||||
- **Constructor**: `new([capacity])` creates a new slice (optionally pre-sized).
|
||||
- **Indexing & Assignment**: `slice[i] = seq` or `seq = slice[i]`, with bounds checking.
|
||||
- **Dynamic Operations**:
|
||||
- `push(seq)`: Append a sequence.
|
||||
- `pop()`: Remove and return the last sequence.
|
||||
- **Length Query**: `len()` returns number of sequences in slice.
|
||||
|
||||
## Output Formatting
|
||||
|
||||
Provides multiple export methods to format all contained sequences:
|
||||
|
||||
- `fasta([format])`: Returns FASTA string (supports `"json"` or `"obi"` headers).
|
||||
- `fastq([format])`: Returns FASTQ string (same format options as above).
|
||||
- `string([format])`: Smart formatter:
|
||||
- Uses FASTQ if *all* sequences have quality scores.
|
||||
- Falls back to FASTA otherwise.
|
||||
|
||||
## Design Notes
|
||||
|
||||
- All methods validate input types and indices.
|
||||
- Format selection is optional; defaults to `"obi"` header style unless specified as `"json"`.
|
||||
- Integrates with `obiseq.BioSequence` and formatting utilities from the OBItools4 ecosystem.
|
||||
|
||||
This enables Lua users to process NGS data (e.g., FASTA/FASTQ) interactively within pipelines, leveraging Go’s performance and Lua’s expressiveness.
|
||||
@@ -0,0 +1,30 @@
|
||||
# Lua Bindings for Taxonomic Operations in `obilua`
|
||||
|
||||
This Go package provides a set of **Lua-accessible functions** for manipulating taxonomic data through the `obitax` library. It exposes a custom Lua type, `"Taxon"`, enabling users to create and query hierarchical taxonomic entities directly from Lua scripts.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Taxon Type Registration**:
|
||||
A new userdata type `Taxon` is registered in the Lua state, with methods exposed via a metatable and `"__index"` delegation.
|
||||
|
||||
- **Taxon Creation**:
|
||||
The `Taxon.new(taxid, parent, sname, rank[, isroot])` constructor creates a new taxon node in the taxonomy. It supports optional root flag and raises errors on failure.
|
||||
|
||||
- **Scientific Name Management**:
|
||||
`taxon:scientific_name([newname])` gets or sets the scientific name of a taxon.
|
||||
|
||||
- **Taxonomic Navigation**:
|
||||
Methods allow upward/downward traversal:
|
||||
- `taxon:parent()` → returns the parent taxon (or nil if root).
|
||||
- `taxon:species()`, `.genus()`, `.family()` → return the nearest taxon at that rank.
|
||||
- `taxon:taxon_at_rank(rank)` → returns the ancestor taxon at a given rank (e.g., `"order"`, `"class"`).
|
||||
|
||||
- **String Representation**:
|
||||
`taxon:string()` returns a human-readable string (typically the scientific name).
|
||||
|
||||
- **Integration with Taxonomy Context**:
|
||||
All operations assume an active taxonomy context (enforced via `checkTaxonomy`), and taxon instances are wrapped as Lua userdata with proper type checking.
|
||||
|
||||
## Use Case
|
||||
|
||||
Ideal for scripting biodiversity pipelines (e.g., in OBITools), where users need to dynamically inspect or build taxonomies during sequence annotation, filtering, or reporting.
|
||||
@@ -0,0 +1,29 @@
|
||||
# ObiTax Lua Module Documentation
|
||||
|
||||
This Go package (`obilua`) provides **Lua bindings** for the `obitax` taxonomy management module of OBItools4, enabling scripting in Lua with rich taxonomic operations.
|
||||
|
||||
## Core Features
|
||||
|
||||
- **Type Registration**: Registers two main types in the Lua state: `Taxonomy` and `Taxon`.
|
||||
- **Factory Functions**:
|
||||
- `obitax.Taxonomy.new(name, code [, charset])`: Creates a new taxonomy instance.
|
||||
- `obitax.Taxonomy.default()`: Returns the globally configured default taxonomy (raises error if none exists).
|
||||
- `obitax.Taxonomy.has_default()`: Boolean check for existence of a default taxonomy.
|
||||
- `obitax.Taxonomy.nil`: Represents the nil taxon (used for missing data).
|
||||
|
||||
## Taxonomy Object Methods
|
||||
|
||||
- `name()`: Returns the taxonomy name (e.g., `"NCBI"`).
|
||||
- `code()`: Returns the internal code used for taxonomic identifiers (e.g., `"txid"`).
|
||||
- `taxon(id)`: Retrieves a taxonomic node by ID; returns:
|
||||
- the corresponding *Taxon* object,
|
||||
- raises an error if not found or on alias resolution when `FailOnTaxonomy()` is enabled.
|
||||
|
||||
## Taxon Object Support
|
||||
|
||||
- A dedicated `registerTaxonType` (not shown here) exposes a Lua-accessible *Taxon* type with methods like `rank`, `parent`, and string representation.
|
||||
|
||||
## Integration
|
||||
|
||||
- Built on top of standard OBItools4 types (`obitax.Taxonomy`, `obiutils.AsciiSetFromString`).
|
||||
- Leverages GopherLua for seamless interoperability between Go and Lua.
|
||||
Reference in New Issue
Block a user