Files
obitools4/autodoc/docmd/pkg_obitable.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

40 lines
3.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `obitable`: Row-Oriented Data Table for Biological Sequences
The `obitable` package provides a lightweight, row-oriented data table structure (`Table`) for managing biological sequence metadata in Go. It is designed to support heterogeneous, schema-flexible tabular representations of sequences while maintaining strong interoperability with `obiseq`, the core biological sequence module in OBITools4.
## Core Types
- **`Header`**: An ordered list of column names (alias for `stl4go.Ordered`). It defines the schemas *column order* but not types.
- **`Row`**: A flexible, map-like structure (`map[string]interface{}`) representing a single record. Values may be of any Go type.
- **`Table`**: Encapsulates both a `Header`, column-type metadata (`ColType map[string]reflect.Type`), and an ordered slice of `Row`s. Enforces type consistency *per column* across rows.
## Row Generators (Lazy/On-Demand Construction)
- **`RowFromMap(map[string]interface{}, interface{}) RowFunc`**
Returns a callable `func(string) interface{}` (i.e., a *row accessor*). For any column name, it retrieves the corresponding value from the input map; missing keys are replaced by a configurable default (`navalue`, typically `nil`). Enables efficient wrapping of generic maps as row-like functions.
- **`RowFromBioSeq(seq obiseq.BioSequence, navalue interface{}) RowFunc`**
Constructs a row accessor specialized for `obiseq.BioSequence`. Maps standard fields (`id`, `description`, `sequence`, etc.) and dynamically extracts all sequence annotations (e.g., `qualifiers` in FASTA/FASTQ) as column entries. Missing fields default to `navalue`.
## Semantic Capabilities
- **Heterogeneous column types**: Each column may hold values of any Go type (e.g., `string`, `int64`, `[]byte`), with runtime type tracking via `ColType`.
- **Uniform metadata access**: Enables seamless integration of sequence identifiers, raw sequences, and rich annotation sets (e.g., taxonomy IDs, quality scores).
- **Streaming-friendly**: Row generators avoid materializing full row maps until needed—ideal for large-scale pipeline processing.
- **Interoperability**: Built explicitly to work with `obiseq` and future extensions of OBITools4.
## Public API Summary
| Function / Type | Purpose |
|-----------------|---------|
| `NewTable(header Header, colType map[string]reflect.Type) *Table` | Instantiate a new table with schema. |
| `Append(t *Table, row Row)` | Append one fully materialized row to the table. |
| `AppendFunc(t *Table, f RowFunc)` | Append a row via lazy accessor (no intermediate map). |
| `RowFromMap(...), RowFromBioSeq(...)` | Create reusable row accessors for map-based or sequence-backed data. |
| `ToMap(row Row) map[string]interface{}` | Materialize a row as a plain Go map. |
| `ColType(t *Table) map[string]reflect.Type` | Expose column type metadata. |
| `Header(t *Table) Header` | Retrieve the tables ordered header (column names). |
| `Rows(t *Table) []Row` | Access all rows as a slice (for iteration/export). |
> **Note**: All public functions operate on `Table`, `Header`, and `Row` types. Internal helpers (e.g., type-checking utilities) are not exposed.