# `obidemerge` Package Documentation

The **`obidemerge`** package enables *demerging* of biological sequences—i.e., splitting aggregated or merged sequence records into discrete, count-annotated variants based on metadata statistics. It supports both programmatic and CLI workflows for downstream processing in metabarcoding or amplicon-based pipelines.

## Core Functionalities

### 1. `MakeDemergeWorker(key string) SeqWorker`
- **Purpose**: Constructs a sequence processor that splits sequences by statistical metadata.
- **Behavior**:
  - Scans the input sequence for a statistics map under attribute `key`.
    - *Example*: If `"sample"` → `{ "S1": 5, "S2": 3 }`, two new sequences are generated.
  - For each `(stat_key, count)` pair:
    - Copies the original sequence data,
    - Adds a new attribute: `key = stat_key`,
    - Sets `.Count` to the corresponding integer value.
  - Removes original statistics from the input sequence after splitting.
- **Fallback**: If no stats are found for `key`, returns a single-element slice containing the unchanged sequence.

### 2. `CLIDemergeSequences(iterator, slot string) SeqIterator`
- **Purpose**: CLI wrapper for batch demerging.
- **Behavior**:
  - Applies `MakeDemergeWorker(slot)` to each sequence in the input iterator.
  - Supports parallel processing (implementation-dependent).
- **Integration**:
  - Designed to be used with the `--demerge` CLI flag (see below).

### 3. CLI Integration via OptionSet
- **Flag**: `--demerge` (`-d`)
  - Specifies the metadata slot to demerge (default: `"sample"`).
- **APIs**:
  - `DemergeOptionSet(options *getoptions.Options)`: Registers the `-d/--demerge` flag.
  - `CLIDemergeSlot() string`: Returns the selected slot name (e.g., `"sample"`), used by downstream workers.
- **Inheritance**:
  - Extends `obiconvert.OptionSet`, inheriting standard conversion options (I/O formats, filters, etc.).

## Semantic Workflow

1. **Input**: Sequences with embedded statistical metadata (e.g., sample abundances, OTU counts).
2. **Demerge Operation**: Splits each sequence into multiple copies—each tagged with a unique metadata key and abundance.
3. **Output**: A new set of sequences where each variant is independently annotated, enabling:
   - Accurate abundance-aware filtering,
   - Per-variant downstream analysis (e.g., taxonomic assignment, diversity metrics).

## Key Concept: *Demerging*
- **Definition**: Reversal of prior merging steps (e.g., OTU clustering, read pairing).
- **Purpose**: Restores granularity for statistical or ecological interpretation while preserving original sequence data.

## Use Cases
- Post-clustering demerging of OTU/ASV tables.
- Splitting merged paired-end reads by sample or condition metadata.
- Preparing data for tools expecting discrete, count-labeled sequences.

> **Note**: Only *public* APIs are documented. Internal helpers (e.g., slot validation, worker state) remain unspecified.