Files
obitools4/autodoc/prompt_full.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

31 KiB
Raw Blame History

Task

Produce complete documentation for the obi{xxx} CLI command:

  1. autodoc/cmd/obi{xxx}.md — the markdown documentation file
  2. autodoc/examples/obi{xxx}/ — synthetic input sequence files for testing examples
  3. obitools4-doc/content/docs/commands/<category>/obi{xxx}/_index.md — Hugo documentation page

Execute the three phases below in order. Do not skip phases. Do not merge phases.


TOOL CALL FORMAT — enforce before every call

A tool call is exactly:

<function=tool_name>
{"param": "value"}
</function>

Rules:

  • < is immediately followed by f — zero spaces, zero characters in between.
  • Parameters are a single JSON object — no XML tags, no <parameter=...>, no </parameter>.
  • No outer wrapper — never use <tool_call>, <tool_use>, or any other enclosing tag.
  • Tool name lowercase with double underscores — never ALL_CAPS, never single underscore between server and tool name.

CORRECT: <function=mcp__treesitter__treesitter_get_dependencies> WRONG: < function=mcp__treesitter__treesitter_get_dependencies > ← spaces WRONG: <function=mcp_treesitter_treesitter_get_dependencies> ← wrong separator


HALLUCINATION GUARD — enforce before writing anything

OBITools4 is a complete rewrite. Training data about OBITools v1/v2/v3 is wrong for this version.

Before writing any sentence, apply this check:

"Can I point to the exact line in $help or $docs that justifies this claim?" If NO → do not write it.

This applies to: option names, option flags, default values, file formats, behaviours, algorithms, output fields. Omit rather than guess. A shorter correct page is better than a longer hallucinated one.


PHASE 1 — Generate initial documentation file

(Equivalent to prompt_v2.md)

STATE 1 — Gather raw data (parallel)

Input: nothing. Action: emit all of the following tool calls in a single message.

Call 1 — dependencies of the main entry point:

<function=mcp__treesitter__treesitter_get_dependencies>
{"language": "go", "file_path": "cmd/obitools/obi{xxx}/main.go"}
</function>

Call 2 — dependencies of every file in the command package (one call per .go file in pkg/obitools/obi{xxx}/):

<function=mcp__treesitter__treesitter_get_dependencies>
{"language": "go", "file_path": "pkg/obitools/obi{xxx}/options.go"}
</function>
<function=mcp__treesitter__treesitter_get_dependencies>
{"language": "go", "file_path": "pkg/obitools/obi{xxx}/obi{xxx}.go"}
</function>

Call 3 — symbol outline of the command package:

<function=mcp__jcodemunch__get_file_outline>
{"repo": "git.metabarcoding.org/obitools/obitools4/obitools4", "file_paths": ["pkg/obitools/obi{xxx}/options.go", "pkg/obitools/obi{xxx}/obi{xxx}.go"]}
</function>

Call 4 — CLI help text:

<function=Bash>
{"command": "cd /Users/coissac/Sync/travail/__MOI__/GO/obitools4 && obi{xxx} --help 2>&1"}
</function>

Output: store results as $deps, $outline, $help. Stop. Do not interpret, summarise, or write anything. Proceed to STATE 2.


STATE 2 — Resolve documentation files

Input: $deps, $outline. Action (no tool calls):

  1. Collect every import path that starts with git.metabarcoding.org/obitools/obitools4/obitools4/pkg/.
  2. Remove these infrastructure packages:
    • pkg/obidefault, pkg/obiiter, pkg/obiformats, pkg/obioptions, pkg/obiutils, pkg/obiparams
    • pkg/obiseq — keep only if $outline shows non-trivial sequence manipulation (custom methods or transformations beyond simple access).
  3. Always add: pkg/obitools/obi{xxx} and pkg/obitools/obiconvert.
  4. Map each remaining package path to its doc file:
    • take all path segments from pkg/ onward (inclusive), replace / with _autodoc/docmd/<joined_segments>.md
    • Example: git.metabarcoding.org/.../pkg/obitools/obicsvautodoc/docmd/pkg_obitools_obicsv.md
    • Example: git.metabarcoding.org/.../pkg/obiseqautodoc/docmd/pkg_obiseq.md

Output: store file list as $docfiles. Stop. Proceed to STATE 3.


STATE 3 — Read documentation files (parallel)

Input: $docfiles. Action: emit one Read call per file in $docfiles, all in a single parallel message.

Do NOT read any .go source file. Do NOT produce any summary or analysis.

<function=Read>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/DOCFILE"}
</function>

Output: store all file contents as $docs. Stop. Proceed to STATE 4.


STATE 4 — Write the documentation file

Input: $help, $docs, $outline. Action: fill the template below, then emit exactly one Write call.

Source discipline: every piece of information in the template MUST come from $help, $docs, or $outline. If the source does not contain the information, write _(not available)_ — never invent.

Output template

# NAME

obi{xxx} — [FILL: one-line description. Source: first line of $help]

---

# SYNOPSIS

[FILL: verbatim USAGE block from $help, inside a code block]

---

# DESCRIPTION

[FILL: 24 paragraphs explaining what the command does, why a biologist would use it,
and what it produces. Source: $help description section + $docs.
No jargon. No implementation details (goroutines, channels, GC, arena).
No options that belong in the OPTIONS section.]

---

# INPUT

[FILL: accepted input formats and how to provide them. Source: $help + $docs/obiconvert.]

---

# OUTPUT

[FILL: output format and what fields/attributes are added or changed. Source: $help + $docs.
If the default output is JSON, state it clearly. If YAML, state it clearly. Do not assume.]

---

# OPTIONS

[FILL: one subsection per thematic group found in $help.
For each flag:
- Flag name(s): long form + short form if it exists. Source: $help exactly.
- Default: state it. Source: $help exactly. If absent from $help: write "none".
- Meaning: explain the biological or practical purpose, not just the mechanical action.
- Do NOT include any flag not present in $help.]

---

# EXAMPLES

[FILL: at least four copy-pasteable examples.
Each example:
1. A one-line comment explaining the biological use case.
2. The command inside a code block.
3. MUST include output redirection (`> out.fasta`) or `-o` flag so the user can reproduce the output.
4. Use consistent file names across examples (same input file for different options).
5. COVERAGE RULE: every command-specific option documented in the OPTIONS section MUST
   appear in at least one example. If needed, add extra examples to achieve full coverage.
Source for flags and options: $help only.]

---

# SEE ALSO

[FILL: related obi commands mentioned in $docs or $help. If none: omit section.]

---

# NOTES

[FILL: caveats, performance notes, known limitations. Source: $docs only.
If none: omit section.]

Emit the Write call:

<function=Write>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/cmd/obi{xxx}.md", "content": "..."}
</function>

Stop. Proceed to PHASE 2.


PHASE 2 — Test examples and enrich documentation

(Equivalent to prompt_examples.md)

DOCUMENT PRESERVATION — critical

The output of STATE 5 is $doc with surgical edits only. The rules are:

  • Copy the ENTIRE content of $doc verbatim into the new file.
  • Apply ONLY the three modifications described in STATE 5 (EXAMPLES update, prose corrections, OUTPUT subsection addition).
  • Do NOT reformat, reorder, rewrite, or restructure any heading, paragraph, option list, or prose from $doc unless it is factually contradicted by actual execution results.
  • Do NOT add new top-level sections (no ENVIRONMENT VARIABLES, no duplicate OUTPUT, etc.).
  • Do NOT change section title casing, Markdown heading levels, or list syntax.
  • If in doubt, leave the section exactly as it appears in $doc.

Prerequisites — FASTQ FORMAT

A valid FASTQ record is exactly 4 lines in this order:

@<identifier> <optional description>
<nucleotide sequence>          ← MUST be non-empty (≥ 10 characters, A/T/G/C only)
+
<quality string>               ← MUST be the exact same length as the sequence line

Common mistakes forbidden:

  • Writing @header\n+\nquality with the sequence line missing.
  • Writing a quality string shorter or longer than the sequence.
  • Mixing > (FASTA) and @ (FASTQ) headers in the same file.
  • Writing ~-separated fields (e.g. @seq002~description) — use a space.
  • Writing a quality string containing characters outside printable ASCII (33126).

OUTPUT FORMAT GUARD

OBITools4 determines the output format from data content and explicit flags, NOT from filename extension.

  • If the example is meant to produce FASTA output from FASTQ input, the command MUST include --fasta-output.
  • If the example is meant to produce FASTQ output from FASTA input, the command MUST include --fastq-output.
  • Never assume an output format from the filename alone.
  • Verify the actual format of each output file by checking its first character: > = FASTA, @ = FASTQ, [/{ = JSON.
  • If the format is wrong, add the missing flag, update $cmds_doc and $cmds_run, and re-run.

OPTION VALIDATION GUARD

Before writing any example command in STATE 2, explicitly cross-check each option against the OPTIONS section of $doc:

  • Every flag used must appear in the OPTIONS section with the claimed semantics.
  • Input-format flags (--fasta, --fastq, --csv, --genbank, --embl, --ecopcr) tell the tool how to read input — they do NOT affect output format.
  • Output-format flags (--fasta-output, --fastq-output, --json-output) control write format.
  • If an option needed for a working example is absent from $doc, mark that example as SKIP rather than inventing a flag.

ANNOTATION RULES — CRITICAL

When creating FASTA/FASTQ files with annotations:

  • Use only valid annotation attribute names: taxid, scientific_name, rank, definition, sample, run_id, instrument
  • For taxonomy data: use taxid (NCBI Taxonomy ID) and scientific_name — never invent taxids
  • Examples of valid taxonomy annotations:
    • >seq001 {"taxid":2} — Bacteria (valid NCBI taxid)
    • >seq002 {"taxid":2157,"scientific_name":"Archaea"} — Archaea (valid NCBI taxid)
    • >seq003 {"taxid":2759,"scientific_name":"Eukaryota"} — Eukaryota (valid NCBI taxid)
  • NEVER use invented taxids
  • Map attributes (JSON maps) must have names ending with _merged (e.g., taxid_merged, sample_merged)

CSV FILES FOR JOINS

When creating CSV files for obijoin:

  • Do NOT include the ID column in the CSV (the join key is specified separately via --by)
  • The CSV format is auto-detected; do NOT use --csv flag
  • Example CSV structure for taxid join:
    taxid,scientific_name,phylum
    2,Bacteria,Proteobacteria
    2157,Archaea,Euryarchaeota
    2759,Eukaryota,Arthropoda
    
  • Example command: obijoin --join-with taxonomy.csv --by taxid sequences.fasta

STATE 1 — Read the documentation file

Input: nothing. Action: emit a single Read call.

<function=Read>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/cmd/obi{xxx}.md"}
</function>

Output: store content as $doc. Stop. Do not interpret or summarise. Proceed to STATE 2.


STATE 2 — Analyse examples and design input files

Input: $doc. Action (no tool calls):

  1. Extract every example command from the EXAMPLES section of $doc.

    • Identify every distinct input filename referenced.
    • Identify every option used and verify each against the OPTIONS section (OPTION VALIDATION GUARD).
    • Skip any example that requires an external resource (taxonomy database, remote URL, pre-existing output file). Mark it SKIP — it will be kept verbatim in the final doc without a **Expected output:** annotation.
    • --paired-with examples: --paired-with requires --out (standard output cannot be used). The command produces TWO output files <prefix>_R1.ext and <prefix>_R2.ext. Do NOT use > redirection for these. In STATE 4, read both _R1 and _R2 files.
  2. Coverage check — command-specific options: From the OPTIONS section of $doc, list all command-specific options (excluding those covered by standard option-sets: input, output, common — see Phase 3 STATE 2 item 8 for the full list). Verify that every such option appears in at least one non-skipped example. If any option is not covered, add an additional example that exercises it before proceeding.

  3. For each distinct input filename, design synthetic sequence content that:

    • Is minimal (≤ 20 sequences, each ≤ 300 bp).
    • Contains sequences that will produce output (positive cases) AND at least one that will not produce output (negative case), when the command filters sequences.
    • Exercises every option combination present in the non-skipped examples.
    • Uses realistic identifiers (seq001, seq002, …).
  4. File format rules (strictly enforced):

    • FASTA: >id description header, then sequence on one or more lines (60 bp per line), ≥ 10 bp, A/T/G/C only.
    • FASTQ: exactly 4 lines per record. Before finalising, mentally verify each record:
      • Line 1 starts with @, has an identifier, optionally a space and description.
      • Line 2 is the nucleotide sequence (≥ 10 characters, A/T/G/C only).
      • Line 3 is exactly +.
      • Line 4 has exactly the same number of characters as line 2. If any record fails this check, fix it before proceeding.
  5. Rewrite every non-skipped example command into two forms:

    • $cmds_doc: the bare command as it will appear in documentation — filenames only, no cd prefix.
    • $cmds_run: the same command prefixed with cd /Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx} &&

Output: store file designs as $files, $cmds_doc, $cmds_run. Stop. Proceed to STATE 3.


STATE 3 — Write input files, validate, and run examples

Step 3a — create input files (parallel):

<function=Write>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx}/FILENAME", "content": "..."}
</function>

Step 3b — validate input files:

<function=Bash>
{"command": "cd /Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx} && python3 -c \"\nimport sys\nfor fname in $(echo FILENAMES):\n    lines = open(fname).readlines()\n    if fname.endswith('.fastq'):\n        assert len(lines) % 4 == 0, f'{fname}: line count not multiple of 4'\n        for i in range(0, len(lines), 4):\n            hdr, seq, plus, qual = lines[i:i+4]\n            assert hdr.startswith('@'), f'{fname} record {i//4+1}: header must start with @'\n            seq = seq.rstrip(); qual = qual.rstrip()\n            assert len(seq) >= 10, f'{fname} record {i//4+1}: sequence too short ({len(seq)})'\n            assert len(seq) == len(qual), f'{fname} record {i//4+1}: seq len {len(seq)} != qual len {len(qual)}'\n    elif fname.endswith('.fasta') or fname.endswith('.fa'):\n        assert lines[0].startswith('>'), f'{fname}: first line must start with >'\n        seq_len = sum(len(l.rstrip()) for l in lines[1:] if not l.startswith('>'))\n        assert seq_len >= 10, f'{fname}: total sequence length too short ({seq_len})'\nprint('All input files valid')\n\" 2>&1; echo EXIT:$?"}
</function>

If validation fails (EXIT non-zero): fix the offending file(s) and re-run. Do NOT proceed until validation passes.

Step 3c — run examples (sequential, one Bash call at a time):

For each non-skipped example, emit the run command, wait for the result, then immediately verify.

Run:

<function=Bash>
{"command": "cd /Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx} && COMMAND 2>&1; echo EXIT:$?"}
</function>

After each EXIT:0, verify the output file exists and is non-empty:

<function=Bash>
{"command": "ls -la /Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx}/OUTPUT_FILE && head -c 200 /Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx}/OUTPUT_FILE"}
</function>

Check output format from the first character: > = FASTA, @ = FASTQ, [/{ = JSON. If the format does not match expectation, add the missing --fasta-output / --fastq-output / --json-output flag, update $cmds_doc and $cmds_run, and re-run.

If a command fails (EXIT non-zero): diagnose, fix, update $cmds_doc and $cmds_run, and re-run. Do NOT proceed to STATE 4 until all non-skipped commands have EXIT:0 and verified non-empty output files.

Output: store per-command results as $runs. Stop. Proceed to STATE 4.


STATE 4 — Read output files

Input: $runs (output file paths from STATE 3). Action: emit one Read call per output file successfully produced (EXIT:0), all in a single parallel message.

Do NOT re-run commands — read only files already generated in STATE 3.

<function=Read>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx}/OUTPUT_FILE"}
</function>

Output: store contents as $outputs. Stop. Proceed to STATE 5.


STATE 5 — Update the documentation file

Input: $doc, $runs, $outputs, $cmds_doc.

Re-read DOCUMENT PRESERVATION at the top before writing. Apply ONLY these modifications:

Modification 1 — EXAMPLES section

For each non-skipped example:

  • Replace the original command with the $cmds_doc form (no cd prefix).
  • Keep the one-line biological use-case comment unchanged.
  • Add **Expected output:** on its own line after the closing triple-backtick of the code block:
    **Expected output:** N sequences written to `out_name.fasta`.
    
    where N = number of lines starting with > or @ in the corresponding $outputs entry.

For skipped examples: keep them exactly as they are in $doc, no annotation added.

Modification 2 — Prose corrections (if any factual contradiction)

Fix only specific sentences that are contradicted by actual execution results. Examples of things to correct:

  • An attribute name that differs from actual output.
  • An output format claimed that differs from the actual format observed.
  • A default value stated incorrectly.
  • A claim about which sequences are selected/discarded that contradicts observed results.

After each corrected passage, add: <!-- corrected: <brief reason> --> Do NOT "improve" text that is merely incomplete — only fix outright contradictions.

Modification 3 — OUTPUT section

Find the # OUTPUT section and append at its very end (before the next --- or #):

## Observed output example

<verbatim excerpt — first ≤ 10 sequences from the first successful $outputs entry>

Rules:

  • The excerpt is copied byte-for-byte from $outputs. No editing, no truncation within a record.
  • Do NOT duplicate the OUTPUT section. There must be exactly one # OUTPUT heading.
  • If no output was successfully produced, omit this subsection entirely.
<function=Write>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/cmd/obi{xxx}.md", "content": "..."}
</function>

Stop. Proceed to PHASE 3.


PHASE 3 — Generate Hugo documentation

(Equivalent to prompt_hugo.md)

HUGO SHORTCODE REFERENCE

Use only the shortcodes listed below. Never invent others.

Shortcode Syntax Effect
Command link {{< obi obi{xxx} >}} Renders command name as internal link
Format name {{% fasta %}} {{% fastq %}} {{% csv %}} {{% json %}} {{% yaml %}} Renders format name (use in prose)
Suite name {{% obitools4 %}} Renders "OBITools4" as styled text
Embed data file {{< code "FILENAME" FORMAT true >}} Embeds file content; FORMAT = fasta, fastq, txt, csv, json, yaml
Standard option set {{< option-sets/input >}} Renders shared input-format options
Standard option set {{< option-sets/output >}} Renders shared output-format options
Standard option set {{< option-sets/common >}} Renders shared performance/logging options
Standard option set {{< option-sets/selection >}} Renders shared sequence-selection options (obigrep only)
Single shared option {{< cmd-options/paired-with >}} Renders --paired-with option
Custom option block {{< cmd-option name="NAME" short="S" param="PARAM" >}} text {{< /cmd-option >}} Renders a command-specific option; short and param are optional
Workflow diagram {{< mermaid class="workflow" >}}{{< /mermaid >}} Renders Mermaid flowchart

SECTION STRUCTURE OF A HUGO COMMAND PAGE

---
archetype: "command"
title: "obi{xxx}"
date: YYYY-MM-DD
command: "obi{xxx}"
category: <category>
url: "/obitools/obi{xxx}"
weight: <weight>
---

# `obi{xxx}`: <one-line description>

> [!WARNING] Preliminary AI-generated documentation
> This page was automatically generated by an AI assistant and has **not yet been
> reviewed or validated** by the {{% obitools4 %}} development team. It may contain
> inaccuracies or incomplete information. Use with caution and refer to the command's
> `--help` output for authoritative option descriptions.

## Description

<narrative prose — 25 paragraphs, uses {{< obi >}} and {{% format %}} shortcodes>
<workflow diagram>
<data files with {{< code >}} shortcodes and paired command+output blocks>

## Synopsis

```bash
obi{xxx} [--option1] [--option2|-s PARAM] ... [<args>]

Options

{{< obi obi{xxx} >}} specific options

  • {{< cmd-option name="NAME" short="S" param="PARAM" >}} Description of option. {{< /cmd-option >}}

Taxonomic options ← include only if command uses taxonomy

  • {{< cmd-options/taxonomy/taxonomy >}}

{{< option-sets/input >}}

{{< option-sets/output >}} ← omit if command has no output (e.g. obicount)

{{< option-sets/common >}}

Examples

...

obi{xxx} --help

**YAML front matter fields:**
- `archetype`: always `"command"`.
- `title` and `command`: the command name.
- `date`: today's date in `YYYY-MM-DD` format.
- `category`: the subdirectory name under `commands/`.
- `url`: always `/obitools/obi{xxx}`.
- `weight`: copy from the existing `_index.md` if it exists; otherwise use `50`.

---

### STATE 1 — Read source material (parallel)

**Input:** nothing.
**Action:** emit all of the following calls in a single parallel message.

<function=Read> {"file_path": "/Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/cmd/obi{xxx}.md"} <function=Bash> {"command": "ls /Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/ 2>/dev/null || echo NO_EXAMPLES"} <function=Read> {"file_path": "/Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands/basics/obi{xxx}/_index.md"} <function=Bash> {"command": "find /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands -type d -name 'obi{xxx}'"}


**Output:** store as `$doc`, `$examples_list`, `$existing_hugo`, `$category_path`.
**Stop.** Do not interpret. Proceed to STATE 2.

---

### STATE 2 — Determine category and plan content

**Input:** `$doc`, `$examples_list`, `$existing_hugo`, `$category_path`.

1. **Category:**
   - If `$category_path` found a directory, extract the category name (segment between `commands/` and `obi{xxx}`).
   - If `$existing_hugo` contains `category:`, use that value.
   - Otherwise, default to `basics`.

2. **Weight:**
   - If `$existing_hugo` contains `weight:`, reuse that value.
   - Otherwise, use `50`.

3. **Example files to copy:** input files AND output files from the working examples. Skip compressed files (`.gz`).

4. **Naming convention for output files:** Use simple names like `out.json`, `out.yaml`, `out.fasta`, `out.fastq` — NOT `out_json.json` or similar.

5. **Command-output file consistency:** Each example command MUST produce the file shown below it. Verify that the flag in the command creates the displayed file.

6. **Plan replacements:**
   - `obi{xxx}` → `{{< obi obi{xxx} >}}`
   - Format names in prose → `{{% fasta %}}`, etc.
   - Input filenames → `{{< code "FILENAME" FORMAT true >}}`

7. **Workflow diagram consistency:** The Mermaid diagram MUST use the exact same files as the first working example.

8. **Options section plan — standard option-sets coverage:**
   The following options are covered by standard option-sets and must NOT be re-documented:
   - `{{< option-sets/input >}}`: `--fasta`, `--fastq`, `--embl`, `--genbank`, `--ecopcr`, `--csv`, `--input-OBI-header`, `--input-json-header`, `--u-to-t`, `--solexa`, `--skip-empty`, `--no-order`.
   - `{{< option-sets/output >}}`: `--fasta-output`, `--fastq-output`, `--json-output`, `--output-OBI-header`/`-O`, `--output-json-header`, `--out`/`-o`, `--compress`/`-Z`.
   - `{{< option-sets/common >}}`: `--max-cpu`, `--batch-size`, `--batch-size-max`, `--batch-mem`, `--no-progressbar`, `--debug`, `--silent-warning`, `--pprof`, `--pprof-goroutine`, `--pprof-mutex`, `--version`, `--help`.
   - `--paired-with` → `{{< cmd-options/paired-with >}}`.
   - Taxonomy options (`--taxonomy`, `--fail-on-taxonomy`, `--raw-taxid`, `--update-taxid`, `--with-leaves`) → grouped under `#### Taxonomic options` with `{{< cmd-options/taxonomy/taxonomy >}}` for `--taxonomy`; document the rest with `{{< cmd-option >}}` blocks.
   - All remaining command-specific options → `{{< cmd-option >}}` blocks under `#### {{< obi obi{xxx} >}} specific options`.

9. **Examples:** keep only examples whose input files exist in `$examples_list`. Add final `obi{xxx} --help`.

10. **Remove duplicates:** If the same files appear in both Description and Examples sections, keep only the first occurrence.

**Output:** store as `$plan`.
**Stop.** Proceed to STATE 3.

---

### STATE 3 — Read example input files (parallel)

Emit one Read call per file to be used in the Hugo page (both input and output files). Do not read compressed files (`.gz`).

<function=Read> {"file_path": "/Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/FILENAME"}


**Output:** store file contents as `$input_files`.
**Stop.** Proceed to STATE 4.

---

### STATE 4 — Write Hugo files (parallel)

**Step 4a — write `_index.md`** following SECTION STRUCTURE and CONTENT RULES below.

**CRITICAL:** The Synopsis section MUST use the **full verbatim** synopsis from `$doc`, not a simplified version.

**Step 4b — copy data files (parallel with 4a):**

<function=Bash> {"command": "cp /Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/.fasta /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands//obi{xxx}/ 2>/dev/null || true"} <function=Bash> {"command": "cp /Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/.fastq /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands//obi{xxx}/ 2>/dev/null || true"} <function=Bash> {"command": "cp /Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/.csv /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands//obi{xxx}/ 2>/dev/null || true"} <function=Bash> {"command": "cp /Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/.json /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands//obi{xxx}/ 2>/dev/null || true"} <function=Bash> {"command": "cp /Users/coissac/Sync/travail/MOI/GO/obitools4/autodoc/examples/obi{xxx}/*.yaml /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands//obi{xxx}/ 2>/dev/null || true"}


Skip compressed files (`.gz`).

**Step 4c — delete unused files from Hugo directory:**

<function=Bash> {"command": "ls -la /Users/coissac/Sync/travail/MOI/GO/obitools4-doc/content/docs/commands//obi{xxx}/"}

Compare with `$examples_list` and remove any files not present in the working examples.

---

## CONTENT RULES (apply throughout Phase 3 STATE 4)

### Workflow diagram

Place in the Description section, after introductory prose and before the first `{{< code >}}` block.

{{< mermaid class="workflow" >}} graph TD A@{ shape: doc, label: "input_file.fastq" } C[obi{xxx}] D@{ shape: doc, label: "output_file.fasta" } A --> C:::obitools C --> D classDef obitools fill:#99d57c {{< /mermaid >}}


Rules:
- One `@{ shape: doc, label: "FILENAME" }` node per input file; use filenames from the first example.
- One output node for the output file of the same example.
- Apply `:::obitools` on the **last arrow pointing to the command node**, not on the node definition line.
- `classDef obitools fill:#99d57c` must always be the last line inside the block.
- If the command produces no file output (prints to stdout only), use a terminal node `D([stdout])`.

### Description section

Narrative prose teaching the reader how the command works, one concept at a time. These are NOT the same examples as in the Examples section — simpler, focused on a single behaviour, chosen to clarify specific options or edge cases.

- Write flowing paragraphs, not bullet lists of options.
- Explain **why** a biologist would use the command and **what** it does.
- Introduce data files with `{{< code "FILENAME" FORMAT true >}}` before the first command that uses them.
- Show example commands and their output in **paired** fenced blocks — no `**Expected output:**` label:
  ````markdown
  ```bash
  obi{xxx} [options] input_file
<actual output lines>
- Replace tool name in prose with `{{< obi obi{xxx} >}}`.
- Replace format names in prose with `{{% fasta %}}`, `{{% fastq %}}`, etc.
- **Do NOT reuse** examples from the Examples section verbatim. Description examples are simpler and pedagogical.

### Synopsis section

Use the synopsis from `$doc` verbatim. Wrap in a `bash` fenced code block.

### Options section

- Only document options **not** covered by `{{< option-sets/... >}}` (see STATE 2 item 8).
- Group under `#### {{< obi obi{xxx} >}} specific options`, then `#### Taxonomic options` if applicable, then the three `{{< option-sets/... >}}`.

### Examples section

Practical, real-world recipes. Each example addresses a distinct use case not already shown in Description.

- **Never duplicate** an example from the Description section.
- Every example that produces sequence or annotation file output uses this pattern:
1. Short intro paragraph (24 sentences) with a Markdown hyperlink to the input file, e.g. `The file [input.fasta](input.fasta) contains …`.
2. `{{< code "input_file" FORMAT true >}}`
3. `bash` fenced block with `-o out_name.ext` (never `>` redirection for non-paired examples).
4. `{{< code "out_name.ext" FORMAT true >}}`
- **`--paired-with` examples:** use `--out <prefix>.fastq` (not `>`), show both output files:
````markdown
```bash
obi{xxx} --paired-with reverse.fastq --out out_paired.fastq forward.fastq
```
{{< code "out_paired_R1.fastq" fastq true >}}
{{< code "out_paired_R2.fastq" fastq true >}}

Both _R1 and _R2 files must be copied to the Hugo command directory (Step 4b).

  • CSV output: pipe through csvlook, no file redirection, no {{< code >}}:
    ```bash
    obi{xxx} [options] input_file | csvlook
    ```
    ```
    | col1 | col2 |
    | ---- | ---- |
    | val1 | val2 |
    ```
    
  • Last example always: ```bash\nobi{xxx} --help\n``` (no output block).
  • Never inline file content as raw fenced blocks — always use {{< code >}}.
  • Output files must be copied to the Hugo command directory alongside input files (Step 4b).