Files
obitools4/autodoc/prompt_hugo.md
T

415 lines
15 KiB
Markdown
Raw Normal View History

2026-04-07 08:36:50 +02:00
# Task
Convert `autodoc/cmd/obi{xxx}.md` and `autodoc/examples/obi{xxx}/` into a Hugo
documentation page at
`/Users/coissac/Sync/travail/__MOI__/GO/obitools4-doc/content/docs/commands/<category>/obi{xxx}/`.
The Hugo site uses the **Book** theme with custom shortcodes specific to OBITools4.
Every rule below is derived from reading existing pages — do not invent shortcodes or
patterns not listed here.
---
## TOOL CALL FORMAT — enforce before every call
A tool call is exactly:
<function=tool_name>
{"param": "value"}
</function>
Rules:
- `<` immediately followed by `f` — zero spaces.
- Parameters are a **single JSON object** — no XML wrapper tags.
- No outer `<tool_call>` or `<tool_use>` wrapper.
- Tool name lowercase with double underscores.
---
## HUGO SHORTCODE REFERENCE
Use only the shortcodes listed below. Never invent others.
| Shortcode | Syntax | Effect |
|-----------|--------|--------|
| Command link | `{{< obi obi{xxx} >}}` | Renders command name as internal link |
| Format name | `{{% fasta %}}` `{{% fastq %}}` `{{% csv %}}` `{{% json %}}` `{{% yaml %}}` | Renders format name (use in prose) |
| Suite name | `{{% obitools4 %}}` | Renders "OBITools4" as styled text |
| Embed data file | `{{< code "FILENAME" FORMAT true >}}` | Embeds file content in page; FORMAT = `fasta`, `fastq`, `txt`, `csv`, `json`, `yaml` |
| Standard option set | `{{< option-sets/input >}}` | Renders the shared input-format options block |
| Standard option set | `{{< option-sets/output >}}` | Renders the shared output-format options block |
| Standard option set | `{{< option-sets/common >}}` | Renders the shared performance/logging options block |
| Standard option set | `{{< option-sets/selection >}}` | Renders the shared sequence-selection options block (obigrep only) |
| Single shared option | `{{< cmd-options/paired-with >}}` | Renders the `--paired-with` option description |
| Custom option block | `{{< cmd-option name="NAME" short="S" param="PARAM" >}}` text `{{< /cmd-option >}}` | Renders a command-specific option; `short` and `param` are optional |
| Workflow diagram | `{{< mermaid class="workflow" >}}``{{< /mermaid >}}` | Renders a Mermaid flowchart showing command inputs and outputs |
---
## SECTION STRUCTURE OF A HUGO COMMAND PAGE
```markdown
--- ← YAML front matter (see below)
...
---
# `obi{xxx}`: <one-line description>
> [!WARNING] Preliminary AI-generated documentation
> This page was automatically generated by an AI assistant and has **not yet been
> reviewed or validated** by the {{% obitools4 %}} development team. It may contain
> inaccuracies or incomplete information. Use with caution and refer to the command's
> `--help` output for authoritative option descriptions.
## Description
<narrative prose — 25 paragraphs, uses {{< obi >}} and {{% format %}} shortcodes>
<workflow diagram — see WORKFLOW DIAGRAM rule below>
<data files shown with {{< code >}} shortcodes>
<commands and output shown in paired fenced code blocks>
## Synopsis
```bash
obi{xxx} [--option1] [--option2|-s PARAM] ... [<args>]
```
## Options
#### {{< obi obi{xxx} >}} specific options
- {{< cmd-option name="NAME" short="S" param="PARAM" >}}
Description of option.
{{< /cmd-option >}}
#### Taxonomic options ← include only if command uses taxonomy
- {{< cmd-options/taxonomy/taxonomy >}}
{{< option-sets/input >}}
{{< option-sets/output >}} ← omit if command has no output (e.g. obicount)
{{< option-sets/common >}}
## Examples
```bash
obi{xxx} --help
```
```
---
## YAML FRONT MATTER TEMPLATE
```yaml
---
archetype: "command"
title: "obi{xxx}"
date: <YYYY-MM-DD>
command: "obi{xxx}"
category: <category>
url: "/obitools/obi{xxx}"
weight: <weight>
---
```
Fields:
- `archetype`: always `"command"`.
- `title`: the command name, e.g. `"obigrep"`.
- `date`: today's date in `YYYY-MM-DD` format.
- `command`: same as title.
- `category`: the subdirectory name under `commands/` (see STATE 1).
- `url`: always `/obitools/obi{xxx}`.
- `weight`: copy the value from the **existing** `_index.md` if the page already exists;
otherwise use `50`.
---
## PIPELINE
Execute the five states below in order. Do not skip states. Do not merge states.
---
### STATE 1 — Read source material (parallel)
**Input:** nothing.
**Action:** emit all of the following calls in a single parallel message.
Call 1 — read the autodoc file:
```
<function=Read>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/cmd/obi{xxx}.md"}
</function>
```
Call 2 — list example files:
```
<function=Bash>
{"command": "ls /Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx}/ 2>/dev/null || echo NO_EXAMPLES"}
</function>
```
Call 3 — read the existing Hugo page (may not exist yet):
```
<function=Read>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4-doc/content/docs/commands/basics/obi{xxx}/_index.md"}
</function>
```
Call 4 — list the Hugo commands directory to find the right category:
```
<function=Bash>
{"command": "find /Users/coissac/Sync/travail/__MOI__/GO/obitools4-doc/content/docs/commands -type d -name 'obi{xxx}'"}
</function>
```
**Output:** store results as `$doc`, `$examples_list`, `$existing_hugo`, `$category_path`.
**Stop.** Do not interpret. Proceed to STATE 2.
---
### STATE 2 — Determine category and plan content (no tool calls)
**Input:** `$doc`, `$examples_list`, `$existing_hugo`, `$category_path`.
1. **Category:**
- If `$category_path` found a directory, extract the category name from the path
(segment between `commands/` and `obi{xxx}`).
- If `$existing_hugo` exists and contains `category:`, use that value.
- Otherwise, default to `basics`.
2. **Weight:**
- If `$existing_hugo` contains `weight:`, reuse that value.
- Otherwise, use `50`.
3. **Example files to copy:**
- From `$examples_list`, keep only files that are referenced in the EXAMPLES section
of `$doc` **as input files** (not output files that start with `out_`).
- Identify the format of each file from its extension:
`.fasta` / `.fa``fasta`, `.fastq` / `.fq``fastq`,
`.txt``txt`, `.csv``csv`, `.gz` → skip (do not embed compressed files).
4. **Description section plan:**
- Extract the DESCRIPTION content from `$doc`.
- Identify every occurrence of `obi{xxx}` and plan to replace with `{{< obi obi{xxx} >}}`.
- Identify format names (`FASTA`, `FASTQ`, `JSON`, `CSV`) and plan to replace with
`{{% fasta %}}`, `{{% fastq %}}`, etc. in flowing prose (not in code blocks).
- Identify every input filename used in examples and plan to show with
`{{< code "FILENAME" FORMAT true >}}` just before the first command that uses it.
5. **Options section plan:**
- List options that are command-specific (not covered by standard option-sets).
- The standard option-sets cover:
- `{{< option-sets/input >}}`: all `--fasta`, `--fastq`, `--embl`, `--genbank`,
`--ecopcr`, `--csv`, `--input-OBI-header`, `--input-json-header`, `--u-to-t`,
`--solexa`, `--skip-empty`, `--no-order` flags.
- `{{< option-sets/output >}}`: all `--fasta-output`, `--fastq-output`,
`--json-output`, `--output-OBI-header`, `--output-json-header`, `--out`/`-o`,
`--compress`/`-Z` flags.
- `{{< option-sets/common >}}`: all `--max-cpu`, `--batch-size`, `--batch-size-max`,
`--batch-mem`, `--no-progressbar`, `--debug`, `--verbose`, `--silent-warning`,
`--pprof`, `--pprof-goroutine`, `--pprof-mutex`, `--version`, `--help` flags.
- Do NOT re-document options already covered by a standard option-set.
- `--paired-with` → use `{{< cmd-options/paired-with >}}`.
- Taxonomy options (`--taxonomy`, `--restrict-to-taxon`, `--ignore-taxon`, etc.)
→ grouped under `#### Taxonomic options` with
`{{< cmd-options/taxonomy/taxonomy >}}` for `--taxonomy`;
document the rest with `{{< cmd-option >}}` blocks.
- All remaining command-specific options → `{{< cmd-option >}}` blocks.
6. **Examples section plan:**
- Keep only examples whose input files exist in `$examples_list`
(skip examples requiring external resources like taxonomy databases or URLs).
- For each kept example, identify the corresponding output file in `$examples_list`
(typically `out_<name>.fasta`, `out_<name>.fastq`, etc.).
- Plan to read every identified output file in STATE 3.
- Always add a final example: `` ```bash\nobi{xxx} --help\n``` ``
**Output:** store the plan as `$plan`.
**Stop.** Proceed to STATE 3.
---
### STATE 3 — Read example input files (parallel)
**Input:** `$plan` (list of input files to embed and output files to show in examples).
**Action:** emit one Read call per file to be used in the Hugo page — both input files
(to embed with `{{< code >}}`) and output files (to show as example results).
```
<function=Read>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4/autodoc/examples/obi{xxx}/FILENAME"}
</function>
```
Do **not** read compressed files (`.gz`).
**Output:** store file contents as `$input_files`.
**Stop.** Proceed to STATE 4.
---
### STATE 4 — Write Hugo files (parallel)
**Input:** `$doc`, `$plan`, `$input_files`.
**Step 4a — write the Hugo `_index.md`:**
Compose the Hugo page following the SECTION STRUCTURE and YAML FRONT MATTER templates
above, applying the plan from STATE 2. Then emit:
```
<function=Write>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4-doc/content/docs/commands/<category>/obi{xxx}/_index.md",
"content": "..."}
</function>
```
**Step 4b — copy data files (parallel with 4a):**
For each file in `$input_files` and `$output_files`, emit a Write call to place the
file in the Hugo command directory:
```
<function=Write>
{"file_path": "/Users/coissac/Sync/travail/__MOI__/GO/obitools4-doc/content/docs/commands/<category>/obi{xxx}/FILENAME",
"content": "<file content verbatim>"}
</function>
```
Emit the `_index.md` write and all data file writes in a **single parallel message**.
**Stop.** Do not emit any text after the Write calls.
---
## CONTENT RULES (apply throughout STATE 4)
### Workflow diagram
Place the diagram in the Description section, after the introductory prose and before
the first `{{< code >}}` block. It represents the main use case of the command.
```
{{< mermaid class="workflow" >}}
graph TD
A@{ shape: doc, label: "input_file.fastq" }
C[obi{xxx}]
D@{ shape: doc, label: "output_file.fasta" }
A --> C:::obitools
C --> D
classDef obitools fill:#99d57c
{{< /mermaid >}}
```
Rules:
- One `@{ shape: doc, label: "FILENAME" }` node per input file; use the actual
filenames from the main example (first example in the Examples section).
- One `@{ shape: doc, label: "FILENAME" }` node for the output file, using the
actual output filename from the same example.
- The command node uses `[obi{xxx}]` (square brackets = rounded rectangle).
- Apply `:::obitools` on the **last arrow pointing to the command node**, not on
the node definition line itself.
- `classDef obitools fill:#99d57c` must always be the last line inside the block.
- If the command produces no file output (e.g. prints to stdout only), use a
terminal node `D([stdout])` instead of a doc node.
### Description section
The Description section serves an **explanatory** purpose: it teaches the reader how the
command works by walking through a few illustrative cases, one option or concept at a time.
These are NOT the same examples as in the Examples section — they are simpler, focused on
a single behaviour, and chosen to clarify specific options or edge cases.
- Write narrative prose, not a bullet list of options.
- Explain **why** a biologist would use the command and **what** it does to sequences.
- Introduce data files with `{{< code "FILENAME" FORMAT true >}}` before the first
command that uses them.
- Show example commands and their output in **paired** fenced blocks:
````markdown
```bash
obi{xxx} [options] input_file
```
```
<actual output lines>
```
````
- Do **not** use `**Expected output:**` labels — output goes directly in the second code block.
- Replace tool name occurrences in prose with `{{< obi obi{xxx} >}}`.
- Replace format names in prose with `{{% fasta %}}`, `{{% fastq %}}`, etc.
- **Do NOT reuse** examples from the Examples section verbatim. The Description examples
are simpler, pedagogical, focused on one concept; the Examples section examples are
richer, more realistic cookbook recipes.
### Synopsis section
- Use the synopsis from `$doc` verbatim, or reconstruct from the OPTIONS section if
the synopsis in `$doc` is incomplete.
- Wrap in a `bash` fenced code block.
### Options section
- Only document options **not** covered by `{{< option-sets/... >}}`.
- Use `{{< cmd-option >}}` blocks for each command-specific option.
- Group under `#### {{< obi obi{xxx} >}} specific options`, then
`#### Taxonomic options` if applicable, then the three `{{< option-sets/... >}}`.
### Examples section
The Examples section serves a **cookbook** purpose: it shows practical, real-world
recipes that a biologist might want to run directly. Each example should address a
distinct use case that goes beyond the introductory illustrations already shown in the
Description section. Examples here may combine multiple options, use realistic file
names, and demonstrate more complex pipelines.
**Never duplicate** an example that already appears in the Description section — choose
different scenarios, different option combinations, or more complete workflows.
- When a command produces **CSV output**, pipe it through `csvlook` for readable
display. Do not redirect to a file in that case — show the result inline:
````markdown
```bash
obi{xxx} [options] input_file | csvlook
```
```
| col1 | col2 |
| ---- | ---- |
| val1 | val2 |
```
````
The output block contains the verbatim `csvlook` rendering (table with `|` borders).
No `{{< code >}}` shortcode is needed since there is no output file to download.
- For **`--paired-with`** examples: the command uses `--out <prefix>.fastq` (never `>`
redirection) and produces two files `<prefix>_R1.fastq` and `<prefix>_R2.fastq`.
Show both output files with two consecutive `{{< code >}}` shortcodes:
````markdown
```bash
obi{xxx} --paired-with reverse.fastq --out out_paired.fastq forward.fastq
```
{{< code "out_paired_R1.fastq" fastq true >}}
{{< code "out_paired_R2.fastq" fastq true >}}
````
Both `_R1` and `_R2` files must be copied to the Hugo command directory (Step 4b).
- Every example that produces **sequence or annotation file output** (non paired) uses the following pattern:
1. A short introductory paragraph (24 sentences) that explains the biological
motivation for the example and includes a Markdown hyperlink to the input file,
e.g. `The file [input.fastq](input.fastq) contains …`.
2. `{{< code "input_file" FORMAT true >}}` — shows the input file content and makes
it downloadable.
3. A `bash` fenced block with the command writing to an output file
(use `-o out_name.fasta`, never `>` redirection for non-paired examples).
4. Immediately after: `{{< code "out_name.fasta" FORMAT true >}}` — so the result is
rendered AND downloadable.
- The output files must be copied to the Hugo command directory alongside input files
(Step 4b), so the shortcode can find them.
- If no output file exists for an example, omit the `{{< code >}}` line entirely.
- Last example always: `obi{xxx} --help` (no output).
- Never inline file content as raw fenced blocks — always use `{{< code >}}`.