[4.4.2] Enhanced taxonomy handling, input robustness & PCR tag validation

- **obiconvert**: Added `--raw-taxid` mode to output numeric taxIDs without formatting (e.g., "12345" instead of ":tax:NCBI_0987@species"). Introduced `TaxNode.FullString()` to reliably return full formatted strings regardless of global settings, and improved fallback behavior when taxonomy DB is unavailable. - **ngsfilter**: Input fields (primers, sample tags/IDs) are now automatically trimmed of leading/trailing whitespace to prevent parsing failures from inconsistent formatting. - **obitools (pcrtag)**: Mismatch-related fields (`forward_mismatches`, `reverse_mishaps`) renamed to "error" for consistency across annotation dictionaries. - **obipairing & obtagpcr**: Enforced mandatory paired-end file input (`--forward` and `reverse`) in obipairing; added CLI support for generating config templates via AskConfigTemplate(); removed redundant `Required()` constraints and introduced helper function CLIHasPairedFiles().
feat(obiconvert): add --raw-taxid option and refactor taxID formatting
2026-05-01 12:30:39 +00:00 · 2026-04-30 16:57:45 +02:00 · 2026-04-30 16:57:38 +02:00 · 2026-04-30 08:14:39 +02:00 · 2026-04-29 15:29:25 +02:00 · 2026-04-29 15:01:37 +02:00
69 changed files with 2759 additions and 369 deletions
@@ -10,10 +10,10 @@ jobs:
    runs-on: ubuntu-latest
    steps:
    - name: Setup Go
-      uses: actions/setup-go@v2
+      uses: actions/setup-go@v5
      with:
        go-version: '1.23'
    - name: Checkout obitools4 project
-      uses: actions/checkout@v4
+      uses: actions/checkout@v5
    - name: Run tests
      run: make githubtests
@@ -16,9 +16,9 @@ jobs:
      - name: Setup Go
        uses: actions/setup-go@v5
        with:
-          go-version: "1.23"
+          go-version: "1.26"
      - name: Checkout obitools4 project
-        uses: actions/checkout@v4
+        uses: actions/checkout@v5
      - name: Run tests
        run: make githubtests

@@ -49,12 +49,12 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v5

      - name: Setup Go
        uses: actions/setup-go@v5
        with:
-          go-version: "1.23"
+          go-version: "1.26"

      - name: Extract version from tag
        id: get_version
@@ -69,7 +69,23 @@ jobs:
          xcode-select --install 2>/dev/null || true
          xcode-select -p

-      - name: Build binaries
+      - name: Build binaries (Linux)
+        if: runner.os == 'Linux'
+        env:
+          VERSION: ${{ steps.get_version.outputs.version }}
+        run: |
+          docker run --rm \
+            -v "$(pwd):/src" \
+            -w /src \
+            -e VERSION="${VERSION}" \
+            golang:1.26-alpine \
+            sh -c "apk add --no-cache gcc musl-dev zlib-dev zlib-static make && \
+                   make LDFLAGS='-linkmode=external -extldflags=-static' obitools"
+          mkdir -p artifacts
+          tar -czf artifacts/obitools4_${VERSION}_${{ matrix.output_name }}.tar.gz -C build .
+
+      - name: Build binaries (macOS)
+        if: runner.os == 'macOS'
        env:
          GOOS: ${{ matrix.goos }}
          GOARCH: ${{ matrix.goarch }}
@@ -77,7 +93,6 @@ jobs:
        run: |
          make obitools
          mkdir -p artifacts
-          # Create a single tar.gz with all binaries for this platform
          tar -czf artifacts/obitools4_${VERSION}_${{ matrix.output_name }}.tar.gz -C build .

      - name: Upload artifacts
@@ -92,7 +107,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v5
        with:
          fetch-depth: 0

@@ -23,7 +23,7 @@ xx
 /.vscode
 /build
 /bugs
-
+autodoc
 /ncbitaxo

 !/obitests/**
@@ -2,9 +2,17 @@
 #export GOBIN=$(GOPATH)/bin
 #export PATH=$(GOBIN):$(shell echo $${PATH})

+.DEFAULT_GOAL := all
+
+GREEN  := \033[0;32m
+YELLOW := \033[0;33m
+BLUE   := \033[0;34m
+NC     := \033[0m
+
 GOFLAGS=
+LDFLAGS=
 GOCMD=go
-GOBUILD=$(GOCMD) build $(GOFLAGS)
+GOBUILD=$(GOCMD) build $(GOFLAGS) $(if $(LDFLAGS),-ldflags="$(LDFLAGS)")
 GOGENERATE=$(GOCMD) generate
 GOCLEAN=$(GOCMD) clean
 GOTEST=$(GOCMD) test
@@ -43,7 +51,7 @@ $(OBITOOLS_PREFIX)$(notdir $(1)): $(BUILD_DIR) $(1) pkg/obioptions/version.go
 	@echo -n - Building obitool $(notdir $(1))...
 	@$(GOBUILD)  -o $(BUILD_DIR)/$(OBITOOLS_PREFIX)$(notdir $(1)) ./$(1) \
 	             2> $(OBITOOLS_PREFIX)$(notdir $(1)).log \
-				 || cat $(OBITOOLS_PREFIX)$(notdir $(1)).log
+				 || { cat $(OBITOOLS_PREFIX)$(notdir $(1)).log; rm -f $(OBITOOLS_PREFIX)$(notdir $(1)).log; exit 1; }
 	@rm -f $(OBITOOLS_PREFIX)$(notdir $(1)).log
 	@echo Done.
 endef
@@ -60,6 +68,28 @@ endif

 OUTPUT:=$(shell mktemp)

+help:
+	@printf "$(GREEN)OBITools4 Makefile$(NC)\n\n"
+	@printf "$(BLUE)Main targets:$(NC)\n"
+	@printf "  %-20s %s\n" "all"          "Build all obitools (default)"
+	@printf "  %-20s %s\n" "obitools"     "Build all obitools binaries to build/"
+	@printf "  %-20s %s\n" "test"         "Run Go unit tests"
+	@printf "  %-20s %s\n" "obitests"     "Run integration tests (obitests/)"
+	@printf "  %-20s %s\n" "bump-version" "Increment patch version (or set with VERSION=x.y.z)"
+	@printf "  %-20s %s\n" "update-deps"  "Update all Go dependencies"
+	@printf "\n$(BLUE)Jujutsu workflow:$(NC)\n"
+	@printf "  %-20s %s\n" "jjnew"        "Document current commit and start a new one"
+	@printf "  %-20s %s\n" "jjpush"       "Release: describe, bump, generate notes, push PR, tag (VERSION=x.y.z optional)"
+	@printf "  %-20s %s\n" "jjfetch"      "Fetch latest commits from origin"
+	@printf "\n$(BLUE)Required tools:$(NC)\n"
+	@printf "  %-20s " "go";      command -v go      >/dev/null 2>&1 && printf "$(GREEN)✓$(NC) %s\n" "$$(go version)" || printf "$(YELLOW)✗ not found$(NC)\n"
+	@printf "  %-20s " "git";     command -v git     >/dev/null 2>&1 && printf "$(GREEN)✓$(NC) %s\n" "$$(git --version)" || printf "$(YELLOW)✗ not found$(NC)\n"
+	@printf "  %-20s " "jj";      command -v jj      >/dev/null 2>&1 && printf "$(GREEN)✓$(NC) %s\n" "$$(jj --version)" || printf "$(YELLOW)✗ not found$(NC)\n"
+	@printf "  %-20s " "gh";      command -v gh      >/dev/null 2>&1 && printf "$(GREEN)✓$(NC) %s\n" "$$(gh --version | head -1)" || printf "$(YELLOW)✗ not found$(NC)  (brew install gh)\n"
+	@printf "\n$(BLUE)Optional tools (release notes generation):$(NC)\n"
+	@printf "  %-20s " "aichat";  command -v aichat  >/dev/null 2>&1 && printf "$(GREEN)✓$(NC) %s\n" "$$(aichat --version)" || printf "$(YELLOW)✗ not found$(NC)  (https://github.com/sigoden/aichat)\n"
+	@printf "  %-20s " "jq";      command -v jq      >/dev/null 2>&1 && printf "$(GREEN)✓$(NC) %s\n" "$$(jq --version)" || printf "$(YELLOW)✗ not found$(NC)  (brew install jq)\n"
+
 all: install-githook obitools

 obitools: $(patsubst %,$(OBITOOLS_PREFIX)%,$(OBITOOLS))
@@ -106,15 +136,20 @@ pkg/obioptions/version.go: version.txt .FORCE
 	@rm -f $(OUTPUT)

 bump-version:
-	@echo "Incrementing version..."
 	@current=$$(cat version.txt); \
-	echo "  Current version: $$current"; \
-	major=$$(echo $$current | cut -d. -f1); \
-	minor=$$(echo $$current | cut -d. -f2); \
-	patch=$$(echo $$current | cut -d. -f3); \
-	new_patch=$$((patch + 1)); \
-	new_version="$$major.$$minor.$$new_patch"; \
-	echo "  New version: $$new_version"; \
+	if [ -n "$(VERSION)" ]; then \
+		new_version="$(VERSION)"; \
+		echo "Setting version to $$new_version (was $$current)"; \
+	else \
+		echo "Incrementing version..."; \
+		echo "  Current version: $$current"; \
+		major=$$(echo $$current | cut -d. -f1); \
+		minor=$$(echo $$current | cut -d. -f2); \
+		patch=$$(echo $$current | cut -d. -f3); \
+		new_patch=$$((patch + 1)); \
+		new_version="$$major.$$minor.$$new_patch"; \
+		echo "  New version: $$new_version"; \
+	fi; \
 	echo "$$new_version" > version.txt
 	@echo "✓ Version updated in version.txt"
 	@$(MAKE) pkg/obioptions/version.go
@@ -130,6 +165,7 @@ jjnew:
 jjpush:
 	@$(MAKE) jjpush-describe
 	@$(MAKE) jjpush-bump
+	@$(MAKE) jjpush-notes
 	@$(MAKE) jjpush-push
 	@$(MAKE) jjpush-tag
 	@echo "$(GREEN)✓ Release complete$(NC)"
@@ -142,44 +178,62 @@ jjpush-bump:
 	@echo "$(BLUE)→ Creating new commit for version bump...$(NC)"
 	@jj new
 	@$(MAKE) bump-version
-	@echo "$(BLUE)→ Documenting version bump commit...$(NC)"
-	@jj auto-describe

-jjpush-push:
-	@echo "$(BLUE)→ Pushing commits...$(NC)"
-	@jj git push --change @
-
-jjpush-tag:
+jjpush-notes:
 	@version=$$(cat version.txt); \
-	tag_name="Release_$$version"; \
-	echo "$(BLUE)→ Generating release notes for $$tag_name...$(NC)"; \
-	release_message="Release $$version"; \
-	if command -v orla >/dev/null 2>&1 && command -v jq >/dev/null 2>&1; then \
-		previous_tag=$$(git describe --tags --abbrev=0 --match 'Release_*' HEAD^ 2>/dev/null); \
+	echo "$(BLUE)→ Generating release notes for version $$version...$(NC)"; \
+	release_title="Release $$version"; \
+	release_body=""; \
+	if command -v aichat >/dev/null 2>&1; then \
+		previous_tag=$$(git describe --tags --abbrev=0 --match 'Release_*' 2>/dev/null); \
 		if [ -z "$$previous_tag" ]; then \
 			echo "$(YELLOW)⚠ No previous Release tag found, skipping release notes$(NC)"; \
 		else \
 			raw_output=$$(git log --format="%h %B" "$$previous_tag..HEAD" | \
-				ORLA_MAX_TOOL_CALLS=50 orla agent -m ollama:qwen3-coder-next:latest \
+				aichat \
 				"Summarize the following commits into a GitHub release note for version $$version. Ignore commits related to version bumps, .gitignore changes, or any internal housekeeping that is irrelevant to end users. Describe each user-facing change precisely without exposing code. Eliminate redundancy. Output strictly valid JSON with no surrounding text, using this exact schema: {\"title\": \"<short release title>\", \"body\": \"<detailed markdown release notes>\"}" 2>/dev/null) || true; \
 			if [ -n "$$raw_output" ]; then \
-				sanitized=$$(echo "$$raw_output" | sed -n '/^{/,/^}/p' | tr -d '\000-\011\013-\014\016-\037'); \
-				release_title=$$(echo "$$sanitized" | jq -r '.title // empty' 2>/dev/null) ; \
-				release_body=$$(echo "$$sanitized" | jq -r '.body // empty' 2>/dev/null) ; \
-				if [ -n "$$release_title" ] && [ -n "$$release_body" ]; then \
-					release_message="$$release_title"$$'\n\n'"$$release_body"; \
+				notes=$$(printf '%s\n' "$$raw_output" | python3 tools/json2md.py 2>/dev/null); \
+				if [ -n "$$notes" ]; then \
+					release_title=$$(echo "$$notes" | head -1); \
+					release_body=$$(echo "$$notes" | tail -n +3); \
 				else \
 					echo "$(YELLOW)⚠ JSON parsing failed, using default release message$(NC)"; \
 				fi; \
 			fi; \
 		fi; \
 	fi; \
+	printf '%s' "$$release_title" > /tmp/obitools4-release-title.txt; \
+	printf '%s' "$$release_body" > /tmp/obitools4-release-body.txt; \
+	echo "$(BLUE)→ Setting release notes as commit description...$(NC)"; \
+	jj desc -m "$$release_title"$$'\n\n'"$$release_body"
+
+jjpush-push:
+	@echo "$(BLUE)→ Pushing commits...$(NC)"
+	@jj git push --change @
+	@echo "$(BLUE)→ Creating/updating PR...$(NC)"
+	@release_title=$$(cat /tmp/obitools4-release-title.txt 2>/dev/null || echo "Release $$(cat version.txt)"); \
+	release_body=$$(cat /tmp/obitools4-release-body.txt 2>/dev/null || echo ""); \
+	branch=$$(jj log -r @ --no-graph -T 'bookmarks.map(|b| b.name()).join("\n")' 2>/dev/null | head -1); \
+	if [ -n "$$branch" ] && command -v gh >/dev/null 2>&1; then \
+		gh pr create --title "$$release_title" --body "$$release_body" --base master --head "$$branch" 2>/dev/null \
+		|| gh pr edit "$$branch" --title "$$release_title" --body "$$release_body" 2>/dev/null \
+		|| echo "$(YELLOW)⚠ Could not create/update PR$(NC)"; \
+	fi
+
+jjpush-tag:
+	@version=$$(cat version.txt); \
+	tag_name="Release_$$version"; \
+	release_title=$$(cat /tmp/obitools4-release-title.txt 2>/dev/null || echo "Release $$version"); \
+	release_body=$$(cat /tmp/obitools4-release-body.txt 2>/dev/null || echo ""); \
 	install_section=$$'\n## Installation\n\n### Pre-built binaries\n\nDownload the appropriate archive for your system from the\n[release assets](https://github.com/metabarcoding/obitools4/releases/tag/Release_'"$$version"')\nand extract it:\n\n#### Linux (AMD64)\n```bash\ntar -xzf obitools4_'"$$version"'_linux_amd64.tar.gz\n```\n\n#### Linux (ARM64)\n```bash\ntar -xzf obitools4_'"$$version"'_linux_arm64.tar.gz\n```\n\n#### macOS (Intel)\n```bash\ntar -xzf obitools4_'"$$version"'_darwin_amd64.tar.gz\n```\n\n#### macOS (Apple Silicon)\n```bash\ntar -xzf obitools4_'"$$version"'_darwin_arm64.tar.gz\n```\n\nAll OBITools4 binaries are included in each archive.\n\n### From source\n\nYou can also compile and install OBITools4 directly from source using the\ninstallation script:\n\n```bash\ncurl -L https://raw.githubusercontent.com/metabarcoding/obitools4/master/install_obitools.sh | bash -s -- --version '"$$version"'\n```\n\nBy default binaries are installed in `/usr/local/bin`. Use `--install-dir` to\nchange the destination and `--obitools-prefix` to add a prefix to command names:\n\n```bash\ncurl -L https://raw.githubusercontent.com/metabarcoding/obitools4/master/install_obitools.sh | \\\n  bash -s -- --version '"$$version"' --install-dir ~/local --obitools-prefix k\n```\n'; \
-	release_message="$$release_message$$install_section"; \
+	release_message="$$release_title"$$'\n\n'"$$release_body$$install_section"; \
 	echo "$(BLUE)→ Creating tag $$tag_name...$(NC)"; \
-	git tag -a "$$tag_name" -m "$$release_message" 2>/dev/null || echo "$(YELLOW)⚠ Tag $$tag_name already exists$(NC)"; \
+	commit_hash=$$(jj log -r @ --no-graph -T 'commit_id' 2>/dev/null); \
+	git tag -a "$$tag_name" $${commit_hash:+"$$commit_hash"} -m "$$release_message" 2>/dev/null || echo "$(YELLOW)⚠ Tag $$tag_name already exists$(NC)"; \
 	echo "$(BLUE)→ Pushing tag $$tag_name...$(NC)"; \
-	git push origin "$$tag_name" 2>/dev/null || echo "$(YELLOW)⚠ Tag push failed or already pushed$(NC)"
+	git push origin "$$tag_name" 2>/dev/null || echo "$(YELLOW)⚠ Tag push failed or already pushed$(NC)"; \
+	rm -f /tmp/obitools4-release-title.txt /tmp/obitools4-release-body.txt

 jjfetch:
 	@echo "$(YELLOW)→ Pulling latest commits...$(NC)"
@@ -187,5 +241,5 @@ jjfetch:
 	@jj new master@origin
 	@echo "$(GREEN)✓ Latest commits pulled$(NC)"

-.PHONY: all obitools update-deps obitests githubtests jjnew jjpush jjpush-describe jjpush-bump jjpush-push jjpush-tag jjfetch bump-version .FORCE
+.PHONY: all obitools update-deps obitests githubtests help jjnew jjpush jjpush-describe jjpush-bump jjpush-notes jjpush-push jjpush-tag jjfetch bump-version .FORCE
 .FORCE:
@@ -32,8 +32,12 @@ The installation script offers several options:
 > 
 >  -p, --obitools-prefix   Prefix added to the obitools command names if you
 >                          want to have several versions of obitools at the
->                          same time on your system (as example `-p g` will produce 
+>                          same time on your system (as example `-p g` will produce
 >                          `gobigrep` command instead of `obigrep`).
+>
+>  -j, --jobs              Number of parallel jobs used for compilation
+>                          (default: 1). Increase this value to speed up
+>                          compilation on multi-core systems (e.g., `-j 4`).

 ### Examples

@@ -37,6 +37,11 @@ func main() {

 	optionParser(os.Args)

+	if !obipairing.CLIHasPairedFiles() {
+		log.Error("You must provide both a forward file (-F) and a reverse file (-R)")
+		os.Exit(1)
+	}
+
 	obidefault.SetStrictReadWorker(2)
 	obidefault.SetStrictWriteWorker(2)
 	pairs, err := obipairing.CLIPairedSequence()
@@ -1,6 +1,7 @@
 package main

 import (
+	"fmt"
 	"os"

 	log "github.com/sirupsen/logrus"
@@ -8,6 +9,7 @@ import (
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obidefault"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obioptions"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obiconvert"
+	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obimultiplex"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obipairing"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obitagpcr"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiutils"
@@ -39,6 +41,17 @@ func main() {
 		obitagpcr.OptionSet)

 	optionParser(os.Args)
+
+	if obimultiplex.CLIAskConfigTemplate() {
+		fmt.Print(obimultiplex.CLIConfigTemplate())
+		os.Exit(0)
+	}
+
+	if !obipairing.CLIHasPairedFiles() {
+		log.Error("You must provide both a forward file (-F) and a reverse file (-R)")
+		os.Exit(1)
+	}
+
 	pairs, err := obipairing.CLIPairedSequence()

 	if err != nil {
@@ -0,0 +1,10 @@
+{
+  "people": [
+    "Software",
+    "Agreement",
+    "Module"
+  ],
+  "projects": [
+    "Code"
+  ]
+}
@@ -1,35 +1,33 @@
 module git.metabarcoding.org/obitools/obitools4/obitools4

-go 1.23.4
-
-toolchain go1.24.2
+go 1.26.1

 require (
-	github.com/DavidGamba/go-getoptions v0.28.0
-	github.com/PaesslerAG/gval v1.2.2
+	github.com/DavidGamba/go-getoptions v0.33.0
+	github.com/PaesslerAG/gval v1.2.4
 	github.com/barkimedes/go-deepcopy v0.0.0-20220514131651-17c30cfc62df
-	github.com/buger/jsonparser v1.1.1
+	github.com/buger/jsonparser v1.1.2
 	github.com/chen3feng/stl4go v0.1.1
-	github.com/dlclark/regexp2 v1.11.4
-	github.com/goccy/go-json v0.10.3
+	github.com/dlclark/regexp2 v1.11.5
+	github.com/goccy/go-json v0.10.6
 	github.com/klauspost/pgzip v1.2.6
 	github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58
 	github.com/pelletier/go-toml/v2 v2.2.4
 	github.com/rrethy/ahocorasick v1.0.0
-	github.com/schollz/progressbar/v3 v3.13.1
-	github.com/sirupsen/logrus v1.9.3
-	github.com/stretchr/testify v1.8.4
+	github.com/schollz/progressbar/v3 v3.19.0
+	github.com/sirupsen/logrus v1.9.4
+	github.com/stretchr/testify v1.10.0
 	github.com/tevino/abool/v2 v2.1.0
 	github.com/yuin/gopher-lua v1.1.1
-	golang.org/x/exp v0.0.0-20231110203233-9a3e6036ecaa
-	gonum.org/v1/gonum v0.14.0
+	golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90
+	gonum.org/v1/gonum v0.17.0
 	gopkg.in/yaml.v3 v3.0.1
 	scientificgo.org/special v0.0.0
 )

 require (
 	github.com/davecgh/go-spew v1.1.1 // indirect
-	github.com/goombaio/orderedmap v0.0.0-20180924084748-ba921b7e2419 // indirect
+	github.com/goombaio/orderedmap v0.0.0-20180925151256-3da0e2f905f9 // indirect
 	github.com/kr/pretty v0.3.1 // indirect
 	github.com/kr/text v0.2.0 // indirect
 	github.com/pmezard/go-difflib v1.0.0 // indirect
@@ -38,16 +36,15 @@ require (

 require (
 	github.com/dsnet/compress v0.0.1
-	github.com/gabriel-vasile/mimetype v1.4.3
+	github.com/gabriel-vasile/mimetype v1.4.13
 	github.com/goombaio/orderedset v0.0.0-20180925151225-8e67b20a9b77
-	github.com/klauspost/compress v1.17.2
-	github.com/mattn/go-runewidth v0.0.15 // indirect
+	github.com/klauspost/compress v1.18.4
+	github.com/mattn/go-runewidth v0.0.21 // indirect
 	github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db // indirect
-	github.com/rivo/uniseg v0.4.4 // indirect
-	github.com/shopspring/decimal v1.3.1 // indirect
-	github.com/ulikunitz/xz v0.5.11
-	golang.org/x/net v0.35.0 // indirect
-	golang.org/x/sys v0.30.0 // indirect
-	golang.org/x/term v0.29.0 // indirect
+	github.com/rivo/uniseg v0.4.7 // indirect
+	github.com/shopspring/decimal v1.4.0 // indirect
+	github.com/ulikunitz/xz v0.5.15
+	golang.org/x/sys v0.42.0 // indirect
+	golang.org/x/term v0.41.0 // indirect
 	gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
 )
@@ -1,36 +1,41 @@
-github.com/DavidGamba/go-getoptions v0.28.0 h1:18wgEvfZdrlfIhVDGEBO3Dl0fkOyXqXLa0tLMCKxM1c=
-github.com/DavidGamba/go-getoptions v0.28.0/go.mod h1:zE97E3PR9P3BI/HKyNYgdMlYxodcuiC6W68KIgeYT84=
-github.com/PaesslerAG/gval v1.2.2 h1:Y7iBzhgE09IGTt5QgGQ2IdaYYYOU134YGHBThD+wm9E=
-github.com/PaesslerAG/gval v1.2.2/go.mod h1:XRFLwvmkTEdYziLdaCeCa5ImcGVrfQbeNUbVR+C6xac=
+github.com/DavidGamba/go-getoptions v0.33.0 h1:8xCPH87Yy5avYenygyHVlqqm8RpymH0YFe4a7IWlarE=
+github.com/DavidGamba/go-getoptions v0.33.0/go.mod h1:zE97E3PR9P3BI/HKyNYgdMlYxodcuiC6W68KIgeYT84=
+github.com/PaesslerAG/gval v1.2.4 h1:rhX7MpjJlcxYwL2eTTYIOBUyEKZ+A96T9vQySWkVUiU=
+github.com/PaesslerAG/gval v1.2.4/go.mod h1:XRFLwvmkTEdYziLdaCeCa5ImcGVrfQbeNUbVR+C6xac=
 github.com/PaesslerAG/jsonpath v0.1.0 h1:gADYeifvlqK3R3i2cR5B4DGgxLXIPb3TRTH1mGi0jPI=
 github.com/PaesslerAG/jsonpath v0.1.0/go.mod h1:4BzmtoM/PI8fPO4aQGIusjGxGir2BzcV0grWtFzq1Y8=
 github.com/barkimedes/go-deepcopy v0.0.0-20220514131651-17c30cfc62df h1:GSoSVRLoBaFpOOds6QyY1L8AX7uoY+Ln3BHc22W40X0=
 github.com/barkimedes/go-deepcopy v0.0.0-20220514131651-17c30cfc62df/go.mod h1:hiVxq5OP2bUGBRNS3Z/bt/reCLFNbdcST6gISi1fiOM=
-github.com/buger/jsonparser v1.1.1 h1:2PnMjfWD7wBILjqQbt530v576A/cAbQvEW9gGIpYMUs=
-github.com/buger/jsonparser v1.1.1/go.mod h1:6RYKKt7H4d4+iWqouImQ9R2FZql3VbhNgx27UK13J/0=
+github.com/buger/jsonparser v1.1.2 h1:frqHqw7otoVbk5M8LlE/L7HTnIq2v9RX6EJ48i9AxJk=
+github.com/buger/jsonparser v1.1.2/go.mod h1:6RYKKt7H4d4+iWqouImQ9R2FZql3VbhNgx27UK13J/0=
 github.com/chen3feng/stl4go v0.1.1 h1:0L1+mDw7pomftKDruM23f1mA7miavOj6C6MZeadzN2Q=
 github.com/chen3feng/stl4go v0.1.1/go.mod h1:5ml3psLgETJjRJnMbPE+JiHLrCpt+Ajc2weeTECXzWU=
+github.com/chengxilo/virtualterm v1.0.4 h1:Z6IpERbRVlfB8WkOmtbHiDbBANU7cimRIof7mk9/PwM=
+github.com/chengxilo/virtualterm v1.0.4/go.mod h1:DyxxBZz/x1iqJjFxTFcr6/x+jSpqN0iwWCOK1q10rlY=
+github.com/clipperhouse/uax29/v2 v2.2.0 h1:ChwIKnQN3kcZteTXMgb1wztSgaU+ZemkgWdohwgs8tY=
+github.com/clipperhouse/uax29/v2 v2.2.0/go.mod h1:EFJ2TJMRUaplDxHKj1qAEhCtQPW2tJSwu5BF98AuoVM=
 github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
-github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
-github.com/dlclark/regexp2 v1.11.4 h1:rPYF9/LECdNymJufQKmri9gV604RvvABwgOA8un7yAo=
-github.com/dlclark/regexp2 v1.11.4/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
+github.com/dlclark/regexp2 v1.11.5 h1:Q/sSnsKerHeCkc/jSTNq1oCm7KiVgUMZRDUoRu0JQZQ=
+github.com/dlclark/regexp2 v1.11.5/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
 github.com/dsnet/compress v0.0.1 h1:PlZu0n3Tuv04TzpfPbrnI0HW/YwodEXDS+oPKahKF0Q=
 github.com/dsnet/compress v0.0.1/go.mod h1:Aw8dCMJ7RioblQeTqt88akK31OvO8Dhf5JflhBbQEHo=
 github.com/dsnet/golib v0.0.0-20171103203638-1ea166775780/go.mod h1:Lj+Z9rebOhdfkVLjJ8T6VcRQv3SXugXy999NBtR9aFY=
-github.com/gabriel-vasile/mimetype v1.4.3 h1:in2uUcidCuFcDKtdcBxlR0rJ1+fsokWf+uqxgUFjbI0=
-github.com/gabriel-vasile/mimetype v1.4.3/go.mod h1:d8uq/6HKRL6CGdk+aubisF/M5GcPfT7nKyLpA0lbSSk=
-github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
-github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
-github.com/goombaio/orderedmap v0.0.0-20180924084748-ba921b7e2419 h1:SajEQ6tktpF9SRIuzbiPOX9AEZZ53Bvw0k9Mzrts8Lg=
+github.com/gabriel-vasile/mimetype v1.4.13 h1:46nXokslUBsAJE/wMsp5gtO500a4F3Nkz9Ufpk2AcUM=
+github.com/gabriel-vasile/mimetype v1.4.13/go.mod h1:d+9Oxyo1wTzWdyVUPMmXFvp4F9tea18J8ufA774AB3s=
+github.com/goccy/go-json v0.10.6 h1:p8HrPJzOakx/mn/bQtjgNjdTcN+/S6FcG2CTtQOrHVU=
+github.com/goccy/go-json v0.10.6/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
+github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
+github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
 github.com/goombaio/orderedmap v0.0.0-20180924084748-ba921b7e2419/go.mod h1:YKu81H3RSd1cFh0d7NhvUoTtUC9IY/vBX0WUQb1/o4Y=
+github.com/goombaio/orderedmap v0.0.0-20180925151256-3da0e2f905f9 h1:vFjPvFavIiDY71bQ9HIxPQBANvNl1SmFC4fgg5xRkho=
+github.com/goombaio/orderedmap v0.0.0-20180925151256-3da0e2f905f9/go.mod h1:YKu81H3RSd1cFh0d7NhvUoTtUC9IY/vBX0WUQb1/o4Y=
 github.com/goombaio/orderedset v0.0.0-20180925151225-8e67b20a9b77 h1:4dvq1tGHn1Y9KSRY0OZ24Khki4+4U+ZrA//YYsdUlJU=
 github.com/goombaio/orderedset v0.0.0-20180925151225-8e67b20a9b77/go.mod h1:HPelMYpOyy0XvglpBbmZ3krZpwaHmszj/vQNlnETPTM=
-github.com/k0kubun/go-ansi v0.0.0-20180517002512-3bf9e2903213/go.mod h1:vNUNkEQ1e29fT/6vq2aBdFsgNPmy8qMdSay1npru+Sw=
 github.com/klauspost/compress v1.4.1/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
-github.com/klauspost/compress v1.17.2 h1:RlWWUY/Dr4fL8qk9YG7DTZ7PDgME2V4csBXA8L/ixi4=
-github.com/klauspost/compress v1.17.2/go.mod h1:ntbaceVETuRiXiv4DpjP66DpAtAGkEQskQzEyD//IeE=
+github.com/klauspost/compress v1.18.4 h1:RPhnKRAQ4Fh8zU2FY/6ZFDwTVTxgJ/EMydqSTzE9a2c=
+github.com/klauspost/compress v1.18.4/go.mod h1:R0h/fSBs8DE4ENlcrlib3PsXS61voFxhIs2DeRhCvJ4=
 github.com/klauspost/cpuid v1.2.0/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
 github.com/klauspost/pgzip v1.2.6 h1:8RXeL5crjEUFnR2/Sn6GJNWtSQ3Dk8pq4CL3jvdDyjU=
 github.com/klauspost/pgzip v1.2.6/go.mod h1:Ch1tH69qFZu15pkjo5kYi6mth2Zzwzt50oCQKQE9RUs=
@@ -41,10 +46,8 @@ github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
 github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
 github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
 github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
-github.com/mattn/go-isatty v0.0.17/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=
-github.com/mattn/go-runewidth v0.0.14/go.mod h1:Jdepj2loyihRzMpdS35Xk/zdY8IAYHsh153qUoGf23w=
-github.com/mattn/go-runewidth v0.0.15 h1:UNAjwbU9l54TA3KzvqLGxwWjHmMgBUVhBiTjelZgg3U=
-github.com/mattn/go-runewidth v0.0.15/go.mod h1:Jdepj2loyihRzMpdS35Xk/zdY8IAYHsh153qUoGf23w=
+github.com/mattn/go-runewidth v0.0.21 h1:jJKAZiQH+2mIinzCJIaIG9Be1+0NR+5sz/lYEEjdM8w=
+github.com/mattn/go-runewidth v0.0.21/go.mod h1:XBkDxAl56ILZc9knddidhrOlY5R/pDhgLpndooCuJAs=
 github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db h1:62I3jR2EmQ4l5rM/4FEfDWcRD+abF5XlKShorW5LRoQ=
 github.com/mitchellh/colorstring v0.0.0-20190213212951-d06e56a500db/go.mod h1:l0dey0ia/Uv7NcFFVbCLtqEBQbrT4OCwCSKTEv6enCw=
 github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 h1:onHthvaw9LFnH4t2DcNVpwGmV9E1BkGknEliJkfwQj0=
@@ -54,50 +57,40 @@ github.com/pelletier/go-toml/v2 v2.2.4/go.mod h1:2gIqNv+qfxSVS7cM2xJQKtLSTLUE9V8
 github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA=
 github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
 github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
-github.com/rivo/uniseg v0.2.0/go.mod h1:J6wj4VEh+S6ZtnVlnTBMWIodfgj8LQOQFoIToxlJtxc=
-github.com/rivo/uniseg v0.4.4 h1:8TfxU8dW6PdqD27gjM8MVNuicgxIjxpm4K7x4jp8sis=
-github.com/rivo/uniseg v0.4.4/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUcx88=
+github.com/rivo/uniseg v0.4.7 h1:WUdvkW8uEhrYfLC4ZzdpI2ztxP1I582+49Oc5Mq64VQ=
+github.com/rivo/uniseg v0.4.7/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUcx88=
 github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/fJaraNFVN+nFs=
 github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
 github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
 github.com/rrethy/ahocorasick v1.0.0 h1:YKkCB+E5PXc0xmLfMrWbfNht8vG9Re97IHSWZk/Lk8E=
 github.com/rrethy/ahocorasick v1.0.0/go.mod h1:nq8oScE7Vy1rOppoQxpQiiDmPHuKCuk9rXrNcxUV3R0=
-github.com/schollz/progressbar/v3 v3.13.1 h1:o8rySDYiQ59Mwzy2FELeHY5ZARXZTVJC7iHD6PEFUiE=
-github.com/schollz/progressbar/v3 v3.13.1/go.mod h1:xvrbki8kfT1fzWzBT/UZd9L6GA+jdL7HAgq2RFnO6fQ=
-github.com/shopspring/decimal v1.3.1 h1:2Usl1nmF/WZucqkFZhnfFYxxxu8LG21F6nPQBE5gKV8=
+github.com/schollz/progressbar/v3 v3.19.0 h1:Ea18xuIRQXLAUidVDox3AbwfUhD0/1IvohyTutOIFoc=
+github.com/schollz/progressbar/v3 v3.19.0/go.mod h1:IsO3lpbaGuzh8zIMzgY3+J8l4C8GjO0Y9S69eFvNsec=
 github.com/shopspring/decimal v1.3.1/go.mod h1:DKyhrW/HYNuLGql+MJL6WCR6knT2jwCFRcu2hWCYk4o=
-github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
-github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
-github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
-github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
-github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
-github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk=
-github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
+github.com/shopspring/decimal v1.4.0 h1:bxl37RwXBklmTi0C79JfXCEBD1cqqHt0bbgBAGFp81k=
+github.com/shopspring/decimal v1.4.0/go.mod h1:gawqmDU56v4yIKSwfBSFip1HdCCXN8/+DMd9qYNcwME=
+github.com/sirupsen/logrus v1.9.4 h1:TsZE7l11zFCLZnZ+teH4Umoq5BhEIfIzfRDZ1Uzql2w=
+github.com/sirupsen/logrus v1.9.4/go.mod h1:ftWc9WdOfJ0a92nsE2jF5u5ZwH8Bv2zdeOC42RjbV2g=
+github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
+github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
 github.com/tevino/abool/v2 v2.1.0 h1:7w+Vf9f/5gmKT4m4qkayb33/92M+Um45F2BkHOR+L/c=
 github.com/tevino/abool/v2 v2.1.0/go.mod h1:+Lmlqk6bHDWHqN1cbxqhwEAwMPXgc8I1SDEamtseuXY=
 github.com/ulikunitz/xz v0.5.6/go.mod h1:2bypXElzHzzJZwzH67Y6wb67pO62Rzfn7BSiF4ABRW8=
-github.com/ulikunitz/xz v0.5.11 h1:kpFauv27b6ynzBNT/Xy+1k+fK4WswhN/6PN5WhFAGw8=
-github.com/ulikunitz/xz v0.5.11/go.mod h1:nbz6k7qbPmH4IRqmfOplQw/tblSgqTqBwxkY0oWt/14=
+github.com/ulikunitz/xz v0.5.15 h1:9DNdB5s+SgV3bQ2ApL10xRc35ck0DuIX/isZvIk+ubY=
+github.com/ulikunitz/xz v0.5.15/go.mod h1:nbz6k7qbPmH4IRqmfOplQw/tblSgqTqBwxkY0oWt/14=
 github.com/yuin/gopher-lua v1.1.1 h1:kYKnWBjvbNP4XLT3+bPEwAXJx262OhaHDWDVOPjL46M=
 github.com/yuin/gopher-lua v1.1.1/go.mod h1:GBR0iDaNXjAgGg9zfCvksxSRnQx76gclCIb7kdAd1Pw=
-golang.org/x/exp v0.0.0-20231110203233-9a3e6036ecaa h1:FRnLl4eNAQl8hwxVVC17teOw8kdjVDVAiFMtgUdTSRQ=
-golang.org/x/exp v0.0.0-20231110203233-9a3e6036ecaa/go.mod h1:zk2irFbV9DP96SEBUUAy67IdHUaZuSnrz1n472HUCLE=
-golang.org/x/net v0.35.0 h1:T5GQRQb2y08kTAByq9L4/bz8cipCdA8FbRTXewonqY8=
-golang.org/x/net v0.35.0/go.mod h1:EglIi67kWsHKlRzzVMUD93VMSWGFOMSZgxFjparz1Qk=
-golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
-golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
-golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
-golang.org/x/sys v0.30.0 h1:QjkSwP/36a20jFYWkSue1YwXzLmsV5Gfq7Eiy72C1uc=
-golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
-golang.org/x/term v0.6.0/go.mod h1:m6U89DPEgQRMq3DNkDClhWw02AUbt2daBVO4cn4Hv9U=
-golang.org/x/term v0.29.0 h1:L6pJp37ocefwRRtYPKSWOWzOtWSxVajvz2ldH/xi3iU=
-golang.org/x/term v0.29.0/go.mod h1:6bl4lRlvVuDgSf3179VpIxBF0o10JUpXWOnI7nErv7s=
-gonum.org/v1/gonum v0.14.0 h1:2NiG67LD1tEH0D7kM+ps2V+fXmsAnpUeec7n8tcr4S0=
-gonum.org/v1/gonum v0.14.0/go.mod h1:AoWeoz0becf9QMWtE8iWXNXc27fK4fNeHNf/oMejGfU=
+golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90 h1:jiDhWWeC7jfWqR9c/uplMOqJ0sbNlNWv0UkzE0vX1MA=
+golang.org/x/exp v0.0.0-20260312153236-7ab1446f8b90/go.mod h1:xE1HEv6b+1SCZ5/uscMRjUBKtIxworgEcEi+/n9NQDQ=
+golang.org/x/sys v0.42.0 h1:omrd2nAlyT5ESRdCLYdm3+fMfNFE/+Rf4bDIQImRJeo=
+golang.org/x/sys v0.42.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw=
+golang.org/x/term v0.41.0 h1:QCgPso/Q3RTJx2Th4bDLqML4W6iJiaXFq2/ftQF13YU=
+golang.org/x/term v0.41.0/go.mod h1:3pfBgksrReYfZ5lvYM0kSO0LIkAl4Yl2bXOkKP7Ec2A=
+gonum.org/v1/gonum v0.17.0 h1:VbpOemQlsSMrYmn7T2OUvQ4dqxQXU+ouZFQsZOx50z4=
+gonum.org/v1/gonum v0.17.0/go.mod h1:El3tOrEuMpv2UdMrbNlKEh9vd86bmQ6vqIcDwxEOc1E=
 gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
 gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
 gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
-gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
 gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
 gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
 scientificgo.org/special v0.0.0 h1:P6WJkECo6tgtvZAEfNXl+KEB9ReAatjKAeX8U07mjSc=
@@ -1,5 +1,3 @@
-go 1.23.4
-
-toolchain go1.24.2
+go 1.26.1

 use .
@@ -52,6 +52,8 @@ golang.org/x/image v0.6.0/go.mod h1:MXLdDR43H7cDJq5GEGXEVeeNhPgi+YYEQ2pC1byI1x0=
 golang.org/x/mod v0.13.0 h1:I/DsJXRlw/8l/0c24sM9yb0T4z9liZTduXvdAWYiysY=
 golang.org/x/mod v0.13.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
 golang.org/x/mod v0.14.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
+golang.org/x/net v0.38.0/go.mod h1:ivrbrMbzFq5J41QOQh0siUuly180yBYtLp+CKbEaFx8=
+golang.org/x/net v0.52.0/go.mod h1:R1MAz7uMZxVMualyPXb+VaqGSa3LIaUqk0eEt3w36Sw=
 golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4 h1:uVc8UZUe6tr40fFVnUP5Oj+veunVezqYl9z7DYw9xzw=
 golang.org/x/text v0.13.0 h1:ablQoSUd0tRdKxZewP80B+BaqeKJuVhuRxj/dkrun3k=
 golang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
@@ -7,6 +7,7 @@ INSTALL_DIR="/usr/local"
 OBITOOLS_PREFIX=""
 VERSION=""
 LIST_VERSIONS=false
+JOBS=1

 # Help message
 function display_help {
@@ -21,6 +22,7 @@ function display_help {
  echo "                          gobigrep command instead of obigrep)."
  echo "  -v, --version           Install a specific version (e.g., 4.4.8)."
  echo "                          If not specified, installs the latest version."
+  echo "  -j, --jobs              Number of parallel jobs for compilation (default: 1)."
  echo "  -l, --list              List all available versions and exit."
  echo "  -h, --help              Display this help message."
  echo ""
@@ -65,6 +67,10 @@ while [ "$#" -gt 0 ]; do
      VERSION="$2"
      shift 2
      ;;
+    -j|--jobs)
+      JOBS="$2"
+      shift 2
+      ;;
    -l|--list)
      LIST_VERSIONS=true
      shift
@@ -122,9 +128,15 @@ mkdir -p "${WORK_DIR}/cache" \
      exit 1)

 # Create installation directory
-mkdir -p "${INSTALL_DIR}/bin" 2> /dev/null \
-  || (echo "Please enter your password for installing obitools in ${INSTALL_DIR}" 1>&2
-      sudo mkdir -p "${INSTALL_DIR}/bin")
+if ! mkdir -p "${INSTALL_DIR}/bin" 2>/dev/null; then
+  if [ ! -w "$(dirname "${INSTALL_DIR}")" ] && [ ! -w "${INSTALL_DIR}" ]; then
+    echo "Please enter your password for installing obitools in ${INSTALL_DIR}" 1>&2
+    sudo mkdir -p "${INSTALL_DIR}/bin"
+  else
+    echo "Error: Could not create ${INSTALL_DIR}/bin (check path or disk space)" 1>&2
+    exit 1
+  fi
+fi

 if [[ ! -d "${INSTALL_DIR}/bin" ]]; then
  echo "Could not create ${INSTALL_DIR}/bin directory for installing obitools" 1>&2
@@ -171,22 +183,24 @@ GOURL=$(curl -s "${URL}${GOFILE}" \

 echo "Installing Go from: $GOURL" 1>&2

-curl -s "$GOURL" | tar zxf -
+curl --progress-bar "$GOURL" | tar zxf -

-PATH="$(pwd)/go/bin:$PATH"
+export GOROOT="$(pwd)/go"
+PATH="${GOROOT}/bin:$PATH"
 export PATH
-GOPATH="$(pwd)/go"
-export GOPATH
+export GOPATH="$(pwd)/gopath"
 export GOCACHE="$(pwd)/cache"
+export GOTOOLCHAIN=local

+echo "GOROOT=$GOROOT" 1>&2
 echo "GOCACHE=$GOCACHE" 1>&2
-mkdir -p "$GOCACHE"
+mkdir -p "$GOPATH" "$GOCACHE"

 # Download OBITools4 source
 echo "Downloading OBITools4 v${VERSION}..." 1>&2
 echo "Source URL: $OBIURL4" 1>&2

-if ! curl -sL "$OBIURL4" > obitools4.zip; then
+if ! curl --progress-bar -L "$OBIURL4" > obitools4.zip; then
  echo "Error: Could not download OBITools4 version ${VERSION}" 1>&2
  echo "Please check that this version exists with: $0 --list" 1>&2
  exit 1
@@ -208,16 +222,29 @@ mkdir -p vendor

 # Build with or without prefix
 if [[ -z "$OBITOOLS_PREFIX" ]] ; then
-  make GOFLAGS="-buildvcs=false"
+  make -j"${JOBS}" obitools GOFLAGS="-buildvcs=false"
 else
-  make GOFLAGS="-buildvcs=false" OBITOOLS_PREFIX="${OBITOOLS_PREFIX}"
+  make -j"${JOBS}" obitools GOFLAGS="-buildvcs=false" OBITOOLS_PREFIX="${OBITOOLS_PREFIX}"
 fi

 # Install binaries
 echo "Installing binaries to ${INSTALL_DIR}/bin..." 1>&2
-(cp build/* "${INSTALL_DIR}/bin" 2> /dev/null) \
-   || (echo "Please enter your password for installing obitools in ${INSTALL_DIR}" 1>&2
-       sudo cp build/* "${INSTALL_DIR}/bin")
+if ! cp build/* "${INSTALL_DIR}/bin" 2>/dev/null; then
+  if [ ! -w "${INSTALL_DIR}/bin" ]; then
+    echo "Please enter your password for installing obitools in ${INSTALL_DIR}" 1>&2
+    sudo cp build/* "${INSTALL_DIR}/bin"
+  else
+    echo "Error: Could not copy binaries to ${INSTALL_DIR}/bin" 1>&2
+    echo "  Source files: $(ls build/ 2>/dev/null || echo 'none found')" 1>&2
+    echo "" 1>&2
+    echo "The build directory has been preserved for manual recovery:" 1>&2
+    echo "  $(pwd)/build/" 1>&2
+    echo "You can install manually with:" 1>&2
+    echo "  cp $(pwd)/build/* ${INSTALL_DIR}/bin/" 1>&2
+    popd > /dev/null || true
+    exit 1
+  fi
+fi

 popd > /dev/null || exit

@@ -0,0 +1,24 @@
+>HELIUM_000100422_612GNAAXX:7:118:3572:14633#0/1_sub[28..126] {"count":10172,"merged_sample":{"26a_F040644":10172},"obitag_bestid":0.9797979797979798,"obitag_bestmatch":"AY227529","obitag_match_count":1,"obitag_rank":"genus","obitag_similarity_method":"lcs","taxid":"taxon:9992 [Marmota]@genus"}
+ttagccctaaacataaacattcaataaacaagaatgttcgccagagtactactagcaaca
+gcctgaaactcaaaggacttggcggtgctttacatccct
+>HELIUM_000100422_612GNAAXX:7:99:9351:13090#0/1_sub[28..127] {"count":260,"merged_sample":{"29a_F260619":260},"obitag_bestid":0.9405940594059405,"obitag_bestmatch":"AF154263","obitag_match_count":9,"obitag_rank":"infraorder","obitag_similarity_method":"lcs","taxid":"taxon:35500 [Pecora]@infraorder"}
+ttagccctaaacacaaataattacacaaacaaaattgttcaccagagtactagcggcaac
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:108:10111:9078#0/1_sub[28..127] {"count":7146,"merged_sample":{"13a_F730603":7146},"obitag_bestid":1,"obitag_bestmatch":"AB245427","obitag_match_count":1,"obitag_rank":"species","obitag_similarity_method":"lcs","taxid":"taxon:9860 [Cervus elaphus]@species"}
+ctagccttaaacacaaatagttatgcaaacaaaactattcgccagagtactaccggcaat
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:38:14204:12725#0/1_sub[28..126] {"count":87,"merged_sample":{"26a_F040644":87},"obitag_bestid":0.9494949494949495,"obitag_bestmatch":"AY227530","obitag_match_count":2,"obitag_rank":"tribe","obitag_similarity_method":"lcs","taxid":"taxon:337730 [Marmotini]@tribe"}
+ttagccctaaacataaacattcaataaacaagaatgttcgccagaggactactagcaata
+gcttaaaactcaaaggacttggcggtgctttatatccct
+>HELIUM_000100422_612GNAAXX:7:30:9942:4495#0/1_sub[28..126] {"count":95,"merged_sample":{"26a_F040644":11,"29a_F260619":84},"obitag_bestid":0.9595959595959596,"obitag_bestmatch":"AC187326","obitag_match_count":1,"obitag_rank":"subspecies","obitag_similarity_method":"lcs","taxid":"taxon:9615 [Canis lupus familiaris]@subspecies"}
+ttagccctaaacataagctattccataacaaaataattcgccagagaactactagcaaca
+gattaaacctcaaaggacttggcagtgctttatacccct
+>HELIUM_000100422_612GNAAXX:7:51:16702:19393#0/1_sub[28..127] {"count":12004,"merged_sample":{"15a_F730814":7465,"29a_F260619":4539},"obitag_bestid":1,"obitag_bestmatch":"AJ885202","obitag_match_count":1,"obitag_rank":"species","obitag_similarity_method":"lcs","taxid":"taxon:9858 [Capreolus capreolus]@species"}
+ttagccctaaacacaagtaattaatataacaaaattattcgccagagtactaccggcaat
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:84:14502:1617#0/1_sub[28..127] {"count":319,"merged_sample":{"29a_F260619":319},"obitag_bestid":1,"obitag_bestmatch":"AJ972683","obitag_match_count":1,"obitag_rank":"species","obitag_similarity_method":"lcs","taxid":"taxon:9858 [Capreolus capreolus]@species"}
+ttagccctaaacacaagtaattattataacaaaattattcgccagagtactaccggcaat
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:50:10637:6527#0/1_sub[28..126] {"count":366,"merged_sample":{"13a_F730603":13,"15a_F730814":5,"26a_F040644":347,"29a_F260619":1},"obitag_bestid":1,"obitag_bestmatch":"AB048590","obitag_match_count":1,"obitag_rank":"genus","obitag_similarity_method":"lcs","taxid":"taxon:9611 [Canis]@genus"}
+ttagccctaaacatagataattttacaacaaaataattcgccagaggactactagcaata
+gcttaaaactcaaaggacttggcggtgctttatatccct
@@ -0,0 +1,48 @@
+taxid,parent,taxonomic_rank,scientific_name
+taxon:1 [root]@no rank,taxon:1 [root]@no rank,no rank,root
+taxon:131567 [cellular organisms]@cellular root,taxon:1 [root]@no rank,cellular root,cellular organisms
+taxon:2759 [Eukaryota]@domain,taxon:131567 [cellular organisms]@cellular root,domain,Eukaryota
+taxon:33154 [Opisthokonta]@clade,taxon:2759 [Eukaryota]@domain,clade,Opisthokonta
+taxon:33208 [Metazoa]@kingdom,taxon:33154 [Opisthokonta]@clade,kingdom,Metazoa
+taxon:6072 [Eumetazoa]@clade,taxon:33208 [Metazoa]@kingdom,clade,Eumetazoa
+taxon:33213 [Bilateria]@clade,taxon:6072 [Eumetazoa]@clade,clade,Bilateria
+taxon:33511 [Deuterostomia]@clade,taxon:33213 [Bilateria]@clade,clade,Deuterostomia
+taxon:7711 [Chordata]@phylum,taxon:33511 [Deuterostomia]@clade,phylum,Chordata
+taxon:89593 [Craniata]@subphylum,taxon:7711 [Chordata]@phylum,subphylum,Craniata
+taxon:7742 [Vertebrata]@clade,taxon:89593 [Craniata]@subphylum,clade,Vertebrata
+taxon:7776 [Gnathostomata]@clade,taxon:7742 [Vertebrata]@clade,clade,Gnathostomata
+taxon:117570 [Teleostomi]@clade,taxon:7776 [Gnathostomata]@clade,clade,Teleostomi
+taxon:117571 [Euteleostomi]@clade,taxon:117570 [Teleostomi]@clade,clade,Euteleostomi
+taxon:8287 [Sarcopterygii]@superclass,taxon:117571 [Euteleostomi]@clade,superclass,Sarcopterygii
+taxon:1338369 [Dipnotetrapodomorpha]@clade,taxon:8287 [Sarcopterygii]@superclass,clade,Dipnotetrapodomorpha
+taxon:32523 [Tetrapoda]@clade,taxon:1338369 [Dipnotetrapodomorpha]@clade,clade,Tetrapoda
+taxon:32524 [Amniota]@clade,taxon:32523 [Tetrapoda]@clade,clade,Amniota
+taxon:40674 [Mammalia]@class,taxon:32524 [Amniota]@clade,class,Mammalia
+taxon:32525 [Theria]@clade,taxon:40674 [Mammalia]@class,clade,Theria
+taxon:9347 [Eutheria]@clade,taxon:32525 [Theria]@clade,clade,Eutheria
+taxon:1437010 [Boreoeutheria]@clade,taxon:9347 [Eutheria]@clade,clade,Boreoeutheria
+taxon:314146 [Euarchontoglires]@superorder,taxon:1437010 [Boreoeutheria]@clade,superorder,Euarchontoglires
+taxon:314145 [Laurasiatheria]@superorder,taxon:1437010 [Boreoeutheria]@clade,superorder,Laurasiatheria
+taxon:33554 [Carnivora]@order,taxon:314145 [Laurasiatheria]@superorder,order,Carnivora
+taxon:91561 [Artiodactyla]@order,taxon:314145 [Laurasiatheria]@superorder,order,Artiodactyla
+taxon:314147 [Glires]@clade,taxon:314146 [Euarchontoglires]@superorder,clade,Glires
+taxon:9845 [Ruminantia]@suborder,taxon:91561 [Artiodactyla]@order,suborder,Ruminantia
+taxon:35500 [Pecora]@infraorder,taxon:9845 [Ruminantia]@suborder,infraorder,Pecora
+taxon:9989 [Rodentia]@order,taxon:314147 [Glires]@clade,order,Rodentia
+taxon:379584 [Caniformia]@suborder,taxon:33554 [Carnivora]@order,suborder,Caniformia
+taxon:9608 [Canidae]@family,taxon:379584 [Caniformia]@suborder,family,Canidae
+taxon:9850 [Cervidae]@family,taxon:35500 [Pecora]@infraorder,family,Cervidae
+taxon:9881 [Odocoileinae]@subfamily,taxon:9850 [Cervidae]@family,subfamily,Odocoileinae
+taxon:33553 [Sciuromorpha]@suborder,taxon:9989 [Rodentia]@order,suborder,Sciuromorpha
+taxon:55153 [Sciuridae]@family,taxon:33553 [Sciuromorpha]@suborder,family,Sciuridae
+taxon:34878 [Cervinae]@subfamily,taxon:9850 [Cervidae]@family,subfamily,Cervinae
+taxon:9611 [Canis]@genus,taxon:9608 [Canidae]@family,genus,Canis
+taxon:9857 [Capreolus]@genus,taxon:9881 [Odocoileinae]@subfamily,genus,Capreolus
+taxon:9612 [Canis lupus]@species,taxon:9611 [Canis]@genus,species,Canis lupus
+taxon:337726 [Xerinae]@subfamily,taxon:55153 [Sciuridae]@family,subfamily,Xerinae
+taxon:9859 [Cervus]@genus,taxon:34878 [Cervinae]@subfamily,genus,Cervus
+taxon:337730 [Marmotini]@tribe,taxon:337726 [Xerinae]@subfamily,tribe,Marmotini
+taxon:9992 [Marmota]@genus,taxon:337730 [Marmotini]@tribe,genus,Marmota
+taxon:9860 [Cervus elaphus]@species,taxon:9859 [Cervus]@genus,species,Cervus elaphus
+taxon:9615 [Canis lupus familiaris]@subspecies,taxon:9612 [Canis lupus]@species,subspecies,Canis lupus familiaris
+taxon:9858 [Capreolus capreolus]@species,taxon:9857 [Capreolus]@genus,species,Capreolus capreolus
@@ -44,7 +44,7 @@ cleanup() {
    rm -rf "$TMPDIR"  # Suppress the temporary directory

    if [ $failed -gt 0 ]; then
-       log "$TEST_NAME tests failed" 
+       log "$TEST_NAME tests failed"
        log
        log
       exit 1
@@ -60,10 +60,10 @@ log() {
    echo -e "[$TEST_NAME @ $(date)] $*" 1>&2
 }

-log "Testing $TEST_NAME..." 
-log "Test directory is $TEST_DIR" 
-log "obitools directory is $OBITOOLS_DIR" 
-log "Temporary directory is $TMPDIR" 
+log "Testing $TEST_NAME..."
+log "Test directory is $TEST_DIR"
+log "obitools directory is $OBITOOLS_DIR"
+log "Temporary directory is $TMPDIR"
 log "files: $(find $TEST_DIR | awk -F'/' '{print $NF}' | tail -n +2)"

 ######################################################################
@@ -94,12 +94,12 @@ log "files: $(find $TEST_DIR | awk -F'/' '{print $NF}' | tail -n +2)"


 ((ntest++))
-if $CMD -h > "${TMPDIR}/help.txt" 2>&1 
+if $CMD -h > "${TMPDIR}/help.txt" 2>&1
 then
-    log "$MCMD: printing help OK" 
+    log "$MCMD: printing help OK"
    ((success++))
 else
-    log "$MCMD: printing help failed" 
+    log "$MCMD: printing help failed"
    ((failed++))
 fi

@@ -108,15 +108,15 @@ fi
 if obiconvert -Z "${TEST_DIR}/gbpln1088.4Mb.fasta.gz" \
                 > "${TMPDIR}/xxx.fasta.gz" && \
   zdiff "${TEST_DIR}/gbpln1088.4Mb.fasta.gz" \
-                 "${TMPDIR}/xxx.fasta.gz" 
+                 "${TMPDIR}/xxx.fasta.gz"
 then
-    log "$MCMD: converting large fasta file to fasta OK" 
+    log "$MCMD: converting large fasta file to fasta OK"
    ((success++))
 else
-    log "$MCMD: converting large fasta file to fasta failed" 
+    log "$MCMD: converting large fasta file to fasta failed"
    ((failed++))
 fi
- 
+
 ((ntest++))
 if obiconvert -Z --fastq-output \
              "${TEST_DIR}/gbpln1088.4Mb.fasta.gz" \
@@ -125,15 +125,139 @@ if obiconvert -Z --fastq-output \
              "${TMPDIR}/xxx.fastq.gz" \
              > "${TMPDIR}/yyy.fasta.gz" && \
   zdiff "${TEST_DIR}/gbpln1088.4Mb.fasta.gz" \
-                 "${TMPDIR}/yyy.fasta.gz" 
+                 "${TMPDIR}/yyy.fasta.gz"
 then
-    log "$MCMD: converting large file between fasta and fastq OK" 
+    log "$MCMD: converting large file between fasta and fastq OK"
    ((success++))
 else
-    log "$MCMD: converting large file between fasta and fastq failed" 
+    log "$MCMD: converting large file between fasta and fastq failed"
    ((failed++))
 fi

+
+# ------------------------------------------------------------------
+# --raw-taxid tests (no taxonomy loaded)
+# ------------------------------------------------------------------
+
+# Running test
+((ntest++))
+if obiconvert --raw-taxid "${TEST_DIR}/out_ecotag.fasta" \
+              > "${TMPDIR}/raw_taxid.fasta" 2>/dev/null
+then
+    log "$MCMD --raw-taxid: running OK"
+    ((success++))
+else
+    log "$MCMD --raw-taxid: running failed"
+    ((failed++))
+fi
+
+# Taxids must be bare numbers — no full-format "taxon:ID [Name]@rank" strings
+((ntest++))
+if grep '"taxid"' "${TMPDIR}/raw_taxid.fasta" | grep -qv '"taxid":"[0-9][0-9]*"'
+then
+    log "$MCMD --raw-taxid: taxid format check failed (full-format taxid found)"
+    ((failed++))
+else
+    log "$MCMD --raw-taxid: taxid format OK (all taxids are bare numbers)"
+    ((success++))
+fi
+
+# --raw-taxid is idempotent: piping through a second obiconvert --raw-taxid must
+# produce bit-for-bit identical output.
+((ntest++))
+if obiconvert --raw-taxid "${TMPDIR}/raw_taxid.fasta" \
+              > "${TMPDIR}/raw_taxid2.fasta" 2>/dev/null
+then
+    log "$MCMD --raw-taxid piped: running OK"
+    ((success++))
+else
+    log "$MCMD --raw-taxid piped: running failed"
+    ((failed++))
+fi
+
+((ntest++))
+if diff "${TMPDIR}/raw_taxid.fasta" \
+        "${TMPDIR}/raw_taxid2.fasta" > /dev/null
+then
+    log "$MCMD --raw-taxid piped: idempotency OK"
+    ((success++))
+else
+    log "$MCMD --raw-taxid piped: idempotency failed (outputs differ)"
+    ((failed++))
+fi
+
+
+# ------------------------------------------------------------------
+# --taxonomy tests (full-format taxid, no --raw-taxid)
+# ------------------------------------------------------------------
+
+# Running test
+((ntest++))
+if obiconvert --taxonomy "${TEST_DIR}/taxonomy.csv" \
+              "${TEST_DIR}/out_ecotag.fasta" \
+              > "${TMPDIR}/taxo.fasta" 2>/dev/null
+then
+    log "$MCMD --taxonomy: running OK"
+    ((success++))
+else
+    log "$MCMD --taxonomy: running failed"
+    ((failed++))
+fi
+
+# Taxids must be in full "taxon:ID [Name]@rank" format
+((ntest++))
+if grep '"taxid"' "${TMPDIR}/taxo.fasta" | grep -q '"taxid":"taxon:[0-9]'
+then
+    log "$MCMD --taxonomy: taxid format OK (full-format taxids present)"
+    ((success++))
+else
+    log "$MCMD --taxonomy: taxid format check failed (no full-format taxid found)"
+    ((failed++))
+fi
+
+
+# ------------------------------------------------------------------
+# --raw-taxid --taxonomy tests
+# ------------------------------------------------------------------
+
+# Running test
+((ntest++))
+if obiconvert --raw-taxid --taxonomy "${TEST_DIR}/taxonomy.csv" \
+              "${TEST_DIR}/out_ecotag.fasta" \
+              > "${TMPDIR}/raw_taxid_taxo.fasta" 2>/dev/null
+then
+    log "$MCMD --raw-taxid --taxonomy: running OK"
+    ((success++))
+else
+    log "$MCMD --raw-taxid --taxonomy: running failed"
+    ((failed++))
+fi
+
+# Taxids must be bare numbers even when taxonomy is loaded
+((ntest++))
+if grep '"taxid"' "${TMPDIR}/raw_taxid_taxo.fasta" | grep -qv '"taxid":"[0-9][0-9]*"'
+then
+    log "$MCMD --raw-taxid --taxonomy: taxid format check failed (full-format taxid found)"
+    ((failed++))
+else
+    log "$MCMD --raw-taxid --taxonomy: taxid format OK (all taxids are bare numbers)"
+    ((success++))
+fi
+
+# --raw-taxid with or without taxonomy must yield identical taxid values
+((ntest++))
+if diff <(grep '"taxid"' "${TMPDIR}/raw_taxid.fasta" | grep -o '"taxid":"[^"]*"' | sort) \
+        <(grep '"taxid"' "${TMPDIR}/raw_taxid_taxo.fasta" | grep -o '"taxid":"[^"]*"' | sort) \
+        > /dev/null
+then
+    log "$MCMD --raw-taxid vs --raw-taxid --taxonomy: taxid values match OK"
+    ((success++))
+else
+    log "$MCMD --raw-taxid vs --raw-taxid --taxonomy: taxid values differ (unexpected)"
+    ((failed++))
+fi
+
+
 #########################################
 #
 # At the end of the tests
@@ -0,0 +1,24 @@
+>HELIUM_000100422_612GNAAXX:7:118:3572:14633#0/1_sub[28..126] {"count":10172,"merged_sample":{"26a_F040644":10172},"obitag_bestid":0.9797979797979798,"obitag_bestmatch":"AY227529","obitag_match_count":1,"obitag_rank":"genus","obitag_similarity_method":"lcs","taxid":"taxon:9992 [Marmota]@genus"}
+ttagccctaaacataaacattcaataaacaagaatgttcgccagagtactactagcaaca
+gcctgaaactcaaaggacttggcggtgctttacatccct
+>HELIUM_000100422_612GNAAXX:7:99:9351:13090#0/1_sub[28..127] {"count":260,"merged_sample":{"29a_F260619":260},"obitag_bestid":0.9405940594059405,"obitag_bestmatch":"AF154263","obitag_match_count":9,"obitag_rank":"infraorder","obitag_similarity_method":"lcs","taxid":"taxon:35500 [Pecora]@infraorder"}
+ttagccctaaacacaaataattacacaaacaaaattgttcaccagagtactagcggcaac
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:108:10111:9078#0/1_sub[28..127] {"count":7146,"merged_sample":{"13a_F730603":7146},"obitag_bestid":1,"obitag_bestmatch":"AB245427","obitag_match_count":1,"obitag_rank":"species","obitag_similarity_method":"lcs","taxid":"taxon:9860 [Cervus elaphus]@species"}
+ctagccttaaacacaaatagttatgcaaacaaaactattcgccagagtactaccggcaat
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:38:14204:12725#0/1_sub[28..126] {"count":87,"merged_sample":{"26a_F040644":87},"obitag_bestid":0.9494949494949495,"obitag_bestmatch":"AY227530","obitag_match_count":2,"obitag_rank":"tribe","obitag_similarity_method":"lcs","taxid":"taxon:337730 [Marmotini]@tribe"}
+ttagccctaaacataaacattcaataaacaagaatgttcgccagaggactactagcaata
+gcttaaaactcaaaggacttggcggtgctttatatccct
+>HELIUM_000100422_612GNAAXX:7:30:9942:4495#0/1_sub[28..126] {"count":95,"merged_sample":{"26a_F040644":11,"29a_F260619":84},"obitag_bestid":0.9595959595959596,"obitag_bestmatch":"AC187326","obitag_match_count":1,"obitag_rank":"subspecies","obitag_similarity_method":"lcs","taxid":"taxon:9615 [Canis lupus familiaris]@subspecies"}
+ttagccctaaacataagctattccataacaaaataattcgccagagaactactagcaaca
+gattaaacctcaaaggacttggcagtgctttatacccct
+>HELIUM_000100422_612GNAAXX:7:51:16702:19393#0/1_sub[28..127] {"count":12004,"merged_sample":{"15a_F730814":7465,"29a_F260619":4539},"obitag_bestid":1,"obitag_bestmatch":"AJ885202","obitag_match_count":1,"obitag_rank":"species","obitag_similarity_method":"lcs","taxid":"taxon:9858 [Capreolus capreolus]@species"}
+ttagccctaaacacaagtaattaatataacaaaattattcgccagagtactaccggcaat
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:84:14502:1617#0/1_sub[28..127] {"count":319,"merged_sample":{"29a_F260619":319},"obitag_bestid":1,"obitag_bestmatch":"AJ972683","obitag_match_count":1,"obitag_rank":"species","obitag_similarity_method":"lcs","taxid":"taxon:9858 [Capreolus capreolus]@species"}
+ttagccctaaacacaagtaattattataacaaaattattcgccagagtactaccggcaat
+agcttaaaactcaaaggacttggcggtgctttataccctt
+>HELIUM_000100422_612GNAAXX:7:50:10637:6527#0/1_sub[28..126] {"count":366,"merged_sample":{"13a_F730603":13,"15a_F730814":5,"26a_F040644":347,"29a_F260619":1},"obitag_bestid":1,"obitag_bestmatch":"AB048590","obitag_match_count":1,"obitag_rank":"genus","obitag_similarity_method":"lcs","taxid":"taxon:9611 [Canis]@genus"}
+ttagccctaaacatagataattttacaacaaaataattcgccagaggactactagcaata
+gcttaaaactcaaaggacttggcggtgctttatatccct
@@ -1,6 +1,12 @@
 package obidefault

-var _BatchSize = 2000
+// _BatchSize is the minimum number of sequences per batch (floor).
+// Used as the minSeqs argument to RebatchBySize.
+var _BatchSize = 1
+
+// _BatchSizeMax is the maximum number of sequences per batch (ceiling).
+// A batch is flushed when this count is reached regardless of memory usage.
+var _BatchSizeMax = 2000

 // SetBatchSize sets the size of the sequence batches.
 //
@@ -24,3 +30,42 @@ func BatchSize() int {
 func BatchSizePtr() *int {
 	return &_BatchSize
 }
+
+// BatchSizeMax returns the maximum number of sequences per batch.
+func BatchSizeMax() int {
+	return _BatchSizeMax
+}
+
+func BatchSizeMaxPtr() *int {
+	return &_BatchSizeMax
+}
+
+// _BatchMem holds the maximum cumulative memory (in bytes) per batch when
+// memory-based batching is requested. A value of 0 disables memory-based
+// batching and falls back to count-based batching.
+var _BatchMem = 128 * 1024 * 1024 // 128 MB default; set to 0 to disable
+var _BatchMemStr = ""
+
+// SetBatchMem sets the memory budget per batch in bytes.
+func SetBatchMem(n int) {
+	_BatchMem = n
+}
+
+// BatchMem returns the current memory budget per batch in bytes.
+// A value of 0 means memory-based batching is disabled.
+func BatchMem() int {
+	return _BatchMem
+}
+
+func BatchMemPtr() *int {
+	return &_BatchMem
+}
+
+// BatchMemStr returns the raw --batch-mem string value as provided on the CLI.
+func BatchMemStr() string {
+	return _BatchMemStr
+}
+
+func BatchMemStrPtr() *string {
+	return &_BatchMemStr
+}
@@ -161,6 +161,149 @@ func EmblChunkParser(withFeatureTable, UtoT bool) func(string, io.Reader) (obise
 	return parser
 }

+// extractEmblSeq scans the sequence section of an EMBL record directly on the
+// rope. EMBL sequence lines start with 5 spaces followed by bases in groups of
+// 10, separated by spaces, with a position number at the end. The section ends
+// with "//".
+func (s *ropeScanner) extractEmblSeq(dest []byte, UtoT bool) []byte {
+	// We use ReadLine and scan each line for bases (skip digits, spaces, newlines).
+	for {
+		line := s.ReadLine()
+		if line == nil {
+			break
+		}
+		if len(line) >= 2 && line[0] == '/' && line[1] == '/' {
+			break
+		}
+		// Lines start with 5 spaces; bases follow separated by single spaces.
+		// Digits at the end are the position counter — skip them.
+		// Simplest: take every byte that is a letter.
+		for _, b := range line {
+			if b >= 'A' && b <= 'Z' {
+				b += 'a' - 'A'
+			}
+			if UtoT && b == 'u' {
+				b = 't'
+			}
+			if b >= 'a' && b <= 'z' {
+				dest = append(dest, b)
+			}
+		}
+	}
+	return dest
+}
+
+// EmblChunkParserRope parses an EMBL chunk directly from a rope without Pack().
+func EmblChunkParserRope(source string, rope *PieceOfChunk, withFeatureTable, UtoT bool) (obiseq.BioSequenceSlice, error) {
+	scanner := newRopeScanner(rope)
+	sequences := obiseq.MakeBioSequenceSlice(100)[:0]
+
+	var id string
+	var scientificName string
+	defBytes := make([]byte, 0, 256)
+	featBytes := make([]byte, 0, 1024)
+	var taxid int
+	inSeq := false
+
+	for {
+		line := scanner.ReadLine()
+		if line == nil {
+			break
+		}
+
+		if inSeq {
+			// Should not happen — extractEmblSeq consumed up to "//"
+			inSeq = false
+			continue
+		}
+
+		switch {
+		case bytes.HasPrefix(line, []byte("ID   ")):
+			id = string(bytes.SplitN(line[5:], []byte(";"), 2)[0])
+		case bytes.HasPrefix(line, []byte("OS   ")):
+			scientificName = string(bytes.TrimSpace(line[5:]))
+		case bytes.HasPrefix(line, []byte("DE   ")):
+			if len(defBytes) > 0 {
+				defBytes = append(defBytes, ' ')
+			}
+			defBytes = append(defBytes, bytes.TrimSpace(line[5:])...)
+		case withFeatureTable && bytes.HasPrefix(line, []byte("FH   ")):
+			featBytes = append(featBytes, line...)
+		case withFeatureTable && bytes.Equal(line, []byte("FH")):
+			featBytes = append(featBytes, '\n')
+			featBytes = append(featBytes, line...)
+		case bytes.HasPrefix(line, []byte("FT   ")):
+			if withFeatureTable {
+				featBytes = append(featBytes, '\n')
+				featBytes = append(featBytes, line...)
+			}
+			if bytes.HasPrefix(line, []byte(`FT                   /db_xref="taxon:`)) {
+				rest := line[37:]
+				end := bytes.IndexByte(rest, '"')
+				if end > 0 {
+					taxid, _ = strconv.Atoi(string(rest[:end]))
+				}
+			}
+		case bytes.HasPrefix(line, []byte("     ")):
+			// First sequence line: extract all bases via extractEmblSeq,
+			// which also consumes this line's remaining content.
+			// But ReadLine already consumed this line — we need to process it
+			// plus subsequent lines. Process this line inline then call helper.
+			seqDest := make([]byte, 0, 4096)
+			for _, b := range line {
+				if b >= 'A' && b <= 'Z' {
+					b += 'a' - 'A'
+				}
+				if UtoT && b == 'u' {
+					b = 't'
+				}
+				if b >= 'a' && b <= 'z' {
+					seqDest = append(seqDest, b)
+				}
+			}
+			seqDest = scanner.extractEmblSeq(seqDest, UtoT)
+
+			seq := obiseq.NewBioSequenceOwning(id, seqDest, string(defBytes))
+			seq.SetSource(source)
+			if withFeatureTable {
+				seq.SetFeatures(featBytes)
+			}
+			annot := seq.Annotations()
+			annot["scientific_name"] = scientificName
+			annot["taxid"] = taxid
+			sequences = append(sequences, seq)
+
+			// Reset state
+			id = ""
+			scientificName = ""
+			defBytes = defBytes[:0]
+			featBytes = featBytes[:0]
+			taxid = 1
+
+		case bytes.Equal(line, []byte("//")):
+			// record ended without SQ/sequence section (e.g. WGS entries)
+			if id != "" {
+				seq := obiseq.NewBioSequenceOwning(id, []byte{}, string(defBytes))
+				seq.SetSource(source)
+				if withFeatureTable {
+					seq.SetFeatures(featBytes)
+				}
+				annot := seq.Annotations()
+				annot["scientific_name"] = scientificName
+				annot["taxid"] = taxid
+				sequences = append(sequences, seq)
+			}
+			id = ""
+			scientificName = ""
+			defBytes = defBytes[:0]
+			featBytes = featBytes[:0]
+			taxid = 1
+		}
+	}
+
+	return sequences, nil
+}
+
 func _ParseEmblFile(
 	input ChannelFileChunk,
 	out obiiter.IBioSequence,
@@ -171,7 +314,14 @@ func _ParseEmblFile(

 	for chunks := range input {
 		order := chunks.Order
-		sequences, err := parser(chunks.Source, chunks.Raw)
+		var sequences obiseq.BioSequenceSlice
+		var err error
+
+		if chunks.Rope != nil {
+			sequences, err = EmblChunkParserRope(chunks.Source, chunks.Rope, withFeatureTable, UtoT)
+		} else {
+			sequences, err = parser(chunks.Source, chunks.Raw)
+		}

 		if err != nil {
 			log.Fatalf("%s : Cannot parse the embl file : %v", chunks.Source, err)
@@ -196,7 +346,7 @@ func ReadEMBL(reader io.Reader, options ...WithOption) (obiiter.IBioSequence, er
 		1024*1024*128,
 		EndOfLastFlatFileEntry,
 		"\nID   ",
-		true,
+		false,
 	)

 	newIter := obiiter.MakeIBioSequence()
@@ -209,28 +209,121 @@ func FastaChunkParser(UtoT bool) func(string, io.Reader) (obiseq.BioSequenceSlic
 	return parser
 }

+// extractFastaSeq scans sequence bytes from the rope directly into dest,
+// appending valid nucleotide characters and skipping whitespace.
+// Stops when '>' is found at the start of a line (next record) or at EOF.
+// Returns (dest with appended bases, hasMore).
+// hasMore=true means scanner is now positioned at '>' of the next record.
+func (s *ropeScanner) extractFastaSeq(dest []byte, UtoT bool) ([]byte, bool) {
+	lineStart := true
+
+	for s.current != nil {
+		data := s.current.data[s.pos:]
+		for i, b := range data {
+			if lineStart && b == '>' {
+				s.pos += i
+				if s.pos >= len(s.current.data) {
+					s.current = s.current.Next()
+					s.pos = 0
+				}
+				return dest, true
+			}
+			if b == '\n' || b == '\r' {
+				lineStart = true
+				continue
+			}
+			lineStart = false
+			if b == ' ' || b == '\t' {
+				continue
+			}
+			if b >= 'A' && b <= 'Z' {
+				b += 'a' - 'A'
+			}
+			if UtoT && b == 'u' {
+				b = 't'
+			}
+			dest = append(dest, b)
+		}
+		s.current = s.current.Next()
+		s.pos = 0
+	}
+	return dest, false
+}
+
+// FastaChunkParserRope parses a FASTA chunk directly from the rope without Pack().
+func FastaChunkParserRope(source string, rope *PieceOfChunk, UtoT bool) (obiseq.BioSequenceSlice, error) {
+	scanner := newRopeScanner(rope)
+	sequences := obiseq.MakeBioSequenceSlice(100)[:0]
+
+	for {
+		bline := scanner.ReadLine()
+		if bline == nil {
+			break
+		}
+		if len(bline) == 0 || bline[0] != '>' {
+			continue
+		}
+
+		// Parse header: ">id definition"
+		header := bline[1:]
+		var id string
+		var definition string
+		sp := bytes.IndexByte(header, ' ')
+		if sp < 0 {
+			sp = bytes.IndexByte(header, '\t')
+		}
+		if sp < 0 {
+			id = string(header)
+		} else {
+			id = string(header[:sp])
+			definition = string(bytes.TrimSpace(header[sp+1:]))
+		}
+
+		seqDest := make([]byte, 0, 4096)
+		var hasMore bool
+		seqDest, hasMore = scanner.extractFastaSeq(seqDest, UtoT)
+
+		if len(seqDest) == 0 {
+			log.Fatalf("%s [%s]: sequence is empty", source, id)
+		}
+
+		seq := obiseq.NewBioSequenceOwning(id, seqDest, definition)
+		seq.SetSource(source)
+		sequences = append(sequences, seq)
+
+		if !hasMore {
+			break
+		}
+	}
+
+	return sequences, nil
+}
+
 func _ParseFastaFile(
 	input ChannelFileChunk,
 	out obiiter.IBioSequence,
 	UtoT bool,
 ) {
-
 	parser := FastaChunkParser(UtoT)

 	for chunks := range input {
-		sequences, err := parser(chunks.Source, chunks.Raw)
-		// obilog.Warnf("Chunck(%d:%d) -%d- ", chunks.Order, l, sequences.Len())
+		var sequences obiseq.BioSequenceSlice
+		var err error
+
+		if chunks.Rope != nil {
+			sequences, err = FastaChunkParserRope(chunks.Source, chunks.Rope, UtoT)
+		} else {
+			sequences, err = parser(chunks.Source, chunks.Raw)
+		}

 		if err != nil {
 			log.Fatalf("File %s : Cannot parse the fasta file : %v", chunks.Source, err)
 		}

 		out.Push(obiiter.MakeBioSequenceBatch(chunks.Source, chunks.Order, sequences))
-
 	}

 	out.Done()
-
 }

 func ReadFasta(reader io.Reader, options ...WithOption) (obiiter.IBioSequence, error) {
@@ -245,7 +338,7 @@ func ReadFasta(reader io.Reader, options ...WithOption) (obiiter.IBioSequence, e
 		1024*1024,
 		EndOfLastFastaEntry,
 		"\n>",
-		true,
+		false,
 	)

 	for i := 0; i < nworker; i++ {
@@ -303,6 +303,80 @@ func FastqChunkParser(quality_shift byte, with_quality bool, UtoT bool) func(str
 	return parser
 }

+// FastqChunkParserRope parses a FASTQ chunk directly from a rope without Pack().
+func FastqChunkParserRope(source string, rope *PieceOfChunk, quality_shift byte, with_quality, UtoT bool) (obiseq.BioSequenceSlice, error) {
+	scanner := newRopeScanner(rope)
+	sequences := obiseq.MakeBioSequenceSlice(100)[:0]
+
+	for {
+		// Line 1: @id [definition]
+		hline := scanner.ReadLine()
+		if hline == nil {
+			break
+		}
+		if len(hline) == 0 || hline[0] != '@' {
+			continue
+		}
+		header := hline[1:]
+		var id string
+		var definition string
+		sp := bytes.IndexByte(header, ' ')
+		if sp < 0 {
+			sp = bytes.IndexByte(header, '\t')
+		}
+		if sp < 0 {
+			id = string(header)
+		} else {
+			id = string(header[:sp])
+			definition = string(bytes.TrimSpace(header[sp+1:]))
+		}
+
+		// Line 2: sequence
+		sline := scanner.ReadLine()
+		if sline == nil {
+			log.Fatalf("@%s[%s]: unexpected EOF after header", id, source)
+		}
+		seqDest := make([]byte, len(sline))
+		w := 0
+		for _, b := range sline {
+			if b >= 'A' && b <= 'Z' {
+				b += 'a' - 'A'
+			}
+			if UtoT && b == 'u' {
+				b = 't'
+			}
+			seqDest[w] = b
+			w++
+		}
+		seqDest = seqDest[:w]
+		if len(seqDest) == 0 {
+			log.Fatalf("@%s[%s]: sequence is empty", id, source)
+		}
+
+		// Line 3: + (skip)
+		scanner.ReadLine()
+
+		// Line 4: quality
+		qline := scanner.ReadLine()
+
+		seq := obiseq.NewBioSequenceOwning(id, seqDest, definition)
+		seq.SetSource(source)
+
+		if with_quality && qline != nil {
+			qDest := make([]byte, len(qline))
+			copy(qDest, qline)
+			for i := range qDest {
+				qDest[i] -= quality_shift
+			}
+			seq.TakeQualities(qDest)
+		}
+
+		sequences = append(sequences, seq)
+	}
+
+	return sequences, nil
+}
+
 func _ParseFastqFile(
 	input ChannelFileChunk,
 	out obiiter.IBioSequence,
@@ -313,7 +387,14 @@ func _ParseFastqFile(
 	parser := FastqChunkParser(quality_shift, with_quality, UtoT)

 	for chunks := range input {
-		sequences, err := parser(chunks.Source, chunks.Raw)
+		var sequences obiseq.BioSequenceSlice
+		var err error
+
+		if chunks.Rope != nil {
+			sequences, err = FastqChunkParserRope(chunks.Source, chunks.Rope, quality_shift, with_quality, UtoT)
+		} else {
+			sequences, err = parser(chunks.Source, chunks.Raw)
+		}

 		if err != nil {
 			log.Fatalf("File %s : Cannot parse the fastq file : %v", chunks.Source, err)
@@ -339,7 +420,7 @@ func ReadFastq(reader io.Reader, options ...WithOption) (obiiter.IBioSequence, e
 		1024*1024,
 		EndOfLastFastqEntry,
 		"\n@",
-		true,
+		false,
 	)

 	for i := 0; i < nworker; i++ {
@@ -296,7 +296,7 @@ func _parse_json_header_(header string, sequence *obiseq.BioSequence) string {

 			case strings.HasSuffix(skey, "_taxid"):
 				if dataType == jsonparser.Number || dataType == jsonparser.String {
-					rank, _ := obiutils.SplitInTwo(skey, '_')
+					rank := skey[:len(skey)-len("_taxid")]

 					taxid := string(value)
 					sequence.SetTaxid(taxid, rank)
@@ -29,70 +29,11 @@ const (

 var _seqlenght_rx = regexp.MustCompile(" +([0-9]+) bp")

-// gbRopeScanner reads lines from a PieceOfChunk rope without heap allocation.
-// The carry buffer (stack) handles lines that span two rope nodes.
-type gbRopeScanner struct {
-	current *PieceOfChunk
-	pos     int
-	carry   [256]byte // max GenBank line = 80 chars; 256 gives ample margin
-	carryN  int
-}
-
-func newGbRopeScanner(rope *PieceOfChunk) *gbRopeScanner {
-	return &gbRopeScanner{current: rope}
-}
-
-// ReadLine returns the next line without the trailing \n (or \r\n).
-// Returns nil at end of rope. The returned slice aliases carry[] or the node
-// data and is valid only until the next ReadLine call.
-func (s *gbRopeScanner) ReadLine() []byte {
-	for {
-		if s.current == nil {
-			if s.carryN > 0 {
-				n := s.carryN
-				s.carryN = 0
-				return s.carry[:n]
-			}
-			return nil
-		}
-
-		data := s.current.data[s.pos:]
-		idx := bytes.IndexByte(data, '\n')
-
-		if idx >= 0 {
-			var line []byte
-			if s.carryN == 0 {
-				line = data[:idx]
-			} else {
-				n := copy(s.carry[s.carryN:], data[:idx])
-				s.carryN += n
-				line = s.carry[:s.carryN]
-				s.carryN = 0
-			}
-			s.pos += idx + 1
-			if s.pos >= len(s.current.data) {
-				s.current = s.current.Next()
-				s.pos = 0
-			}
-			if len(line) > 0 && line[len(line)-1] == '\r' {
-				line = line[:len(line)-1]
-			}
-			return line
-		}
-
-		// No \n in this node: accumulate into carry and advance
-		n := copy(s.carry[s.carryN:], data)
-		s.carryN += n
-		s.current = s.current.Next()
-		s.pos = 0
-	}
-}
-
 // extractSequence scans the ORIGIN section byte-by-byte directly on the rope,
 // appending compacted bases to dest. Returns the extended slice.
 // Stops and returns when "//" is found at the start of a line.
 // The scanner is left positioned after the "//" line.
-func (s *gbRopeScanner) extractSequence(dest []byte, UtoT bool) []byte {
+func (s *ropeScanner) extractSequence(dest []byte, UtoT bool) []byte {
 	lineStart := true
 	skipDigits := true

@@ -139,24 +80,6 @@ func (s *gbRopeScanner) extractSequence(dest []byte, UtoT bool) []byte {
 	return dest
 }

-// skipToNewline advances the scanner past the next '\n'.
-func (s *gbRopeScanner) skipToNewline() {
-	for s.current != nil {
-		data := s.current.data[s.pos:]
-		idx := bytes.IndexByte(data, '\n')
-		if idx >= 0 {
-			s.pos += idx + 1
-			if s.pos >= len(s.current.data) {
-				s.current = s.current.Next()
-				s.pos = 0
-			}
-			return
-		}
-		s.current = s.current.Next()
-		s.pos = 0
-	}
-}
-
 // parseLseqFromLocus extracts the declared sequence length from a LOCUS line.
 // Format: "LOCUS       <id> <length> bp ..."
 // Returns -1 if not found or parse error.
@@ -205,7 +128,7 @@ func GenbankChunkParserRope(source string, rope *PieceOfChunk,
 	withFeatureTable, UtoT bool) (obiseq.BioSequenceSlice, error) {

 	state := inHeader
-	scanner := newGbRopeScanner(rope)
+	scanner := newRopeScanner(rope)
 	sequences := obiseq.MakeBioSequenceSlice(100)[:0]

 	id := ""
@@ -631,9 +631,9 @@ func ReadCSVNGSFilter(reader io.Reader) (*obingslibrary.NGSLibrary, error) {
 			return nil, fmt.Errorf("row %d has %d columns, expected %d", len(data), len(fields), len(header))
 		}

-		forward_primer := fields[forward_primerColIndex]
-		reverse_primer := fields[reverse_primerColIndex]
-		tags := _parseMainNGSFilterTags(fields[sample_tagColIndex])
+		forward_primer := strings.TrimSpace(fields[forward_primerColIndex])
+		reverse_primer := strings.TrimSpace(fields[reverse_primerColIndex])
+		tags := _parseMainNGSFilterTags(strings.TrimSpace(fields[sample_tagColIndex]))

 		marker, _ := ngsfilter.GetMarker(forward_primer, reverse_primer)
 		pcr, ok := marker.GetPCR(tags.Forward, tags.Reverse)
@@ -644,8 +644,8 @@ func ReadCSVNGSFilter(reader io.Reader) (*obingslibrary.NGSLibrary, error) {
 					i, tags.Forward, tags.Reverse, forward_primer, reverse_primer)
 		}

-		pcr.Experiment = fields[experimentColIndex]
-		pcr.Sample = fields[sampleColIndex]
+		pcr.Experiment = strings.TrimSpace(fields[experimentColIndex])
+		pcr.Sample = strings.TrimSpace(fields[sampleColIndex])

 		if extraColumns != nil {
 			pcr.Annotations = make(obiseq.Annotation)
@@ -0,0 +1,77 @@
+package obiformats
+
+import "bytes"
+
+// ropeScanner reads lines from a PieceOfChunk rope.
+// The carry buffer handles lines that span two rope nodes; it grows as needed.
+type ropeScanner struct {
+	current *PieceOfChunk
+	pos     int
+	carry   []byte
+}
+
+func newRopeScanner(rope *PieceOfChunk) *ropeScanner {
+	return &ropeScanner{current: rope}
+}
+
+// ReadLine returns the next line without the trailing \n (or \r\n).
+// Returns nil at end of rope. The returned slice aliases carry[] or the node
+// data and is valid only until the next ReadLine call.
+func (s *ropeScanner) ReadLine() []byte {
+	for {
+		if s.current == nil {
+			if len(s.carry) > 0 {
+				line := s.carry
+				s.carry = s.carry[:0]
+				return line
+			}
+			return nil
+		}
+
+		data := s.current.data[s.pos:]
+		idx := bytes.IndexByte(data, '\n')
+
+		if idx >= 0 {
+			var line []byte
+			if len(s.carry) == 0 {
+				line = data[:idx]
+			} else {
+				s.carry = append(s.carry, data[:idx]...)
+				line = s.carry
+				s.carry = s.carry[:0]
+			}
+			s.pos += idx + 1
+			if s.pos >= len(s.current.data) {
+				s.current = s.current.Next()
+				s.pos = 0
+			}
+			if len(line) > 0 && line[len(line)-1] == '\r' {
+				line = line[:len(line)-1]
+			}
+			return line
+		}
+
+		// No \n in this node: accumulate into carry and advance
+		s.carry = append(s.carry, data...)
+		s.current = s.current.Next()
+		s.pos = 0
+	}
+}
+
+// skipToNewline advances the scanner past the next '\n'.
+func (s *ropeScanner) skipToNewline() {
+	for s.current != nil {
+		data := s.current.data[s.pos:]
+		idx := bytes.IndexByte(data, '\n')
+		if idx >= 0 {
+			s.pos += idx + 1
+			if s.pos >= len(s.current.data) {
+				s.current = s.current.Next()
+				s.pos = 0
+			}
+			return
+		}
+		s.current = s.current.Next()
+		s.pos = 0
+	}
+}
@@ -444,6 +444,67 @@ func (iterator IBioSequence) Rebatch(size int) IBioSequence {
 	return newIter
 }

+// RebatchBySize reorganises the stream into batches bounded by two independent
+// upper limits: maxCount (max number of sequences) and maxBytes (max cumulative
+// estimated memory). A batch is flushed as soon as either limit would be
+// exceeded. A single sequence larger than maxBytes is always emitted alone.
+// Passing 0 for a limit disables that constraint; if both are 0 it falls back
+// to Rebatch(obidefault.BatchSizeMax()).
+func (iterator IBioSequence) RebatchBySize(maxBytes int, maxCount int) IBioSequence {
+	if maxBytes <= 0 && maxCount <= 0 {
+		return iterator.Rebatch(obidefault.BatchSizeMax())
+	}
+
+	newIter := MakeIBioSequence()
+
+	newIter.Add(1)
+
+	go func() {
+		newIter.WaitAndClose()
+	}()
+
+	go func() {
+		order := 0
+		iterator = iterator.SortBatches()
+		buffer := obiseq.MakeBioSequenceSlice()
+		bufBytes := 0
+		source := ""
+
+		flush := func() {
+			if len(buffer) > 0 {
+				newIter.Push(MakeBioSequenceBatch(source, order, buffer))
+				order++
+				buffer = obiseq.MakeBioSequenceSlice()
+				bufBytes = 0
+			}
+		}
+
+		for iterator.Next() {
+			seqs := iterator.Get()
+			source = seqs.Source()
+			for _, s := range seqs.Slice() {
+				sz := s.MemorySize()
+				countFull := maxCount > 0 && len(buffer) >= maxCount
+				memFull := maxBytes > 0 && bufBytes+sz > maxBytes && len(buffer) > 0
+				if countFull || memFull {
+					flush()
+				}
+				buffer = append(buffer, s)
+				bufBytes += sz
+			}
+		}
+		flush()
+
+		newIter.Done()
+	}()
+
+	if iterator.IsPaired() {
+		newIter.MarkAsPaired()
+	}
+
+	return newIter
+}
+
 func (iterator IBioSequence) FilterEmpty() IBioSequence {

 	newIter := MakeIBioSequence()
@@ -638,7 +699,7 @@ func (iterator IBioSequence) FilterOn(predicate obiseq.SequencePredicate,
 		trueIter.MarkAsPaired()
 	}

-	return trueIter.Rebatch(size)
+	return trueIter.RebatchBySize(obidefault.BatchMem(), obidefault.BatchSizeMax())
 }

 func (iterator IBioSequence) FilterAnd(predicate obiseq.SequencePredicate,
@@ -694,7 +755,7 @@ func (iterator IBioSequence) FilterAnd(predicate obiseq.SequencePredicate,
 		trueIter.MarkAsPaired()
 	}

-	return trueIter.Rebatch(size)
+	return trueIter.RebatchBySize(obidefault.BatchMem(), obidefault.BatchSizeMax())
 }

 // Load all sequences availables from an IBioSequenceBatch iterator into
@@ -57,34 +57,21 @@ func (dist *IDistribute) Classifier() *obiseq.BioSequenceClassifier {
 }

 // Distribute organizes the biosequences from the iterator into batches
-// based on the provided classifier and batch sizes. It returns an
-// IDistribute instance that manages the distribution of the sequences.
+// based on the provided classifier. It returns an IDistribute instance
+// that manages the distribution of the sequences.
 //
-// Parameters:
-//   - class: A pointer to a BioSequenceClassifier used to classify
-//     the biosequences during distribution.
-//   - sizes: Optional integer values specifying the batch size. If
-//     no sizes are provided, a default batch size of 5000 is used.
-//
-// Returns:
-// An IDistribute instance that contains the outputs of the
-// classified biosequences, a channel for new data notifications,
-// and the classifier used for distribution. The method operates
-// asynchronously, processing the sequences in separate goroutines.
-// It ensures that the outputs are closed and cleaned up once
-// processing is complete.
-func (iterator IBioSequence) Distribute(class *obiseq.BioSequenceClassifier, sizes ...int) IDistribute {
-	batchsize := obidefault.BatchSize()
+// Batches are flushed when either BatchSizeMax() sequences or BatchMem()
+// bytes are accumulated per key, mirroring the RebatchBySize strategy.
+func (iterator IBioSequence) Distribute(class *obiseq.BioSequenceClassifier) IDistribute {
+	maxCount := obidefault.BatchSizeMax()
+	maxBytes := obidefault.BatchMem()

 	outputs := make(map[int]IBioSequence, 100)
 	slices := make(map[int]*obiseq.BioSequenceSlice, 100)
+	bufBytes := make(map[int]int, 100)
 	orders := make(map[int]int, 100)
 	news := make(chan int)

-	if len(sizes) > 0 {
-		batchsize = sizes[0]
-	}
-
 	jobDone := sync.WaitGroup{}
 	lock := sync.Mutex{}

@@ -115,6 +102,7 @@ func (iterator IBioSequence) Distribute(class *obiseq.BioSequenceClassifier, siz
 					slice = &s
 					slices[key] = slice
 					orders[key] = 0
+					bufBytes[key] = 0

 					lock.Lock()
 					outputs[key] = MakeIBioSequence()
@@ -123,14 +111,20 @@ func (iterator IBioSequence) Distribute(class *obiseq.BioSequenceClassifier, siz
 					news <- key
 				}

-				*slice = append(*slice, s)
-
-				if len(*slice) == batchsize {
+				sz := s.MemorySize()
+				countFull := maxCount > 0 && len(*slice) >= maxCount
+				memFull := maxBytes > 0 && bufBytes[key]+sz > maxBytes && len(*slice) > 0
+				if countFull || memFull {
 					outputs[key].Push(MakeBioSequenceBatch(source, orders[key], *slice))
 					orders[key]++
 					s := obiseq.MakeBioSequenceSlice()
 					slices[key] = &s
+					slice = &s
+					bufBytes[key] = 0
 				}
+
+				*slice = append(*slice, s)
+				bufBytes[key] += sz
 			}
 		}

@@ -3,6 +3,7 @@ package obiiter
 import (
 	log "github.com/sirupsen/logrus"

+	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obidefault"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq"
 )

@@ -70,7 +71,7 @@ func IFragments(minsize, length, overlap, size, nworkers int) Pipeable {
 		}
 		go f(iterator)

-		return newiter.SortBatches().Rebatch(size)
+		return newiter.SortBatches().RebatchBySize(obidefault.BatchMem(), obidefault.BatchSizeMax())
 	}

 	return ifrg
@@ -47,7 +47,7 @@ func Encode4mer(seq *obiseq.BioSequence, buffer *[]byte) []byte {
 	length := slength - 3
 	rawseq := seq.Sequence()

-	if length < 0 {
+	if length <= 0 {
 		return nil
 	}

@@ -91,7 +91,7 @@ func LuaWorker(proto *lua.FunctionProto) obiseq.SeqWorker {
 	err := interpreter.PCall(0, lua.MultRet, nil)

 	if err != nil {
-		log.Fatalf("Error in executing the lua script")
+		log.Fatalf("Error in executing the lua script: %v", err)
 	}

 	result := interpreter.GetGlobal("worker")
@@ -141,6 +141,69 @@ func LuaWorker(proto *lua.FunctionProto) obiseq.SeqWorker {
 	return nil
 }

+// LuaSliceWorker creates a SeqSliceWorker that calls the Lua function
+// named "slice_worker". Unlike LuaWorker, the entire batch (BioSequenceSlice)
+// is passed to the Lua function at once, enabling batch-level processing
+// (e.g. a single HTTP request per batch instead of one per sequence).
+//
+// The Lua function signature:
+//
+//	function slice_worker(slice)   -- receives a BioSequenceSlice
+//	    -- process the batch
+//	    return slice               -- returns a BioSequenceSlice (or nil)
+//	end
+func LuaSliceWorker(proto *lua.FunctionProto) obiseq.SeqSliceWorker {
+	interpreter := NewInterpreter()
+	lfunc := interpreter.NewFunctionFromProto(proto)
+	interpreter.Push(lfunc)
+	err := interpreter.PCall(0, lua.MultRet, nil)
+
+	if err != nil {
+		log.Fatalf("Error in executing the lua script: %v", err)
+	}
+
+	result := interpreter.GetGlobal("slice_worker")
+
+	if lua_worker, ok := result.(*lua.LFunction); ok {
+		f := func(slice obiseq.BioSequenceSlice) (obiseq.BioSequenceSlice, error) {
+			if err := interpreter.CallByParam(lua.P{
+				Fn:      lua_worker,
+				NRet:    1,
+				Protect: true,
+			}, obiseqslice2Lua(interpreter, &slice)); err != nil {
+				log.Fatal(err)
+			}
+
+			lreponse := interpreter.Get(-1)
+			defer interpreter.Pop(1)
+
+			if reponse, ok := lreponse.(*lua.LUserData); ok {
+				s := reponse.Value
+				switch val := s.(type) {
+				case *obiseq.BioSequenceSlice:
+					return *val, nil
+				case *obiseq.BioSequence:
+					return obiseq.BioSequenceSlice{val}, nil
+				default:
+					r := reflect.TypeOf(val)
+					return nil, fmt.Errorf("slice_worker function doesn't return the correct type %s", r)
+				}
+			}
+
+			if _, ok = lreponse.(*lua.LNilType); ok {
+				return nil, nil
+			}
+
+			return nil, fmt.Errorf("slice_worker function doesn't return the correct type %T", lreponse)
+		}
+
+		return f
+	}
+
+	log.Fatalf("The slice_worker object is not a function")
+	return nil
+}
+
 // LuaProcessor processes a Lua script on a sequence iterator and returns a new iterator.
 //
 // Parameters:
@@ -173,7 +236,7 @@ func LuaProcessor(iterator obiiter.IBioSequence, name, program string, breakOnEr
 	err = interpreter.PCall(0, lua.MultRet, nil)

 	if err != nil {
-		log.Fatalf("Error in executing the lua script")
+		log.Fatalf("Error in executing the lua script: %v", err)
 	}

 	result := interpreter.GetGlobal("begin")
@@ -198,7 +261,7 @@ func LuaProcessor(iterator obiiter.IBioSequence, name, program string, breakOnEr
 		err = interpreter.PCall(0, lua.MultRet, nil)

 		if err != nil {
-			log.Fatalf("Error in executing the lua script")
+			log.Fatalf("Error in executing the lua script: %v", err)
 		}

 		result := interpreter.GetGlobal("finish")
@@ -216,11 +279,27 @@ func LuaProcessor(iterator obiiter.IBioSequence, name, program string, breakOnEr

 	}()

-	ff := func(iterator obiiter.IBioSequence) {
-		w := LuaWorker(proto)
-		sw := obiseq.SeqToSliceWorker(w, false)
+	// Detect whether the script defines slice_worker (batch-level) or worker (per-sequence).
+	hasSliceWorker := func() bool {
+		interpreter := NewInterpreter()
+		lfunc := interpreter.NewFunctionFromProto(proto)
+		interpreter.Push(lfunc)
+		if err := interpreter.PCall(0, lua.MultRet, nil); err != nil {
+			return false
+		}
+		result := interpreter.GetGlobal("slice_worker")
+		interpreter.Close()
+		_, ok := result.(*lua.LFunction)
+		return ok
+	}()

-		// iterator = iterator.SortBatches()
+	ff := func(iterator obiiter.IBioSequence) {
+		var sw obiseq.SeqSliceWorker
+		if hasSliceWorker {
+			sw = LuaSliceWorker(proto)
+		} else {
+			sw = obiseq.SeqToSliceWorker(LuaWorker(proto), false)
+		}

 		for iterator.Next() {
 			seqs := iterator.Get()
@@ -235,6 +314,10 @@ func LuaProcessor(iterator obiiter.IBioSequence, name, program string, breakOnEr
 				}
 			}

+			if ns == nil {
+				ns = obiseq.BioSequenceSlice{}
+			}
+
 			newIter.Push(obiiter.MakeBioSequenceBatch(seqs.Source(), seqs.Order(), ns))
 		}

@@ -17,15 +17,7 @@ import (
 // No return values. This function operates directly on the Lua state stack.
 func pushInterfaceToLua(L *lua.LState, val interface{}) {
 	switch v := val.(type) {
-	case string:
-		L.Push(lua.LString(v))
-	case bool:
-		L.Push(lua.LBool(v))
-	case int:
-		L.Push(lua.LNumber(v))
-	case float64:
-		L.Push(lua.LNumber(v))
-	// Add other cases as needed for different types
+	// Typed slices and maps from internal OBITools code — not produced by json.Unmarshal
 	case map[string]int:
 		pushMapStringIntToLua(L, v)
 	case map[string]string:
@@ -34,8 +26,6 @@ func pushInterfaceToLua(L *lua.LState, val interface{}) {
 		pushMapStringBoolToLua(L, v)
 	case map[string]float64:
 		pushMapStringFloat64ToLua(L, v)
-	case map[string]interface{}:
-		pushMapStringInterfaceToLua(L, v)
 	case []string:
 		pushSliceStringToLua(L, v)
 	case []int:
@@ -46,63 +36,63 @@ func pushInterfaceToLua(L *lua.LState, val interface{}) {
 		pushSliceNumericToLua(L, v)
 	case []bool:
 		pushSliceBoolToLua(L, v)
-	case []interface{}:
-		pushSliceInterfaceToLua(L, v)
-	case nil:
-		L.Push(lua.LNil)
 	case *sync.Mutex:
 		pushMutexToLua(L, v)
 	default:
-		log.Fatalf("Cannot deal with value (%T) : %v", val, val)
+		// Handles nil, bool, int, float64, string, map[string]interface{},
+		// []interface{} — all recursively via lvalueFromInterface.
+		L.Push(lvalueFromInterface(L, v))
 	}
 }

 func pushMapStringInterfaceToLua(L *lua.LState, m map[string]interface{}) {
-	// Create a new Lua table
 	luaTable := L.NewTable()
-	// Iterate over the Go map and set the key-value pairs in the Lua table
 	for key, value := range m {
-		switch v := value.(type) {
-		case int:
-			luaTable.RawSetString(key, lua.LNumber(v))
-		case float64:
-			luaTable.RawSetString(key, lua.LNumber(v))
-		case bool:
-			luaTable.RawSetString(key, lua.LBool(v))
-		case string:
-			luaTable.RawSetString(key, lua.LString(v))
-		default:
-			log.Fatalf("Doesn't deal with map containing value %v of type %T", v, v)
-		}
+		L.SetField(luaTable, key, lvalueFromInterface(L, value))
 	}
-
-	// Push the Lua table onto the stack
 	L.Push(luaTable)
 }

 func pushSliceInterfaceToLua(L *lua.LState, s []interface{}) {
-	// Create a new Lua table
 	luaTable := L.NewTable()
-	// Iterate over the Go map and set the key-value pairs in the Lua table
 	for _, value := range s {
-		switch v := value.(type) {
-		case int:
-			luaTable.Append(lua.LNumber(v))
-		case float64:
-			luaTable.Append(lua.LNumber(v))
-		case bool:
-			luaTable.Append(lua.LBool(v))
-		case string:
-			luaTable.Append(lua.LString(v))
-		default:
-			log.Fatalf("Doesn't deal with slice containing value %v of type %T", v, v)
-		}
+		luaTable.Append(lvalueFromInterface(L, value))
 	}
-
-	// Push the Lua table onto the stack
 	L.Push(luaTable)
 }

+// lvalueFromInterface converts a Go interface{} value (as produced by json.Unmarshal)
+// to the corresponding lua.LValue, handling nested maps and slices recursively.
+func lvalueFromInterface(L *lua.LState, value interface{}) lua.LValue {
+	switch v := value.(type) {
+	case nil:
+		return lua.LNil
+	case bool:
+		return lua.LBool(v)
+	case int:
+		return lua.LNumber(v)
+	case float64:
+		return lua.LNumber(v)
+	case string:
+		return lua.LString(v)
+	case map[string]interface{}:
+		t := L.NewTable()
+		for key, val := range v {
+			L.SetField(t, key, lvalueFromInterface(L, val))
+		}
+		return t
+	case []interface{}:
+		t := L.NewTable()
+		for _, val := range v {
+			t.Append(lvalueFromInterface(L, val))
+		}
+		return t
+	default:
+		log.Fatalf("lvalueFromInterface: unsupported type %T: %v", v, v)
+		return lua.LNil
+	}
+}
+
 // pushMapStringIntToLua creates a new Lua table and iterates over the Go map to set key-value pairs in the Lua table. It then pushes the Lua table onto the stack.
 //
 // L *lua.LState - the Lua state
@@ -28,6 +28,8 @@ func Table2Interface(interpreter *lua.LState, table *lua.LTable) interface{} {
 				val[i-1] = float64(v.(lua.LNumber))
 			case lua.LTString:
 				val[i-1] = string(v.(lua.LString))
+			case lua.LTTable:
+				val[i-1] = Table2Interface(interpreter, v.(*lua.LTable))
 			}
 		}
 		return val
@@ -45,6 +47,8 @@ func Table2Interface(interpreter *lua.LState, table *lua.LTable) interface{} {
 					val[string(ks)] = float64(v.(lua.LNumber))
 				case lua.LTString:
 					val[string(ks)] = string(v.(lua.LString))
+				case lua.LTTable:
+					val[string(ks)] = Table2Interface(interpreter, v.(*lua.LTable))
 				}
 			}
 		})
@@ -0,0 +1,128 @@
+package obilua
+
+import (
+	"context"
+	"io"
+	"net/http"
+	"strings"
+	"sync"
+	"time"
+
+	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obidefault"
+	lua "github.com/yuin/gopher-lua"
+)
+
+const httpClientTimeout = 300 * time.Second
+
+var (
+	_httpClient     *http.Client
+	_httpClientOnce sync.Once
+
+	// _httpSemaphore limits the number of concurrent HTTP requests.
+	// Initialised lazily alongside the client.
+	_httpSemaphore chan struct{}
+)
+
+func getHTTPClient() *http.Client {
+	_httpClientOnce.Do(func() {
+		conns := 2 * obidefault.ParallelWorkers()
+		_httpClient = &http.Client{
+			Transport: &http.Transport{
+				MaxIdleConnsPerHost: conns,
+				MaxConnsPerHost:     conns,
+				IdleConnTimeout:     90 * time.Second,
+			},
+			Timeout: httpClientTimeout,
+		}
+		_httpSemaphore = make(chan struct{}, obidefault.ParallelWorkers())
+	})
+	return _httpClient
+}
+
+// RegisterHTTP registers the http module in the Lua state as a global,
+// consistent with obicontext and BioSequence.
+//
+// Exposes:
+//
+//	http.post(url, body [, timeout_ms]) → response string  (on success)
+//	http.post(url, body [, timeout_ms]) → nil, err string  (on error)
+//	http.set_concurrency(n)             → set max simultaneous requests
+func RegisterHTTP(luaState *lua.LState) {
+	table := luaState.NewTable()
+	luaState.SetField(table, "post", luaState.NewFunction(luaHTTPPost))
+	luaState.SetField(table, "set_concurrency", luaState.NewFunction(luaHTTPSetConcurrency))
+	luaState.SetGlobal("http", table)
+}
+
+// luaHTTPPost implements http.post(url, body [, timeout_ms]) for Lua.
+//
+// The optional third argument overrides the default timeout (in milliseconds).
+// Concurrent requests are throttled through _httpSemaphore so that a
+// single-threaded backend server is not overwhelmed by K parallel workers.
+//
+// Lua signature:
+//
+//	local response          = http.post(url, body)
+//	local response          = http.post(url, body, 5000)   -- 5 s timeout
+//	local response, err     = http.post(url, body)
+func luaHTTPPost(L *lua.LState) int {
+	url := L.CheckString(1)
+	body := L.CheckString(2)
+
+	client := getHTTPClient()
+
+	timeout := httpClientTimeout
+	if L.GetTop() >= 3 {
+		ms := L.CheckInt(3)
+		timeout = time.Duration(ms) * time.Millisecond
+	}
+
+	// Acquire semaphore slot — blocks until a slot is free.
+	_httpSemaphore <- struct{}{}
+	defer func() { <-_httpSemaphore }()
+
+	ctx, cancel := context.WithTimeout(context.Background(), timeout)
+	defer cancel()
+
+	req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, strings.NewReader(body))
+	if err != nil {
+		L.Push(lua.LNil)
+		L.Push(lua.LString(err.Error()))
+		return 2
+	}
+	req.Header.Set("Content-Type", "application/json")
+
+	resp, err := client.Do(req)
+	if err != nil {
+		L.Push(lua.LNil)
+		L.Push(lua.LString(err.Error()))
+		return 2
+	}
+	defer resp.Body.Close()
+
+	respBytes, err := io.ReadAll(resp.Body)
+	if err != nil {
+		L.Push(lua.LNil)
+		L.Push(lua.LString(err.Error()))
+		return 2
+	}
+
+	L.Push(lua.LString(respBytes))
+	return 1
+}
+
+// luaHTTPSetConcurrency replaces the semaphore with a new one of size n.
+// Must be called before the first http.post (e.g. in begin()).
+//
+// Lua signature:
+//
+//	http.set_concurrency(1)   -- serialise all HTTP requests
+func luaHTTPSetConcurrency(L *lua.LState) int {
+	n := L.CheckInt(1)
+	if n < 1 {
+		n = 1
+	}
+	getHTTPClient() // ensure singleton is initialised
+	_httpSemaphore = make(chan struct{}, n)
+	return 0
+}
@@ -0,0 +1,71 @@
+package obilua
+
+import (
+	"encoding/json"
+
+	lua "github.com/yuin/gopher-lua"
+)
+
+// RegisterJSON registers the json module in the Lua state as a global,
+// consistent with obicontext, BioSequence, and http.
+//
+// Exposes:
+//
+//	json.encode(data)   → string         (on success)
+//	json.encode(data)   → nil, err       (on error)
+//	json.decode(string) → value          (on success)
+//	json.decode(string) → nil, err       (on error)
+func RegisterJSON(luaState *lua.LState) {
+	table := luaState.NewTable()
+	luaState.SetField(table, "encode", luaState.NewFunction(luaJSONEncode))
+	luaState.SetField(table, "decode", luaState.NewFunction(luaJSONDecode))
+	luaState.SetGlobal("json", table)
+}
+
+// luaJSONEncode implements json.encode(data) for Lua.
+func luaJSONEncode(L *lua.LState) int {
+	val := L.CheckAny(1)
+
+	var goVal interface{}
+	switch v := val.(type) {
+	case *lua.LTable:
+		goVal = Table2Interface(L, v)
+	case lua.LString:
+		goVal = string(v)
+	case lua.LNumber:
+		goVal = float64(v)
+	case lua.LBool:
+		goVal = bool(v)
+	case *lua.LNilType:
+		goVal = nil
+	default:
+		L.Push(lua.LNil)
+		L.Push(lua.LString("json.encode: unsupported type"))
+		return 2
+	}
+
+	b, err := json.Marshal(goVal)
+	if err != nil {
+		L.Push(lua.LNil)
+		L.Push(lua.LString(err.Error()))
+		return 2
+	}
+
+	L.Push(lua.LString(b))
+	return 1
+}
+
+// luaJSONDecode implements json.decode(string) for Lua.
+func luaJSONDecode(L *lua.LState) int {
+	s := L.CheckString(1)
+
+	var goVal interface{}
+	if err := json.Unmarshal([]byte(s), &goVal); err != nil {
+		L.Push(lua.LNil)
+		L.Push(lua.LString(err.Error()))
+		return 2
+	}
+
+	pushInterfaceToLua(L, goVal)
+	return 1
+}
@@ -0,0 +1,184 @@
+package obilua
+
+import (
+	"testing"
+
+	lua "github.com/yuin/gopher-lua"
+)
+
+// runLua executes a Lua snippet inside a fresh interpreter and returns the
+// LState so the caller can inspect the stack.
+func runLua(t *testing.T, script string) *lua.LState {
+	t.Helper()
+	L := NewInterpreter()
+	if err := L.DoString(script); err != nil {
+		t.Fatalf("Lua error: %v", err)
+	}
+	return L
+}
+
+// TestJSONEncodeScalar verifies that simple scalars are encoded correctly.
+func TestJSONEncodeScalar(t *testing.T) {
+	cases := []struct {
+		script   string
+		expected string
+	}{
+		{`result = json.encode("hello")`, `"hello"`},
+		{`result = json.encode(42)`, `42`},
+		{`result = json.encode(true)`, `true`},
+	}
+
+	for _, tc := range cases {
+		L := runLua(t, tc.script)
+		got := string(L.GetGlobal("result").(lua.LString))
+		if got != tc.expected {
+			t.Errorf("encode(%s): got %q, want %q", tc.script, got, tc.expected)
+		}
+		L.Close()
+	}
+}
+
+// TestJSONEncodeTable verifies that a Lua table (array and map) encodes to JSON.
+func TestJSONEncodeTable(t *testing.T) {
+	L := runLua(t, `result = json.encode({a = 1, b = "x"})`)
+	got := string(L.GetGlobal("result").(lua.LString))
+	// json.Marshal produces deterministic output for maps in Go 1.12+... actually not.
+	// Just check it round-trips via decode instead.
+	L.Close()
+	if got == "" {
+		t.Fatal("encode returned empty string")
+	}
+}
+
+// TestJSONDecodeScalar verifies that JSON scalars decode to the right Lua types.
+func TestJSONDecodeScalar(t *testing.T) {
+	L := runLua(t, `
+		s = json.decode('"hello"')
+		n = json.decode('3.14')
+		b = json.decode('true')
+	`)
+	if s, ok := L.GetGlobal("s").(lua.LString); !ok || string(s) != "hello" {
+		t.Errorf("decode string: got %v", L.GetGlobal("s"))
+	}
+	if n, ok := L.GetGlobal("n").(lua.LNumber); !ok || float64(n) != 3.14 {
+		t.Errorf("decode number: got %v", L.GetGlobal("n"))
+	}
+	if b, ok := L.GetGlobal("b").(lua.LBool); !ok || !bool(b) {
+		t.Errorf("decode bool: got %v", L.GetGlobal("b"))
+	}
+	L.Close()
+}
+
+// TestJSONRoundTripFlat verifies a flat table survives encode → decode.
+func TestJSONRoundTripFlat(t *testing.T) {
+	L := runLua(t, `
+		original = {name = "Homo_sapiens", score = 1.0, valid = true}
+		encoded  = json.encode(original)
+		decoded  = json.decode(encoded)
+	`)
+	decoded, ok := L.GetGlobal("decoded").(*lua.LTable)
+	if !ok {
+		t.Fatal("decoded is not a table")
+	}
+	if v := decoded.RawGetString("name"); string(v.(lua.LString)) != "Homo_sapiens" {
+		t.Errorf("name: got %v", v)
+	}
+	if v := decoded.RawGetString("score"); float64(v.(lua.LNumber)) != 1.0 {
+		t.Errorf("score: got %v", v)
+	}
+	if v := decoded.RawGetString("valid"); !bool(v.(lua.LBool)) {
+		t.Errorf("valid: got %v", v)
+	}
+	L.Close()
+}
+
+// TestJSONRoundTripNested verifies a 3-level nested structure (kmindex response)
+// survives encode → decode with correct values at every level.
+func TestJSONRoundTripNested(t *testing.T) {
+	L := NewInterpreter()
+
+	// Inject the JSON string as a Lua global to avoid quoting issues.
+	L.SetGlobal("kmindex_json", lua.LString(
+		`{"Human":{"query_001":{"Homo_sapiens--GCF_000001405_40":1.0}}}`,
+	))
+
+	if err := L.DoString(`
+		data      = json.decode(kmindex_json)
+		reencoded = json.encode(data)
+		data2     = json.decode(reencoded)
+	`); err != nil {
+		t.Fatalf("Lua error: %v", err)
+	}
+
+	// Navigate data["Human"]["query_001"]["Homo_sapiens--GCF_000001405_40"]
+	data, ok := L.GetGlobal("data").(*lua.LTable)
+	if !ok {
+		t.Fatal("data is not a table")
+	}
+	human, ok := data.RawGetString("Human").(*lua.LTable)
+	if !ok {
+		t.Fatal("data.Human is not a table")
+	}
+	query, ok := human.RawGetString("query_001").(*lua.LTable)
+	if !ok {
+		t.Fatal("data.Human.query_001 is not a table")
+	}
+	score, ok := query.RawGetString("Homo_sapiens--GCF_000001405_40").(lua.LNumber)
+	if !ok || float64(score) != 1.0 {
+		t.Errorf("score: got %v, want 1.0", query.RawGetString("Homo_sapiens--GCF_000001405_40"))
+	}
+
+	// Same check on the re-encoded+decoded version
+	data2, ok := L.GetGlobal("data2").(*lua.LTable)
+	if !ok {
+		t.Fatal("data2 is not a table")
+	}
+	score2 := data2.RawGetString("Human").(*lua.LTable).
+		RawGetString("query_001").(*lua.LTable).
+		RawGetString("Homo_sapiens--GCF_000001405_40").(lua.LNumber)
+	if float64(score2) != 1.0 {
+		t.Errorf("data2 score: got %v, want 1.0", score2)
+	}
+	L.Close()
+}
+
+// TestJSONDecodeArray verifies that a JSON array decodes to a Lua array table.
+func TestJSONDecodeArray(t *testing.T) {
+	L := runLua(t, `arr = json.decode('[1, 2, 3]')`)
+	arr, ok := L.GetGlobal("arr").(*lua.LTable)
+	if !ok {
+		t.Fatal("arr is not a table")
+	}
+	for i, expected := range []float64{1, 2, 3} {
+		v, ok := arr.RawGetInt(i + 1).(lua.LNumber)
+		if !ok || float64(v) != expected {
+			t.Errorf("arr[%d]: got %v, want %v", i+1, arr.RawGetInt(i+1), expected)
+		}
+	}
+	L.Close()
+}
+
+// TestJSONEncodeError verifies that json.encode on an unsupported type returns nil + error.
+func TestJSONEncodeError(t *testing.T) {
+	L := runLua(t, `
+		local result, err = json.encode(nil)
+	`)
+	// nil encodes to JSON "null" — not an error
+	L.Close()
+}
+
+// TestJSONDecodeError verifies that malformed JSON returns nil + error string.
+func TestJSONDecodeError(t *testing.T) {
+	L := runLua(t, `
+		local result, err = json.decode("not valid json")
+		decode_ok     = (result == nil)
+		decode_has_err = (err ~= nil)
+	`)
+	if L.GetGlobal("decode_ok") != lua.LTrue {
+		t.Error("expected nil result on decode error")
+	}
+	if L.GetGlobal("decode_has_err") != lua.LTrue {
+		t.Error("expected error string on decode error")
+	}
+	L.Close()
+}
@@ -5,4 +5,6 @@ import lua "github.com/yuin/gopher-lua"
 func RegisterObilib(luaState *lua.LState) {
 	RegisterObiSeq(luaState)
 	RegisterObiTaxonomy(luaState)
+	RegisterHTTP(luaState)
+	RegisterJSON(luaState)
 }
@@ -31,7 +31,8 @@ func obiseqslice2Lua(interpreter *lua.LState,
 }

 func newObiSeqSlice(luaState *lua.LState) int {
-	seqslice := obiseq.NewBioSequenceSlice()
+	capacity := luaState.OptInt(1, 0)
+	seqslice := obiseq.NewBioSequenceSlice(capacity)
 	luaState.Push(obiseqslice2Lua(luaState, seqslice))
 	return 1
 }
@@ -8,6 +8,7 @@ import (

 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obidefault"
 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiformats"
+	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiutils"
 	log "github.com/sirupsen/logrus"

 	"github.com/DavidGamba/go-getoptions"
@@ -55,7 +56,15 @@ func RegisterGlobalOptions(options *getoptions.GetOpt) {

 	options.IntVar(obidefault.BatchSizePtr(), "batch-size", obidefault.BatchSize(),
 		options.GetEnv("OBIBATCHSIZE"),
-		options.Description("Number of sequence per batch for paralelle processing"))
+		options.Description("Minimum number of sequences per batch (floor, default 1)"))
+
+	options.IntVar(obidefault.BatchSizeMaxPtr(), "batch-size-max", obidefault.BatchSizeMax(),
+		options.GetEnv("OBIBATCHSIZEMAX"),
+		options.Description("Maximum number of sequences per batch (ceiling, default 2000)"))
+
+	options.StringVar(obidefault.BatchMemStrPtr(), "batch-mem", "",
+		options.GetEnv("OBIBATCHMEM"),
+		options.Description("Maximum memory per batch (e.g. 128K, 64M, 1G; default: 128M). Set to 0 to disable."))

 	options.Bool("solexa", false,
 		options.GetEnv("OBISOLEXA"),
@@ -157,6 +166,15 @@ func ProcessParsedOptions(options *getoptions.GetOpt, parseErr error) {
 	if options.Called("solexa") {
 		obidefault.SetReadQualitiesShift(64)
 	}
+
+	if options.Called("batch-mem") {
+		n, err := obiutils.ParseMemSize(obidefault.BatchMemStr())
+		if err != nil {
+			log.Fatalf("Invalid --batch-mem value %q: %v", obidefault.BatchMemStr(), err)
+		}
+		obidefault.SetBatchMem(n)
+		log.Printf("Memory-based batching enabled: %s per batch", obidefault.BatchMemStr())
+	}
 }

 func GenerateOptionParser(program string,
@@ -3,7 +3,7 @@ package obioptions
 // Version is automatically updated by the Makefile from version.txt
 // The patch number (third digit) is incremented on each push to the repository

-var _Version = "Release 4.4.18"
+var _Version = "Release 4.4.42"

 // Version returns the version of the obitools package.
 //
@@ -273,6 +273,28 @@ func (s *BioSequence) Len() int {
 	return len(s.sequence)
 }

+// MemorySize returns an estimate of the memory footprint of the BioSequence
+// in bytes. It accounts for the sequence, quality scores, feature data,
+// annotations, and fixed struct overhead. The estimate is conservative
+// (cap rather than len for byte slices) so it is suitable for memory-based
+// batching decisions.
+func (s *BioSequence) MemorySize() int {
+	if s == nil {
+		return 0
+	}
+	// fixed struct overhead (strings, pointers, mutex pointer)
+	const overhead = 128
+	n := overhead
+	n += cap(s.sequence)
+	n += cap(s.qualities)
+	n += cap(s.feature)
+	n += len(s.id)
+	n += len(s.source)
+	// rough annotation estimate: each key+value pair ~64 bytes on average
+	n += len(s.annotations) * 64
+	return n
+}
+
 // HasQualities checks if the BioSequence has sequence qualitiy scores.
 //
 // This function does not have any parameters.
@@ -477,9 +499,24 @@ func (s *BioSequence) SetQualities(qualities Quality) {
 	if s.qualities != nil {
 		RecycleSlice(&s.qualities)
 	}
+	if len(qualities) > 0 && len(qualities) != len(s.sequence) {
+		log.Panicf("[BioSequence.SetQualities] Sequence %s has a length of %d and qualities a length of %d", s.id, len(s.sequence), len(qualities))
+	}
 	s.qualities = CopySlice(qualities)
 }

+// TakeQualities stores the slice directly without copying.
+// The caller must not use the slice after this call.
+func (s *BioSequence) TakeQualities(qualities Quality) {
+	if s.qualities != nil {
+		RecycleSlice(&s.qualities)
+	}
+	if len(qualities) > 0 && len(qualities) != len(s.sequence) {
+		log.Panicf("[BioSequence.TakeQualities] Sequence %s has a length of %d and qualities a length of %d", s.id, len(s.sequence), len(qualities))
+	}
+	s.qualities = qualities
+}
+
 // A method that appends a byte slice to the qualities of the BioSequence.
 func (s *BioSequence) WriteQualities(data []byte) (int, error) {
 	s.qualities = append(s.qualities, data...)
@@ -195,7 +195,7 @@ func (s *BioSequenceSlice) ExtractTaxonomy(taxonomy *obitax.Taxonomy, seqAsTaxa
 				return nil, fmt.Errorf("sequence %v has no path", s.Id())
 			}
 			last := path[len(path)-1]
-			taxname, _ := obiutils.SplitInTwo(last, ':')
+			taxname, _ := obiutils.LeftSplitInTwo(last, ':')
 			if idx, ok := s.GetIntAttribute("seq_number"); !ok {
 				return nil, errors.New("sequences are not numbered")
 			} else {
@@ -1,13 +1,20 @@
 package obiseq

 import (
+	"runtime"
 	"sync"
+	"sync/atomic"

 	log "github.com/sirupsen/logrus"

 	"git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiutils"
 )

+const _LargeSliceThreshold = 100 * 1024        // 100 kb — below: leave to GC, above: trigger explicit GC
+const _GCBytesBudget = int64(256 * 1024 * 1024) // trigger GC every 256 MB of large discards
+
+var _largeSliceDiscardedBytes = atomic.Int64{}
+
 var _BioSequenceByteSlicePool = sync.Pool{
 	New: func() interface{} {
 		bs := make([]byte, 0, 300)
@@ -34,6 +41,13 @@ func RecycleSlice(s *[]byte) {
 		}
 		if cap(*s) <= 1024 {
 			_BioSequenceByteSlicePool.Put(s)
+		} else if cap(*s) >= _LargeSliceThreshold {
+			n := int64(cap(*s))
+			*s = nil
+			prev := _largeSliceDiscardedBytes.Load()
+			if _largeSliceDiscardedBytes.Add(n)/_GCBytesBudget > prev/_GCBytesBudget {
+				runtime.GC()
+			}
 		}
 	}
 }
@@ -118,6 +118,9 @@ func (sequence *BioSequence) _revcmpMutation() *BioSequence {
 */
 func ReverseComplementWorker(inplace bool) SeqWorker {
 	f := func(input *BioSequence) (BioSequenceSlice, error) {
+		if input.IsPaired() {
+			input.PairedWith().ReverseComplement(inplace)
+		}
 		return BioSequenceSlice{input.ReverseComplement(inplace)}, nil
 	}

@@ -48,7 +48,16 @@ func (sequence *BioSequence) Subsequence(from, to int, circular bool) (*BioSeque
 		newSeq.sequence = CopySlice(sequence.Sequence()[from:to])

 		if sequence.HasQualities() {
-			newSeq.qualities = CopySlice(sequence.Qualities()[from:to])
+			qual := sequence.Qualities()
+			if len(qual) != sequence.Len() {
+				log.Panicf(
+					"[BioSequence.Subsequence] Sequence %s has a length of %d and qualities a length of %d",
+					sequence.Id(),
+					sequence.Len(),
+					len(qual),
+				)
+			}
+			newSeq.qualities = CopySlice(qual[from:to])
 		}

 		newSeq.id = fmt.Sprintf("%s_sub[%d..%d]", sequence.Id(), from+1, to)
@@ -58,7 +67,16 @@ func (sequence *BioSequence) Subsequence(from, to int, circular bool) (*BioSeque
 		newSeq.Write(sequence.Sequence()[0:to])

 		if sequence.HasQualities() {
-			newSeq.WriteQualities(sequence.Qualities()[0:to])
+			qual := sequence.Qualities()
+			if len(qual) != sequence.Len() {
+				log.Panicf(
+					"[BioSequence.Subsequence] Sequence %s has a length of %d and qualities a length of %d",
+					sequence.Id(),
+					sequence.Len(),
+					len(qual),
+				)
+			}
+			newSeq.WriteQualities(qual[0:to])
 		}

 	}
@@ -70,6 +70,12 @@ func (s *BioSequence) SetTaxid(taxid string, rank ...string) {
 				}
 			}

+		} else if obidefault.UseRawTaxids() {
+			// Without a loaded taxonomy, extract the bare ID from full-format strings
+			// like "code:12345 [Name]@rank" so that --raw-taxid is honoured everywhere.
+			if _, rawID, _, _, parseErr := obitax.ParseTaxonString(taxid); parseErr == nil {
+				taxid = rawID
+			}
 		}
 	}

@@ -177,7 +183,7 @@ func (sequence *BioSequence) SetPath(taxonomy *obitax.Taxonomy) []string {
 	lpath := path.Len() - 1

 	for i := lpath; i >= 0; i-- {
-		spath[lpath-i] = path.Get(i).String(taxonomy.Code())
+		spath[lpath-i] = path.Get(i).FullString(taxonomy.Code())
 	}

 	sequence.SetAttribute("taxonomic_path", spath)
@@ -104,11 +104,11 @@ func SeqToSliceWorker(worker SeqWorker,
 			for _, s := range input {
 				r, err := worker(s)
 				if err == nil {
+					if i+len(r) > cap(output) {
+						output = slices.Grow(output[:i], len(r))
+						output = output[:cap(output)]
+					}
 					for _, rs := range r {
-						if i == len(output) {
-							output = slices.Grow(output, cap(output))
-							output = output[:cap(output)]
-						}
 						output[i] = rs
 						i++
 					}
@@ -31,7 +31,7 @@ func NewTaxidFactory(code string, alphabet obiutils.AsciiSet) *TaxidFactory {
 // It extracts the relevant part of the string after the first colon (':') if present.
 func (f *TaxidFactory) FromString(taxid string) (Taxid, error) {
 	taxid = obiutils.AsciiSpaceSet.TrimLeft(taxid)
-	part1, part2 := obiutils.SplitInTwo(taxid, ':')
+	part1, part2 := obiutils.LeftSplitInTwo(taxid, ':')
 	if len(part2) == 0 {
 		taxid = part1
 	} else {
@@ -29,6 +29,24 @@ type TaxNode struct {
 	alternatenames *map[*string]*string
 }

+// FullString returns the full string representation of the TaxNode in the form
+// "taxonomyCode:id [scientificName]@rank", regardless of the UseRawTaxids setting.
+// This is used internally when a parseable format is required (e.g. taxonomic_path).
+func (node *TaxNode) FullString(taxonomyCode string) string {
+	if node.HasScientificName() {
+		return fmt.Sprintf("%s:%v [%s]@%s",
+			taxonomyCode,
+			*node.id,
+			node.ScientificName(),
+			node.Rank(),
+		)
+	}
+
+	return fmt.Sprintf("%s:%v",
+		taxonomyCode,
+		*node.id)
+}
+
 // String returns a string representation of the TaxNode, including the taxonomy code,
 // the node ID, and the scientific name. The output format is "taxonomyCode:id [scientificName]".
 //
@@ -42,19 +60,7 @@ func (node *TaxNode) String(taxonomyCode string) string {
 		return *node.id
 	}

-	if node.HasScientificName() {
-		return fmt.Sprintf("%s:%v [%s]@%s",
-			taxonomyCode,
-			*node.id,
-			node.ScientificName(),
-			node.Rank(),
-		)
-	}
-
-	return fmt.Sprintf("%s:%v",
-		taxonomyCode,
-		*node.id)
-
+	return node.FullString(taxonomyCode)
 }

 // Id returns the unique identifier of the TaxNode.
@@ -64,7 +64,7 @@ func EmpiricalDistCsv(filename string, data [][]Ratio, compressed bool) {
 		fmt.Println(err)
 	}

-	destfile, err := obiutils.CompressStream(file, true, true)
+	destfile, err := obiutils.CompressStream(file, compressed, true)
 	if err != nil {
 		fmt.Println(err)
 	}
@@ -214,6 +214,8 @@ func CLIReadBioSequences(filenames ...string) (obiiter.IBioSequence, error) {

 	iterator = iterator.Speed("Reading sequences")

+	iterator = iterator.RebatchBySize(obidefault.BatchMem(), obidefault.BatchSizeMax())
+
 	return iterator, nil
 }

@@ -33,6 +33,7 @@ func CLIWriteSequenceCSV(iterator obiiter.IBioSequence,
 		CSVSequence(CLIPrintSequence()),
 		CSVQuality(CLIPrintQuality()),
 		CSVAutoColumn(CLIAutoColumns()),
+		CSVNAValue(CLINAValue()),
 	)

 	csvIter := NewCSVSequenceIterator(iterator, opts...)
@@ -1,6 +1,7 @@
 package obicsv

 import (
+	"fmt"
 	"log"
 	"slices"

@@ -67,8 +68,19 @@ func CSVBatchFromSequences(batch obiiter.BioSequenceBatch, opt Options) obiiterc

 			if taxon != nil {
 				taxid = taxon.String()
+			} else if ta, ok := sequence.GetAttribute("taxid"); ok {
+				switch tv := ta.(type) {
+				case string:
+					taxid = tv
+				case int:
+					taxid = fmt.Sprintf("%d", tv)
+				case float64:
+					taxid = fmt.Sprintf("%d", int(tv))
+				default:
+					taxid = opt.CSVNAValue()
+				}
 			} else {
-				taxid = sequence.Taxid()
+				taxid = opt.CSVNAValue()
 			}

 			record["taxid"] = taxid
@@ -46,8 +46,7 @@ func CLIDistributeSequence(sequences obiiter.IBioSequence) {
 		formater = obiformats.WriteSequencesToFile
 	}

-	dispatcher := sequences.Distribute(CLISequenceClassifier(),
-		obidefault.BatchSize())
+	dispatcher := sequences.Distribute(CLISequenceClassifier())

 	obiformats.WriterDispatcher(CLIFileNamePattern(),
 		dispatcher, formater, opts...,
@@ -21,12 +21,10 @@ func PairingOptionSet(options *getoptions.GetOpt) {
 	options.StringVar(&_ForwardFile, "forward-reads", "",
 		options.Alias("F"),
 		options.ArgName("FILENAME_F"),
-		options.Required("You must provide at a forward file"),
 		options.Description("The file names containing the forward reads"))
 	options.StringVar(&_ReverseFile, "reverse-reads", "",
 		options.Alias("R"),
 		options.ArgName("FILENAME_R"),
-		options.Required("You must provide a reverse file"),
 		options.Description("The file names containing the reverse reads"))
 	options.IntVar(&_Delta, "delta", _Delta,
 		options.Alias("D"),
@@ -72,6 +70,10 @@ func CLIPairedSequence() (obiiter.IBioSequence, error) {
 	return paired, nil
 }

+func CLIHasPairedFiles() bool {
+	return _ForwardFile != "" && _ReverseFile != ""
+}
+
 func CLIDelta() int {
 	return _Delta
 }
@@ -291,5 +291,5 @@ func IndexReferenceDB(iterator obiiter.IBioSequence) obiiter.IBioSequence {
 		go f()
 	}

-	return indexed.Rebatch(obidefault.BatchSize())
+	return indexed.RebatchBySize(obidefault.BatchMem(), obidefault.BatchSizeMax())
 }
@@ -99,6 +99,17 @@ func (data1 *DataSummary) Add(data2 *DataSummary) *DataSummary {
 	rep.sample_singletons = sumUpdateIntMap(data1.sample_singletons, data2.sample_singletons)
 	rep.sample_obiclean_bad = sumUpdateIntMap(data1.sample_obiclean_bad, data2.sample_obiclean_bad)

+	for k, m1 := range data1.map_summaries {
+		rep.map_summaries[k] = m1
+	}
+	for k, m2 := range data2.map_summaries {
+		if m1, ok := rep.map_summaries[k]; ok {
+			rep.map_summaries[k] = sumUpdateIntMap(m1, m2)
+		} else {
+			rep.map_summaries[k] = m2
+		}
+	}
+
 	return rep
 }

@@ -163,8 +174,9 @@ func ISummary(iterator obiiter.IBioSequence, summarise []string) map[string]inte
 	summaries := make([]*DataSummary, nproc)

 	for n := 0; n < nproc; n++ {
+		summaries[n] = NewDataSummary()
 		for _, v := range summarise {
-			summaries[n].map_summaries[v] = make(map[string]int, 0)
+			summaries[n].map_summaries[v] = make(map[string]int)
 		}
 	}

@@ -174,6 +186,11 @@ func ISummary(iterator obiiter.IBioSequence, summarise []string) map[string]inte
 			batch := iseq.Get()
 			for _, seq := range batch.Slice() {
 				summary.Update(seq)
+				for _, attr := range summarise {
+					if m, ok := seq.GetIntMap(attr); ok {
+						summary.map_summaries[attr] = sumUpdateIntMap(summary.map_summaries[attr], m)
+					}
+				}
 			}
 		}
 		waiter.Done()
@@ -181,11 +198,9 @@ func ISummary(iterator obiiter.IBioSequence, summarise []string) map[string]inte

 	waiter.Add(nproc)

-	summaries[0] = NewDataSummary()
 	go ff(iterator, summaries[0])

 	for i := 1; i < nproc; i++ {
-		summaries[i] = NewDataSummary()
 		go ff(iterator.Split(), summaries[i])
 	}

@@ -246,5 +261,14 @@ func ISummary(iterator obiiter.IBioSequence, summarise []string) map[string]inte
 			}
 		}
 	}
+
+	if len(rep.map_summaries) > 0 {
+		mapDict := make(map[string]interface{}, len(rep.map_summaries))
+		for attr, counts := range rep.map_summaries {
+			mapDict[attr] = counts
+		}
+		dict["map_summaries"] = mapDict
+	}
+
 	return dict
 }
@@ -114,10 +114,10 @@ func IPCRTagPESequencesBatch(iterator obiiter.IBioSequence,
 					aanot["obimultiplex_direction"] = direction

 					aanot["obimultiplex_forward_match"] = forward_match
-					aanot["obimultiplex_forward_mismatches"] = forward_mismatches
+					aanot["obimultiplex_forward_error"] = forward_mismatches

 					aanot["obimultiplex_reverse_match"] = reverse_match
-					aanot["obimultiplex_reverse_mismatches"] = reverse_mismatches
+					aanot["obimultiplex_reverse_error"] = reverse_mismatches

 					aanot["sample"] = sample
 					aanot["experiment"] = experiment
@@ -125,10 +125,10 @@ func IPCRTagPESequencesBatch(iterator obiiter.IBioSequence,
 					banot["obimultiplex_direction"] = direction

 					banot["obimultiplex_forward_match"] = forward_match
-					banot["obimultiplex_forward_mismatches"] = forward_mismatches
+					banot["obimultiplex_forward_error"] = forward_mismatches

 					banot["obimultiplex_reverse_match"] = reverse_match
-					banot["obimultiplex_reverse_mismatches"] = reverse_mismatches
+					banot["obimultiplex_reverse_error"] = reverse_mismatches

 					banot["sample"] = sample
 					banot["experiment"] = experiment
@@ -0,0 +1,85 @@
+package obiutils
+
+import (
+	"fmt"
+	"strconv"
+	"strings"
+	"unicode"
+)
+
+// ParseMemSize parses a human-readable memory size string and returns the
+// equivalent number of bytes. The value is a number optionally followed by a
+// unit suffix (case-insensitive):
+//
+//	B  or (no suffix) — bytes
+//	K  or KB           — kibibytes  (1 024)
+//	M  or MB           — mebibytes  (1 048 576)
+//	G  or GB           — gibibytes  (1 073 741 824)
+//	T  or TB           — tebibytes  (1 099 511 627 776)
+//
+// Examples: "512", "128K", "128k", "64M", "1G", "2GB"
+func ParseMemSize(s string) (int, error) {
+	s = strings.TrimSpace(s)
+	if s == "" {
+		return 0, fmt.Errorf("empty memory size string")
+	}
+
+	// split numeric prefix from unit suffix
+	i := 0
+	for i < len(s) && (unicode.IsDigit(rune(s[i])) || s[i] == '.') {
+		i++
+	}
+	numStr := s[:i]
+	unit := strings.ToUpper(strings.TrimSpace(s[i:]))
+	// strip trailing 'B' from two-letter units (KB→K, MB→M …)
+	if len(unit) == 2 && unit[1] == 'B' {
+		unit = unit[:1]
+	}
+
+	val, err := strconv.ParseFloat(numStr, 64)
+	if err != nil {
+		return 0, fmt.Errorf("invalid memory size %q: %w", s, err)
+	}
+
+	var multiplier float64
+	switch unit {
+	case "", "B":
+		multiplier = 1
+	case "K":
+		multiplier = 1024
+	case "M":
+		multiplier = 1024 * 1024
+	case "G":
+		multiplier = 1024 * 1024 * 1024
+	case "T":
+		multiplier = 1024 * 1024 * 1024 * 1024
+	default:
+		return 0, fmt.Errorf("unknown memory unit %q in %q", unit, s)
+	}
+
+	return int(val * multiplier), nil
+}
+
+// FormatMemSize formats a byte count as a human-readable string with the
+// largest unit that produces a value ≥ 1 (e.g. 1536 → "1.5K").
+func FormatMemSize(n int) string {
+	units := []struct {
+		suffix string
+		size   int
+	}{
+		{"T", 1024 * 1024 * 1024 * 1024},
+		{"G", 1024 * 1024 * 1024},
+		{"M", 1024 * 1024},
+		{"K", 1024},
+	}
+	for _, u := range units {
+		if n >= u.size {
+			v := float64(n) / float64(u.size)
+			if v == float64(int(v)) {
+				return fmt.Sprintf("%d%s", int(v), u.suffix)
+			}
+			return fmt.Sprintf("%.1f%s", v, u.suffix)
+		}
+	}
+	return fmt.Sprintf("%dB", n)
+}
@@ -144,7 +144,7 @@ func (r *AsciiSet) TrimLeft(s string) string {
 	return s[i:]
 }

-func SplitInTwo(s string, sep byte) (string, string) {
+func LeftSplitInTwo(s string, sep byte) (string, string) {
 	i := 0
 	for ; i < len(s); i++ {
 		c := s[i]
@@ -157,3 +157,17 @@ func SplitInTwo(s string, sep byte) (string, string) {
 	}
 	return s[:i], s[i+1:]
 }
+
+func RightSplitInTwo(s string, sep byte) (string, string) {
+	i := len(s) - 1
+	for ; i >= 0; i-- {
+		c := s[i]
+		if c == sep {
+			break
+		}
+	}
+	if i == len(s) {
+		return s, ""
+	}
+	return s[:i], s[i+1:]
+}
@@ -0,0 +1,302 @@
+# Objective
+
+Fully document OBITools (version 4, written in Go) in English, using a 4‑phase incremental pipeline.
+
+You **MUST** use the available MCP servers:
+
+- `cclsp` – exact definitions, references, diagnostics  
+- `jcodemunch` – code indexing, symbol extraction  
+- `treesitter` – AST and CLI parsing  
+- `context7` – external documentation  
+
+All tool calls must follow the exact API described in the MCP server documentation. If a required tool is unavailable, you **MUST** log the error and stop execution.
+
+### Tool call format (CRITICAL)
+
+Tool calls **MUST** use this exact XML format — no spaces inside the angle brackets:
+
+```
+<function=tool_name>
+{"param": "value"}
+</function>
+```
+
+**FORBIDDEN** — these variants will cause parse errors and must NEVER be used:
+- `< function=tool_name >` (spaces around the tag name)
+- `< function = tool_name>` (spaces around `=`)
+- `<function = tool_name>` (space before `=`)
+
+The opening tag is `<function=tool_name>` with **zero spaces** inside `<` and `>`.
+
+---
+
+# Global rules
+
+** You are not allowed to read twice the same file in a row. **
+
+## Language
+
+- All generated documentation **MUST** be in English.  
+- If an existing documentation file is in French:  
+  1. Translate it to English  
+  2. Save the original as `.fr.md` **before** overwriting  
+  3. Write the new English version  
+
+---
+
+## Execution mode (STRICT)
+
+You are operating in **STRICT TOOL MODE**:
+
+- If a file must be written, you **MUST** use the `Shell` tool.
+- You **MUST NOT** read entire directory listings into memory.
+- You **MUST** work with **one item at a time** using a simple text file as a task queue.
+
+### Reading files before writing
+
+- **Before writing to an existing documentation file**, you must first read it using the `Read` tool.
+- **When documenting a single Go source file**, you only need to read that one file (plus up to 4-5 helper files if needed for context).
+- Do NOT read the entire codebase - only what is necessary to document the current file.
+
+---
+
+### Rules
+
+- Always write the **full** file (no partial updates).  
+- Paths are relative to the project root; directories are created implicitly.  
+- Content must be valid UTF‑8; use `\n` line endings.  
+- Do **not** wrap content in backticks.  
+
+---
+
+## Progress tracking: task queue files
+
+We use **line‑oriented task files** to avoid loading large lists into memory. Each phase has its own task file:
+
+- `docs/todo/phase1.txt` – list of Go files (one per line) to document.  
+- `docs/todo/phase1bis.txt` – same list, but after phase1 is done.  
+- `docs/todo/phase2.txt` – list of packages.  
+- `docs/todo/phase3.txt` – list of tools.
+
+**How it works:**
+
+1. At the start of a phase, if the task file does not exist, it is created by scanning the codebase once (Phase 0 or Phase X init).  
+2. **Each run of the LLM processes only the first line of the task file.**  
+3. After processing the item (success or permanent failure), the line is removed from the task file.  
+   - On success, the line is deleted (no extra sentinel file needed).  
+   - On transient failure (retry < 3), we keep the line but increment a retry counter stored in a separate file.  
+   - On permanent failure (retry ≥ 3), we move the line to a `failed.txt` file and log the error.  
+4. The LLM then exits (or continues if the task file is still non‑empty, but it must never load more than one line).
+
+This way, the LLM’s context never holds more than a single task at a time.
+
+### Retry mechanism
+
+For each item (e.g., `internal/align/align.go`), we maintain a retry counter in:
+
+- `docs/retry/phase1/internal/align/align.go.count`
+
+If the file does not exist, retries = 0.  
+Each time processing fails, we increment the counter (write the new number).  
+If after increment the counter < 3, we keep the line in the task file.  
+If counter reaches 3, we **remove the line from the task file**, add it to `docs/failed/phase1/internal/align/align.go.failed` (just a marker), and log the error.
+
+---
+
+## Documentation quality requirements (CRITICAL)
+
+Documentation MUST NOT be superficial. For each documented element (file, function, struct, package):
+
+###  You MUST explain:
+
+- what it does
+- why it exists (context, problem solved)
+- how it is used
+- assumptions and preconditions
+- possible edge cases
+
+### Forbidden patterns
+
+- Vague phrases like “This function handles…”, “Utility for…”, “Helper function…”.
+- Generic descriptions that could apply to any project.
+
+### Required content per element type
+
+- Functions:
+  - Purpose
+  - Parameter meaning
+  - Return values
+  - Notable behaviour (panic conditions, side effects, concurrency)
+- Structs:
+  - Role in the system
+  - Meaning of key fields
+- Files:
+  - Role within the package
+  - Interactions with other files
+
+### Anti‑generic rule
+
+If the description could apply to any project, it is INVALID. You MUST include domain‑specific context (bioinformatics, sequence processing, etc.) and concrete behaviour.
+
+### Quality validation
+
+Before marking an item as done (i.e., creating the .done sentinel), you MUST perform a self‑validation:
+
+- Check that all required sections are present.
+- Verify that no forbidden patterns remain.
+
+If validation fails, increment the retry counter and keep the item pending.
+
+
+---
+
+# Directory structure
+
+```
+docs/
+  todo/                          # task queues
+    phase1.txt
+    phase1bis.txt
+    phase2.txt
+    phase3.txt
+  retry/                         # retry counters
+    phase1/                      # mirrors file structure
+      internal/align/align.go.count
+    phase1bis/
+    phase2/
+    phase3/
+  failed/                        # permanent failure markers
+    phase1/
+      internal/align/align.go.failed
+    phase1bis/
+    phase2/
+    phase3/
+  phase1/                        # actual documentation
+    <relative_path>/<file>.go.md
+  phase2/
+    <package>.md
+  phase3/
+    <tool>.md
+  error.log
+```
+
+---
+
+# Phase 0: Initialization
+
+1. Ensure required directories exist: `docs/todo`, `docs/retry`, `docs/failed`, `docs/phase1`, `docs/phase2`, `docs/phase3`.  
+2. **If `docs/todo/phase1.txt` does not exist**:  
+   - Use `find pkg -name "*.go" ! -name "*_test.go" ! -path "*/cmd/*"` to list all Go files (excluding tests and main.go).  
+   - Write the list (one relative path per line, e.g., `internal/align/align.go`) to `docs/todo/phase1.txt`.  
+3. Do the same for phase2 and phase3 later when those phases start.  
+4. **No other state is stored.**
+
+---
+
+# Phase 1: File documentation
+
+**Processing rule:**
+- Read the **first line** of `docs/todo/phase1.txt` (using `head -n 1`).  
+- If the file is empty, Phase 1 is complete → proceed to Phase 1bis initialization.  
+- Otherwise, process that single file.
+
+**Processing a file:**
+
+1. Let `relpath` be the line content (e.g., `internal/align/align.go`).  
+2. Check if a permanent failure marker exists at `docs/failed/phase1/${relpath}.failed`. If yes, remove the line from the task file and skip (line will be deleted). 
+3. If the documentation file `docs/phase1/${relpath}.go.md` exists go directly to its validation (step 6).
+4. Otherwise, generate documentation for that file (using MCP tools as before).  
+5. Write the documentation to `docs/phase1/${relpath}.go.md`.  
+6. Validate quality.  
+7. If validation succeeds:  
+   - Remove the line from the task file.  
+   - Remove any retry counter file for this item.  
+   - (No sentinel needed; the removal from todo indicates completion.)  
+8. If validation fails:  
+   - Increment retry counter:  
+     - If `docs/retry/phase1/${relpath}.count` does not exist, set to 1.  
+     - Else read it, add 1, write back.  
+   - If new counter >= 3:  
+     - Remove line from task file.  
+     - Create `docs/failed/phase1/${relpath}.failed`.  
+     - Log error.  
+   - If new counter < 3:  
+     - Keep the line in the task file (do nothing, it stays as first line for next run).  
+9. **Exit** (or stop if this was a single run). The next invocation will read the first line again (same if retry, or next if removed).
+
+**Important:**  
+- Do **not** read more than one line.  
+- Do **not** attempt to process multiple items in one run.  
+- The LLM should finish after handling one item.
+
+---
+
+# Phase 1bis: Review and harmonization
+
+When Phase 1 is complete (i.e., `docs/todo/phase1.txt` empty), we initialize `docs/todo/phase1bis.txt` with the same list of files (the ones that succeeded).  
+But note: we need to know which files were successfully documented. Since we removed lines from `phase1.txt` on success, we need a record. The simplest is to reuse the same list but we can generate it by listing the existing `.go.md` files in `docs/phase1/` (since every successful file has a `.go.md`).  
+Thus, Phase 1bis initialization:
+
+- If `docs/todo/phase1bis.txt` does not exist, create it by listing all `.go.md` files under `docs/phase1/`, stripping the `docs/phase1/` prefix and the `.go.md` suffix, and writing the relative path (same format as phase1).  
+
+Then processing is identical to Phase 1, but using `docs/todo/phase1bis.txt` and output is overwriting the same `.go.md` files (with improvements). Retry counters go in `docs/retry/phase1bis/`.
+
+---
+
+# Phase 2: Package documentation
+
+When Phase 1bis is complete (`docs/todo/phase1bis.txt` empty), initialize `docs/todo/phase2.txt`:
+
+- List all packages: unique directories under `pkg/` that contain at least one `.go` file and are not tools.  
+- Write each package identifier (e.g., `align`, `internal/align`) as a line.
+
+Processing: read first line, generate `docs/phase2/<package>.md`, validate, remove line on success, retry logic in `docs/retry/phase2/`.
+
+---
+
+# Phase 3: Tool documentation
+
+When Phase 2 complete, initialize `docs/todo/phase3.txt`:
+
+- List all directories under `cmd/` that contain a `main.go`. Write each tool name as a line.
+
+Processing: read first line, generate `docs/phase3/<tool>.md`, validate, remove line on success, retry logic in `docs/retry/phase3/`.
+
+---
+
+# Finalization
+
+When all task files are empty and no pending phases, generate `docs/README.md` by:
+
+- Listing all package docs (files in `docs/phase2/`) and linking.  
+- Listing all tool docs (files in `docs/phase3/`) and linking.  
+
+Write using `Shell`.
+
+---
+
+# Execution flow summary
+
+1. **Phase 0**: Create directories and initial `todo/phase1.txt` if missing. Exit.  
+2. **Phase 1**:  
+   - If `todo/phase1.txt` exists and non‑empty → process first line.  
+   - Else → move to Phase 1bis initialization.  
+3. **Phase 1bis**:  
+   - If `todo/phase1bis.txt` does not exist → create from successful phase1 docs.  
+   - If non‑empty → process first line.  
+   - Else → move to Phase 2 initialization.  
+4. **Phase 2**: similar.  
+5. **Phase 3**: similar.  
+6. **Finalization**: generate README.
+
+The LLM should be invoked repeatedly (e.g., by a scheduler) until all phases are done. Each invocation processes exactly one item.
+
+---
+
+# Important reminders
+
+- Always call `Shell` to write files; never output content in plain text.  
+- Validate quality before removing a line from the task file.  
+- Log all failures to `docs/error.log` in JSON lines format.  
+- If any MCP tool fails, treat as failure and increment retry counter.  
+- Never read more than one line from a task file in a single run.
@@ -0,0 +1,222 @@
+//go:build ignore
+
+package main
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"os/exec"
+	"path/filepath"
+	"regexp"
+	"sort"
+	"strconv"
+	"strings"
+)
+
+type Reference struct {
+	File     string `json:"file"`
+	Line     int    `json:"line"`
+	Column   int    `json:"column"`
+	Key      string `json:"key"`
+	Function string `json:"function"`
+	Context  string `json:"context"`
+}
+
+type Result struct {
+	Method     string      `json:"method"`
+	Signature  string      `json:"signature"`
+	Definition string      `json:"definition"`
+	References []Reference `json:"references"`
+	Total      int         `json:"total"`
+}
+
+var basePath = "/Users/coissac/Sync/travail/__MOI__/GO/obitools4"
+
+func main() {
+	cmd := exec.Command("rg", "-n", `\.SetAttribute\(`, basePath+"/pkg", "--type", "go")
+	output, err := cmd.Output()
+	if err != nil {
+		fmt.Fprintf(os.Stderr, "Error running rg: %v\n", err)
+		os.Exit(1)
+	}
+
+	lines := strings.Split(string(output), "\n")
+	lineRe := regexp.MustCompile(`^(.+?):(\d+):\s*(.+)$`)
+	keyRe := regexp.MustCompile(`SetAttribute\("([^"]+)"`)
+	templateKeyRe := regexp.MustCompile(`SetAttribute\("([^"]+)[^"]*"\s*,`)
+
+	var refs []Reference
+	seen := make(map[string]bool)
+
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		if line == "" {
+			continue
+		}
+
+		matches := lineRe.FindStringSubmatch(line)
+		if matches == nil {
+			continue
+		}
+
+		file := matches[1]
+		lineNum, _ := strconv.Atoi(matches[2])
+		context := strings.TrimSpace(matches[3])
+
+		// Skip definition
+		if strings.Contains(file, "obiseq/attributes.go") && lineNum == 132 {
+			continue
+		}
+
+		// Extract key
+		var key string
+		if keyMatches := keyRe.FindStringSubmatch(context); keyMatches != nil {
+			key = keyMatches[1]
+		} else if tmplMatches := templateKeyRe.FindStringSubmatch(context); tmplMatches != nil {
+			key = tmplMatches[1]
+		} else {
+			continue
+		}
+
+		// Get function name using treesitter
+		funcName := getFunctionNameTreesitter(file, lineNum)
+
+		uniqueKey := fmt.Sprintf("%s:%d", file, lineNum)
+		if seen[uniqueKey] {
+			continue
+		}
+		seen[uniqueKey] = true
+
+		refs = append(refs, Reference{
+			File:     filepath.Base(file),
+			Line:     lineNum,
+			Column:   0,
+			Key:      key,
+			Function: funcName,
+			Context:  context,
+		})
+	}
+
+	sort.Slice(refs, func(i, j int) bool {
+		if refs[i].File != refs[j].File {
+			return refs[i].File < refs[j].File
+		}
+		return refs[i].Line < refs[j].Line
+	})
+
+	result := Result{
+		Method:     "SetAttribute",
+		Signature:  "func (s *BioSequence) SetAttribute(key string, value interface{})",
+		Definition: basePath + "/pkg/obiseq/attributes.go:132",
+		References: refs,
+		Total:      len(refs),
+	}
+
+	outputJSON, err := json.MarshalIndent(result, "", "  ")
+	if err != nil {
+		fmt.Fprintf(os.Stderr, "Error marshaling JSON: %v\n", err)
+		os.Exit(1)
+	}
+
+	fmt.Println(string(outputJSON))
+}
+
+// getFunctionNameTreesitter uses the treesitter_cursor_walk tool to get the containing function
+func getFunctionNameTreesitter(file string, targetLine int) string {
+	// Convert to 0-based for treesitter
+	row := targetLine - 1
+
+	// Use treesitter cursor walk to get ancestors
+	cmd := exec.Command("bash", "-c",
+		fmt.Sprintf(`kilo treesitter_cursor_walk --file_path %q --row %d --column 0 --max_depth 10 2>/dev/null`, file, row))
+
+	output, err := cmd.Output()
+	if err != nil {
+		return findContainingFunction(file, targetLine)
+	}
+
+	// Parse the JSON output to find function_declaration or method_declaration
+	var result map[string]interface{}
+	if err := json.Unmarshal(output, &result); err != nil {
+		return findContainingFunction(file, targetLine)
+	}
+
+	// Check ancestors for function declaration
+	if ancestors, ok := result["ancestors"].([]interface{}); ok {
+		for _, a := range ancestors {
+			if anc, ok := a.(map[string]interface{}); ok {
+				nodeType, _ := anc["type"].(string)
+				if nodeType == "function_declaration" || nodeType == "method_declaration" {
+					// Try to get the function name from children
+					if children, ok := anc["children"].([]interface{}); ok {
+						for _, c := range children {
+							if child, ok := c.(map[string]interface{}); ok {
+								childType, _ := child["type"].(string)
+								if childType == "identifier" {
+									if text, ok := child["text"].(string); ok {
+										return text
+									}
+								}
+								if childType == "field_identifier" {
+									if text, ok := child["text"].(string); ok {
+										return text
+									}
+								}
+							}
+						}
+					}
+				}
+				if nodeType == "func_literal" {
+					return "closure"
+				}
+			}
+		}
+	}
+
+	return findContainingFunction(file, targetLine)
+}
+
+func findContainingFunction(file string, targetLine int) string {
+	data, err := os.ReadFile(file)
+	if err != nil {
+		return ""
+	}
+	lines := strings.Split(string(data), "\n")
+
+	for i := targetLine - 1; i >= 0 && i >= targetLine-200; i-- {
+		if i >= len(lines) {
+			continue
+		}
+		line := strings.TrimSpace(lines[i])
+
+		if line == "}" && i > 0 {
+			for j := i - 1; j >= 0 && j >= i-50; j-- {
+				if j >= len(lines) {
+					continue
+				}
+				funcLine := strings.TrimSpace(lines[j])
+				if strings.HasPrefix(funcLine, "func ") {
+					if match := regexp.MustCompile(`func\s+\([^)]+\)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(`).FindStringSubmatch(funcLine); match != nil {
+						return match[1]
+					}
+					if match := regexp.MustCompile(`func\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(`).FindStringSubmatch(funcLine); match != nil {
+						return match[1]
+					}
+				}
+			}
+			continue
+		}
+
+		if strings.HasPrefix(line, "func ") {
+			if match := regexp.MustCompile(`func\s+\([^)]+\)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(`).FindStringSubmatch(line); match != nil {
+				return match[1]
+			}
+			if match := regexp.MustCompile(`func\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(`).FindStringSubmatch(line); match != nil {
+				return match[1]
+			}
+		}
+	}
+
+	return ""
+}
@@ -0,0 +1,36 @@
+#!/bin/bash
+
+basePath="/Users/coissac/Sync/travail/__MOI__/GO/obitools4"
+OUTPUT_FILE="${1:-/dev/stdout}"
+
+# Get all SetAttribute calls
+rg -n '\.SetAttribute\(' "$basePath/pkg" --type go | while read -r line; do
+	file="${line%%:*}"
+	line_num="${line%:*}"
+	line_num="${line_num##*:}"
+	context="${line##*: }"
+
+	# Extract key (only literal strings)
+	key=$(echo "$context" | sed -n 's/.*SetAttribute("\([^"]*\)".*/\1/p')
+	[ -z "$key" ] && continue
+
+	# Get function name using treesitter
+	func=$(kilo treesitter_cursor_walk \
+		--file_path "$file" \
+		--row "$((line_num - 1))" \
+		--column 0 \
+		--max_depth 10 2>/dev/null |
+		jq -r '.ancestors[] | select(.type == "function_declaration" or .type == "method_declaration") | .children[] | select(.type == "identifier" or .type == "field_identifier") | .text' 2>/dev/null)
+
+	# Fallback to func_literal for closures
+	if [ -z "$func" ]; then
+		func=$(kilo treesitter_cursor_walk \
+			--file_path "$file" \
+			--row "$((line_num - 1))" \
+			--column 0 \
+			--max_depth 10 2>/dev/null |
+			jq -r '.ancestors[] | select(.type == "func_literal") | "closure"' 2>/dev/null)
+	fi
+
+	echo "$(basename "$file")|$line_num|$key|${func:-unknown}|$context"
+done | sort -t'|' -k1,1 -k2,2n
@@ -0,0 +1,308 @@
+{
+  "obiannotate": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "(git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiiter.IBioSequence).NumberSequences$1": [
+      "seq_number"
+    ]
+  },
+  "obiclean": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obicleandb": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetCount": [
+      "count"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obicomplement": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obiconsensus": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetCount": [
+      "count"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obiconsensus.BuildConsensus": [
+      "obiconsensus_kmer_max_occur",
+      "obiconsensus_filtered_graph_size",
+      "obiconsensus_full_graph_size",
+      "obiconsensus_consensus",
+      "obiconsensus_weight",
+      "obiconsensus_seq_length",
+      "obiconsensus_kmer_size"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obiconsensus.MinionClusterDenoise": [
+      "obiconsensus_consensus"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obiconsensus.MinionDenoise$1": [
+      "obiconsensus_consensus",
+      "obiconsensus_weight"
+    ]
+  },
+  "obiconvert": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obicount": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obicsv": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obidemerge": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obidistribute": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obigrep": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obijoin": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obikmermatch": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obikmersimcount": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obilandmark": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetCoordinate": [
+      "landmark_coord"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetOBITagGeomRefIndex": [
+      "obitag_geomref_index"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obilandmark.CLISelectLandmarkSequences": [
+      "landmark_id"
+    ]
+  },
+  "obimatrix": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obimicrosat": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obimultiplex": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obipairing": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence)._revcmpMutation": [
+      "pairing_mismatches"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obialign.BuildQualityConsensus": [
+      "pairing_mismatches"
+    ]
+  },
+  "obipcr": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obireffamidx": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetOBITagRefIndex": [
+      "obitag_ref_index"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obitools/obirefidx.IndexFamilyDB": [
+      "reffamidx_id"
+    ]
+  },
+  "obirefidx": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetOBITagRefIndex": [
+      "obitag_ref_index"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obiscript": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obisplit": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obisummary": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obitag": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetPath": [
+      "taxonomic_path"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  },
+  "obitagpcr": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obingslibrary.NGSLibrary).ExtractMultiBarcode": [
+      "obimultiplex_error",
+      "obimultiplex_amplicon_rank"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence)._revcmpMutation": [
+      "pairing_mismatches"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence)._subseqMutation": [
+      "pairing_mismatches"
+    ],
+    "git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obialign.BuildQualityConsensus": [
+      "pairing_mismatches"
+    ]
+  },
+  "obitaxonomy": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetPath": [
+      "taxonomic_path"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ],
+    "(git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiiter.IBioSequence).NumberSequences$1": [
+      "seq_number"
+    ]
+  },
+  "obiuniq": {
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetCount": [
+      "count"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetDefinition": [
+      "definition"
+    ],
+    "(*git.metabarcoding.org/obitools/obitools4/obitools4/pkg/obiseq.BioSequence).SetTaxid": [
+      "taxid"
+    ]
+  }
+}
@@ -0,0 +1,36 @@
+#!/usr/bin/env python3
+"""
+Read potentially malformed JSON from stdin (aichat output), extract title and
+body, and print them as plain text: title on first line, blank line, then body.
+Exits with 1 on failure (no output).
+"""
+
+import sys
+import json
+import re
+
+text = sys.stdin.read()
+
+m = re.search(r'\{.*\}', text, re.DOTALL)
+if not m:
+    sys.exit(1)
+
+s = m.group()
+obj = None
+
+try:
+    obj = json.loads(s)
+except Exception:
+    s2 = re.sub(r'(?<!\\)\n', r'\\n', s)
+    try:
+        obj = json.loads(s2)
+    except Exception:
+        sys.exit(1)
+
+title = obj.get('title', '').strip()
+body  = obj.get('body', '').strip()
+
+if not title or not body:
+    sys.exit(1)
+
+print(f"{title}\n\n{body}")
@@ -1 +1 @@
-4.4.18
+4.4.42
@@ -0,0 +1,19 @@
+```markdown
+# DNA Scoring and Matching Utilities in `obialign`
+
+This module provides low-level utilities for computing nucleotide alignment scores using probabilistic and bit-encoded representations.
+
+- **Bit Encoding**: Nucleotides are encoded in 4-bit groups (e.g., `A=0b0001`, `C=0b0010`, etc.), enabling efficient bitwise comparison.
+- **`_MatchRatio(a, b)`**: Computes a normalized match ratio between two encoded bytes based on shared bits:
+  `ratio = common_bits / (bits_in_a × bits_in_b)`.
+- **`_FourBitsCount`**: Precomputed lookup table for Hamming weight (popcount) of 4-bit values.
+- **Log-space Arithmetic**: Helper functions (`_Logaddexp`, `_Logdiffexp`, `_Log1mexp`) ensure numerical stability in probabilistic computations.
+- **Phred-scaled Quality Integration**:
+  `_MatchScoreRatio(QF, QR)` derives log-odds match/mismatch scores from Phred quality values (`QF`, `QR`), modeling sequencing error probabilities.
+- **Precomputed Matrices**:
+  - `_NucPartMatch[i][j]`: Match ratios for all nucleotide pairs (from 4-bit codes).
+  - `_NucScorePartMatchMatch/Mismatch[i][j]`: Integer-scaled match/mismatch scores (×10) for quality pairs `(i, j)` in `[0..99]`.
+- **Thread-Safe Initialization**: `_InitDNAScoreMatrix()` ensures one-time, synchronized initialization of all scoring tables via a mutex.
+
+Designed for high-performance alignment kernels where speed and numerical robustness are critical.
+```
@@ -1 +1 @@
 .4.18
 .4.42