Files
obitools4/autodoc/docmd/pkg/obiseq/revcomp.md
T
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00

36 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# BioSequence Reverse Complement Functionality
This Go package (`obiseq`) provides utilities for computing the reverse complement of biological sequences (e.g., DNA), including support for quality scores and structured metadata.
## Core Functions
- **`nucComplement(n byte) byte`**
Returns the nucleotide complement using a lookup table (`_revcmpDNA`). Handles special cases:
- `.` / `-` → unchanged (gaps)
- `[`, `]` → swapped (`[``]`)
- AZ letters → complemented (case-insensitive via bitwise masking)
- Unknown characters → `'n'`
- **`BioSequence.ReverseComplement(inplace bool) *BioSequence`**
Performs reverse complement on the sequence and (if present) its quality string:
- If `inplace = false`, a copy is made; original preserved.
- Reverses indices and complements each base using `nucComplement`.
- Also reverses the quality array symmetrically.
- Caches result in `sequence.revcomp` for reuse.
- **`BioSequence._revcmpMutation() *BioSequence`**
Adjusts mutation metadata (e.g., `"pairing_mismatches"`) to reflect the reversed-complement orientation:
- Reverses and complements symbolic mutation strings (e.g., `"A>T"``"T>A"`).
- Updates positional indices to match reversed sequence coordinates.
- **`ReverseComplementWorker(inplace bool) SeqWorker`**
Returns a reusable `SeqWorker` function for batch processing: applies reverse complement to each sequence in a stream.
## Design Notes
- Uses ASCII bitwise tricks (`&31`, `|0x20`) for case-insensitive indexing and lowercase output.
- Supports non-standard symbols (e.g., IUPAC ambiguity codes via lookup table).
- Integrates quality scores and structured attributes seamlessly.
> Ideal for NGS preprocessing pipelines where orientation matters (e.g., paired-end alignment, variant calling).