Files
obitools4/autodoc/docmd/pkg/obiapat/predicat.md
T

24 lines
1.4 KiB
Markdown
Raw Normal View History

2026-04-07 08:36:50 +02:00
## Semantic Description of `IsPatternMatchSequence`
The function `IsPatternMatchSequence` defines a **sequence predicate** for pattern-based matching in biological sequences (e.g., DNA/RNA), supporting fuzzy and strand-aware search.
### Core Functionality:
- **Input Parameters**
- `pattern`: A regular expression-like string describing the target pattern.
- `errormax`: Maximum allowed mismatches (substitutions only by default).
- `bothStrand`: If true, also search on the reverse-complement strand.
- `allowIndels`: Enables insertion/deletion errors (beyond mismatches) when set to true.
- **Internal Workflow**
- Parses the pattern into an automaton (`apat`) via `MakeApatPattern`.
- Computes its reverse complement for dual-strand matching.
- Returns a closure (`SequencePredicate`) that tests whether a given `BioSequence` matches the pattern (or its RC), within error tolerance.
- **Matching Logic**
- Converts input sequence to `apat` format.
- Checks match on forward strand first; if failed and `bothStrand=true`, tries reverse complement.
- Uses automaton-based matching (`IsMatching`) for efficient fuzzy search.
### Semantic Use Case:
Enables flexible, error-tolerant detection of sequence motifs (e.g., primers, barcodes) in high-throughput sequencing data—supporting both *in silico* primer design validation and read filtering in metagenomic pipelines.