mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
24 lines
1.4 KiB
Markdown
24 lines
1.4 KiB
Markdown
## Semantic Description of `IsPatternMatchSequence`
|
|
|
|
The function `IsPatternMatchSequence` defines a **sequence predicate** for pattern-based matching in biological sequences (e.g., DNA/RNA), supporting fuzzy and strand-aware search.
|
|
|
|
### Core Functionality:
|
|
- **Input Parameters**
|
|
- `pattern`: A regular expression-like string describing the target pattern.
|
|
- `errormax`: Maximum allowed mismatches (substitutions only by default).
|
|
- `bothStrand`: If true, also search on the reverse-complement strand.
|
|
- `allowIndels`: Enables insertion/deletion errors (beyond mismatches) when set to true.
|
|
|
|
- **Internal Workflow**
|
|
- Parses the pattern into an automaton (`apat`) via `MakeApatPattern`.
|
|
- Computes its reverse complement for dual-strand matching.
|
|
- Returns a closure (`SequencePredicate`) that tests whether a given `BioSequence` matches the pattern (or its RC), within error tolerance.
|
|
|
|
- **Matching Logic**
|
|
- Converts input sequence to `apat` format.
|
|
- Checks match on forward strand first; if failed and `bothStrand=true`, tries reverse complement.
|
|
- Uses automaton-based matching (`IsMatching`) for efficient fuzzy search.
|
|
|
|
### Semantic Use Case:
|
|
Enables flexible, error-tolerant detection of sequence motifs (e.g., primers, barcodes) in high-throughput sequencing data—supporting both *in silico* primer design validation and read filtering in metagenomic pipelines.
|