mirror of
https://github.com/metabarcoding/obitools4.git
synced 2026-04-30 12:00:39 +00:00
8c7017a99d
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5" - Update version.txt from 4.29 → .30 (automated by Makefile)
1.4 KiB
1.4 KiB
Semantic Description of IsPatternMatchSequence
The function IsPatternMatchSequence defines a sequence predicate for pattern-based matching in biological sequences (e.g., DNA/RNA), supporting fuzzy and strand-aware search.
Core Functionality:
-
Input Parameters
pattern: A regular expression-like string describing the target pattern.errormax: Maximum allowed mismatches (substitutions only by default).bothStrand: If true, also search on the reverse-complement strand.allowIndels: Enables insertion/deletion errors (beyond mismatches) when set to true.
-
Internal Workflow
- Parses the pattern into an automaton (
apat) viaMakeApatPattern. - Computes its reverse complement for dual-strand matching.
- Returns a closure (
SequencePredicate) that tests whether a givenBioSequencematches the pattern (or its RC), within error tolerance.
- Parses the pattern into an automaton (
-
Matching Logic
- Converts input sequence to
apatformat. - Checks match on forward strand first; if failed and
bothStrand=true, tries reverse complement. - Uses automaton-based matching (
IsMatching) for efficient fuzzy search.
- Converts input sequence to
Semantic Use Case:
Enables flexible, error-tolerant detection of sequence motifs (e.g., primers, barcodes) in high-throughput sequencing data—supporting both in silico primer design validation and read filtering in metagenomic pipelines.