mirror of https://github.com/metabarcoding/obitools4.git synced 2026-04-30 12:00:39 +00:00

Files

T

Eric Coissac 8c7017a99d ⬆️ version bump to v4.5

- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)

2026-04-13 13:34:53 +02:00

1.8 KiB

Raw Blame History

Semantic Description of `obikmer` Package

This Go package provides utilities for k-mer (specifically 4-mer) counting and comparison of biological sequences.

Core Functionalities

Count4Mer(seq, buffer, counts)
Counts occurrences of all possible 16-mer (4-nucleotide) subsequences in a BioSequence.
- Encodes each 4-mer into an integer (0–255) using Encode4mer.
- Populates a fixed-size [256]uint16 table (Table4mer) with counts.
- Reuses or allocates the counts buffer as needed.
Common4Mer(count1, count2)
Computes the intersection of two 4-mer frequency profiles: sum over all k-mers of min(count1[k], count2[k]).
Used to measure shared content between sequences.
Sum4Mer(count)
Returns the total number of 4-mers in a profile (i.e., sum over all entries).

Distance & Similarity Bounds

LCS4MerBounds(count1, count2)
Estimates bounds for the Longest Common Subsequence (LCS) length between two sequences based on 4-mer profiles:
- Lower bound: common_kmers + (3 if common > 0 else 0)
- Upper bound: min(total1, total2) + 3 − ceil((min_total – common)/4)
  Leverages the fact that overlapping k-mers constrain possible alignments.
Error4MerBounds(count1, count2)
Estimates bounds for alignment errors (e.g., mismatches + indels):
- Upper bound: max_total − common_kmers + 2 * floor((common_kmers + 5)/8)
- Lower bound: ceil(upper_bound / 4)
  Provides fast, approximate error estimates without full alignment.

Use Case

Designed for high-performance comparison of NGS reads (e.g., in metabarcoding), where exact alignment is too costly, and k-mer-based heuristics enable scalable similarity estimation.

1.8 KiB Raw Blame History Unescape Escape

Semantic Description of obikmer Package

Core Functionalities

Distance & Similarity Bounds

Use Case

1.8 KiB

Raw Blame History

Semantic Description of `obikmer` Package