2224 lines
79 KiB
HTML
2224 lines
79 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en-us" dir="ltr">
|
|
<head>
|
|
<meta charset="UTF-8">
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
<meta name="description" content="
|
|
obipairing: align forward and reverse paired reads
|
|
#
|
|
|
|
|
|
Description
|
|
#
|
|
|
|
When DNA metabarcoding sequences are generated as paired reads on the Illumina platform,
|
|
obipairing
|
|
aims to align forward and reverse reads to generate full length amplicon sequences.
|
|
|
|
Input data
|
|
#
|
|
|
|
The
|
|
obipairing
|
|
command requires two input files:
|
|
|
|
One file contains the forward reads.
|
|
The second file contains the reverse reads.
|
|
|
|
Both files must contain the same number of sequences, and the sequences must be in the same order. This means that the first sequence of the forward reads file must correspond to the first sequence of the reverse reads file.
|
|
obipairing
|
|
will take this order into account and will only align sequences that are in the same rank.">
|
|
<meta name="theme-color" media="(prefers-color-scheme: light)" content="#ffffff">
|
|
<meta name="theme-color" media="(prefers-color-scheme: dark)" content="#343a40">
|
|
<meta name="color-scheme" content="light dark"><meta property="og:url" content="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<meta property="og:site_name" content="OBITools4 documentation">
|
|
<meta property="og:title" content="obipairing">
|
|
<meta property="og:description" content="obipairing: align forward and reverse paired reads # Description # When DNA metabarcoding sequences are generated as paired reads on the Illumina platform, obipairing aims to align forward and reverse reads to generate full length amplicon sequences.
|
|
Input data # The obipairing command requires two input files:
|
|
One file contains the forward reads. The second file contains the reverse reads. Both files must contain the same number of sequences, and the sequences must be in the same order. This means that the first sequence of the forward reads file must correspond to the first sequence of the reverse reads file. obipairing will take this order into account and will only align sequences that are in the same rank.">
|
|
<meta property="og:locale" content="en_us">
|
|
<meta property="og:type" content="website">
|
|
<title>obipairing | OBITools4 documentation</title>
|
|
<link rel="icon" href="/obidoc/favicon.png" >
|
|
<link rel="manifest" href="/obidoc/manifest.json">
|
|
<link rel="canonical" href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<link rel="stylesheet" href="/obidoc/book.min.5fd7b8e2d1c0ae15da279c52ff32731130386f71b58f011468f20d0056fe6b78.css" integrity="sha256-X9e44tHArhXaJ5xS/zJzETA4b3G1jwEUaPINAFb+a3g=" crossorigin="anonymous">
|
|
<script defer src="/obidoc/fuse.min.js"></script>
|
|
<script defer src="/obidoc/en.search.min.4da51bdd2d833922fdbc0e19df517221387fc625ffb68ee140d605b3c5b68058.js" integrity="sha256-TaUb3S2DOSL9vA4Z31FyITh/xiX/to7hQNYFs8W2gFg=" crossorigin="anonymous"></script>
|
|
|
|
<script defer src="/obidoc/sw.min.32af8eafce4180aa1c5dea66d99fb26ba9043ea7c7a4c706138c91d9051b285e.js" integrity="sha256-Mq+Or85BgKocXepm2Z+ya6kEPqfHpMcGE4yR2QUbKF4=" crossorigin="anonymous"></script>
|
|
<link rel="alternate" type="application/rss+xml" href="http://metabar:8888/obidoc/obitools/obipairing/index.xml" title="OBITools4 documentation" />
|
|
<!--
|
|
Made with Book Theme
|
|
https://github.com/alex-shpak/hugo-book
|
|
-->
|
|
<link rel="stylesheet" type="text/css" href="http://metabar:8888/obidoc/hugo-cite.css" />
|
|
</head>
|
|
<body dir="ltr">
|
|
<input type="checkbox" class="hidden toggle" id="menu-control" />
|
|
<input type="checkbox" class="hidden toggle" id="toc-control" />
|
|
<main class="container flex">
|
|
<aside class="book-menu">
|
|
<div class="book-menu-content">
|
|
|
|
<nav>
|
|
<h2 class="book-brand">
|
|
<a class="flex align-center" href="/obidoc/"><img src="/obidoc/obitools_logo.jpg" alt="Logo" class="book-icon" /><span>OBITools4 documentation</span>
|
|
</a>
|
|
</h2>
|
|
|
|
|
|
<div class="book-search hidden">
|
|
<input type="text" id="book-search-input" placeholder="Search" aria-label="Search" maxlength="64" data-hotkeys="s/" />
|
|
<div class="book-search-spinner hidden"></div>
|
|
<ul id="book-search-results"></ul>
|
|
</div>
|
|
<script>document.querySelector(".book-search").classList.remove("hidden")</script>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<span>Docs</span>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/about/" class="">About</a>
|
|
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/installation/" class="">Installation</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/principles/" class="">General operating principles</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-08756b4c1f14be6ee584ece005b9f621" class="toggle" />
|
|
<label for="section-08756b4c1f14be6ee584ece005b9f621" class="flex justify-between">
|
|
<a role="button" class="">File formats</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-933c2e64b905b84e22aa5273cea2d0bd" class="toggle" />
|
|
<label for="section-933c2e64b905b84e22aa5273cea2d0bd" class="flex justify-between">
|
|
<a role="button" class="">Sequence file formats</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/formats/fasta/" class="">FASTA file format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/formats/fastq/" class="">FASTQ file format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/formats/genbank/" class="">GenBank Flat File format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/formats/embl/" class="">EMBL Flat File format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/file_format/sequence_files/csv/" class="">CSV format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/formats/json/" class="">JSON format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/file_format/sequence_files/annotations/" class="">Annotation of sequences</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-0258ae1c222f9a38cc1b75254c93b0f4" class="toggle" />
|
|
<label for="section-0258ae1c222f9a38cc1b75254c93b0f4" class="flex justify-between">
|
|
<a role="button" class="">Taxonomy file formats</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/file_format/taxonomy_file/csv_taxdump/" class="">CSV formatted taxdump</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/file_format/taxonomy_file/ncbi_taxdump/" class="">NCBI taxdump</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/formats/csv/" class="">The CSV format</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-70b1e6e5ec7f3ccab643155fa50659b6" class="toggle" />
|
|
<label for="section-70b1e6e5ec7f3ccab643155fa50659b6" class="flex justify-between">
|
|
<a role="button" class="">Patterns</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/patterns/regular/" class="">Regular Expressions</a>
|
|
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/patterns/dnagrep/" class="">DNA Patterns</a>
|
|
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-8223f464911a1fe6c655972143684e93" class="toggle" checked />
|
|
<label for="section-8223f464911a1fe6c655972143684e93" class="flex justify-between">
|
|
<a role="button" class="">The OBITools4 commands</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/commands/options/" class="">Shared command options</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-8921ea65523c266b128dd4263232b0fc" class="toggle" />
|
|
<label for="section-8921ea65523c266b128dd4263232b0fc" class="flex justify-between">
|
|
<a role="button" class="">Basics</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obiannotate/" class="">obiannotate</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obicomplement/" class="">obicomplement</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obiconvert/" class="">obiconvert</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obicount/" class="">obicount</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obicsv/" class="">obicsv</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obidemerge/" class="">obidemerge</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obidistribute/" class="">obidistribute</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obigrep/" class="">obigrep</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obijoin/" class="">obijoin</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obimatrix/" class="">obimatrix</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obisplit/" class="">obisplit</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obisummary/" class="">obisummary</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obiuniq/" class="">obiuniq</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-dbdf1bb5377572439394e60e08c30f50" class="toggle" />
|
|
<label for="section-dbdf1bb5377572439394e60e08c30f50" class="flex justify-between">
|
|
<a role="button" class="">Demultiplexing samples</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obimultiplex/" class="">obimultiplex</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obitagpcr/" class="">obitagpcr</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-aa98fedd067b51150db59691a8ea8edd" class="toggle" checked />
|
|
<label for="section-aa98fedd067b51150db59691a8ea8edd" class="flex justify-between">
|
|
<a role="button" class="">Sequence alignments</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obiclean/" class="">obiclean</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-7433746525d8c2b29b033f765c869acd" class="toggle" checked />
|
|
<label for="section-7433746525d8c2b29b033f765c869acd" class="flex justify-between">
|
|
<a href="/obidoc/obitools/obipairing/" class="active">obipairing</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/" class="">The FASTA-like alignment</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/exact-alignment/" class="">Exact alignment</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obipcr/" class="">obipcr</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obirefidx/" class="">obirefidx</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obitag/" class="">obitag</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-5746f699d10490780dec8e30ab2dd3ce" class="toggle" />
|
|
<label for="section-5746f699d10490780dec8e30ab2dd3ce" class="flex justify-between">
|
|
<a role="button" class="">Taxonomy</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obitaxonomy/" class="">obitaxonomy</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-3f50c4fe7ab436a56ae92897d5444956" class="toggle" />
|
|
<label for="section-3f50c4fe7ab436a56ae92897d5444956" class="flex justify-between">
|
|
<a role="button" class="">Advanced tools</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obiscript/" class="">obiscript</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-549be3934679fcb82a232f6bd5435563" class="toggle" />
|
|
<label for="section-549be3934679fcb82a232f6bd5435563" class="flex justify-between">
|
|
<a role="button" class="">Others</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obimicrosat/" class="">obimicrosat</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-ceca4455173761e30cbc0a6dc2327167" class="toggle" />
|
|
<label for="section-ceca4455173761e30cbc0a6dc2327167" class="flex justify-between">
|
|
<a role="button" class="">Experimentals</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obicleandb/" class="">obicleandb</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obiconsensus/" class="">obiconsensus</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/obitools/obilandmark/" class="">obilandmark</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/commands/tags/" class="">Glossary of tags</a>
|
|
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-9b1bcd52530c59dc4819b1f61c128f54" class="toggle" />
|
|
<label for="section-9b1bcd52530c59dc4819b1f61c128f54" class="flex justify-between">
|
|
<a role="button" class="">Cookbook</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/cookbook/illumina/" class="">Analysing an Illumina data set</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/cookbook/ecoprimers/" class="">Designing new barcodes</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/cookbook/local_genbank/" class="">Prepare a local copy of Genbank</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/cookbook/reference_db/" class="">Build a reference database</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/cookbook/minion/" class="">Oxford Nanopore data analysis</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<span>Programming OBITools</span>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/programming/expression/" class="">Expression language</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-6d580829a667b5cca790b286d99a10fe" class="toggle" />
|
|
<label for="section-6d580829a667b5cca790b286d99a10fe" class="flex justify-between">
|
|
<a href="/obidoc/docs/programming/lua/" class="">Lua: for scripting OBITools</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<input type="checkbox" id="section-2fb081dac812d624eea5f4268fca9e26" class="toggle" />
|
|
<label for="section-2fb081dac812d624eea5f4268fca9e26" class="flex justify-between">
|
|
<a role="button" class="">Obitools Classes</a>
|
|
</label>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/programming/lua/obitools_classes/biosequence/" class="">BioSequence</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/programming/lua/obitools_classes/biosequenceslice/" class="">BioSequenceSlice</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/programming/lua/obitools_classes/taxonomy/" class="">Taxonomy</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/programming/lua/obitools_classes/taxon/" class="">Taxon</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
|
|
<li>
|
|
|
|
|
|
|
|
|
|
|
|
<a href="/obidoc/docs/programming/lua/obitools_classes/mutex/" class="">Mutex</a>
|
|
|
|
|
|
|
|
<ul>
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
</nav>
|
|
|
|
|
|
|
|
|
|
<script>(function(){var e=document.querySelector("aside .book-menu-content");addEventListener("beforeunload",function(){localStorage.setItem("menu.scrollTop",e.scrollTop)}),e.scrollTop=localStorage.getItem("menu.scrollTop")})()</script>
|
|
|
|
|
|
|
|
</div>
|
|
</aside>
|
|
|
|
<div class="book-page">
|
|
<header class="book-header">
|
|
|
|
<div class="flex align-center justify-between">
|
|
<label for="menu-control">
|
|
<img src="/obidoc/svg/menu.svg" class="book-icon" alt="Menu" />
|
|
</label>
|
|
|
|
<h3>obipairing</h3>
|
|
|
|
<label for="toc-control">
|
|
|
|
<img src="/obidoc/svg/toc.svg" class="book-icon" alt="Table of Contents" />
|
|
|
|
</label>
|
|
</div>
|
|
|
|
|
|
|
|
<aside class="hidden clearfix">
|
|
|
|
|
|
<nav id="TableOfContents">
|
|
<ul>
|
|
<li><a href="#obipairing-align-forward-and-reverse-paired-reads"><code>obipairing</code>: align forward and reverse paired reads</a>
|
|
<ul>
|
|
<li><a href="#description">Description</a>
|
|
<ul>
|
|
<li><a href="#input-data">Input data</a></li>
|
|
<li><a href="#the-simplest-obipairing-command">The simplest <em>obipairing</em> command</a></li>
|
|
<li><a href="#the-alignment-process">The alignment process</a></li>
|
|
<li><a href="#building-the-consensus-sequence">Building the consensus sequence</a></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="#synopsis">Synopsis</a></li>
|
|
<li><a href="#options">Options</a>
|
|
<ul>
|
|
<li></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="#examples">Examples</a>
|
|
<ul>
|
|
<li><a href="#basic-example">Basic example</a></li>
|
|
<li><a href="#pairing-the-reads-in-exact-mode">Pairing the reads in exact mode</a></li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
|
|
|
|
|
|
</aside>
|
|
|
|
|
|
</header>
|
|
|
|
|
|
|
|
<article class="markdown book-article"><h1 id="obipairing-align-forward-and-reverse-paired-reads">
|
|
<code>obipairing</code>: align forward and reverse paired reads
|
|
<a class="anchor" href="#obipairing-align-forward-and-reverse-paired-reads">#</a>
|
|
</h1>
|
|
<h2 id="description">
|
|
Description
|
|
<a class="anchor" href="#description">#</a>
|
|
</h2>
|
|
<p>When DNA metabarcoding sequences are generated as paired reads on the Illumina platform, <a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> aims to align forward and reverse reads to generate full length amplicon sequences.</p>
|
|
<h3 id="input-data">
|
|
Input data
|
|
<a class="anchor" href="#input-data">#</a>
|
|
</h3>
|
|
<p>The <a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> command requires two input files:</p>
|
|
<ul>
|
|
<li>One file contains the forward reads.</li>
|
|
<li>The second file contains the reverse reads.</li>
|
|
</ul>
|
|
<p>Both files must contain the same number of sequences, and the sequences must be in the same order. This means that the first sequence of the forward reads file must correspond to the first sequence of the reverse reads file. <a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> will take this order into account and will only align sequences that are in the same rank.</p>
|
|
<p>Consider the following example, where the forward reads file is
|
|
|
|
<a href="forward.fastq"><code>forward.fastq</code></a> and the reverse reads file is
|
|
|
|
<a href="reverse.fastq"><code>reverse.fastq</code></a> and both consist of 4 sequences:</p>
|
|
|
|
<a style="padding: 10px 20px; background-color: #cacaca; border: 1px solid #8e8080; border-bottom: none; border-radius: 5px 5px 0 0; box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1)"
|
|
href="forward.fastq" download="forward.fastq">📄 forward.fastq</a>
|
|
<DIV style="border: 2px solid #8e8080; border-radius: 0 0 5px 5px; padding: 20px; background-color: white; ">
|
|
|
|
<pre tabindex="0"><code class="language-fastq" data-lang="fastq">@M01334:147:000000000-LBRVD:1:1101:14968:1570 1:N:0:CTCACCAA+CTAGGCAA
|
|
TGTTCCACGGGCAATCCTGAGCCAAATCTTTCATTTTGAAAAAATGAGAGATATAATGTATCTCTTATTTATTATAAGAAATAAAATATTTCTTATCTAATATTAAAGTTAGGTGCAGAGACTCAATGGGTGGAACTAGATCGGATGTGCA
|
|
+
|
|
11>A>@3@A11>ACFFEG110BFB00BAFGHE2DFGG201110/B11111/D1D2222D2FDFDFGDGHHBGG2F222110D11@1D1FGHFHGFF@GE1F2FG22112B220F1@111/0>BF11B210B>//11B1<1BB<///<1122
|
|
@M01334:147:000000000-LBRVD:1:1101:15946:1586 1:N:0:CTCACCAA+CTAGGCAA
|
|
TCCTAACCCCATTGAGTCTCTGCACCTATCTTTAATATTAGATAAGAAATATTTTATTTCTTATAATAAATAAGAGATATTTTATATCTCTCATTTTTTCAAAATGAAAGATTTGGCTCAGGATTGCCCACGTAACGGAGATCGGAAGAGC
|
|
+
|
|
1>>A111>>>AFGGB1FFGFGFF3BBF1GGHHH33D2GH2B1D211110D1DGHHBFGGGGG2FA2F221F21A1F0D1DGHH2FAFFGFHFFGHHHHGG22@1BD111@0FFHE11GC1001BGF1B1B/EF00??////BF////<000
|
|
@M01334:147:000000000-LBRVD:1:1101:15399:1590 1:N:0:CTCACCAA+CTAGGCAA
|
|
TGTTCCACCCATTGAGTCTCTGCACCTATCTTTAATATTAGATAAGAAATATTTTACTTCTTATAATAAATAAGAGTTATTTTATATCTCTCATTTTTTCAAAATGAAAGATTTGGCTCAGGATTGCCCGTGGAACTAGATCGGAAGAGCA
|
|
+
|
|
11>A>@3B>>1CF111BBFAG3A3AAF1FFGHHF3FBGH221F211110D1DGHH2BBGBFF2F22D221D211111A2DDGG2F2FFFEGD1FFHHHGFD221B111110BFGD11F@1001BF0@@1/EA//1>F1B1FD/////00<1
|
|
@M01334:147:000000000-LBRVD:1:1101:13773:1687 1:N:0:CTCACCAA+CTAGGCAA
|
|
CTCGGATCACCATTGAGTCTCTGCACCTATCTTTAATATTAGATAAGAAAAAATATTATTTCTTATCTGAAATAAGAAATATTTTATATATTTCTTTTTCTCAAAATGAAAGATTTGGCTCAGGATTGCCCTGATCCGAGGGATAGCACCA
|
|
+
|
|
3AAAAAADFFFFGGGGFGGGGGHHHHHHFHHHHHHHHGHHHHGHGGHFFHHHCGFHHHHHHHHHHHHHGHHGGFHFFHHHGHHHHBHHHGHHHHHHHHHHHHHFFHHFBDFBCGHHF4BGHFGFFHHBDGFHHEHHFAAEECEGF3FDGFC
|
|
</code></pre></td>
|
|
|
|
</DIV>
|
|
|
|
|
|
<a style="padding: 10px 20px; background-color: #cacaca; border: 1px solid #8e8080; border-bottom: none; border-radius: 5px 5px 0 0; box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1)"
|
|
href="reverse.fastq" download="reverse.fastq">📄 reverse.fastq</a>
|
|
<DIV style="border: 2px solid #8e8080; border-radius: 0 0 5px 5px; padding: 20px; background-color: white; ">
|
|
|
|
<pre tabindex="0"><code class="language-fastq" data-lang="fastq">@M01334:147:000000000-LBRVD:1:1101:14968:1570 2:N:0:CTCACCAA+CTAGGCAA
|
|
TTTTCCTCCCTTTTTTTCTCTGCACCTTTCTTTTTTATTAGTTTTTTATTATTTTTTTTCTTTTTTTATTTTATTGATACTTTATATCTCTCTTTTTTTCTTTTTTATTGATTTTTCTCTGGTTTTCCCTTGTTACTTGTTCTTTTTTGCT
|
|
+
|
|
11>>1131111BB111A0B3B313A0B1BAFGG11E/DG222B22///1D2DDGG1AE>>FG1D1/>/12B221212@21BFD2B2B2B2F11BFGHEEC1111B//1212BBF110@22111@@/2111?01111@111?111111--11
|
|
@M01334:147:000000000-LBRVD:1:1101:15946:1586 2:N:0:CTCACCAA+CTAGGCAA
|
|
CCGTTACGTGGGCAATCCTGAGCCAATTCTTTCTTTTTGAAAAAATGAGAGATATAAAATATCTCTTATTTATTATAAGAAATAAAATATTTCTTATCTAATATTAATGATAGGTGCAGTGACTCTATGGGGTTAGGTAGTTCGGATGAGC
|
|
+
|
|
111>>111B111111BA0B1101B001BAGGH22DGGH?01110/B11111/D1D2221D1DBEDGH1GHH2GG2F222110D@111D1DFGEGFBG@GB1B2FG22222B220B11111111B@11B210/?E/00B211B2/////111
|
|
@M01334:147:000000000-LBRVD:1:1101:15399:1590 2:N:0:CTCACCAA+CTAGGCAA
|
|
TTTTCCTCGGGCTATCCTGAGCCAAATCTTTCCTTTTGAAAAATTTAGAGATATAAAATATCTCTTATTTATTTTATGTAGTATTATATTTCTTATCTAATATTAAATTTAGTTGCTTTTTCTCATTTTGTTTTACTTTTTCTTTTTTGCT
|
|
+
|
|
11>>1131111111B11B1101A000B1DFF21DDFG1011100B122111D1D2221D1DADAFG1DGH2FG2D212222D2222D2DAF2FG2D@F21B2DE22122B221@11111110B222B222B00021B221B011111//11
|
|
@M01334:147:000000000-LBRVD:1:1101:13773:1687 2:N:0:CTCACCAA+CTAGGCAA
|
|
TGATAGCAGGGCTATCCTGAGCCAAATCCGTGTTTTGAGAAAACAAGGGGGTTCTCGAACTAGAATACAAAAGAAAAGGATAGGTGCAGAGACTCAATGGTGCTATCCCTCGGATCAGGGCAATCCTTAGCCAAATCTTTCATTTTTTGAA
|
|
+
|
|
111>13@1111>11B1AF11BABC00B110BAFGGH0000DFAB//0///EEECGFA10AG1111D@@11100/0000/0F110B11@11/0>FC@1B>1B11FEFEC>E>///?<0110/?/FF<G22111@00@<GHHB>FHHH1///1
|
|
</code></pre></td>
|
|
|
|
</DIV>
|
|
|
|
<p>The first sequence of the
|
|
|
|
<a href="forward.fastq"><code>forward.fastq</code></a> file having the id <code>M01334:147:000000000-LBRVD:1:1101:14968:1570</code> will be paired with the first sequence of the
|
|
|
|
<a href="reverse.fastq"><code>reverse.fastq</code></a> file having the same id <code>M01334:147:000000000-LBRVD:1:1101:14968:1570</code>, not because they have the same identifier but because they are both the first sequence of their respective files.</p>
|
|
<h3 id="the-simplest-obipairing-command">
|
|
The simplest <em>obipairing</em> command
|
|
<a class="anchor" href="#the-simplest-obipairing-command">#</a>
|
|
</h3>
|
|
<p>The minimal <a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> command to align the
|
|
|
|
<a href="forward.fastq"><code>forward.fastq</code></a> and
|
|
|
|
<a href="reverse.fastq"><code>reverse.fastq</code></a> files is:</p>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>obipairing -F forward.fastq -R reverse.fastq > paired.fastq
|
|
</span></span></code></pre></div>
|
|
|
|
<script src="/obidoc/mermaid.min.js"></script>
|
|
|
|
<script>mermaid.initialize({
|
|
"flowchart": {
|
|
"useMaxWidth":true
|
|
},
|
|
"theme": "default"
|
|
}
|
|
)</script>
|
|
|
|
|
|
|
|
|
|
<pre class="mermaid workflow">
|
|
graph TD
|
|
A@{ shape: doc, label: "forward.fastq" }
|
|
B@{ shape: doc, label: "reverse.fastq" }
|
|
C[obipairing]
|
|
D@{ shape: doc, label: "paired.fastq" }
|
|
A --> C
|
|
B --> C:::obitools
|
|
C --> D
|
|
classDef obitools fill:#99d57c
|
|
</pre>
|
|
|
|
<p>it will produce a file named
|
|
|
|
<a href="paired.fastq"><code>paired.fastq</code></a> with the following content:</p>
|
|
|
|
<a style="padding: 10px 20px; background-color: #cacaca; border: 1px solid #8e8080; border-bottom: none; border-radius: 5px 5px 0 0; box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1)"
|
|
href="paired.fastq" download="paired.fastq">📄 paired.fastq</a>
|
|
<DIV style="border: 2px solid #8e8080; border-radius: 0 0 5px 5px; padding: 20px; background-color: white; ">
|
|
|
|
<pre tabindex="0"><code class="language-fastq" data-lang="fastq">@M01334:147:000000000-LBRVD:1:1101:14968:1570 {"ali_length":137,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":1687,"score_norm":0.679,"seq_ab_match":93}
|
|
tgttccacgggcaatcctgagccaaatctttcattttgaaaaaatgagagatataatgtatctcttatttattataagaaataaaatatttcttatctaatattaaagttaggtgcagagactcaatgggtggaactagatcggatgtgca..........agcaaaaaagaacaagtaacaagggaaaaccagagaaaaatcaataaaaaagaaaaaaagagagatataaagtatcaataaaataaaaaaagaaaaaaaataataaaaaactaataaaaaagaaaggtgcagagaaaaaaagggaggaaaa
|
|
+
|
|
11>A>@3@A11>ACFFEG110BFB00BAFGHE2DFGG201110/B11111/D1D2222D2FDFDFGDGHHBGG2F222110D11@1D1FGHFHGFF@GE1F2FG22112B220F1@111/0>BF11B210B>//11B1<1BB<///<1122!!!!!!!!!!11--111111?111@11110?1112/@@11122@011FBB2121//B1111CEEHGFB11F2B2B2B2DFB12@212122B21/>/1D1GF>>EA1GGDD2D1///22B222GD/E11GGFAB1B0A313B3B0A111BB1111311>>11
|
|
@M01334:147:000000000-LBRVD:1:1101:15946:1586 {"ali_dir":"right","ali_length":138,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(T:16)->(A:33)":14,"(T:33)->(A:17)":118,"(T:37)->(A:16)":125,"(T:38)->(A:16)":32,"(T:39)->(A:17)":44},"paring_fast_count":114,"paring_fast_overlap":138,"paring_fast_score":0.844,"score":5446,"score_norm":0.957,"seq_a_single":13,"seq_ab_match":132,"seq_b_single":13}
|
|
gctcatccgaactacctaaccccattgagtctctgcacctatctttaatattagataagaaatattttatttcttataataaataagagatattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccacgtaacggagatcggaagagc
|
|
+
|
|
111/////2B112CMMOUO?MNObVHfcAVVHVWVVTQSWRXXIYYYXUSWiXaWeWWUWVSTTTWXgeUWWXXXWWgXWYYWVYWdUgSTTTXYYUVdTVWVXVgUWXXXVeYXfTCUXWW`QGUWfA@WSR?PRRWVARAc?UVMMOO?///BF////<000
|
|
@M01334:147:000000000-LBRVD:1:1101:15399:1590 {"ali_length":4,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":126,"score_norm":1,"seq_ab_match":4}
|
|
tgttccacccattgagtctctgcacctatctttaatattagataagaaatattttacttcttataataaataagagttattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccgtggaactagatcggaagagca..........agcaaaaaagaaaaagtaaaacaaaatgagaaaaagcaactaaatttaatattagataagaaatataatactacataaaataaataagagatattttatatctctaaatttttcaaaaggaaagatttggctcaggatagcccgaggaaaa
|
|
+
|
|
11>A>@3B>>1CF111BBFAG3A3AAF1FFGHHF3FBGH221F211110D1DGHH2BBGBFF2F22D221D211111A2DDGG2F2FFFEGD1FFHHHGFD221B111110BFGD11F@1001BF0@@1/EA//1>F1B1FD/////00<1!!!!!!!!!!11//111110B122B12000B222B222B01111111@122B22122ED2B12F@D2GF2FAD2D2222D222212D2GF2HGD1GFADAD1D1222D1D111221B0011101GFDD12FFD1B000A1011B11B1111111311>>11
|
|
@M01334:147:000000000-LBRVD:1:1101:13773:1687 {"ali_dir":"left","ali_length":54,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(C:39)->(A:16)":102,"(C:39)->(A:17)":121,"(T:39)->(A:14)":101},"paring_fast_count":42,"paring_fast_overlap":54,"paring_fast_score":0.824,"score":2888,"score_norm":0.944,"seq_a_single":97,"seq_ab_match":51,"seq_b_single":97}
|
|
ctcggatcaccattgagtctctgcacctatctttaatattagataagaaaaaatattatttcttatctgaaataagaaatattttatatatttctttttctcaaaatgaaagatttggctcaggattgccctgatccgagggatagcaccattgagtctctgcacctatccttttcttttgtattctagttcgagaacccccttgttttctcaaaacacggatttggctcaggatagccctgctatca
|
|
+
|
|
3AAAAAADFFFFGGGGFGGGGGHHHHHHFHHHHHHHHGHHHHGHGGHFFHHHCGFHHHHHHHHHHHHHGHHGGFHFFHHHGHHHHBHHHGHHHHHHHXVVJIommmegikl]bVWgVDRXIlbkkVfPSWVWccVVT^ebggjkkCVeWcd1@CF>0/11@11B011F0/0000/00111@@D1111GA01AFGCEEE///0//BAFD0000HGGFAB011B00CBAB11FA1B11>1111@31>111
|
|
</code></pre></td>
|
|
|
|
</DIV>
|
|
|
|
<h3 id="the-alignment-process">
|
|
The alignment process
|
|
<a class="anchor" href="#the-alignment-process">#</a>
|
|
</h3>
|
|
<p><a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> will align the reads following a two-step procedure to increase computation speed.</p>
|
|
<h4 id="a-fast-alignment-to-determine-quickly-the-overlap">
|
|
A fast alignment to determine quickly the overlap
|
|
<a class="anchor" href="#a-fast-alignment-to-determine-quickly-the-overlap">#</a>
|
|
</h4>
|
|
<p>The first step aligns the reads using a
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/">FASTA-derived algorithm</a>.
|
|
Based on results of the first step, a second alignment step is on the overlapping region only using an exact dynamic programming algorithm taking into account sequence quality scores present in the
|
|
<a href="http://metabar:8888/obidoc/formats/fastq/">fastq</a>
|
|
files. It is possible to disable this first alignment step at the cost of an increase in the computation time by using the <code>--exact-mode</code> option.</p>
|
|
<p>The first fast alignment step adds three tags to the FASTQ header for each sequence record to indicate the results of this first step alignment.</p>
|
|
<ul>
|
|
<li><code>paring_fast_count</code> : Number of 4mer shared on the main diagonal of the
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/#dotplot">fasta dot plot</a>.</li>
|
|
<li><code>pairing_fast_overlap</code> : Length in nucleotides of the overlap as detected by this algorithm.</li>
|
|
<li><code>pairing_fast_score</code> : The pairing fast score is the number of shared 4mer on the main diagonal of the
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/#dotplot">fasta dot plot</a> (<code>pairing_fast_count</code>) divided by the number of 4mer involved in the overlapping region of the forward and reverse reads (
|
|
<link rel="stylesheet" href="/obidoc/katex/katex.min.css" />
|
|
<script defer src="/obidoc/katex/katex.min.js"></script>
|
|
<script defer src="/obidoc/katex/auto-render.min.js" onload="renderMathInElement(document.body);"></script><span>
|
|
\(pairing\_fast\_overlap - 3\)
|
|
</span>
|
|
)
|
|
<span>
|
|
\[
|
|
pairing\_fast\_score = \frac{pairing\_fast\_count}{pairing\_fast\_overlap - 3}
|
|
\]
|
|
</span>
|
|
</li>
|
|
</ul>
|
|
<p>There are two options for controlling this first step.</p>
|
|
<ul>
|
|
<li>
|
|
<p>The <code>--fasta-exact</code> option allows changing the
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/#fasta-scores">best alignment selection</a> from the one with the highest <code>pairing_fast_score</code> (the default behavior) to the one with the highest <code>pairing_fast_count</code>.</p>
|
|
</li>
|
|
<li>
|
|
<p>The <code>--exact-mode</code> option tells <a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> to bypass this first alignment step and proceed directly to exact alignment, at the cost of a longer computation time.</p>
|
|
</li>
|
|
</ul>
|
|
<h4 id="the-exact-alignment-of-the-overlapping-regions">
|
|
The exact alignment of the overlapping regions
|
|
<a class="anchor" href="#the-exact-alignment-of-the-overlapping-regions">#</a>
|
|
</h4>
|
|
<p>Once the overlap has been quickly identified using the
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/">FASTA-derived algorithm</a>, the overlapping region as detected in this first step is extended by <span>
|
|
\(\Delta\)
|
|
</span>
|
|
nucleotides at each end (<span>
|
|
\(\Delta = 5\)
|
|
</span>
|
|
by default and can be defined with the <code>--delta</code> option) to be exactly aligned using a
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/exact-alignment/">semi-global alignment algorithm</a> taking into account the sequence quality scores present in the
|
|
<a href="http://metabar:8888/obidoc/formats/fastq/">fastq</a>
|
|
files. There are two versions of this algorithm,
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/exact-alignment/#left-and-right-alignment">the <em>left-align</em> and the <em>right-align</em> version</a>. The version used, left or right, depends on the length of the amplicon. Amplicons longer than the read length will be aligned with the left version. The shorter ones are aligned with the right version.</p>
|
|
<p>When the <code>--exact-mode</code> option is used, full length reads are aligned twice, once with the left version and once with the right version. The alignment with the highest score is used. This consequently increases computation time.</p>
|
|
<p>The exact alignment step adds the following tags to the FASTQ header for each read to report the quality of the alignment.</p>
|
|
<ul>
|
|
<li><code>ali_dir</code>: indicates the mode of the used exact alignment <em>left</em> or <em>right</em>.</li>
|
|
<li><code>ali_length</code>: the length of the aligned overlapping region (including gaps).</li>
|
|
<li><code>seq_a_single</code>: the length of the unaligned region on the forward read.</li>
|
|
<li><code>seq_ab_match</code>: the number of matches in the aligned overlapping region.</li>
|
|
<li><code>seq_b_single</code>: the length of the unaligned region on the reverse read.</li>
|
|
<li><code>score</code>: the raw score of the alignment (the sum of the
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/exact-alignment/#scoring-system">elementary scores for each aligned position</a>).</li>
|
|
<li><code>score_norm</code>: <code>seq_ab_match</code> divided by <code>ali_length</code>.</li>
|
|
<li><code>pairing_mismatches</code>: a description of the mismatches between the reads (this tag is not added if the <code>--without-stat</code> is set). It is expressed as a JSON map with keys describing the mismatch and values corresponding to the position of the mismatch in the reconstructed full length amplicon.
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{<span style="color:#f92672">"(C:39)->(A:16)"</span>:<span style="color:#ae81ff">102</span>,<span style="color:#f92672">"(C:39)->(A:17)"</span>:<span style="color:#ae81ff">121</span>,<span style="color:#f92672">"(T:40)->(A:14)"</span>:<span style="color:#ae81ff">101</span>}
|
|
</span></span></code></pre></div>This example describes the three mismatches found in the overlapping region of the fourth sequence pair:
|
|
<ul>
|
|
<li>A <em>C</em> with a quality score of 39 on the forward read is aligned to an <em>A</em> with a quality score of 16 on the reverse read at position 102.</li>
|
|
<li>A <em>C</em> with a quality score of 39 on the forward read is aligned to an <em>A</em> with a quality score of 17 on the reverse read at position 121.</li>
|
|
<li>A <em>T</em> with a quality score of 40 on the forward read aligns to an <em>A</em> with a quality score of 14 on the reverse read at position 101.</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
<h3 id="building-the-consensus-sequence">
|
|
Building the consensus sequence
|
|
<a class="anchor" href="#building-the-consensus-sequence">#</a>
|
|
</h3>
|
|
<p>If the overlap length is below a threshold (20 by default, and can be set with the <code>--min-overlap</code> option), or the <code>score_norm</code> is below an identity threshold (0.9 by default, and can be set with the <code>--min-identity</code> option), no consensus is computed for the read pair. Both sequences are only pasted together with a set of <code>.</code> separating the forward read and the reverse complementary sequence of the reverse read. In this case, the sequence is tagged with a <code>mode</code> attribute set to <code>join</code>.</p>
|
|
<p>If the overlap is long enough and the identity is sufficient, a consensus sequence is built to maximize the global sequencing quality of the reconstructed amplicon. The non-aligned regions are reported as is. The overlapping regions are transcribed as follows:</p>
|
|
<ul>
|
|
<li>For each match, the nucleotide observed on both reads is retained, and the quality score is increased to reflect the congruence of the two reads.
|
|
<span>
|
|
\[Q_{consensus} = Q_F + Q_R\]
|
|
</span>
|
|
</li>
|
|
<li>If there is a mismatch, the nucleotide with the highest quality score is retained and its quality score is decreased to reflect the discrepancy between the two reads (with <span>
|
|
\(Q_{max} = max(Q_F, Q_R)\)
|
|
</span>
|
|
and <span>
|
|
\(Q_{min} = min(Q_F, Q_R)\)
|
|
</span>
|
|
).
|
|
<span>
|
|
\[Q_{consensus} = \log_{10} \left(10^{-\frac{Q_max}{10}} \cdot \frac{1 - 10^{-\frac{Q_min}{10}}}{4} \right)\]
|
|
</span>
|
|
</li>
|
|
<li>In case of an insertion or deletion, the gap will be affected with a quality of 0 and the mismatch rules will be applied. This means that insertions and deletions will always be considered as insertions in the consensus sequence.</li>
|
|
</ul>
|
|
<p>A <code>mode</code> attribute set to <code>alignment</code> will be added to the consensus sequence annotations.</p>
|
|
<h2 id="synopsis">
|
|
Synopsis
|
|
<a class="anchor" href="#synopsis">#</a>
|
|
</h2>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>obipairing --forward-reads|-F <FILENAME_F> --reverse-reads|-R <FILENAME_R>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--batch-size <int><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--compress|-Z<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--debug<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--delta|-D <int><span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--ecopcr<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--embl<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--exact-mode<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--fast-absolute<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--fasta<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--fasta-output<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--fastq<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--fastq-output<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--force-one-cpu<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--gap-penality|-G <float64><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--genbank<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--help|-h|-?<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--input-OBI-header<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--input-json-header<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--json-output<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--max-cpu <int><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--min-identity|-X <float64><span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--min-overlap <int><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--no-order<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--no-progressbar<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--out|-o <FILENAME><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--output-OBI-header|-O<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--output-json-header<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--penality-scale <float64><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--pprof<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--pprof-goroutine <int><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--pprof-mutex <int><span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--skip-empty<span style="color:#f92672">]</span>
|
|
</span></span><span style="display:flex;"><span> <span style="color:#f92672">[</span>--solexa<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--version<span style="color:#f92672">]</span> <span style="color:#f92672">[</span>--without-stat|-S<span style="color:#f92672">]</span> <span style="color:#f92672">[</span><args><span style="color:#f92672">]</span>
|
|
</span></span></code></pre></div><h2 id="options">
|
|
Options
|
|
<a class="anchor" href="#options">#</a>
|
|
</h2>
|
|
<h4 id="obipairing-mandatory-options">
|
|
<em>obipairing</em> mandatory options
|
|
<a class="anchor" href="#obipairing-mandatory-options">#</a>
|
|
</h4>
|
|
<ul>
|
|
<li>
|
|
<b><code class="language-bash">--forward-reads</code></b>
|
|
| <b><code class="language-bash">-F</code></b>
|
|
<FILENAME>: The name of the file containing the forward reads.
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--reverse-reads</code></b>
|
|
| <b><code class="language-bash">-R</code></b>
|
|
<FILENAME>: The name of the file containing the reverse reads.
|
|
|
|
</li>
|
|
</ul>
|
|
<h4 id="other-hahahugoshortcode10s26hbhb-specific-options">
|
|
Other <a href="http://metabar:8888/obidoc/obitools/obipairing/">
|
|
<abbr title="obipairing: align the forward and reverse paired reads"><code>obipairing</code></abbr>
|
|
</a> specific options
|
|
<a class="anchor" href="#other-hahahugoshortcode10s26hbhb-specific-options">#</a>
|
|
</h4>
|
|
<ul>
|
|
<li>
|
|
<b><code class="language-bash">--delta</code></b>
|
|
| <b><code class="language-bash">-D</code></b>
|
|
<INTEGER>: length added to the overlap detected by the fast algorithm before being forwarded to the exact alignment algorithm (default: 5 nucleotides).
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--exact-mode</code></b>: do not run fast alignment heuristic. (default: a fast algorithm is run at first to accelerate the final exact alignment).
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--fast-absolute</code></b>: compute absolute fast score, this option has no effect in exact mode (default: false).
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--gap-penalty</code></b>
|
|
| <b><code class="language-bash">-G</code></b>
|
|
<FLOAT64>: gap penalty expressed as the multiply factor applied to the mismatch score between two nucleotides with a quality of 40 (default 2). (default: 2.000000)
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--min-identity</code></b>
|
|
| <b><code class="language-bash">-X</code></b>
|
|
<FLOAT64>: minimum identity between overlapped regions of the reads to consider the alignment (default: 0.900000).
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--min-overlap</code></b> <INTEGER>: minimum overlap between both the reads to consider the alignment (default: 20).
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--penalty-scale</code></b> <FLOAT64>: scale factor applied to the mismatch score and the gap penalty (default 1).
|
|
|
|
</li>
|
|
<li>
|
|
<b><code class="language-bash">--without-stat</code></b>
|
|
| <b><code class="language-bash">-S</code></b>
|
|
: remove alignment statistics from the produced consensus sequences (default: false).
|
|
|
|
</li>
|
|
</ul>
|
|
<h4 id="controlling-the-input-data">
|
|
Controlling the input data
|
|
<a class="anchor" href="#controlling-the-input-data">#</a>
|
|
</h4>
|
|
|
|
|
|
<I>OBITools4</I> generally recognizes the input file format. It also recognizes
|
|
whether the input file is compressed using GZIP. But some rare files can be
|
|
misidentified, so the following options allow the user to force the format, thus
|
|
bypassing the format identification step.
|
|
|
|
<h5 id="the-file-format-options">
|
|
The file format options
|
|
<a class="anchor" href="#the-file-format-options">#</a>
|
|
</h5>
|
|
|
|
|
|
<ul>
|
|
<li>
|
|
<b><code class="language-bash">--fasta</code></b>: indicates that sequence data is in <a href="http://metabar:8888/obidoc/formats/fasta/">fasta</a> format.</li>
|
|
<li>
|
|
<b><code class="language-bash">--fastq</code></b>: indicates that sequence data is in <a href="http://metabar:8888/obidoc/formats/fastq/">fastq</a> format.</li>
|
|
<li>
|
|
<b><code class="language-bash">--embl</code></b>: indicates that sequence data is in <a href="http://metabar:8888/obidoc/formats/embl/">EMBL-ENA flatfile</a> format.</li>
|
|
<li>
|
|
<b><code class="language-bash">--csv</code></b>: indicates that sequence data is in <a href="http://metabar:8888/obidoc/docs/file_format/sequence_files/csv/">CSV</a> format.</li>
|
|
<li>
|
|
<b><code class="language-bash">--genbank</code></b>: indicates that sequence data is in <a href="http://metabar:8888/obidoc/formats/genbank/">GenBank flatfile</a> format.</li>
|
|
<li><b><code class="language-bash">--ecopcr</code></b>: indicates that sequence data is in the old ecoPCR tabulated format.</li>
|
|
</ul>
|
|
|
|
<h5 id="controlling-the-way-obitools4-are-formatting-annotations">
|
|
Controlling the way <em>OBITools4</em> are formatting annotations
|
|
<a class="anchor" href="#controlling-the-way-obitools4-are-formatting-annotations">#</a>
|
|
</h5>
|
|
|
|
|
|
These options only apply to the <a href="http://metabar:8888/obidoc/formats/fasta/">FASTA</a> and <a href="http://metabar:8888/obidoc/formats/fastq/">FASTQ</a> formats
|
|
|
|
<ul>
|
|
<li><b><code class="language-bash">--input-OBI-header</code></b>: FASTA/FASTQ title line annotations follow the old OBI format.</li>
|
|
<li><b><code class="language-bash">--input-json-header</code></b>: FASTA/FASTQ title line annotations follow the JSON format.</li>
|
|
</ul>
|
|
|
|
<h5 id="controlling-quality-score-decoding">
|
|
Controlling quality score decoding
|
|
<a class="anchor" href="#controlling-quality-score-decoding">#</a>
|
|
</h5>
|
|
|
|
|
|
This option only applies to the <a href="http://metabar:8888/obidoc/formats/fastq/">FASTQ</a> formats
|
|
|
|
<ul>
|
|
<li><b><code class="language-bash">--solexa</code></b>: decodes quality string according to the old Solexa specification. (default: the standard Sanger encoding is used, env: <strong>OBISSOLEXA</strong>)</li>
|
|
</ul>
|
|
|
|
<h4 id="controlling-the-output-data">
|
|
Controlling the output data
|
|
<a class="anchor" href="#controlling-the-output-data">#</a>
|
|
</h4>
|
|
|
|
|
|
<ul>
|
|
<li><b><code class="language-bash">--compress</code></b>
|
|
| <b><code class="language-bash">-Z</code></b>
|
|
: output is compressed using gzip. (default: false)</li>
|
|
<li><b><code class="language-bash">--no-order</code></b>: the <em>OBITools</em> ensure that the order between the input file and
|
|
the output file does not change. When multiple files are processed,
|
|
they are processed one at a time.
|
|
If the <strong>–no-order</strong> option is added to a command, multiple input
|
|
files can be opened at the same time and their contents processed
|
|
in parallel. This usually increases processing speed, but does not
|
|
guarantee the order of the sequences in the output file.
|
|
Also, processing multiple files in parallel may require more memory
|
|
to perform the computation.</li>
|
|
<li>
|
|
<b><code class="language-bash">--fasta-output</code></b>: writes sequence data in <a href="http://metabar:8888/obidoc/formats/fasta/">fasta</a> format (default if quality data is not available).</li>
|
|
<li>
|
|
<b><code class="language-bash">--fastq-output</code></b>: writes sequence data in <a href="http://metabar:8888/obidoc/formats/fastq/">fastq</a> format (default if quality data is available).</li>
|
|
<li><b><code class="language-bash">--json-output</code></b>: writes sequence data in JSON format.</li>
|
|
<li><b><code class="language-bash">--out</code></b>
|
|
| <b><code class="language-bash">-o</code></b>
|
|
<FILENAME>: filename used for saving the output (default: “-”, the standard output)</li>
|
|
<li><b><code class="language-bash">--output-OBI-header</code></b>
|
|
| <b><code class="language-bash">-O</code></b>
|
|
: writes output FASTA/FASTQ title line annotations in OBI format (default: JSON).</li>
|
|
<li><b><code class="language-bash">--output-json-header</code></b>: writew output FASTA/FASTQ title line annotations in JSON format (the default format).</li>
|
|
<li><b><code class="language-bash">--skip-empty</code></b>: sequences of length equal to zero are removed from the output (default: false).</li>
|
|
<li><b><code class="language-bash">--no-progressbar</code></b>: deactivates progress bar display (default: false).</li>
|
|
</ul>
|
|
<h4 id="general-options">
|
|
General options
|
|
<a class="anchor" href="#general-options">#</a>
|
|
</h4>
|
|
|
|
|
|
<ul>
|
|
<li><b><code class="language-bash">--help</code></b>
|
|
| <b><code class="language-bash">-h|-?</code></b>
|
|
: shows this help.</li>
|
|
<li><b><code class="language-bash">--version</code></b>: prints the version and exits.</li>
|
|
<li><b><code class="language-bash">--silent-warning</code></b>: This option tells obitools to stop displaying warnings.
|
|
This behaviour can be controlled by setting the <strong>OBIWARNINGS</strong> environment variable.</li>
|
|
</ul>
|
|
|
|
|
|
|
|
<h4 id="computation-related-options">
|
|
Computation related options
|
|
<a class="anchor" href="#computation-related-options">#</a>
|
|
</h4>
|
|
|
|
|
|
<ul>
|
|
<li><b><code class="language-bash">--max-cpu</code></b> <INTEGER>: <em>OBITools</em> can take advantage of your computer’s multi-core
|
|
architecture by parallelizing the computation across all available CPUs.
|
|
Computing on more CPUs usually requires more memory to perform the
|
|
computation. Reducing the number of CPUs used to perform a calculation
|
|
is also a way to indirectly control the amount of memory used by the
|
|
process. The number of CPUs used by <em>OBITools</em> can also be controlled
|
|
by setting the <strong>OBIMAXCPU</strong> environment variable.</li>
|
|
<li><b><code class="language-bash">--force-one-cpu</code></b>: forces the use of a single CPU core for parallel processing (default: false).</li>
|
|
<li><b><code class="language-bash">--batch-size</code></b> <INTEGER>: number of sequence per batch for parallel processing (default: 1000, env: <strong>OBIBATCHSIZE</strong>)</li>
|
|
</ul>
|
|
|
|
<h4 id="debug-related-options">
|
|
Debug related options
|
|
<a class="anchor" href="#debug-related-options">#</a>
|
|
</h4>
|
|
|
|
|
|
<ul>
|
|
<li><b><code class="language-bash">--debug</code></b>: enables debug mode, by setting log level to debug (default: false, env: <strong>OBIDEBUG</strong>)</li>
|
|
<li><b><code class="language-bash">--pprof</code></b>: enables pprof server. Look at the log for details. (default: false).</li>
|
|
<li><b><code class="language-bash">--pprof-mutex</code></b> <INTEGER>: enables profiling of mutex lock. (default: 10, env: <strong>OBIPPROFMUTEX</strong>)</li>
|
|
<li><b><code class="language-bash">--pprof-goroutine</code></b> <INTEGER>: enables profiling of goroutine blocking profile. (default: 6060, env: <strong>OBIPPROFGOROUTINE</strong>)</li>
|
|
</ul>
|
|
<h2 id="examples">
|
|
Examples
|
|
<a class="anchor" href="#examples">#</a>
|
|
</h2>
|
|
<h3 id="basic-example">
|
|
Basic example
|
|
<a class="anchor" href="#basic-example">#</a>
|
|
</h3>
|
|
<p>Consider the two small fastq files presented above, each containing four sequences and named <code>forward.fastq</code> and <code>reverse.fastq</code>. The following command will align them and create a file named <code>paired.fastq</code> containing the full-length amplicon sequences:</p>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>obipairing -F forward.fastq -R reverse.fastq > paired.fastq
|
|
</span></span></code></pre></div><p>A bar graph showing the frequencies of the aligned and joined read pairs can be generated by combining the output of the <a href="http://metabar:8888/obidoc/obitools/obicsv/">
|
|
<abbr title="obicsv: convert a sequence file to a CSV file"><code>obicsv</code></abbr>
|
|
</a> command with the <code>uplot</code> command:</p>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>obicsv -k mode paired.fastq | uplot -H count
|
|
</span></span></code></pre></div><pre tabindex="0"><code> mode
|
|
┌ ┐
|
|
alignment ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 2.0
|
|
join ┤■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 2.0
|
|
└ ┘
|
|
</code></pre><p>It is possible to use the <a href="http://metabar:8888/obidoc/obitools/obidistribute/">
|
|
<abbr title="obidistribute: split a sequence file into multiple files"><code>obidistribute</code></abbr>
|
|
</a> tool to separate the reads according to their <code>mode</code> attribute, which is set to <code>join</code> or <code>alignment</code>:</p>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>obidistribute -p <span style="color:#e6db74">"paired_%s.fastq"</span> <span style="color:#ae81ff">\
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> -c mode <span style="color:#ae81ff">\
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> paired.fastq
|
|
</span></span></code></pre></div><p>This command will produce two files named <code>paired_join.fastq</code> and <code>paired_alignment.fastq</code> containing the sequences with <code>mode</code> set to <code>join</code> and <code>alignment</code> respectively.</p>
|
|
<p>Looking at the content of the <code>paired_join.fastq</code> file, we can see that the first pair of reads was not aligned because the <code>score_norm</code> tag is less than the default identity threshold of 0.9, while the second pair of reads was not aligned because the length of the overlap (<code>ali_length</code> tag) is less than the default minimum overlap of 20.</p>
|
|
|
|
<a style="padding: 10px 20px; background-color: #cacaca; border: 1px solid #8e8080; border-bottom: none; border-radius: 5px 5px 0 0; box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1)"
|
|
href="paired_join.fastq" download="paired_join.fastq">📄 paired_join.fastq</a>
|
|
<DIV style="border: 2px solid #8e8080; border-radius: 0 0 5px 5px; padding: 20px; background-color: white; ">
|
|
|
|
<pre tabindex="0"><code class="language-fastq" data-lang="fastq">@M01334:147:000000000-LBRVD:1:1101:14968:1570 {"ali_length":137,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":1687,"score_norm":0.679,"seq_ab_match":93}
|
|
tgttccacgggcaatcctgagccaaatctttcattttgaaaaaatgagagatataatgtatctcttatttattataagaaataaaatatttcttatctaatattaaagttaggtgcagagactcaatgggtggaactagatcggatgtgca..........agcaaaaaagaacaagtaacaagggaaaaccagagaaaaatcaataaaaaagaaaaaaagagagatataaagtatcaataaaataaaaaaagaaaaaaaataataaaaaactaataaaaaagaaaggtgcagagaaaaaaagggaggaaaa
|
|
+
|
|
11>A>@3@A11>ACFFEG110BFB00BAFGHE2DFGG201110/B11111/D1D2222D2FDFDFGDGHHBGG2F222110D11@1D1FGHFHGFF@GE1F2FG22112B220F1@111/0>BF11B210B>//11B1<1BB<///<1122!!!!!!!!!!11--111111?111@11110?1112/@@11122@011FBB2121//B1111CEEHGFB11F2B2B2B2DFB12@212122B21/>/1D1GF>>EA1GGDD2D1///22B222GD/E11GGFAB1B0A313B3B0A111BB1111311>>11
|
|
@M01334:147:000000000-LBRVD:1:1101:15399:1590 {"ali_length":4,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":126,"score_norm":1,"seq_ab_match":4}
|
|
tgttccacccattgagtctctgcacctatctttaatattagataagaaatattttacttcttataataaataagagttattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccgtggaactagatcggaagagca..........agcaaaaaagaaaaagtaaaacaaaatgagaaaaagcaactaaatttaatattagataagaaatataatactacataaaataaataagagatattttatatctctaaatttttcaaaaggaaagatttggctcaggatagcccgaggaaaa
|
|
+
|
|
11>A>@3B>>1CF111BBFAG3A3AAF1FFGHHF3FBGH221F211110D1DGHH2BBGBFF2F22D221D211111A2DDGG2F2FFFEGD1FFHHHGFD221B111110BFGD11F@1001BF0@@1/EA//1>F1B1FD/////00<1!!!!!!!!!!11//111110B122B12000B222B222B01111111@122B22122ED2B12F@D2GF2FAD2D2222D222212D2GF2HGD1GFADAD1D1222D1D111221B0011101GFDD12FFD1B000A1011B11B1111111311>>11
|
|
</code></pre></td>
|
|
|
|
</DIV>
|
|
|
|
<p>Looking at the contents of the <code>paired_alignment.fastq</code> file, we can see (<code>ali_dir</code> tag) that the first pair of reads was aligned using the <em>right</em> version of the exact alignment algorithm, while the second pair of reads was aligned using the <em>left</em> version.</p>
|
|
|
|
<a style="padding: 10px 20px; background-color: #cacaca; border: 1px solid #8e8080; border-bottom: none; border-radius: 5px 5px 0 0; box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1)"
|
|
href="paired_alignment.fastq" download="paired_alignment.fastq">📄 paired_alignment.fastq</a>
|
|
<DIV style="border: 2px solid #8e8080; border-radius: 0 0 5px 5px; padding: 20px; background-color: white; ">
|
|
|
|
<pre tabindex="0"><code class="language-fastq" data-lang="fastq">@M01334:147:000000000-LBRVD:1:1101:15946:1586 {"ali_dir":"right","ali_length":138,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(T:16)->(A:33)":14,"(T:33)->(A:17)":118,"(T:37)->(A:16)":125,"(T:38)->(A:16)":32,"(T:39)->(A:17)":44},"paring_fast_count":114,"paring_fast_overlap":138,"paring_fast_score":0.844,"score":5446,"score_norm":0.957,"seq_a_single":13,"seq_ab_match":132,"seq_b_single":13}
|
|
gctcatccgaactacctaaccccattgagtctctgcacctatctttaatattagataagaaatattttatttcttataataaataagagatattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccacgtaacggagatcggaagagc
|
|
+
|
|
111/////2B112CMMOUO?MNObVHfcAVVHVWVVTQSWRXXIYYYXUSWiXaWeWWUWVSTTTWXgeUWWXXXWWgXWYYWVYWdUgSTTTXYYUVdTVWVXVgUWXXXVeYXfTCUXWW`QGUWfA@WSR?PRRWVARAc?UVMMOO?///BF////<000
|
|
@M01334:147:000000000-LBRVD:1:1101:13773:1687 {"ali_dir":"left","ali_length":54,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(C:39)->(A:16)":102,"(C:39)->(A:17)":121,"(T:39)->(A:14)":101},"paring_fast_count":42,"paring_fast_overlap":54,"paring_fast_score":0.824,"score":2888,"score_norm":0.944,"seq_a_single":97,"seq_ab_match":51,"seq_b_single":97}
|
|
ctcggatcaccattgagtctctgcacctatctttaatattagataagaaaaaatattatttcttatctgaaataagaaatattttatatatttctttttctcaaaatgaaagatttggctcaggattgccctgatccgagggatagcaccattgagtctctgcacctatccttttcttttgtattctagttcgagaacccccttgttttctcaaaacacggatttggctcaggatagccctgctatca
|
|
+
|
|
3AAAAAADFFFFGGGGFGGGGGHHHHHHFHHHHHHHHGHHHHGHGGHFFHHHCGFHHHHHHHHHHHHHGHHGGFHFFHHHGHHHHBHHHGHHHHHHHXVVJIommmegikl]bVWgVDRXIlbkkVfPSWVWccVVT^ebggjkkCVeWcd1@CF>0/11@11B011F0/0000/00111@@D1111GA01AFGCEEE///0//BAFD0000HGGFAB011B00CBAB11FA1B11>1111@31>111
|
|
</code></pre></td>
|
|
|
|
</DIV>
|
|
|
|
<h3 id="pairing-the-reads-in-exact-mode">
|
|
Pairing the reads in exact mode
|
|
<a class="anchor" href="#pairing-the-reads-in-exact-mode">#</a>
|
|
</h3>
|
|
<p>The <code>--exact-mode</code> option can be used to align the reads in exact mode. This option bypasses the first fast alignment step and aligns the overlapping region of the reads using the exact alignment algorithm. This option increases the computation time.</p>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>obipairing -F forward.fastq -R reverse.fastq <span style="color:#ae81ff">\
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#ae81ff"></span> --exact-mode > paired_exact.fastq
|
|
</span></span></code></pre></div>
|
|
<a style="padding: 10px 20px; background-color: #cacaca; border: 1px solid #8e8080; border-bottom: none; border-radius: 5px 5px 0 0; box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1)"
|
|
href="paired_exact.fastq" download="paired_exact.fastq">📄 paired_exact.fastq</a>
|
|
<DIV style="border: 2px solid #8e8080; border-radius: 0 0 5px 5px; padding: 20px; background-color: white; ">
|
|
|
|
<pre tabindex="0"><code class="language-fastq" data-lang="fastq">@M01334:147:000000000-LBRVD:1:1101:14968:1570 {"ali_length":137,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":1687,"score_norm":0.679,"seq_ab_match":93}
|
|
tgttccacgggcaatcctgagccaaatctttcattttgaaaaaatgagagatataatgtatctcttatttattataagaaataaaatatttcttatctaatattaaagttaggtgcagagactcaatgggtggaactagatcggatgtgca..........agcaaaaaagaacaagtaacaagggaaaaccagagaaaaatcaataaaaaagaaaaaaagagagatataaagtatcaataaaataaaaaaagaaaaaaaataataaaaaactaataaaaaagaaaggtgcagagaaaaaaagggaggaaaa
|
|
+
|
|
11>A>@3@A11>ACFFEG110BFB00BAFGHE2DFGG201110/B11111/D1D2222D2FDFDFGDGHHBGG2F222110D11@1D1FGHFHGFF@GE1F2FG22112B220F1@111/0>BF11B210B>//11B1<1BB<///<1122!!!!!!!!!!11--111111?111@11110?1112/@@11122@011FBB2121//B1111CEEHGFB11F2B2B2B2DFB12@212122B21/>/1D1GF>>EA1GGDD2D1///22B222GD/E11GGFAB1B0A313B3B0A111BB1111311>>11
|
|
@M01334:147:000000000-LBRVD:1:1101:15946:1586 {"ali_dir":"right","ali_length":138,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(T:16)->(A:33)":14,"(T:33)->(A:17)":118,"(T:37)->(A:16)":125,"(T:38)->(A:16)":32,"(T:39)->(A:17)":44},"score":5446,"score_norm":0.957,"seq_a_single":13,"seq_ab_match":132,"seq_b_single":13}
|
|
gctcatccgaactacctaaccccattgagtctctgcacctatctttaatattagataagaaatattttatttcttataataaataagagatattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccacgtaacggagatcggaagagc
|
|
+
|
|
111/////2B112CMMOUO?MNObVHfcAVVHVWVVTQSWRXXIYYYXUSWiXaWeWWUWVSTTTWXgeUWWXXXWWgXWYYWVYWdUgSTTTXYYUVdTVWVXVgUWXXXVeYXfTCUXWW`QGUWfA@WSR?PRRWVARAc?UVMMOO?///BF////<000
|
|
@M01334:147:000000000-LBRVD:1:1101:15399:1590 {"ali_length":137,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":3033,"score_norm":0.796,"seq_ab_match":109}
|
|
tgttccacccattgagtctctgcacctatctttaatattagataagaaatattttacttcttataataaataagagttattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccgtggaactagatcggaagagca..........agcaaaaaagaaaaagtaaaacaaaatgagaaaaagcaactaaatttaatattagataagaaatataatactacataaaataaataagagatattttatatctctaaatttttcaaaaggaaagatttggctcaggatagcccgaggaaaa
|
|
+
|
|
11>A>@3B>>1CF111BBFAG3A3AAF1FFGHHF3FBGH221F211110D1DGHH2BBGBFF2F22D221D211111A2DDGG2F2FFFEGD1FFHHHGFD221B111110BFGD11F@1001BF0@@1/EA//1>F1B1FD/////00<1!!!!!!!!!!11//111110B122B12000B222B222B01111111@122B22122ED2B12F@D2GF2FAD2D2222D222212D2GF2HGD1GFADAD1D1222D1D111221B0011101GFDD12FFD1B000A1011B11B1111111311>>11
|
|
@M01334:147:000000000-LBRVD:1:1101:13773:1687 {"ali_dir":"left","ali_length":54,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(C:39)->(A:16)":102,"(C:39)->(A:17)":121,"(T:39)->(A:14)":101},"score":2888,"score_norm":0.944,"seq_a_single":97,"seq_ab_match":51,"seq_b_single":97}
|
|
ctcggatcaccattgagtctctgcacctatctttaatattagataagaaaaaatattatttcttatctgaaataagaaatattttatatatttctttttctcaaaatgaaagatttggctcaggattgccctgatccgagggatagcaccattgagtctctgcacctatccttttcttttgtattctagttcgagaacccccttgttttctcaaaacacggatttggctcaggatagccctgctatca
|
|
+
|
|
3AAAAAADFFFFGGGGFGGGGGHHHHHHFHHHHHHHHGHHHHGHGGHFFHHHCGFHHHHHHHHHHHHHGHHGGFHFFHHHGHHHHBHHHGHHHHHHHXVVJIommmegikl]bVWgVDRXIlbkkVfPSWVWccVVT^ebggjkkCVeWcd1@CF>0/11@11B011F0/0000/00111@@D1111GA01AFGCEEE///0//BAFD0000HGGFAB011B00CBAB11FA1B11>1111@31>111
|
|
</code></pre></td>
|
|
|
|
</DIV>
|
|
|
|
<p>For this trivial data set, both results, <code>paired.fastq</code> and <code>paired_exact.fastq</code>, are identical with respect to the consensus sequence.
|
|
But the annotations are different. Using the UNIX diff command, it is possible to compare the two files:</p>
|
|
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>diff -u paired.fastq paired_exact.fastq
|
|
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-diff" data-lang="diff"><span style="display:flex;"><span><span style="color:#f92672">--- paired.fastq 2025-02-23 16:50:12
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#f92672"></span><span style="color:#a6e22e">+++ paired_exact.fastq 2025-02-23 17:24:37
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e"></span><span style="color:#75715e">@@ -2,15 +2,15 @@
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> tgttccacgggcaatcctgagccaaatctttcattttgaaaaaatgagagatataatgtatctcttatttattataagaaataaaatatttcttatctaatattaaagttaggtgcagagactcaatgggtggaactagatcggatgtgca..........agcaaaaaagaacaagtaacaagggaaaaccagagaaaaatcaataaaaaagaaaaaaagagagatataaagtatcaataaaataaaaaaagaaaaaaaataataaaaaactaataaaaaagaaaggtgcagagaaaaaaagggaggaaaa
|
|
</span></span><span style="display:flex;"><span> +
|
|
</span></span><span style="display:flex;"><span> 11>A>@3@A11>ACFFEG110BFB00BAFGHE2DFGG201110/B11111/D1D2222D2FDFDFGDGHHBGG2F222110D11@1D1FGHFHGFF@GE1F2FG22112B220F1@111/0>BF11B210B>//11B1<1BB<///<1122!!!!!!!!!!11--111111?111@11110?1112/@@11122@011FBB2121//B1111CEEHGFB11F2B2B2B2DFB12@212122B21/>/1D1GF>>EA1GGDD2D1///22B222GD/E11GGFAB1B0A313B3B0A111BB1111311>>11
|
|
</span></span><span style="display:flex;"><span><span style="color:#f92672">-@M01334:147:000000000-LBRVD:1:1101:15946:1586 {"ali_dir":"right","ali_length":138,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(T:16)->(A:33)":14,"(T:33)->(A:17)":118,"(T:37)->(A:16)":125,"(T:38)->(A:16)":32,"(T:39)->(A:17)":44},"paring_fast_count":114,"paring_fast_overlap":138,"paring_fast_score":0.844,"score":5446,"score_norm":0.957,"seq_a_single":13,"seq_ab_match":132,"seq_b_single":13}
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#f92672"></span><span style="color:#a6e22e">+@M01334:147:000000000-LBRVD:1:1101:15946:1586 {"ali_dir":"right","ali_length":138,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(T:16)->(A:33)":14,"(T:33)->(A:17)":118,"(T:37)->(A:16)":125,"(T:38)->(A:16)":32,"(T:39)->(A:17)":44},"score":5446,"score_norm":0.957,"seq_a_single":13,"seq_ab_match":132,"seq_b_single":13}
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e"></span> gctcatccgaactacctaaccccattgagtctctgcacctatctttaatattagataagaaatattttatttcttataataaataagagatattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccacgtaacggagatcggaagagc
|
|
</span></span><span style="display:flex;"><span> +
|
|
</span></span><span style="display:flex;"><span> 111/////2B112CMMOUO?MNObVHfcAVVHVWVVTQSWRXXIYYYXUSWiXaWeWWUWVSTTTWXgeUWWXXXWWgXWYYWVYWdUgSTTTXYYUVdTVWVXVgUWXXXVeYXfTCUXWW`QGUWfA@WSR?PRRWVARAc?UVMMOO?///BF////<000
|
|
</span></span><span style="display:flex;"><span><span style="color:#f92672">-@M01334:147:000000000-LBRVD:1:1101:15399:1590 {"ali_length":4,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":126,"score_norm":1,"seq_ab_match":4}
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#f92672"></span><span style="color:#a6e22e">+@M01334:147:000000000-LBRVD:1:1101:15399:1590 {"ali_length":137,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"join","score":3033,"score_norm":0.796,"seq_ab_match":109}
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e"></span> tgttccacccattgagtctctgcacctatctttaatattagataagaaatattttacttcttataataaataagagttattttatatctctcattttttcaaaatgaaagatttggctcaggattgcccgtggaactagatcggaagagca..........agcaaaaaagaaaaagtaaaacaaaatgagaaaaagcaactaaatttaatattagataagaaatataatactacataaaataaataagagatattttatatctctaaatttttcaaaaggaaagatttggctcaggatagcccgaggaaaa
|
|
</span></span><span style="display:flex;"><span> +
|
|
</span></span><span style="display:flex;"><span> 11>A>@3B>>1CF111BBFAG3A3AAF1FFGHHF3FBGH221F211110D1DGHH2BBGBFF2F22D221D211111A2DDGG2F2FFFEGD1FFHHHGFD221B111110BFGD11F@1001BF0@@1/EA//1>F1B1FD/////00<1!!!!!!!!!!11//111110B122B12000B222B222B01111111@122B22122ED2B12F@D2GF2FAD2D2222D222212D2GF2HGD1GFADAD1D1222D1D111221B0011101GFDD12FFD1B000A1011B11B1111111311>>11
|
|
</span></span><span style="display:flex;"><span><span style="color:#f92672">-@M01334:147:000000000-LBRVD:1:1101:13773:1687 {"ali_dir":"left","ali_length":54,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(C:39)->(A:16)":102,"(C:39)->(A:17)":121,"(T:39)->(A:14)":101},"paring_fast_count":42,"paring_fast_overlap":54,"paring_fast_score":0.824,"score":2888,"score_norm":0.944,"seq_a_single":97,"seq_ab_match":51,"seq_b_single":97}
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#f92672"></span><span style="color:#a6e22e">+@M01334:147:000000000-LBRVD:1:1101:13773:1687 {"ali_dir":"left","ali_length":54,"definition":"1:N:0:CTCACCAA+CTAGGCAA","mode":"alignment","pairing_mismatches":{"(C:39)->(A:16)":102,"(C:39)->(A:17)":121,"(T:39)->(A:14)":101},"score":2888,"score_norm":0.944,"seq_a_single":97,"seq_ab_match":51,"seq_b_single":97}
|
|
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e"></span> ctcggatcaccattgagtctctgcacctatctttaatattagataagaaaaaatattatttcttatctgaaataagaaatattttatatatttctttttctcaaaatgaaagatttggctcaggattgccctgatccgagggatagcaccattgagtctctgcacctatccttttcttttgtattctagttcgagaacccccttgttttctcaaaacacggatttggctcaggatagccctgctatca
|
|
</span></span><span style="display:flex;"><span> +
|
|
</span></span><span style="display:flex;"><span> 3AAAAAADFFFFGGGGFGGGGGHHHHHHFHHHHHHHHGHHHHGHGGHFFHHHCGFHHHHHHHHHHHHHGHHGGFHFFHHHGHHHHBHHHGHHHHHHHXVVJIommmegikl]bVWgVDRXIlbkkVfPSWVWccVVT^ebggjkkCVeWcd1@CF>0/11@11B011F0/0000/00111@@D1111GA01AFGCEEE///0//BAFD0000HGGFAB011B00CBAB11FA1B11>1111@31>111
|
|
</span></span></code></pre></div><p>You can see that only the description line of the sequences has been changed. They are the only ones that start with a <code>+</code> or a <code>-</code> in the first column. The lines starting with <code>-</code> are from the <code>paired.fastq</code> file. The lines starting with <code>+</code> are from the <code>unpaired.fastq</code> file. Lines starting with <code> </code> are identical in both files.</p>
|
|
<p>For the two aligned sequences, the tags describing the fast alignment performed first are missing in the <code>paired_exact.fastq</code> file because the
|
|
<a href="/obidoc/docs/commands/alignments/obipairing/fasta-like/">FASTA-derived algorithm</a> is not run when the <code>--exact-mode</code> option is used.</p>
|
|
<p>The second joined sequence pair with the <code>--exact-mode</code> now has a very long overlap of 137 bases, as opposed to 4 bases in the previous command, but the <code>score_norm</code> value is only 0.796, which is much lower than the threshold of 0.9, leading to a rejection of the alignment.</p>
|
|
</article>
|
|
|
|
|
|
|
|
<footer class="book-footer">
|
|
|
|
<div class="flex flex-wrap justify-between">
|
|
|
|
|
|
|
|
|
|
|
|
</div>
|
|
|
|
|
|
|
|
<script>(function(){function e(e){const t=window.getSelection(),n=document.createRange();n.selectNodeContents(e),t.removeAllRanges(),t.addRange(n)}document.querySelectorAll("pre code").forEach(t=>{t.addEventListener("click",function(){if(window.getSelection().toString())return;e(t.parentElement),navigator.clipboard&&navigator.clipboard.writeText(t.parentElement.textContent)})})})()</script>
|
|
|
|
|
|
|
|
|
|
</footer>
|
|
|
|
|
|
|
|
<div class="book-comments">
|
|
|
|
</div>
|
|
|
|
|
|
|
|
<label for="menu-control" class="hidden book-menu-overlay"></label>
|
|
</div>
|
|
|
|
|
|
<aside class="book-toc">
|
|
<div class="book-toc-content">
|
|
|
|
|
|
<nav id="TableOfContents">
|
|
<ul>
|
|
<li><a href="#obipairing-align-forward-and-reverse-paired-reads"><code>obipairing</code>: align forward and reverse paired reads</a>
|
|
<ul>
|
|
<li><a href="#description">Description</a>
|
|
<ul>
|
|
<li><a href="#input-data">Input data</a></li>
|
|
<li><a href="#the-simplest-obipairing-command">The simplest <em>obipairing</em> command</a></li>
|
|
<li><a href="#the-alignment-process">The alignment process</a></li>
|
|
<li><a href="#building-the-consensus-sequence">Building the consensus sequence</a></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="#synopsis">Synopsis</a></li>
|
|
<li><a href="#options">Options</a>
|
|
<ul>
|
|
<li></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="#examples">Examples</a>
|
|
<ul>
|
|
<li><a href="#basic-example">Basic example</a></li>
|
|
<li><a href="#pairing-the-reads-in-exact-mode">Pairing the reads in exact mode</a></li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
|
|
|
|
|
|
</div>
|
|
</aside>
|
|
|
|
</main>
|
|
|
|
|
|
</body>
|
|
</html>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|