Adds the new version of the doc as a quarto book

This commit is contained in:
2023-01-17 19:06:14 +01:00
parent f873645e8e
commit 4592855095
36 changed files with 7238 additions and 795 deletions

BIN
doc/.DS_Store vendored Normal file

Binary file not shown.

1
doc/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/.quarto/

BIN
doc/_book/OBITools-V4.pdf Normal file

Binary file not shown.

View File

@ -1,376 +1,389 @@
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>4 Annexes | The GO OBITools</title>
<meta name="description" content="Description of the principles used into the GO implementation of OBITools." />
<meta name="generator" content="bookdown 0.29 and GitBook 2.6.7" />
<meta charset="utf-8">
<meta name="generator" content="quarto-1.2.256">
<meta property="og:title" content="4 Annexes | The GO OBITools" />
<meta property="og:type" content="book" />
<meta property="og:description" content="Description of the principles used into the GO implementation of OBITools." />
<meta name="github-repo" content="seankross/bookdown-start" />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="4 Annexes | The GO OBITools" />
<meta name="twitter:description" content="Description of the principles used into the GO implementation of OBITools." />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<meta name="author" content="SEric Coissac" />
<meta name="date" content="2022-08-25" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="prev" href="reference-documentation-for-the-go-obitools-library.html"/>
<script src="book_assets/jquery-3.6.0/jquery-3.6.0.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/fuse.js@6.4.6/dist/fuse.min.js"></script>
<link href="book_assets/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="book_assets/anchor-sections-1.1.0/anchor-sections.css" rel="stylesheet" />
<link href="book_assets/anchor-sections-1.1.0/anchor-sections-hash.css" rel="stylesheet" />
<script src="book_assets/anchor-sections-1.1.0/anchor-sections.js"></script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
<title>OBITools V4 - 4&nbsp; Annexes</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./references.html" rel="next">
<link href="./library.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 20,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit"
}
}</script>
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li class="chapter" data-level="1" data-path="the-obitools.html"><a href="the-obitools.html"><i class="fa fa-check"></i><b>1</b> The OBITools</a>
<ul>
<li class="chapter" data-level="1.1" data-path="the-obitools.html"><a href="the-obitools.html#aims-of-obitools"><i class="fa fa-check"></i><b>1.1</b> Aims of <em>OBITools</em></a></li>
<li class="chapter" data-level="1.2" data-path="the-obitools.html"><a href="the-obitools.html#file-formats-usable-with-obitools"><i class="fa fa-check"></i><b>1.2</b> File formats usable with <em>OBITools</em></a>
<ul>
<li class="chapter" data-level="1.2.1" data-path="the-obitools.html"><a href="the-obitools.html#the-sequence-files"><i class="fa fa-check"></i><b>1.2.1</b> The sequence files</a></li>
<li class="chapter" data-level="1.2.2" data-path="the-obitools.html"><a href="the-obitools.html#the-iupac-code"><i class="fa fa-check"></i><b>1.2.2</b> The IUPAC Code</a></li>
<li class="chapter" data-level="1.2.3" data-path="the-obitools.html"><a href="the-obitools.html#classical-fasta"><i class="fa fa-check"></i><b>1.2.3</b> The <em>fasta</em> format</a></li>
<li class="chapter" data-level="1.2.4" data-path="the-obitools.html"><a href="the-obitools.html#classical-fastq"><i class="fa fa-check"></i><b>1.2.4</b> The <em>fastq</em> sequence format</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="the-obitools.html"><a href="the-obitools.html#file-extension"><i class="fa fa-check"></i><b>1.3</b> File extension</a></li>
<li class="chapter" data-level="1.4" data-path="the-obitools.html"><a href="the-obitools.html#see-also"><i class="fa fa-check"></i><b>1.4</b> See also</a></li>
<li class="chapter" data-level="1.5" data-path="the-obitools.html"><a href="the-obitools.html#references"><i class="fa fa-check"></i><b>1.5</b> References</a></li>
</ul></li>
<li class="chapter" data-level="2" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html"><i class="fa fa-check"></i><b>2</b> The <em>OBITools</em> commands</a>
<ul>
<li class="chapter" data-level="2.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#specifying-the-input-files-to-obitools-commands"><i class="fa fa-check"></i><b>2.1</b> Specifying the input files to <em>OBITools</em> commands</a></li>
<li class="chapter" data-level="2.2" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#options-common-to-most-of-the-obitools-commands"><i class="fa fa-check"></i><b>2.2</b> Options common to most of the <em>OBITools</em> commands</a>
<ul>
<li class="chapter" data-level="2.2.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#specifying-input-format"><i class="fa fa-check"></i><b>2.2.1</b> Specifying input format</a></li>
<li class="chapter" data-level="2.2.2" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#specifying-output-format"><i class="fa fa-check"></i><b>2.2.2</b> Specifying output format</a></li>
<li class="chapter" data-level="2.2.3" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#format-of-the-annotations-in-fasta-and-fastq-files"><i class="fa fa-check"></i><b>2.2.3</b> Format of the annotations in Fasta and Fastq files</a></li>
</ul></li>
<li class="chapter" data-level="2.3" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#metabarcode-design-and-quality-assessment"><i class="fa fa-check"></i><b>2.3</b> Metabarcode design and quality assessment</a></li>
<li class="chapter" data-level="2.4" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#file-format-conversions"><i class="fa fa-check"></i><b>2.4</b> File format conversions</a></li>
<li class="chapter" data-level="2.5" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#sequence-annotations"><i class="fa fa-check"></i><b>2.5</b> Sequence annotations</a></li>
<li class="chapter" data-level="2.6" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#computations-on-sequences"><i class="fa fa-check"></i><b>2.6</b> Computations on sequences</a>
<ul>
<li class="chapter" data-level="2.6.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#obipairing"><i class="fa fa-check"></i><b>2.6.1</b> <code>obipairing</code></a></li>
</ul></li>
<li class="chapter" data-level="2.7" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#sequence-sampling-and-filtering"><i class="fa fa-check"></i><b>2.7</b> Sequence sampling and filtering</a>
<ul>
<li class="chapter" data-level="2.7.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#utilities"><i class="fa fa-check"></i><b>2.7.1</b> Utilities</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="3" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html"><i class="fa fa-check"></i><b>3</b> Reference documentation for the GO <em>OBITools</em> library</a>
<ul>
<li class="chapter" data-level="3.1" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#biosequence"><i class="fa fa-check"></i><b>3.1</b> BioSequence</a>
<ul>
<li class="chapter" data-level="3.1.1" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#creating-new-instances"><i class="fa fa-check"></i><b>3.1.1</b> Creating new instances</a></li>
<li class="chapter" data-level="3.1.2" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#end-of-life-of-a-biosequence-instance"><i class="fa fa-check"></i><b>3.1.2</b> End of life of a <code>BioSequence</code> instance</a></li>
<li class="chapter" data-level="3.1.3" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#accessing-to-the-elements-of-a-sequence"><i class="fa fa-check"></i><b>3.1.3</b> Accessing to the elements of a sequence</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="4" data-path="annexes.html"><a href="annexes.html"><i class="fa fa-check"></i><b>4</b> Annexes</a>
<ul>
<li class="chapter" data-level="4.0.1" data-path="annexes.html"><a href="annexes.html#sequence-attributes"><i class="fa fa-check"></i><b>4.0.1</b> Sequence attributes</a></li>
</ul></li>
</ul>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<div class="container-fluid d-flex justify-content-between">
<h1 class="quarto-secondary-nav-title"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></h1>
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation">
<i class="bi bi-chevron-right"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">OBITools V4</a>
</div>
</div>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link">Preface</a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./commands.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./library.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./annexes.html" class="sidebar-item-text sidebar-link active"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link">References</a>
</div>
</li>
</ul>
</div>
</nav>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#sequence-attributes" id="toc-sequence-attributes" class="nav-link active" data-scroll-target="#sequence-attributes"><span class="toc-section-number">4.0.1</span> Sequence attributes</a></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title d-none d-lg-block"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></h1>
</div>
<div class="quarto-title-meta">
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">The GO <em>OBITools</em></a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
</header>
<section class="normal" id="section-">
<div id="annexes" class="section level1 hasAnchor" number="4">
<h1><span class="header-section-number">4</span> Annexes<a href="annexes.html#annexes" class="anchor-section" aria-label="Anchor link to header"></a></h1>
<div id="sequence-attributes" class="section level3 hasAnchor" number="4.0.1">
<h3><span class="header-section-number">4.0.1</span> Sequence attributes<a href="annexes.html#sequence-attributes" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<div id="reserved-sequence-attributes" class="section level4 hasAnchor" number="4.0.1.1">
<h4><span class="header-section-number">4.0.1.1</span> Reserved sequence attributes<a href="annexes.html#reserved-sequence-attributes" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<div id="ali_dir" class="section level5 hasAnchor" number="4.0.1.1.1">
<h5><span class="header-section-number">4.0.1.1.1</span> <code>ali_dir</code><a href="annexes.html#ali_dir" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="type-string" class="section level6 hasAnchor" number="4.0.1.1.1.1">
<h6><span class="header-section-number">4.0.1.1.1.1</span> Type : <code>string</code><a href="annexes.html#type-string" class="anchor-section" aria-label="Anchor link to header"></a></h6>
<section id="sequence-attributes" class="level3" data-number="4.0.1">
<h3 data-number="4.0.1" class="anchored" data-anchor-id="sequence-attributes"><span class="header-section-number">4.0.1</span> Sequence attributes</h3>
<section id="reserved-sequence-attributes" class="level4" data-number="4.0.1.1">
<h4 data-number="4.0.1.1" class="anchored" data-anchor-id="reserved-sequence-attributes"><span class="header-section-number">4.0.1.1</span> Reserved sequence attributes</h4>
<section id="ali_dir" class="level5" data-number="4.0.1.1.1">
<h5 data-number="4.0.1.1.1" class="anchored" data-anchor-id="ali_dir"><span class="header-section-number">4.0.1.1.1</span> <code>ali_dir</code></h5>
<section id="type-string" class="level6" data-number="4.0.1.1.1.1">
<h6 data-number="4.0.1.1.1.1" class="anchored" data-anchor-id="type-string"><span class="header-section-number">4.0.1.1.1.1</span> Type : <code>string</code></h6>
<p>The attribute can contain 2 string values <code>"left"</code> or <code>"right".</code></p>
</div>
<div id="set-by-the-obipairing-tool" class="section level6 hasAnchor" number="4.0.1.1.1.2">
<h6><span class="header-section-number">4.0.1.1.1.2</span> Set by the <em>obipairing</em> tool<a href="annexes.html#set-by-the-obipairing-tool" class="anchor-section" aria-label="Anchor link to header"></a></h6>
<p>The alignment generated by <em>obipairing</em> is a 3-end gap free algorithm.
Two cases can occur when aligning the forward and reverse reads. If the
barcode is long enough, both the reads overlap only on their 3 ends. In
such case, the alignment direction <code>ali_dir</code> is set to <em>left</em>. If the
barcode is shorter than the read length, the paired reads overlap by
their 5 ends, and the complete barcode is sequenced by both the reads.
In that later case, <code>ali_dir</code> is set to <em>right</em>.</p>
</div>
</div>
<div id="ali_length" class="section level5 hasAnchor" number="4.0.1.1.2">
<h5><span class="header-section-number">4.0.1.1.2</span> <code>ali_length</code><a href="annexes.html#ali_length" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="set-by-the-obipairing-tool-1" class="section level6 hasAnchor" number="4.0.1.1.2.1">
<h6><span class="header-section-number">4.0.1.1.2.1</span> Set by the <em>obipairing</em> tool<a href="annexes.html#set-by-the-obipairing-tool-1" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</section>
<section id="set-by-the-obipairing-tool" class="level6" data-number="4.0.1.1.1.2">
<h6 data-number="4.0.1.1.1.2" class="anchored" data-anchor-id="set-by-the-obipairing-tool"><span class="header-section-number">4.0.1.1.1.2</span> Set by the <em>obipairing</em> tool</h6>
<p>The alignment generated by <em>obipairing</em> is a 3-end gap free algorithm. Two cases can occur when aligning the forward and reverse reads. If the barcode is long enough, both the reads overlap only on their 3 ends. In such case, the alignment direction <code>ali_dir</code> is set to <em>left</em>. If the barcode is shorter than the read length, the paired reads overlap by their 5 ends, and the complete barcode is sequenced by both the reads. In that later case, <code>ali_dir</code> is set to <em>right</em>.</p>
</section>
</section>
<section id="ali_length" class="level5" data-number="4.0.1.1.2">
<h5 data-number="4.0.1.1.2" class="anchored" data-anchor-id="ali_length"><span class="header-section-number">4.0.1.1.2</span> <code>ali_length</code></h5>
<section id="set-by-the-obipairing-tool-1" class="level6" data-number="4.0.1.1.2.1">
<h6 data-number="4.0.1.1.2.1" class="anchored" data-anchor-id="set-by-the-obipairing-tool-1"><span class="header-section-number">4.0.1.1.2.1</span> Set by the <em>obipairing</em> tool</h6>
<p>Length of the aligned parts when merging forward and reverse reads</p>
</div>
</div>
<div id="count-the-number-of-sequence-occurrences" class="section level5 hasAnchor" number="4.0.1.1.3">
<h5><span class="header-section-number">4.0.1.1.3</span> <code>count</code> : the number of sequence occurrences<a href="annexes.html#count-the-number-of-sequence-occurrences" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="set-by-the-obiuniq-tool" class="section level6 hasAnchor" number="4.0.1.1.3.1">
<h6><span class="header-section-number">4.0.1.1.3.1</span> Set by the <em>obiuniq</em> tool<a href="annexes.html#set-by-the-obiuniq-tool" class="anchor-section" aria-label="Anchor link to header"></a></h6>
<p>The <code>count</code> attribute indicates how-many strictly identical sequences
have been merged in a single record. It contains an integer value. If it
is absent this means that the sequence record represents a single
occurrence of the sequence.</p>
</div>
<div id="getter-method-count" class="section level6 hasAnchor" number="4.0.1.1.3.2">
<h6><span class="header-section-number">4.0.1.1.3.2</span> Getter : method <code>Count()</code><a href="annexes.html#getter-method-count" class="anchor-section" aria-label="Anchor link to header"></a></h6>
<p>The <code>Count()</code> method allows to access to the count attribute as an
integer value. If the <code>count</code> attribute is not defined for the given
sequence, the value <em>1</em> is returned</p>
</div>
</div>
<div id="merged_" class="section level5 hasAnchor" number="4.0.1.1.4">
<h5><span class="header-section-number">4.0.1.1.4</span> <code>merged_*</code><a href="annexes.html#merged_" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="type-mapstringint" class="section level6 hasAnchor" number="4.0.1.1.4.1">
<h6><span class="header-section-number">4.0.1.1.4.1</span> Type : <code>map[string]int</code><a href="annexes.html#type-mapstringint" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</div>
<div id="set-by-the-obiuniq-tool-1" class="section level6 hasAnchor" number="4.0.1.1.4.2">
<h6><span class="header-section-number">4.0.1.1.4.2</span> Set by the <em>obiuniq</em> tool<a href="annexes.html#set-by-the-obiuniq-tool-1" class="anchor-section" aria-label="Anchor link to header"></a></h6>
<p>The <code>-m</code> option of the <em>obiuniq</em> tools allows for keeping track of the
distribution of the values stored in given attribute of interest. Often
this option is used to summarise distribution of a sequence variant
accross samples when <em>obiuniq</em> is run after running <em>obimultiplex</em>. The
actual name of the attribute depends on the name of the monitored
attribute. If <code>-m</code> option is used with the attribute <em>sample</em>, then this
attribute names <em>merged_sample</em>.</p>
</div>
</div>
<div id="mode" class="section level5 hasAnchor" number="4.0.1.1.5">
<h5><span class="header-section-number">4.0.1.1.5</span> <code>mode</code><a href="annexes.html#mode" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="set-by-the-obipairing-tool-2" class="section level6 hasAnchor" number="4.0.1.1.5.1">
<h6><span class="header-section-number">4.0.1.1.5.1</span> Set by the <em>obipairing</em> tool<a href="annexes.html#set-by-the-obipairing-tool-2" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</section>
</section>
<section id="count-the-number-of-sequence-occurrences" class="level5" data-number="4.0.1.1.3">
<h5 data-number="4.0.1.1.3" class="anchored" data-anchor-id="count-the-number-of-sequence-occurrences"><span class="header-section-number">4.0.1.1.3</span> <code>count</code> : the number of sequence occurrences</h5>
<section id="set-by-the-obiuniq-tool" class="level6" data-number="4.0.1.1.3.1">
<h6 data-number="4.0.1.1.3.1" class="anchored" data-anchor-id="set-by-the-obiuniq-tool"><span class="header-section-number">4.0.1.1.3.1</span> Set by the <em>obiuniq</em> tool</h6>
<p>The <code>count</code> attribute indicates how-many strictly identical sequences have been merged in a single record. It contains an integer value. If it is absent this means that the sequence record represents a single occurrence of the sequence.</p>
</section>
<section id="getter-method-count" class="level6" data-number="4.0.1.1.3.2">
<h6 data-number="4.0.1.1.3.2" class="anchored" data-anchor-id="getter-method-count"><span class="header-section-number">4.0.1.1.3.2</span> Getter : method <code>Count()</code></h6>
<p>The <code>Count()</code> method allows to access to the count attribute as an integer value. If the <code>count</code> attribute is not defined for the given sequence, the value <em>1</em> is returned</p>
</section>
</section>
<section id="merged_" class="level5" data-number="4.0.1.1.4">
<h5 data-number="4.0.1.1.4" class="anchored" data-anchor-id="merged_"><span class="header-section-number">4.0.1.1.4</span> <code>merged_*</code></h5>
<section id="type-mapstringint" class="level6" data-number="4.0.1.1.4.1">
<h6 data-number="4.0.1.1.4.1" class="anchored" data-anchor-id="type-mapstringint"><span class="header-section-number">4.0.1.1.4.1</span> Type : <code>map[string]int</code></h6>
</section>
<section id="set-by-the-obiuniq-tool-1" class="level6" data-number="4.0.1.1.4.2">
<h6 data-number="4.0.1.1.4.2" class="anchored" data-anchor-id="set-by-the-obiuniq-tool-1"><span class="header-section-number">4.0.1.1.4.2</span> Set by the <em>obiuniq</em> tool</h6>
<p>The <code>-m</code> option of the <em>obiuniq</em> tools allows for keeping track of the distribution of the values stored in given attribute of interest. Often this option is used to summarise distribution of a sequence variant accross samples when <em>obiuniq</em> is run after running <em>obimultiplex</em>. The actual name of the attribute depends on the name of the monitored attribute. If <code>-m</code> option is used with the attribute <em>sample</em>, then this attribute names <em>merged_sample</em>.</p>
</section>
</section>
<section id="mode" class="level5" data-number="4.0.1.1.5">
<h5 data-number="4.0.1.1.5" class="anchored" data-anchor-id="mode"><span class="header-section-number">4.0.1.1.5</span> <code>mode</code></h5>
<section id="set-by-the-obipairing-tool-2" class="level6" data-number="4.0.1.1.5.1">
<h6 data-number="4.0.1.1.5.1" class="anchored" data-anchor-id="set-by-the-obipairing-tool-2"><span class="header-section-number">4.0.1.1.5.1</span> Set by the <em>obipairing</em> tool</h6>
<p><strong><code>obitag_ref_index</code></strong></p>
</div>
<div id="set-by-the-obirefidx-tool." class="section level6 hasAnchor" number="4.0.1.1.5.2">
<h6><span class="header-section-number">4.0.1.1.5.2</span> Set by the <em>obirefidx</em> tool.<a href="annexes.html#set-by-the-obirefidx-tool." class="anchor-section" aria-label="Anchor link to header"></a></h6>
<p>It resumes to which taxonomic annotation a match to that sequence must
lead according to the number of differences existing between the query
sequence and the reference sequence having that tag.</p>
</div>
<div id="getter-method-count-1" class="section level6 hasAnchor" number="4.0.1.1.5.3">
<h6><span class="header-section-number">4.0.1.1.5.3</span> Getter : method <code>Count()</code><a href="annexes.html#getter-method-count-1" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</div>
</div>
<div id="pairing_mismatches" class="section level5 hasAnchor" number="4.0.1.1.6">
<h5><span class="header-section-number">4.0.1.1.6</span> <code>pairing_mismatches</code><a href="annexes.html#pairing_mismatches" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="set-by-the-obipairing-tool-3" class="section level6 hasAnchor" number="4.0.1.1.6.1">
<h6><span class="header-section-number">4.0.1.1.6.1</span> Set by the <em>obipairing</em> tool<a href="annexes.html#set-by-the-obipairing-tool-3" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</div>
</div>
<div id="score" class="section level5 hasAnchor" number="4.0.1.1.7">
<h5><span class="header-section-number">4.0.1.1.7</span> <code>score</code><a href="annexes.html#score" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="set-by-the-obipairing-tool-4" class="section level6 hasAnchor" number="4.0.1.1.7.1">
<h6><span class="header-section-number">4.0.1.1.7.1</span> Set by the <em>obipairing</em> tool<a href="annexes.html#set-by-the-obipairing-tool-4" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</div>
</div>
<div id="score_norm" class="section level5 hasAnchor" number="4.0.1.1.8">
<h5><span class="header-section-number">4.0.1.1.8</span> <code>score_norm</code><a href="annexes.html#score_norm" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<div id="set-by-the-obipairing-tool-5" class="section level6 hasAnchor" number="4.0.1.1.8.1">
<h6><span class="header-section-number">4.0.1.1.8.1</span> Set by the <em>obipairing</em> tool<a href="annexes.html#set-by-the-obipairing-tool-5" class="anchor-section" aria-label="Anchor link to header"></a></h6>
</section>
<section id="set-by-the-obirefidx-tool." class="level6" data-number="4.0.1.1.5.2">
<h6 data-number="4.0.1.1.5.2" class="anchored" data-anchor-id="set-by-the-obirefidx-tool."><span class="header-section-number">4.0.1.1.5.2</span> Set by the <em>obirefidx</em> tool.</h6>
<p>It resumes to which taxonomic annotation a match to that sequence must lead according to the number of differences existing between the query sequence and the reference sequence having that tag.</p>
</section>
<section id="getter-method-count-1" class="level6" data-number="4.0.1.1.5.3">
<h6 data-number="4.0.1.1.5.3" class="anchored" data-anchor-id="getter-method-count-1"><span class="header-section-number">4.0.1.1.5.3</span> Getter : method <code>Count()</code></h6>
</section>
</section>
<section id="pairing_mismatches" class="level5" data-number="4.0.1.1.6">
<h5 data-number="4.0.1.1.6" class="anchored" data-anchor-id="pairing_mismatches"><span class="header-section-number">4.0.1.1.6</span> <code>pairing_mismatches</code></h5>
<section id="set-by-the-obipairing-tool-3" class="level6" data-number="4.0.1.1.6.1">
<h6 data-number="4.0.1.1.6.1" class="anchored" data-anchor-id="set-by-the-obipairing-tool-3"><span class="header-section-number">4.0.1.1.6.1</span> Set by the <em>obipairing</em> tool</h6>
</section>
</section>
<section id="score" class="level5" data-number="4.0.1.1.7">
<h5 data-number="4.0.1.1.7" class="anchored" data-anchor-id="score"><span class="header-section-number">4.0.1.1.7</span> <code>score</code></h5>
<section id="set-by-the-obipairing-tool-4" class="level6" data-number="4.0.1.1.7.1">
<h6 data-number="4.0.1.1.7.1" class="anchored" data-anchor-id="set-by-the-obipairing-tool-4"><span class="header-section-number">4.0.1.1.7.1</span> Set by the <em>obipairing</em> tool</h6>
</section>
</section>
<section id="score_norm" class="level5" data-number="4.0.1.1.8">
<h5 data-number="4.0.1.1.8" class="anchored" data-anchor-id="score_norm"><span class="header-section-number">4.0.1.1.8</span> <code>score_norm</code></h5>
<section id="set-by-the-obipairing-tool-5" class="level6" data-number="4.0.1.1.8.1">
<h6 data-number="4.0.1.1.8.1" class="anchored" data-anchor-id="set-by-the-obipairing-tool-5"><span class="header-section-number">4.0.1.1.8.1</span> Set by the <em>obipairing</em> tool</h6>
</div>
</div>
</div>
</div>
</div>
</section>
</div>
</div>
</div>
<a href="reference-documentation-for-the-go-obitools-library.html" class="navigation navigation-prev navigation-unique" aria-label="Previous page"><i class="fa fa-angle-left"></i></a>
</section>
</section>
</section>
</section>
</div>
</div>
<script src="book_assets/gitbook-2.6.7/js/app.min.js"></script>
<script src="book_assets/gitbook-2.6.7/js/clipboard.min.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="book_assets/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-clipboard.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": false,
"facebook": true,
"twitter": true,
"linkedin": false,
"weibo": false,
"instapaper": false,
"vk": false,
"whatsapp": false,
"all": ["facebook", "twitter", "linkedin", "weibo", "instapaper"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": null,
"text": null
},
"history": {
"link": null,
"text": null
},
"view": {
"link": null,
"text": null
},
"download": ["_main.pdf"],
"search": {
"engine": "fuse",
"options": null
},
"toc": {
"collapse": "subsection"
}
});
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const toggleBodyColorMode = (bsSheetEl) => {
const mode = bsSheetEl.getAttribute("data-mode");
const bodyEl = window.document.querySelector("body");
if (mode === "dark") {
bodyEl.classList.add("quarto-dark");
bodyEl.classList.remove("quarto-light");
} else {
bodyEl.classList.add("quarto-light");
bodyEl.classList.remove("quarto-dark");
}
}
const toggleBodyColorPrimary = () => {
const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
if (bsSheetEl) {
toggleBodyColorMode(bsSheetEl);
}
}
toggleBodyColorPrimary();
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const clipboard = new window.ClipboardJS('.code-copy-button', {
target: function(trigger) {
return trigger.previousElementSibling;
}
});
clipboard.on('success', function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
});
function tippyHover(el, contentFn) {
const config = {
allowHTML: true,
content: contentFn,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start'
};
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="./library.html" class="pagination-link">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></span>
</a>
</div>
<div class="nav-page nav-page-next">
<a href="./references.html" class="pagination-link">
<span class="nav-page-text">References</span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
</div> <!-- /content -->
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
var src = "true";
if (src === "" || src === "true") src = "https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.9/latest.js?config=TeX-MML-AM_CHTML";
if (location.protocol !== "file:")
if (/^https?:/.test(src))
src = src.replace(/^https?:/, '');
script.src = src;
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>
</body></html>

490
doc/_book/commands.html Normal file
View File

@ -0,0 +1,490 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.2.256">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>OBITools V4 - 2&nbsp; The OBITools commands</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./library.html" rel="next">
<link href="./intro.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 20,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit"
}
}</script>
</head>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<div class="container-fluid d-flex justify-content-between">
<h1 class="quarto-secondary-nav-title"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></h1>
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation">
<i class="bi bi-chevron-right"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">OBITools V4</a>
</div>
</div>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link">Preface</a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./commands.html" class="sidebar-item-text sidebar-link active"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./library.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./annexes.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link">References</a>
</div>
</li>
</ul>
</div>
</nav>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#specifying-the-input-files-to-obitools-commands" id="toc-specifying-the-input-files-to-obitools-commands" class="nav-link active" data-scroll-target="#specifying-the-input-files-to-obitools-commands"><span class="toc-section-number">2.1</span> Specifying the input files to <em>OBITools</em> commands</a></li>
<li><a href="#options-common-to-most-of-the-obitools-commands" id="toc-options-common-to-most-of-the-obitools-commands" class="nav-link" data-scroll-target="#options-common-to-most-of-the-obitools-commands"><span class="toc-section-number">2.2</span> Options common to most of the <em>OBITools</em> commands</a>
<ul class="collapse">
<li><a href="#specifying-input-format" id="toc-specifying-input-format" class="nav-link" data-scroll-target="#specifying-input-format"><span class="toc-section-number">2.2.1</span> Specifying input format</a></li>
<li><a href="#specifying-output-format" id="toc-specifying-output-format" class="nav-link" data-scroll-target="#specifying-output-format"><span class="toc-section-number">2.2.2</span> Specifying output format</a></li>
<li><a href="#format-of-the-annotations-in-fasta-and-fastq-files" id="toc-format-of-the-annotations-in-fasta-and-fastq-files" class="nav-link" data-scroll-target="#format-of-the-annotations-in-fasta-and-fastq-files"><span class="toc-section-number">2.2.3</span> Format of the annotations in Fasta and Fastq files</a></li>
</ul></li>
<li><a href="#obitools-expression-language" id="toc-obitools-expression-language" class="nav-link" data-scroll-target="#obitools-expression-language"><span class="toc-section-number">2.3</span> OBITools expression language</a>
<ul class="collapse">
<li><a href="#variables-usable-in-the-expression" id="toc-variables-usable-in-the-expression" class="nav-link" data-scroll-target="#variables-usable-in-the-expression"><span class="toc-section-number">2.3.1</span> Variables usable in the expression</a></li>
<li><a href="#function-defined-in-the-language" id="toc-function-defined-in-the-language" class="nav-link" data-scroll-target="#function-defined-in-the-language"><span class="toc-section-number">2.3.2</span> Function defined in the language</a></li>
<li><a href="#accessing-to-the-sequence-annotations" id="toc-accessing-to-the-sequence-annotations" class="nav-link" data-scroll-target="#accessing-to-the-sequence-annotations"><span class="toc-section-number">2.3.3</span> Accessing to the sequence annotations</a></li>
</ul></li>
<li><a href="#metabarcode-design-and-quality-assessment" id="toc-metabarcode-design-and-quality-assessment" class="nav-link" data-scroll-target="#metabarcode-design-and-quality-assessment"><span class="toc-section-number">2.4</span> Metabarcode design and quality assessment</a></li>
<li><a href="#file-format-conversions" id="toc-file-format-conversions" class="nav-link" data-scroll-target="#file-format-conversions"><span class="toc-section-number">2.5</span> File format conversions</a></li>
<li><a href="#sequence-annotations" id="toc-sequence-annotations" class="nav-link" data-scroll-target="#sequence-annotations"><span class="toc-section-number">2.6</span> Sequence annotations</a></li>
<li><a href="#computations-on-sequences" id="toc-computations-on-sequences" class="nav-link" data-scroll-target="#computations-on-sequences"><span class="toc-section-number">2.7</span> Computations on sequences</a>
<ul class="collapse">
<li><a href="#obipairing" id="toc-obipairing" class="nav-link" data-scroll-target="#obipairing"><span class="toc-section-number">2.7.1</span> <code>obipairing</code></a></li>
</ul></li>
<li><a href="#sequence-sampling-and-filtering" id="toc-sequence-sampling-and-filtering" class="nav-link" data-scroll-target="#sequence-sampling-and-filtering"><span class="toc-section-number">2.8</span> Sequence sampling and filtering</a>
<ul class="collapse">
<li><a href="#utilities" id="toc-utilities" class="nav-link" data-scroll-target="#utilities"><span class="toc-section-number">2.8.1</span> Utilities</a></li>
</ul></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title d-none d-lg-block"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></h1>
</div>
<div class="quarto-title-meta">
</div>
</header>
<section id="specifying-the-input-files-to-obitools-commands" class="level2" data-number="2.1">
<h2 data-number="2.1" class="anchored" data-anchor-id="specifying-the-input-files-to-obitools-commands"><span class="header-section-number">2.1</span> Specifying the input files to <em>OBITools</em> commands</h2>
</section>
<section id="options-common-to-most-of-the-obitools-commands" class="level2" data-number="2.2">
<h2 data-number="2.2" class="anchored" data-anchor-id="options-common-to-most-of-the-obitools-commands"><span class="header-section-number">2.2</span> Options common to most of the <em>OBITools</em> commands</h2>
<section id="specifying-input-format" class="level3" data-number="2.2.1">
<h3 data-number="2.2.1" class="anchored" data-anchor-id="specifying-input-format"><span class="header-section-number">2.2.1</span> Specifying input format</h3>
<p>Five sequence formats are accepted for input files. <a href="#fasta-classical" title="Fasta format description">Fasta</a> and <a href="#fastq-classical" title="Fastq format description">Fastq</a> are the main ones, EMBL and Genbank allow the use of flat files produced by these two international databases. The last one, ecoPCR, is maintained for compatibility with previous <em>OBITools</em> and allows to read <em>ecoPCR</em> outputs as sequence files.</p>
<ul>
<li><code>--ecopcr</code> : Read data following the <em>ecoPCR</em> output format.</li>
<li><code>--embl</code> Read data following the <em>EMBL</em> flatfile format.</li>
<li><code>--genbank</code> Read data following the <em>Genbank</em> flatfile format.</li>
</ul>
<p>Several encoding schemes have been proposed for quality scores in <a href="#fastq-classical" title="Fastq format description">Fastq</a> format. Currently, <em>OBITools</em> considers Sanger encoding as the standard. For reasons of compatibility with older datasets produced with <em>Solexa</em> sequencers, it is possible, by using the following option, to force the use of the corresponding quality encoding scheme when reading these older files.</p>
<ul>
<li><code>--solexa</code> Decodes quality string according to the Solexa specification. (default: false)</li>
</ul>
</section>
<section id="specifying-output-format" class="level3" data-number="2.2.2">
<h3 data-number="2.2.2" class="anchored" data-anchor-id="specifying-output-format"><span class="header-section-number">2.2.2</span> Specifying output format</h3>
<p>Only two output sequence formats are supported by OBITools, Fasta and Fastq. Fastq is used when output sequences are associated with quality information. Otherwise, Fasta is the default format. However, it is possible to force the output format by using one of the following two options. Forcing the use of Fasta results in the loss of quality information. Conversely, when the Fastq format is forced with sequences that have no quality data, dummy qualities set to 40 for each nucleotide are added.</p>
<ul>
<li><code>--fasta-output</code> Read data following the ecoPCR output format.</li>
<li><code>--fastq-output</code> Read data following the EMBL flatfile format.</li>
</ul>
<p>OBITools allows multiple input files to be specified for a single command.</p>
<ul>
<li><code>--no-order</code> When several input files are provided, indicates that there is no order among them. (default: false)</li>
</ul>
</section>
<section id="format-of-the-annotations-in-fasta-and-fastq-files" class="level3" data-number="2.2.3">
<h3 data-number="2.2.3" class="anchored" data-anchor-id="format-of-the-annotations-in-fasta-and-fastq-files"><span class="header-section-number">2.2.3</span> Format of the annotations in Fasta and Fastq files</h3>
<p>OBITools extend the <a href="#fasta-classical" title="Fasta format description">Fasta</a> and <a href="#fastq-classical" title="Fastq format description">Fastq</a> formats by introducing a format for the title lines of these formats allowing to annotate every sequence. While the previous version of OBITools used an <em>ad-hoc</em> format for these annotation, this new version introduce the usage of the standard JSON format to store them.</p>
<p>On input, OBITools automatically recognize the format of the annotations, but two options allows to force the parsing following one of them. You should normally not need to use these options.</p>
<ul>
<li><p><code>--input-OBI-header</code> FASTA/FASTQ title line annotations follow OBI format. (default: false)</p></li>
<li><p><code>--input-json-header</code> FASTA/FASTQ title line annotations follow json format. (default: false)</p></li>
</ul>
<p>On output, by default annotation are formatted using the new JSON format. For compatibility with previous version of OBITools and with external scripts and software, it is possible to force the usage of the previous OBITools format.</p>
<ul>
<li><p><code>--output-OBI-header|-O</code> output FASTA/FASTQ title line annotations follow OBI format. (default: false)</p></li>
<li><p><code>--output-json-header</code> output FASTA/FASTQ title line annotations follow json format. (default: false)</p></li>
</ul>
<section id="system-related-options" class="level4" data-number="2.2.3.1">
<h4 data-number="2.2.3.1" class="anchored" data-anchor-id="system-related-options"><span class="header-section-number">2.2.3.1</span> System related options</h4>
<ul>
<li><code>--debug</code> (default: false)</li>
<li><code>--help\|-h\|-?</code> (default: false)</li>
<li><code>--max-cpu &lt;int&gt;</code> Number of parallele threads computing the result (default: 10)</li>
<li><code>--workers\|-w &lt;int&gt;</code> Number of parallele threads computing the result (default: 9)</li>
</ul>
</section>
</section>
</section>
<section id="obitools-expression-language" class="level2" data-number="2.3">
<h2 data-number="2.3" class="anchored" data-anchor-id="obitools-expression-language"><span class="header-section-number">2.3</span> OBITools expression language</h2>
<p>Several OBITools (<em>e.g.</em> obigrep, obiannotate) allow the user to specify some simple expressions to compute values or define predicates. This expressions are parsed and evaluated using the <a href="https://pkg.go.dev/github.com/PaesslerAG/gval" title="Gval (Go eVALuate) for evaluating arbitrary expressions Go-like expressions.">gval</a> go package, which allows for evaluating go-Like expression.</p>
<section id="variables-usable-in-the-expression" class="level3" data-number="2.3.1">
<h3 data-number="2.3.1" class="anchored" data-anchor-id="variables-usable-in-the-expression"><span class="header-section-number">2.3.1</span> Variables usable in the expression</h3>
<section id="sequence" class="level4" data-number="2.3.1.1">
<h4 data-number="2.3.1.1" class="anchored" data-anchor-id="sequence"><span class="header-section-number">2.3.1.1</span> sequence</h4>
<p>sequence is the sequence object on which the expression is evaluated</p>
</section>
<section id="annotation" class="level4" data-number="2.3.1.2">
<h4 data-number="2.3.1.2" class="anchored" data-anchor-id="annotation"><span class="header-section-number">2.3.1.2</span> annotation</h4>
</section>
</section>
<section id="function-defined-in-the-language" class="level3" data-number="2.3.2">
<h3 data-number="2.3.2" class="anchored" data-anchor-id="function-defined-in-the-language"><span class="header-section-number">2.3.2</span> Function defined in the language</h3>
<section id="len" class="level4" data-number="2.3.2.1">
<h4 data-number="2.3.2.1" class="anchored" data-anchor-id="len"><span class="header-section-number">2.3.2.1</span> len</h4>
</section>
<section id="ismap" class="level4" data-number="2.3.2.2">
<h4 data-number="2.3.2.2" class="anchored" data-anchor-id="ismap"><span class="header-section-number">2.3.2.2</span> ismap</h4>
</section>
<section id="hasattribute" class="level4" data-number="2.3.2.3">
<h4 data-number="2.3.2.3" class="anchored" data-anchor-id="hasattribute"><span class="header-section-number">2.3.2.3</span> hasattribute</h4>
</section>
<section id="min" class="level4" data-number="2.3.2.4">
<h4 data-number="2.3.2.4" class="anchored" data-anchor-id="min"><span class="header-section-number">2.3.2.4</span> min</h4>
</section>
<section id="max" class="level4" data-number="2.3.2.5">
<h4 data-number="2.3.2.5" class="anchored" data-anchor-id="max"><span class="header-section-number">2.3.2.5</span> max</h4>
</section>
</section>
<section id="accessing-to-the-sequence-annotations" class="level3" data-number="2.3.3">
<h3 data-number="2.3.3" class="anchored" data-anchor-id="accessing-to-the-sequence-annotations"><span class="header-section-number">2.3.3</span> Accessing to the sequence annotations</h3>
</section>
</section>
<section id="metabarcode-design-and-quality-assessment" class="level2" data-number="2.4">
<h2 data-number="2.4" class="anchored" data-anchor-id="metabarcode-design-and-quality-assessment"><span class="header-section-number">2.4</span> Metabarcode design and quality assessment</h2>
<section id="obipcr" class="level4" data-number="2.4.0.1">
<h4 data-number="2.4.0.1" class="anchored" data-anchor-id="obipcr"><span class="header-section-number">2.4.0.1</span> <code>obipcr</code></h4>
<blockquote class="blockquote">
<p>Replace the <code>ecoPCR</code> original <em>OBITools</em></p>
</blockquote>
</section>
</section>
<section id="file-format-conversions" class="level2" data-number="2.5">
<h2 data-number="2.5" class="anchored" data-anchor-id="file-format-conversions"><span class="header-section-number">2.5</span> File format conversions</h2>
<section id="obiconvert" class="level4" data-number="2.5.0.1">
<h4 data-number="2.5.0.1" class="anchored" data-anchor-id="obiconvert"><span class="header-section-number">2.5.0.1</span> <code>obiconvert</code></h4>
</section>
</section>
<section id="sequence-annotations" class="level2" data-number="2.6">
<h2 data-number="2.6" class="anchored" data-anchor-id="sequence-annotations"><span class="header-section-number">2.6</span> Sequence annotations</h2>
<section id="obitag" class="level4" data-number="2.6.0.1">
<h4 data-number="2.6.0.1" class="anchored" data-anchor-id="obitag"><span class="header-section-number">2.6.0.1</span> <code>obitag</code></h4>
</section>
</section>
<section id="computations-on-sequences" class="level2" data-number="2.7">
<h2 data-number="2.7" class="anchored" data-anchor-id="computations-on-sequences"><span class="header-section-number">2.7</span> Computations on sequences</h2>
<section id="obipairing" class="level3" data-number="2.7.1">
<h3 data-number="2.7.1" class="anchored" data-anchor-id="obipairing"><span class="header-section-number">2.7.1</span> <code>obipairing</code></h3>
<blockquote class="blockquote">
<p>Replace the <code>illuminapairedends</code> original <em>OBITools</em></p>
</blockquote>
<section id="obimultiplex" class="level4" data-number="2.7.1.1">
<h4 data-number="2.7.1.1" class="anchored" data-anchor-id="obimultiplex"><span class="header-section-number">2.7.1.1</span> <code>obimultiplex</code></h4>
<blockquote class="blockquote">
<p>Replace the <code>ngsfilter</code> original <em>OBITools</em></p>
</blockquote>
</section>
<section id="obicomplement" class="level4" data-number="2.7.1.2">
<h4 data-number="2.7.1.2" class="anchored" data-anchor-id="obicomplement"><span class="header-section-number">2.7.1.2</span> <code>obicomplement</code></h4>
</section>
<section id="obiclean" class="level4" data-number="2.7.1.3">
<h4 data-number="2.7.1.3" class="anchored" data-anchor-id="obiclean"><span class="header-section-number">2.7.1.3</span> <code>obiclean</code></h4>
</section>
<section id="obiuniq" class="level4" data-number="2.7.1.4">
<h4 data-number="2.7.1.4" class="anchored" data-anchor-id="obiuniq"><span class="header-section-number">2.7.1.4</span> <code>obiuniq</code></h4>
</section>
</section>
</section>
<section id="sequence-sampling-and-filtering" class="level2" data-number="2.8">
<h2 data-number="2.8" class="anchored" data-anchor-id="sequence-sampling-and-filtering"><span class="header-section-number">2.8</span> Sequence sampling and filtering</h2>
<section id="obigrep" class="level4" data-number="2.8.0.1">
<h4 data-number="2.8.0.1" class="anchored" data-anchor-id="obigrep"><span class="header-section-number">2.8.0.1</span> <code>obigrep</code></h4>
</section>
<section id="utilities" class="level3" data-number="2.8.1">
<h3 data-number="2.8.1" class="anchored" data-anchor-id="utilities"><span class="header-section-number">2.8.1</span> Utilities</h3>
<section id="obicount" class="level4" data-number="2.8.1.1">
<h4 data-number="2.8.1.1" class="anchored" data-anchor-id="obicount"><span class="header-section-number">2.8.1.1</span> <code>obicount</code></h4>
</section>
<section id="obidistribute" class="level4" data-number="2.8.1.2">
<h4 data-number="2.8.1.2" class="anchored" data-anchor-id="obidistribute"><span class="header-section-number">2.8.1.2</span> <code>obidistribute</code></h4>
</section>
<section id="obifind" class="level4" data-number="2.8.1.3">
<h4 data-number="2.8.1.3" class="anchored" data-anchor-id="obifind"><span class="header-section-number">2.8.1.3</span> <code>obifind</code></h4>
<blockquote class="blockquote">
<p>Replace the <code>ecofind</code> original <em>OBITools.</em></p>
</blockquote>
</section>
</section>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const toggleBodyColorMode = (bsSheetEl) => {
const mode = bsSheetEl.getAttribute("data-mode");
const bodyEl = window.document.querySelector("body");
if (mode === "dark") {
bodyEl.classList.add("quarto-dark");
bodyEl.classList.remove("quarto-light");
} else {
bodyEl.classList.add("quarto-light");
bodyEl.classList.remove("quarto-dark");
}
}
const toggleBodyColorPrimary = () => {
const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
if (bsSheetEl) {
toggleBodyColorMode(bsSheetEl);
}
}
toggleBodyColorPrimary();
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const clipboard = new window.ClipboardJS('.code-copy-button', {
target: function(trigger) {
return trigger.previousElementSibling;
}
});
clipboard.on('success', function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
});
function tippyHover(el, contentFn) {
const config = {
allowHTML: true,
content: contentFn,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start'
};
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="./intro.html" class="pagination-link">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></span>
</a>
</div>
<div class="nav-page nav-page-next">
<a href="./library.html" class="pagination-link">
<span class="nav-page-text"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
</div> <!-- /content -->
</body></html>

View File

@ -1,461 +1,326 @@
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>The GO OBITools</title>
<meta name="description" content="Description of the principles used into the GO implementation of OBITools." />
<meta name="generator" content="bookdown 0.29 and GitBook 2.6.7" />
<meta charset="utf-8">
<meta name="generator" content="quarto-1.2.256">
<meta property="og:title" content="The GO OBITools" />
<meta property="og:type" content="book" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<meta property="og:description" content="Description of the principles used into the GO implementation of OBITools." />
<meta name="github-repo" content="seankross/bookdown-start" />
<meta name="author" content="Eric Coissac">
<meta name="dcterms.date" content="2023-01-17">
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="The GO OBITools" />
<meta name="twitter:description" content="Description of the principles used into the GO implementation of OBITools." />
<meta name="author" content="SEric Coissac" />
<meta name="date" content="2022-08-25" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="next" href="the-obitools-commands.html"/>
<script src="book_assets/jquery-3.6.0/jquery-3.6.0.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/fuse.js@6.4.6/dist/fuse.min.js"></script>
<link href="book_assets/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="book_assets/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="book_assets/anchor-sections-1.1.0/anchor-sections.css" rel="stylesheet" />
<link href="book_assets/anchor-sections-1.1.0/anchor-sections-hash.css" rel="stylesheet" />
<script src="book_assets/anchor-sections-1.1.0/anchor-sections.js"></script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
<title>OBITools V4</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./intro.html" rel="next">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 20,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit"
}
}</script>
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li class="chapter" data-level="1" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i><b>1</b> The OBITools</a>
<ul>
<li class="chapter" data-level="1.1" data-path="index.html"><a href="index.html#aims-of-obitools"><i class="fa fa-check"></i><b>1.1</b> Aims of <em>OBITools</em></a></li>
<li class="chapter" data-level="1.2" data-path="index.html"><a href="index.html#file-formats-usable-with-obitools"><i class="fa fa-check"></i><b>1.2</b> File formats usable with <em>OBITools</em></a>
<ul>
<li class="chapter" data-level="1.2.1" data-path="index.html"><a href="index.html#the-sequence-files"><i class="fa fa-check"></i><b>1.2.1</b> The sequence files</a></li>
<li class="chapter" data-level="1.2.2" data-path="index.html"><a href="index.html#classical-fasta"><i class="fa fa-check"></i><b>1.2.2</b> The <em>fasta</em> format</a></li>
<li class="chapter" data-level="1.2.3" data-path="index.html"><a href="index.html#the-fastq-sequence-format"><i class="fa fa-check"></i><b>1.2.3</b> The <em>fastq</em> sequence format</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="index.html"><a href="index.html#file-extension"><i class="fa fa-check"></i><b>1.3</b> File extension</a></li>
<li class="chapter" data-level="1.4" data-path="index.html"><a href="index.html#see-also"><i class="fa fa-check"></i><b>1.4</b> See also</a></li>
<li class="chapter" data-level="1.5" data-path="index.html"><a href="index.html#references"><i class="fa fa-check"></i><b>1.5</b> References</a></li>
</ul></li>
<li class="chapter" data-level="2" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html"><i class="fa fa-check"></i><b>2</b> The OBITools commands</a>
<ul>
<li class="chapter" data-level="2.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#metabarcode-design-and-quality-assessment"><i class="fa fa-check"></i><b>2.1</b> Metabarcode design and quality assessment</a></li>
<li class="chapter" data-level="2.2" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#file-format-conversions"><i class="fa fa-check"></i><b>2.2</b> File format conversions</a></li>
<li class="chapter" data-level="2.3" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#sequence-annotations"><i class="fa fa-check"></i><b>2.3</b> Sequence annotations</a></li>
<li class="chapter" data-level="2.4" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#computations-on-sequences"><i class="fa fa-check"></i><b>2.4</b> Computations on sequences</a>
<ul>
<li class="chapter" data-level="2.4.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#obipairing"><i class="fa fa-check"></i><b>2.4.1</b> <code>obipairing</code></a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#sequence-sampling-and-filtering"><i class="fa fa-check"></i><b>2.5</b> Sequence sampling and filtering</a>
<ul>
<li class="chapter" data-level="2.5.1" data-path="the-obitools-commands.html"><a href="the-obitools-commands.html#utilities"><i class="fa fa-check"></i><b>2.5.1</b> Utilities</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="3" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html"><i class="fa fa-check"></i><b>3</b> Reference documentation for the GO <em>OBITools</em> library</a>
<ul>
<li class="chapter" data-level="3.1" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#biosequence"><i class="fa fa-check"></i><b>3.1</b> BioSequence</a>
<ul>
<li class="chapter" data-level="3.1.1" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#creating-new-instances"><i class="fa fa-check"></i><b>3.1.1</b> Creating new instances</a></li>
<li class="chapter" data-level="3.1.2" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#end-of-life-of-a-biosequence-instance"><i class="fa fa-check"></i><b>3.1.2</b> End of life of a <code>BioSequence</code> instance</a></li>
<li class="chapter" data-level="3.1.3" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#accessing-to-the-elements-of-a-sequence"><i class="fa fa-check"></i><b>3.1.3</b> Accessing to the elements of a sequence</a></li>
<li class="chapter" data-level="3.1.4" data-path="reference-documentation-for-the-go-obitools-library.html"><a href="reference-documentation-for-the-go-obitools-library.html#sequence-attributes"><i class="fa fa-check"></i><b>3.1.4</b> Sequence attributes</a></li>
</ul></li>
</ul></li>
</ul>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<div class="container-fluid d-flex justify-content-between">
<h1 class="quarto-secondary-nav-title">OBITools V4</h1>
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation">
<i class="bi bi-chevron-right"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">OBITools V4</a>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">The GO <em>OBITools</em></a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<div id="header">
<h1 class="title">The GO <em>OBITools</em></h1>
<p class="author"><em>SEric Coissac</em></p>
<p class="date"><em>2022-08-25</em></p>
</div>
<div id="the-obitools" class="section level1 hasAnchor" number="1">
<h1><span class="header-section-number">1</span> The OBITools<a href="index.html#the-obitools" class="anchor-section" aria-label="Anchor link to header"></a></h1>
<div id="aims-of-obitools" class="section level2 hasAnchor" number="1.1">
<h2><span class="header-section-number">1.1</span> Aims of <em>OBITools</em><a href="index.html#aims-of-obitools" class="anchor-section" aria-label="Anchor link to header"></a></h2>
</div>
<div id="file-formats-usable-with-obitools" class="section level2 hasAnchor" number="1.2">
<h2><span class="header-section-number">1.2</span> File formats usable with <em>OBITools</em><a href="index.html#file-formats-usable-with-obitools" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<div id="the-sequence-files" class="section level3 hasAnchor" number="1.2.1">
<h3><span class="header-section-number">1.2.1</span> The sequence files<a href="index.html#the-sequence-files" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>Sequences can be stored following various format. OBITools knows some of
them. The central formats for sequence files manipulated by OBITools
scripts are the <code>fasta</code> and fastq format. OBITools extends the both
these formats by specifying a syntax to include in the definition line
data qualifying the sequence. All file formats use the <code>IUPAC</code> code for
encoding nucleotides.</p>
</div>
<div id="classical-fasta" class="section level3 hasAnchor" number="1.2.2">
<h3><span class="header-section-number">1.2.2</span> The <em>fasta</em> format<a href="index.html#classical-fasta" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>The <strong>fasta format</strong> is certainly the most widely used sequence file
format. This is certainly due to its great simplicity. It was originally
created for the Lipman and Pearson <a href="http://www.ncbi.nlm.nih.gov/pubmed/3162770?dopt=Citation">FASTA
program</a>.
OBITools use in more of the classical :ref:<code>fasta</code> format an
:ref:<code>extended version</code> of this format where structured data are
included in the title line.</p>
<p>In <em>fasta</em> format a sequence is represented by a title line beginning
with a <strong>&gt;</strong> character and the sequences by itself following the
:doc:<code>iupac</code> code. The sequence is usually split other severals lines of
the same length (expect for the last one)</p>
<pre><code>&gt;my_sequence this is my pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT</code></pre>
<p>This is no special format for the title line excepting that this line
should be unique. Usually the first word following the <strong>&gt;</strong> character
is considered as the sequence identifier. The end of the title line
corresponding to a description of the sequence. Several sequences can be
concatenated in a same file. The description of the next sequence is
just pasted at the end of the record of the previous one</p>
<pre><code>&gt;sequence_A this is my first pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
&gt;sequence_B this is my second pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
&gt;sequence_C this is my third pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT</code></pre>
</div>
<div id="the-fastq-sequence-format" class="section level3 hasAnchor" number="1.2.3">
<h3><span class="header-section-number">1.2.3</span> The <em>fastq</em> sequence format<a href="index.html#the-fastq-sequence-format" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>.. _classical-fastq:</p>
<p>.. note::</p>
<pre><code>This article uses material from the Wikipedia article
`FASTQ format `
which is released under the
`Creative Commons Attribution-Share-Alike License 3.0 `</code></pre>
<p><strong>fastq format</strong> is a text-based format for storing both a biological
sequence (usually nucleotide sequence) and its corresponding quality
scores. Both the sequence letter and quality score are encoded with a
single ASCII character for brevity. It was originally developed at the
<code>Wellcome Trust Sanger Institute</code> to bundle a <a href="#genuine-fasta">fasta</a>
sequence and its quality data, but has recently become the <em>de facto</em>
standard for storing the output of high throughput sequencing
instruments such as the Illumina Genome Analyzer Illumina. [1]_</p>
<div id="format" class="section level4 hasAnchor" number="1.2.3.1">
<h4><span class="header-section-number">1.2.3.1</span> Format<a href="index.html#format" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<p>A fastq file normally uses four lines per sequence.</p>
<ul>
<li>Line 1 begins with a @ character and is followed by a sequence
identifier and an <em>optional</em> description (like a :ref:<code>fasta</code> title
line).</li>
<li>Line 2 is the raw sequence letters.</li>
<li>Line 3 begins with a + character and is <em>optionally</em> followed by
the same sequence identifier (and any description) again.</li>
<li>Line 4 encodes the quality values for the sequence in Line 2, and
must contain the same number of symbols as letters in the sequence.</li>
</ul>
<p>A fastq file containing a single sequence might look like this:</p>
<pre><code>@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!&#39;&#39;*((((***+))%%%++)(%%%%).1***-+*&#39;&#39;))**55CCF&gt;&gt;&gt;&gt;&gt;&gt;CCCCCCC65</code></pre>
<p>The character ! represents the lowest quality while ~ is the
highest. Here are the quality value characters in left-to-right
increasing order of quality (<code>ASCII</code>):</p>
<pre><code>!&quot;#$%&amp;&#39;()*+,-./0123456789:;&lt;=&gt;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~</code></pre>
<p>The original Sanger FASTQ files also allowed the sequence and quality
strings to be wrapped (split over multiple lines), but this is generally
discouraged as it can make parsing complicated due to the unfortunate
choice of “@” and “+” as markers (these characters can also occur in
the quality string).</p>
</div>
<div id="variations" class="section level4 hasAnchor" number="1.2.3.2">
<h4><span class="header-section-number">1.2.3.2</span> Variations<a href="index.html#variations" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<div id="quality" class="section level5 hasAnchor" number="1.2.3.2.1">
<h5><span class="header-section-number">1.2.3.2.1</span> Quality<a href="index.html#quality" class="anchor-section" aria-label="Anchor link to header"></a></h5>
<p>A quality value <em>Q</em> is an integer mapping of <em>p</em> (i.e., the probability
that the corresponding base call is incorrect). Two different equations
have been in use. The first is the standard Sanger variant to assess
reliability of a base call, otherwise known as Phred quality score:</p>
<p><span class="math display">\[
Q_\text{sanger} = -10 \, \log_{10} p
\]</span></p>
<p>The Solexa pipeline (i.e., the software delivered with the Illumina
Genome Analyzer) earlier used a different mapping, encoding the odds
<span class="math inline">\(\mathbf{p}/(1-\mathbf{p})\)</span> instead of the probability <span class="math inline">\(\mathbf{p}\)</span>:</p>
<p><span class="math display">\[
Q_\text{solexa-prior to v.1.3} = -10 \, \log_{10} \frac{p}{1-p}
\]</span></p>
<p>Although both mappings are asymptotically identical at higher quality
values, they differ at lower quality levels (i.e., approximately
<span class="math inline">\(\mathbf{p} &gt; 0.05\)</span>, or equivalently, <span class="math inline">\(\mathbf{Q} &lt; 13\)</span>).</p>
<p>|Relationship between <em>Q</em> and <em>p</em> using the Sanger (red) and Solexa
(black) equations (described above). The vertical dotted line indicates
<span class="math inline">\(\mathbf{p}= 0.05\)</span>, or equivalently, <span class="math inline">\(Q = 13\)</span>.|</p>
</div>
</div>
<div id="encoding" class="section level4 hasAnchor" number="1.2.3.3">
<h4><span class="header-section-number">1.2.3.3</span> Encoding<a href="index.html#encoding" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<ul>
<li>Sanger format can encode a Phred quality score from 0 to 93 using
ASCII 33 to 126 (although in raw read data the Phred quality score
rarely exceeds 60, higher scores are possible in assemblies or read
maps).</li>
<li>Solexa/Illumina 1.0 format can encode a Solexa/Illumina quality
score from -5 to 62 using ASCII 59 to 126 (although in raw read data
Solexa scores from -5 to 40 only are expected)</li>
<li>Starting with Illumina 1.3 and before Illumina 1.8, the format
encoded a Phred quality score from 0 to 62 using ASCII 64 to 126
(although in raw read data Phred scores from 0 to 40 only are
expected).</li>
<li>Starting in Illumina 1.5 and before Illumina 1.8, the Phred scores 0
to 2 have a slightly different meaning. The values 0 and 1 are no
longer used and the value 2, encoded by ASCII 66 “B”.</li>
</ul>
<p>Sequencing Control Software, Version 2.6, Catalog # SY-960-2601, Part
# 15009921 Rev. A, November
2009] [<a href="http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf\" class="uri">http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf\\</a>](<a href="http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf)%7B.uri%7D" class="uri">http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf\</a>
(page 30) states the following: <em>If a read ends with a segment of mostly
low quality (Q15 or below), then all of the quality values in the
segment are replaced with a value of 2 (encoded as the letter B in
Illuminas text-based encoding of quality scores)… This Q2 indicator
does not predict a specific error rate, but rather indicates that a
specific final portion of the read should not be used in further
analyses.</em> Also, the quality score encoded as “B” letter may occur
internally within reads at least as late as pipeline version 1.6, as
shown in the following example:</p>
<pre><code>@HWI-EAS209_0006_FC706VJ:5:58:5894:21141#ATCACG/1
TTAATTGGTAAATAAATCTCCTAATAGCTTAGATNTTACCTTNNNNNNNNNNTAGTTTCTTGAGATTTGTTGGGGGAGACATTTTTGTGATTGCCTTGAT
+HWI-EAS209_0006_FC706VJ:5:58:5894:21141#ATCACG/1
efcfffffcfeefffcffffffddf`feed]`]_Ba_^__[YBBBBBBBBBBRTT\]][]dddd`ddd^dddadd^BBBBBBBBBBBBBBBBBBBBBBBB</code></pre>
<p>An alternative interpretation of this ASCII encoding has been proposed.
Also, in Illumina runs using PhiX controls, the character B was
observed to represent an “unknown quality score”. The error rate of B
reads was roughly 3 phred scores lower the mean observed score of a
given run.</p>
<ul>
<li>Starting in Illumina 1.8, the quality scores have basically returned
to the use of the Sanger format (Phred+33).</li>
</ul>
</div>
</div>
</div>
<div id="file-extension" class="section level2 hasAnchor" number="1.3">
<h2><span class="header-section-number">1.3</span> File extension<a href="index.html#file-extension" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>There is no standard file extension for a FASTQ file, but .fq and
.fastq, are commonly used.</p>
</div>
<div id="see-also" class="section level2 hasAnchor" number="1.4">
<h2><span class="header-section-number">1.4</span> See also<a href="index.html#see-also" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<ul>
<li>:ref:<code>fasta</code></li>
</ul>
</div>
<div id="references" class="section level2 hasAnchor" number="1.5">
<h2><span class="header-section-number">1.5</span> References<a href="index.html#references" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>.. [1] Cock et al (2009) The Sanger FASTQ file format for sequences with
quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids
Research,</p>
<p>.. [2] Illumina Quality Scores, Tobias Mann, Bioinformatics, San Diego,
Illumina <code>1</code>__</p>
<p>.. |Relationship between <em>Q</em> and <em>p</em> using the Sanger (red) and Solexa
(black) equations (described above). The vertical dotted line indicates
<em>p</em> = 0.05, or equivalently, <em>Q</em> Å 13.| image:: Probability metrics.png</p>
<p>See <a href="http://en.wikipedia.org/wiki/FASTQ_format" class="uri">http://en.wikipedia.org/wiki/FASTQ_format</a></p>
</div>
</div>
</section>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link active">Preface</a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./commands.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./library.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./annexes.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link">References</a>
</div>
</li>
</ul>
</div>
</nav>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#preface" id="toc-preface" class="nav-link active" data-scroll-target="#preface">Preface</a></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title d-none d-lg-block">OBITools V4</h1>
</div>
<div class="quarto-title-meta">
<div>
<div class="quarto-title-meta-heading">Author</div>
<div class="quarto-title-meta-contents">
<p>Eric Coissac </p>
</div>
</div>
<a href="the-obitools-commands.html" class="navigation navigation-next navigation-unique" aria-label="Next page"><i class="fa fa-angle-right"></i></a>
<div>
<div class="quarto-title-meta-heading">Published</div>
<div class="quarto-title-meta-contents">
<p class="date">January 17, 2023</p>
</div>
</div>
<script src="book_assets/gitbook-2.6.7/js/app.min.js"></script>
<script src="book_assets/gitbook-2.6.7/js/clipboard.min.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="book_assets/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script src="book_assets/gitbook-2.6.7/js/plugin-clipboard.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": false,
"facebook": true,
"twitter": true,
"linkedin": false,
"weibo": false,
"instapaper": false,
"vk": false,
"whatsapp": false,
"all": ["facebook", "twitter", "linkedin", "weibo", "instapaper"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": null,
"text": null
},
"history": {
"link": null,
"text": null
},
"view": {
"link": null,
"text": null
},
"download": ["_main.pdf"],
"search": {
"engine": "fuse",
"options": null
},
"toc": {
"collapse": "subsection"
}
});
</div>
</header>
<section id="preface" class="level1 unnumbered">
<h1 class="unnumbered">Preface</h1>
<p>The first version of <em>OBITools</em> started to be developed in 2005. This was at the beginning of the DNA metabarcoding story at the Laboratoire dEcologie Alpine (LECA) in Grenoble. At that time, with Pierre Taberlet and François Pompanon, we were thinking about the potential of this new methodology under development. PIerre and François developed more the laboratory methods, while I was thinking more about the tools for analysing the sequences produced. Two ideas were behind this development. I wanted something modular, and something easy to extend. To achieve the first goal, I decided to implement obitools as a suite of unix commands mimicking the classic unix commands but dedicated to sequence files. The basic unix commands are very useful for automatically manipulating, parsing and editing text files. They work in flow, line by line on the input text. The result is a new text file that can be used as input for the next command. Such a design makes it possible to quickly develop a text processing pipeline by chaining simple elementary operations. The <em>OBITools</em> are the exact counterpart of these basic Unix commands, but the basic information they process is a sequence (potentially spanning several lines of text), not a single line of text. Most <em>OBITools</em> consume sequence files and produce sequence files. Thus, the principles of chaining and modularity are respected. In order to be able to easily extend the <em>OBITools</em> to keep up with our evolving ideas about processing DNA metabarcoding data, it was decided to develop them using an interpreted language: Python. Python 2, the version available at the time, allowed us to develop the <em>OBITools</em> efficiently. When parts of the algorithms were computationally demanding, they were implemented in C and linked to the Python code. Even though Python is not the most efficient language available, even though computers were not as powerful as they are today, the size of the data we could produce using 454 sequencers or early solexa machines was small enough to be processed in a reasonable time.</p>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const toggleBodyColorMode = (bsSheetEl) => {
const mode = bsSheetEl.getAttribute("data-mode");
const bodyEl = window.document.querySelector("body");
if (mode === "dark") {
bodyEl.classList.add("quarto-dark");
bodyEl.classList.remove("quarto-light");
} else {
bodyEl.classList.add("quarto-light");
bodyEl.classList.remove("quarto-dark");
}
}
const toggleBodyColorPrimary = () => {
const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
if (bsSheetEl) {
toggleBodyColorMode(bsSheetEl);
}
}
toggleBodyColorPrimary();
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const clipboard = new window.ClipboardJS('.code-copy-button', {
target: function(trigger) {
return trigger.previousElementSibling;
}
});
clipboard.on('success', function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
});
function tippyHover(el, contentFn) {
const config = {
allowHTML: true,
content: contentFn,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start'
};
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
</div>
<div class="nav-page nav-page-next">
<a href="./intro.html" class="pagination-link">
<span class="nav-page-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
</div> <!-- /content -->
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
var src = "true";
if (src === "" || src === "true") src = "https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.9/latest.js?config=TeX-MML-AM_CHTML";
if (location.protocol !== "file:")
if (/^https?:/.test(src))
src = src.replace(/^https?:/, '');
script.src = src;
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>
</body></html>

536
doc/_book/intro.html Normal file
View File

@ -0,0 +1,536 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.2.256">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>OBITools V4 - 1&nbsp; The OBITools</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
div.csl-bib-body { }
div.csl-entry {
clear: both;
}
.hanging div.csl-entry {
margin-left:2em;
text-indent:-2em;
}
div.csl-left-margin {
min-width:2em;
float:left;
}
div.csl-right-inline {
margin-left:2em;
padding-left:1em;
}
div.csl-indent {
margin-left: 2em;
}
</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./commands.html" rel="next">
<link href="./index.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 20,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit"
}
}</script>
<script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js" type="text/javascript"></script>
</head>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<div class="container-fluid d-flex justify-content-between">
<h1 class="quarto-secondary-nav-title"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></h1>
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation">
<i class="bi bi-chevron-right"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">OBITools V4</a>
</div>
</div>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link">Preface</a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link active"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./commands.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./library.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./annexes.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link">References</a>
</div>
</li>
</ul>
</div>
</nav>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#aims-of-obitools" id="toc-aims-of-obitools" class="nav-link active" data-scroll-target="#aims-of-obitools"><span class="toc-section-number">1.1</span> Aims of <em>OBITools</em></a></li>
<li><a href="#file-formats-usable-with-obitools" id="toc-file-formats-usable-with-obitools" class="nav-link" data-scroll-target="#file-formats-usable-with-obitools"><span class="toc-section-number">1.2</span> File formats usable with <em>OBITools</em></a>
<ul class="collapse">
<li><a href="#the-sequence-files" id="toc-the-sequence-files" class="nav-link" data-scroll-target="#the-sequence-files"><span class="toc-section-number">1.2.1</span> The sequence files</a></li>
<li><a href="#the-iupac-code" id="toc-the-iupac-code" class="nav-link" data-scroll-target="#the-iupac-code"><span class="toc-section-number">1.2.2</span> The IUPAC Code</a></li>
<li><a href="#classical-fasta" id="toc-classical-fasta" class="nav-link" data-scroll-target="#classical-fasta"><span class="toc-section-number">1.2.3</span> The <em>fasta</em> format</a></li>
<li><a href="#classical-fastq" id="toc-classical-fastq" class="nav-link" data-scroll-target="#classical-fastq"><span class="toc-section-number">1.2.4</span> The <em>fastq</em> sequence format</a></li>
</ul></li>
<li><a href="#file-extension" id="toc-file-extension" class="nav-link" data-scroll-target="#file-extension"><span class="toc-section-number">1.3</span> File extension</a></li>
<li><a href="#see-also" id="toc-see-also" class="nav-link" data-scroll-target="#see-also"><span class="toc-section-number">1.4</span> See also</a></li>
<li><a href="#references" id="toc-references" class="nav-link" data-scroll-target="#references"><span class="toc-section-number">1.5</span> References</a></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title d-none d-lg-block"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></h1>
</div>
<div class="quarto-title-meta">
</div>
</header>
<section id="aims-of-obitools" class="level2" data-number="1.1">
<h2 data-number="1.1" class="anchored" data-anchor-id="aims-of-obitools"><span class="header-section-number">1.1</span> Aims of <em>OBITools</em></h2>
</section>
<section id="file-formats-usable-with-obitools" class="level2" data-number="1.2">
<h2 data-number="1.2" class="anchored" data-anchor-id="file-formats-usable-with-obitools"><span class="header-section-number">1.2</span> File formats usable with <em>OBITools</em></h2>
<section id="the-sequence-files" class="level3" data-number="1.2.1">
<h3 data-number="1.2.1" class="anchored" data-anchor-id="the-sequence-files"><span class="header-section-number">1.2.1</span> The sequence files</h3>
<p>Sequences can be stored following various format. OBITools knows some of them. The central formats for sequence files manipulated by OBITools scripts are the <code>fasta</code> and fastq format. OBITools extends the both these formats by specifying a syntax to include in the definition line data qualifying the sequence. All file formats use the <code>IUPAC</code> code for encoding nucleotides.</p>
</section>
<section id="the-iupac-code" class="level3" data-number="1.2.2">
<h3 data-number="1.2.2" class="anchored" data-anchor-id="the-iupac-code"><span class="header-section-number">1.2.2</span> The IUPAC Code</h3>
<p>The International Union of Pure and Applied Chemistry (IUPAC_) defined the standard code for representing protein or DNA sequences.</p>
<section id="DNA-IUPAC" class="level4" data-number="1.2.2.1">
<h4 data-number="1.2.2.1" class="anchored" data-anchor-id="DNA-IUPAC"><span class="header-section-number">1.2.2.1</span> Nucleic IUPAC Code</h4>
<table class="table">
<thead>
<tr class="header">
<th><strong>Code</strong></th>
<th><strong>Nucleotide</strong></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>A</td>
<td>Adenine</td>
</tr>
<tr class="even">
<td>C</td>
<td>Cytosine</td>
</tr>
<tr class="odd">
<td>G</td>
<td>Guanine</td>
</tr>
<tr class="even">
<td>T</td>
<td>Thymine</td>
</tr>
<tr class="odd">
<td>U</td>
<td>Uracil</td>
</tr>
<tr class="even">
<td>R</td>
<td>Purine (A or G)</td>
</tr>
<tr class="odd">
<td>Y</td>
<td>Pyrimidine (C, T, or U)</td>
</tr>
<tr class="even">
<td>M</td>
<td>C or A</td>
</tr>
<tr class="odd">
<td>K</td>
<td>T, U, or G</td>
</tr>
<tr class="even">
<td>W</td>
<td>T, U, or A</td>
</tr>
<tr class="odd">
<td>S</td>
<td>C or G</td>
</tr>
<tr class="even">
<td>B</td>
<td>C, T, U, or G (not A)</td>
</tr>
<tr class="odd">
<td>D</td>
<td>A, T, U, or G (not C)</td>
</tr>
<tr class="even">
<td>H</td>
<td>A, T, U, or C (not G)</td>
</tr>
<tr class="odd">
<td>V</td>
<td>A, C, or G (not T, not U)</td>
</tr>
<tr class="even">
<td>N</td>
<td>Any base (A, C, G, T, or U)</td>
</tr>
</tbody>
</table>
</section>
</section>
<section id="classical-fasta" class="level3" data-number="1.2.3">
<h3 data-number="1.2.3" class="anchored" data-anchor-id="classical-fasta"><span class="header-section-number">1.2.3</span> The <em>fasta</em> format</h3>
<p>The <strong>fasta format</strong> is certainly the most widely used sequence file format. This is certainly due to its great simplicity. It was originally created for the Lipman and Pearson <a href="http://www.ncbi.nlm.nih.gov/pubmed/3162770?dopt=Citation">FASTA program</a>. OBITools use in more of the classical :ref:<code>fasta</code> format an :ref:<code>extended version</code> of this format where structured data are included in the title line.</p>
<p>In <em>fasta</em> format a sequence is represented by a title line beginning with a <strong><code>&gt;</code></strong> character and the sequences by itself following the :doc:<code>iupac</code> code. The sequence is usually split other severals lines of the same length (expect for the last one)</p>
<pre><code>&gt;my_sequence this is my pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT</code></pre>
<p>This is no special format for the title line excepting that this line should be unique. Usually the first word following the <strong>&gt;</strong> character is considered as the sequence identifier. The end of the title line corresponding to a description of the sequence. Several sequences can be concatenated in a same file. The description of the next sequence is just pasted at the end of the record of the previous one</p>
<pre><code>&gt;sequence_A this is my first pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
&gt;sequence_B this is my second pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
&gt;sequence_C this is my third pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT</code></pre>
</section>
<section id="classical-fastq" class="level3" data-number="1.2.4">
<h3 data-number="1.2.4" class="anchored" data-anchor-id="classical-fastq"><span class="header-section-number">1.2.4</span> The <em>fastq</em> sequence format<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a></h3>
<p><strong>fastq format</strong> is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are encoded with a single ASCII character for brevity. It was originally developed at the <code>Wellcome Trust Sanger Institute</code> to bundle a <a href="#classical-fasta">fasta</a> sequence and its quality data, but has recently become the <em>de facto</em> standard for storing the output of high throughput sequencing instruments such as the Illumina Genome Analyzer Illumina <span class="citation" data-cites="cock2010sanger">(<a href="references.html#ref-cock2010sanger" role="doc-biblioref">Cock et al. 2010</a>)</span> .</p>
<p>A fastq file normally uses four lines per sequence.</p>
<ul>
<li>Line 1 begins with a @ character and is followed by a sequence identifier and an <em>optional</em> description (like a :ref:<code>fasta</code> title line).</li>
<li>Line 2 is the raw sequence letters.</li>
<li>Line 3 begins with a + character and is <em>optionally</em> followed by the same sequence identifier (and any description) again.</li>
<li>Line 4 encodes the quality values for the sequence in Line 2, and must contain the same number of symbols as letters in the sequence.</li>
</ul>
<p>A fastq file containing a single sequence might look like this:</p>
<pre><code>@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF&gt;&gt;&gt;&gt;&gt;&gt;CCCCCCC65</code></pre>
<p>The character ! represents the lowest quality while ~ is the highest. Here are the quality value characters in left-to-right increasing order of quality (<code>ASCII</code>):</p>
<pre><code>!"#$%&amp;'()*+,-./0123456789:;&lt;=&gt;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~</code></pre>
<p>The original Sanger FASTQ files also allowed the sequence and quality strings to be wrapped (split over multiple lines), but this is generally discouraged as it can make parsing complicated due to the unfortunate choice of “@” and “+” as markers (these characters can also occur in the quality string).</p>
<section id="variations" class="level4" data-number="1.2.4.1">
<h4 data-number="1.2.4.1" class="anchored" data-anchor-id="variations"><span class="header-section-number">1.2.4.1</span> Variations</h4>
<section id="quality" class="level5" data-number="1.2.4.1.1">
<h5 data-number="1.2.4.1.1" class="anchored" data-anchor-id="quality"><span class="header-section-number">1.2.4.1.1</span> Quality</h5>
<p>A quality value <em>Q</em> is an integer mapping of <em>p</em> (i.e., the probability that the corresponding base call is incorrect). Two different equations have been in use. The first is the standard Sanger variant to assess reliability of a base call, otherwise known as Phred quality score:</p>
<p><span class="math display">\[
Q_\text{sanger} = -10 \, \log_{10} p
\]</span></p>
<p>The Solexa pipeline (i.e., the software delivered with the Illumina Genome Analyzer) earlier used a different mapping, encoding the odds <span class="math inline">\(\mathbf{p}/(1-\mathbf{p})\)</span> instead of the probability <span class="math inline">\(\mathbf{p}\)</span>:</p>
<p><span class="math display">\[
Q_\text{solexa-prior to v.1.3} = -10 \, \log_{10} \frac{p}{1-p}
\]</span></p>
<p>Although both mappings are asymptotically identical at higher quality values, they differ at lower quality levels (i.e., approximately <span class="math inline">\(\mathbf{p} &gt; 0.05\)</span>, or equivalently, <span class="math inline">\(\mathbf{Q} &lt; 13\)</span>).</p>
<p>|Relationship between <em>Q</em> and <em>p</em> using the Sanger (red) and Solexa (black) equations (described above). The vertical dotted line indicates <span class="math inline">\(\mathbf{p}= 0.05\)</span>, or equivalently, <span class="math inline">\(Q = 13\)</span>.|</p>
</section>
</section>
<section id="encoding" class="level4" data-number="1.2.4.2">
<h4 data-number="1.2.4.2" class="anchored" data-anchor-id="encoding"><span class="header-section-number">1.2.4.2</span> Encoding</h4>
<ul>
<li>Sanger format can encode a Phred quality score from 0 to 93 using ASCII 33 to 126 (although in raw read data the Phred quality score rarely exceeds 60, higher scores are possible in assemblies or read maps).</li>
<li>Solexa/Illumina 1.0 format can encode a Solexa/Illumina quality score from -5 to 62 using ASCII 59 to 126 (although in raw read data Solexa scores from -5 to 40 only are expected)</li>
<li>Starting with Illumina 1.3 and before Illumina 1.8, the format encoded a Phred quality score from 0 to 62 using ASCII 64 to 126 (although in raw read data Phred scores from 0 to 40 only are expected).</li>
<li>Starting in Illumina 1.5 and before Illumina 1.8, the Phred scores 0 to 2 have a slightly different meaning. The values 0 and 1 are no longer used and the value 2, encoded by ASCII 66 “B”.</li>
</ul>
<p>Sequencing Control Software, Version 2.6, Catalog # SY-960-2601, Part # 15009921 Rev.&nbsp;A, November 2009]&nbsp;<a href="[http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf](http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf){.uri}" class="uri">[http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf\\](http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf){.uri}</a> (page 30) states the following: <em>If a read ends with a segment of mostly low quality (Q15 or below), then all of the quality values in the segment are replaced with a value of 2 (encoded as the letter B in Illuminas text-based encoding of quality scores)… This Q2 indicator does not predict a specific error rate, but rather indicates that a specific final portion of the read should not be used in further analyses.</em> Also, the quality score encoded as “B” letter may occur internally within reads at least as late as pipeline version 1.6, as shown in the following example:</p>
<pre><code>@HWI-EAS209_0006_FC706VJ:5:58:5894:21141#ATCACG/1
TTAATTGGTAAATAAATCTCCTAATAGCTTAGATNTTACCTTNNNNNNNNNNTAGTTTCTTGAGATTTGTTGGGGGAGACATTTTTGTGATTGCCTTGAT
+HWI-EAS209_0006_FC706VJ:5:58:5894:21141#ATCACG/1
efcfffffcfeefffcffffffddf`feed]`]_Ba_^__[YBBBBBBBBBBRTT\]][]dddd`ddd^dddadd^BBBBBBBBBBBBBBBBBBBBBBBB</code></pre>
<p>An alternative interpretation of this ASCII encoding has been proposed. Also, in Illumina runs using PhiX controls, the character B was observed to represent an “unknown quality score”. The error rate of B reads was roughly 3 phred scores lower the mean observed score of a given run.</p>
<ul>
<li>Starting in Illumina 1.8, the quality scores have basically returned to the use of the Sanger format (Phred+33).</li>
</ul>
</section>
</section>
</section>
<section id="file-extension" class="level2" data-number="1.3">
<h2 data-number="1.3" class="anchored" data-anchor-id="file-extension"><span class="header-section-number">1.3</span> File extension</h2>
<p>There is no standard file extension for a FASTQ file, but .fq and .fastq, are commonly used.</p>
</section>
<section id="see-also" class="level2" data-number="1.4">
<h2 data-number="1.4" class="anchored" data-anchor-id="see-also"><span class="header-section-number">1.4</span> See also</h2>
<ul>
<li>:ref:<code>fasta</code></li>
</ul>
</section>
<section id="references" class="level2" data-number="1.5">
<h2 data-number="1.5" class="anchored" data-anchor-id="references"><span class="header-section-number">1.5</span> References</h2>
<p>.. [1] Cock et al (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research,</p>
<p>.. [2] Illumina Quality Scores, Tobias Mann, Bioinformatics, San Diego, Illumina <code>1</code>__</p>
<p>.. |Relationship between <em>Q</em> and <em>p</em> using the Sanger (red) and Solexa (black) equations (described above). The vertical dotted line indicates <em>p</em> = 0.05, or equivalently, <em>Q</em> Å 13.| image:: Probability metrics.png</p>
<p>See <a href="http://en.wikipedia.org/wiki/FASTQ_format" class="uri">http://en.wikipedia.org/wiki/FASTQ_format</a></p>
<div id="refs" class="references csl-bib-body hanging-indent" role="doc-bibliography" style="display: none">
<div id="ref-cock2010sanger" class="csl-entry" role="doc-biblioentry">
Cock, Peter JA, Christopher J Fields, Naohisa Goto, Michael L Heuer, and Peter M Rice. 2010. <span>“The Sanger FASTQ File Format for Sequences with Quality Scores, and the Solexa/Illumina FASTQ Variants.”</span> <em>Nucleic Acids Research</em> 38 (6): 176771.
</div>
</div>
</section>
<section id="footnotes" class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr>
<ol>
<li id="fn1"><p>This article uses material from the Wikipedia article <a href="http://en.wikipedia.org/wiki/FASTQ_format"><code>FASTQ format</code></a> which is released under the <code>Creative Commons Attribution-Share-Alike License 3.0</code><a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const toggleBodyColorMode = (bsSheetEl) => {
const mode = bsSheetEl.getAttribute("data-mode");
const bodyEl = window.document.querySelector("body");
if (mode === "dark") {
bodyEl.classList.add("quarto-dark");
bodyEl.classList.remove("quarto-light");
} else {
bodyEl.classList.add("quarto-light");
bodyEl.classList.remove("quarto-dark");
}
}
const toggleBodyColorPrimary = () => {
const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
if (bsSheetEl) {
toggleBodyColorMode(bsSheetEl);
}
}
toggleBodyColorPrimary();
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const clipboard = new window.ClipboardJS('.code-copy-button', {
target: function(trigger) {
return trigger.previousElementSibling;
}
});
clipboard.on('success', function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
});
function tippyHover(el, contentFn) {
const config = {
allowHTML: true,
content: contentFn,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start'
};
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="./index.html" class="pagination-link">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text">Preface</span>
</a>
</div>
<div class="nav-page nav-page-next">
<a href="./commands.html" class="pagination-link">
<span class="nav-page-text"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
</div> <!-- /content -->
</body></html>

489
doc/_book/library.html Normal file
View File

@ -0,0 +1,489 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.2.256">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>OBITools V4 - 3&nbsp; The GO OBITools library</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { color: #008000; } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { color: #008000; font-weight: bold; } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./annexes.html" rel="next">
<link href="./commands.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 20,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit"
}
}</script>
</head>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<div class="container-fluid d-flex justify-content-between">
<h1 class="quarto-secondary-nav-title"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></h1>
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation">
<i class="bi bi-chevron-right"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">OBITools V4</a>
</div>
</div>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link">Preface</a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./commands.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./library.html" class="sidebar-item-text sidebar-link active"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./annexes.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link">References</a>
</div>
</li>
</ul>
</div>
</nav>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
<nav id="TOC" role="doc-toc" class="toc-active">
<h2 id="toc-title">Table of contents</h2>
<ul>
<li><a href="#biosequence" id="toc-biosequence" class="nav-link active" data-scroll-target="#biosequence"><span class="toc-section-number">3.1</span> BioSequence</a>
<ul class="collapse">
<li><a href="#creating-new-instances" id="toc-creating-new-instances" class="nav-link" data-scroll-target="#creating-new-instances"><span class="toc-section-number">3.1.1</span> Creating new instances</a></li>
<li><a href="#end-of-life-of-a-biosequence-instance" id="toc-end-of-life-of-a-biosequence-instance" class="nav-link" data-scroll-target="#end-of-life-of-a-biosequence-instance"><span class="toc-section-number">3.1.2</span> End of life of a <code>BioSequence</code> instance</a></li>
<li><a href="#accessing-to-the-elements-of-a-sequence" id="toc-accessing-to-the-elements-of-a-sequence" class="nav-link" data-scroll-target="#accessing-to-the-elements-of-a-sequence"><span class="toc-section-number">3.1.3</span> Accessing to the elements of a sequence</a></li>
</ul></li>
</ul>
</nav>
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title d-none d-lg-block"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></h1>
</div>
<div class="quarto-title-meta">
</div>
</header>
<section id="biosequence" class="level2" data-number="3.1">
<h2 data-number="3.1" class="anchored" data-anchor-id="biosequence"><span class="header-section-number">3.1</span> BioSequence</h2>
<p>The <code>BioSequence</code> class is used to represent biological sequences. It allows for storing : - the sequence itself as a <code>[]byte</code> - the sequencing quality score as a <code>[]byte</code> if needed - an identifier as a <code>string</code> - a definition as a <code>string</code> - a set of <em>(key, value)</em> pairs in a <code>map[sting]interface{}</code></p>
<p>BioSequence is defined in the obiseq module and is included using the code</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode go code-with-copy"><code class="sourceCode go"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">import</span> <span class="op">(</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="st">"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="op">)</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<section id="creating-new-instances" class="level3" data-number="3.1.1">
<h3 data-number="3.1.1" class="anchored" data-anchor-id="creating-new-instances"><span class="header-section-number">3.1.1</span> Creating new instances</h3>
<p>To create new instance, use</p>
<ul>
<li><code>MakeBioSequence(id string, sequence []byte, definition string) obiseq.BioSequence</code></li>
<li><code>NewBioSequence(id string, sequence []byte, definition string) *obiseq.BioSequence</code></li>
</ul>
<p>Both create a <code>BioSequence</code> instance, but when the first one returns the instance, the second returns a pointer on the new instance. Two other functions <code>MakeEmptyBioSequence</code>, and <code>NewEmptyBioSequence</code> do the same job but provide an uninitialized objects.</p>
<ul>
<li><code>id</code> parameters corresponds to the unique identifier of the sequence. It mist be a string constituted of a single word (not containing any space).</li>
<li><code>sequence</code> is the DNA sequence itself, provided as a <code>byte</code> array (<code>[]byte</code>).</li>
<li><code>definition</code> is a <code>string</code>, potentially empty, but usualy containing a sentence explaining what is that sequence.</li>
</ul>
<div class="sourceCode" id="cb2"><pre class="sourceCode go code-with-copy"><code class="sourceCode go"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="kw">import</span> <span class="op">(</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> <span class="st">"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a><span class="kw">func</span> main<span class="op">()</span> <span class="op">{</span></span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> myseq <span class="op">:=</span> obiseq<span class="op">.</span>NewBiosequence<span class="op">(</span></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> <span class="st">"seq_GH0001"</span><span class="op">,</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> bytes<span class="op">.</span>FromString<span class="op">(</span><span class="st">"ACGTGTCAGTCG"</span><span class="op">),</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> <span class="st">"A short test sequence"</span><span class="op">,</span></span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> <span class="op">)</span></span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<p>When formated as fasta the parameters correspond to the following schema</p>
<pre><code>&gt;id definition containing potentially several words
sequence</code></pre>
</section>
<section id="end-of-life-of-a-biosequence-instance" class="level3" data-number="3.1.2">
<h3 data-number="3.1.2" class="anchored" data-anchor-id="end-of-life-of-a-biosequence-instance"><span class="header-section-number">3.1.2</span> End of life of a <code>BioSequence</code> instance</h3>
<p>When a <code>BioSequence</code> instance is no more used, it is normally taken in charge by the GO garbage collector. You can if you want call the <code>Recycle</code> method on the instance to store the allocated memory element in a <code>pool</code> to limit allocation effort when many sequences are manipulated.</p>
</section>
<section id="accessing-to-the-elements-of-a-sequence" class="level3" data-number="3.1.3">
<h3 data-number="3.1.3" class="anchored" data-anchor-id="accessing-to-the-elements-of-a-sequence"><span class="header-section-number">3.1.3</span> Accessing to the elements of a sequence</h3>
<p>The different elements of an <code>obiseq.BioSequence</code> must be accessed using a set of methods. For the three main elements provided during the creation of a new instance methodes are :</p>
<ul>
<li><code>Id() string</code></li>
<li><code>Sequence() []byte</code></li>
<li><code>Definition() string</code></li>
</ul>
<p>It exists pending method to change the value of these elements</p>
<ul>
<li><code>SetId(id string)</code></li>
<li><code>SetSequence(sequence []byte)</code></li>
<li><code>SetDefinition(definition string)</code></li>
</ul>
<div class="sourceCode" id="cb4"><pre class="sourceCode go code-with-copy"><code class="sourceCode go"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="kw">import</span> <span class="op">(</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> <span class="st">"fmt"</span></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> <span class="st">"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"</span></span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a><span class="kw">func</span> main<span class="op">()</span> <span class="op">{</span></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a> myseq <span class="op">:=</span> obiseq<span class="op">.</span>NewBiosequence<span class="op">(</span></span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> <span class="st">"seq_GH0001"</span><span class="op">,</span></span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> bytes<span class="op">.</span>FromString<span class="op">(</span><span class="st">"ACGTGTCAGTCG"</span><span class="op">),</span></span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> <span class="st">"A short test sequence"</span><span class="op">,</span></span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a> <span class="op">)</span></span>
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a> fmt<span class="op">.</span>Println<span class="op">(</span>myseq<span class="op">.</span>Id<span class="op">())</span></span>
<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a> myseq<span class="op">.</span>SetId<span class="op">(</span><span class="st">"SPE01_0001"</span><span class="op">)</span></span>
<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a> fmt<span class="op">.</span>Println<span class="op">(</span>myseq<span class="op">.</span>Id<span class="op">())</span></span>
<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<section id="different-ways-for-accessing-an-editing-the-sequence" class="level4" data-number="3.1.3.1">
<h4 data-number="3.1.3.1" class="anchored" data-anchor-id="different-ways-for-accessing-an-editing-the-sequence"><span class="header-section-number">3.1.3.1</span> Different ways for accessing an editing the sequence</h4>
<p>If <code>Sequence()</code>and <code>SetSequence(sequence []byte)</code> methods are the basic ones, several other methods exist.</p>
<ul>
<li><code>String() string</code> return the sequence directly converted to a <code>string</code> instance.</li>
<li>The <code>Write</code> method family allows for extending an existing sequence following the buffer protocol.
<ul>
<li><code>Write(data []byte) (int, error)</code> allows for appending a byte array on 3 end of the sequence.</li>
<li><code>WriteString(data string) (int, error)</code> allows for appending a <code>string</code>.</li>
<li><code>WriteByte(data byte) error</code> allows for appending a single <code>byte</code>.</li>
</ul></li>
</ul>
<p>The <code>Clear</code> method empties the sequence buffer.</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode go code-with-copy"><code class="sourceCode go"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="kw">import</span> <span class="op">(</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a> <span class="st">"fmt"</span></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a> <span class="st">"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"</span></span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a><span class="op">)</span></span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a><span class="kw">func</span> main<span class="op">()</span> <span class="op">{</span></span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a> myseq <span class="op">:=</span> obiseq<span class="op">.</span>NewEmptyBiosequence<span class="op">()</span></span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a> myseq<span class="op">.</span>WriteString<span class="op">(</span><span class="st">"accc"</span><span class="op">)</span></span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a> myseq<span class="op">.</span>WriteByte<span class="op">(</span><span class="dt">byte</span><span class="op">(</span><span class="ch">'c'</span><span class="op">))</span></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a> fmt<span class="op">.</span>Println<span class="op">(</span>myseq<span class="op">.</span>String<span class="op">())</span></span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</section>
<section id="sequence-quality-scores" class="level4" data-number="3.1.3.2">
<h4 data-number="3.1.3.2" class="anchored" data-anchor-id="sequence-quality-scores"><span class="header-section-number">3.1.3.2</span> Sequence quality scores</h4>
<p>Sequence quality scores cannot be initialized at the time of instance creation. You must use dedicated methods to add quality scores to a sequence.</p>
<p>To be coherent the length of both the DNA sequence and que quality score sequence must be equal. But assessment of this constraint is realized. It is of the programmer responsability to check that invariant.</p>
<p>While accessing to the quality scores relies on the method <code>Quality() []byte</code>, setting the quality need to call one of the following method. They run similarly to their sequence dedicated conterpart.</p>
<ul>
<li><code>SetQualities(qualities Quality)</code></li>
<li><code>WriteQualities(data []byte) (int, error)</code></li>
<li><code>WriteByteQualities(data byte) error</code></li>
</ul>
<p>In a way analogous to the <code>Clear</code> method, <code>ClearQualities()</code> empties the sequence of quality scores.</p>
</section>
</section>
</section>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const toggleBodyColorMode = (bsSheetEl) => {
const mode = bsSheetEl.getAttribute("data-mode");
const bodyEl = window.document.querySelector("body");
if (mode === "dark") {
bodyEl.classList.add("quarto-dark");
bodyEl.classList.remove("quarto-light");
} else {
bodyEl.classList.add("quarto-light");
bodyEl.classList.remove("quarto-dark");
}
}
const toggleBodyColorPrimary = () => {
const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
if (bsSheetEl) {
toggleBodyColorMode(bsSheetEl);
}
}
toggleBodyColorPrimary();
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const clipboard = new window.ClipboardJS('.code-copy-button', {
target: function(trigger) {
return trigger.previousElementSibling;
}
});
clipboard.on('success', function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
});
function tippyHover(el, contentFn) {
const config = {
allowHTML: true,
content: contentFn,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start'
};
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="./commands.html" class="pagination-link">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></span>
</a>
</div>
<div class="nav-page nav-page-next">
<a href="./annexes.html" class="pagination-link">
<span class="nav-page-text"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></span> <i class="bi bi-arrow-right-short"></i>
</a>
</div>
</nav>
</div> <!-- /content -->
</body></html>

329
doc/_book/references.html Normal file
View File

@ -0,0 +1,329 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
<meta charset="utf-8">
<meta name="generator" content="quarto-1.2.256">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>OBITools V4 - References</title>
<style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
ul.task-list{list-style: none;}
ul.task-list li input[type="checkbox"] {
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
div.csl-bib-body { }
div.csl-entry {
clear: both;
}
.hanging div.csl-entry {
margin-left:2em;
text-indent:-2em;
}
div.csl-left-margin {
min-width:2em;
float:left;
}
div.csl-right-inline {
margin-left:2em;
padding-left:1em;
}
div.csl-indent {
margin-left: 2em;
}
</style>
<script src="site_libs/quarto-nav/quarto-nav.js"></script>
<script src="site_libs/quarto-nav/headroom.min.js"></script>
<script src="site_libs/clipboard/clipboard.min.js"></script>
<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
<script src="site_libs/quarto-search/fuse.min.js"></script>
<script src="site_libs/quarto-search/quarto-search.js"></script>
<meta name="quarto:offset" content="./">
<link href="./annexes.html" rel="prev">
<script src="site_libs/quarto-html/quarto.js"></script>
<script src="site_libs/quarto-html/popper.min.js"></script>
<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
<script src="site_libs/quarto-html/anchor.min.js"></script>
<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
<script src="site_libs/bootstrap/bootstrap.min.js"></script>
<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
<script id="quarto-search-options" type="application/json">{
"location": "sidebar",
"copy-button": false,
"collapse-after": 3,
"panel-placement": "start",
"type": "textbox",
"limit": 20,
"language": {
"search-no-results-text": "No results",
"search-matching-documents-text": "matching documents",
"search-copy-link-title": "Copy link to search",
"search-hide-matches-text": "Hide additional matches",
"search-more-match-text": "more match in this document",
"search-more-matches-text": "more matches in this document",
"search-clear-button-title": "Clear",
"search-detached-cancel-button-title": "Cancel",
"search-submit-button-title": "Submit"
}
}</script>
</head>
<body class="nav-sidebar floating">
<div id="quarto-search-results"></div>
<header id="quarto-header" class="headroom fixed-top">
<nav class="quarto-secondary-nav" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
<div class="container-fluid d-flex justify-content-between">
<h1 class="quarto-secondary-nav-title">References</h1>
<button type="button" class="quarto-btn-toggle btn" aria-label="Show secondary navigation">
<i class="bi bi-chevron-right"></i>
</button>
</div>
</nav>
</header>
<!-- content -->
<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
<!-- sidebar -->
<nav id="quarto-sidebar" class="sidebar collapse sidebar-navigation floating overflow-auto">
<div class="pt-lg-2 mt-2 text-left sidebar-header">
<div class="sidebar-title mb-0 py-0">
<a href="./">OBITools V4</a>
</div>
</div>
<div class="mt-2 flex-shrink-0 align-items-center">
<div class="sidebar-search">
<div id="quarto-search" class="" title="Search"></div>
</div>
</div>
<div class="sidebar-menu-container">
<ul class="list-unstyled mt-1">
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./index.html" class="sidebar-item-text sidebar-link">Preface</a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./intro.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">The OBITools</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./commands.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">The <em>OBITools</em> commands</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./library.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">3</span>&nbsp; <span class="chapter-title">The GO <em>OBITools</em> library</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./annexes.html" class="sidebar-item-text sidebar-link"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./references.html" class="sidebar-item-text sidebar-link active">References</a>
</div>
</li>
</ul>
</div>
</nav>
<!-- margin-sidebar -->
<div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
</div>
<!-- main -->
<main class="content" id="quarto-document-content">
<header id="title-block-header" class="quarto-title-block default">
<div class="quarto-title">
<h1 class="title d-none d-lg-block">References</h1>
</div>
<div class="quarto-title-meta">
</div>
</header>
<div id="refs" class="references csl-bib-body hanging-indent" role="doc-bibliography">
<div id="ref-cock2010sanger" class="csl-entry" role="doc-biblioentry">
Cock, Peter JA, Christopher J Fields, Naohisa Goto, Michael L Heuer, and
Peter M Rice. 2010. <span>“The Sanger FASTQ File Format for Sequences
with Quality Scores, and the Solexa/Illumina FASTQ Variants.”</span>
<em>Nucleic Acids Research</em> 38 (6): 176771.
</div>
</div>
</main> <!-- /main -->
<script id="quarto-html-after-body" type="application/javascript">
window.document.addEventListener("DOMContentLoaded", function (event) {
const toggleBodyColorMode = (bsSheetEl) => {
const mode = bsSheetEl.getAttribute("data-mode");
const bodyEl = window.document.querySelector("body");
if (mode === "dark") {
bodyEl.classList.add("quarto-dark");
bodyEl.classList.remove("quarto-light");
} else {
bodyEl.classList.add("quarto-light");
bodyEl.classList.remove("quarto-dark");
}
}
const toggleBodyColorPrimary = () => {
const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
if (bsSheetEl) {
toggleBodyColorMode(bsSheetEl);
}
}
toggleBodyColorPrimary();
const icon = "";
const anchorJS = new window.AnchorJS();
anchorJS.options = {
placement: 'right',
icon: icon
};
anchorJS.add('.anchored');
const clipboard = new window.ClipboardJS('.code-copy-button', {
target: function(trigger) {
return trigger.previousElementSibling;
}
});
clipboard.on('success', function(e) {
// button target
const button = e.trigger;
// don't keep focus
button.blur();
// flash "checked"
button.classList.add('code-copy-button-checked');
var currentTitle = button.getAttribute("title");
button.setAttribute("title", "Copied!");
let tooltip;
if (window.bootstrap) {
button.setAttribute("data-bs-toggle", "tooltip");
button.setAttribute("data-bs-placement", "left");
button.setAttribute("data-bs-title", "Copied!");
tooltip = new bootstrap.Tooltip(button,
{ trigger: "manual",
customClass: "code-copy-button-tooltip",
offset: [0, -8]});
tooltip.show();
}
setTimeout(function() {
if (tooltip) {
tooltip.hide();
button.removeAttribute("data-bs-title");
button.removeAttribute("data-bs-toggle");
button.removeAttribute("data-bs-placement");
}
button.setAttribute("title", currentTitle);
button.classList.remove('code-copy-button-checked');
}, 1000);
// clear code selection
e.clearSelection();
});
function tippyHover(el, contentFn) {
const config = {
allowHTML: true,
content: contentFn,
maxWidth: 500,
delay: 100,
arrow: false,
appendTo: function(el) {
return el.parentElement;
},
interactive: true,
interactiveBorder: 10,
theme: 'quarto',
placement: 'bottom-start'
};
window.tippy(el, config);
}
const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
for (var i=0; i<noterefs.length; i++) {
const ref = noterefs[i];
tippyHover(ref, function() {
// use id or data attribute instead here
let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
try { href = new URL(href).hash; } catch {}
const id = href.replace(/^#\/?/, "");
const note = window.document.getElementById(id);
return note.innerHTML;
});
}
const findCites = (el) => {
const parentEl = el.parentElement;
if (parentEl) {
const cites = parentEl.dataset.cites;
if (cites) {
return {
el,
cites: cites.split(' ')
};
} else {
return findCites(el.parentElement)
}
} else {
return undefined;
}
};
var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
for (var i=0; i<bibliorefs.length; i++) {
const ref = bibliorefs[i];
const citeInfo = findCites(ref);
if (citeInfo) {
tippyHover(citeInfo.el, function() {
var popup = window.document.createElement('div');
citeInfo.cites.forEach(function(cite) {
var citeDiv = window.document.createElement('div');
citeDiv.classList.add('hanging-indent');
citeDiv.classList.add('csl-entry');
var biblioDiv = window.document.getElementById('ref-' + cite);
if (biblioDiv) {
citeDiv.innerHTML = biblioDiv.innerHTML;
}
popup.appendChild(citeDiv);
});
return popup.innerHTML;
});
}
}
});
</script>
<nav class="page-navigation">
<div class="nav-page nav-page-previous">
<a href="./annexes.html" class="pagination-link">
<i class="bi bi-arrow-left-short"></i> <span class="nav-page-text"><span class="chapter-number">4</span>&nbsp; <span class="chapter-title">Annexes</span></span>
</a>
</div>
<div class="nav-page nav-page-next">
</div>
</nav>
</div> <!-- /content -->
</body></html>

121
doc/_book/search.json Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

Binary file not shown.

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,171 @@
/* quarto syntax highlight colors */
:root {
--quarto-hl-ot-color: #003B4F;
--quarto-hl-at-color: #657422;
--quarto-hl-ss-color: #20794D;
--quarto-hl-an-color: #5E5E5E;
--quarto-hl-fu-color: #4758AB;
--quarto-hl-st-color: #20794D;
--quarto-hl-cf-color: #003B4F;
--quarto-hl-op-color: #5E5E5E;
--quarto-hl-er-color: #AD0000;
--quarto-hl-bn-color: #AD0000;
--quarto-hl-al-color: #AD0000;
--quarto-hl-va-color: #111111;
--quarto-hl-bu-color: inherit;
--quarto-hl-ex-color: inherit;
--quarto-hl-pp-color: #AD0000;
--quarto-hl-in-color: #5E5E5E;
--quarto-hl-vs-color: #20794D;
--quarto-hl-wa-color: #5E5E5E;
--quarto-hl-do-color: #5E5E5E;
--quarto-hl-im-color: #00769E;
--quarto-hl-ch-color: #20794D;
--quarto-hl-dt-color: #AD0000;
--quarto-hl-fl-color: #AD0000;
--quarto-hl-co-color: #5E5E5E;
--quarto-hl-cv-color: #5E5E5E;
--quarto-hl-cn-color: #8f5902;
--quarto-hl-sc-color: #5E5E5E;
--quarto-hl-dv-color: #AD0000;
--quarto-hl-kw-color: #003B4F;
}
/* other quarto variables */
:root {
--quarto-font-monospace: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
}
pre > code.sourceCode > span {
color: #003B4F;
}
code span {
color: #003B4F;
}
code.sourceCode > span {
color: #003B4F;
}
div.sourceCode,
div.sourceCode pre.sourceCode {
color: #003B4F;
}
code span.ot {
color: #003B4F;
}
code span.at {
color: #657422;
}
code span.ss {
color: #20794D;
}
code span.an {
color: #5E5E5E;
}
code span.fu {
color: #4758AB;
}
code span.st {
color: #20794D;
}
code span.cf {
color: #003B4F;
}
code span.op {
color: #5E5E5E;
}
code span.er {
color: #AD0000;
}
code span.bn {
color: #AD0000;
}
code span.al {
color: #AD0000;
}
code span.va {
color: #111111;
}
code span.pp {
color: #AD0000;
}
code span.in {
color: #5E5E5E;
}
code span.vs {
color: #20794D;
}
code span.wa {
color: #5E5E5E;
font-style: italic;
}
code span.do {
color: #5E5E5E;
font-style: italic;
}
code span.im {
color: #00769E;
}
code span.ch {
color: #20794D;
}
code span.dt {
color: #AD0000;
}
code span.fl {
color: #AD0000;
}
code span.co {
color: #5E5E5E;
}
code span.cv {
color: #5E5E5E;
font-style: italic;
}
code span.cn {
color: #8f5902;
}
code span.sc {
color: #5E5E5E;
}
code span.dv {
color: #AD0000;
}
code span.kw {
color: #003B4F;
}
.prevent-inlining {
content: "</";
}
/*# sourceMappingURL=debc5d5d77c3f9108843748ff7464032.css.map */

View File

@ -0,0 +1,770 @@
const sectionChanged = new CustomEvent("quarto-sectionChanged", {
detail: {},
bubbles: true,
cancelable: false,
composed: false,
});
window.document.addEventListener("DOMContentLoaded", function (_event) {
const tocEl = window.document.querySelector('nav.toc-active[role="doc-toc"]');
const sidebarEl = window.document.getElementById("quarto-sidebar");
const leftTocEl = window.document.getElementById("quarto-sidebar-toc-left");
const marginSidebarEl = window.document.getElementById(
"quarto-margin-sidebar"
);
// function to determine whether the element has a previous sibling that is active
const prevSiblingIsActiveLink = (el) => {
const sibling = el.previousElementSibling;
if (sibling && sibling.tagName === "A") {
return sibling.classList.contains("active");
} else {
return false;
}
};
// fire slideEnter for bootstrap tab activations (for htmlwidget resize behavior)
function fireSlideEnter(e) {
const event = window.document.createEvent("Event");
event.initEvent("slideenter", true, true);
window.document.dispatchEvent(event);
}
const tabs = window.document.querySelectorAll('a[data-bs-toggle="tab"]');
tabs.forEach((tab) => {
tab.addEventListener("shown.bs.tab", fireSlideEnter);
});
// fire slideEnter for tabby tab activations (for htmlwidget resize behavior)
document.addEventListener("tabby", fireSlideEnter, false);
// Track scrolling and mark TOC links as active
// get table of contents and sidebar (bail if we don't have at least one)
const tocLinks = tocEl
? [...tocEl.querySelectorAll("a[data-scroll-target]")]
: [];
const makeActive = (link) => tocLinks[link].classList.add("active");
const removeActive = (link) => tocLinks[link].classList.remove("active");
const removeAllActive = () =>
[...Array(tocLinks.length).keys()].forEach((link) => removeActive(link));
// activate the anchor for a section associated with this TOC entry
tocLinks.forEach((link) => {
link.addEventListener("click", () => {
if (link.href.indexOf("#") !== -1) {
const anchor = link.href.split("#")[1];
const heading = window.document.querySelector(
`[data-anchor-id=${anchor}]`
);
if (heading) {
// Add the class
heading.classList.add("reveal-anchorjs-link");
// function to show the anchor
const handleMouseout = () => {
heading.classList.remove("reveal-anchorjs-link");
heading.removeEventListener("mouseout", handleMouseout);
};
// add a function to clear the anchor when the user mouses out of it
heading.addEventListener("mouseout", handleMouseout);
}
}
});
});
const sections = tocLinks.map((link) => {
const target = link.getAttribute("data-scroll-target");
if (target.startsWith("#")) {
return window.document.getElementById(decodeURI(`${target.slice(1)}`));
} else {
return window.document.querySelector(decodeURI(`${target}`));
}
});
const sectionMargin = 200;
let currentActive = 0;
// track whether we've initialized state the first time
let init = false;
const updateActiveLink = () => {
// The index from bottom to top (e.g. reversed list)
let sectionIndex = -1;
if (
window.innerHeight + window.pageYOffset >=
window.document.body.offsetHeight
) {
sectionIndex = 0;
} else {
sectionIndex = [...sections].reverse().findIndex((section) => {
if (section) {
return window.pageYOffset >= section.offsetTop - sectionMargin;
} else {
return false;
}
});
}
if (sectionIndex > -1) {
const current = sections.length - sectionIndex - 1;
if (current !== currentActive) {
removeAllActive();
currentActive = current;
makeActive(current);
if (init) {
window.dispatchEvent(sectionChanged);
}
init = true;
}
}
};
const inHiddenRegion = (top, bottom, hiddenRegions) => {
for (const region of hiddenRegions) {
if (top <= region.bottom && bottom >= region.top) {
return true;
}
}
return false;
};
const categorySelector = "header.quarto-title-block .quarto-category";
const activateCategories = (href) => {
// Find any categories
// Surround them with a link pointing back to:
// #category=Authoring
try {
const categoryEls = window.document.querySelectorAll(categorySelector);
for (const categoryEl of categoryEls) {
const categoryText = categoryEl.textContent;
if (categoryText) {
const link = `${href}#category=${encodeURIComponent(categoryText)}`;
const linkEl = window.document.createElement("a");
linkEl.setAttribute("href", link);
for (const child of categoryEl.childNodes) {
linkEl.append(child);
}
categoryEl.appendChild(linkEl);
}
}
} catch {
// Ignore errors
}
};
function hasTitleCategories() {
return window.document.querySelector(categorySelector) !== null;
}
function offsetRelativeUrl(url) {
const offset = getMeta("quarto:offset");
return offset ? offset + url : url;
}
function offsetAbsoluteUrl(url) {
const offset = getMeta("quarto:offset");
const baseUrl = new URL(offset, window.location);
const projRelativeUrl = url.replace(baseUrl, "");
if (projRelativeUrl.startsWith("/")) {
return projRelativeUrl;
} else {
return "/" + projRelativeUrl;
}
}
// read a meta tag value
function getMeta(metaName) {
const metas = window.document.getElementsByTagName("meta");
for (let i = 0; i < metas.length; i++) {
if (metas[i].getAttribute("name") === metaName) {
return metas[i].getAttribute("content");
}
}
return "";
}
async function findAndActivateCategories() {
const currentPagePath = offsetAbsoluteUrl(window.location.href);
const response = await fetch(offsetRelativeUrl("listings.json"));
if (response.status == 200) {
return response.json().then(function (listingPaths) {
const listingHrefs = [];
for (const listingPath of listingPaths) {
const pathWithoutLeadingSlash = listingPath.listing.substring(1);
for (const item of listingPath.items) {
if (
item === currentPagePath ||
item === currentPagePath + "index.html"
) {
// Resolve this path against the offset to be sure
// we already are using the correct path to the listing
// (this adjusts the listing urls to be rooted against
// whatever root the page is actually running against)
const relative = offsetRelativeUrl(pathWithoutLeadingSlash);
const baseUrl = window.location;
const resolvedPath = new URL(relative, baseUrl);
listingHrefs.push(resolvedPath.pathname);
break;
}
}
}
// Look up the tree for a nearby linting and use that if we find one
const nearestListing = findNearestParentListing(
offsetAbsoluteUrl(window.location.pathname),
listingHrefs
);
if (nearestListing) {
activateCategories(nearestListing);
} else {
// See if the referrer is a listing page for this item
const referredRelativePath = offsetAbsoluteUrl(document.referrer);
const referrerListing = listingHrefs.find((listingHref) => {
const isListingReferrer =
listingHref === referredRelativePath ||
listingHref === referredRelativePath + "index.html";
return isListingReferrer;
});
if (referrerListing) {
// Try to use the referrer if possible
activateCategories(referrerListing);
} else if (listingHrefs.length > 0) {
// Otherwise, just fall back to the first listing
activateCategories(listingHrefs[0]);
}
}
});
}
}
if (hasTitleCategories()) {
findAndActivateCategories();
}
const findNearestParentListing = (href, listingHrefs) => {
if (!href || !listingHrefs) {
return undefined;
}
// Look up the tree for a nearby linting and use that if we find one
const relativeParts = href.substring(1).split("/");
while (relativeParts.length > 0) {
const path = relativeParts.join("/");
for (const listingHref of listingHrefs) {
if (listingHref.startsWith(path)) {
return listingHref;
}
}
relativeParts.pop();
}
return undefined;
};
const manageSidebarVisiblity = (el, placeholderDescriptor) => {
let isVisible = true;
return (hiddenRegions) => {
if (el === null) {
return;
}
// Find the last element of the TOC
const lastChildEl = el.lastElementChild;
if (lastChildEl) {
// Find the top and bottom o the element that is being managed
const elTop = el.offsetTop;
const elBottom =
elTop + lastChildEl.offsetTop + lastChildEl.offsetHeight;
// Converts the sidebar to a menu
const convertToMenu = () => {
for (const child of el.children) {
child.style.opacity = 0;
child.style.overflow = "hidden";
}
const toggleContainer = window.document.createElement("div");
toggleContainer.style.width = "100%";
toggleContainer.classList.add("zindex-over-content");
toggleContainer.classList.add("quarto-sidebar-toggle");
toggleContainer.classList.add("headroom-target"); // Marks this to be managed by headeroom
toggleContainer.id = placeholderDescriptor.id;
toggleContainer.style.position = "fixed";
const toggleIcon = window.document.createElement("i");
toggleIcon.classList.add("quarto-sidebar-toggle-icon");
toggleIcon.classList.add("bi");
toggleIcon.classList.add("bi-caret-down-fill");
const toggleTitle = window.document.createElement("div");
const titleEl = window.document.body.querySelector(
placeholderDescriptor.titleSelector
);
if (titleEl) {
toggleTitle.append(titleEl.innerText, toggleIcon);
}
toggleTitle.classList.add("zindex-over-content");
toggleTitle.classList.add("quarto-sidebar-toggle-title");
toggleContainer.append(toggleTitle);
const toggleContents = window.document.createElement("div");
toggleContents.classList = el.classList;
toggleContents.classList.add("zindex-over-content");
toggleContents.classList.add("quarto-sidebar-toggle-contents");
for (const child of el.children) {
if (child.id === "toc-title") {
continue;
}
const clone = child.cloneNode(true);
clone.style.opacity = 1;
clone.style.display = null;
toggleContents.append(clone);
}
toggleContents.style.height = "0px";
toggleContainer.append(toggleContents);
el.parentElement.prepend(toggleContainer);
// Process clicks
let tocShowing = false;
// Allow the caller to control whether this is dismissed
// when it is clicked (e.g. sidebar navigation supports
// opening and closing the nav tree, so don't dismiss on click)
const clickEl = placeholderDescriptor.dismissOnClick
? toggleContainer
: toggleTitle;
const closeToggle = () => {
if (tocShowing) {
toggleContainer.classList.remove("expanded");
toggleContents.style.height = "0px";
tocShowing = false;
}
};
const positionToggle = () => {
// position the element (top left of parent, same width as parent)
const elRect = el.getBoundingClientRect();
toggleContainer.style.left = `${elRect.left}px`;
toggleContainer.style.top = `${elRect.top}px`;
toggleContainer.style.width = `${elRect.width}px`;
};
// Get rid of any expanded toggle if the user scrolls
window.document.addEventListener(
"scroll",
throttle(() => {
closeToggle();
}, 50)
);
// Handle positioning of the toggle
window.addEventListener(
"resize",
throttle(() => {
positionToggle();
}, 50)
);
positionToggle();
// Process the click
clickEl.onclick = () => {
if (!tocShowing) {
toggleContainer.classList.add("expanded");
toggleContents.style.height = null;
tocShowing = true;
} else {
closeToggle();
}
};
};
// Converts a sidebar from a menu back to a sidebar
const convertToSidebar = () => {
for (const child of el.children) {
child.style.opacity = 1;
child.style.overflow = null;
}
const placeholderEl = window.document.getElementById(
placeholderDescriptor.id
);
if (placeholderEl) {
placeholderEl.remove();
}
el.classList.remove("rollup");
};
if (isReaderMode()) {
convertToMenu();
isVisible = false;
} else {
if (!isVisible) {
// If the element is current not visible reveal if there are
// no conflicts with overlay regions
if (!inHiddenRegion(elTop, elBottom, hiddenRegions)) {
convertToSidebar();
isVisible = true;
}
} else {
// If the element is visible, hide it if it conflicts with overlay regions
// and insert a placeholder toggle (or if we're in reader mode)
if (inHiddenRegion(elTop, elBottom, hiddenRegions)) {
convertToMenu();
isVisible = false;
}
}
}
}
};
};
// Find any conflicting margin elements and add margins to the
// top to prevent overlap
const marginChildren = window.document.querySelectorAll(
".column-margin.column-container > * "
);
nexttick(() => {
let lastBottom = 0;
for (const marginChild of marginChildren) {
const top = marginChild.getBoundingClientRect().top + window.scrollY;
if (top < lastBottom) {
const margin = lastBottom - top;
marginChild.style.marginTop = `${margin}px`;
}
const styles = window.getComputedStyle(marginChild);
const marginTop = parseFloat(styles["marginTop"]);
lastBottom = top + marginChild.getBoundingClientRect().height + marginTop;
}
});
// Manage the visibility of the toc and the sidebar
const marginScrollVisibility = manageSidebarVisiblity(marginSidebarEl, {
id: "quarto-toc-toggle",
titleSelector: "#toc-title",
dismissOnClick: true,
});
const sidebarScrollVisiblity = manageSidebarVisiblity(sidebarEl, {
id: "quarto-sidebarnav-toggle",
titleSelector: ".title",
dismissOnClick: false,
});
let tocLeftScrollVisibility;
if (leftTocEl) {
tocLeftScrollVisibility = manageSidebarVisiblity(leftTocEl, {
id: "quarto-lefttoc-toggle",
titleSelector: "#toc-title",
dismissOnClick: true,
});
}
// Find the first element that uses formatting in special columns
const conflictingEls = window.document.body.querySelectorAll(
'[class^="column-"], [class*=" column-"], aside, [class*="margin-caption"], [class*=" margin-caption"], [class*="margin-ref"], [class*=" margin-ref"]'
);
// Filter all the possibly conflicting elements into ones
// the do conflict on the left or ride side
const arrConflictingEls = Array.from(conflictingEls);
const leftSideConflictEls = arrConflictingEls.filter((el) => {
if (el.tagName === "ASIDE") {
return false;
}
return Array.from(el.classList).find((className) => {
return (
className !== "column-body" &&
className.startsWith("column-") &&
!className.endsWith("right") &&
!className.endsWith("container") &&
className !== "column-margin"
);
});
});
const rightSideConflictEls = arrConflictingEls.filter((el) => {
if (el.tagName === "ASIDE") {
return true;
}
const hasMarginCaption = Array.from(el.classList).find((className) => {
return className == "margin-caption";
});
if (hasMarginCaption) {
return true;
}
return Array.from(el.classList).find((className) => {
return (
className !== "column-body" &&
!className.endsWith("container") &&
className.startsWith("column-") &&
!className.endsWith("left")
);
});
});
const kOverlapPaddingSize = 10;
function toRegions(els) {
return els.map((el) => {
const top =
el.getBoundingClientRect().top +
document.documentElement.scrollTop -
kOverlapPaddingSize;
return {
top,
bottom: top + el.scrollHeight + 2 * kOverlapPaddingSize,
};
});
}
const hideOverlappedSidebars = () => {
marginScrollVisibility(toRegions(rightSideConflictEls));
sidebarScrollVisiblity(toRegions(leftSideConflictEls));
if (tocLeftScrollVisibility) {
tocLeftScrollVisibility(toRegions(leftSideConflictEls));
}
};
window.quartoToggleReader = () => {
// Applies a slow class (or removes it)
// to update the transition speed
const slowTransition = (slow) => {
const manageTransition = (id, slow) => {
const el = document.getElementById(id);
if (el) {
if (slow) {
el.classList.add("slow");
} else {
el.classList.remove("slow");
}
}
};
manageTransition("TOC", slow);
manageTransition("quarto-sidebar", slow);
};
const readerMode = !isReaderMode();
setReaderModeValue(readerMode);
// If we're entering reader mode, slow the transition
if (readerMode) {
slowTransition(readerMode);
}
highlightReaderToggle(readerMode);
hideOverlappedSidebars();
// If we're exiting reader mode, restore the non-slow transition
if (!readerMode) {
slowTransition(!readerMode);
}
};
const highlightReaderToggle = (readerMode) => {
const els = document.querySelectorAll(".quarto-reader-toggle");
if (els) {
els.forEach((el) => {
if (readerMode) {
el.classList.add("reader");
} else {
el.classList.remove("reader");
}
});
}
};
const setReaderModeValue = (val) => {
if (window.location.protocol !== "file:") {
window.localStorage.setItem("quarto-reader-mode", val);
} else {
localReaderMode = val;
}
};
const isReaderMode = () => {
if (window.location.protocol !== "file:") {
return window.localStorage.getItem("quarto-reader-mode") === "true";
} else {
return localReaderMode;
}
};
let localReaderMode = null;
// Walk the TOC and collapse/expand nodes
// Nodes are expanded if:
// - they are top level
// - they have children that are 'active' links
// - they are directly below an link that is 'active'
const walk = (el, depth) => {
// Tick depth when we enter a UL
if (el.tagName === "UL") {
depth = depth + 1;
}
// It this is active link
let isActiveNode = false;
if (el.tagName === "A" && el.classList.contains("active")) {
isActiveNode = true;
}
// See if there is an active child to this element
let hasActiveChild = false;
for (child of el.children) {
hasActiveChild = walk(child, depth) || hasActiveChild;
}
// Process the collapse state if this is an UL
if (el.tagName === "UL") {
if (depth === 1 || hasActiveChild || prevSiblingIsActiveLink(el)) {
el.classList.remove("collapse");
} else {
el.classList.add("collapse");
}
// untick depth when we leave a UL
depth = depth - 1;
}
return hasActiveChild || isActiveNode;
};
// walk the TOC and expand / collapse any items that should be shown
if (tocEl) {
walk(tocEl, 0);
updateActiveLink();
}
// Throttle the scroll event and walk peridiocally
window.document.addEventListener(
"scroll",
throttle(() => {
if (tocEl) {
updateActiveLink();
walk(tocEl, 0);
}
if (!isReaderMode()) {
hideOverlappedSidebars();
}
}, 5)
);
window.addEventListener(
"resize",
throttle(() => {
if (!isReaderMode()) {
hideOverlappedSidebars();
}
}, 10)
);
hideOverlappedSidebars();
highlightReaderToggle(isReaderMode());
});
// grouped tabsets
window.addEventListener("pageshow", (_event) => {
function getTabSettings() {
const data = localStorage.getItem("quarto-persistent-tabsets-data");
if (!data) {
localStorage.setItem("quarto-persistent-tabsets-data", "{}");
return {};
}
if (data) {
return JSON.parse(data);
}
}
function setTabSettings(data) {
localStorage.setItem(
"quarto-persistent-tabsets-data",
JSON.stringify(data)
);
}
function setTabState(groupName, groupValue) {
const data = getTabSettings();
data[groupName] = groupValue;
setTabSettings(data);
}
function toggleTab(tab, active) {
const tabPanelId = tab.getAttribute("aria-controls");
const tabPanel = document.getElementById(tabPanelId);
if (active) {
tab.classList.add("active");
tabPanel.classList.add("active");
} else {
tab.classList.remove("active");
tabPanel.classList.remove("active");
}
}
function toggleAll(selectedGroup, selectorsToSync) {
for (const [thisGroup, tabs] of Object.entries(selectorsToSync)) {
const active = selectedGroup === thisGroup;
for (const tab of tabs) {
toggleTab(tab, active);
}
}
}
function findSelectorsToSyncByLanguage() {
const result = {};
const tabs = Array.from(
document.querySelectorAll(`div[data-group] a[id^='tabset-']`)
);
for (const item of tabs) {
const div = item.parentElement.parentElement.parentElement;
const group = div.getAttribute("data-group");
if (!result[group]) {
result[group] = {};
}
const selectorsToSync = result[group];
const value = item.innerHTML;
if (!selectorsToSync[value]) {
selectorsToSync[value] = [];
}
selectorsToSync[value].push(item);
}
return result;
}
function setupSelectorSync() {
const selectorsToSync = findSelectorsToSyncByLanguage();
Object.entries(selectorsToSync).forEach(([group, tabSetsByValue]) => {
Object.entries(tabSetsByValue).forEach(([value, items]) => {
items.forEach((item) => {
item.addEventListener("click", (_event) => {
setTabState(group, value);
toggleAll(value, selectorsToSync[group]);
});
});
});
});
return selectorsToSync;
}
const selectorsToSync = setupSelectorSync();
for (const [group, selectedName] of Object.entries(getTabSettings())) {
const selectors = selectorsToSync[group];
// it's possible that stale state gives us empty selections, so we explicitly check here.
if (selectors) {
toggleAll(selectedName, selectors);
}
}
});
function throttle(func, wait) {
let waiting = false;
return function () {
if (!waiting) {
func.apply(this, arguments);
waiting = true;
setTimeout(function () {
waiting = false;
}, wait);
}
};
}
function nexttick(func) {
return setTimeout(func, 0);
}

View File

@ -0,0 +1 @@
.tippy-box[data-animation=fade][data-state=hidden]{opacity:0}[data-tippy-root]{max-width:calc(100vw - 10px)}.tippy-box{position:relative;background-color:#333;color:#fff;border-radius:4px;font-size:14px;line-height:1.4;white-space:normal;outline:0;transition-property:transform,visibility,opacity}.tippy-box[data-placement^=top]>.tippy-arrow{bottom:0}.tippy-box[data-placement^=top]>.tippy-arrow:before{bottom:-7px;left:0;border-width:8px 8px 0;border-top-color:initial;transform-origin:center top}.tippy-box[data-placement^=bottom]>.tippy-arrow{top:0}.tippy-box[data-placement^=bottom]>.tippy-arrow:before{top:-7px;left:0;border-width:0 8px 8px;border-bottom-color:initial;transform-origin:center bottom}.tippy-box[data-placement^=left]>.tippy-arrow{right:0}.tippy-box[data-placement^=left]>.tippy-arrow:before{border-width:8px 0 8px 8px;border-left-color:initial;right:-7px;transform-origin:center left}.tippy-box[data-placement^=right]>.tippy-arrow{left:0}.tippy-box[data-placement^=right]>.tippy-arrow:before{left:-7px;border-width:8px 8px 8px 0;border-right-color:initial;transform-origin:center right}.tippy-box[data-inertia][data-state=visible]{transition-timing-function:cubic-bezier(.54,1.5,.38,1.11)}.tippy-arrow{width:16px;height:16px;color:#333}.tippy-arrow:before{content:"";position:absolute;border-color:transparent;border-style:solid}.tippy-content{position:relative;padding:5px 9px;z-index:1}

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,7 @@
/*!
* headroom.js v0.12.0 - Give your page some headroom. Hide your header until you need it
* Copyright (c) 2020 Nick Williams - http://wicky.nillia.ms/headroom.js
* License: MIT
*/
!function(t,n){"object"==typeof exports&&"undefined"!=typeof module?module.exports=n():"function"==typeof define&&define.amd?define(n):(t=t||self).Headroom=n()}(this,function(){"use strict";function t(){return"undefined"!=typeof window}function d(t){return function(t){return t&&t.document&&function(t){return 9===t.nodeType}(t.document)}(t)?function(t){var n=t.document,o=n.body,s=n.documentElement;return{scrollHeight:function(){return Math.max(o.scrollHeight,s.scrollHeight,o.offsetHeight,s.offsetHeight,o.clientHeight,s.clientHeight)},height:function(){return t.innerHeight||s.clientHeight||o.clientHeight},scrollY:function(){return void 0!==t.pageYOffset?t.pageYOffset:(s||o.parentNode||o).scrollTop}}}(t):function(t){return{scrollHeight:function(){return Math.max(t.scrollHeight,t.offsetHeight,t.clientHeight)},height:function(){return Math.max(t.offsetHeight,t.clientHeight)},scrollY:function(){return t.scrollTop}}}(t)}function n(t,s,e){var n,o=function(){var n=!1;try{var t={get passive(){n=!0}};window.addEventListener("test",t,t),window.removeEventListener("test",t,t)}catch(t){n=!1}return n}(),i=!1,r=d(t),l=r.scrollY(),a={};function c(){var t=Math.round(r.scrollY()),n=r.height(),o=r.scrollHeight();a.scrollY=t,a.lastScrollY=l,a.direction=l<t?"down":"up",a.distance=Math.abs(t-l),a.isOutOfBounds=t<0||o<t+n,a.top=t<=s.offset[a.direction],a.bottom=o<=t+n,a.toleranceExceeded=a.distance>s.tolerance[a.direction],e(a),l=t,i=!1}function h(){i||(i=!0,n=requestAnimationFrame(c))}var u=!!o&&{passive:!0,capture:!1};return t.addEventListener("scroll",h,u),c(),{destroy:function(){cancelAnimationFrame(n),t.removeEventListener("scroll",h,u)}}}function o(t){return t===Object(t)?t:{down:t,up:t}}function s(t,n){n=n||{},Object.assign(this,s.options,n),this.classes=Object.assign({},s.options.classes,n.classes),this.elem=t,this.tolerance=o(this.tolerance),this.offset=o(this.offset),this.initialised=!1,this.frozen=!1}return s.prototype={constructor:s,init:function(){return s.cutsTheMustard&&!this.initialised&&(this.addClass("initial"),this.initialised=!0,setTimeout(function(t){t.scrollTracker=n(t.scroller,{offset:t.offset,tolerance:t.tolerance},t.update.bind(t))},100,this)),this},destroy:function(){this.initialised=!1,Object.keys(this.classes).forEach(this.removeClass,this),this.scrollTracker.destroy()},unpin:function(){!this.hasClass("pinned")&&this.hasClass("unpinned")||(this.addClass("unpinned"),this.removeClass("pinned"),this.onUnpin&&this.onUnpin.call(this))},pin:function(){this.hasClass("unpinned")&&(this.addClass("pinned"),this.removeClass("unpinned"),this.onPin&&this.onPin.call(this))},freeze:function(){this.frozen=!0,this.addClass("frozen")},unfreeze:function(){this.frozen=!1,this.removeClass("frozen")},top:function(){this.hasClass("top")||(this.addClass("top"),this.removeClass("notTop"),this.onTop&&this.onTop.call(this))},notTop:function(){this.hasClass("notTop")||(this.addClass("notTop"),this.removeClass("top"),this.onNotTop&&this.onNotTop.call(this))},bottom:function(){this.hasClass("bottom")||(this.addClass("bottom"),this.removeClass("notBottom"),this.onBottom&&this.onBottom.call(this))},notBottom:function(){this.hasClass("notBottom")||(this.addClass("notBottom"),this.removeClass("bottom"),this.onNotBottom&&this.onNotBottom.call(this))},shouldUnpin:function(t){return"down"===t.direction&&!t.top&&t.toleranceExceeded},shouldPin:function(t){return"up"===t.direction&&t.toleranceExceeded||t.top},addClass:function(t){this.elem.classList.add.apply(this.elem.classList,this.classes[t].split(" "))},removeClass:function(t){this.elem.classList.remove.apply(this.elem.classList,this.classes[t].split(" "))},hasClass:function(t){return this.classes[t].split(" ").every(function(t){return this.classList.contains(t)},this.elem)},update:function(t){t.isOutOfBounds||!0!==this.frozen&&(t.top?this.top():this.notTop(),t.bottom?this.bottom():this.notBottom(),this.shouldUnpin(t)?this.unpin():this.shouldPin(t)&&this.pin())}},s.options={tolerance:{up:0,down:0},offset:0,scroller:t()?window:null,classes:{frozen:"headroom--frozen",pinned:"headroom--pinned",unpinned:"headroom--unpinned",top:"headroom--top",notTop:"headroom--not-top",bottom:"headroom--bottom",notBottom:"headroom--not-bottom",initial:"headroom"}},s.cutsTheMustard=!!(t()&&function(){}.bind&&"classList"in document.documentElement&&Object.assign&&Object.keys&&requestAnimationFrame),s});

View File

@ -0,0 +1,221 @@
const headroomChanged = new CustomEvent("quarto-hrChanged", {
detail: {},
bubbles: true,
cancelable: false,
composed: false,
});
window.document.addEventListener("DOMContentLoaded", function () {
let init = false;
function throttle(func, wait) {
var timeout;
return function () {
const context = this;
const args = arguments;
const later = function () {
clearTimeout(timeout);
timeout = null;
func.apply(context, args);
};
if (!timeout) {
timeout = setTimeout(later, wait);
}
};
}
function headerOffset() {
// Set an offset if there is are fixed top navbar
const headerEl = window.document.querySelector("header.fixed-top");
if (headerEl) {
return headerEl.clientHeight;
} else {
return 0;
}
}
function footerOffset() {
const footerEl = window.document.querySelector("footer.footer");
if (footerEl) {
return footerEl.clientHeight;
} else {
return 0;
}
}
function updateDocumentOffsetWithoutAnimation() {
updateDocumentOffset(false);
}
function updateDocumentOffset(animated) {
// set body offset
const topOffset = headerOffset();
const bodyOffset = topOffset + footerOffset();
const bodyEl = window.document.body;
bodyEl.setAttribute("data-bs-offset", topOffset);
bodyEl.style.paddingTop = topOffset + "px";
// deal with sidebar offsets
const sidebars = window.document.querySelectorAll(
".sidebar, .headroom-target"
);
sidebars.forEach((sidebar) => {
if (!animated) {
sidebar.classList.add("notransition");
// Remove the no transition class after the animation has time to complete
setTimeout(function () {
sidebar.classList.remove("notransition");
}, 201);
}
if (window.Headroom && sidebar.classList.contains("sidebar-unpinned")) {
sidebar.style.top = "0";
sidebar.style.maxHeight = "100vh";
} else {
sidebar.style.top = topOffset + "px";
sidebar.style.maxHeight = "calc(100vh - " + topOffset + "px)";
}
});
// allow space for footer
const mainContainer = window.document.querySelector(".quarto-container");
if (mainContainer) {
mainContainer.style.minHeight = "calc(100vh - " + bodyOffset + "px)";
}
// link offset
let linkStyle = window.document.querySelector("#quarto-target-style");
if (!linkStyle) {
linkStyle = window.document.createElement("style");
window.document.head.appendChild(linkStyle);
}
while (linkStyle.firstChild) {
linkStyle.removeChild(linkStyle.firstChild);
}
if (topOffset > 0) {
linkStyle.appendChild(
window.document.createTextNode(`
section:target::before {
content: "";
display: block;
height: ${topOffset}px;
margin: -${topOffset}px 0 0;
}`)
);
}
if (init) {
window.dispatchEvent(headroomChanged);
}
init = true;
}
// initialize headroom
var header = window.document.querySelector("#quarto-header");
if (header && window.Headroom) {
const headroom = new window.Headroom(header, {
tolerance: 5,
onPin: function () {
const sidebars = window.document.querySelectorAll(
".sidebar, .headroom-target"
);
sidebars.forEach((sidebar) => {
sidebar.classList.remove("sidebar-unpinned");
});
updateDocumentOffset();
},
onUnpin: function () {
const sidebars = window.document.querySelectorAll(
".sidebar, .headroom-target"
);
sidebars.forEach((sidebar) => {
sidebar.classList.add("sidebar-unpinned");
});
updateDocumentOffset();
},
});
headroom.init();
let frozen = false;
window.quartoToggleHeadroom = function () {
if (frozen) {
headroom.unfreeze();
frozen = false;
} else {
headroom.freeze();
frozen = true;
}
};
}
// Observe size changed for the header
const headerEl = window.document.querySelector("header.fixed-top");
if (headerEl && window.ResizeObserver) {
const observer = new window.ResizeObserver(
updateDocumentOffsetWithoutAnimation
);
observer.observe(headerEl, {
attributes: true,
childList: true,
characterData: true,
});
} else {
window.addEventListener(
"resize",
throttle(updateDocumentOffsetWithoutAnimation, 50)
);
}
setTimeout(updateDocumentOffsetWithoutAnimation, 250);
// fixup index.html links if we aren't on the filesystem
if (window.location.protocol !== "file:") {
const links = window.document.querySelectorAll("a");
for (let i = 0; i < links.length; i++) {
links[i].href = links[i].href.replace(/\/index\.html/, "/");
}
// Fixup any sharing links that require urls
// Append url to any sharing urls
const sharingLinks = window.document.querySelectorAll(
"a.sidebar-tools-main-item"
);
for (let i = 0; i < sharingLinks.length; i++) {
const sharingLink = sharingLinks[i];
const href = sharingLink.getAttribute("href");
if (href) {
sharingLink.setAttribute(
"href",
href.replace("|url|", window.location.href)
);
}
}
// Scroll the active navigation item into view, if necessary
const navSidebar = window.document.querySelector("nav#quarto-sidebar");
if (navSidebar) {
// Find the active item
const activeItem = navSidebar.querySelector("li.sidebar-item a.active");
if (activeItem) {
// Wait for the scroll height and height to resolve by observing size changes on the
// nav element that is scrollable
const resizeObserver = new ResizeObserver((_entries) => {
// The bottom of the element
const elBottom = activeItem.offsetTop;
const viewBottom = navSidebar.scrollTop + navSidebar.clientHeight;
// The element height and scroll height are the same, then we are still loading
if (viewBottom !== navSidebar.scrollHeight) {
// Determine if the item isn't visible and scroll to it
if (elBottom >= viewBottom) {
navSidebar.scrollTop = elBottom;
}
// stop observing now since we've completed the scroll
resizeObserver.unobserve(navSidebar);
}
});
resizeObserver.observe(navSidebar);
}
}
}
});

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

25
doc/_quarto.yml Normal file
View File

@ -0,0 +1,25 @@
project:
type: book
book:
title: "OBITools V4"
author: "Eric Coissac"
date: "1/17/2023"
chapters:
- index.qmd
- intro.qmd
- commands.qmd
- library.qmd
- annexes.qmd
- references.qmd
bibliography: book.bib
format:
html:
theme: cosmo
pdf:
documentclass: scrreprt

82
doc/annexes.qmd Normal file
View File

@ -0,0 +1,82 @@
# Annexes
### Sequence attributes
#### Reserved sequence attributes
##### `ali_dir`
###### Type : `string`
The attribute can contain 2 string values `"left"` or `"right".`
###### Set by the *obipairing* tool
The alignment generated by *obipairing* is a 3'-end gap free algorithm.
Two cases can occur when aligning the forward and reverse reads. If the
barcode is long enough, both the reads overlap only on their 3' ends. In
such case, the alignment direction `ali_dir` is set to *left*. If the
barcode is shorter than the read length, the paired reads overlap by
their 5' ends, and the complete barcode is sequenced by both the reads.
In that later case, `ali_dir` is set to *right*.
##### `ali_length`
###### Set by the *obipairing* tool
Length of the aligned parts when merging forward and reverse reads
##### `count` : the number of sequence occurrences
###### Set by the *obiuniq* tool
The `count` attribute indicates how-many strictly identical sequences
have been merged in a single record. It contains an integer value. If it
is absent this means that the sequence record represents a single
occurrence of the sequence.
###### Getter : method `Count()`
The `Count()` method allows to access to the count attribute as an
integer value. If the `count` attribute is not defined for the given
sequence, the value *1* is returned
##### `merged_*`
###### Type : `map[string]int`
###### Set by the *obiuniq* tool
The `-m` option of the *obiuniq* tools allows for keeping track of the
distribution of the values stored in given attribute of interest. Often
this option is used to summarise distribution of a sequence variant
accross samples when *obiuniq* is run after running *obimultiplex*. The
actual name of the attribute depends on the name of the monitored
attribute. If `-m` option is used with the attribute *sample*, then this
attribute names *merged_sample*.
##### `mode`
###### Set by the *obipairing* tool
**`obitag_ref_index`**
###### Set by the *obirefidx* tool.
It resumes to which taxonomic annotation a match to that sequence must
lead according to the number of differences existing between the query
sequence and the reference sequence having that tag.
###### Getter : method `Count()`
##### `pairing_mismatches`
###### Set by the *obipairing* tool
##### `score`
###### Set by the *obipairing* tool
##### `score_norm`
###### Set by the *obipairing* tool

View File

@ -0,0 +1,10 @@
@article{cock2010sanger,
title={The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants},
author={Cock, Peter JA and Fields, Christopher J and Goto, Naohisa and Heuer, Michael L and Rice, Peter M},
journal={Nucleic acids research},
volume={38},
number={6},
pages={1767--1771},
year={2010},
publisher={Oxford University Press}
}

121
doc/commands.qmd Normal file
View File

@ -0,0 +1,121 @@
# The *OBITools* commands
## Specifying the input files to *OBITools* commands
## Options common to most of the *OBITools* commands
### Specifying input format
Five sequence formats are accepted for input files. [Fasta](#fasta-classical "Fasta format description") and [Fastq](#fastq-classical "Fastq format description") are the main ones, EMBL and Genbank allow the use of flat files produced by these two international databases. The last one, ecoPCR, is maintained for compatibility with previous *OBITools* and allows to read *ecoPCR* outputs as sequence files.
- `--ecopcr` : Read data following the *ecoPCR* output format.
- `--embl` Read data following the *EMBL* flatfile format.
- `--genbank` Read data following the *Genbank* flatfile format.
Several encoding schemes have been proposed for quality scores in [Fastq](#fastq-classical "Fastq format description") format. Currently, *OBITools* considers Sanger encoding as the standard. For reasons of compatibility with older datasets produced with *Solexa* sequencers, it is possible, by using the following option, to force the use of the corresponding quality encoding scheme when reading these older files.
- `--solexa` Decodes quality string according to the Solexa specification. (default: false)
### Specifying output format
Only two output sequence formats are supported by OBITools, Fasta and Fastq. Fastq is used when output sequences are associated with quality information. Otherwise, Fasta is the default format. However, it is possible to force the output format by using one of the following two options. Forcing the use of Fasta results in the loss of quality information. Conversely, when the Fastq format is forced with sequences that have no quality data, dummy qualities set to 40 for each nucleotide are added.
- `--fasta-output` Read data following the ecoPCR output format.
- `--fastq-output` Read data following the EMBL flatfile format.
OBITools allows multiple input files to be specified for a single command.
- `--no-order` When several input files are provided, indicates that there is no order among them. (default: false)
### Format of the annotations in Fasta and Fastq files
OBITools extend the [Fasta](#fasta-classical "Fasta format description") and [Fastq](#fastq-classical "Fastq format description") formats by introducing a format for the title lines of these formats allowing to annotate every sequence. While the previous version of OBITools used an *ad-hoc* format for these annotation, this new version introduce the usage of the standard JSON format to store them.
On input, OBITools automatically recognize the format of the annotations, but two options allows to force the parsing following one of them. You should normally not need to use these options.
- `--input-OBI-header` FASTA/FASTQ title line annotations follow OBI format. (default: false)
- `--input-json-header` FASTA/FASTQ title line annotations follow json format. (default: false)
On output, by default annotation are formatted using the new JSON format. For compatibility with previous version of OBITools and with external scripts and software, it is possible to force the usage of the previous OBITools format.
- `--output-OBI-header|-O` output FASTA/FASTQ title line annotations follow OBI format. (default: false)
- `--output-json-header` output FASTA/FASTQ title line annotations follow json format. (default: false)
#### System related options
- `--debug` (default: false)
- `--help\|-h\|-?` (default: false)
- `--max-cpu <int>` Number of parallele threads computing the result (default: 10)
- `--workers\|-w <int>` Number of parallele threads computing the result (default: 9)
## OBITools expression language
Several OBITools (*e.g.* obigrep, obiannotate) allow the user to specify some simple expressions to compute values or define predicates. This expressions are parsed and evaluated using the [gval](https://pkg.go.dev/github.com/PaesslerAG/gval "Gval (Go eVALuate) for evaluating arbitrary expressions Go-like expressions.") go package, which allows for evaluating go-Like expression.
### Variables usable in the expression
#### sequence
sequence is the sequence object on which the expression is evaluated
#### annotation
### Function defined in the language
#### len
#### ismap
#### hasattribute
#### min
#### max
### Accessing to the sequence annotations
## Metabarcode design and quality assessment
#### `obipcr`
> Replace the `ecoPCR` original *OBITools*
## File format conversions
#### `obiconvert`
## Sequence annotations
#### `obitag`
## Computations on sequences
### `obipairing`
> Replace the `illuminapairedends` original *OBITools*
#### `obimultiplex`
> Replace the `ngsfilter` original *OBITools*
#### `obicomplement`
#### `obiclean`
#### `obiuniq`
## Sequence sampling and filtering
#### `obigrep`
### Utilities
#### `obicount`
#### `obidistribute`
#### `obifind`
> Replace the `ecofind` original *OBITools.*

BIN
doc/cover.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

5
doc/index.qmd Normal file
View File

@ -0,0 +1,5 @@
# Preface {.unnumbered}
The first version of *OBITools* started to be developed in 2005. This was at the beginning of the DNA metabarcoding story at the Laboratoire d'Ecologie Alpine (LECA) in Grenoble. At that time, with Pierre Taberlet and François Pompanon, we were thinking about the potential of this new methodology under development. PIerre and François developed more the laboratory methods, while I was thinking more about the tools for analysing the sequences produced. Two ideas were behind this development. I wanted something modular, and something easy to extend. To achieve the first goal, I decided to implement obitools as a suite of unix commands mimicking the classic unix commands but dedicated to sequence files. The basic unix commands are very useful for automatically manipulating, parsing and editing text files. They work in flow, line by line on the input text. The result is a new text file that can be used as input for the next command. Such a design makes it possible to quickly develop a text processing pipeline by chaining simple elementary operations. The *OBITools* are the exact counterpart of these basic Unix commands, but the basic information they process is a sequence (potentially spanning several lines of text), not a single line of text. Most *OBITools* consume sequence files and produce sequence files. Thus, the principles of chaining and modularity are respected. In order to be able to easily extend the *OBITools* to keep up with our evolving ideas about processing DNA metabarcoding data, it was decided to develop them using an interpreted language: Python. Python 2, the version available at the time, allowed us to develop the *OBITools* efficiently. When parts of the algorithms were computationally demanding, they were implemented in C and linked to the Python code. Even though Python is not the most efficient language available, even though computers were not as powerful as they are today, the size of the data we could produce using 454 sequencers or early solexa machines was small enough to be processed in a reasonable time.

142
doc/intro.qmd Normal file
View File

@ -0,0 +1,142 @@
# The OBITools
## Aims of *OBITools*
## File formats usable with *OBITools*
### The sequence files
Sequences can be stored following various format. OBITools knows some of them. The central formats for sequence files manipulated by OBITools scripts are the `fasta` and fastq format. OBITools extends the both these formats by specifying a syntax to include in the definition line data qualifying the sequence. All file formats use the `IUPAC` code for encoding nucleotides.
### The IUPAC Code
The International Union of Pure and Applied Chemistry (IUPAC\_) defined the standard code for representing protein or DNA sequences.
#### Nucleic IUPAC Code {#DNA-IUPAC}
| **Code** | **Nucleotide** |
|----------|-----------------------------|
| A | Adenine |
| C | Cytosine |
| G | Guanine |
| T | Thymine |
| U | Uracil |
| R | Purine (A or G) |
| Y | Pyrimidine (C, T, or U) |
| M | C or A |
| K | T, U, or G |
| W | T, U, or A |
| S | C or G |
| B | C, T, U, or G (not A) |
| D | A, T, U, or G (not C) |
| H | A, T, U, or C (not G) |
| V | A, C, or G (not T, not U) |
| N | Any base (A, C, G, T, or U) |
### The *fasta* format {#classical-fasta}
The **fasta format** is certainly the most widely used sequence file format. This is certainly due to its great simplicity. It was originally created for the Lipman and Pearson [FASTA program](http://www.ncbi.nlm.nih.gov/pubmed/3162770?dopt=Citation). OBITools use in more of the classical :ref:`fasta` format an :ref:`extended version` of this format where structured data are included in the title line.
In *fasta* format a sequence is represented by a title line beginning with a **`>`** character and the sequences by itself following the :doc:`iupac` code. The sequence is usually split other severals lines of the same length (expect for the last one)
>my_sequence this is my pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
This is no special format for the title line excepting that this line should be unique. Usually the first word following the **\>** character is considered as the sequence identifier. The end of the title line corresponding to a description of the sequence. Several sequences can be concatenated in a same file. The description of the next sequence is just pasted at the end of the record of the previous one
>sequence_A this is my first pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
>sequence_B this is my second pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
>sequence_C this is my third pretty sequence
ACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGT
GTGCTGACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTACGTTGCAGTGTTT
AACGACGTTGCAGTACGTTGCAGT
### The *fastq* sequence format[^01_obitools_doc-1] {#classical-fastq}
**fastq format** is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are encoded with a single ASCII character for brevity. It was originally developed at the `Wellcome Trust Sanger Institute` to bundle a [fasta](#classical-fasta) sequence and its quality data, but has recently become the *de facto* standard for storing the output of high throughput sequencing instruments such as the Illumina Genome Analyzer Illumina [@cock2010sanger] .
[^01_obitools_doc-1]: This article uses material from the Wikipedia article [`FASTQ format`](http://en.wikipedia.org/wiki/FASTQ_format) which is released under the `Creative Commons Attribution-Share-Alike License 3.0`
A fastq file normally uses four lines per sequence.
- Line 1 begins with a '\@' character and is followed by a sequence identifier and an *optional* description (like a :ref:`fasta` title line).
- Line 2 is the raw sequence letters.
- Line 3 begins with a '+' character and is *optionally* followed by the same sequence identifier (and any description) again.
- Line 4 encodes the quality values for the sequence in Line 2, and must contain the same number of symbols as letters in the sequence.
A fastq file containing a single sequence might look like this:
@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65
The character '!' represents the lowest quality while '\~' is the highest. Here are the quality value characters in left-to-right increasing order of quality (`ASCII`):
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
The original Sanger FASTQ files also allowed the sequence and quality strings to be wrapped (split over multiple lines), but this is generally discouraged as it can make parsing complicated due to the unfortunate choice of "\@" and "+" as markers (these characters can also occur in the quality string).
#### Variations
##### Quality
A quality value *Q* is an integer mapping of *p* (i.e., the probability that the corresponding base call is incorrect). Two different equations have been in use. The first is the standard Sanger variant to assess reliability of a base call, otherwise known as Phred quality score:
$$
Q_\text{sanger} = -10 \, \log_{10} p
$$
The Solexa pipeline (i.e., the software delivered with the Illumina Genome Analyzer) earlier used a different mapping, encoding the odds $\mathbf{p}/(1-\mathbf{p})$ instead of the probability $\mathbf{p}$:
$$
Q_\text{solexa-prior to v.1.3} = -10 \, \log_{10} \frac{p}{1-p}
$$
Although both mappings are asymptotically identical at higher quality values, they differ at lower quality levels (i.e., approximately $\mathbf{p} > 0.05$, or equivalently, $\mathbf{Q} < 13$).
\|Relationship between *Q* and *p* using the Sanger (red) and Solexa (black) equations (described above). The vertical dotted line indicates $\mathbf{p}= 0.05$, or equivalently, $Q = 13$.\|
#### Encoding
- Sanger format can encode a Phred quality score from 0 to 93 using ASCII 33 to 126 (although in raw read data the Phred quality score rarely exceeds 60, higher scores are possible in assemblies or read maps).
- Solexa/Illumina 1.0 format can encode a Solexa/Illumina quality score from -5 to 62 using ASCII 59 to 126 (although in raw read data Solexa scores from -5 to 40 only are expected)
- Starting with Illumina 1.3 and before Illumina 1.8, the format encoded a Phred quality score from 0 to 62 using ASCII 64 to 126 (although in raw read data Phred scores from 0 to 40 only are expected).
- Starting in Illumina 1.5 and before Illumina 1.8, the Phred scores 0 to 2 have a slightly different meaning. The values 0 and 1 are no longer used and the value 2, encoded by ASCII 66 "B".
Sequencing Control Software, Version 2.6, Catalog \# SY-960-2601, Part \# 15009921 Rev. A, November 2009] [[http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf\\\\](http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf){.uri}](%5Bhttp://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf\%5D(http://watson.nci.nih.gov/solexa/Using_SCSv2.6_15009921_A.pdf)%7B.uri%7D){.uri} (page 30) states the following: *If a read ends with a segment of mostly low quality (Q15 or below), then all of the quality values in the segment are replaced with a value of 2 (encoded as the letter B in Illumina's text-based encoding of quality scores)... This Q2 indicator does not predict a specific error rate, but rather indicates that a specific final portion of the read should not be used in further analyses.* Also, the quality score encoded as "B" letter may occur internally within reads at least as late as pipeline version 1.6, as shown in the following example:
@HWI-EAS209_0006_FC706VJ:5:58:5894:21141#ATCACG/1
TTAATTGGTAAATAAATCTCCTAATAGCTTAGATNTTACCTTNNNNNNNNNNTAGTTTCTTGAGATTTGTTGGGGGAGACATTTTTGTGATTGCCTTGAT
+HWI-EAS209_0006_FC706VJ:5:58:5894:21141#ATCACG/1
efcfffffcfeefffcffffffddf`feed]`]_Ba_^__[YBBBBBBBBBBRTT\]][]dddd`ddd^dddadd^BBBBBBBBBBBBBBBBBBBBBBBB
An alternative interpretation of this ASCII encoding has been proposed. Also, in Illumina runs using PhiX controls, the character 'B' was observed to represent an "unknown quality score". The error rate of 'B' reads was roughly 3 phred scores lower the mean observed score of a given run.
- Starting in Illumina 1.8, the quality scores have basically returned to the use of the Sanger format (Phred+33).
## File extension
There is no standard file extension for a FASTQ file, but .fq and .fastq, are commonly used.
## See also
- :ref:`fasta`
## References
.. [1] Cock et al (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research,
.. [2] Illumina Quality Scores, Tobias Mann, Bioinformatics, San Diego, Illumina `1`\_\_
.. \|Relationship between *Q* and *p* using the Sanger (red) and Solexa (black) equations (described above). The vertical dotted line indicates *p* = 0.05, or equivalently, *Q* Å 13.\| image:: Probability metrics.png
See <http://en.wikipedia.org/wiki/FASTQ_format>

157
doc/library.qmd Normal file
View File

@ -0,0 +1,157 @@
# The GO *OBITools* library
## BioSequence
The `BioSequence` class is used to represent biological sequences. It
allows for storing : - the sequence itself as a `[]byte` - the
sequencing quality score as a `[]byte` if needed - an identifier as a
`string` - a definition as a `string` - a set of *(key, value)* pairs in
a `map[sting]interface{}`
BioSequence is defined in the obiseq module and is included using the
code
``` go
import (
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
```
### Creating new instances
To create new instance, use
- `MakeBioSequence(id string, sequence []byte, definition string) obiseq.BioSequence`
- `NewBioSequence(id string, sequence []byte, definition string) *obiseq.BioSequence`
Both create a `BioSequence` instance, but when the first one returns the
instance, the second returns a pointer on the new instance. Two other
functions `MakeEmptyBioSequence`, and `NewEmptyBioSequence` do the same
job but provide an uninitialized objects.
- `id` parameters corresponds to the unique identifier of the
sequence. It mist be a string constituted of a single word (not
containing any space).
- `sequence` is the DNA sequence itself, provided as a `byte` array
(`[]byte`).
- `definition` is a `string`, potentially empty, but usualy containing
a sentence explaining what is that sequence.
``` go
import (
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
func main() {
myseq := obiseq.NewBiosequence(
"seq_GH0001",
bytes.FromString("ACGTGTCAGTCG"),
"A short test sequence",
)
}
```
When formated as fasta the parameters correspond to the following schema
>id definition containing potentially several words
sequence
### End of life of a `BioSequence` instance
When a `BioSequence` instance is no more used, it is normally taken in
charge by the GO garbage collector. You can if you want call the
`Recycle` method on the instance to store the allocated memory element
in a `pool` to limit allocation effort when many sequences are
manipulated.
### Accessing to the elements of a sequence
The different elements of an `obiseq.BioSequence` must be accessed using
a set of methods. For the three main elements provided during the
creation of a new instance methodes are :
- `Id() string`
- `Sequence() []byte`
- `Definition() string`
It exists pending method to change the value of these elements
- `SetId(id string)`
- `SetSequence(sequence []byte)`
- `SetDefinition(definition string)`
``` go
import (
"fmt"
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
func main() {
myseq := obiseq.NewBiosequence(
"seq_GH0001",
bytes.FromString("ACGTGTCAGTCG"),
"A short test sequence",
)
fmt.Println(myseq.Id())
myseq.SetId("SPE01_0001")
fmt.Println(myseq.Id())
}
```
#### Different ways for accessing an editing the sequence
If `Sequence()`and `SetSequence(sequence []byte)` methods are the basic
ones, several other methods exist.
- `String() string` return the sequence directly converted to a
`string` instance.
- The `Write` method family allows for extending an existing sequence
following the buffer protocol.
- `Write(data []byte) (int, error)` allows for appending a byte
array on 3' end of the sequence.
- `WriteString(data string) (int, error)` allows for appending a
`string`.
- `WriteByte(data byte) error` allows for appending a single
`byte`.
The `Clear` method empties the sequence buffer.
``` go
import (
"fmt"
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
func main() {
myseq := obiseq.NewEmptyBiosequence()
myseq.WriteString("accc")
myseq.WriteByte(byte('c'))
fmt.Println(myseq.String())
}
```
#### Sequence quality scores
Sequence quality scores cannot be initialized at the time of instance
creation. You must use dedicated methods to add quality scores to a
sequence.
To be coherent the length of both the DNA sequence and que quality score
sequence must be equal. But assessment of this constraint is realized.
It is of the programmer responsability to check that invariant.
While accessing to the quality scores relies on the method
`Quality() []byte`, setting the quality need to call one of the
following method. They run similarly to their sequence dedicated
conterpart.
- `SetQualities(qualities Quality)`
- `WriteQualities(data []byte) (int, error)`
- `WriteByteQualities(data byte) error`
In a way analogous to the `Clear` method, `ClearQualities()` empties the
sequence of quality scores.

4
doc/references.qmd Normal file
View File

@ -0,0 +1,4 @@
# References {.unnumbered}
::: {#refs}
:::

3
doc/summary.qmd Normal file
View File

@ -0,0 +1,3 @@
# Summary
In summary, this book has no content whatsoever.