<ahref="./installation.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">1</span> <spanclass="chapter-title">Installation of the obitools</span></a>
<ahref="./inupt.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">4</span> <spanclass="chapter-title">Specifying the data input to <em>OBITools</em> commands</span></a>
<ahref="./common_options.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">6</span> <spanclass="chapter-title">Options common to most of the <em>OBITools</em> commands</span></a>
<ahref="./comm_metabarcode_design.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">8</span> <spanclass="chapter-title">Metabarcode design and quality assessment</span></a>
</div>
</li>
<liclass="sidebar-item">
<divclass="sidebar-item-container">
<ahref="./comm_reformat.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">9</span> <spanclass="chapter-title">File format conversions</span></a>
<ahref="./comm_computation.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">11</span> <spanclass="chapter-title">Computations on sequences</span></a>
</div>
</li>
<liclass="sidebar-item">
<divclass="sidebar-item-container">
<ahref="./comm_sampling.html"class="sidebar-item-text sidebar-link"><spanclass="chapter-number">12</span> <spanclass="chapter-title">Sequence sampling and filtering</span></a>
<li><ahref="#creating-new-instances"id="toc-creating-new-instances"class="nav-link"data-scroll-target="#creating-new-instances">Creating new instances</a></li>
<li><ahref="#end-of-life-of-a-biosequence-instance"id="toc-end-of-life-of-a-biosequence-instance"class="nav-link"data-scroll-target="#end-of-life-of-a-biosequence-instance">End of life of a <code>BioSequence</code> instance</a></li>
<li><ahref="#accessing-to-the-elements-of-a-sequence"id="toc-accessing-to-the-elements-of-a-sequence"class="nav-link"data-scroll-target="#accessing-to-the-elements-of-a-sequence">Accessing to the elements of a sequence</a></li>
<li><ahref="#the-annotations-of-a-sequence"id="toc-the-annotations-of-a-sequence"class="nav-link"data-scroll-target="#the-annotations-of-a-sequence">The annotations of a sequence</a></li>
<li><ahref="#basic-usage-of-a-sequence-iterator"id="toc-basic-usage-of-a-sequence-iterator"class="nav-link"data-scroll-target="#basic-usage-of-a-sequence-iterator">Basic usage of a sequence iterator</a></li>
<p>The <code>BioSequence</code> class is used to represent biological sequences. It allows for storing : - the sequence itself as a <code>[]byte</code> - the sequencing quality score as a <code>[]byte</code> if needed - an identifier as a <code>string</code> - a definition as a <code>string</code> - a set of <em>(key, value)</em> pairs in a <code>map[sting]interface{}</code></p>
<p>BioSequence is defined in the obiseq module and is included using the code</p>
<divclass="sourceCode"id="cb1"><preclass="sourceCode go code-with-copy"><codeclass="sourceCode go"><spanid="cb1-1"><ahref="#cb1-1"aria-hidden="true"tabindex="-1"></a><spanclass="kw">import</span><spanclass="op">(</span></span>
<spanid="cb1-3"><ahref="#cb1-3"aria-hidden="true"tabindex="-1"></a><spanclass="op">)</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>Both create a <code>BioSequence</code> instance, but when the first one returns the instance, the second returns a pointer on the new instance. Two other functions <code>MakeEmptyBioSequence</code>, and <code>NewEmptyBioSequence</code> do the same job but provide an uninitialized objects.</p>
<ul>
<li><code>id</code> parameters corresponds to the unique identifier of the sequence. It mist be a string constituted of a single word (not containing any space).</li>
<li><code>sequence</code> is the DNA sequence itself, provided as a <code>byte</code> array (<code>[]byte</code>).</li>
<li><code>definition</code> is a <code>string</code>, potentially empty, but usualy containing a sentence explaining what is that sequence.</li>
</ul>
<divclass="sourceCode"id="cb2"><preclass="sourceCode go code-with-copy"><codeclass="sourceCode go"><spanid="cb2-1"><ahref="#cb2-1"aria-hidden="true"tabindex="-1"></a><spanclass="kw">import</span><spanclass="op">(</span></span>
<spanid="cb2-11"><ahref="#cb2-11"aria-hidden="true"tabindex="-1"></a><spanclass="op">}</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>When formated as fasta the parameters correspond to the following schema</p>
<pre><code>>id definition containing potentially several words
<p>When an instance of <code>BioSequence</code> is no longer in use, it is normally taken over by the GO garbage collector. If you know that an instance will never be used again, you can, if you wish, call the <code>Recycle</code> method on it to store the allocated memory elements in a <code>pool</code> to limit the allocation effort when many sequences are being handled. Once the recycle method has been called on an instance, you must ensure that no other method is called on it.</p>
<p>The different elements of an <code>obiseq.BioSequence</code> must be accessed using a set of methods. For the three main elements provided during the creation of a new instance methodes are :</p>
<ul>
<li><code>Id() string</code></li>
<li><code>Sequence() []byte</code></li>
<li><code>Definition() string</code></li>
</ul>
<p>It exists pending method to change the value of these elements</p>
<divclass="sourceCode"id="cb4"><preclass="sourceCode go code-with-copy"><codeclass="sourceCode go"><spanid="cb4-1"><ahref="#cb4-1"aria-hidden="true"tabindex="-1"></a><spanclass="kw">import</span><spanclass="op">(</span></span>
<spanid="cb4-16"><ahref="#cb4-16"aria-hidden="true"tabindex="-1"></a><spanclass="op">}</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>If <code>Sequence()</code>and <code>SetSequence(sequence []byte)</code> methods are the basic ones, several other methods exist.</p>
<ul>
<li><code>String() string</code> return the sequence directly converted to a <code>string</code> instance.</li>
<li>The <code>Write</code> method family allows for extending an existing sequence following the buffer protocol.
<ul>
<li><code>Write(data []byte) (int, error)</code> allows for appending a byte array on 3’ end of the sequence.</li>
<li><code>WriteString(data string) (int, error)</code> allows for appending a <code>string</code>.</li>
<li><code>WriteByte(data byte) error</code> allows for appending a single <code>byte</code>.</li>
</ul></li>
</ul>
<p>The <code>Clear</code> method empties the sequence buffer.</p>
<divclass="sourceCode"id="cb5"><preclass="sourceCode go code-with-copy"><codeclass="sourceCode go"><spanid="cb5-1"><ahref="#cb5-1"aria-hidden="true"tabindex="-1"></a><spanclass="kw">import</span><spanclass="op">(</span></span>
<spanid="cb5-12"><ahref="#cb5-12"aria-hidden="true"tabindex="-1"></a><spanclass="op">}</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>Sequence quality scores cannot be initialized at the time of instance creation. You must use dedicated methods to add quality scores to a sequence.</p>
<p>To be coherent the length of both the DNA sequence and que quality score sequence must be equal. But assessment of this constraint is realized. It is of the programmer responsability to check that invariant.</p>
<p>While accessing to the quality scores relies on the method <code>Quality() []byte</code>, setting the quality need to call one of the following method. They run similarly to their sequence dedicated conterpart.</p>
<p>A sequence can be annotated with attributes. Each attribute is associated with a value. An attribute is identified by its name. The name of an attribute consists of a character string containing no spaces or blank characters. Values can be of several types.</p>
<ul>
<li>Scalar types:
<ul>
<li>integer</li>
<li>numeric</li>
<li>character</li>
<li>boolean</li>
</ul></li>
<li>Container types:
<ul>
<li>vector</li>
<li>map</li>
</ul></li>
</ul>
<p>Vectors can contain any type of scalar. Maps are compulsorily indexed by strings and can contain any scalar type. It is not possible to have nested container type.</p>
<p>Annotations are stored in an object of type <code>bioseq.Annotation</code> which is an alias of <code>map[string]interface{}</code>. This map can be retrieved using the <code>Annotations() Annotation</code> method. If no annotation has been defined for this sequence, the method returns an empty map. It is possible to test an instance of <code>BioSequence</code> using its <code>HasAnnotation() bool</code> method to see if it has any annotations associated with it.</p>
<p>The pakage <em>obiter</em> provides an iterator mecanism for manipulating sequences. The main class provided by this package is <code>obiiter.IBioSequence</code>. An <code>IBioSequence</code> iterator provides batch of sequences.</p>
<p>Many functions, among them functions reading sequences from a text file, return a <code>IBioSequence</code> iterator. The iterator class provides two main methods:</p>
<p>The <code>Next</code> method moves the iterator to the next value, while the <code>Get</code> method returns the currently pointed value. Using them, it is possible to loop over the data as in the following code chunk.</p>
<divclass="sourceCode"id="cb6"><preclass="sourceCode go code-with-copy"><codeclass="sourceCode go"><spanid="cb6-1"><ahref="#cb6-1"aria-hidden="true"tabindex="-1"></a><spanclass="kw">import</span><spanclass="op">(</span></span>
<spanid="cb6-9"><ahref="#cb6-9"aria-hidden="true"tabindex="-1"></a> data <spanclass="op">:=</span> mydata<spanclass="op">.</span>Get<spanclass="op">()</span></span>
<spanid="cb6-14"><ahref="#cb6-14"aria-hidden="true"tabindex="-1"></a><spanclass="op">}</span></span></code><buttontitle="Copy to Clipboard"class="code-copy-button"><iclass="bi"></i></button></pre></div>
<p>An <code>obiseq.BioSequenceBatch</code> instance is a set of sequences stored in an <code>obiseq.BioSequenceSlice</code> and a sequence number. The number of sequences in a batch is not defined. A batch can even contain zero sequences, if for example all sequences initially included in the batch have been filtered out at some stage of their processing.</p>
<p>A function consuming a <code>obiiter.IBioSequence</code> and returning a <code>obiiter.IBioSequence</code> is of class <code>obiiter.Pipable</code>.</p>
<p>A function consuming a <code>obiiter.IBioSequence</code> and returning two <code>obiiter.IBioSequence</code> instance is of class <code>obiiter.Teeable</code>.</p>