3 Reference documentation for the GO OBITools library
3.1 BioSequence
The BioSequence
class is used to represent biological sequences. It
allows for storing : - the sequence itself as a []byte
- the
sequencing quality score as a []byte
if needed - an identifier as a
string
- a definition as a string
- a set of (key, value) pairs in
a map[sting]interface{}
BioSequence is defined in the obiseq module and is included using the code
import (
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
3.1.1 Creating new instances
To create new instance, use
MakeBioSequence(id string, sequence []byte, definition string) obiseq.BioSequence
NewBioSequence(id string, sequence []byte, definition string) *obiseq.BioSequence
Both create a BioSequence
instance, but when the first one returns the
instance, the second returns a pointer on the new instance. Two other
functions MakeEmptyBioSequence
, and NewEmptyBioSequence
do the same
job but provide an uninitialized objects.
id
parameters corresponds to the unique identifier of the sequence. It mist be a string constituted of a single word (not containing any space).sequence
is the DNA sequence itself, provided as abyte
array ([]byte
).definition
is astring
, potentially empty, but usualy containing a sentence explaining what is that sequence.
import (
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
func main() {
:= obiseq.NewBiosequence(
myseq "seq_GH0001",
.FromString("ACGTGTCAGTCG"),
bytes"A short test sequence",
)
}
When formated as fasta the parameters correspond to the following schema
>id definition containing potentially several words
sequence
3.1.2 End of life of a BioSequence
instance
When a BioSequence
instance is no more used, it is normally taken in
charge by the GO garbage collector. You can if you want call the
Recycle
method on the instance to store the allocated memory element
in a pool
to limit allocation effort when many sequences are
manipulated.
3.1.3 Accessing to the elements of a sequence
The different elements of an obiseq.BioSequence
must be accessed using
a set of methods. For the three main elements provided during the
creation of a new instance methodes are :
Id() string
Sequence() []byte
Definition() string
It exists pending method to change the value of these elements
SetId(id string)
SetSequence(sequence []byte)
SetDefinition(definition string)
import (
"fmt"
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
func main() {
:= obiseq.NewBiosequence(
myseq "seq_GH0001",
.FromString("ACGTGTCAGTCG"),
bytes"A short test sequence",
)
.Println(myseq.Id())
fmt.SetId("SPE01_0001")
myseq.Println(myseq.Id())
fmt}
3.1.3.1 Different ways for accessing an editing the sequence
If Sequence()
and SetSequence(sequence []byte)
methods are the basic
ones, several other methods exist.
String() string
return the sequence directly converted to astring
instance.- The
Write
method family allows for extending an existing sequence following the buffer protocol.Write(data []byte) (int, error)
allows for appending a byte array on 3’ end of the sequence.WriteString(data string) (int, error)
allows for appending astring
.WriteByte(data byte) error
allows for appending a singlebyte
.
The Clear
method empties the sequence buffer.
import (
"fmt"
"git.metabarcoding.org/lecasofts/go/obitools/pkg/obiseq"
)
func main() {
:= obiseq.NewEmptyBiosequence()
myseq
.WriteString("accc")
myseq.WriteByte(byte('c'))
myseq.Println(myseq.String())
fmt}
3.1.3.2 Sequence quality scores
Sequence quality scores cannot be initialized at the time of instance creation. You must use dedicated methods to add quality scores to a sequence.
To be coherent the length of both the DNA sequence and que quality score sequence must be equal. But assessment of this constraint is realized. It is of the programmer responsability to check that invariant.
While accessing to the quality scores relies on the method
Quality() []byte
, setting the quality need to call one of the
following method. They run similarly to their sequence dedicated
conterpart.
SetQualities(qualities Quality)
WriteQualities(data []byte) (int, error)
WriteByteQualities(data byte) error
In a way analogous to the Clear
method, ClearQualities()
empties the
sequence of quality scores.