mirror of
https://github.com/metabarcoding/obitools4.git
synced 2025-06-29 16:20:46 +00:00
45 lines
3.3 KiB
Plaintext
45 lines
3.3 KiB
Plaintext
# The OBITools
|
|
|
|
The *OBITools4* are programs specifically designed for analyzing NGS data in a DNA metabarcoding context, taking into account taxonomic information. It is distributed as an open source software available on the following website: http://metabarcoding.org/obitools4.
|
|
|
|
|
|
## Aims of *OBITools*
|
|
|
|
DNA metabarcoding is an efficient approach for biodiversity studies [@Taberlet2012-pf]. Originally mainly developed by microbiologists [*e.g.* @Sogin2006-ab], it is now widely used for plants [*e.g.* @Sonstebo2010-vv;@Yoccoz2012-ix;@Parducci2012-rn] and animals from meiofauna [*e.g.* @Chariton2010-cz;@Baldwin2013-yc] to larger organisms [*e.g.* @Andersen2012-gj;@Thomsen2012-au]. Interestingly, this method is not limited to *sensu
|
|
stricto* biodiversity surveys, but it can also be implemented in other
|
|
ecological contexts such as for herbivore [e.g. @Valentini2009-ay;@Kowalczyk2011-kg] or carnivore [e.g. @Deagle2009-yh;@Shehzad2012-pn] diet
|
|
analyses.
|
|
|
|
Whatever the biological question under consideration, the DNA metabarcoding
|
|
methodology relies heavily on next-generation sequencing (NGS), and generates
|
|
considerable numbers of DNA sequence reads (typically million of reads).
|
|
Manipulation of such large datasets requires dedicated programs usually running
|
|
on a Unix system. Unix is an operating system, whose first version was created
|
|
during the sixties. Since its early stages, it is dedicated to scientific
|
|
computing and includes a large set of simple tools to efficiently process text
|
|
files. Most of those programs can be viewed as filters extracting information
|
|
from a text file to create a new text file. These programs process text files as
|
|
streams, line per line, therefore allowing computation on a huge dataset without
|
|
requiring a large memory. Unix programs usually print their results to their
|
|
standard output (*stdout*), which by default is the terminal, so the results can
|
|
be examined on screen. The main philosophy of the Unix environment is to allow
|
|
easy redirection of the *stdout* either to a file, for saving the results, or to
|
|
the standard input (*stdin*) of a second program thus allowing to easily create
|
|
complex processing from simple base commands. Access to Unix computers is
|
|
increasingly easier for scientists nowadays. Indeed, the Linux operating system,
|
|
an open source version of Unix, can be freely installed on every PC machine and
|
|
the MacOS operating system, running on Apple computers, is also a Unix system.
|
|
The *OBITools* programs imitate Unix standard programs because they usually act as
|
|
filters, reading their data from text files or the *stdin* and writing their
|
|
results to the *stdout*. The main difference with classical Unix programs is that
|
|
text files are not analyzed line per line but sequence record per sequence
|
|
record (see below for a detailed description of a sequence record).
|
|
Compared to packages for similar purposes like mothur [@Schloss2009-qy] or
|
|
QIIME [@Caporaso2010-ii], the *OBITools* mainly rely on filtering and sorting
|
|
algorithms. This allows users to set up versatile data analysis pipelines
|
|
(Figure 1), adjustable to the broad range of DNA metabarcoding applications.
|
|
The innovation of the *OBITools* is their ability to take into account the
|
|
taxonomic annotations, ultimately allowing sorting and filtering of sequence
|
|
records based on the taxonomy.
|
|
|