4 Specifying the data input to OBITools commands
4.1 Specifying input format
Five sequence formats are accepted for input files. Fasta (Section 3.2) and Fastq (Section 3.3) are the main ones, EMBL and Genbank allow the use of flat files produced by these two international databases. The last one, ecoPCR, is maintained for compatibility with previous OBITools and allows to read ecoPCR outputs as sequence files.
--ecopcr
: Read data following the ecoPCR output format.--embl
Read data following the EMBL flatfile format.--genbank
Read data following the Genbank flatfile format.
Several encoding schemes have been proposed for quality scores in Fastq format. Currently, OBITools considers Sanger encoding as the standard. For reasons of compatibility with older datasets produced with Solexa sequencers, it is possible, by using the following option, to force the use of the corresponding quality encoding scheme when reading these older files.
--solexa
Decodes quality string according to the Solexa specification. (default: false)