5  Controling OBITools outputs

5.1 Specifying output format

Only two output sequence formats are supported by OBITools, Fasta and Fastq. Fastq is used when output sequences are associated with quality information. Otherwise, Fasta is the default format. However, it is possible to force the output format by using one of the following two options. Forcing the use of Fasta results in the loss of quality information. Conversely, when the Fastq format is forced with sequences that have no quality data, dummy qualities set to 40 for each nucleotide are added.

  • --fasta-output Read data following the ecoPCR output format.
  • --fastq-output Read data following the EMBL flatfile format.

OBITools allows multiple input files to be specified for a single command.

  • --no-order When several input files are provided, indicates that there is no order among them. (default: false). Using such option can increase a lot the processing of the data.

5.2 The Fasta and Fastq annotations format

OBITools extend the Fasta and Fastq formats by introducing a format for the title lines of these formats allowing to annotate every sequence. While the previous version of OBITools used an ad-hoc format for these annotation, this new version introduce the usage of the standard JSON format to store them.

On input, OBITools automatically recognize the format of the annotations, but two options allows to force the parsing following one of them. You should normally not need to use these options.

  • --input-OBI-header FASTA/FASTQ title line annotations follow OBI format. (default: false)

  • --input-json-header FASTA/FASTQ title line annotations follow json format. (default: false)

On output, by default annotation are formatted using the new JSON format. For compatibility with previous version of OBITools and with external scripts and software, it is possible to force the usage of the previous OBITools format.

  • --output-OBI-header|-O output FASTA/FASTQ title line annotations follow OBI format. (default: false)

  • --output-json-header output FASTA/FASTQ title line annotations follow json format. (default: false)