--- title: "obigrep" section: 1 author: Eric Coissac format: html: default man: default --- # NAME obigrep -- filters sequence files according to numerous conditions # SYNOPSIS **obigrep** \[**\--attribute** | **-a** _KEY=VALUE_]... \[**\--compress** | **-Z**] \[**\--debug**] \[**\--definition**|**-D** _PATTERN_]... \[**\--ecopcr**] \[**\--embl**] \[**\--fasta-output**] \[**\--fastq-output**] \[**\--genbank**] \[**\--has-attribute** | **-A** _KEY_]... \[**\--help** | **-h** | **-?**] \[**\--id-list** _FILENAME_] \[**\--identifier** | **-I** _PATTERN_]... \[**\--ignore-taxon** | **-i** _TAXID_]... \[**\--input-OBI-header**] \[**\--input-json-header**] \[**\--inverse-match** | **-v**] \[**\--max-count**|**-C** _COUNT_] \[**\--max-cpu** _INT_] \[**\--max-length** | **-L** _LENGTH_] \[**\--min-count** | **-c** _COUNT_] \[**\--min-length** | **-l** _LENGTH_] \[**\--no-order**] \[**\--no-progressbar**] \[**\--out** | **-o** _FILENAME_] \[**\--output-OBI-header** | **-O**] \[**\--output-json-header**] \[**\--paired-mode** _forward|reverse|and|or|andnot|xor_] \[**\--paired-with** _FILENAME_] \[**\--predicate**|**-p** _EXPRESSION_]... \[**\--require-rank** _RANK_NAME_]... \[**\--restrict-to-taxon** | **-r** _TAXID_]... \[**\--save-discarded** _FILENAME_] \[**\--sequence**|**-s** _PATTERN_]... \[**\--solexa**] \[**\--taxdump** | **-t** _DIRECTORY_] \[**\--workers** | **-w** _INT_] [_FILENAMES_] # DESCRIPTION {{< include ../lib/descriptions/_obigrep.qmd >}} # OPTIONS ## General options {{< include ../lib/options/_system.qmd >}} ## Input format options The OBITools are centered around the [FASTA] (https://en.wikipedia.org/wiki/FASTA_format) and [FASTQ] (https://en.wikipedia.org/wiki/FASTQ_format) formats. These formats are automaticaly recognized when data are read both from files, and from standard input (`stdin`). Other formats (genbank, EMBL, ecopcr) are also automatically identified when data are read from files, but for stdin input, input format must be indicated using one of the following options. ## Output format options {{< include ../lib/options/_output.qmd >}} ## Paired reads options **\--paired-with** _FILENAME_ **\--paired-mode** _forward|reverse|and|or|andnot|xor_ ## Taxonomy related options **\--taxdump** | **-t** _DIRECTORY_ **\--ignore-taxon** | **-i** _TAXID_ **\--require-rank** _RANK_NAME_ **\--restrict-to-taxon** | **-r** _TAXID_ ## Filtering options **\--has-attribute** | **-A** _KEY_... **\--id-list** _FILENAME_ **\--identifier** | **-I** _PATTERN_ {{< include ../lib/options/selection/_max-count.qmd >}} {{< include ../lib/options/selection/_min-count.qmd >}} {{< include ../lib/options/selection/_max-length.qmd >}} {{< include ../lib/options/selection/_min-length.qmd >}} **\--predicate**|**-p** _EXPRESSION_ {{< include ../lib/options/selection/_sequence.qmd >}} **\--inverse-match** | **-v** **\--save-discarded** _FILENAME_ # ENVIRONMENT **OBICPUMAX** # EXAMPLES - Filtering sequence file to keep only barcodes between 8 and 130 bp. ```bash obigrep -l 8 -L 130 data_SPER01.fasta > data_goodLength_SPER01.fasta ``` - Filtering reads without anbiguity base code in its sequence. ```bash obigrep -s '^[acgt]+$' data_SPER01.fasta > data_onlyACGT_SPER01.fasta ``` - Filtering paired files for keeping only pairs of read without ambiguity. ```bash obigrep -s '^[acgt]+$' \ --paired-mode and --paired-with wolf_R.fastq.gz \ --out wolf_good.fastq \ wolf_F.fastq.gz ``` That command produces two files `wolf_good_R1.fastq` and `wolf_good_R1.fastq` containing respectively the filtered forward and reverse reads. # SEE ALSO `obiannotate` # HISTORY # BUGS Submit bug reports online at: https://git.metabarcoding.org/obitools/obitools4/obitools4/-/issues