Update wolf_tutorial
@ -1,6 +1,6 @@
|
||||
# Wolf tutorial with the OBITools3
|
||||
|
||||
A (cooler) remake of the infamous [wolf tutorial](https://pythonhosted.org/OBITools/wolves.html). And a work in progress.
|
||||
The OBITools3 version of the [wolf tutorial](https://pythonhosted.org/OBITools/wolves.html) made for the first OBITools.
|
||||
|
||||
### 0.1 Before starting: the OBITools3 data structure
|
||||
|
||||
@ -44,28 +44,44 @@ The new database system used by the OBITools3 (called **DMS** for Data Managemen
|
||||
|
||||
### 0.3 Before starting: installing the OBITools3
|
||||
|
||||
Not working yet...
|
||||
This is going to change, but for now:
|
||||
|
||||
Requirements: **python3, python3-venv, git, CMake**
|
||||
|
||||
Then you can do:
|
||||
|
||||
git clone https://git.metabarcoding.org/obitools/obitools3.git
|
||||
cd obitools3
|
||||
python3 -m venv obi3-env
|
||||
. obi3-env/bin/activate
|
||||
pip install cython
|
||||
python setup.py install
|
||||
|
||||
And test the installation with:
|
||||
|
||||
obi test
|
||||
|
||||
|
||||
### 1. Import the sequencing data in a DMS
|
||||
|
||||
Download the reads and the ngs file:
|
||||
Download this archive containing the reads and the ngs file:
|
||||
|
||||
[wolf_F.fastq.gz](/uploads/09dada3587189c3b3a7af7024981c074/wolf_F.fastq.gz)
|
||||
[wolf_tutorial.tar.gz](/uploads/9b86f67ad05815ddee14526640d81137/wolf_tutorial.tar.gz)
|
||||
|
||||
[wolf_R.fastq.gz](/uploads/a95dbad14b75474c8307cab56fa083ca/wolf_R.fastq.gz)
|
||||
And unzip it:
|
||||
|
||||
tar -zxvf wolf_tutorial.tar.gz
|
||||
|
||||
[wolf_diet_ngsfilter.txt](/uploads/379d01fabbe9adf21d33c1fd8f5ee43c/wolf_diet_ngsfilter.txt)
|
||||
|
||||
1. Import the first set of reads, with :
|
||||
|
||||
obi import --quality-solexa wolf_tutorial/wolf_F.fastq.gz wolf/reads1
|
||||
obi import --quality-solexa wolf_tutorial/wolf_F.fastq wolf/reads1
|
||||
|
||||
`--quality-solexa` is the appropriate fastq quality option because it's an old dataset, `wolf_tutorial/wolf_F.fastq` is the path to the file to import, `wolf` is the path to the DMS that will be automatically created, and `reads1` is the name of the view into which the file will be imported.
|
||||
|
||||
2. Import the second set of reads:
|
||||
|
||||
obi import --quality-solexa wolf_tutorial/wolf_R.fastq.gz wolf/reads2
|
||||
obi import --quality-solexa wolf_tutorial/wolf_R.fastq wolf/reads2
|
||||
|
||||
3. Import the [ngsfilter file](https://pythonhosted.org/OBITools/scripts/ngsfilter.html) describing the primers and tags used for each sample:
|
||||
|
||||
@ -99,7 +115,7 @@ Unlike the OBITools1, the OBITools3 make it possible to run ngsfilter before ali
|
||||
|
||||
### 4. Remove unaligned sequence records
|
||||
|
||||
obi grep -p "mode!=b'joined'" wolf/aligned_reads wolf/good_sequences
|
||||
obi grep -a mode:alignment wolf/aligned_reads wolf/good_sequences
|
||||
|
||||
### 5. Dereplicate reads into unique sequences
|
||||
|
||||
@ -109,7 +125,7 @@ Unlike the OBITools1, the OBITools3 make it possible to run ngsfilter before ali
|
||||
|
||||
1. First let's clean the useless metadata and keep only the `COUNT` and `merged_sample` (count by sample) tags:
|
||||
|
||||
obi annotate -k COUNT -k merged_sample wolf/dereplicated_sequences wolf/cleaned_metadata_sequences
|
||||
obi annotate -k COUNT -k MERGED_sample wolf/dereplicated_sequences wolf/cleaned_metadata_sequences
|
||||
|
||||
2. Keep only the sequences having a count greater or equal to 10 and a length shorter than 80 bp:
|
||||
|
||||
@ -117,7 +133,7 @@ Unlike the OBITools1, the OBITools3 make it possible to run ngsfilter before ali
|
||||
|
||||
3. Clean the sequences from PCR/sequencing errors (sequence variants):
|
||||
|
||||
obi clean -s merged_sample -r 0.05 -H wolf/denoised_sequences wolf/cleaned_sequences
|
||||
obi clean -s MERGED_sample -r 0.05 -H wolf/denoised_sequences wolf/cleaned_sequences
|
||||
|
||||
### 7. Taxonomic assignment of the sequences
|
||||
|
||||
|
Reference in New Issue
Block a user