Update wolf_tutorial

2019-09-01 18:19:32 +02:00
parent 0ba6b57157
commit c616d9dada
1 changed files with 27 additions and 11 deletions
--- a/wolf_tutorial.md
+++ b/wolf_tutorial.md
@ -1,6 +1,6 @@
 # Wolf tutorial with the OBITools3

-A (cooler) remake of the infamous [wolf tutorial](https://pythonhosted.org/OBITools/wolves.html). And a work in progress.
+The OBITools3 version of the [wolf tutorial](https://pythonhosted.org/OBITools/wolves.html) made for the first OBITools.

 ### 0.1 Before starting: the OBITools3 data structure

@ -44,28 +44,44 @@ The new database system used by the OBITools3 (called **DMS** for Data Managemen

 ### 0.3 Before starting: installing the OBITools3

-Not working yet...
+This is going to change, but for now:
+
+Requirements: **python3, python3-venv, git, CMake**
+
+Then you can do:
+
+		git clone https://git.metabarcoding.org/obitools/obitools3.git
+		cd obitools3
+		python3 -m venv obi3-env
+		. obi3-env/bin/activate
+		pip install cython
+		python setup.py install
+		
+And test the installation with:
+
+		obi test


 ### 1. Import the sequencing data in a DMS

-Download the reads and the ngs file:
+Download this archive containing the reads and the ngs file:

-[wolf_F.fastq.gz](/uploads/09dada3587189c3b3a7af7024981c074/wolf_F.fastq.gz)
+[wolf_tutorial.tar.gz](/uploads/9b86f67ad05815ddee14526640d81137/wolf_tutorial.tar.gz)

-[wolf_R.fastq.gz](/uploads/a95dbad14b75474c8307cab56fa083ca/wolf_R.fastq.gz)
+And unzip it:
+
+		tar -zxvf wolf_tutorial.tar.gz

-[wolf_diet_ngsfilter.txt](/uploads/379d01fabbe9adf21d33c1fd8f5ee43c/wolf_diet_ngsfilter.txt)

 1. Import the first set of reads, with :

-		obi import --quality-solexa wolf_tutorial/wolf_F.fastq.gz wolf/reads1
+		obi import --quality-solexa wolf_tutorial/wolf_F.fastq wolf/reads1

 	`--quality-solexa` is the appropriate fastq quality option because it's an old dataset, `wolf_tutorial/wolf_F.fastq` is the path to the file to import, `wolf` is the path to the DMS that will be automatically created, and `reads1` is the name of the view into which the file will be imported.

 2. Import the second set of reads:

-		obi import --quality-solexa wolf_tutorial/wolf_R.fastq.gz wolf/reads2
+		obi import --quality-solexa wolf_tutorial/wolf_R.fastq wolf/reads2

 3. Import the [ngsfilter file](https://pythonhosted.org/OBITools/scripts/ngsfilter.html) describing the primers and tags used for each sample:

@ -99,7 +115,7 @@ Unlike the OBITools1, the OBITools3 make it possible to run ngsfilter before ali
 	
 ### 4. Remove unaligned sequence records

-	obi grep -p "mode!=b'joined'" wolf/aligned_reads wolf/good_sequences
+	obi grep -a mode:alignment wolf/aligned_reads wolf/good_sequences
 	
 ### 5. Dereplicate reads into unique sequences

@ -109,7 +125,7 @@ Unlike the OBITools1, the OBITools3 make it possible to run ngsfilter before ali

 1. First let's clean the useless metadata and keep only the `COUNT` and `merged_sample` (count by sample) tags:

-		obi annotate -k COUNT -k merged_sample wolf/dereplicated_sequences wolf/cleaned_metadata_sequences
+		obi annotate -k COUNT -k MERGED_sample wolf/dereplicated_sequences wolf/cleaned_metadata_sequences

 2. Keep only the sequences having a count greater or equal to 10 and a length shorter than 80 bp:

@ -117,7 +133,7 @@ Unlike the OBITools1, the OBITools3 make it possible to run ngsfilter before ali

 3. Clean the sequences from PCR/sequencing errors (sequence variants):
 	
-		obi clean -s merged_sample -r 0.05 -H wolf/denoised_sequences wolf/cleaned_sequences
+		obi clean -s MERGED_sample -r 0.05 -H wolf/denoised_sequences wolf/cleaned_sequences

 ### 7. Taxonomic assignment of the sequences