This commit is contained in:
@ -16,30 +16,30 @@
|
|||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
> obitaxonomy -d my_ecopcr_database \
|
> obitaxonomy -d my_ecopcr_database \
|
||||||
-a 'Gentiana alpina':'species':21496
|
-a 'Gentiana alpina':'species':49934
|
||||||
|
|
||||||
Adds a taxon with the scientific name *Gentiana alpina* and the rank *species* under
|
Adds a taxon with the scientific name *Gentiana alpina* and the rank *species* under
|
||||||
the taxon whose taxid is 21496.
|
the taxon whose taxid is 49934.
|
||||||
|
|
||||||
|
|
||||||
.. cmdoption:: -m <####>, --min-taxid=<####>
|
.. cmdoption:: -m <####>, --min-taxid=<####>
|
||||||
|
|
||||||
Minimum taxid for the newly added taxid(s).
|
Minimum *taxid* for the newly added *taxid(s)*.
|
||||||
|
|
||||||
*Example:*
|
*Example:*
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
> obitaxonomy -d my_ecopcr_database -m 1000000000 \
|
> obitaxonomy -d my_ecopcr_database -m 1000000000 \
|
||||||
-a 'Gentiana alpina':'species':21496
|
-a 'Gentiana alpina':'species':49934
|
||||||
|
|
||||||
Adds a taxon with the scientific name *Gentiana alpina* and the rank *species* under
|
Adds a taxon with the scientific name *Gentiana alpina* and the rank *species* under
|
||||||
the taxon whose taxid is 21496, with a taxid greater than or equal to 1000000000.
|
the taxon whose *taxid* is 49934, with a *taxid* greater than or equal to 1000000000.
|
||||||
|
|
||||||
|
|
||||||
.. cmdoption:: -D <TAXID>, --delete-local-taxon=<TAXID>
|
.. cmdoption:: -D <TAXID>, --delete-local-taxon=<TAXID>
|
||||||
|
|
||||||
Deletes the local taxon with the taxid <TAXID> from the
|
Deletes the local taxon with the *taxid* <TAXID> from the
|
||||||
taxonomic database.
|
taxonomic database.
|
||||||
|
|
||||||
*Example:*
|
*Example:*
|
||||||
@ -71,22 +71,22 @@
|
|||||||
|
|
||||||
Adds a new favorite scientific name to the taxonomy.
|
Adds a new favorite scientific name to the taxonomy.
|
||||||
The new name is described by two values separated by
|
The new name is described by two values separated by
|
||||||
a colon: the new favorite name and the taxid of the taxon.
|
a colon: the new favorite name and the *taxid* of the taxon.
|
||||||
|
|
||||||
*Example:*
|
*Example:*
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: bash
|
||||||
|
|
||||||
> obitaxonomy -d my_ecopcr_database \
|
> obitaxonomy -d my_ecopcr_database \
|
||||||
-f 'Gentiana algida':10000832
|
-f 'Gentiana algida':50748
|
||||||
|
|
||||||
Adds the favorite scientific name *Gentiana algida* for the taxid 10000832 in the taxonomic database.
|
Adds the favorite scientific name *Gentiana algida* for the *taxid* 50748 in the taxonomic database.
|
||||||
|
|
||||||
|
|
||||||
.. cmdoption:: -F <FILE_NAME>, --file-name=<FILE_NAME>
|
.. cmdoption:: -F <FILE_NAME>, --file-name=<FILE_NAME>
|
||||||
|
|
||||||
Adds all the taxa from a sequence file in OBITools extended
|
Adds all the taxa from a sequence file in ``OBITools`` extended
|
||||||
fasta format, and eventually their ancestors to the database
|
doc:`fasta <../fasta>` format, and eventually their ancestors to the database
|
||||||
(see documentation). Each sequence record must contain the
|
(see documentation). Each sequence record must contain the
|
||||||
attribute specified by the ``-k`` option.
|
attribute specified by the ``-k`` option.
|
||||||
|
|
||||||
@ -110,12 +110,12 @@
|
|||||||
|
|
||||||
.. cmdoption:: -A <ANCESTOR>, --restricting_ancestor=<ANCESTOR>
|
.. cmdoption:: -A <ANCESTOR>, --restricting_ancestor=<ANCESTOR>
|
||||||
|
|
||||||
Works with the ``-F`` option. Can be a taxid (integer) or
|
Works with the ``-F`` option. Can be a *taxid* (integer) or
|
||||||
a key (string). If it is a taxid, this taxid is the
|
a key (string). If it is a *taxid*, this *taxid* is the
|
||||||
default taxid under which the new taxon is added if
|
default *taxid* under which the new taxon is added if
|
||||||
none of his ancestors are specified or can be found.
|
none of his ancestors are specified or can be found.
|
||||||
If it is a key, :py:mod:`obitaxonomy`: looks for the
|
If it is a key, :py:mod:`obitaxonomy` looks for the
|
||||||
ancestor taxid in the corresponding attribute, and the
|
ancestor *taxid* in the corresponding attribute, and the
|
||||||
new taxon is *systematically* added under this ancestor.
|
new taxon is *systematically* added under this ancestor.
|
||||||
By default, the restricting ancestor is the root of the
|
By default, the restricting ancestor is the root of the
|
||||||
taxonomic tree for all the new taxa.
|
taxonomic tree for all the new taxa.
|
||||||
@ -130,7 +130,7 @@
|
|||||||
Adds the taxon of each sequence record from the file ``my_sequences.fasta`` in the taxonomic
|
Adds the taxon of each sequence record from the file ``my_sequences.fasta`` in the taxonomic
|
||||||
database, based on the scientific name contained in the ``my_taxon_name_key`` attribute. If
|
database, based on the scientific name contained in the ``my_taxon_name_key`` attribute. If
|
||||||
the genus of the new taxon cannot be found, the new taxon is added under the taxon whose
|
the genus of the new taxon cannot be found, the new taxon is added under the taxon whose
|
||||||
taxid is 33090.
|
*taxid* is 33090.
|
||||||
|
|
||||||
|
|
||||||
.. cmdoption:: -p <PATH>, --path=<PATH>
|
.. cmdoption:: -p <PATH>, --path=<PATH>
|
||||||
@ -151,7 +151,7 @@
|
|||||||
|
|
||||||
Adds the taxon of each sequence record from the file ``my_sequences.fasta`` in the taxonomic
|
Adds the taxon of each sequence record from the file ``my_sequences.fasta`` in the taxonomic
|
||||||
database, based on the scientific name contained in the ``my_taxon_name_key`` attribute.
|
database, based on the scientific name contained in the ``my_taxon_name_key`` attribute.
|
||||||
Each ancestor contained in the ``my_taxonomic_path_key`` attribute is added if it doesn't
|
Each ancestor contained in the ``my_taxonomic_path_key`` attribute is added if it does not
|
||||||
already exist, and the new taxon is added under the latest ancestor of the path.
|
already exist, and the new taxon is added under the latest ancestor of the path.
|
||||||
|
|
||||||
|
|
||||||
|
@ -1,11 +1,11 @@
|
|||||||
#!/usr/local/bin/python
|
#!/usr/local/bin/python
|
||||||
'''
|
'''
|
||||||
:py:mod:`obitaxonomy`: Manages taxonomic databases
|
:py:mod:`obitaxonomy`: manages taxonomic databases
|
||||||
==================================================
|
==================================================
|
||||||
|
|
||||||
.. codeauthor:: Eric Coissac <eric.coissac@metabarcoding.org> and Celine Mercier <celine.mercier@metabarcoding.org>
|
.. codeauthor:: Eric Coissac <eric.coissac@metabarcoding.org> and Celine Mercier <celine.mercier@metabarcoding.org>
|
||||||
|
|
||||||
The :py:mod:`obitaxonomy` command can generate an ecopcr database from a NCBI taxdump
|
The :py:mod:`obitaxonomy` command can generate an ecoPCR database from a NCBI taxdump
|
||||||
(see NCBI ftp site) and allows managing the taxonomic data contained in both types of
|
(see NCBI ftp site) and allows managing the taxonomic data contained in both types of
|
||||||
database.
|
database.
|
||||||
|
|
||||||
@ -14,12 +14,12 @@ Several types of editing are possible:
|
|||||||
**Adding a taxon to the database**
|
**Adding a taxon to the database**
|
||||||
|
|
||||||
The new taxon is described by three values:
|
The new taxon is described by three values:
|
||||||
its scientific name, its taxonomic rank, and the taxid of its first ancestor.
|
its scientific name, its taxonomic rank, and the *taxid* of its first ancestor.
|
||||||
Done by using the ``-a`` option.
|
Done by using the ``-a`` option.
|
||||||
|
|
||||||
**Deleting a taxon from the database /MARCHE PAS/**
|
**Deleting a taxon from the database**
|
||||||
|
|
||||||
Erases a local taxon. Done by using the ``-D`` option and specifying a taxid.
|
Erases a local taxon. Done by using the ``-D`` option and specifying a *taxid*.
|
||||||
|
|
||||||
**Adding a species to the database**
|
**Adding a species to the database**
|
||||||
|
|
||||||
@ -27,15 +27,15 @@ Several types of editing are possible:
|
|||||||
added under its genus. Done by using the ``-s`` option and specifying a species
|
added under its genus. Done by using the ``-s`` option and specifying a species
|
||||||
scientific name.
|
scientific name.
|
||||||
|
|
||||||
**Adding a preferred scientific name for a taxon in the database (????)**
|
**Adding a preferred scientific name for a taxon in the database**
|
||||||
|
|
||||||
Adds a preferred name for a taxon in the taxonomy, by specifying the new favorite
|
Adds a preferred name for a taxon in the taxonomy, by specifying the new favorite
|
||||||
name and the taxid of the taxon whose preferred name should be changed (???).
|
name and the *taxid* of the taxon whose preferred name should be changed.
|
||||||
Done by using the ``-f`` option.
|
Done by using the ``-f`` option.
|
||||||
|
|
||||||
**Adding all the taxa from a sequence file in OBITools extended fasta format to the database**
|
**Adding all the taxa from a sequence file in the `<60>BITools`` extended :doc:`fasta <../fasta>` format to the database**
|
||||||
|
|
||||||
All the taxon from a file in :doc:`OBITools extended fasta format <../fasta>`, and eventually their ancestors, are added to the
|
All the taxon from a file in the `<60>BITools`` extended :doc:`fasta <../fasta>` format, and eventually their ancestors, are added to the
|
||||||
taxonomy database.
|
taxonomy database.
|
||||||
|
|
||||||
The header of each sequence record must contain the attribute defined by the
|
The header of each sequence record must contain the attribute defined by the
|
||||||
@ -45,10 +45,10 @@ Several types of editing are possible:
|
|||||||
A taxonomic path for each sequence record can be specified with the ``-p`` option,
|
A taxonomic path for each sequence record can be specified with the ``-p`` option,
|
||||||
as the attribute key that contains the taxonomic path of the taxon to be added.
|
as the attribute key that contains the taxonomic path of the taxon to be added.
|
||||||
|
|
||||||
A restricting ancestor can be specified with the ``-A`` option, either as a taxid
|
A restricting ancestor can be specified with the ``-A`` option, either as a *taxid*
|
||||||
(integer) or a key (string). If it is a taxid, this taxid is the default taxid
|
(integer) or a key (string). If it is a *taxid*, this *taxid* is the default *taxid*
|
||||||
under which the new taxon is added if none of his ancestors are specified or can
|
under which the new taxon is added if none of his ancestors are specified or can
|
||||||
be found. If it is a key, :py:mod:`obitaxonomy`: looks for the ancestor taxid in
|
be found. If it is a key, :py:mod:`obitaxonomy` looks for the ancestor *taxid* in
|
||||||
the corresponding attribute, and the new taxon is systematically added under this
|
the corresponding attribute, and the new taxon is systematically added under this
|
||||||
ancestor. By default, the restricting ancestor is the root of the taxonomic tree for
|
ancestor. By default, the restricting ancestor is the root of the taxonomic tree for
|
||||||
all the new taxa.
|
all the new taxa.
|
||||||
@ -58,15 +58,14 @@ Several types of editing are possible:
|
|||||||
genus in the taxonomic database. If the genus is found, the new taxon is added under it.
|
genus in the taxonomic database. If the genus is found, the new taxon is added under it.
|
||||||
If not, it is added under the restricting ancestor.
|
If not, it is added under the restricting ancestor.
|
||||||
|
|
||||||
It is highly recommended to check what was exactly done by reading the output,
|
It is highly recommended checking what was exactly done by reading the output,
|
||||||
since :py:mod:`obitaxonomy` tries to be clever about what is done but can make
|
since :py:mod:`obitaxonomy` uses *ad hoc* parsing and decision rules.
|
||||||
mistakes.
|
|
||||||
|
|
||||||
Done by using the ``-F`` option.
|
Done by using the ``-F`` option.
|
||||||
|
|
||||||
**Notes:**
|
**Notes:**
|
||||||
|
|
||||||
- When a taxon is added, a new taxid is assigned to it. The minimum for the new taxids
|
- When a taxon is added, a new *taxid* is assigned to it. The minimum for the new *taxids*
|
||||||
can be specified by the ``-m`` option and is equal to 10000000 by default.
|
can be specified by the ``-m`` option and is equal to 10000000 by default.
|
||||||
|
|
||||||
- For each modification, a line is printed with details on what was done.
|
- For each modification, a line is printed with details on what was done.
|
||||||
|
Reference in New Issue
Block a user