This commit is contained in:
Aurelie Bonin
2014-01-31 14:47:07 +00:00
parent 977ebda88f
commit 927157c735
2 changed files with 34 additions and 35 deletions

View File

@ -16,30 +16,30 @@
.. code-block:: bash
> obitaxonomy -d my_ecopcr_database \
-a 'Gentiana alpina':'species':21496
-a 'Gentiana alpina':'species':49934
Adds a taxon with the scientific name *Gentiana alpina* and the rank *species* under
the taxon whose taxid is 21496.
the taxon whose taxid is 49934.
.. cmdoption:: -m <####>, --min-taxid=<####>
Minimum taxid for the newly added taxid(s).
Minimum *taxid* for the newly added *taxid(s)*.
*Example:*
.. code-block:: bash
> obitaxonomy -d my_ecopcr_database -m 1000000000 \
-a 'Gentiana alpina':'species':21496
-a 'Gentiana alpina':'species':49934
Adds a taxon with the scientific name *Gentiana alpina* and the rank *species* under
the taxon whose taxid is 21496, with a taxid greater than or equal to 1000000000.
the taxon whose *taxid* is 49934, with a *taxid* greater than or equal to 1000000000.
.. cmdoption:: -D <TAXID>, --delete-local-taxon=<TAXID>
Deletes the local taxon with the taxid <TAXID> from the
Deletes the local taxon with the *taxid* <TAXID> from the
taxonomic database.
*Example:*
@ -71,22 +71,22 @@
Adds a new favorite scientific name to the taxonomy.
The new name is described by two values separated by
a colon: the new favorite name and the taxid of the taxon.
a colon: the new favorite name and the *taxid* of the taxon.
*Example:*
.. code-block:: bash
> obitaxonomy -d my_ecopcr_database \
-f 'Gentiana algida':10000832
-f 'Gentiana algida':50748
Adds the favorite scientific name *Gentiana algida* for the taxid 10000832 in the taxonomic database.
Adds the favorite scientific name *Gentiana algida* for the *taxid* 50748 in the taxonomic database.
.. cmdoption:: -F <FILE_NAME>, --file-name=<FILE_NAME>
Adds all the taxa from a sequence file in OBITools extended
fasta format, and eventually their ancestors to the database
Adds all the taxa from a sequence file in ``OBITools`` extended
doc:`fasta <../fasta>` format, and eventually their ancestors to the database
(see documentation). Each sequence record must contain the
attribute specified by the ``-k`` option.
@ -110,12 +110,12 @@
.. cmdoption:: -A <ANCESTOR>, --restricting_ancestor=<ANCESTOR>
Works with the ``-F`` option. Can be a taxid (integer) or
a key (string). If it is a taxid, this taxid is the
default taxid under which the new taxon is added if
Works with the ``-F`` option. Can be a *taxid* (integer) or
a key (string). If it is a *taxid*, this *taxid* is the
default *taxid* under which the new taxon is added if
none of his ancestors are specified or can be found.
If it is a key, :py:mod:`obitaxonomy`: looks for the
ancestor taxid in the corresponding attribute, and the
If it is a key, :py:mod:`obitaxonomy` looks for the
ancestor *taxid* in the corresponding attribute, and the
new taxon is *systematically* added under this ancestor.
By default, the restricting ancestor is the root of the
taxonomic tree for all the new taxa.
@ -129,8 +129,8 @@
Adds the taxon of each sequence record from the file ``my_sequences.fasta`` in the taxonomic
database, based on the scientific name contained in the ``my_taxon_name_key`` attribute. If
the genus of the new taxon can not be found, the new taxon is added under the taxon whose
taxid is 33090.
the genus of the new taxon cannot be found, the new taxon is added under the taxon whose
*taxid* is 33090.
.. cmdoption:: -p <PATH>, --path=<PATH>
@ -151,7 +151,7 @@
Adds the taxon of each sequence record from the file ``my_sequences.fasta`` in the taxonomic
database, based on the scientific name contained in the ``my_taxon_name_key`` attribute.
Each ancestor contained in the ``my_taxonomic_path_key`` attribute is added if it doesn't
Each ancestor contained in the ``my_taxonomic_path_key`` attribute is added if it does not
already exist, and the new taxon is added under the latest ancestor of the path.

View File

@ -1,11 +1,11 @@
#!/usr/local/bin/python
'''
:py:mod:`obitaxonomy`: Manages taxonomic databases
:py:mod:`obitaxonomy`: manages taxonomic databases
==================================================
.. codeauthor:: Eric Coissac <eric.coissac@metabarcoding.org> and Celine Mercier <celine.mercier@metabarcoding.org>
The :py:mod:`obitaxonomy` command can generate an ecopcr database from a NCBI taxdump
The :py:mod:`obitaxonomy` command can generate an ecoPCR database from a NCBI taxdump
(see NCBI ftp site) and allows managing the taxonomic data contained in both types of
database.
@ -14,12 +14,12 @@ Several types of editing are possible:
**Adding a taxon to the database**
The new taxon is described by three values:
its scientific name, its taxonomic rank, and the taxid of its first ancestor.
its scientific name, its taxonomic rank, and the *taxid* of its first ancestor.
Done by using the ``-a`` option.
**Deleting a taxon from the database /MARCHE PAS/**
**Deleting a taxon from the database**
Erases a local taxon. Done by using the ``-D`` option and specifying a taxid.
Erases a local taxon. Done by using the ``-D`` option and specifying a *taxid*.
**Adding a species to the database**
@ -27,15 +27,15 @@ Several types of editing are possible:
added under its genus. Done by using the ``-s`` option and specifying a species
scientific name.
**Adding a preferred scientific name for a taxon in the database (????)**
**Adding a preferred scientific name for a taxon in the database**
Adds a preferred name for a taxon in the taxonomy, by specifying the new favorite
name and the taxid of the taxon whose preferred name should be changed (???).
name and the *taxid* of the taxon whose preferred name should be changed.
Done by using the ``-f`` option.
**Adding all the taxa from a sequence file in OBITools extended fasta format to the database**
**Adding all the taxa from a sequence file in the `<60>BITools`` extended :doc:`fasta <../fasta>` format to the database**
All the taxon from a file in :doc:`OBITools extended fasta format <../fasta>`, and eventually their ancestors, are added to the
All the taxon from a file in the `<60>BITools`` extended :doc:`fasta <../fasta>` format, and eventually their ancestors, are added to the
taxonomy database.
The header of each sequence record must contain the attribute defined by the
@ -45,10 +45,10 @@ Several types of editing are possible:
A taxonomic path for each sequence record can be specified with the ``-p`` option,
as the attribute key that contains the taxonomic path of the taxon to be added.
A restricting ancestor can be specified with the ``-A`` option, either as a taxid
(integer) or a key (string). If it is a taxid, this taxid is the default taxid
A restricting ancestor can be specified with the ``-A`` option, either as a *taxid*
(integer) or a key (string). If it is a *taxid*, this *taxid* is the default *taxid*
under which the new taxon is added if none of his ancestors are specified or can
be found. If it is a key, :py:mod:`obitaxonomy`: looks for the ancestor taxid in
be found. If it is a key, :py:mod:`obitaxonomy` looks for the ancestor *taxid* in
the corresponding attribute, and the new taxon is systematically added under this
ancestor. By default, the restricting ancestor is the root of the taxonomic tree for
all the new taxa.
@ -58,15 +58,14 @@ Several types of editing are possible:
genus in the taxonomic database. If the genus is found, the new taxon is added under it.
If not, it is added under the restricting ancestor.
It is highly recommended to check what was exactly done by reading the output,
since :py:mod:`obitaxonomy` tries to be clever about what is done but can make
mistakes.
It is highly recommended checking what was exactly done by reading the output,
since :py:mod:`obitaxonomy` uses *ad hoc* parsing and decision rules.
Done by using the ``-F`` option.
**Notes:**
- When a taxon is added, a new taxid is assigned to it. The minimum for the new taxids
- When a taxon is added, a new *taxid* is assigned to it. The minimum for the new *taxids*
can be specified by the ``-m`` option and is equal to 10000000 by default.
- For each modification, a line is printed with details on what was done.