Files
obitools3/doc/source/DMS.rst

161 lines
4.8 KiB
ReStructuredText
Raw Normal View History

2015-05-23 03:39:56 +03:00
*********************************************
The OBItools3 Data Management System (OBIDMS)
*********************************************
2015-05-18 17:57:10 +02:00
A complete DNA metabarcoding experiment relies on several kinds of data.
2015-05-18 17:57:10 +02:00
- The sequence data resulting from the sequencing of the PCR products,
2015-05-23 03:39:56 +03:00
- The description of the samples including all their metadata,
- One or several reference databases used for the taxonomic annotation,
- One or several taxonomy databases.
2015-05-18 17:57:10 +02:00
Up to now, each of these categories of data were stored in separate
files, and nothing made it mandatory to keep them together.
2015-05-18 17:57:10 +02:00
The `Data Management System` (DMS) of OBITools3 can be viewed like a basic
database system.
2015-05-18 17:57:10 +02:00
2015-05-23 03:39:56 +03:00
OBIDMS UML
==========
2015-05-18 17:57:10 +02:00
.. image:: ./UML/OBIDMS_UML.png
2015-05-18 17:57:10 +02:00
2015-05-23 03:39:56 +03:00
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
2015-05-18 17:57:10 +02:00
2015-05-07 16:10:03 +02:00
An OBIDMS directory contains :
* one `OBIDMS history file <#obidms-history-files>`_
* OBIDMS column directories
OBIDMS column directories
=========================
OBIDMS column directories contain :
* all the different versions of one OBIDMS column, under the form of different files (`OBIDMS column files <#obidms-column-files>`_)
* one `OBIDMS version file <#obidms-version-files>`_
The directory name is the column attribute with the extension ``.obicol``.
Example: ``count.obicol``
2015-05-07 16:10:03 +02:00
OBIDMS column files
2015-05-07 16:10:03 +02:00
===================
2015-05-23 03:39:56 +03:00
Each OBIDMS column file contains :
2015-05-23 03:39:56 +03:00
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
2015-05-07 16:10:03 +02:00
on most systems) containing metadata
* Lines of data with the same `OBIType <types.html#obitypes>`_
2015-05-07 16:10:03 +02:00
Header
------
The header of an OBIDMS column contains :
2015-05-23 03:39:56 +03:00
2015-05-07 16:10:03 +02:00
* Endian byte order
2015-05-18 17:57:10 +02:00
* Header size (PAGESIZE multiple)
* Number of lines of data
* Number of lines of data used
* `OBIType <types.html#obitypes>`_ (type of the data)
* Date of creation of the file
* Version of the OBIDMS column
* The column name
2015-05-07 16:10:03 +02:00
* Eventual comments
Data
----
A line of data corresponds to a vector of elements. Each element is associated with an element name.
Elements names are stored in the header. The correspondance between an element and its name is done
using their order in the lists of elements and elements names. This structure allows the storage of
dictionary-like data.
Example: In the header, the attribute ``elements_names`` will be associated with the value ``"sample_1;
sample_2;sample_3"``, and a line of data with the type ``OBInt_t`` will be stored as an ``OBInt_t`` vector
of size three e.g. ``5|8|4``.
2015-05-07 16:10:03 +02:00
2015-05-18 17:57:10 +02:00
Mandatory columns
-----------------
Some columns must exist in an OBIDMS directory :
* sequence identifiers column (type ``OBIStr_t``)
2015-05-18 17:57:10 +02:00
File name
---------
2015-05-23 03:39:56 +03:00
Each file is named with the attribute associated to the data it contains, and the number of
2015-05-18 17:57:10 +02:00
its version, separated by an ``@``, and with the extension ``.odc``.
2015-05-07 16:10:03 +02:00
2015-05-18 17:57:10 +02:00
Example : ``count@3.odc``
Modifications
-------------
An OBIDMS column file can only be modified by the process that created it, and while its status is set to Open.
2015-05-18 17:57:10 +02:00
2015-05-23 03:39:56 +03:00
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
2015-05-18 17:57:10 +02:00
again.
2015-05-23 03:39:56 +03:00
That means that one column is stored in one file (if there is only one version)
or more (if there are several versions), and that there is one file per version.
2015-05-07 16:10:03 +02:00
All the versions of one column are stored in one directory.
2015-05-18 17:57:10 +02:00
Versioning
----------
The first version of a column file is numbered 0, and each new version increments that
2015-05-07 16:10:03 +02:00
number by 1.
2015-06-17 17:17:54 +02:00
The number of the latest version of an OBIDMS column is stored in the `OBIDMS version file <#obidms-version-files>`_ of its directory.
2015-05-07 16:10:03 +02:00
OBIDMS version files
====================
2015-05-23 03:39:56 +03:00
Each OBIDMS column is associated with an OBIDMS version file in its directory, that contains the number of the latest
2015-05-23 03:39:56 +03:00
version of the column.
File name
---------
OBIDMS version files are named with the attribute associated to the data contained in the column, and
have the extension ``.odv``.
Example : ``count.odv``
OBIDMS views
2015-05-07 16:10:03 +02:00
============
2015-05-23 03:39:56 +03:00
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
of each mandatory column. Only one version of each column is included. All the columns of
2015-05-07 16:10:03 +02:00
one view contain the same number of lines in the same order.
2015-05-18 17:57:10 +02:00
OBIDMS history file
===================
2015-05-23 03:39:56 +03:00
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
2015-05-18 17:57:10 +02:00
from one view to the next one.
This history can be represented in the form of a ?? showing all the
2015-05-18 17:57:10 +02:00
operations ever done in the OBIDMS directory and the views in between them :
.. image:: ./images/history.png
:width: 150 px
:align: center