Files
obitools3/doc/source/DMS.rst

155 lines
4.1 KiB
ReStructuredText

*********************************************
The OBItools3 Data Management System (OBIDMS)
*********************************************
A complete DNA Metabarcoding experiment rely on several kinds of data.
- The sequence data resulting of the PCR products sequencing,
- The description of the samples including all their metadata,
- One or several refence database used for the taxonomical annotation
- One or several taxonomies.
Up to now each of these categories of data were stored in separate
files an nothing obliged to keep them together.
The `Data Management System` (DMS) of OBITools3 can be considered
as a basic database system.
OBIDMS UML
==========
.. image:: ./images/OBIDMS_UML.png
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
An OBIDMS directory consists of :
* OBIDMS column files
* OBIDMS release files
* OBIDMS dictionary files
* one OBIDMS history file
OBIDMS column files
===================
Each OBIDMS column file contains :
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
on most systems) containing metadata
* one column of data with the same OBIType
Header
------
The header of an OBIDMS column contains :
* Endian byte order
* Header size (PAGESIZE multiple)
*
* File status : Open/Closed
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
* Number of lines (total or without the header?)
* OBIType
* Date of creation
* Version of the file
* Eventual comments
Data
----
A column of data with the same OBIType.
Mandatory columns
-----------------
Some columns must exist in an OBIDMS directory :
* sequence identifiers column (type *OBIStr_t*)
File name
---------
Each file is named with the attribute associated to the data it contains, and the number of
its version, separated by an ``@``, and with the extension ``.odc``.
Example : ``count@3.odc``
Modifications
-------------
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
contained in the `header <#header>`_.
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
again.
That means that one column is stored in one file (if there is only one version)
or more (if there are several versions), and that there is one file per version.
Versioning
----------
The first version of a column file is numbered 1, and each new version increments that
number by 1.
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
OBIDMS release files
====================
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
version of the column.
File name
---------
OBIDMS release files are named with the attribute associated to the data contained in the column, and
have the extension ``.odr``.
Example : ``count.odr``
OBIDMS views
============
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
of each mandatory column. Only one version of each column is included. All the columns of
one view contain the same number of lines in the same order.
OBIDMS history file
===================
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
from one view to the next one.
This history can be represented in the form of a --- showing all the
operations ever done in the OBIDMS directory and the views in between them :
.. image:: ./images/history.png
:width: 150 px
:align: center
OBIIntColumn header file
========================
.. doxygenfile:: obiintcolumn.h
OBIColumn header file
=====================
.. doxygenfile:: obicolumn.h