git@git.metabarcoding.org:obitools/obitools3.git Conflicts: doc/source/formats.rst src/obicolumn.h
159 lines
4.1 KiB
ReStructuredText
159 lines
4.1 KiB
ReStructuredText
*********************************************
|
|
The OBItools3 Data Management System (OBIDMS)
|
|
*********************************************
|
|
|
|
A complete DNA Metabarcoding experiment rely on several kinds of data.
|
|
|
|
- The sequence data resulting of the PCR products sequencing,
|
|
- The description of the samples including all their metadata,
|
|
- One or several refence database used for the taxonomical annotation
|
|
- One or several taxonomies.
|
|
|
|
Up to now each of these categories of data were stored in separate
|
|
files an nothing obliged to keep them together.
|
|
|
|
|
|
The `Data Management System` (DMS) of OBITools3 can be considered
|
|
as a basic database system.
|
|
|
|
|
|
OBIDMS UML
|
|
==========
|
|
|
|
.. image:: ./UML/OBIDMS_UML.png
|
|
|
|
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
|
|
|
|
|
|
An OBIDMS directory consists of :
|
|
* OBIDMS column files
|
|
* OBIDMS release files
|
|
* OBIDMS dictionary files
|
|
* one OBIDMS history file
|
|
|
|
|
|
OBIDMS column files
|
|
===================
|
|
|
|
Each OBIDMS column file contains :
|
|
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
|
|
on most systems) containing metadata
|
|
* one column of data with the same OBIType
|
|
|
|
|
|
Header
|
|
------
|
|
|
|
The header of an OBIDMS column contains :
|
|
|
|
* Endian byte order
|
|
* Header size (PAGESIZE multiple)
|
|
*
|
|
* File status : Open/Closed
|
|
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
|
|
* Number of lines (total or without the header?)
|
|
* OBIType
|
|
* Date of creation
|
|
* Version of the file
|
|
* Eventual comments
|
|
|
|
|
|
Data
|
|
----
|
|
|
|
A column of data with the same OBIType.
|
|
|
|
|
|
Mandatory columns
|
|
-----------------
|
|
|
|
Some columns must exist in an OBIDMS directory :
|
|
* sequence identifiers column (type *OBIStr_t*)
|
|
|
|
|
|
File name
|
|
---------
|
|
|
|
Each file is named with the attribute associated to the data it contains, and the number of
|
|
its version, separated by an ``@``, and with the extension ``.odc``.
|
|
|
|
Example : ``count@3.odc``
|
|
|
|
|
|
Modifications
|
|
-------------
|
|
|
|
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
|
|
contained in the `header <#header>`_.
|
|
|
|
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
|
|
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
|
|
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
|
|
again.
|
|
|
|
That means that one column is stored in one file (if there is only one version)
|
|
or more (if there are several versions), and that there is one file per version.
|
|
|
|
|
|
Versioning
|
|
----------
|
|
|
|
The first version of a column file is numbered 0, and each new version increments that
|
|
number by 1.
|
|
|
|
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
|
|
|
|
|
|
OBIDMS release files
|
|
====================
|
|
|
|
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
|
|
version of the column.
|
|
|
|
File name
|
|
---------
|
|
|
|
OBIDMS release files are named with the attribute associated to the data contained in the column, and
|
|
have the extension ``.odr``.
|
|
|
|
Example : ``count.odr``
|
|
|
|
|
|
OBIDMS views
|
|
============
|
|
|
|
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
|
|
of each mandatory column. Only one version of each column is included. All the columns of
|
|
one view contain the same number of lines in the same order.
|
|
|
|
|
|
OBIDMS history file
|
|
===================
|
|
|
|
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
|
|
from one view to the next one.
|
|
|
|
This history can be represented in the form of a ?? showing all the
|
|
operations ever done in the OBIDMS directory and the views in between them :
|
|
|
|
.. image:: ./images/history.png
|
|
:width: 150 px
|
|
:align: center
|
|
|
|
OBIType header file
|
|
========================
|
|
|
|
.. doxygenfile:: obitypes.h
|
|
|
|
|
|
OBIIntColumn header file
|
|
========================
|
|
|
|
.. doxygenfile:: obiintcolumn.h
|
|
|
|
|
|
OBIColumn header file
|
|
=====================
|
|
|
|
.. doxygenfile:: obicolumn.h
|