updated the documentation with the special values, and the idea of
column directories and column group directories.
This commit is contained in:
@ -2,19 +2,19 @@
|
|||||||
The OBItools3 Data Management System (OBIDMS)
|
The OBItools3 Data Management System (OBIDMS)
|
||||||
*********************************************
|
*********************************************
|
||||||
|
|
||||||
A complete DNA Metabarcoding experiment rely on several kinds of data.
|
A complete DNA metabarcoding experiment relies on several kinds of data.
|
||||||
|
|
||||||
- The sequence data resulting of the PCR products sequencing,
|
- The sequence data resulting from the sequencing of the PCR products,
|
||||||
- The description of the samples including all their metadata,
|
- The description of the samples including all their metadata,
|
||||||
- One or several refence database used for the taxonomical annotation
|
- One or several reference databases used for the taxonomic annotation,
|
||||||
- One or several taxonomies.
|
- One or several taxonomy databases.
|
||||||
|
|
||||||
Up to now each of these categories of data were stored in separate
|
Up to now, each of these categories of data were stored in separate
|
||||||
files an nothing obliged to keep them together.
|
files, and nothing made it mandatory to keep them together.
|
||||||
|
|
||||||
|
|
||||||
The `Data Management System` (DMS) of OBITools3 can be considered
|
The `Data Management System` (DMS) of OBITools3 can be regarded as a basic
|
||||||
as a basic database system.
|
database system.
|
||||||
|
|
||||||
|
|
||||||
OBIDMS UML
|
OBIDMS UML
|
||||||
@ -25,11 +25,30 @@ OBIDMS UML
|
|||||||
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
|
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
|
||||||
|
|
||||||
|
|
||||||
An OBIDMS directory consists of :
|
An OBIDMS directory contains :
|
||||||
* OBIDMS column files
|
* one `OBIDMS history file <#obidms-history-files>`_
|
||||||
* OBIDMS release files
|
* Two different kinds of directories :
|
||||||
* OBIDMS dictionary files
|
* OBIDMS column directories
|
||||||
* one OBIDMS history file
|
* OBIDMS column group directories containing OBIDMS column directories
|
||||||
|
|
||||||
|
|
||||||
|
OBIDMS column directories
|
||||||
|
=========================
|
||||||
|
|
||||||
|
OBIDMS column directories contain :
|
||||||
|
* all the different versions of one OBIDMS column, under the form of different files (`OBIDMS column files <#obidms-column-files>`_)
|
||||||
|
* one `OBIDMS release file <#obidms-release-files>`_
|
||||||
|
|
||||||
|
The directory name is the column attribute, or sub-attribute if the column directory is in a column group directory.
|
||||||
|
|
||||||
|
|
||||||
|
OBIDMS column group directories
|
||||||
|
===============================
|
||||||
|
|
||||||
|
OBIDMS column group directories contain OBIDMS column directories. They are used to store dictionary-like data, where
|
||||||
|
each key corresponds to an OBIDMS column.
|
||||||
|
|
||||||
|
The directory name is the dictionary attribute. Each key is considered a sub-attribute and is associated to its column.
|
||||||
|
|
||||||
|
|
||||||
OBIDMS column files
|
OBIDMS column files
|
||||||
@ -38,7 +57,7 @@ OBIDMS column files
|
|||||||
Each OBIDMS column file contains :
|
Each OBIDMS column file contains :
|
||||||
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
|
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
|
||||||
on most systems) containing metadata
|
on most systems) containing metadata
|
||||||
* one column of data with the same OBIType
|
* one column of data with the same `OBIType <types.html#obitypes>`_
|
||||||
|
|
||||||
|
|
||||||
Header
|
Header
|
||||||
@ -48,27 +67,26 @@ The header of an OBIDMS column contains :
|
|||||||
|
|
||||||
* Endian byte order
|
* Endian byte order
|
||||||
* Header size (PAGESIZE multiple)
|
* Header size (PAGESIZE multiple)
|
||||||
*
|
* Number of lines of data
|
||||||
* File status : Open/Closed
|
* Number of lines of data used
|
||||||
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
|
* `OBIType <types.html#obitypes>`_ (type of the data)
|
||||||
* Number of lines (total or without the header?)
|
* Date of creation of the file
|
||||||
* OBIType
|
* Version of the OBIDMS column
|
||||||
* Date of creation
|
* The column name
|
||||||
* Version of the file
|
|
||||||
* Eventual comments
|
* Eventual comments
|
||||||
|
|
||||||
|
|
||||||
Data
|
Data
|
||||||
----
|
----
|
||||||
|
|
||||||
A column of data with the same OBIType.
|
A column of data with the same `OBIType <types.html#obitypes>`_.
|
||||||
|
|
||||||
|
|
||||||
Mandatory columns
|
Mandatory columns
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
Some columns must exist in an OBIDMS directory :
|
Some columns must exist in an OBIDMS directory :
|
||||||
* sequence identifiers column (type *OBIStr_t*)
|
* sequence identifiers column (type ``OBIStr_t``)
|
||||||
|
|
||||||
|
|
||||||
File name
|
File name
|
||||||
@ -83,8 +101,7 @@ Example : ``count@3.odc``
|
|||||||
Modifications
|
Modifications
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
|
An OBIDMS column file can only be modified by the process that created it, and while its status is set to Open.
|
||||||
contained in the `header <#header>`_.
|
|
||||||
|
|
||||||
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
|
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
|
||||||
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
|
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
|
||||||
@ -94,6 +111,8 @@ again.
|
|||||||
That means that one column is stored in one file (if there is only one version)
|
That means that one column is stored in one file (if there is only one version)
|
||||||
or more (if there are several versions), and that there is one file per version.
|
or more (if there are several versions), and that there is one file per version.
|
||||||
|
|
||||||
|
All the versions of one column are stored in one directory.
|
||||||
|
|
||||||
|
|
||||||
Versioning
|
Versioning
|
||||||
----------
|
----------
|
||||||
@ -101,13 +120,13 @@ Versioning
|
|||||||
The first version of a column file is numbered 0, and each new version increments that
|
The first version of a column file is numbered 0, and each new version increments that
|
||||||
number by 1.
|
number by 1.
|
||||||
|
|
||||||
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
|
The number of the latest version of an OBIDMS column is stored in the `OBIDMS release file <formats.html#obidms-release-files>`_ of its directory.
|
||||||
|
|
||||||
|
|
||||||
OBIDMS release files
|
OBIDMS release files
|
||||||
====================
|
====================
|
||||||
|
|
||||||
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
|
Each OBIDMS column is associated with an OBIDMS release file in its dorectory, that contains the number of the latest
|
||||||
version of the column.
|
version of the column.
|
||||||
|
|
||||||
File name
|
File name
|
||||||
@ -139,20 +158,3 @@ operations ever done in the OBIDMS directory and the views in between them :
|
|||||||
.. image:: ./images/history.png
|
.. image:: ./images/history.png
|
||||||
:width: 150 px
|
:width: 150 px
|
||||||
:align: center
|
:align: center
|
||||||
|
|
||||||
OBIType header file
|
|
||||||
========================
|
|
||||||
|
|
||||||
.. doxygenfile:: obitypes.h
|
|
||||||
|
|
||||||
|
|
||||||
OBIIntColumn header file
|
|
||||||
========================
|
|
||||||
|
|
||||||
.. doxygenfile:: obiintcolumn.h
|
|
||||||
|
|
||||||
|
|
||||||
OBIColumn header file
|
|
||||||
=====================
|
|
||||||
|
|
||||||
.. doxygenfile:: obicolumn.h
|
|
||||||
|
Binary file not shown.
Before Width: | Height: | Size: 63 KiB After Width: | Height: | Size: 66 KiB |
File diff suppressed because it is too large
Load Diff
52
doc/source/specialvalues.rst
Normal file
52
doc/source/specialvalues.rst
Normal file
@ -0,0 +1,52 @@
|
|||||||
|
==============
|
||||||
|
Special values
|
||||||
|
==============
|
||||||
|
|
||||||
|
|
||||||
|
NA values
|
||||||
|
=========
|
||||||
|
|
||||||
|
All OBITypes have an associated NA (Not Available) value.
|
||||||
|
NA values are implemented by specifying an explicit NA value for each type, corresponding to the R standards:
|
||||||
|
|
||||||
|
* For the types ``OBIInt_t``, ``OBIBool_t``, ``OBIIdx_t`` and ``OBITaxid_t``, the NA value is ``INT_MIN``.
|
||||||
|
|
||||||
|
* For the type ``OBIChar_t``: the NA value is ``\0`` (?).
|
||||||
|
|
||||||
|
* For the type ``OBIStr_t`` : the NA value is a tab followed by a space.
|
||||||
|
|
||||||
|
* For the type ``OBIFloat_t``::
|
||||||
|
|
||||||
|
typedef union
|
||||||
|
{
|
||||||
|
double value;
|
||||||
|
unsigned int word[2];
|
||||||
|
} ieee_double;
|
||||||
|
|
||||||
|
static double NA_value(void)
|
||||||
|
{
|
||||||
|
volatile ieee_double x;
|
||||||
|
x.word[hw] = 0x7ff00000;
|
||||||
|
x.word[lw] = 1954;
|
||||||
|
return x.value;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
Minimum and maximum values for ``OBIInt_t``
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
* Maximum value : ``INT_MAX``
|
||||||
|
* Minimum value : ``INT_MIN(-1?)``
|
||||||
|
|
||||||
|
|
||||||
|
Infinity values for the type ``OBIFloat_t``
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
* Positive infinity : ``INFINITY`` (should be defined in ``<math.h>``)
|
||||||
|
* Negative infinity : ``-INFINITY``
|
||||||
|
|
||||||
|
|
||||||
|
NaN value for the type ``OBIFloat_t``
|
||||||
|
=====================================
|
||||||
|
|
||||||
|
* NaN (Not a Number) value : ``NAN`` (should be defined in ``<math.h>`` but probably needs to be tested)
|
@ -6,20 +6,12 @@ OBITypes
|
|||||||
.. image:: ./UML/OBITypes_UML.png
|
.. image:: ./UML/OBITypes_UML.png
|
||||||
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
|
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
|
||||||
|
|
||||||
.. note::
|
|
||||||
All OBITypes have an associated NA (Not Available) value.
|
|
||||||
We have currently two ideas for implementing NA values:
|
|
||||||
|
|
||||||
- By specifying an explicit NA value for each type
|
|
||||||
- By adding to each column of an OBIDMS a bit vector
|
|
||||||
indicating if the value is defined or not.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
||||||
The elementary types <elementary>
|
The elementary types <elementary>
|
||||||
The containers <containers>
|
The containers <containers>
|
||||||
|
Special values <specialvalues>
|
||||||
|
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user