updated the documentation with the special values, and the idea of
column directories and column group directories.
This commit is contained in:
@ -2,19 +2,19 @@
|
||||
The OBItools3 Data Management System (OBIDMS)
|
||||
*********************************************
|
||||
|
||||
A complete DNA Metabarcoding experiment rely on several kinds of data.
|
||||
A complete DNA metabarcoding experiment relies on several kinds of data.
|
||||
|
||||
- The sequence data resulting of the PCR products sequencing,
|
||||
- The sequence data resulting from the sequencing of the PCR products,
|
||||
- The description of the samples including all their metadata,
|
||||
- One or several refence database used for the taxonomical annotation
|
||||
- One or several taxonomies.
|
||||
- One or several reference databases used for the taxonomic annotation,
|
||||
- One or several taxonomy databases.
|
||||
|
||||
Up to now each of these categories of data were stored in separate
|
||||
files an nothing obliged to keep them together.
|
||||
Up to now, each of these categories of data were stored in separate
|
||||
files, and nothing made it mandatory to keep them together.
|
||||
|
||||
|
||||
The `Data Management System` (DMS) of OBITools3 can be considered
|
||||
as a basic database system.
|
||||
The `Data Management System` (DMS) of OBITools3 can be regarded as a basic
|
||||
database system.
|
||||
|
||||
|
||||
OBIDMS UML
|
||||
@ -25,11 +25,30 @@ OBIDMS UML
|
||||
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
|
||||
|
||||
|
||||
An OBIDMS directory consists of :
|
||||
* OBIDMS column files
|
||||
* OBIDMS release files
|
||||
* OBIDMS dictionary files
|
||||
* one OBIDMS history file
|
||||
An OBIDMS directory contains :
|
||||
* one `OBIDMS history file <#obidms-history-files>`_
|
||||
* Two different kinds of directories :
|
||||
* OBIDMS column directories
|
||||
* OBIDMS column group directories containing OBIDMS column directories
|
||||
|
||||
|
||||
OBIDMS column directories
|
||||
=========================
|
||||
|
||||
OBIDMS column directories contain :
|
||||
* all the different versions of one OBIDMS column, under the form of different files (`OBIDMS column files <#obidms-column-files>`_)
|
||||
* one `OBIDMS release file <#obidms-release-files>`_
|
||||
|
||||
The directory name is the column attribute, or sub-attribute if the column directory is in a column group directory.
|
||||
|
||||
|
||||
OBIDMS column group directories
|
||||
===============================
|
||||
|
||||
OBIDMS column group directories contain OBIDMS column directories. They are used to store dictionary-like data, where
|
||||
each key corresponds to an OBIDMS column.
|
||||
|
||||
The directory name is the dictionary attribute. Each key is considered a sub-attribute and is associated to its column.
|
||||
|
||||
|
||||
OBIDMS column files
|
||||
@ -38,7 +57,7 @@ OBIDMS column files
|
||||
Each OBIDMS column file contains :
|
||||
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
|
||||
on most systems) containing metadata
|
||||
* one column of data with the same OBIType
|
||||
* one column of data with the same `OBIType <types.html#obitypes>`_
|
||||
|
||||
|
||||
Header
|
||||
@ -48,27 +67,26 @@ The header of an OBIDMS column contains :
|
||||
|
||||
* Endian byte order
|
||||
* Header size (PAGESIZE multiple)
|
||||
*
|
||||
* File status : Open/Closed
|
||||
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
|
||||
* Number of lines (total or without the header?)
|
||||
* OBIType
|
||||
* Date of creation
|
||||
* Version of the file
|
||||
* Number of lines of data
|
||||
* Number of lines of data used
|
||||
* `OBIType <types.html#obitypes>`_ (type of the data)
|
||||
* Date of creation of the file
|
||||
* Version of the OBIDMS column
|
||||
* The column name
|
||||
* Eventual comments
|
||||
|
||||
|
||||
Data
|
||||
----
|
||||
|
||||
A column of data with the same OBIType.
|
||||
A column of data with the same `OBIType <types.html#obitypes>`_.
|
||||
|
||||
|
||||
Mandatory columns
|
||||
-----------------
|
||||
|
||||
Some columns must exist in an OBIDMS directory :
|
||||
* sequence identifiers column (type *OBIStr_t*)
|
||||
* sequence identifiers column (type ``OBIStr_t``)
|
||||
|
||||
|
||||
File name
|
||||
@ -83,8 +101,7 @@ Example : ``count@3.odc``
|
||||
Modifications
|
||||
-------------
|
||||
|
||||
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
|
||||
contained in the `header <#header>`_.
|
||||
An OBIDMS column file can only be modified by the process that created it, and while its status is set to Open.
|
||||
|
||||
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
|
||||
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
|
||||
@ -94,6 +111,8 @@ again.
|
||||
That means that one column is stored in one file (if there is only one version)
|
||||
or more (if there are several versions), and that there is one file per version.
|
||||
|
||||
All the versions of one column are stored in one directory.
|
||||
|
||||
|
||||
Versioning
|
||||
----------
|
||||
@ -101,13 +120,13 @@ Versioning
|
||||
The first version of a column file is numbered 0, and each new version increments that
|
||||
number by 1.
|
||||
|
||||
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
|
||||
The number of the latest version of an OBIDMS column is stored in the `OBIDMS release file <formats.html#obidms-release-files>`_ of its directory.
|
||||
|
||||
|
||||
OBIDMS release files
|
||||
====================
|
||||
|
||||
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
|
||||
Each OBIDMS column is associated with an OBIDMS release file in its dorectory, that contains the number of the latest
|
||||
version of the column.
|
||||
|
||||
File name
|
||||
@ -139,20 +158,3 @@ operations ever done in the OBIDMS directory and the views in between them :
|
||||
.. image:: ./images/history.png
|
||||
:width: 150 px
|
||||
:align: center
|
||||
|
||||
OBIType header file
|
||||
========================
|
||||
|
||||
.. doxygenfile:: obitypes.h
|
||||
|
||||
|
||||
OBIIntColumn header file
|
||||
========================
|
||||
|
||||
.. doxygenfile:: obiintcolumn.h
|
||||
|
||||
|
||||
OBIColumn header file
|
||||
=====================
|
||||
|
||||
.. doxygenfile:: obicolumn.h
|
||||
|
Binary file not shown.
Before Width: | Height: | Size: 63 KiB After Width: | Height: | Size: 66 KiB |
File diff suppressed because it is too large
Load Diff
52
doc/source/specialvalues.rst
Normal file
52
doc/source/specialvalues.rst
Normal file
@ -0,0 +1,52 @@
|
||||
==============
|
||||
Special values
|
||||
==============
|
||||
|
||||
|
||||
NA values
|
||||
=========
|
||||
|
||||
All OBITypes have an associated NA (Not Available) value.
|
||||
NA values are implemented by specifying an explicit NA value for each type, corresponding to the R standards:
|
||||
|
||||
* For the types ``OBIInt_t``, ``OBIBool_t``, ``OBIIdx_t`` and ``OBITaxid_t``, the NA value is ``INT_MIN``.
|
||||
|
||||
* For the type ``OBIChar_t``: the NA value is ``\0`` (?).
|
||||
|
||||
* For the type ``OBIStr_t`` : the NA value is a tab followed by a space.
|
||||
|
||||
* For the type ``OBIFloat_t``::
|
||||
|
||||
typedef union
|
||||
{
|
||||
double value;
|
||||
unsigned int word[2];
|
||||
} ieee_double;
|
||||
|
||||
static double NA_value(void)
|
||||
{
|
||||
volatile ieee_double x;
|
||||
x.word[hw] = 0x7ff00000;
|
||||
x.word[lw] = 1954;
|
||||
return x.value;
|
||||
}
|
||||
|
||||
|
||||
Minimum and maximum values for ``OBIInt_t``
|
||||
===========================================
|
||||
|
||||
* Maximum value : ``INT_MAX``
|
||||
* Minimum value : ``INT_MIN(-1?)``
|
||||
|
||||
|
||||
Infinity values for the type ``OBIFloat_t``
|
||||
===========================================
|
||||
|
||||
* Positive infinity : ``INFINITY`` (should be defined in ``<math.h>``)
|
||||
* Negative infinity : ``-INFINITY``
|
||||
|
||||
|
||||
NaN value for the type ``OBIFloat_t``
|
||||
=====================================
|
||||
|
||||
* NaN (Not a Number) value : ``NAN`` (should be defined in ``<math.h>`` but probably needs to be tested)
|
@ -6,20 +6,12 @@ OBITypes
|
||||
.. image:: ./UML/OBITypes_UML.png
|
||||
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
|
||||
|
||||
.. note::
|
||||
All OBITypes have an associated NA (Not Available) value.
|
||||
We have currently two ideas for implementing NA values:
|
||||
|
||||
- By specifying an explicit NA value for each type
|
||||
- By adding to each column of an OBIDMS a bit vector
|
||||
indicating if the value is defined or not.
|
||||
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
The elementary types <elementary>
|
||||
The containers <containers>
|
||||
Special values <specialvalues>
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user