209 lines
5.2 KiB
ReStructuredText
209 lines
5.2 KiB
ReStructuredText
#######
|
|
Formats
|
|
#######
|
|
|
|
|
|
********
|
|
OBITypes
|
|
********
|
|
|
|
.. note::
|
|
All OBITypes have an associated NA (Not Available) value.
|
|
|
|
|
|
Atomic types
|
|
============
|
|
|
|
========= ========= ============ ==============================
|
|
Type C type OBIType Definition
|
|
========= ========= ============ ==============================
|
|
integer int32_t OBIInt_t a signed integer value
|
|
float double OBIFloat_t a floating value
|
|
boolean ? OBIBool_t a boolean true/false value
|
|
char char OBIChar_t a character
|
|
index size_t OBIIdx_t an index in a data structure
|
|
========= ========= ============ ==============================
|
|
|
|
|
|
Character string type
|
|
=====================
|
|
|
|
================ ====== ======== ==================
|
|
Type C type OBIType Definition
|
|
================ ====== ======== ==================
|
|
Character string char* OBIStr_t a character string
|
|
================ ====== ======== ==================
|
|
|
|
|
|
Container types
|
|
===============
|
|
|
|
Lists
|
|
-----
|
|
|
|
Lists of values with an atomic OBIType.
|
|
|
|
|
|
Ensembles
|
|
---------
|
|
|
|
Ensembles of values with an atomic OBIType.
|
|
|
|
|
|
Dictionaries
|
|
------------
|
|
|
|
* Dictionaries of *OBIIdx_t* values indexed by *OBIStr_t* values, typically used for the storage of DNA sequences.
|
|
* Bit arrays for data presence/absence informations in the above dictionaries.
|
|
|
|
|
|
The taxid type
|
|
--------------
|
|
|
|
A couple of (*OBIInt_t* value, *OBIStr_t* value) corresponding to the taxid and a reference to a taxonomic database.
|
|
|
|
|
|
*********************************************
|
|
The OBItools3 Data Management System (OBIDMS)
|
|
*********************************************
|
|
|
|
An OBIDMS directory consists of :
|
|
* OBIDMS column files
|
|
* OBIDMS release files
|
|
* OBIDMS dictionary files
|
|
* one OBIDMS history file
|
|
|
|
|
|
OBIDMS column files
|
|
===================
|
|
|
|
Each OBIDMS column file contains :
|
|
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
|
|
on most systems) containing metadata
|
|
* one column of data with the same OBIType
|
|
|
|
|
|
Header
|
|
------
|
|
|
|
The header of an OBIDMS column contains :
|
|
|
|
* Endian byte order
|
|
* Header size (PAGESIZE multiple)
|
|
*
|
|
* File status : Open/Closed
|
|
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
|
|
* Number of lines (total or without the header?)
|
|
* OBIType
|
|
* Date of creation
|
|
* Version of the file
|
|
* Eventual comments
|
|
|
|
|
|
Data
|
|
----
|
|
|
|
A column of data with the same OBIType.
|
|
|
|
|
|
Mandatory columns
|
|
-----------------
|
|
|
|
Some columns must exist in an OBIDMS directory :
|
|
* sequence identifiers column (type *OBIStr_t*)
|
|
|
|
|
|
File name
|
|
---------
|
|
|
|
Each file is named with the attribute associated to the data it contains, and the number of
|
|
its version, separated by an ``@``, and with the extension ``.odc``.
|
|
|
|
Example : ``count@3.odc``
|
|
|
|
|
|
Modifications
|
|
-------------
|
|
|
|
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
|
|
contained in the `header <#header>`_.
|
|
|
|
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
|
|
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
|
|
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
|
|
again.
|
|
|
|
That means that one column is stored in one file (if there is only one version)
|
|
or more (if there are several versions), and that there is one file per version.
|
|
|
|
|
|
Versioning
|
|
----------
|
|
|
|
The first version of a column file is numbered 0, and each new version increments that
|
|
number by 1.
|
|
|
|
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
|
|
|
|
|
|
OBIDMS release files
|
|
====================
|
|
|
|
Each OBIDMS column is associated with an OBIDMS release file that contains the number of the latest
|
|
version of the column.
|
|
|
|
File name
|
|
---------
|
|
|
|
OBIDMS release files are named with the attribute associated to the data contained in the column, and
|
|
have the extension ``.odr``.
|
|
|
|
Example : ``count.odr``
|
|
|
|
|
|
OBIDMS views
|
|
============
|
|
|
|
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
|
|
of each mandatory column. Only one version of each column is included. All the columns of
|
|
one view contain the same number of lines in the same order.
|
|
|
|
|
|
OBIDMS history file
|
|
===================
|
|
|
|
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
|
|
from one view to the next one.
|
|
|
|
This history can be represented in the form of a ?? showing all the
|
|
operations ever done in the OBIDMS directory and the views in between them :
|
|
|
|
.. image:: ./images/history.png
|
|
:width: 150 px
|
|
:align: center
|
|
|
|
|
|
OBIDMS UML
|
|
==========
|
|
|
|
.. image:: ./images/OBIDMS_UML.png
|
|
:download:`html version of the OBIDMS UML file </ObiDMS_UML.class.violet.html>`
|
|
|
|
|
|
OBIType header file
|
|
========================
|
|
|
|
.. doxygenfile:: obitypes.h
|
|
|
|
|
|
OBIIntColumn header file
|
|
========================
|
|
|
|
.. doxygenfile:: obiintcolumn.h
|
|
|
|
|
|
OBIColumn header file
|
|
=====================
|
|
|
|
.. doxygenfile:: obicolumn.h
|