updated documentation
This commit is contained in:
File diff suppressed because it is too large
Load Diff
@ -3,6 +3,66 @@ Formats
|
||||
#######
|
||||
|
||||
|
||||
********
|
||||
OBITypes
|
||||
********
|
||||
|
||||
.. note::
|
||||
All OBITypes have an associated NA (Not Available) value.
|
||||
|
||||
|
||||
Atomic types
|
||||
============
|
||||
|
||||
========= ========= ============ ==============================
|
||||
Type C type OBIType Definition
|
||||
========= ========= ============ ==============================
|
||||
integer int32_t OBIInt_t a signed integer value
|
||||
float double OBIFloat_t a floating value
|
||||
boolean ? OBIBool_t a boolean true/false value
|
||||
char char OBIChar_t a character
|
||||
index size_t OBIIdx_t an index in a data structure
|
||||
========= ========= ============ ==============================
|
||||
|
||||
|
||||
Character string type
|
||||
=====================
|
||||
|
||||
================ ====== ======== ==================
|
||||
Type C type OBIType Definition
|
||||
================ ====== ======== ==================
|
||||
Character string char* OBIStr_t a character string
|
||||
================ ====== ======== ==================
|
||||
|
||||
|
||||
Container types
|
||||
===============
|
||||
|
||||
Lists
|
||||
-----
|
||||
|
||||
Lists of values with an atomic OBIType.
|
||||
|
||||
|
||||
Ensembles
|
||||
---------
|
||||
|
||||
Ensembles of values with an atomic OBIType.
|
||||
|
||||
|
||||
Dictionaries
|
||||
------------
|
||||
|
||||
* Dictionaries of *OBIIdx_t* values indexed by *OBIStr_t* values, typically used for the storage of DNA sequences.
|
||||
* Bit arrays for data presence/absence informations in the above dictionaries.
|
||||
|
||||
|
||||
The taxid type
|
||||
--------------
|
||||
|
||||
A couple of (*OBIInt_t* value, *OBIStr_t* value) corresponding to the taxid and a reference to a taxonomic database.
|
||||
|
||||
|
||||
*********************************************
|
||||
The OBItools3 Data Management System (OBIDMS)
|
||||
*********************************************
|
||||
@ -10,7 +70,8 @@ The OBItools3 Data Management System (OBIDMS)
|
||||
An OBIDMS directory consists of :
|
||||
* OBIDMS column files
|
||||
* OBIDMS release files
|
||||
* an OBIDMS history file
|
||||
* OBIDMS dictionary files
|
||||
* one OBIDMS history file
|
||||
|
||||
|
||||
OBIDMS column files
|
||||
@ -19,9 +80,37 @@ OBIDMS column files
|
||||
Each OBIDMS column file contains :
|
||||
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
|
||||
on most systems) containing metadata
|
||||
* one column of data of the same type
|
||||
* one column of data with the same OBIType
|
||||
|
||||
|
||||
Header
|
||||
------
|
||||
|
||||
The header of an OBIDMS column contains :
|
||||
|
||||
OBIDMS column files are read-only.
|
||||
* Endian byte order
|
||||
* Header size (PAGESIZE multiple)
|
||||
*
|
||||
* File status : Open/Closed
|
||||
* Owner : PID of the process that created the file and is the only one allowed to modify it if it is open
|
||||
* Number of lines (total or without the header?)
|
||||
* OBIType
|
||||
* Date of creation
|
||||
* Version of the file
|
||||
* Eventual comments
|
||||
|
||||
|
||||
Data
|
||||
----
|
||||
|
||||
A column of data with the same OBIType.
|
||||
|
||||
|
||||
Mandatory columns
|
||||
-----------------
|
||||
|
||||
Some columns must exist in an OBIDMS directory :
|
||||
* sequence identifiers column (type *OBIStr_t*)
|
||||
|
||||
|
||||
File name
|
||||
@ -33,47 +122,30 @@ its version, separated by an ``@``, and with the extension ``.odc``.
|
||||
Example : ``count@3.odc``
|
||||
|
||||
|
||||
Header
|
||||
------
|
||||
Modifications
|
||||
-------------
|
||||
|
||||
The header of an OBIDMS column contains :
|
||||
|
||||
* Endian byte order
|
||||
* PAGESIZE value / Size of the header
|
||||
|
||||
* Number of lines (total or without the header?)
|
||||
* Data type (int, str...)
|
||||
* Date of creation
|
||||
* Version of the file
|
||||
* Eventual comments
|
||||
An OBIDMS column file can only be modified by the process that created it, if its status is set to Open. Those informations are
|
||||
contained in the `header <#header>`_.
|
||||
|
||||
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
|
||||
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
|
||||
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
|
||||
again.
|
||||
|
||||
Data
|
||||
----
|
||||
|
||||
A column of data of the same type.
|
||||
That means that one column is stored in one file (if there is only one version)
|
||||
or more (if there are several versions), and that there is one file per version.
|
||||
|
||||
|
||||
Versioning
|
||||
----------
|
||||
|
||||
OBIDMS column files are read-only, and any modification leads to the creation of a new version
|
||||
of the column file. That means that one column is stored in one file (if there is only one version)
|
||||
or more (if there are several versions), and that there is one file per version.
|
||||
|
||||
The first version of a column file is numbered 1, and each new version increments that
|
||||
number by 1.
|
||||
|
||||
The number of the latest version of an OBIDMS column is stored in an `OBIDMS release file <formats.html#obidms-release-files>`_.
|
||||
|
||||
|
||||
Mandatory columns
|
||||
-----------------
|
||||
|
||||
Some columns must exist in an OBIDMS directory :
|
||||
* sequence identifiers column
|
||||
|
||||
|
||||
OBIDMS release files
|
||||
====================
|
||||
|
||||
@ -89,21 +161,28 @@ have the extension ``.odr``.
|
||||
Example : ``count.odr``
|
||||
|
||||
|
||||
OBIDMS history file
|
||||
===================
|
||||
|
||||
An OBIDMS history file consists of data that can be represented in the form of a directed acyclic
|
||||
graph presenting the history of all the operations ever done in the OBIDMS directory.
|
||||
|
||||
|
||||
OBIDMS views
|
||||
============
|
||||
|
||||
An OBIDMS view corresponds to a list of OBIDMS columns and lines. A view includes one version
|
||||
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
|
||||
of each mandatory column. Only one version of each column is included. All the columns of
|
||||
one view contain the same number of lines in the same order.
|
||||
|
||||
|
||||
OBIDMS history file
|
||||
===================
|
||||
|
||||
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
|
||||
from one view to the next one.
|
||||
|
||||
This history can be represented in the form of a --- showing all the
|
||||
operations ever done in the OBIDMS directory and the views in between them :
|
||||
|
||||
.. image:: ./images/history.png
|
||||
:width: 150 px
|
||||
:align: center
|
||||
|
||||
|
||||
OBIDMS UML
|
||||
==========
|
||||
|
||||
|
@ -100,12 +100,12 @@ Naming conventions
|
||||
******************
|
||||
|
||||
.. todo::
|
||||
Look for usual naming conventions
|
||||
Look for common naming conventions
|
||||
|
||||
|
||||
*****************
|
||||
Programming rules
|
||||
*****************
|
||||
|
||||
* The *int* type should never be used
|
||||
*
|
||||
|
||||
|
Binary file not shown.
Before Width: | Height: | Size: 62 KiB After Width: | Height: | Size: 63 KiB |
BIN
doc/source/images/history.png
Normal file
BIN
doc/source/images/history.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 17 KiB |
@ -11,6 +11,7 @@ OBITools3 documentation
|
||||
|
||||
Programming guidelines <guidelines>
|
||||
Formats <formats>
|
||||
Pistes de reflexion <pistes>
|
||||
|
||||
|
||||
Indices and tables
|
||||
|
20
doc/source/pistes.rst
Normal file
20
doc/source/pistes.rst
Normal file
@ -0,0 +1,20 @@
|
||||
###################
|
||||
Pistes de reflexion
|
||||
###################
|
||||
|
||||
|
||||
******************************
|
||||
Ce que l'on veut pouvoir faire
|
||||
******************************
|
||||
|
||||
* Gerer les valeurs manquantes
|
||||
* Modifier une colonne en cours d'ecriture (mmap)
|
||||
* Ajouter des valeurs a la fin d'une colonne en cours d'ecriture (mmap)
|
||||
|
||||
|
||||
******
|
||||
Divers
|
||||
******
|
||||
|
||||
* Si l'ordre d'une colonne est change, elle est reecrite (pas d'index).
|
||||
* Truc pour verrouiller l'acces en lecture a un programme a la fois...
|
Reference in New Issue
Block a user