Compare commits

...

36 Commits

Author SHA1 Message Date
d1d26b9028 Simplify the code 2016-08-04 08:00:54 +02:00
8f0462c407 Merge branch 'master' into Eric_version_for_sequence
Conflicts:
	python/obitools3/obidms/_obidmscolumn_seq.pyx
2016-08-03 10:09:20 +02:00
000b9999ad Merge branch 'master' of git@git.metabarcoding.org:obitools/obitools3.git 2016-07-03 09:22:22 +02:00
aff9831c13 Substitute fprintf call by fputs call to conform with the new ubuntu
compilation rules
2016-07-03 09:21:56 +02:00
448fa8d325 first trial for a fasta formater 2016-07-03 09:18:52 +02:00
6af62d8124 Change a fprintf without argument to a fputs to comply with the new
default parameter on ubuntu
2016-07-03 08:25:06 +02:00
0869b9ba3f Closes issue #47 by storing each view in a separate file named with the
view's name and created upon view creation.
2016-06-30 11:41:30 +02:00
ad2af0b512 Some comments updated 2016-06-16 11:26:54 +02:00
38e603ed57 Deleted some redundant cython code 2016-06-10 10:34:47 +02:00
f438c3d913 OBIQUAL columns can now handle multiple elements per line 2016-06-09 15:54:36 +02:00
2a1ea3ba3f Setting NA values is now handled properly for OBI_SEQ, OBI_STR and
OBI_QUAL columns
2016-06-09 14:22:36 +02:00
fc3641d7ff Read-only AVLs are now hard-linked instead of copied when cloning an AVL
group to make it writable. Also fixed several bugs when handling AVL
groups.
2016-06-03 19:02:46 +02:00
799b942017 Deleted old debugging print 2016-06-03 18:57:32 +02:00
6e3f5b230e Fixed typo in doc 2016-06-03 18:56:45 +02:00
2f57f80c63 Fixed a bug where an unmapped variable would be read 2016-06-03 18:55:58 +02:00
2962c4d250 Goes with previous commit 2016-06-03 18:54:25 +02:00
69bf7ec2e7 NA value for OBI_STR and OBI_SEQ columns is now NULL 2016-06-03 18:53:22 +02:00
bac7ce7184 Start of the implementation of the export methods 2016-06-02 19:10:33 +02:00
f186395661 Trap potential exception generated by char* to bytes casts 2016-05-29 21:18:20 +02:00
85395dfc1a value returned for sequence is now bytes and no more str 2016-05-29 13:53:32 +02:00
f830389974 Add some comment on the location of the align method. 2016-05-29 12:58:31 +02:00
2e35229357 Add conversion checking on the value of a seq column 2016-05-29 12:54:13 +02:00
a8ed57dc6e few small changes 2016-05-21 12:29:55 +02:00
c3274d419c remove an extra debug log 2016-05-21 12:29:08 +02:00
cca0dbb46b Close issue #54 by adding a read1 method to the MagicKeyFile class 2016-05-21 12:24:48 +02:00
5a78157112 increase parsing speed of the header 2016-05-21 10:29:11 +02:00
0b9a41d952 Patch a bug about the reading of the last sequence 2016-05-21 10:28:03 +02:00
e681ca646d Fixed a problem with some columns being shorter in views and triggering
errors when trying to get values. Temporary fix that needs discussion
2016-05-20 18:45:29 +02:00
3b59043ea8 Major update: New column type to store sequence qualities. Closes #41 2016-05-20 16:45:22 +02:00
ffff91e76c Fixed variable name that had been accidentally changed for better
clarity
2016-05-18 13:27:41 +02:00
6a8df069ad Indexers are now cloned if needed to modify them after they've been
closed. Obligatory indexers' names now follow the same pattern as other
indexers (columnname_version). Closes #46 and #49.
2016-05-18 13:23:48 +02:00
8ae7644945 First version of quality handling (not working yet) and now it is
checked that a column is writable before enlarging it
2016-05-11 16:38:14 +02:00
b3c47809da First version of alignment functions (imported from suma* programs) 2016-05-11 16:36:23 +02:00
3567681339 Now when a column is added to a view, if there is a line selection, all
columns in the view are cloned first
2016-05-11 16:34:20 +02:00
757ef8509a Deleting CeCILL license duplicates 2016-05-09 11:17:45 +02:00
f961621f5d Minor improvements in _obidms Cython layer 2016-05-04 13:43:26 +02:00
75 changed files with 4656 additions and 2038 deletions

View File

@ -1,506 +0,0 @@
CeCILL FREE SOFTWARE LICENSE AGREEMENT
Notice
This Agreement is a Free Software license agreement that is the result
of discussions between its authors in order to ensure compliance with
the two main principles guiding its drafting:
* firstly, compliance with the principles governing the distribution
of Free Software: access to source code, broad rights granted to
users,
* secondly, the election of a governing law, French law, with which
it is conformant, both as regards the law of torts and
intellectual property law, and the protection that it offers to
both authors and holders of the economic rights over software.
The authors of the CeCILL (for Ce[a] C[nrs] I[nria] L[ogiciel] L[ibre])
license are:
Commissariat <20> l'Energie Atomique - CEA, a public scientific, technical
and industrial research establishment, having its principal place of
business at 25 rue Leblanc, immeuble Le Ponant D, 75015 Paris, France.
Centre National de la Recherche Scientifique - CNRS, a public scientific
and technological establishment, having its principal place of business
at 3 rue Michel-Ange, 75794 Paris cedex 16, France.
Institut National de Recherche en Informatique et en Automatique -
INRIA, a public scientific and technological establishment, having its
principal place of business at Domaine de Voluceau, Rocquencourt, BP
105, 78153 Le Chesnay cedex, France.
Preamble
The purpose of this Free Software license agreement is to grant users
the right to modify and redistribute the software governed by this
license within the framework of an open source distribution model.
The exercising of these rights is conditional upon certain obligations
for users so as to preserve this status for all subsequent redistributions.
In consideration of access to the source code and the rights to copy,
modify and redistribute granted by the license, users are provided only
with a limited warranty and the software's author, the holder of the
economic rights, and the successive licensors only have limited liability.
In this respect, the risks associated with loading, using, modifying
and/or developing or reproducing the software by the user are brought to
the user's attention, given its Free Software status, which may make it
complicated to use, with the result that its use is reserved for
developers and experienced professionals having in-depth computer
knowledge. Users are therefore encouraged to load and test the
suitability of the software as regards their requirements in conditions
enabling the security of their systems and/or data to be ensured and,
more generally, to use and operate it in the same conditions of
security. This Agreement may be freely reproduced and published,
provided it is not altered, and that no provisions are either added or
removed herefrom.
This Agreement may apply to any or all software for which the holder of
the economic rights decides to submit the use thereof to its provisions.
Article 1 - DEFINITIONS
For the purpose of this Agreement, when the following expressions
commence with a capital letter, they shall have the following meaning:
Agreement: means this license agreement, and its possible subsequent
versions and annexes.
Software: means the software in its Object Code and/or Source Code form
and, where applicable, its documentation, "as is" when the Licensee
accepts the Agreement.
Initial Software: means the Software in its Source Code and possibly its
Object Code form and, where applicable, its documentation, "as is" when
it is first distributed under the terms and conditions of the Agreement.
Modified Software: means the Software modified by at least one
Contribution.
Source Code: means all the Software's instructions and program lines to
which access is required so as to modify the Software.
Object Code: means the binary files originating from the compilation of
the Source Code.
Holder: means the holder(s) of the economic rights over the Initial
Software.
Licensee: means the Software user(s) having accepted the Agreement.
Contributor: means a Licensee having made at least one Contribution.
Licensor: means the Holder, or any other individual or legal entity, who
distributes the Software under the Agreement.
Contribution: means any or all modifications, corrections, translations,
adaptations and/or new functions integrated into the Software by any or
all Contributors, as well as any or all Internal Modules.
Module: means a set of sources files including their documentation that
enables supplementary functions or services in addition to those offered
by the Software.
External Module: means any or all Modules, not derived from the
Software, so that this Module and the Software run in separate address
spaces, with one calling the other when they are run.
Internal Module: means any or all Module, connected to the Software so
that they both execute in the same address space.
GNU GPL: means the GNU General Public License version 2 or any
subsequent version, as published by the Free Software Foundation Inc.
Parties: mean both the Licensee and the Licensor.
These expressions may be used both in singular and plural form.
Article 2 - PURPOSE
The purpose of the Agreement is the grant by the Licensor to the
Licensee of a non-exclusive, transferable and worldwide license for the
Software as set forth in Article 5 hereinafter for the whole term of the
protection granted by the rights over said Software.
Article 3 - ACCEPTANCE
3.1 The Licensee shall be deemed as having accepted the terms and
conditions of this Agreement upon the occurrence of the first of the
following events:
* (i) loading the Software by any or all means, notably, by
downloading from a remote server, or by loading from a physical
medium;
* (ii) the first time the Licensee exercises any of the rights
granted hereunder.
3.2 One copy of the Agreement, containing a notice relating to the
characteristics of the Software, to the limited warranty, and to the
fact that its use is restricted to experienced users has been provided
to the Licensee prior to its acceptance as set forth in Article 3.1
hereinabove, and the Licensee hereby acknowledges that it has read and
understood it.
Article 4 - EFFECTIVE DATE AND TERM
4.1 EFFECTIVE DATE
The Agreement shall become effective on the date when it is accepted by
the Licensee as set forth in Article 3.1.
4.2 TERM
The Agreement shall remain in force for the entire legal term of
protection of the economic rights over the Software.
Article 5 - SCOPE OF RIGHTS GRANTED
The Licensor hereby grants to the Licensee, who accepts, the following
rights over the Software for any or all use, and for the term of the
Agreement, on the basis of the terms and conditions set forth hereinafter.
Besides, if the Licensor owns or comes to own one or more patents
protecting all or part of the functions of the Software or of its
components, the Licensor undertakes not to enforce the rights granted by
these patents against successive Licensees using, exploiting or
modifying the Software. If these patents are transferred, the Licensor
undertakes to have the transferees subscribe to the obligations set
forth in this paragraph.
5.1 RIGHT OF USE
The Licensee is authorized to use the Software, without any limitation
as to its fields of application, with it being hereinafter specified
that this comprises:
1. permanent or temporary reproduction of all or part of the Software
by any or all means and in any or all form.
2. loading, displaying, running, or storing the Software on any or
all medium.
3. entitlement to observe, study or test its operation so as to
determine the ideas and principles behind any or all constituent
elements of said Software. This shall apply when the Licensee
carries out any or all loading, displaying, running, transmission
or storage operation as regards the Software, that it is entitled
to carry out hereunder.
5.2 ENTITLEMENT TO MAKE CONTRIBUTIONS
The right to make Contributions includes the right to translate, adapt,
arrange, or make any or all modifications to the Software, and the right
to reproduce the resulting software.
The Licensee is authorized to make any or all Contributions to the
Software provided that it includes an explicit notice that it is the
author of said Contribution and indicates the date of the creation thereof.
5.3 RIGHT OF DISTRIBUTION
In particular, the right of distribution includes the right to publish,
transmit and communicate the Software to the general public on any or
all medium, and by any or all means, and the right to market, either in
consideration of a fee, or free of charge, one or more copies of the
Software by any means.
The Licensee is further authorized to distribute copies of the modified
or unmodified Software to third parties according to the terms and
conditions set forth hereinafter.
5.3.1 DISTRIBUTION OF SOFTWARE WITHOUT MODIFICATION
The Licensee is authorized to distribute true copies of the Software in
Source Code or Object Code form, provided that said distribution
complies with all the provisions of the Agreement and is accompanied by:
1. a copy of the Agreement,
2. a notice relating to the limitation of both the Licensor's
warranty and liability as set forth in Articles 8 and 9,
and that, in the event that only the Object Code of the Software is
redistributed, the Licensee allows future Licensees unhindered access to
the full Source Code of the Software by indicating how to access it, it
being understood that the additional cost of acquiring the Source Code
shall not exceed the cost of transferring the data.
5.3.2 DISTRIBUTION OF MODIFIED SOFTWARE
When the Licensee makes a Contribution to the Software, the terms and
conditions for the distribution of the resulting Modified Software
become subject to all the provisions of this Agreement.
The Licensee is authorized to distribute the Modified Software, in
source code or object code form, provided that said distribution
complies with all the provisions of the Agreement and is accompanied by:
1. a copy of the Agreement,
2. a notice relating to the limitation of both the Licensor's
warranty and liability as set forth in Articles 8 and 9,
and that, in the event that only the object code of the Modified
Software is redistributed, the Licensee allows future Licensees
unhindered access to the full source code of the Modified Software by
indicating how to access it, it being understood that the additional
cost of acquiring the source code shall not exceed the cost of
transferring the data.
5.3.3 DISTRIBUTION OF EXTERNAL MODULES
When the Licensee has developed an External Module, the terms and
conditions of this Agreement do not apply to said External Module, that
may be distributed under a separate license agreement.
5.3.4 COMPATIBILITY WITH THE GNU GPL
The Licensee can include a code that is subject to the provisions of one
of the versions of the GNU GPL in the Modified or unmodified Software,
and distribute that entire code under the terms of the same version of
the GNU GPL.
The Licensee can include the Modified or unmodified Software in a code
that is subject to the provisions of one of the versions of the GNU GPL,
and distribute that entire code under the terms of the same version of
the GNU GPL.
Article 6 - INTELLECTUAL PROPERTY
6.1 OVER THE INITIAL SOFTWARE
The Holder owns the economic rights over the Initial Software. Any or
all use of the Initial Software is subject to compliance with the terms
and conditions under which the Holder has elected to distribute its work
and no one shall be entitled to modify the terms and conditions for the
distribution of said Initial Software.
The Holder undertakes that the Initial Software will remain ruled at
least by this Agreement, for the duration set forth in Article 4.2.
6.2 OVER THE CONTRIBUTIONS
The Licensee who develops a Contribution is the owner of the
intellectual property rights over this Contribution as defined by
applicable law.
6.3 OVER THE EXTERNAL MODULES
The Licensee who develops an External Module is the owner of the
intellectual property rights over this External Module as defined by
applicable law and is free to choose the type of agreement that shall
govern its distribution.
6.4 JOINT PROVISIONS
The Licensee expressly undertakes:
1. not to remove, or modify, in any manner, the intellectual property
notices attached to the Software;
2. to reproduce said notices, in an identical manner, in the copies
of the Software modified or not.
The Licensee undertakes not to directly or indirectly infringe the
intellectual property rights of the Holder and/or Contributors on the
Software and to take, where applicable, vis-<2D>-vis its staff, any and all
measures required to ensure respect of said intellectual property rights
of the Holder and/or Contributors.
Article 7 - RELATED SERVICES
7.1 Under no circumstances shall the Agreement oblige the Licensor to
provide technical assistance or maintenance services for the Software.
However, the Licensor is entitled to offer this type of services. The
terms and conditions of such technical assistance, and/or such
maintenance, shall be set forth in a separate instrument. Only the
Licensor offering said maintenance and/or technical assistance services
shall incur liability therefor.
7.2 Similarly, any Licensor is entitled to offer to its licensees, under
its sole responsibility, a warranty, that shall only be binding upon
itself, for the redistribution of the Software and/or the Modified
Software, under terms and conditions that it is free to decide. Said
warranty, and the financial terms and conditions of its application,
shall be subject of a separate instrument executed between the Licensor
and the Licensee.
Article 8 - LIABILITY
8.1 Subject to the provisions of Article 8.2, the Licensee shall be
entitled to claim compensation for any direct loss it may have suffered
from the Software as a result of a fault on the part of the relevant
Licensor, subject to providing evidence thereof.
8.2 The Licensor's liability is limited to the commitments made under
this Agreement and shall not be incurred as a result of in particular:
(i) loss due the Licensee's total or partial failure to fulfill its
obligations, (ii) direct or consequential loss that is suffered by the
Licensee due to the use or performance of the Software, and (iii) more
generally, any consequential loss. In particular the Parties expressly
agree that any or all pecuniary or business loss (i.e. loss of data,
loss of profits, operating loss, loss of customers or orders,
opportunity cost, any disturbance to business activities) or any or all
legal proceedings instituted against the Licensee by a third party,
shall constitute consequential loss and shall not provide entitlement to
any or all compensation from the Licensor.
Article 9 - WARRANTY
9.1 The Licensee acknowledges that the scientific and technical
state-of-the-art when the Software was distributed did not enable all
possible uses to be tested and verified, nor for the presence of
possible defects to be detected. In this respect, the Licensee's
attention has been drawn to the risks associated with loading, using,
modifying and/or developing and reproducing the Software which are
reserved for experienced users.
The Licensee shall be responsible for verifying, by any or all means,
the suitability of the product for its requirements, its good working
order, and for ensuring that it shall not cause damage to either persons
or properties.
9.2 The Licensor hereby represents, in good faith, that it is entitled
to grant all the rights over the Software (including in particular the
rights set forth in Article 5).
9.3 The Licensee acknowledges that the Software is supplied "as is" by
the Licensor without any other express or tacit warranty, other than
that provided for in Article 9.2 and, in particular, without any warranty
as to its commercial value, its secured, safe, innovative or relevant
nature.
Specifically, the Licensor does not warrant that the Software is free
from any error, that it will operate without interruption, that it will
be compatible with the Licensee's own equipment and software
configuration, nor that it will meet the Licensee's requirements.
9.4 The Licensor does not either expressly or tacitly warrant that the
Software does not infringe any third party intellectual property right
relating to a patent, software or any other property right. Therefore,
the Licensor disclaims any and all liability towards the Licensee
arising out of any or all proceedings for infringement that may be
instituted in respect of the use, modification and redistribution of the
Software. Nevertheless, should such proceedings be instituted against
the Licensee, the Licensor shall provide it with technical and legal
assistance for its defense. Such technical and legal assistance shall be
decided on a case-by-case basis between the relevant Licensor and the
Licensee pursuant to a memorandum of understanding. The Licensor
disclaims any and all liability as regards the Licensee's use of the
name of the Software. No warranty is given as regards the existence of
prior rights over the name of the Software or as regards the existence
of a trademark.
Article 10 - TERMINATION
10.1 In the event of a breach by the Licensee of its obligations
hereunder, the Licensor may automatically terminate this Agreement
thirty (30) days after notice has been sent to the Licensee and has
remained ineffective.
10.2 A Licensee whose Agreement is terminated shall no longer be
authorized to use, modify or distribute the Software. However, any
licenses that it may have granted prior to termination of the Agreement
shall remain valid subject to their having been granted in compliance
with the terms and conditions hereof.
Article 11 - MISCELLANEOUS
11.1 EXCUSABLE EVENTS
Neither Party shall be liable for any or all delay, or failure to
perform the Agreement, that may be attributable to an event of force
majeure, an act of God or an outside cause, such as defective
functioning or interruptions of the electricity or telecommunications
networks, network paralysis following a virus attack, intervention by
government authorities, natural disasters, water damage, earthquakes,
fire, explosions, strikes and labor unrest, war, etc.
11.2 Any failure by either Party, on one or more occasions, to invoke
one or more of the provisions hereof, shall under no circumstances be
interpreted as being a waiver by the interested Party of its right to
invoke said provision(s) subsequently.
11.3 The Agreement cancels and replaces any or all previous agreements,
whether written or oral, between the Parties and having the same
purpose, and constitutes the entirety of the agreement between said
Parties concerning said purpose. No supplement or modification to the
terms and conditions hereof shall be effective as between the Parties
unless it is made in writing and signed by their duly authorized
representatives.
11.4 In the event that one or more of the provisions hereof were to
conflict with a current or future applicable act or legislative text,
said act or legislative text shall prevail, and the Parties shall make
the necessary amendments so as to comply with said act or legislative
text. All other provisions shall remain effective. Similarly, invalidity
of a provision of the Agreement, for any reason whatsoever, shall not
cause the Agreement as a whole to be invalid.
11.5 LANGUAGE
The Agreement is drafted in both French and English and both versions
are deemed authentic.
Article 12 - NEW VERSIONS OF THE AGREEMENT
12.1 Any person is authorized to duplicate and distribute copies of this
Agreement.
12.2 So as to ensure coherence, the wording of this Agreement is
protected and may only be modified by the authors of the License, who
reserve the right to periodically publish updates or new versions of the
Agreement, each with a separate number. These subsequent versions may
address new issues encountered by Free Software.
12.3 Any Software distributed under a given version of the Agreement may
only be subsequently distributed under the same version of the Agreement
or a subsequent version, subject to the provisions of Article 5.3.4.
Article 13 - GOVERNING LAW AND JURISDICTION
13.1 The Agreement is governed by French law. The Parties agree to
endeavor to seek an amicable solution to any disagreements or disputes
that may arise during the performance of the Agreement.
13.2 Failing an amicable solution within two (2) months as from their
occurrence, and unless emergency proceedings are necessary, the
disagreements or disputes shall be referred to the Paris Courts having
jurisdiction, by the more diligent Party.
Version 2.0 dated 2006-09-05.

View File

@ -1,512 +0,0 @@
CONTRAT DE LICENCE DE LOGICIEL LIBRE CeCILL
Avertissement
Ce contrat est une licence de logiciel libre issue d'une concertation
entre ses auteurs afin que le respect de deux grands principes pr<70>side <20>
sa r<>daction:
* d'une part, le respect des principes de diffusion des logiciels
libres: acc<63>s au code source, droits <20>tendus conf<6E>r<EFBFBD>s aux
utilisateurs,
* d'autre part, la d<>signation d'un droit applicable, le droit
fran<61>ais, auquel elle est conforme, tant au regard du droit de la
responsabilit<69> civile que du droit de la propri<72>t<EFBFBD> intellectuelle
et de la protection qu'il offre aux auteurs et titulaires des
droits patrimoniaux sur un logiciel.
Les auteurs de la licence CeCILL (pour Ce[a] C[nrs] I[nria] L[ogiciel]
L[ibre]) sont:
Commissariat <20> l'Energie Atomique - CEA, <20>tablissement public de
recherche <20> caract<63>re scientifique, technique et industriel, dont le
si<EFBFBD>ge est situ<74> 25 rue Leblanc, immeuble Le Ponant D, 75015 Paris.
Centre National de la Recherche Scientifique - CNRS, <20>tablissement
public <20> caract<63>re scientifique et technologique, dont le si<73>ge est
situ<EFBFBD> 3 rue Michel-Ange, 75794 Paris cedex 16.
Institut National de Recherche en Informatique et en Automatique -
INRIA, <20>tablissement public <20> caract<63>re scientifique et technologique,
dont le si<73>ge est situ<74> Domaine de Voluceau, Rocquencourt, BP 105, 78153
Le Chesnay cedex.
Pr<50>ambule
Ce contrat est une licence de logiciel libre dont l'objectif est de
conf<EFBFBD>rer aux utilisateurs la libert<72> de modification et de
redistribution du logiciel r<>gi par cette licence dans le cadre d'un
mod<EFBFBD>le de diffusion en logiciel libre.
L'exercice de ces libert<72>s est assorti de certains devoirs <20> la charge
des utilisateurs afin de pr<70>server ce statut au cours des
redistributions ult<6C>rieures.
L'accessibilit<69> au code source et les droits de copie, de modification
et de redistribution qui en d<>coulent ont pour contrepartie de n'offrir
aux utilisateurs qu'une garantie limit<69>e et de ne faire peser sur
l'auteur du logiciel, le titulaire des droits patrimoniaux et les
conc<EFBFBD>dants successifs qu'une responsabilit<69> restreinte.
A cet <20>gard l'attention de l'utilisateur est attir<69>e sur les risques
associ<EFBFBD>s au chargement, <20> l'utilisation, <20> la modification et/ou au
d<EFBFBD>veloppement et <20> la reproduction du logiciel par l'utilisateur <20>tant
donn<EFBFBD> sa sp<73>cificit<69> de logiciel libre, qui peut le rendre complexe <20>
manipuler et qui le r<>serve donc <20> des d<>veloppeurs ou des
professionnels avertis poss<73>dant des connaissances informatiques
approfondies. Les utilisateurs sont donc invit<69>s <20> charger et tester
l'ad<61>quation du logiciel <20> leurs besoins dans des conditions permettant
d'assurer la s<>curit<69> de leurs syst<73>mes et/ou de leurs donn<6E>es et, plus
g<EFBFBD>n<EFBFBD>ralement, <20> l'utiliser et l'exploiter dans les m<>mes conditions de
s<EFBFBD>curit<EFBFBD>. Ce contrat peut <20>tre reproduit et diffus<75> librement, sous
r<EFBFBD>serve de le conserver en l'<27>tat, sans ajout ni suppression de clauses.
Ce contrat est susceptible de s'appliquer <20> tout logiciel dont le
titulaire des droits patrimoniaux d<>cide de soumettre l'exploitation aux
dispositions qu'il contient.
Article 1 - DEFINITIONS
Dans ce contrat, les termes suivants, lorsqu'ils seront <20>crits avec une
lettre capitale, auront la signification suivante:
Contrat: d<>signe le pr<70>sent contrat de licence, ses <20>ventuelles versions
post<EFBFBD>rieures et annexes.
Logiciel: d<>signe le logiciel sous sa forme de Code Objet et/ou de Code
Source et le cas <20>ch<63>ant sa documentation, dans leur <20>tat au moment de
l'acceptation du Contrat par le Licenci<63>.
Logiciel Initial: d<>signe le Logiciel sous sa forme de Code Source et
<EFBFBD>ventuellement de Code Objet et le cas <20>ch<63>ant sa documentation, dans
leur <20>tat au moment de leur premi<6D>re diffusion sous les termes du Contrat.
Logiciel Modifi<66>: d<>signe le Logiciel modifi<66> par au moins une
Contribution.
Code Source: d<>signe l'ensemble des instructions et des lignes de
programme du Logiciel et auquel l'acc<63>s est n<>cessaire en vue de
modifier le Logiciel.
Code Objet: d<>signe les fichiers binaires issus de la compilation du
Code Source.
Titulaire: d<>signe le ou les d<>tenteurs des droits patrimoniaux d'auteur
sur le Logiciel Initial.
Licenci<EFBFBD>: d<>signe le ou les utilisateurs du Logiciel ayant accept<70> le
Contrat.
Contributeur: d<>signe le Licenci<63> auteur d'au moins une Contribution.
Conc<EFBFBD>dant: d<>signe le Titulaire ou toute personne physique ou morale
distribuant le Logiciel sous le Contrat.
Contribution: d<>signe l'ensemble des modifications, corrections,
traductions, adaptations et/ou nouvelles fonctionnalit<69>s int<6E>gr<67>es dans
le Logiciel par tout Contributeur, ainsi que tout Module Interne.
Module: d<>signe un ensemble de fichiers sources y compris leur
documentation qui permet de r<>aliser des fonctionnalit<69>s ou services
suppl<EFBFBD>mentaires <20> ceux fournis par le Logiciel.
Module Externe: d<>signe tout Module, non d<>riv<69> du Logiciel, tel que ce
Module et le Logiciel s'ex<65>cutent dans des espaces d'adressage
diff<EFBFBD>rents, l'un appelant l'autre au moment de leur ex<65>cution.
Module Interne: d<>signe tout Module li<6C> au Logiciel de telle sorte
qu'ils s'ex<65>cutent dans le m<>me espace d'adressage.
GNU GPL: d<>signe la GNU General Public License dans sa version 2 ou
toute version ult<6C>rieure, telle que publi<6C>e par Free Software Foundation
Inc.
Parties: d<>signe collectivement le Licenci<63> et le Conc<6E>dant.
Ces termes s'entendent au singulier comme au pluriel.
Article 2 - OBJET
Le Contrat a pour objet la concession par le Conc<6E>dant au Licenci<63> d'une
licence non exclusive, cessible et mondiale du Logiciel telle que
d<EFBFBD>finie ci-apr<70>s <20> l'article 5 pour toute la dur<75>e de protection des droits
portant sur ce Logiciel.
Article 3 - ACCEPTATION
3.1 L'acceptation par le Licenci<63> des termes du Contrat est r<>put<75>e
acquise du fait du premier des faits suivants:
* (i) le chargement du Logiciel par tout moyen notamment par
t<>l<EFBFBD>chargement <20> partir d'un serveur distant ou par chargement <20>
partir d'un support physique;
* (ii) le premier exercice par le Licenci<63> de l'un quelconque des
droits conc<6E>d<EFBFBD>s par le Contrat.
3.2 Un exemplaire du Contrat, contenant notamment un avertissement
relatif aux sp<73>cificit<69>s du Logiciel, <20> la restriction de garantie et <20>
la limitation <20> un usage par des utilisateurs exp<78>riment<6E>s a <20>t<EFBFBD> mis <20>
disposition du Licenci<63> pr<70>alablement <20> son acceptation telle que
d<EFBFBD>finie <20> l'article 3.1 ci dessus et le Licenci<63> reconna<6E>t en avoir pris
connaissance.
Article 4 - ENTREE EN VIGUEUR ET DUREE
4.1 ENTREE EN VIGUEUR
Le Contrat entre en vigueur <20> la date de son acceptation par le Licenci<63>
telle que d<>finie en 3.1.
4.2 DUREE
Le Contrat produira ses effets pendant toute la dur<75>e l<>gale de
protection des droits patrimoniaux portant sur le Logiciel.
Article 5 - ETENDUE DES DROITS CONCEDES
Le Conc<6E>dant conc<6E>de au Licenci<63>, qui accepte, les droits suivants sur
le Logiciel pour toutes destinations et pour la dur<75>e du Contrat dans
les conditions ci-apr<70>s d<>taill<6C>es.
Par ailleurs, si le Conc<6E>dant d<>tient ou venait <20> d<>tenir un ou
plusieurs brevets d'invention prot<6F>geant tout ou partie des
fonctionnalit<EFBFBD>s du Logiciel ou de ses composants, il s'engage <20> ne pas
opposer les <20>ventuels droits conf<6E>r<EFBFBD>s par ces brevets aux Licenci<63>s
successifs qui utiliseraient, exploiteraient ou modifieraient le
Logiciel. En cas de cession de ces brevets, le Conc<6E>dant s'engage <20>
faire reprendre les obligations du pr<70>sent alin<69>a aux cessionnaires.
5.1 DROIT D'UTILISATION
Le Licenci<63> est autoris<69> <20> utiliser le Logiciel, sans restriction quant
aux domaines d'application, <20>tant ci-apr<70>s pr<70>cis<69> que cela comporte:
1. la reproduction permanente ou provisoire du Logiciel en tout ou
partie par tout moyen et sous toute forme.
2. le chargement, l'affichage, l'ex<65>cution, ou le stockage du
Logiciel sur tout support.
3. la possibilit<69> d'en observer, d'en <20>tudier, ou d'en tester le
fonctionnement afin de d<>terminer les id<69>es et principes qui sont
<20> la base de n'importe quel <20>l<EFBFBD>ment de ce Logiciel; et ceci,
lorsque le Licenci<63> effectue toute op<6F>ration de chargement,
d'affichage, d'ex<65>cution, de transmission ou de stockage du
Logiciel qu'il est en droit d'effectuer en vertu du Contrat.
5.2 DROIT D'APPORTER DES CONTRIBUTIONS
Le droit d'apporter des Contributions comporte le droit de traduire,
d'adapter, d'arranger ou d'apporter toute autre modification au Logiciel
et le droit de reproduire le logiciel en r<>sultant.
Le Licenci<63> est autoris<69> <20> apporter toute Contribution au Logiciel sous
r<EFBFBD>serve de mentionner, de fa<66>on explicite, son nom en tant qu'auteur de
cette Contribution et la date de cr<63>ation de celle-ci.
5.3 DROIT DE DISTRIBUTION
Le droit de distribution comporte notamment le droit de diffuser, de
transmettre et de communiquer le Logiciel au public sur tout support et
par tout moyen ainsi que le droit de mettre sur le march<63> <20> titre
on<EFBFBD>reux ou gratuit, un ou des exemplaires du Logiciel par tout proc<6F>d<EFBFBD>.
Le Licenci<63> est autoris<69> <20> distribuer des copies du Logiciel, modifi<66> ou
non, <20> des tiers dans les conditions ci-apr<70>s d<>taill<6C>es.
5.3.1 DISTRIBUTION DU LOGICIEL SANS MODIFICATION
Le Licenci<63> est autoris<69> <20> distribuer des copies conformes du Logiciel,
sous forme de Code Source ou de Code Objet, <20> condition que cette
distribution respecte les dispositions du Contrat dans leur totalit<69> et
soit accompagn<67>e:
1. d'un exemplaire du Contrat,
2. d'un avertissement relatif <20> la restriction de garantie et de
responsabilit<69> du Conc<6E>dant telle que pr<70>vue aux articles 8
et 9,
et que, dans le cas o<> seul le Code Objet du Logiciel est redistribu<62>,
le Licenci<63> permette aux futurs Licenci<63>s d'acc<63>der facilement au Code
Source complet du Logiciel en indiquant les modalit<69>s d'acc<63>s, <20>tant
entendu que le co<63>t additionnel d'acquisition du Code Source ne devra
pas exc<78>der le simple co<63>t de transfert des donn<6E>es.
5.3.2 DISTRIBUTION DU LOGICIEL MODIFIE
Lorsque le Licenci<63> apporte une Contribution au Logiciel, les conditions
de distribution du Logiciel Modifi<66> en r<>sultant sont alors soumises <20>
l'int<6E>gralit<69> des dispositions du Contrat.
Le Licenci<63> est autoris<69> <20> distribuer le Logiciel Modifi<66>, sous forme de
code source ou de code objet, <20> condition que cette distribution
respecte les dispositions du Contrat dans leur totalit<69> et soit
accompagn<EFBFBD>e:
1. d'un exemplaire du Contrat,
2. d'un avertissement relatif <20> la restriction de garantie et de
responsabilit<69> du Conc<6E>dant telle que pr<70>vue aux articles 8
et 9,
et que, dans le cas o<> seul le code objet du Logiciel Modifi<66> est
redistribu<EFBFBD>, le Licenci<63> permette aux futurs Licenci<63>s d'acc<63>der
facilement au code source complet du Logiciel Modifi<66> en indiquant les
modalit<EFBFBD>s d'acc<63>s, <20>tant entendu que le co<63>t additionnel d'acquisition
du code source ne devra pas exc<78>der le simple co<63>t de transfert des donn<6E>es.
5.3.3 DISTRIBUTION DES MODULES EXTERNES
Lorsque le Licenci<63> a d<>velopp<70> un Module Externe les conditions du
Contrat ne s'appliquent pas <20> ce Module Externe, qui peut <20>tre distribu<62>
sous un contrat de licence diff<66>rent.
5.3.4 COMPATIBILITE AVEC LA LICENCE GNU GPL
Le Licenci<63> peut inclure un code soumis aux dispositions d'une des
versions de la licence GNU GPL dans le Logiciel modifi<66> ou non et
distribuer l'ensemble sous les conditions de la m<>me version de la
licence GNU GPL.
Le Licenci<63> peut inclure le Logiciel modifi<66> ou non dans un code soumis
aux dispositions d'une des versions de la licence GNU GPL et distribuer
l'ensemble sous les conditions de la m<>me version de la licence GNU GPL.
Article 6 - PROPRIETE INTELLECTUELLE
6.1 SUR LE LOGICIEL INITIAL
Le Titulaire est d<>tenteur des droits patrimoniaux sur le Logiciel
Initial. Toute utilisation du Logiciel Initial est soumise au respect
des conditions dans lesquelles le Titulaire a choisi de diffuser son
oeuvre et nul autre n'a la facult<6C> de modifier les conditions de
diffusion de ce Logiciel Initial.
Le Titulaire s'engage <20> ce que le Logiciel Initial reste au moins r<>gi
par le Contrat et ce, pour la dur<75>e vis<69>e <20> l'article 4.2.
6.2 SUR LES CONTRIBUTIONS
Le Licenci<63> qui a d<>velopp<70> une Contribution est titulaire sur celle-ci
des droits de propri<72>t<EFBFBD> intellectuelle dans les conditions d<>finies par
la l<>gislation applicable.
6.3 SUR LES MODULES EXTERNES
Le Licenci<63> qui a d<>velopp<70> un Module Externe est titulaire sur celui-ci
des droits de propri<72>t<EFBFBD> intellectuelle dans les conditions d<>finies par
la l<>gislation applicable et reste libre du choix du contrat r<>gissant
sa diffusion.
6.4 DISPOSITIONS COMMUNES
Le Licenci<63> s'engage express<73>ment:
1. <20> ne pas supprimer ou modifier de quelque mani<6E>re que ce soit les
mentions de propri<72>t<EFBFBD> intellectuelle appos<6F>es sur le Logiciel;
2. <20> reproduire <20> l'identique lesdites mentions de propri<72>t<EFBFBD>
intellectuelle sur les copies du Logiciel modifi<66> ou non.
Le Licenci<63> s'engage <20> ne pas porter atteinte, directement ou
indirectement, aux droits de propri<72>t<EFBFBD> intellectuelle du Titulaire et/ou
des Contributeurs sur le Logiciel et <20> prendre, le cas <20>ch<63>ant, <20>
l'<27>gard de son personnel toutes les mesures n<>cessaires pour assurer le
respect des dits droits de propri<72>t<EFBFBD> intellectuelle du Titulaire et/ou
des Contributeurs.
Article 7 - SERVICES ASSOCIES
7.1 Le Contrat n'oblige en aucun cas le Conc<6E>dant <20> la r<>alisation de
prestations d'assistance technique ou de maintenance du Logiciel.
Cependant le Conc<6E>dant reste libre de proposer ce type de services. Les
termes et conditions d'une telle assistance technique et/ou d'une telle
maintenance seront alors d<>termin<69>s dans un acte s<>par<61>. Ces actes de
maintenance et/ou assistance technique n'engageront que la seule
responsabilit<EFBFBD> du Conc<6E>dant qui les propose.
7.2 De m<>me, tout Conc<6E>dant est libre de proposer, sous sa seule
responsabilit<EFBFBD>, <20> ses licenci<63>s une garantie, qui n'engagera que lui,
lors de la redistribution du Logiciel et/ou du Logiciel Modifi<66> et ce,
dans les conditions qu'il souhaite. Cette garantie et les modalit<69>s
financi<EFBFBD>res de son application feront l'objet d'un acte s<>par<61> entre le
Conc<EFBFBD>dant et le Licenci<63>.
Article 8 - RESPONSABILITE
8.1 Sous r<>serve des dispositions de l'article 8.2, le Licenci<63> a la
facult<EFBFBD>, sous r<>serve de prouver la faute du Conc<6E>dant concern<72>, de
solliciter la r<>paration du pr<70>judice direct qu'il subirait du fait du
Logiciel et dont il apportera la preuve.
8.2 La responsabilit<69> du Conc<6E>dant est limit<69>e aux engagements pris en
application du Contrat et ne saurait <20>tre engag<61>e en raison notamment:
(i) des dommages dus <20> l'inex<65>cution, totale ou partielle, de ses
obligations par le Licenci<63>, (ii) des dommages directs ou indirects
d<EFBFBD>coulant de l'utilisation ou des performances du Logiciel subis par le
Licenci<EFBFBD> et (iii) plus g<>n<EFBFBD>ralement d'un quelconque dommage indirect. En
particulier, les Parties conviennent express<73>ment que tout pr<70>judice
financier ou commercial (par exemple perte de donn<6E>es, perte de
b<EFBFBD>n<EFBFBD>fices, perte d'exploitation, perte de client<6E>le ou de commandes,
manque <20> gagner, trouble commercial quelconque) ou toute action dirig<69>e
contre le Licenci<63> par un tiers, constitue un dommage indirect et
n'ouvre pas droit <20> r<>paration par le Conc<6E>dant.
Article 9 - GARANTIE
9.1 Le Licenci<63> reconna<6E>t que l'<27>tat actuel des connaissances
scientifiques et techniques au moment de la mise en circulation du
Logiciel ne permet pas d'en tester et d'en v<>rifier toutes les
utilisations ni de d<>tecter l'existence d'<27>ventuels d<>fauts. L'attention
du Licenci<63> a <20>t<EFBFBD> attir<69>e sur ce point sur les risques associ<63>s au
chargement, <20> l'utilisation, la modification et/ou au d<>veloppement et <20>
la reproduction du Logiciel qui sont r<>serv<72>s <20> des utilisateurs avertis.
Il rel<65>ve de la responsabilit<69> du Licenci<63> de contr<74>ler, par tous
moyens, l'ad<61>quation du produit <20> ses besoins, son bon fonctionnement et
de s'assurer qu'il ne causera pas de dommages aux personnes et aux biens.
9.2 Le Conc<6E>dant d<>clare de bonne foi <20>tre en droit de conc<6E>der
l'ensemble des droits attach<63>s au Logiciel (comprenant notamment les
droits vis<69>s <20> l'article 5).
9.3 Le Licenci<63> reconna<6E>t que le Logiciel est fourni "en l'<27>tat" par le
Conc<EFBFBD>dant sans autre garantie, expresse ou tacite, que celle pr<70>vue <20>
l'article 9.2 et notamment sans aucune garantie sur sa valeur commerciale,
son caract<63>re s<>curis<69>, innovant ou pertinent.
En particulier, le Conc<6E>dant ne garantit pas que le Logiciel est exempt
d'erreur, qu'il fonctionnera sans interruption, qu'il sera compatible
avec l'<27>quipement du Licenci<63> et sa configuration logicielle ni qu'il
remplira les besoins du Licenci<63>.
9.4 Le Conc<6E>dant ne garantit pas, de mani<6E>re expresse ou tacite, que le
Logiciel ne porte pas atteinte <20> un quelconque droit de propri<72>t<EFBFBD>
intellectuelle d'un tiers portant sur un brevet, un logiciel ou sur tout
autre droit de propri<72>t<EFBFBD>. Ainsi, le Conc<6E>dant exclut toute garantie au
profit du Licenci<63> contre les actions en contrefa<66>on qui pourraient <20>tre
diligent<EFBFBD>es au titre de l'utilisation, de la modification, et de la
redistribution du Logiciel. N<>anmoins, si de telles actions sont
exerc<EFBFBD>es contre le Licenci<63>, le Conc<6E>dant lui apportera son aide
technique et juridique pour sa d<>fense. Cette aide technique et
juridique est d<>termin<69>e au cas par cas entre le Conc<6E>dant concern<72> et
le Licenci<63> dans le cadre d'un protocole d'accord. Le Conc<6E>dant d<>gage
toute responsabilit<69> quant <20> l'utilisation de la d<>nomination du
Logiciel par le Licenci<63>. Aucune garantie n'est apport<72>e quant <20>
l'existence de droits ant<6E>rieurs sur le nom du Logiciel et sur
l'existence d'une marque.
Article 10 - RESILIATION
10.1 En cas de manquement par le Licenci<63> aux obligations mises <20> sa
charge par le Contrat, le Conc<6E>dant pourra r<>silier de plein droit le
Contrat trente (30) jours apr<70>s notification adress<73>e au Licenci<63> et
rest<EFBFBD>e sans effet.
10.2 Le Licenci<63> dont le Contrat est r<>sili<6C> n'est plus autoris<69> <20>
utiliser, modifier ou distribuer le Logiciel. Cependant, toutes les
licences qu'il aura conc<6E>d<EFBFBD>es ant<6E>rieurement <20> la r<>siliation du Contrat
resteront valides sous r<>serve qu'elles aient <20>t<EFBFBD> effectu<74>es en
conformit<EFBFBD> avec le Contrat.
Article 11 - DISPOSITIONS DIVERSES
11.1 CAUSE EXTERIEURE
Aucune des Parties ne sera responsable d'un retard ou d'une d<>faillance
d'ex<65>cution du Contrat qui serait d<> <20> un cas de force majeure, un cas
fortuit ou une cause ext<78>rieure, telle que, notamment, le mauvais
fonctionnement ou les interruptions du r<>seau <20>lectrique ou de
t<EFBFBD>l<EFBFBD>communication, la paralysie du r<>seau li<6C>e <20> une attaque
informatique, l'intervention des autorit<69>s gouvernementales, les
catastrophes naturelles, les d<>g<EFBFBD>ts des eaux, les tremblements de terre,
le feu, les explosions, les gr<67>ves et les conflits sociaux, l'<27>tat de
guerre...
11.2 Le fait, par l'une ou l'autre des Parties, d'omettre en une ou
plusieurs occasions de se pr<70>valoir d'une ou plusieurs dispositions du
Contrat, ne pourra en aucun cas impliquer renonciation par la Partie
int<EFBFBD>ress<EFBFBD>e <20> s'en pr<70>valoir ult<6C>rieurement.
11.3 Le Contrat annule et remplace toute convention ant<6E>rieure, <20>crite
ou orale, entre les Parties sur le m<>me objet et constitue l'accord
entier entre les Parties sur cet objet. Aucune addition ou modification
aux termes du Contrat n'aura d'effet <20> l'<27>gard des Parties <20> moins
d'<27>tre faite par <20>crit et sign<67>e par leurs repr<70>sentants d<>ment habilit<69>s.
11.4 Dans l'hypoth<74>se o<> une ou plusieurs des dispositions du Contrat
s'av<61>rerait contraire <20> une loi ou <20> un texte applicable, existants ou
futurs, cette loi ou ce texte pr<70>vaudrait, et les Parties feraient les
amendements n<>cessaires pour se conformer <20> cette loi ou <20> ce texte.
Toutes les autres dispositions resteront en vigueur. De m<>me, la
nullit<EFBFBD>, pour quelque raison que ce soit, d'une des dispositions du
Contrat ne saurait entra<72>ner la nullit<69> de l'ensemble du Contrat.
11.5 LANGUE
Le Contrat est r<>dig<69> en langue fran<61>aise et en langue anglaise, ces
deux versions faisant <20>galement foi.
Article 12 - NOUVELLES VERSIONS DU CONTRAT
12.1 Toute personne est autoris<69>e <20> copier et distribuer des copies de
ce Contrat.
12.2 Afin d'en pr<70>server la coh<6F>rence, le texte du Contrat est prot<6F>g<EFBFBD>
et ne peut <20>tre modifi<66> que par les auteurs de la licence, lesquels se
r<EFBFBD>servent le droit de publier p<>riodiquement des mises <20> jour ou de
nouvelles versions du Contrat, qui poss<73>deront chacune un num<75>ro
distinct. Ces versions ult<6C>rieures seront susceptibles de prendre en
compte de nouvelles probl<62>matiques rencontr<74>es par les logiciels libres.
12.3 Tout Logiciel diffus<75> sous une version donn<6E>e du Contrat ne pourra
faire l'objet d'une diffusion ult<6C>rieure que sous la m<>me version du
Contrat ou une version post<73>rieure, sous r<>serve des dispositions de
l'article 5.3.4.
Article 13 - LOI APPLICABLE ET COMPETENCE TERRITORIALE
13.1 Le Contrat est r<>gi par la loi fran<61>aise. Les Parties conviennent
de tenter de r<>gler <20> l'amiable les diff<66>rends ou litiges qui
viendraient <20> se produire par suite ou <20> l'occasion du Contrat.
13.2 A d<>faut d'accord amiable dans un d<>lai de deux (2) mois <20> compter
de leur survenance et sauf situation relevant d'une proc<6F>dure d'urgence,
les diff<66>rends ou litiges seront port<72>s par la Partie la plus diligente
devant les Tribunaux comp<6D>tents de Paris.
Version 2.0 du 2006-09-05.

View File

@ -5,6 +5,7 @@ from ..utils cimport str2bytes
cdef extern from "stdio.h":
struct FILE
int fprintf(FILE *stream, char *format, ...)
int fputs(char *string, FILE *stream)
FILE* stderr
ctypedef unsigned int off_t "unsigned long long"

View File

@ -126,7 +126,7 @@ cdef class ProgressBar:
if twentyth != self.lastlog:
if self.ontty:
<void>fprintf(stderr,b'\n')
<void>fputs(b'\n',stderr)
self.logger.info('%s %5.1f %% remain : %02d:%02d:%02d' % (
bytes2str(self.head),

View File

@ -35,7 +35,6 @@ def addOptions(parser):
type=str,
help="Name of the default DMS for reading and writing data")
group.add_argument('--destination-view','-v',
action="store", dest="import:destview",
metavar='<VIEW NAME>',
@ -96,12 +95,14 @@ def run(config):
inputs = uopen(config['import']['filename'])
get_quality = False
if config['import']['seqinformat']=='fasta':
iseq = fastaIterator(inputs)
view_type="NUC_SEQS_VIEW"
elif config['import']['seqinformat']=='fastq':
iseq = fastqIterator(inputs)
view_type="NUC_SEQS_VIEW"
get_quality = True
else:
raise RuntimeError('No file format specified')
@ -120,13 +121,15 @@ def run(config):
view[i].set_id(seq['id'])
view[i].set_definition(seq['definition'])
view[i].set_sequence(seq['sequence'])
if get_quality :
view[i].set_quality(seq['quality'])
for tag in seq['tags'] :
#print(tag, seq['tags'][tag])
#if seq['tags'][tag] not in NA_list :
view[i][tag] = seq['tags'][tag]
i+=1
#print(view)
#print(i)
print(view.__repr__())
view.save_and_close()

View File

@ -9,6 +9,7 @@ cdef class MagicKeyFile:
cdef int pos
cpdef bytes read(self,int size=?)
cpdef bytes read1(self,int size=?)
cpdef int tell(self)

View File

@ -54,14 +54,17 @@ cdef class MagicKeyFile:
r = self.binary.read(size)
return r
cpdef bytes read1(self,int size=-1):
return self.read(size)
cpdef int tell(self):
cdef int p
if self.pos < self.keylength:
p = self.pos
else:
p = self.tell()
p = self.binary.tell()
return p

View File

View File

@ -0,0 +1,10 @@
from ..utils cimport bytes2str
from .header cimport HeaderFormat
from cython.view cimport array as cvarray
cdef class FastaFormat:
cdef HeaderFormat headerFormater
cdef size_t sequenceBufferLength
cdef char* sequenceBuffer

View File

@ -0,0 +1,32 @@
cimport cython
from libc.stdlib cimport malloc, free, realloc
from libc.string cimport strncpy
cdef class FastaFormat:
def __init__(self, list tags=[], bint printNAKeys=False):
self.headerFormater = HeaderFormat(True,
tags,
printNAKeys)
@cython.boundscheck(False)
def __call__(self, dict data):
cdef bytes brawseq = data['sequence']
cdef size_t lseq = len(brawseq)
cdef size_t k=0
cdef list lines = []
for k in range(0,lseq,60):
lines.append(brawseq[k:(k+60)])
brawseq = b'\n'.join(lines)
return "%s\n%s" % (self.headerFormater(data),bytes2str(brawseq))

View File

@ -0,0 +1,7 @@
cdef class HeaderFormat:
cdef str start
cdef set tags
cdef bint printNaKeys
cdef size_t headerBufferLength

View File

@ -0,0 +1,60 @@
cdef class HeaderFormat:
def __init__(self, bint fastaHeader=True, list tags=[], bint printNAKeys=False):
'''
@param fastaHeader:
@type fastaHeader: `bool`
@param tags:
@type tags: `list` of `bytes`
@param printNAKeys:
@type printNAKeys: `bool`
'''
self.tags = set(tags)
self.printNaKeys = printNAKeys
if fastaHeader:
self.start=">"
else:
self.start="@"
self.headerBufferLength = 1000
#self.headerBuffer = []
def __call__(self, dict data):
cdef str header
cdef dict tags = data['tags']
cdef set ktags
cdef list lines = [""]
cdef str tagline
if self.tags is not None and self.tags:
ktags = self.tags
else:
ktags = set(tags.keys())
for k in ktags:
if k in tags:
value = tags[k]
if value is not None or self.printNaKeys:
lines.append("%s=%s;" % (k,tags[k]))
if len(lines) > 1:
tagline=" ".join(lines)
else:
tagline=""
if data['definition'] is not None:
header = "%s%s%s %s" % (self.start,data['id'],
tagline,
data['definition'])
else:
header = "%s%s%s" % (self.start,data['id'],
tagline)
return header

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -6,11 +6,12 @@ from .capi.obiview cimport Obiview_p
from .capi.obitypes cimport obiversion_t, OBIType_t, index_t
from ._obitaxo cimport OBI_Taxonomy
cdef class OBIDMS_column:
cdef OBIDMS_column_p* pointer
cdef OBIDMS dms
cdef Obiview_p view
cdef OBIView view
cdef str data_type
cdef str dms_name
cdef str column_name
@ -69,6 +70,7 @@ cdef class OBIView_NUC_SEQS(OBIView):
cdef OBIDMS_column ids
cdef OBIDMS_column sequences
cdef OBIDMS_column definitions
cdef OBIDMS_column qualities
cpdef delete_column(self, str column_name)
@ -89,5 +91,5 @@ cdef class OBIDMS:
cpdef OBIView open_view(self, str view_name)
cpdef OBIView new_view(self, str view_name, object view_to_clone=*, list line_selection=*, str view_type=*, str comments=*)
cpdef dict read_view_infos(self, str view_name)
cpdef dict read_views(self)
# cpdef dict read_views(self) TODO

View File

@ -17,6 +17,7 @@ from .capi.obitypes cimport const_char_p, \
OBI_FLOAT, \
OBI_BOOL, \
OBI_CHAR, \
OBI_QUAL, \
OBI_STR, \
OBI_SEQ, \
name_data_type, \
@ -43,6 +44,9 @@ from ._obidmscolumn_bool cimport OBIDMS_column_bool, \
from ._obidmscolumn_char cimport OBIDMS_column_char, \
OBIDMS_column_multi_elts_char
from ._obidmscolumn_qual cimport OBIDMS_column_qual, \
OBIDMS_column_multi_elts_qual
from ._obidmscolumn_str cimport OBIDMS_column_str, \
OBIDMS_column_multi_elts_str
@ -50,19 +54,19 @@ from ._obidmscolumn_seq cimport OBIDMS_column_seq, \
OBIDMS_column_multi_elts_seq
from .capi.obiview cimport Obiview_p, \
Obiviews_infos_all_p, \
Obiview_infos_p, \
Column_reference_p, \
obi_new_view_nuc_seqs, \
obi_new_view, \
obi_new_view_cloned_from_name, \
obi_new_view_nuc_seqs_cloned_from_name, \
obi_view_map_file, \
obi_view_unmap_file, \
obi_open_view, \
obi_read_view_infos, \
obi_close_view_infos, \
obi_view_delete_column, \
obi_view_add_column, \
obi_view_get_column, \
obi_view_get_column, \
obi_view_get_pointer_on_column_in_view, \
obi_select_line, \
obi_select_lines, \
@ -70,7 +74,8 @@ from .capi.obiview cimport Obiview_p, \
VIEW_TYPE_NUC_SEQS, \
NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN
DEFINITION_COLUMN, \
QUALITY_COLUMN
from libc.stdlib cimport malloc
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
@ -90,7 +95,7 @@ cdef class OBIDMS_column :
# Fill structure
self.pointer = column_pp
self.dms = view.dms
self.view = view.pointer
self.view = view
self.data_type = bytes2str(name_data_type((column_p.header).returned_data_type))
self.column_name = bytes2str((column_p.header).name)
self.nb_elements_per_line = (column_p.header).nb_elements_per_line
@ -120,7 +125,7 @@ cdef class OBIDMS_column :
yield self.get_line(line_nb)
cpdef update_pointer(self):
self.pointer = <OBIDMS_column_p*> obi_view_get_pointer_on_column_in_view(self.view, str2bytes(self.column_name))
self.pointer = <OBIDMS_column_p*> obi_view_get_pointer_on_column_in_view(self.view.pointer, str2bytes(self.column_name))
cpdef list get_elements_names(self):
return self.elements_names
@ -186,6 +191,11 @@ cdef class OBIDMS_column :
subclass = OBIDMS_column_char
else :
subclass = OBIDMS_column_multi_elts_char
elif col_type == OBI_QUAL :
if col_one_element_per_line :
subclass = OBIDMS_column_qual
else :
subclass = OBIDMS_column_multi_elts_qual
elif col_type == OBI_STR :
if col_one_element_per_line :
subclass = OBIDMS_column_str
@ -239,15 +249,15 @@ cdef class OBIDMS_column_line :
cdef class OBIView :
def __init__(self, OBIDMS dms, str view_name, bint new=False, object view_to_clone=None, list line_selection=None, str comments=""):
cdef Obiview_p view = NULL
cdef int i
cdef list col_list
cdef str col_name
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef Obiview_p view = NULL
cdef int i
cdef list col_list
cdef str col_name
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_header_p header
cdef index_t* line_selection_p
cdef index_t* line_selection_p
self.dms = dms
@ -280,11 +290,11 @@ cdef class OBIView :
raise Exception("Error creating/opening a view")
self.pointer = view
self.name = bytes2str(view.name)
self.name = bytes2str(view.infos.name)
# Go through columns to build list of corresponding python instances
self.columns = {}
for i in range(view.column_count) :
for i in range(view.infos.column_count) :
column_p = <OBIDMS_column_p> (view.columns)[i]
header = (column_p).header
col_name = bytes2str(header.name)
@ -293,13 +303,8 @@ cdef class OBIView :
def __repr__(self) :
cdef str s
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
s = self.name
s = s + ", " + self.comments + ", " + str(self.pointer.line_count) + " lines\n"
s = str(self.name) + ", " + str(self.comments) + ", " + str(self.pointer.infos.line_count) + " lines\n"
for column_name in self.columns :
s = s + self.columns[column_name].__repr__() + '\n'
return s
@ -308,7 +313,7 @@ cdef class OBIView :
cpdef delete_column(self, str column_name) :
cdef int i
cdef Obiview_p view
cdef Obiview_p view_p
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_header_p header
@ -316,7 +321,7 @@ cdef class OBIView :
view = self.pointer
if obi_view_delete_column(view, str2bytes(column_name)) < 0 :
if obi_view_delete_column(view_p, str2bytes(column_name)) < 0 :
raise Exception("Problem deleting a column from a view")
# Update the dictionaries of column pointers and column objects, and update pointers in column objects (make function?):
@ -324,7 +329,7 @@ cdef class OBIView :
for column_n in self.columns :
(self.columns[column_n]).update_pointer()
cpdef add_column(self,
str column_name,
@ -341,7 +346,6 @@ cdef class OBIView :
cdef bytes column_name_b
cdef bytes elements_names_b
cdef object subclass
cdef OBIDMS_column_p* column_pp
cdef OBIDMS_column_p column_p
column_name_b = str2bytes(column_name)
@ -360,6 +364,8 @@ cdef class OBIView :
data_type = OBI_BOOL
elif type == 'OBI_CHAR' :
data_type = OBI_CHAR
elif type == 'OBI_QUAL' :
data_type = OBI_QUAL
elif type == 'OBI_STR' :
data_type = OBI_STR
elif type == 'OBI_SEQ' :
@ -374,10 +380,9 @@ cdef class OBIView :
raise Exception("Problem adding a column in a view")
# Get the column pointer
column_pp = obi_view_get_pointer_on_column_in_view(self.pointer, column_name_b)
column_p = obi_view_get_column(self.pointer, column_name_b)
# Open and store the subclass
column_p = column_pp[0] # TODO ugly cython dereferencing
subclass = OBIDMS_column.get_subclass_type(column_p)
(self.columns)[column_name] = subclass(self, column_name)
@ -396,7 +401,7 @@ cdef class OBIView :
cdef OBIView_line line # TODO Check that this works for NUC SEQ views
# Yield each line
lines_used = (self.pointer).line_count
lines_used = self.pointer.infos.line_count
for line_nb in range(lines_used) :
line = self[line_nb]
@ -480,12 +485,12 @@ cdef class OBIView_NUC_SEQS(OBIView):
raise Exception("Error creating/opening view")
self.pointer = view
self.name = bytes2str(view.name)
self.comments = bytes2str(view.comments)
self.name = bytes2str(view.infos.name)
self.comments = bytes2str(view.infos.comments)
# Go through columns to build list of corresponding python instances
self.columns = {}
for i in range(view.column_count) :
for i in range(view.infos.column_count) :
column_p = <OBIDMS_column_p> (view.columns)[i]
header = (column_p).header
col_name = bytes2str(header.name)
@ -495,31 +500,7 @@ cdef class OBIView_NUC_SEQS(OBIView):
self.ids = self.columns[bytes2str(ID_COLUMN)]
self.sequences = self.columns[bytes2str(NUC_SEQUENCE_COLUMN)]
self.definitions = self.columns[bytes2str(DEFINITION_COLUMN)]
cpdef delete_column(self, str column_name) :
cdef int i
cdef Obiview_p view
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_header_p header
cdef str column_n
if ((column_name == bytes2str(ID_COLUMN)) or (column_name == bytes2str(NUC_SEQUENCE_COLUMN)) or (column_name == bytes2str(DEFINITION_COLUMN))) :
raise Exception("Can't delete an obligatory column from a NUC_SEQS view")
view = self.pointer
if obi_view_delete_column(view, str2bytes(column_name)) < 0 :
raise Exception("Problem deleting a column from a view")
# Remove instance from the dictionary
(self.columns).pop(column_name)
for column_n in self.columns :
(self.columns[column_n]).update_pointer()
self.qualities = self.columns[bytes2str(QUALITY_COLUMN)]
def __getitem__(self, object item) :
if type(item) == str :
@ -527,7 +508,6 @@ cdef class OBIView_NUC_SEQS(OBIView):
elif type(item) == int :
return OBI_Nuc_Seq_Stored(self, item)
def __setitem__(self, index_t line_idx, OBI_Nuc_Seq sequence_obj) :
for key in sequence_obj :
self[line_idx][key] = sequence_obj[key]
@ -546,6 +526,7 @@ cdef class OBIView_line :
def __setitem__(self, str column_name, object value):
# TODO detect multiple elements (dict type)? put somewhere else? but more risky (in get)
# TODO OBI_QUAL ?
cdef type value_type
cdef str value_obitype
if column_name not in self.view :
@ -621,7 +602,7 @@ cdef class OBIDMS :
view_class = OBIView_NUC_SEQS
else :
view_class = OBIView
return view_class(self, view_name)
@ -639,54 +620,84 @@ cdef class OBIDMS :
cpdef dict read_view_infos(self, str view_name) :
all_views = self.read_views()
return all_views[view_name]
cdef Obiview_infos_p view_infos_p
cdef dict view_infos_d
cdef Column_reference_p column_refs
cdef int i, j
cdef str column_name
view_infos_p = obi_view_map_file(self.pointer, str2bytes(view_name))
view_infos_d = {}
view_infos_d["name"] = bytes2str(view_infos_p.name)
view_infos_d["comments"] = bytes2str(view_infos_p.comments)
view_infos_d["view_type"] = bytes2str(view_infos_p.view_type)
view_infos_d["column_count"] = <int> view_infos_p.column_count
view_infos_d["line_count"] = <int> view_infos_p.line_count
view_infos_d["created_from"] = bytes2str(view_infos_p.created_from)
view_infos_d["creation_date"] = bytes2str(obi_format_date(view_infos_p.creation_date))
if (view_infos_p.all_lines) :
view_infos_d["line_selection"] = None
else :
view_infos_d["line_selection"] = {}
view_infos_d["line_selection"]["column_name"] = bytes2str((view_infos_p.line_selection).column_name)
view_infos_d["line_selection"]["version"] = <int> (view_infos_p.line_selection).version
view_infos_d["column_references"] = {}
column_refs = view_infos_p.column_references
for j in range(view_infos_d["column_count"]) :
column_name = bytes2str((column_refs[j]).column_name)
view_infos_d["column_references"][column_name] = {}
view_infos_d["column_references"][column_name]["version"] = column_refs[j].version
obi_view_unmap_file(self.pointer, view_infos_p)
return view_infos_d
cpdef dict read_views(self) : # TODO function that prints the dic nicely and function that prints 1 view nicely. Add column type in col ref
cdef Obiviews_infos_all_p all_views_p
cdef Obiview_infos_p view_p
cdef Column_reference_p column_refs
cdef int nb_views
cdef int i, j
cdef str view_name
cdef str column_name
cdef dict views
cdef bytes name_b
views = {}
all_views_p = obi_read_view_infos(self.pointer)
if all_views_p == NULL :
raise Exception("No views to read")
nb_views = <int> (all_views_p.header).view_count
for i in range(nb_views) :
view_p = (<Obiview_infos_p> (all_views_p.view_infos)) + i
view_name = bytes2str(view_p.name)
views[view_name] = {}
views[view_name]["comments"] = bytes2str(view_p.comments)
views[view_name]["view_type"] = bytes2str(view_p.view_type)
views[view_name]["column_count"] = <int> view_p.column_count
views[view_name]["line_count"] = <int> view_p.line_count
views[view_name]["view_number"] = <int> view_p.view_number
views[view_name]["created_from"] = bytes2str(view_p.created_from)
views[view_name]["creation_date"] = bytes2str(obi_format_date(view_p.creation_date))
if (view_p.all_lines) :
views[view_name]["line_selection"] = None
else :
views[view_name]["line_selection"] = {}
views[view_name]["line_selection"]["column_name"] = bytes2str((view_p.line_selection).column_name)
views[view_name]["line_selection"]["version"] = <int> (view_p.line_selection).version
views[view_name]["column_references"] = {}
column_refs = view_p.column_references
for j in range(views[view_name]["column_count"]) :
column_name = bytes2str((column_refs[j]).column_name)
views[view_name]["column_references"][column_name] = {}
views[view_name]["column_references"][column_name]["version"] = column_refs[j].version
obi_close_view_infos(all_views_p);
return views
# cpdef dict read_views(self) : # TODO function that prints the dic nicely and function that prints 1 view nicely. Add column type in col ref
#
# cdef Obiviews_infos_all_p all_views_p
# cdef Obiview_infos_p view_p
# cdef Column_reference_p column_refs
# cdef int nb_views
# cdef int i, j
# cdef str view_name
# cdef str column_name
# cdef dict views
# cdef bytes name_b
#
# views = {}
# all_views_p = obi_read_view_infos(self.pointer)
# if all_views_p == NULL :
# raise Exception("No views to read")
# nb_views = <int> (all_views_p.header).view_count
# for i in range(nb_views) :
# view_p = (<Obiview_infos_p> (all_views_p.view_infos)) + i
# view_name = bytes2str(view_p.name)
# views[view_name] = {}
# views[view_name]["comments"] = bytes2str(view_p.comments)
# views[view_name]["view_type"] = bytes2str(view_p.view_type)
# views[view_name]["column_count"] = <int> view_p.column_count
# views[view_name]["line_count"] = <int> view_p.line_count
# views[view_name]["view_number"] = <int> view_p.view_number
# views[view_name]["created_from"] = bytes2str(view_p.created_from)
# views[view_name]["creation_date"] = bytes2str(obi_format_date(view_p.creation_date))
# if (view_p.all_lines) :
# views[view_name]["line_selection"] = None
# else :
# views[view_name]["line_selection"] = {}
# views[view_name]["line_selection"]["column_name"] = bytes2str((view_p.line_selection).column_name)
# views[view_name]["line_selection"]["version"] = <int> (view_p.line_selection).version
# views[view_name]["column_references"] = {}
# column_refs = view_p.column_references
# for j in range(views[view_name]["column_count"]) :
# column_name = bytes2str((column_refs[j]).column_name)
# views[view_name]["column_references"][column_name] = {}
# views[view_name]["column_references"][column_name]["version"] = column_refs[j].version
#
# obi_close_view_infos(all_views_p);
#
# return views

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -17,7 +17,7 @@ cdef class OBIDMS_column_bool(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obibool_t value
cdef object result
value = obi_column_get_obibool_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0)
value = obi_column_get_obibool_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIBool_NA :
@ -29,7 +29,7 @@ cdef class OBIDMS_column_bool(OBIDMS_column):
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIBool_NA
if obi_column_set_obibool_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0, <obibool_t> value) < 0:
if obi_column_set_obibool_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, <obibool_t> value) < 0:
raise Exception("Problem setting a value in a column")
@ -38,7 +38,7 @@ cdef class OBIDMS_column_multi_elts_bool(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obibool_t value
cdef object result
value = obi_column_get_obibool_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name))
value = obi_column_get_obibool_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIBool_NA :
@ -56,7 +56,7 @@ cdef class OBIDMS_column_multi_elts_bool(OBIDMS_column_multi_elts):
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obibool_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, i)
value = obi_column_get_obibool_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIBool_NA :
@ -73,5 +73,5 @@ cdef class OBIDMS_column_multi_elts_bool(OBIDMS_column_multi_elts):
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIBool_NA
if obi_column_set_obibool_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name), <obibool_t> value) < 0:
if obi_column_set_obibool_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), <obibool_t> value) < 0:
raise Exception("Problem setting a value in a column")

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -15,7 +15,7 @@ cdef class OBIDMS_column_char(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obichar_t value
cdef object result
value = obi_column_get_obichar_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0)
value = obi_column_get_obichar_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIChar_NA :
@ -27,7 +27,7 @@ cdef class OBIDMS_column_char(OBIDMS_column):
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIChar_NA
if obi_column_set_obichar_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0, str2bytes(value)[0]) < 0:
if obi_column_set_obichar_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)[0]) < 0:
raise Exception("Problem setting a value in a column")
@ -36,7 +36,7 @@ cdef class OBIDMS_column_multi_elts_char(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obichar_t value
cdef object result
value = obi_column_get_obichar_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name))
value = obi_column_get_obichar_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIChar_NA :
@ -54,7 +54,7 @@ cdef class OBIDMS_column_multi_elts_char(OBIDMS_column_multi_elts):
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obichar_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, i)
value = obi_column_get_obichar_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIChar_NA :
@ -71,6 +71,6 @@ cdef class OBIDMS_column_multi_elts_char(OBIDMS_column_multi_elts):
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIChar_NA
if obi_column_set_obichar_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name), str2bytes(value)[0]) < 0:
if obi_column_set_obichar_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), str2bytes(value)[0]) < 0:
raise Exception("Problem setting a value in a column")

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -15,7 +15,7 @@ cdef class OBIDMS_column_float(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obifloat_t value
cdef object result
value = obi_column_get_obifloat_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0)
value = obi_column_get_obifloat_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIFloat_NA :
@ -27,7 +27,7 @@ cdef class OBIDMS_column_float(OBIDMS_column):
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIFloat_NA
if obi_column_set_obifloat_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0, <obifloat_t> value) < 0:
if obi_column_set_obifloat_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, <obifloat_t> value) < 0:
raise Exception("Problem setting a value in a column")
@ -36,7 +36,7 @@ cdef class OBIDMS_column_multi_elts_float(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obifloat_t value
cdef object result
value = obi_column_get_obifloat_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name))
value = obi_column_get_obifloat_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIFloat_NA :
@ -54,7 +54,7 @@ cdef class OBIDMS_column_multi_elts_float(OBIDMS_column_multi_elts):
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obifloat_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, i)
value = obi_column_get_obifloat_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIFloat_NA :
@ -71,6 +71,6 @@ cdef class OBIDMS_column_multi_elts_float(OBIDMS_column_multi_elts):
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIFloat_NA
if obi_column_set_obifloat_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name), <obifloat_t> value) < 0:
if obi_column_set_obifloat_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), <obifloat_t> value) < 0:
raise Exception("Problem setting a value in a column")

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -17,7 +17,7 @@ cdef class OBIDMS_column_int(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obiint_t value
cdef object result
value = obi_column_get_obiint_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0)
value = obi_column_get_obiint_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIInt_NA :
@ -29,7 +29,7 @@ cdef class OBIDMS_column_int(OBIDMS_column):
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIInt_NA
if obi_column_set_obiint_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0, <obiint_t> value) < 0:
if obi_column_set_obiint_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, <obiint_t> value) < 0:
raise Exception("Problem setting a value in a column")
@ -38,7 +38,7 @@ cdef class OBIDMS_column_multi_elts_int(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obiint_t value
cdef object result
value = obi_column_get_obiint_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name))
value = obi_column_get_obiint_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIInt_NA :
@ -56,7 +56,7 @@ cdef class OBIDMS_column_multi_elts_int(OBIDMS_column_multi_elts):
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiint_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, i)
value = obi_column_get_obiint_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIInt_NA :
@ -73,6 +73,6 @@ cdef class OBIDMS_column_multi_elts_int(OBIDMS_column_multi_elts):
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIInt_NA
if obi_column_set_obiint_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name), <obiint_t> value) < 0:
if obi_column_set_obiint_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), <obiint_t> value) < 0:
raise Exception("Problem setting a value in a column")

View File

@ -0,0 +1,59 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@ -0,0 +1,20 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column , OBIDMS_column_multi_elts
cdef class OBIDMS_column_qual(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef object get_str_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cpdef set_str_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_qual(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_str_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef object get_str_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)
cpdef set_str_item(self, index_t line_nb, str element_name, object value)

View File

@ -0,0 +1,184 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obiqual_char_with_elt_name_in_view, \
obi_column_get_obiqual_char_with_elt_idx_in_view, \
obi_column_set_obiqual_char_with_elt_name_in_view, \
obi_column_set_obiqual_char_with_elt_idx_in_view, \
obi_column_get_obiqual_int_with_elt_name_in_view, \
obi_column_get_obiqual_int_with_elt_idx_in_view, \
obi_column_set_obiqual_int_with_elt_name_in_view, \
obi_column_set_obiqual_int_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIQual_char_NA, OBIQual_int_NA, const_char_p
from ._obidms cimport OBIView
from obitools3.utils cimport str2bytes, bytes2str
from libc.stdlib cimport free
from libc.string cimport strcmp
from libc.stdint cimport uint8_t
from libc.stdlib cimport malloc
cdef class OBIDMS_column_qual(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef const uint8_t* value
cdef int value_length
cdef object result
cdef int i
value = obi_column_get_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, &value_length)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_int_NA :
result = None
else :
result = []
for i in range(value_length) :
result.append(<int>value[i])
return result
cpdef object get_str_line(self, index_t line_nb):
cdef char* value
cdef object result
cdef int i
value = obi_column_get_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_char_NA :
result = None
else :
result = bytes2str(value)
free(value)
return result
cpdef set_line(self, index_t line_nb, object value):
cdef uint8_t* value_b
cdef int value_length
if value is None :
if obi_column_set_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, OBIQual_int_NA, 0) < 0:
raise Exception("Problem setting a value in a column")
else :
value_length = len(value)
value_b = <uint8_t*> malloc(value_length * sizeof(uint8_t))
for i in range(value_length) :
value_b[i] = <uint8_t>value[i]
if obi_column_set_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, value_b, value_length) < 0:
raise Exception("Problem setting a value in a column")
free(value_b)
cpdef set_str_line(self, index_t line_nb, object value):
if value is None :
if obi_column_set_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, OBIQual_char_NA) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_qual(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef const uint8_t* value
cdef int value_length
cdef object result
cdef int i
value = obi_column_get_obiqual_int_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), &value_length)
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIQual_int_NA :
result = None
else :
result = []
for i in range(value_length) :
result.append(<int>value[i])
return result
cpdef object get_str_item(self, index_t line_nb, str element_name):
cdef char* value
cdef object result
value = obi_column_get_obiqual_char_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIQual_char_NA :
result = None
else :
result = bytes2str(value)
free(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef const uint8_t* value
cdef int value_length
cdef object value_in_result
cdef dict result
cdef index_t i
cdef int j
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i, &value_length)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_int_NA :
value_in_result = None
else :
value_in_result = []
for j in range(value_length) :
value_in_result.append(<int>value[j])
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef object get_str_line(self, index_t line_nb) :
cdef char* value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_char_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
free(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
cdef uint8_t* value_b
cdef int value_length
if value is None :
if obi_column_set_obiqual_int_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), OBIQual_int_NA, 0) < 0:
raise Exception("Problem setting a value in a column")
else :
value_length = len(value)
value_b = <uint8_t*> malloc(value_length * sizeof(uint8_t))
for i in range(value_length) :
value_b[i] = <uint8_t>value[i]
if obi_column_set_obiqual_int_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), value_b, value_length) < 0:
raise Exception("Problem setting a value in a column")
free(value_b)
cpdef set_str_item(self, index_t line_nb, str element_name, object value):
if value is None :
if obi_column_set_obiqual_char_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), OBIQual_char_NA) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obiqual_char_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -1,12 +1,24 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column, OBIDMS_column_multi_elts
from ._obidms cimport OBIView, OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_seq(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
# TO DISCUSS :
# I'am not sure that this method has to be declared here
# Alignment must be declared outside of the sequence object
cpdef align(self,
OBIView score_view,
OBIDMS_column score_column,
double threshold = *,
bint normalize = *,
int reference = *,
bint similarity_mode = *)
cdef class OBIDMS_column_multi_elts_seq(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)

View File

@ -4,13 +4,15 @@ from .capi.obiview cimport obi_column_get_obiseq_with_elt_name_in_view, \
obi_column_get_obiseq_with_elt_idx_in_view, \
obi_column_set_obiseq_with_elt_name_in_view, \
obi_column_set_obiseq_with_elt_idx_in_view
from .capi.obialign cimport obi_align_one_column
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBISeq_NA, const_char_p
from ._obidms cimport OBIView
from obitools3.utils cimport str2bytes, bytes2str
from libc.stdlib cimport free
from libc.string cimport strcmp
cdef class OBIDMS_column_seq(OBIDMS_column):
@ -18,41 +20,67 @@ cdef class OBIDMS_column_seq(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef char* value
cdef object result
value = obi_column_get_obiseq_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0)
value = obi_column_get_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if strcmp(value, OBISeq_NA) == 0 :
if value == OBISeq_NA :
result = None
else :
result = bytes2str(value)
free(value)
try:
result = <bytes> value
finally:
free(value)
return result
cpdef set_line(self, index_t line_nb, object value):
cdef bytes value_b
if value is None :
value_b = OBISeq_NA
else :
elif isinstance(value, bytes) :
value_b = value
elif isinstance(value, str) :
value_b = str2bytes(value)
if obi_column_set_obiseq_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0, value_b) < 0:
raise Exception("Problem setting a value in a column")
else:
raise TypeError('Sequence value must be of type Bytes, Str or None')
if obi_column_set_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, value_b) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")
# TODO choose alignment type (lcs or other) with supplementary argument
cpdef align(self,
OBIView score_view,
OBIDMS_column score_column,
double threshold = 0.0,
bint normalize = True,
int reference = 0, # TODO
bint similarity_mode = True):
if (obi_align_one_column(self.view.pointer, (self.pointer)[0], score_view.pointer, (score_column.pointer)[0], threshold, normalize, reference, similarity_mode) < 0) :
raise Exception("An error occurred while aligning sequences")
cdef class OBIDMS_column_multi_elts_seq(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef char* value
cdef object result
value = obi_column_get_obiseq_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name))
value = obi_column_get_obiseq_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if strcmp(value, OBISeq_NA) == 0 :
if value == OBISeq_NA :
result = None
else :
result = bytes2str(value)
free(value)
try:
result = <bytes> value
finally:
free(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef char* value
cdef object value_in_result
@ -62,14 +90,16 @@ cdef class OBIDMS_column_multi_elts_seq(OBIDMS_column_multi_elts):
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiseq_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, i)
value = obi_column_get_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if strcmp(value, OBISeq_NA) == 0 :
if value == OBISeq_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
free(value)
try:
value_in_result = <bytes> value
finally:
free(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
@ -79,10 +109,19 @@ cdef class OBIDMS_column_multi_elts_seq(OBIDMS_column_multi_elts):
cpdef set_item(self, index_t line_nb, str element_name, object value):
cdef bytes value_b
if value is None :
value_b = OBISeq_NA
else :
elif isinstance(value, bytes) :
value_b = value
elif isinstance(value, str) :
value_b = str2bytes(value)
if obi_column_set_obiseq_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name), value_b) < 0:
else:
raise TypeError('Sequence value must be of type Bytes, Str or None')
if obi_column_set_obiseq_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), value_b) < 0:
raise Exception("Problem setting a value in a column")
# cpdef align(self, ): # TODO
# raise Exception("Columns with multiple sequences per line can't be aligned") # TODO discuss

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -9,18 +9,16 @@ from .capi.obitypes cimport OBIStr_NA, const_char_p
from obitools3.utils cimport str2bytes, bytes2str
from libc.string cimport strcmp
cdef class OBIDMS_column_str(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef const_char_p value
cdef object result
value = obi_column_get_obistr_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0)
value = obi_column_get_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if strcmp(value, OBIStr_NA) == 0 :
if value == OBIStr_NA :
result = None
else :
result = bytes2str(value)
@ -28,13 +26,12 @@ cdef class OBIDMS_column_str(OBIDMS_column):
return result
cpdef set_line(self, index_t line_nb, object value):
cdef bytes value_b
if value is None :
value_b = OBIStr_NA
if obi_column_set_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, OBIStr_NA) < 0:
raise Exception("Problem setting a value in a column")
else :
value_b = str2bytes(value)
if obi_column_set_obistr_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, 0, value_b) < 0:
raise Exception("Problem setting a value in a column")
if obi_column_set_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_str(OBIDMS_column_multi_elts):
@ -42,10 +39,10 @@ cdef class OBIDMS_column_multi_elts_str(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef const_char_p value
cdef object result
value = obi_column_get_obistr_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name))
value = obi_column_get_obistr_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if strcmp(value, OBIStr_NA) == 0 :
if value == OBIStr_NA :
result = None
else :
result = bytes2str(value)
@ -61,10 +58,10 @@ cdef class OBIDMS_column_multi_elts_str(OBIDMS_column_multi_elts):
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obistr_with_elt_idx_in_view(self.view, (self.pointer)[0], line_nb, i)
value = obi_column_get_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if strcmp(value, OBIStr_NA) == 0 :
if value == OBIStr_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
@ -82,6 +79,6 @@ cdef class OBIDMS_column_multi_elts_str(OBIDMS_column_multi_elts):
value_b = OBIStr_NA
else :
value_b = str2bytes(value)
if obi_column_set_obistr_with_elt_name_in_view(self.view, (self.pointer)[0], line_nb, str2bytes(element_name), value_b) < 0:
if obi_column_set_obistr_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), value_b) < 0:
raise Exception("Problem setting a value in a column")

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -4,27 +4,35 @@ from ._obidms cimport OBIView_line
cdef class OBI_Seq(dict) :
cdef str id
cdef str definition
cdef str sequence
cdef object id
cdef object definition
cdef object sequence
cpdef set_id(self, str id)
cpdef get_id(self)
cpdef set_definition(self, str definition)
cpdef get_definition(self)
cpdef get_sequence(self)
cpdef set_id(self, object id)
cpdef object get_id(self)
cpdef set_definition(self, object definition)
cpdef object get_definition(self)
cpdef object get_sequence(self)
cdef class OBI_Nuc_Seq(OBI_Seq) :
#cpdef str reverse_complement(self)
cpdef set_sequence(self, str sequence)
cdef object quality
#cpdef object reverse_complement(self)
cpdef set_sequence(self, object sequence)
cpdef set_quality(self, object quality)
cpdef object get_quality(self)
cdef class OBI_Nuc_Seq_Stored(OBIView_line) :
cpdef set_id(self, str id)
cpdef get_id(self)
cpdef set_definition(self, str definition)
cpdef get_definition(self)
cpdef set_sequence(self, str sequence)
cpdef get_sequence(self)
# cpdef str reverse_complement(self)
cpdef set_id(self, object id)
cpdef object get_id(self)
cpdef set_definition(self, object definition)
cpdef object get_definition(self)
cpdef set_sequence(self, object sequence)
cpdef object get_sequence(self)
cpdef set_quality(self, object quality)
cpdef object get_quality(self)
cpdef object get_str_quality(self)
# cpdef object reverse_complement(self)

View File

@ -4,30 +4,31 @@ from obitools3.utils cimport bytes2str, str2bytes
from .capi.obiview cimport NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN
DEFINITION_COLUMN, \
QUALITY_COLUMN
cdef class OBI_Seq(dict) :
def __init__(self, str id, str seq, str definition=None) :
def __init__(self, object id, object seq, object definition=None) :
self.set_id(id)
self.set_sequence(seq)
if definition is not None :
self.set_definition(definition)
cpdef set_id(self, str id) :
cpdef set_id(self, object id) :
self.id = id
self[bytes2str(ID_COLUMN)] = id
cpdef get_id(self) :
return self.id
cpdef set_definition(self, str definition) :
cpdef set_definition(self, object definition) :
self.definition = definition
self[bytes2str(DEFINITION_COLUMN)] = definition
cpdef get_definition(self) :
return self.definition
cpdef get_sequence(self) :
return self.sequence
@ -37,34 +38,55 @@ cdef class OBI_Seq(dict) :
cdef class OBI_Nuc_Seq(OBI_Seq) :
cpdef set_sequence(self, str sequence) :
cpdef set_sequence(self, object sequence) :
self.sequence = sequence
self[bytes2str(NUC_SEQUENCE_COLUMN)] = sequence
cpdef set_quality(self, object quality) :
self.quality = quality
self[bytes2str(QUALITY_COLUMN)] = quality
cpdef get_quality(self) :
return self.quality
# cpdef str reverse_complement(self) : TODO in C ?
# pass
cdef class OBI_Nuc_Seq_Stored(OBIView_line) :
cpdef set_id(self, str id) :
# TODO store the str version of column name macros
cpdef set_id(self, object id) :
self[bytes2str(ID_COLUMN)] = id
cpdef get_id(self) :
cpdef object get_id(self) :
return self[bytes2str(ID_COLUMN)]
cpdef set_definition(self, str definition) :
cpdef set_definition(self, object definition) :
self[bytes2str(DEFINITION_COLUMN)] = definition
cpdef get_definition(self) :
cpdef object get_definition(self) :
return self[bytes2str(DEFINITION_COLUMN)]
cpdef set_sequence(self, str sequence) :
cpdef set_sequence(self, object sequence) :
self[bytes2str(NUC_SEQUENCE_COLUMN)] = sequence
cpdef get_sequence(self) :
cpdef object get_sequence(self) :
return self[bytes2str(NUC_SEQUENCE_COLUMN)]
cpdef set_quality(self, object quality) :
if (type(quality) == list) or (quality is None) :
self[bytes2str(QUALITY_COLUMN)] = quality
else : # Quality is in str form
(((self.view).columns)[bytes2str(QUALITY_COLUMN)]).set_str_line(self.index, quality)
cpdef object get_quality(self) :
return self[bytes2str(QUALITY_COLUMN)]
cpdef object get_str_quality(self) :
return ((self.view).columns)[bytes2str(QUALITY_COLUMN)].get_str_line(self.index)
# def __str__(self) :
# return self[bytes2str(NUC_SEQUENCE_COLUMN)] # or not

View File

@ -10,6 +10,8 @@
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
@ -21,8 +23,22 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
@ -35,17 +51,9 @@
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h

View File

@ -0,0 +1,10 @@
#cython: language_level=3
from ..capi.obiview cimport Obiview_p
from ..capi.obidmscolumn cimport OBIDMS_column_p
cdef extern from "obi_align.h" nogil:
int obi_align_one_column(Obiview_p seq_view, OBIDMS_column_p seq_column, Obiview_p score_view, OBIDMS_column_p score_column, double threshold, bint normalize, int reference, bint similarity_mode)

View File

@ -11,6 +11,8 @@ from ..capi.obitypes cimport const_char_p, \
index_t, \
time_t
from libc.stdint cimport uint8_t
cdef extern from "obidmscolumn.h" nogil:
@ -194,3 +196,46 @@ cdef extern from "obidmscolumn_seq.h" nogil:
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_qual.h" nogil:
int obi_column_set_obiqual_char_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
const_char_p value)
int obi_column_set_obiqual_char_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const_char_p value)
int obi_column_set_obiqual_int_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
const uint8_t* value,
int value_length)
int obi_column_set_obiqual_int_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const uint8_t* value,
int value_length)
char* obi_column_get_obiqual_char_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
char* obi_column_get_obiqual_char_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
const uint8_t* obi_column_get_obiqual_int_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
int* value_length)
const uint8_t* obi_column_get_obiqual_int_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
int* value_length)

View File

@ -1,7 +1,8 @@
#cython: language_level=3
from libc.stdint cimport int32_t, int64_t
from libc.stdint cimport int32_t, int64_t, uint8_t
from posix.types cimport time_t
@ -21,6 +22,7 @@ cdef extern from "obitypes.h" nogil:
OBI_FLOAT,
OBI_BOOL,
OBI_CHAR,
OBI_QUAL,
OBI_STR,
OBI_SEQ,
OBI_IDX
@ -46,5 +48,8 @@ cdef extern from "obitypes.h" nogil:
extern obibool_t OBIBool_NA
extern const_char_p OBISeq_NA
extern const_char_p OBIStr_NA
extern const_char_p OBIQual_char_NA
extern uint8_t* OBIQual_int_NA
const_char_p name_data_type(int data_type)

View File

@ -12,6 +12,8 @@ from .obitypes cimport const_char_p, \
from ..capi.obidms cimport OBIDMS_p
from ..capi.obidmscolumn cimport OBIDMS_column_p
from libc.stdint cimport uint8_t
cdef extern from "obiview.h" nogil:
@ -19,21 +21,7 @@ cdef extern from "obiview.h" nogil:
extern const_char_p NUC_SEQUENCE_COLUMN
extern const_char_p ID_COLUMN
extern const_char_p DEFINITION_COLUMN
struct Obiview_t :
OBIDMS_p dms
const_char_p name
const_char_p created_from
const_char_p view_type
bint read_only
OBIDMS_column_p line_selection
OBIDMS_column_p new_line_selection
index_t line_count
int column_count
OBIDMS_column_p columns
const_char_p comments
ctypedef Obiview_t* Obiview_p
extern const_char_p QUALITY_COLUMN
struct Column_reference_t :
@ -44,7 +32,6 @@ cdef extern from "obiview.h" nogil:
struct Obiview_infos_t :
int view_number
time_t creation_date
const_char_p name
const_char_p created_from
@ -59,19 +46,15 @@ cdef extern from "obiview.h" nogil:
ctypedef Obiview_infos_t* Obiview_infos_p
struct Obiviews_header_t :
size_t header_size
size_t views_size
int view_count
ctypedef Obiviews_header_t* Obiviews_header_p
struct Obiview_t :
Obiview_infos_p infos
OBIDMS_p dms
bint read_only
OBIDMS_column_p line_selection
OBIDMS_column_p new_line_selection
OBIDMS_column_p columns
struct Obiviews_infos_all_t :
Obiviews_header_p header
Obiview_infos_p view_infos
ctypedef Obiviews_infos_all_t* Obiviews_infos_all_p
ctypedef Obiview_t* Obiview_p
Obiview_p obi_new_view_nuc_seqs(OBIDMS_p dms, const_char_p view_name, Obiview_p view_to_clone, index_t* line_selection, const_char_p comments)
@ -82,6 +65,10 @@ cdef extern from "obiview.h" nogil:
Obiview_p obi_new_view_nuc_seqs_cloned_from_name(OBIDMS_p dms, const_char_p view_name, const_char_p view_to_clone_name, index_t* line_selection, const_char_p comments)
Obiview_infos_p obi_view_map_file(OBIDMS_p dms, const char* view_name)
int obi_view_unmap_file(OBIDMS_p dms, Obiview_infos_p view_infos)
Obiview_p obi_open_view(OBIDMS_p dms, const_char_p view_name)
int obi_view_add_column(Obiview_p view,
@ -110,10 +97,6 @@ cdef extern from "obiview.h" nogil:
int obi_close_view(Obiview_p view)
int obi_save_and_close_view(Obiview_p view)
Obiviews_infos_all_p obi_read_view_infos(OBIDMS_p dms)
int obi_close_view_infos(Obiviews_infos_all_p views)
int obi_column_set_obiint_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
@ -203,6 +186,54 @@ cdef extern from "obiview.h" nogil:
index_t line_nb,
index_t element_idx)
int obi_column_set_obiqual_char_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const char* value)
int obi_column_set_obiqual_int_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const uint8_t* value,
int value_length)
char* obi_column_get_obiqual_char_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
const uint8_t* obi_column_get_obiqual_int_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
int* value_length)
int obi_column_set_obiqual_char_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
const char* element_name,
const char* value)
int obi_column_set_obiqual_int_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
const char* element_name,
const uint8_t* value,
int value_length)
char* obi_column_get_obiqual_char_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
const char* element_name)
const uint8_t* obi_column_get_obiqual_int_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,
const char* element_name,
int* value_length)
int obi_column_set_obistr_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
index_t line_nb,

View File

@ -36,7 +36,7 @@ if __name__ == '__main__':
if l['score'] > 350 :
line_selec.append(i)
i+=1
new_v = d.new_view(args.new_view, view_to_clone=v, line_selection=line_selec, view_type="NUC_SEQS_VIEW", comments="obigrep "+args.view+" to "+args.new_view) #args.key+" "+str(args.comparison)+" "+str(args.value)+" "+)
print("\n")

View File

@ -1,5 +1,6 @@
#cython: language_level=3
from ..utils cimport str2bytes
from .header cimport parseHeader
from ..files.universalopener cimport uopen
from ..files.linebuffer cimport LineBuffer

View File

@ -6,12 +6,15 @@ Created on 30 mars 2016
@author: coissac
'''
def fastaIterator(lineiterator, int buffersize=100000000):
cdef LineBuffer lb
cdef str ident
cdef str definition
cdef dict tags
cdef list s
cdef bytes sequence
cdef bytes quality
if isinstance(lineiterator,(str,bytes)):
lineiterator=uopen(lineiterator)
@ -28,10 +31,15 @@ def fastaIterator(lineiterator, int buffersize=100000000):
ident,tags,definition = parseHeader(line)
s = []
line = next(i)
while line[0]!='>':
s.append(line[0:-1])
line = next(i)
sequence = "".join(s)
try:
while line[0]!='>':
s.append(str2bytes(line)[0:-1])
line = next(i)
except StopIteration:
pass
sequence = b"".join(s)
quality = None
yield { "id" : ident,
@ -42,5 +50,6 @@ def fastaIterator(lineiterator, int buffersize=100000000):
"annotation" : {}
}

View File

@ -1,5 +1,7 @@
#cython: language_level=3
from ..utils cimport str2bytes
from .header cimport parseHeader
from ..files.universalopener cimport uopen
from ..files.linebuffer cimport LineBuffer

View File

@ -6,13 +6,13 @@ Created on 30 mars 2016
@author: coissac
'''
def fastqIterator(lineiterator, int buffersize=100000000):
cdef LineBuffer lb
cdef str ident
cdef str definition
cdef dict tags
cdef bytes sequence
cdef bytes quality
if isinstance(lineiterator,(str,bytes)):
lineiterator=uopen(lineiterator)
@ -25,9 +25,9 @@ def fastqIterator(lineiterator, int buffersize=100000000):
i = iter(lb)
for line in i:
ident,tags,definition = parseHeader(line)
sequence = next(i)[0:-1]
sequence = str2bytes(next(i)[0:-1])
next(i)
quality = next(i)[0:-1]
quality = str2bytes(next(i)[0:-1])
yield { "id" : ident,
"definition" : definition,
@ -38,4 +38,3 @@ def fastqIterator(lineiterator, int buffersize=100000000):
}

View File

@ -30,20 +30,27 @@ __re_dict__ = re.compile("""^\{\ *
)
)*\ *\}$""", re.VERBOSE)
__re_val__ = re.compile("""(("[^"]*"|'[^']*') *: *([^,}]+|"[^"]*"|'[^']*') *[,}] *)""")
cdef object __etag__(str x):
cdef list elements
cdef tuple i
if __re_int__.match(x):
v=int(x)
elif __re_float__.match(x):
v=float(x)
elif __re_str__.match(x):
v=x[1:-1]
elif x=='None':
v=None
elif x=='False':
v=False
elif x=='True':
v=True
elif __re_dict__.match(x):
v=eval(x)
elements=__re_val__.findall(x)
v=dict([(i[1][1:-1],__etag__(i[2])) for i in elements])
else:
v=x
return v
@ -56,7 +63,7 @@ cpdef tuple parseHeader(str header):
cdef str second
m=header[1:-1].split(maxsplit=1)
ident=m[0]
if len(m)==1:
@ -75,4 +82,4 @@ cpdef tuple parseHeader(str header):
return ident,tags,definition

View File

@ -1,7 +1,5 @@
#cython: language_level=3
import sys
import io
cdef bytes str2bytes(str string):
"""

961
src/_sse.h Normal file
View File

@ -0,0 +1,961 @@
#ifndef _SSE_H_
#define _SSE_H_
#include <string.h>
#include <inttypes.h>
#ifdef __SSE2__
#include <xmmintrin.h>
#else
typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__));
#endif /* __SSE2__ */
#ifndef MAX
#define MAX(x,y) (((x)>(y)) ? (x):(y))
#define MIN(x,y) (((x)<(y)) ? (x):(y))
#endif
#define ALIGN __attribute__((aligned(16)))
typedef __m128i vUInt8;
typedef __m128i vInt8;
typedef __m128i vUInt16;
typedef __m128i vInt16;
typedef __m128i vUInt64;
typedef union
{
__m128i i;
int64_t s64[ 2];
int16_t s16[ 8];
int8_t s8 [16];
uint8_t u8 [16];
uint16_t u16[8 ];
uint32_t u32[4 ];
uint64_t u64[2 ];
} um128;
typedef union
{
vUInt8 m;
uint8_t c[16];
} uchar_v;
typedef union
{
vUInt16 m;
uint16_t c[8];
} ushort_v;
typedef union
{
vUInt64 m;
uint64_t c[2];
} uint64_v;
#ifdef __SSE2__
static inline int8_t _s2_extract_epi8(__m128i r, const int p)
{
#define ACTIONP(r,x) return _mm_extract_epi16(r,x) & 0xFF
#define ACTIONI(r,x) return _mm_extract_epi16(r,x) >> 8
switch (p) {
case 0: ACTIONP(r,0);
case 1: ACTIONI(r,0);
case 2: ACTIONP(r,1);
case 3: ACTIONI(r,1);
case 4: ACTIONP(r,2);
case 5: ACTIONI(r,2);
case 6: ACTIONP(r,3);
case 7: ACTIONI(r,3);
case 8: ACTIONP(r,4);
case 9: ACTIONI(r,4);
case 10: ACTIONP(r,5);
case 11: ACTIONI(r,5);
case 12: ACTIONP(r,6);
case 13: ACTIONI(r,6);
case 14: ACTIONP(r,7);
case 15: ACTIONI(r,7);
}
#undef ACTIONP
#undef ACTIONI
return 0;
}
static inline __m128i _s2_max_epi8(__m128i a, __m128i b)
{
__m128i mask = _mm_cmpgt_epi8( a, b );
a = _mm_and_si128 (a,mask );
b = _mm_andnot_si128(mask,b);
return _mm_or_si128(a,b);
}
static inline __m128i _s2_min_epi8(__m128i a, __m128i b)
{
__m128i mask = _mm_cmplt_epi8( a, b );
a = _mm_and_si128 (a,mask );
b = _mm_andnot_si128(mask,b);
return _mm_or_si128(a,b);
}
static inline __m128i _s2_insert_epi8(__m128i r, int b, const int p)
{
#define ACTIONP(r,x) return _mm_insert_epi16(r,(_mm_extract_epi16(r,x) & 0xFF00) | (b & 0x00FF),x)
#define ACTIONI(r,x) return _mm_insert_epi16(r,(_mm_extract_epi16(r,x) & 0x00FF) | ((b << 8)& 0xFF00),x)
switch (p) {
case 0: ACTIONP(r,0);
case 1: ACTIONI(r,0);
case 2: ACTIONP(r,1);
case 3: ACTIONI(r,1);
case 4: ACTIONP(r,2);
case 5: ACTIONI(r,2);
case 6: ACTIONP(r,3);
case 7: ACTIONI(r,3);
case 8: ACTIONP(r,4);
case 9: ACTIONI(r,4);
case 10: ACTIONP(r,5);
case 11: ACTIONI(r,5);
case 12: ACTIONP(r,6);
case 13: ACTIONI(r,6);
case 14: ACTIONP(r,7);
case 15: ACTIONI(r,7);
}
#undef ACTIONP
#undef ACTIONI
return _mm_setzero_si128();
}
// Fill a SSE Register with 16 time the same 8bits integer value
#define _MM_SET1_EPI8(x) _mm_set1_epi8(x)
#define _MM_INSERT_EPI8(r,x,i) _s2_insert_epi8((r),(x),(i))
#define _MM_CMPEQ_EPI8(x,y) _mm_cmpeq_epi8((x),(y))
#define _MM_CMPGT_EPI8(x,y) _mm_cmpgt_epi8((x),(y))
#define _MM_CMPLT_EPI8(x,y) _mm_cmplt_epi8((x),(y))
#define _MM_MAX_EPI8(x,y) _s2_max_epi8((x),(y))
#define _MM_MIN_EPI8(x,y) _s2_min_epi8((x),(y))
#define _MM_ADD_EPI8(x,y) _mm_add_epi8((x),(y))
#define _MM_SUB_EPI8(x,y) _mm_sub_epi8((x),(y))
#define _MM_EXTRACT_EPI8(r,p) _s2_extract_epi8((r),(p))
#define _MM_MIN_EPU8(x,y) _mm_min_epu8((x),(y))
// Fill a SSE Register with 8 time the same 16bits integer value
#define _MM_SET1_EPI16(x) _mm_set1_epi16(x)
#define _MM_INSERT_EPI16(r,x,i) _mm_insert_epi16((r),(x),(i))
#define _MM_CMPEQ_EPI16(x,y) _mm_cmpeq_epi16((x),(y))
#define _MM_CMPGT_EPI16(x,y) _mm_cmpgt_epi16((x),(y))
#define _MM_CMPGT_EPU16(x,y) _mm_cmpgt_epu16((x),(y)) // n'existe pas ??
#define _MM_CMPLT_EPI16(x,y) _mm_cmplt_epi16((x),(y))
#define _MM_MAX_EPI16(x,y) _mm_max_epi16((x),(y))
#define _MM_MIN_EPI16(x,y) _mm_min_epi16((x),(y))
#define _MM_ADD_EPI16(x,y) _mm_add_epi16((x),(y))
#define _MM_SUB_EPI16(x,y) _mm_sub_epi16((x),(y))
#define _MM_EXTRACT_EPI16(r,p) _mm_extract_epi16((r),(p))
#define _MM_UNPACKLO_EPI8(a,b) _mm_unpacklo_epi8((a),(b))
#define _MM_UNPACKHI_EPI8(a,b) _mm_unpackhi_epi8((a),(b))
#define _MM_ADDS_EPU16(x,y) _mm_adds_epu16((x),(y))
// Multiplication
#define _MM_MULLO_EPI16(x,y) _mm_mullo_epi16((x), (y))
#define _MM_SRLI_EPI64(r,x) _mm_srli_epi64((r),(x))
#define _MM_SLLI_EPI64(r,x) _mm_slli_epi64((r),(x))
// Set a SSE Register to 0
#define _MM_SETZERO_SI128() _mm_setzero_si128()
#define _MM_AND_SI128(x,y) _mm_and_si128((x),(y))
#define _MM_ANDNOT_SI128(x,y) _mm_andnot_si128((x),(y))
#define _MM_OR_SI128(x,y) _mm_or_si128((x),(y))
#define _MM_XOR_SI128(x,y) _mm_xor_si128((x),(y))
#define _MM_SLLI_SI128(r,s) _mm_slli_si128((r),(s))
#define _MM_SRLI_SI128(r,s) _mm_srli_si128((r),(s))
// Load a SSE register from an unaligned address
#define _MM_LOADU_SI128(x) _mm_loadu_si128(x)
// Load a SSE register from an aligned address (/!\ not defined when SSE not available)
#define _MM_LOAD_SI128(x) _mm_load_si128(x)
// #define _MM_UNPACKLO_EPI8(x,y) _mm_unpacklo_epi8((x),(y))
#else /* __SSE2__ Not defined */
static inline __m128i _em_set1_epi8(int x)
{
um128 a;
x&=0xFF;
a.s8[0]=x;
a.s8[1]=x;
a.u16[1]=a.u16[0];
a.u32[1]=a.u32[0];
a.u64[1]=a.u64[0];
return a.i;
}
static inline __m128i _em_insert_epi8(__m128i r, int x, const int i)
{
um128 a;
a.i=r;
a.s8[i]=x & 0xFF;
return a.i;
}
static inline __m128i _em_cmpeq_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=(x.s8[z]==y.s8[z]) ? 0xFF:0
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_cmpgt_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=(x.s8[z]>y.s8[z]) ? 0xFF:0
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_cmplt_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=(x.s8[z]<y.s8[z]) ? 0xFF:0
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_max_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=MAX(x.s8[z],y.s8[z])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_min_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=MIN(x.s8[z],y.s8[z])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_add_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=x.s8[z]+y.s8[z]
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_sub_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s8[z]=x.s8[z]+y.s8[z]
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline int _em_extract_epi8(__m128i r, const int i)
{
um128 a;
a.i=r;
return a.s8[i] & 0xFF;
}
static inline __m128i _em_min_epu8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.u8[z]=MIN(x.u8[z],y.u8[z])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return r.i;
}
static inline __m128i _em_set1_epi16(int x)
{
um128 a;
x&=0xFFFF;
a.s16[0]=x;
a.s16[1]=x;
a.u32[1]=a.u32[0];
a.u64[1]=a.u64[0];
return a.i;
}
static inline __m128i _em_insert_epi16(__m128i r, int x, const int i)
{
um128 a;
a.i=r;
a.s16[i]=x & 0xFFFF;
return a.i;
}
static inline __m128i _em_cmpeq_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=(x.s16[z]==y.s16[z]) ? 0xFFFF:0
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_cmpgt_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=(x.s16[z]>y.s16[z]) ? 0xFFFF:0
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_cmplt_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=(x.s16[z]<y.s16[z]) ? 0xFFFF:0
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_max_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=MAX(x.s16[z],y.s16[z])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_min_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=MIN(x.s16[z],y.s16[z])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_add_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=x.s16[z]+y.s16[z]
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_sub_epi16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=x.s16[z]+y.s16[z]
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline int _em_extract_epi16(__m128i r, const int i)
{
um128 a;
a.i=r;
return a.s16[i] & 0xFFFF;
}
static inline __m128i _em_unpacklo_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=(((int16_t)(y.s8[z])) << 8) | (int16_t)(x.s8[z])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_unpackhi_epi8(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.s16[z]=(((int16_t)(y.s8[z+8])) << 8) | (int16_t)(x.s8[z+8])
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_adds_epu16(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.u16[z]=x.u16[z]+y.u16[z]
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
#undef R
return r.i;
}
static inline __m128i _em_srli_epi64(__m128i a, int b)
{
um128 x;
x.i=a;
x.s64[0]>>=b;
x.s64[1]>>=b;
return x.i;
}
static inline __m128i _em_slli_epi64(__m128i a, int b)
{
um128 x;
x.i=a;
x.s64[0]<<=b;
x.s64[1]<<=b;
return x.i;
}
static inline __m128i _em_setzero_si128()
{
um128 x;
x.s64[0]=x.s64[1]=0;
return x.i;
}
static inline __m128i _em_and_si128(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.u64[z]=x.u64[z] & y.u64[z]
R(0);
R(1);
#undef R
return r.i;
}
static inline __m128i _em_andnot_si128(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.u64[z]=(~x.u64[z]) & y.u64[z]
R(0);
R(1);
#undef R
return r.i;
}
static inline __m128i _em_or_si128(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.u64[z]=x.u64[z] | y.u64[z]
R(0);
R(1);
#undef R
return r.i;
}
static inline __m128i _em_xor_si128(__m128i a, __m128i b)
{
um128 x;
um128 y;
um128 r;
x.i=a;
y.i=b;
#define R(z) r.u64[z]=x.u64[z] ^ y.u64[z]
R(0);
R(1);
#undef R
return r.i;
}
static inline __m128i _em_slli_si128(__m128i a, int b)
{
um128 x;
x.i=a;
#define R(z) x.u8[z]=(z>=b) ? x.u8[z-b]:0
R(15);
R(14);
R(13);
R(12);
R(11);
R(10);
R(9);
R(8);
R(7);
R(6);
R(5);
R(4);
R(3);
R(2);
R(1);
R(0);
#undef R
return x.i;
}
static inline __m128i _em_srli_si128(__m128i a, int b)
{
um128 x;
x.i=a;
#define R(z) x.u8[z]=((b+z) > 15) ? 0:x.u8[z+b]
R(0);
R(1);
R(2);
R(3);
R(4);
R(5);
R(6);
R(7);
R(8);
R(9);
R(10);
R(11);
R(12);
R(13);
R(14);
R(15);
#undef R
return x.i;
}
inline static __m128i _em_loadu_si128(__m128i const *P)
{
um128 tmp;
um128 *pp=(um128*)P;
tmp.u8[0]=(*pp).u8[0];
tmp.u8[1]=(*pp).u8[1];
tmp.u8[2]=(*pp).u8[2];
tmp.u8[3]=(*pp).u8[3];
tmp.u8[4]=(*pp).u8[4];
tmp.u8[5]=(*pp).u8[5];
tmp.u8[6]=(*pp).u8[6];
tmp.u8[7]=(*pp).u8[7];
tmp.u8[8]=(*pp).u8[8];
tmp.u8[9]=(*pp).u8[9];
tmp.u8[10]=(*pp).u8[10];
tmp.u8[11]=(*pp).u8[11];
tmp.u8[12]=(*pp).u8[12];
tmp.u8[13]=(*pp).u8[13];
tmp.u8[14]=(*pp).u8[14];
tmp.u8[15]=(*pp).u8[15];
return tmp.i;
}
#define _MM_SET1_EPI8(x) _em_set1_epi8(x)
#define _MM_INSERT_EPI8(r,x,i) _em_insert_epi8((r),(x),(i))
#define _MM_CMPEQ_EPI8(x,y) _em_cmpeq_epi8((x),(y))
#define _MM_CMPGT_EPI8(x,y) _em_cmpgt_epi8((x),(y))
#define _MM_CMPLT_EPI8(x,y) _em_cmplt_epi8((x),(y))
#define _MM_MAX_EPI8(x,y) _em_max_epi8((x),(y))
#define _MM_MIN_EPI8(x,y) _em_min_epi8((x),(y))
#define _MM_ADD_EPI8(x,y) _em_add_epi8((x),(y))
#define _MM_SUB_EPI8(x,y) _em_sub_epi8((x),(y))
#define _MM_EXTRACT_EPI8(r,p) _em_extract_epi8((r),(p))
#define _MM_MIN_EPU8(x,y) _em_min_epu8((x),(y))
#define _MM_SET1_EPI16(x) _em_set1_epi16(x)
#define _MM_INSERT_EPI16(r,x,i) _em_insert_epi16((r),(x),(i))
#define _MM_CMPEQ_EPI16(x,y) _em_cmpeq_epi16((x),(y))
#define _MM_CMPGT_EPI16(x,y) _em_cmpgt_epi16((x),(y))
#define _MM_CMPLT_EPI16(x,y) _em_cmplt_epi16((x),(y))
#define _MM_MAX_EPI16(x,y) _em_max_epi16((x),(y))
#define _MM_MIN_EPI16(x,y) _em_min_epi16((x),(y))
#define _MM_ADD_EPI16(x,y) _em_add_epi16((x),(y))
#define _MM_SUB_EPI16(x,y) _em_sub_epi16((x),(y))
#define _MM_EXTRACT_EPI16(r,p) _em_extract_epi16((r),(p))
#define _MM_UNPACKLO_EPI8(a,b) _em_unpacklo_epi8((a),(b))
#define _MM_UNPACKHI_EPI8(a,b) _em_unpackhi_epi8((a),(b))
#define _MM_ADDS_EPU16(x,y) _em_adds_epu16((x),(y))
#define _MM_SRLI_EPI64(r,x) _em_srli_epi64((r),(x))
#define _MM_SLLI_EPI64(r,x) _em_slli_epi64((r),(x))
#define _MM_SETZERO_SI128() _em_setzero_si128()
#define _MM_AND_SI128(x,y) _em_and_si128((x),(y))
#define _MM_ANDNOT_SI128(x,y) _em_andnot_si128((x),(y))
#define _MM_OR_SI128(x,y) _em_or_si128((x),(y))
#define _MM_XOR_SI128(x,y) _em_xor_si128((x),(y))
#define _MM_SLLI_SI128(r,s) _em_slli_si128((r),(s))
#define _MM_SRLI_SI128(r,s) _em_srli_si128((r),(s))
#define _MM_LOADU_SI128(x) _em_loadu_si128(x)
#define _MM_LOAD_SI128(x) _em_loadu_si128(x)
#endif /* __SSE2__ */
#define _MM_NOT_SI128(x) _MM_XOR_SI128((x),(_MM_SET1_EPI8(0xFFFF)))
#endif

View File

@ -14,6 +14,7 @@
#include <stdio.h>
#include <math.h>
#include "char_str_indexer.h"
#include "obiblob.h"
#include "obiblob_indexer.h"
#include "obidebug.h"
@ -25,24 +26,16 @@
Obi_blob_p obi_str_to_blob(const char* value)
{
Obi_blob_p value_b;
int32_t length;
int32_t length;
// Compute the number of bytes on which the value will be encoded
length = strlen(value) + 1; // +1 to store \0 at the end (makes retrieving faster)
value_b = obi_blob((byte_t*)value, ELEMENT_SIZE_STR, length, length);
if (value_b == NULL)
{
obidebug(1, "\nError encoding a character string in a blob");
return NULL;
}
return value_b;
return obi_blob((byte_t*)value, ELEMENT_SIZE_STR, length, length);
}
char* obi_blob_to_str(Obi_blob_p value_b)
const char* obi_blob_to_str(Obi_blob_p value_b)
{
return value_b->value;
}
@ -67,7 +60,7 @@ index_t obi_index_char_str(Obi_indexer_p indexer, const char* value)
}
char* obi_retrieve_char_str(Obi_indexer_p indexer, index_t idx)
const char* obi_retrieve_char_str(Obi_indexer_p indexer, index_t idx)
{
Obi_blob_p value_b;

View File

@ -35,7 +35,7 @@
* @since October 2015
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
Obi_blob_p obi_str_to_blob(char* value);
Obi_blob_p obi_str_to_blob(const char* value);
/**
@ -80,7 +80,7 @@ index_t obi_index_char_str(Obi_indexer_p indexer, const char* value);
* @since April 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
char* obi_retrieve_char_str(Obi_indexer_p indexer, index_t idx);
const char* obi_retrieve_char_str(Obi_indexer_p indexer, index_t idx);
#endif /* CHAR_STR_INDEXER_H_ */

View File

@ -14,6 +14,7 @@
#include <stdio.h>
#include <math.h>
#include "dna_seq_indexer.h"
#include "obiblob.h"
#include "obiblob_indexer.h"
#include "obidebug.h"

109
src/obi_align.c Normal file
View File

@ -0,0 +1,109 @@
/****************************************************************************
* Sequence alignment functions *
****************************************************************************/
/**
* @file obi_align.c
* @author Celine Mercier
* @date May 4th 2016
* @brief Functions handling sequence alignments.
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include "obidebug.h"
#include "obierrno.h"
#include "obitypes.h"
#include "obiview.h"
#include "sse_banded_LCS_alignment.h"
#define DEBUG_LEVEL 0 // TODO has to be defined somewhere else (cython compil flag?)
// TODO
// use openMP pragmas : garder scores en memoire et ecrire a la fin? (normalement c bon avec index)
// tout ecrire en stdint?
// check NUC_SEQS and score type (int or float if normalize)
// what's with multiple sequence/line columns?
// make function that put blobs in int16
int obi_align_one_column(Obiview_p seq_view, OBIDMS_column_p seq_column, Obiview_p score_view, OBIDMS_column_p score_column, double threshold, bool normalize, int reference, bool similarity_mode)
{
index_t i, j, k;
index_t seq_count;
char* seq1;
char* seq2;
double score;
k = 0;
if ((seq_column->header)->returned_data_type != OBI_SEQ)
{
obi_set_errno(OBI_ALIGN_ERROR);
obidebug(1, "\nTrying to align a column of a different type than OBI_SEQ");
return -1;
}
if ((normalize && ((score_column->header)->returned_data_type != OBI_FLOAT)) ||
(!normalize && ((score_column->header)->returned_data_type != OBI_INT)))
{
obi_set_errno(OBI_ALIGN_ERROR);
obidebug(1, "\nTrying to store alignment scores in a column of an inappropriate type");
return -1;
}
seq_count = (seq_column->header)->lines_used;
for (i=0; i < (seq_count - 1); i++)
{
for (j=i+1; j < seq_count; j++)
{
//fprintf(stderr, "\ni=%lld, j=%lld, k=%lld", i, j, k);
seq1 = obi_column_get_obiseq_with_elt_idx_in_view(seq_view, seq_column, i, 0);
seq2 = obi_column_get_obiseq_with_elt_idx_in_view(seq_view, seq_column, j, 0);
if ((seq1 == NULL) || (seq2 == NULL))
{
obidebug(1, "\nError retrieving sequences to align");
return -1;
}
// TODO filter
score = generic_sse_banded_lcs_align(seq1, seq2, threshold, normalize, reference, similarity_mode);
if (normalize)
{
if (obi_column_set_obifloat_with_elt_idx_in_view(score_view, score_column, k, 0, (obifloat_t) score) < 0)
{
obidebug(1, "\nError writing alignment score in a column");
return -1;
}
}
else
{
if (obi_column_set_obiint_with_elt_idx_in_view(score_view, score_column, k, 0, (obiint_t) score) < 0)
{
obidebug(1, "\nError writing alignment score in a column");
return -1;
}
}
k++;
}
}
return 0;
}
int obi_align_two_columns(OBIDMS_column_p seq_column_1, OBIDMS_column_p seq_column_2, OBIDMS_column_p score_column, double threshold, bool normalize, int reference, bool similarity_mode)
{
// TODO
return 0;
}

37
src/obi_align.h Normal file
View File

@ -0,0 +1,37 @@
/****************************************************************************
* Sequence alignment functions header file *
****************************************************************************/
/**
* @file obi_align.h
* @author Celine Mercier
* @date May 11th 2016
* @brief Header file for the functions handling the alignment of DNA sequences.
*/
#ifndef OBI_ALIGN_H_
#define OBI_ALIGN_H_
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include "obidms.h"
#include "obidmscolumn.h"
#include "obitypes.h"
/**
* @brief
*
* TODO
*
*/
int obi_align_one_column(Obiview_p seq_view, OBIDMS_column_p seq_column, Obiview_p score_view, OBIDMS_column_p score_column, double threshold, bool normalize, int reference, bool similarity_mode);
#endif /* OBI_ALIGN_H_ */

View File

@ -107,6 +107,42 @@ static char* build_avl_file_name(const char* avl_name);
static char* build_avl_data_file_name(const char* avl_name);
/**
* @brief Internal function building the full path of an AVL tree file.
*
* @warning The returned pointer has to be freed by the caller.
*
* @param dms A pointer to the OBIDMS to which the AVL tree belongs.
* @param avl_name The name of the AVL tree.
* @param avl_idx The index of the AVL if it's part of an AVL group, or -1 if not.
*
* @returns A pointer to the full path of the file where the AVL tree is stored.
* @retval NULL if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
static char* get_full_path_of_avl_file_name(OBIDMS_p dms, const char* avl_name, int avl_idx);
/**
* @brief Internal function building the file name for an AVL data file.
*
* @warning The returned pointer has to be freed by the caller.
*
* @param dms A pointer to the OBIDMS to which the AVL tree belongs.
* @param avl_name The name of the AVL tree.
* @param avl_idx The index of the AVL if it's part of an AVL group, or -1 if not.
*
* @returns A pointer to the full path of the file where the data referred to by the AVL tree is stored.
* @retval NULL if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
static char* get_full_path_of_avl_data_file_name(OBIDMS_p dms, const char* avl_name, int avl_idx);
/**
* @brief Internal function returning the size of an AVL tree header on this platform,
* including the size of the bloom filter associated with the AVL tree.
@ -253,9 +289,12 @@ int remap_an_avl(OBIDMS_avl_p avl);
/**
* @brief Internal function (re)mapping the tree and data parts of an AVL tree structure.
* @brief Internal function creating and adding a new AVL in an AVL group.
*
* @param avl A pointer to the AVL tree group structure.
* @warning The previous AVL in the list of the group is unmapped,
* if it's not the 1st AVL being added.
*
* @param avl A pointer on the AVL tree group structure.
*
* @retval 0 if the operation was successfully completed.
* @retval -1 if an error occurred.
@ -547,6 +586,102 @@ static char* build_avl_data_file_name(const char* avl_name)
}
static char* get_full_path_of_avl_file_name(OBIDMS_p dms, const char* avl_name, int avl_idx)
{
char* complete_avl_name;
char* full_path;
char* avl_file_name;
if (avl_idx >= 0)
{
complete_avl_name = build_avl_name_with_idx(avl_name, avl_idx);
if (complete_avl_name == NULL)
return NULL;
}
else
{
complete_avl_name = (char*) malloc((strlen(avl_name)+1)*sizeof(char));
if (complete_avl_name == NULL)
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError allocating memory for an AVL name");
return NULL;
}
strcpy(complete_avl_name, avl_name);
}
avl_file_name = build_avl_file_name(complete_avl_name);
if (avl_file_name == NULL)
{
free(complete_avl_name);
return NULL;
}
full_path = get_full_path_of_avl_dir(dms, avl_name);
if (full_path == NULL)
{
free(complete_avl_name);
free(avl_file_name);
return NULL;
}
strcat(full_path, "/");
strcat(full_path, avl_file_name);
free(complete_avl_name);
return full_path;
}
static char* get_full_path_of_avl_data_file_name(OBIDMS_p dms, const char* avl_name, int avl_idx)
{
char* complete_avl_name;
char* full_path;
char* avl_data_file_name;
if (avl_idx >= 0)
{
complete_avl_name = build_avl_name_with_idx(avl_name, avl_idx);
if (complete_avl_name == NULL)
return NULL;
}
else
{
complete_avl_name = (char*) malloc((strlen(avl_name)+1)*sizeof(char));
if (complete_avl_name == NULL)
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError allocating memory for an AVL name");
return NULL;
}
strcpy(complete_avl_name, avl_name);
}
avl_data_file_name = build_avl_data_file_name(complete_avl_name);
if (avl_data_file_name == NULL)
{
free(complete_avl_name);
return NULL;
}
full_path = get_full_path_of_avl_dir(dms, avl_name);
if (full_path == NULL)
{
free(complete_avl_name);
free(avl_data_file_name);
return NULL;
}
strcat(full_path, "/");
strcat(full_path, avl_data_file_name);
free(complete_avl_name);
return full_path;
}
size_t get_avl_header_size()
{
size_t header_size;
@ -646,7 +781,6 @@ int truncate_avl_to_size_used(OBIDMS_avl_p avl) // TODO is it necessary to unmap
file_descriptor,
(avl->header)->header_size
);
if (avl->tree == MAP_FAILED)
{
obi_set_errno(OBI_AVL_ERROR);
@ -923,23 +1057,24 @@ int remap_an_avl(OBIDMS_avl_p avl)
int add_new_avl_in_group(OBIDMS_avl_group_p avl_group)
{
// Check that maximum number of AVLs in a group was not reached
if (avl_group->current_avl_idx == (MAX_NB_OF_AVLS_IN_GROUP-1))
if (avl_group->last_avl_idx == (MAX_NB_OF_AVLS_IN_GROUP - 1))
{
obi_set_errno(OBI_AVL_ERROR);
obidebug(1, "\nError: Trying to add new AVL in AVL group but maximum number of AVLs in a group reached");
return -1;
}
// Unmap the previous AVL
if (unmap_an_avl((avl_group->sub_avls)[avl_group->current_avl_idx]) < 0)
return -1;
// Unmap the previous AVL if it's not the 1st
if (avl_group->last_avl_idx > 0)
if (unmap_an_avl((avl_group->sub_avls)[avl_group->last_avl_idx]) < 0)
return -1;
// Increment current AVL index
(avl_group->current_avl_idx)++;
(avl_group->last_avl_idx)++;
// Create the new AVL
(avl_group->sub_avls)[avl_group->current_avl_idx] = obi_create_avl(avl_group->dms, avl_group->name, avl_group->current_avl_idx);
if ((avl_group->sub_avls)[avl_group->current_avl_idx] == NULL)
(avl_group->sub_avls)[avl_group->last_avl_idx] = obi_create_avl(avl_group->dms, avl_group->name, avl_group->last_avl_idx);
if ((avl_group->sub_avls)[avl_group->last_avl_idx] == NULL)
{
obidebug(1, "\nError creating a new AVL tree in a group");
return -1;
@ -949,6 +1084,36 @@ int add_new_avl_in_group(OBIDMS_avl_group_p avl_group)
}
// TODO doc
int add_existing_avl_in_group(OBIDMS_avl_group_p avl_group_dest, OBIDMS_avl_group_p avl_group_source, int avl_idx)
{
if (link(get_full_path_of_avl_file_name(avl_group_source->dms, avl_group_source->name, avl_idx), get_full_path_of_avl_file_name(avl_group_dest->dms, avl_group_dest->name, avl_idx)) < 0)
{
obi_set_errno(OBI_AVL_ERROR);
obidebug(1, "\nError creating a hard link to an existing AVL tree file");
return -1;
}
if (link(get_full_path_of_avl_data_file_name(avl_group_source->dms, avl_group_source->name, avl_idx), get_full_path_of_avl_data_file_name(avl_group_dest->dms, avl_group_dest->name, avl_idx)) < 0)
{
obi_set_errno(OBI_AVL_ERROR);
obidebug(1, "\nError creating a hard link to an existing AVL data file");
return -1;
}
// Increment current AVL index
(avl_group_dest->last_avl_idx)++;
// Open AVL for that group TODO ideally not needed, but needed for now
avl_group_dest->sub_avls[avl_group_dest->last_avl_idx] = obi_open_avl(avl_group_source->dms, avl_group_source->name, avl_idx);
if ((avl_group_dest->sub_avls)[avl_group_dest->last_avl_idx] == NULL)
{
obidebug(1, "\nError opening an AVL to add in an AVL group");
return -1;
}
return 0;
}
int maybe_in_avl(OBIDMS_avl_p avl, Obi_blob_p value)
{
return (bloom_check(&((avl->header)->bloom_filter), value, obi_blob_sizeof(value)));
@ -1529,8 +1694,7 @@ OBIDMS_avl_p obi_create_avl(OBIDMS_p dms, const char* avl_name, int avl_idx)
// Bloom filter
bloom_init(&((avl->header)->bloom_filter), MAX_NODE_COUNT_PER_AVL);
if (avl_idx >= 0)
free(complete_avl_name);
free(complete_avl_name);
return avl;
}
@ -1777,8 +1941,7 @@ OBIDMS_avl_p obi_open_avl(OBIDMS_p dms, const char* avl_name, int avl_idx)
avl->dir_fd = avl_dir_file_descriptor;
avl->avl_fd = avl_file_descriptor;
if (avl_idx >= 0)
free(complete_avl_name);
free(complete_avl_name);
return avl;
}
@ -1806,6 +1969,7 @@ OBIDMS_avl_group_p obi_avl_group(OBIDMS_p dms, const char* avl_name)
OBIDMS_avl_group_p obi_create_avl_group(OBIDMS_p dms, const char* avl_name)
{
OBIDMS_avl_group_p avl_group;
char* avl_dir_name;
avl_group = (OBIDMS_avl_group_p) malloc(sizeof(OBIDMS_avl_group_t));
if (avl_group == NULL)
@ -1815,18 +1979,22 @@ OBIDMS_avl_group_p obi_create_avl_group(OBIDMS_p dms, const char* avl_name)
return NULL;
}
// Create 1st avl
(avl_group->sub_avls)[0] = obi_create_avl(dms, avl_name, 0);
if ((avl_group->sub_avls)[0] == NULL)
{
obidebug(1, "\nError creating the first AVL of an AVL group");
return NULL;
}
avl_group->current_avl_idx = 0;
avl_group->last_avl_idx = -1;
avl_group->dms = dms;
strcpy(avl_group->name, avl_name);
avl_group->dms = dms;
// Create the directory for that AVL group
avl_dir_name = get_full_path_of_avl_dir(dms, avl_name);
if (avl_dir_name == NULL)
return NULL;
if (mkdirat(dms->indexer_dir_fd, avl_dir_name, 00777) < 0)
{
obi_set_errno(OBI_AVL_ERROR);
obidebug(1, "\nError creating an AVL directory");
free(avl_dir_name);
return NULL;
}
// Add in the list of open indexers
obi_dms_list_indexer(dms, avl_group);
@ -1883,7 +2051,7 @@ OBIDMS_avl_group_p obi_open_avl_group(OBIDMS_p dms, const char* avl_name)
if ((avl_group->sub_avls)[i] == NULL)
return NULL;
}
avl_group->current_avl_idx = avl_count-1; // TODO latest. discuss
avl_group->last_avl_idx = avl_count-1; // TODO latest. discuss
strcpy(avl_group->name, avl_name);
avl_group->dms = dms;
@ -1901,6 +2069,59 @@ OBIDMS_avl_group_p obi_open_avl_group(OBIDMS_p dms, const char* avl_name)
}
void obi_clone_avl(OBIDMS_avl_p avl, OBIDMS_avl_p new_avl)
{
// Clone AVL tree
memcpy(new_avl->tree, avl->tree, (avl->header)->avl_size);
(new_avl->header)->avl_size = (avl->header)->avl_size;
(new_avl->header)->nb_items = (avl->header)->nb_items;
(new_avl->header)->root_idx = (avl->header)->root_idx;
(new_avl->header)->bloom_filter = (avl->header)->bloom_filter;
// Clone AVL data
memcpy((new_avl->data)->data, (avl->data)->data, ((avl->data)->header)->data_size_used);
((new_avl->data)->header)->data_size_used = ((avl->data)->header)->data_size_used;
((new_avl->data)->header)->data_size_max = ((avl->data)->header)->data_size_max;
((new_avl->data)->header)->nb_items = ((avl->data)->header)->nb_items;
}
OBIDMS_avl_group_p obi_clone_avl_group(OBIDMS_avl_group_p avl_group, const char* new_avl_name)
{
OBIDMS_avl_group_p new_avl_group;
int i;
// Create the new AVL group
new_avl_group = obi_create_avl_group(avl_group->dms, new_avl_name);
// Create hard links to all the full AVLs that won't be modified: all but the last one
for (i=0; i<(avl_group->last_avl_idx); i++)
{
if (add_existing_avl_in_group(new_avl_group, avl_group, i) < 0)
return NULL;
}
// Create the last AVL to copy data in it
if (add_new_avl_in_group(new_avl_group) < 0)
{
obi_close_avl_group(new_avl_group);
return NULL;
}
// Copy the data from the last AVL to a new one that can be modified
obi_clone_avl((avl_group->sub_avls)[avl_group->last_avl_idx], (new_avl_group->sub_avls)[new_avl_group->last_avl_idx]);
// Close old AVL group
if (obi_close_avl_group(avl_group) < 0)
{
obi_close_avl_group(new_avl_group);
return NULL;
}
return new_avl_group;
}
int obi_close_avl(OBIDMS_avl_p avl)
{
int ret_val = 0;
@ -1909,7 +2130,7 @@ int obi_close_avl(OBIDMS_avl_p avl)
ret_val = truncate_avl_to_size_used(avl);
if (munmap(avl->tree, (((avl->header)->nb_items_max) * sizeof(AVL_node_t))) < 0)
if (munmap(avl->tree, (avl->header)->avl_size) < 0)
{
obi_set_errno(OBI_AVL_ERROR);
obidebug(1, "\nError munmapping the tree of an AVL tree file");
@ -1946,9 +2167,17 @@ int obi_close_avl_group(OBIDMS_avl_group_p avl_group)
ret_val = obi_dms_unlist_indexer(avl_group->dms, avl_group);
// Close each AVL of the group
for (i=0; i < (avl_group->current_avl_idx); i++)
for (i=0; i <= (avl_group->last_avl_idx); i++)
{
// Remap all but the last AVL (already mapped) before closing to truncate and close properly
if (i < (avl_group->last_avl_idx))
{
if (remap_an_avl((avl_group->sub_avls)[i]) < 0)
ret_val = -1;
}
if (obi_close_avl((avl_group->sub_avls)[i]) < 0)
ret_val = -1;
}
free(avl_group);
}
@ -2157,18 +2386,29 @@ index_t obi_avl_group_add(OBIDMS_avl_group_p avl_group, Obi_blob_p value)
index_t index_with_avl;
int i;
if (maybe_in_avl((avl_group->sub_avls)[avl_group->current_avl_idx], value))
// Create 1st AVL if group is empty
if (avl_group->last_avl_idx == -1)
{
index_in_avl = (int32_t) obi_avl_find((avl_group->sub_avls)[avl_group->current_avl_idx], value);
if (add_new_avl_in_group(avl_group) < 0)
{
obidebug(1, "\nError creating the first AVL of an AVL group");
return -1;
}
}
if (maybe_in_avl((avl_group->sub_avls)[avl_group->last_avl_idx], value))
{
index_in_avl = (int32_t) obi_avl_find((avl_group->sub_avls)[avl_group->last_avl_idx], value);
if (index_in_avl >= 0)
{
index_with_avl = avl_group->current_avl_idx;
index_with_avl = avl_group->last_avl_idx;
index_with_avl = index_with_avl << 32;
index_with_avl = index_with_avl + index_in_avl;
return index_with_avl;
}
}
for (i=0; i < (avl_group->current_avl_idx); i++)
for (i=0; i < (avl_group->last_avl_idx); i++)
{
if (maybe_in_avl((avl_group->sub_avls)[i], value))
{
@ -2192,24 +2432,23 @@ index_t obi_avl_group_add(OBIDMS_avl_group_p avl_group, Obi_blob_p value)
// Check if the AVL group is writable
if (!(avl_group->writable))
{
obi_set_errno(OBI_AVL_ERROR);
obidebug(1, "\nTrying to add a value in an AVL group that is read-only.");
obi_set_errno(OBI_READ_ONLY_INDEXER_ERROR);
return -1;
}
// Check if need to make new AVL
if (((((avl_group->sub_avls)[avl_group->current_avl_idx])->header)->nb_items == MAX_NODE_COUNT_PER_AVL) || (((((avl_group->sub_avls)[avl_group->current_avl_idx])->data)->header)->data_size_used >= MAX_DATA_SIZE_PER_AVL))
if (((((avl_group->sub_avls)[avl_group->last_avl_idx])->header)->nb_items == MAX_NODE_COUNT_PER_AVL) || (((((avl_group->sub_avls)[avl_group->last_avl_idx])->data)->header)->data_size_used >= MAX_DATA_SIZE_PER_AVL))
{
if (add_new_avl_in_group(avl_group) < 0)
return -1;
}
// Add in the current AVL
bloom_add(&((((avl_group->sub_avls)[avl_group->current_avl_idx])->header)->bloom_filter), value, obi_blob_sizeof(value));
index_in_avl = (int32_t) obi_avl_add((avl_group->sub_avls)[avl_group->current_avl_idx], value);
bloom_add(&((((avl_group->sub_avls)[avl_group->last_avl_idx])->header)->bloom_filter), value, obi_blob_sizeof(value));
index_in_avl = (int32_t) obi_avl_add((avl_group->sub_avls)[avl_group->last_avl_idx], value);
// Build the index containing the AVL index
index_with_avl = avl_group->current_avl_idx;
index_with_avl = avl_group->last_avl_idx;
index_with_avl = index_with_avl << 32;
index_with_avl = index_with_avl + index_in_avl;

View File

@ -73,7 +73,7 @@ typedef struct AVL_node {
* @brief OBIDMS AVL tree data header structure.
*/
typedef struct OBIDMS_avl_data_header {
int header_size; /**< Size of the header in bytes.
size_t header_size; /**< Size of the header in bytes.
*/
index_t data_size_used; /**< Size of the data used in bytes.
*/
@ -105,7 +105,7 @@ typedef struct OBIDMS_avl_data {
* @brief OBIDMS AVL tree header structure.
*/
typedef struct OBIDMS_avl_header {
int header_size; /**< Size of the header in bytes.
size_t header_size; /**< Size of the header in bytes.
*/
size_t avl_size; /**< Size of the AVL tree in bytes.
*/
@ -160,7 +160,7 @@ typedef struct OBIDMS_avl {
typedef struct OBIDMS_avl_group {
OBIDMS_avl_p sub_avls[MAX_NB_OF_AVLS_IN_GROUP]; /**< Array containing the pointers to the AVL trees of the group.
*/
int current_avl_idx; /**< Index in the sub_avls array of the AVL tree currently being filled.
int last_avl_idx; /**< Index in the sub_avls array of the AVL tree currently being filled.
*/
char name[AVL_MAX_NAME+1]; /**< Base name of the AVL group. The AVL trees in it have names of the form basename_idx.
*/
@ -290,6 +290,37 @@ OBIDMS_avl_group_p obi_create_avl_group(OBIDMS_p dms, const char* avl_name);
OBIDMS_avl_group_p obi_open_avl_group(OBIDMS_p dms, const char* avl_name);
/**
* @brief Clones an AVL.
*
* The tree and the data are both cloned into the new AVL.
*
* @param avl A pointer on the AVL to clone.
* @param new_avl A pointer on the new AVL to fill.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
void obi_clone_avl(OBIDMS_avl_p avl, OBIDMS_avl_p new_avl);
/**
* @brief Clones an AVL group.
*
* @warning The AVL group that has be cloned is closed by the function.
*
* @param avl_group A pointer on the AVL group to clone.
* @param new_avl_name The name of the new AVL group.
*
* @returns A pointer on the new AVL group structure.
* @retval NULL if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
OBIDMS_avl_group_p obi_clone_avl_group(OBIDMS_avl_group_p avl_group, const char* new_avl_name);
/**
* @brief Closes an AVL tree.
*

View File

@ -23,6 +23,8 @@
#define ELEMENT_SIZE_STR (8) /**< The size of an element from a value of type character string.
*/
#define ELEMENT_SIZE_UINT8 (8) /**< The size of an element from a value of type uint8_t.
*/
#define ELEMENT_SIZE_SEQ_2 (2) /**< The size of an element from a value of type DNA sequence encoded on 2 bits.
*/
#define ELEMENT_SIZE_SEQ_4 (4) /**< The size of an element from a value of type DNA sequence encoded on 4 bits.

View File

@ -30,6 +30,8 @@ inline Obi_indexer_p obi_create_indexer(OBIDMS_p dms, const char* name);
inline Obi_indexer_p obi_open_indexer(OBIDMS_p dms, const char* name);
inline Obi_indexer_p obi_clone_indexer(Obi_indexer_p indexer, const char* name);
inline int obi_close_indexer(Obi_indexer_p indexer);
inline index_t obi_indexer_add(Obi_indexer_p indexer, Obi_blob_p value);

View File

@ -107,6 +107,24 @@ inline Obi_indexer_p obi_open_indexer(OBIDMS_p dms, const char* name)
}
/**
* @brief Clones an indexer.
*
* @param indexer The indexer to clone.
* @param name The name of the new indexer.
*
* @returns A pointer on the new indexer structure.
* @retval NULL if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
inline Obi_indexer_p obi_clone_indexer(Obi_indexer_p indexer, const char* name)
{
return obi_clone_avl_group(indexer, name);
}
/**
* @brief Closes an indexer.
*

View File

@ -252,7 +252,7 @@ OBIDMS_p obi_create_dms(const char* dms_path)
return NULL;
}
// Get file descriptor of DMS directory to create the indexer directory
// Get file descriptor of DMS directory to create other directories
dms_dir = opendir(directory_name);
if (dms_dir == NULL)
{
@ -280,6 +280,14 @@ OBIDMS_p obi_create_dms(const char* dms_path)
return NULL;
}
// Create the view directory
if (mkdirat(dms_file_descriptor, VIEW_DIR_NAME, 00777) < 0)
{
obi_set_errno(OBIVIEW_ERROR);
obidebug(1, "\nProblem creating a view directory");
return NULL;
}
// Isolate the dms name
j = 0;
for (i=0; i<strlen(dms_path); i++)
@ -447,6 +455,30 @@ OBIDMS_p obi_open_dms(const char* dms_path)
return NULL;
}
// Open the view directory
dms->view_directory = opendir_in_dms(dms, VIEW_DIR_NAME);
if (dms->view_directory == NULL)
{
obi_set_errno(OBIDMS_UNKNOWN_ERROR);
obidebug(1, "\nError opening the view directory");
closedir(dms->indexer_directory);
closedir(dms->directory);
free(dms);
return NULL;
}
// Store the view directory's file descriptor
dms->view_dir_fd = dirfd(dms->view_directory);
if (dms->view_dir_fd < 0)
{
obi_set_errno(OBIDMS_UNKNOWN_ERROR);
obidebug(1, "\nError getting the file descriptor of the view directory");
closedir(dms->view_directory);
closedir(dms->directory);
free(dms);
return NULL;
}
// Initialize the list of opened columns
dms->opened_columns = (Opened_columns_list_p) malloc(sizeof(Opened_columns_list_t));
(dms->opened_columns)->nb_opened_columns = 0;
@ -486,7 +518,7 @@ int obi_close_dms(OBIDMS_p dms)
while ((dms->opened_columns)->nb_opened_columns > 0)
obi_close_column(*((dms->opened_columns)->columns));
// Close dms and indexer directories
// Close dms, and view and indexer directories
if (closedir(dms->directory) < 0)
{
obi_set_errno(OBIDMS_MEMORY_ERROR);
@ -501,6 +533,13 @@ int obi_close_dms(OBIDMS_p dms)
free(dms);
return -1;
}
if (closedir(dms->view_directory) < 0)
{
obi_set_errno(OBIVIEW_ERROR);
obidebug(1, "\nError closing a view directory");
free(dms);
return -1;
}
free(dms);
}

View File

@ -30,6 +30,8 @@
*/
#define INDEXER_DIR_NAME "OBIBLOB_INDEXERS" /**< The name of the Obiblob indexer directory.
*/
#define VIEW_DIR_NAME "VIEWS" /**< The name of the view directory.
*/
#define TAXONOMY_DIR_NAME "TAXONOMY" /**< The name of the taxonomy directory.
*/
#define MAX_NB_OPENED_COLUMNS (100) /**< The maximum number of columns open at the same time.
@ -98,6 +100,12 @@ typedef struct OBIDMS {
int indexer_dir_fd; /**< The file descriptor of the directory entry
* usable to refer and scan the indexer directory.
*/
DIR* view_directory; /**< A directory entry usable to
* refer and scan the view directory.
*/
int view_dir_fd; /**< The file descriptor of the directory entry
* usable to refer and scan the view directory.
*/
bool little_endian; /**< Endianness of the database.
*/
Opened_columns_list_p opened_columns; /**< List of opened columns.

View File

@ -587,7 +587,7 @@ OBIDMS_column_p obi_create_column(OBIDMS_p dms,
}
// Build the indexer name if needed
if ((data_type == OBI_STR) || (data_type == OBI_SEQ))
if ((data_type == OBI_STR) || (data_type == OBI_SEQ) || (data_type == OBI_QUAL))
{
if (strcmp(indexer_name, "") == 0)
{
@ -603,7 +603,7 @@ OBIDMS_column_p obi_create_column(OBIDMS_p dms,
}
returned_data_type = data_type;
if ((data_type == OBI_STR) || (data_type == OBI_SEQ))
if ((data_type == OBI_STR) || (data_type == OBI_SEQ) || (data_type == OBI_QUAL))
// stored data is indices referring to data stored elsewhere
stored_data_type = OBI_IDX;
else
@ -750,8 +750,8 @@ OBIDMS_column_p obi_create_column(OBIDMS_p dms,
if (comments != NULL)
strncpy(header->comments, comments, COMMENTS_MAX_LENGTH);
// If the data type is OBI_STR or OBI_SEQ, the associated obi_indexer is opened or created
if ((returned_data_type == OBI_STR) || (returned_data_type == OBI_SEQ))
// If the data type is OBI_STR, OBI_SEQ or OBI_QUAL, the associated obi_indexer is opened or created
if ((returned_data_type == OBI_STR) || (returned_data_type == OBI_SEQ) || (returned_data_type == OBI_QUAL))
{
new_column->indexer = obi_indexer(dms, final_indexer_name);
if (new_column->indexer == NULL)
@ -900,8 +900,8 @@ OBIDMS_column_p obi_open_column(OBIDMS_p dms,
column->writable = false;
// If the data type is OBI_STR or OBI_SEQ, the associated indexer is opened
if (((column->header)->returned_data_type == OBI_STR) || ((column->header)->returned_data_type == OBI_SEQ))
// If the data type is OBI_STR, OBI_SEQ or OBI_QUAL, the associated indexer is opened
if (((column->header)->returned_data_type == OBI_STR) || ((column->header)->returned_data_type == OBI_SEQ) || ((column->header)->returned_data_type == OBI_QUAL))
{
column->indexer = obi_open_indexer(dms, (column->header)->indexer_name);
if (column->indexer == NULL)
@ -940,8 +940,6 @@ OBIDMS_column_p obi_clone_column(OBIDMS_p dms,
size_t line_size;
index_t i, index;
obidebug(1, "\nline_selection == NULL = %d\n", (line_selection == NULL));
column_to_clone = obi_open_column(dms, column_name, version_number);
if (column_to_clone == NULL)
{
@ -1028,8 +1026,8 @@ int obi_close_column(OBIDMS_column_p column)
// Close column directory if it was the last column opened from that directory
close_dir = obi_dms_is_column_name_in_list(column->dms, (column->header)->name);
// If the data type is OBI_STR or OBI_SEQ, the associated indexer is closed
if (((column->header)->returned_data_type == OBI_STR) || ((column->header)->returned_data_type == OBI_SEQ))
// If the data type is OBI_STR, OBI_SEQ or OBI_QUAL, the associated indexer is closed
if (((column->header)->returned_data_type == OBI_STR) || ((column->header)->returned_data_type == OBI_SEQ) || ((column->header)->returned_data_type == OBI_QUAL))
if (obi_close_indexer(column->indexer) < 0)
ret_val = -1;
@ -1155,6 +1153,14 @@ int obi_enlarge_column(OBIDMS_column_p column)
int column_file_descriptor;
char* column_file_name;
// Check if the column is read-only
if (!(column->writable))
{
obi_set_errno(OBICOL_UNKNOWN_ERROR);
obidebug(1, "\nError trying to enlarge a read-only column");
return -1;
}
// Get the column file name
column_file_name = build_column_file_name((column->header)->name, (column->header)->version);
if (column_file_name == NULL)
@ -1209,12 +1215,12 @@ int obi_enlarge_column(OBIDMS_column_p column)
}
column->data = mmap(NULL,
new_data_size,
PROT_READ | PROT_WRITE,
MAP_SHARED,
column_file_descriptor,
header_size
);
new_data_size,
PROT_READ | PROT_WRITE,
MAP_SHARED,
column_file_descriptor,
header_size
);
if (column->data == MAP_FAILED)
{
obi_set_errno(OBICOL_UNKNOWN_ERROR);
@ -1274,23 +1280,29 @@ void obi_ini_to_NA_values(OBIDMS_column_p column,
}
break;
case OBI_STR: for (i=start;i<end;i++)
{
*(((index_t*) (column->data)) + i) = OBIIdx_NA;
}
break;
case OBI_SEQ: for (i=start;i<end;i++)
{
*(((index_t*) (column->data)) + i) = OBIIdx_NA;
}
break;
case OBI_IDX: for (i=start;i<end;i++)
{
*(((index_t*) (column->data)) + i) = OBIIdx_NA;
}
break;
case OBI_QUAL: for (i=start;i<end;i++) // case not used since OBI_QUAL is only a returned_data_type
{
*(((index_t*) (column->data)) + i) = OBIIdx_NA;
}
break;
case OBI_STR: for (i=start;i<end;i++) // case not used since OBI_QUAL is only a returned_data_type
{
*(((index_t*) (column->data)) + i) = OBIIdx_NA;
}
break;
case OBI_SEQ: for (i=start;i<end;i++) // case not used since OBI_QUAL is only a returned_data_type
{
*(((index_t*) (column->data)) + i) = OBIIdx_NA;
}
break;
}
}
@ -1460,12 +1472,12 @@ int obi_column_prepare_to_set_value(OBIDMS_column_p column, index_t line_nb)
}
int obi_column_prepare_to_get_value(OBIDMS_column_p column, index_t line_nb)
int obi_column_prepare_to_get_value(OBIDMS_column_p column, index_t line_nb) // TODO problem with some columns in a view being empty or shorter and triggering an error because they've been truncated when the view was closed. Fixed with obiview.c in update_lines() for now
{
if ((line_nb+1) > ((column->header)->line_count))
{
obi_set_errno(OBICOL_UNKNOWN_ERROR);
obidebug(1, "\nError trying to get a value that is beyond the current number of lines used in the column");
obidebug(1, "\nError trying to get a value that is beyond the current number of lines of the column");
return -1;
}
return 0;

184
src/obidmscolumn_qual.c Normal file
View File

@ -0,0 +1,184 @@
/****************************************************************************
* OBIDMS_column_qual functions *
****************************************************************************/
/**
* @file obidsmcolumn_qual.c
* @author Celine Mercier
* @date May 4th 2016
* @brief Functions handling OBIColumns containing data in the form of indices referring to sequence qualities.
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include "obidmscolumn_qual.h"
#include "obidmscolumn.h"
#include "obitypes.h"
#include "uint8_indexer.h"
/**********************************************************************
*
* D E F I N I T I O N O F T H E P U B L I C F U N C T I O N S
*
**********************************************************************/
int obi_column_set_obiqual_char_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, const char* value)
{
uint8_t* int_value;
int int_value_length;
int i;
int ret_value;
// Check NA value
if (value == OBIQual_char_NA)
{
ret_value = obi_column_set_obiqual_int_with_elt_idx(column, line_nb, element_idx, OBIQual_int_NA, 0);
}
else
{
int_value_length = strlen(value);
int_value = (uint8_t*) malloc(int_value_length * sizeof(uint8_t));
// Convert in uint8_t array to index in that format
for (i=0; i<int_value_length; i++)
int_value[i] = ((uint8_t)(value[i])) - QUALITY_ASCII_BASE;
ret_value = obi_column_set_obiqual_int_with_elt_idx(column, line_nb, element_idx, int_value, int_value_length);
free(int_value);
}
return ret_value;
}
int obi_column_set_obiqual_int_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, const uint8_t* value, int value_length)
{
index_t idx;
char* new_indexer_name;
if (obi_column_prepare_to_set_value(column, line_nb) < 0)
return -1;
if (value == OBIQual_int_NA)
{
idx = OBIIdx_NA;
}
else
{
// Add the value in the indexer
idx = obi_index_uint8(column->indexer, value, value_length);
if (idx == -1) // An error occurred
{
if (obi_errno == OBI_READ_ONLY_INDEXER_ERROR)
{
// TODO PUT IN A COLUMN FUNCTION
// If the error is that the indexer is read-only, clone it
new_indexer_name = obi_build_indexer_name((column->header)->name, (column->header)->version);
if (new_indexer_name == NULL)
return -1;
column->indexer = obi_clone_indexer(column->indexer, new_indexer_name); // TODO Need to lock this somehow?
strcpy((column->header)->indexer_name, new_indexer_name);
free(new_indexer_name);
obi_set_errno(0);
// Add the value in the new indexer
idx = obi_index_uint8(column->indexer, value, value_length);
if (idx == -1)
return -1;
}
else
return -1;
}
}
// Add the value's index in the column
*(((index_t*) (column->data)) + (line_nb * ((column->header)->nb_elements_per_line)) + element_idx) = idx;
return 0;
}
char* obi_column_get_obiqual_char_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx)
{
char* value;
const uint8_t* int_value;
int int_value_length;
int i;
int_value = obi_column_get_obiqual_int_with_elt_idx(column, line_nb, element_idx, &int_value_length);
// Check NA
if (int_value == OBIQual_int_NA)
return OBIQual_char_NA;
value = (char*) malloc((int_value_length + 1) * sizeof(char));
// Encode int quality to char quality
for (i=0; i<int_value_length; i++)
value[i] = (char)(int_value[i] + QUALITY_ASCII_BASE);
value[i] = '\0';
return value;
}
const uint8_t* obi_column_get_obiqual_int_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, int* value_length)
{
index_t idx;
if (obi_column_prepare_to_get_value(column, line_nb) < 0)
return OBIQual_int_NA;
idx = *(((index_t*) (column->data)) + (line_nb * ((column->header)->nb_elements_per_line)) + element_idx);
// Check NA
if (idx == OBIIdx_NA)
return OBIQual_int_NA;
return obi_retrieve_uint8(column->indexer, idx, value_length);
}
int obi_column_set_obiqual_char_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name, const char* value)
{
index_t element_idx = obi_column_get_element_index_from_name(column, element_name);
if (element_idx == OBIIdx_NA)
return -1;
return obi_column_set_obiqual_char_with_elt_idx(column, line_nb, element_idx, value);
}
int obi_column_set_obiqual_int_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name, const uint8_t* value, int value_length)
{
index_t element_idx = obi_column_get_element_index_from_name(column, element_name);
if (element_idx == OBIIdx_NA)
return -1;
return obi_column_set_obiqual_int_with_elt_idx(column, line_nb, element_idx, value, value_length);
}
char* obi_column_get_obiqual_char_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name)
{
index_t element_idx = obi_column_get_element_index_from_name(column, element_name);
if (element_idx == OBIIdx_NA)
return OBIQual_char_NA;
return obi_column_get_obiqual_char_with_elt_idx(column, line_nb, element_idx);
}
const uint8_t* obi_column_get_obiqual_int_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name, int* value_length)
{
index_t element_idx = obi_column_get_element_index_from_name(column, element_name);
if (element_idx == OBIIdx_NA)
return OBIQual_int_NA;
return obi_column_get_obiqual_int_with_elt_idx(column, line_nb, element_idx, value_length);
}

204
src/obidmscolumn_qual.h Normal file
View File

@ -0,0 +1,204 @@
/****************************************************************************
* OBIDMS_column_qual header file *
****************************************************************************/
/**
* @file obidsmcolumn_qual.h
* @author Celine Mercier
* @date May 4th 2016
* @brief Header file for the functions handling OBIColumns containing data in the form of indices referring to sequence qualities.
*/
#ifndef OBIDMSCOLUMN_QUAL_H_
#define OBIDMSCOLUMN_QUAL_H_
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include "obidmscolumn.h"
#include "obitypes.h"
#define QUALITY_ASCII_BASE (33) /**< The ASCII base of sequence quality.
* Used to convert sequence qualities from characters to integers
* and the other way around.
*/
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function is for qualities in the character string format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_idx The index of the element that should be set in the line.
* @param value The value that should be set, in the character string format.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_char_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, const char* value);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function is for qualities in the integer format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_idx The index of the element that should be set in the line.
* @param value The value that should be set, in the integer array format.
* @param value_length The length of the integer array.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_int_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, const uint8_t* value, int value_length);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function returns quality scores in the character string format.
*
* @param column A pointer as returned by obi_create_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_idx The index of the element that should be recovered in the line.
*
* @returns The recovered value, in the character string format.
* @retval OBIQual_char_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
char* obi_column_get_obiqual_char_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function returns quality scores in the integer format.
*
* @param column A pointer as returned by obi_create_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_idx The index of the element that should be recovered in the line.
* @param value_length A pointer on an integer to store the length of the integer array recovered.
*
* @returns The recovered value, in the integer array format.
* @retval OBIQual_int_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
const uint8_t* obi_column_get_obiqual_int_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, int* value_length);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function is for quality scores in the character string format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_name The name of the element that should be set in the line.
* @param value The value that should be set, in the character string format.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_char_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name, const char* value);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function is for quality scores in the integer array format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_name The name of the element that should be set in the line.
* @param value The value that should be set, in the integer format.
* @param value_length The length of the integer array.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_int_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name, const uint8_t* value, int value_length);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function returns quality scores in the character string format.
*
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_name The name of the element that should be recovered in the line.
*
* @returns The recovered value, in the character string format.
* @retval OBIQual_char_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
char* obi_column_get_obiqual_char_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line.
*
* This function returns quality scores in the integer array format.
*
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_name The name of the element that should be recovered in the line.
* @param value_length A pointer on an integer to store the length of the integer array recovered.
*
* @returns The recovered value, in the integer format.
* @retval OBIQual_int_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
const uint8_t* obi_column_get_obiqual_int_with_elt_name(OBIDMS_column_p column, index_t line_nb, const char* element_name, int* value_length);
#endif /* OBIDMSCOLUMN_QUAL_H_ */

View File

@ -27,14 +27,42 @@
int obi_column_set_obiseq_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, const char* value)
{
index_t idx;
char* new_indexer_name;
if (obi_column_prepare_to_set_value(column, line_nb) < 0)
return -1;
// Add the value in the indexer
idx = obi_index_dna_seq(column->indexer, value);
if (idx == -1)
return -1;
if (value == OBISeq_NA)
{
idx = OBIIdx_NA;
}
else
{
// Add the value in the indexer
idx = obi_index_dna_seq(column->indexer, value);
if (idx == -1) // An error occurred
{
if (obi_errno == OBI_READ_ONLY_INDEXER_ERROR)
{
// TODO PUT IN A COLUMN FUNCTION
// If the error is that the indexer is read-only, clone it
new_indexer_name = obi_build_indexer_name((column->header)->name, (column->header)->version);
if (new_indexer_name == NULL)
return -1;
column->indexer = obi_clone_indexer(column->indexer, new_indexer_name); // TODO Need to lock this somehow?
strcpy((column->header)->indexer_name, new_indexer_name);
free(new_indexer_name);
obi_set_errno(0);
// Add the value in the new indexer
idx = obi_index_dna_seq(column->indexer, value);
if (idx == -1)
return -1;
}
else
return -1;
}
}
// Add the value's index in the column
*(((index_t*) (column->data)) + (line_nb * ((column->header)->nb_elements_per_line)) + element_idx) = idx;

View File

@ -27,14 +27,42 @@
int obi_column_set_obistr_with_elt_idx(OBIDMS_column_p column, index_t line_nb, index_t element_idx, const char* value)
{
index_t idx;
char* new_indexer_name;
if (obi_column_prepare_to_set_value(column, line_nb) < 0)
return -1;
// Add the value in the indexer
idx = obi_index_char_str(column->indexer, value);
if (idx == -1)
return -1;
if (value == OBIStr_NA)
{
idx = OBIIdx_NA;
}
else
{
// Add the value in the indexer
idx = obi_index_char_str(column->indexer, value);
if (idx == -1) // An error occurred
{
if (obi_errno == OBI_READ_ONLY_INDEXER_ERROR)
{
// TODO PUT IN A COLUMN FUNCTION
// If the error is that the indexer is read-only, clone it
new_indexer_name = obi_build_indexer_name((column->header)->name, (column->header)->version);
if (new_indexer_name == NULL)
return -1;
column->indexer = obi_clone_indexer(column->indexer, new_indexer_name); // TODO Need to lock this somehow?
strcpy((column->header)->indexer_name, new_indexer_name);
free(new_indexer_name);
obi_set_errno(0);
// Add the value in the new indexer
idx = obi_index_char_str(column->indexer, value);
if (idx == -1)
return -1;
}
else
return -1;
}
}
// Add the value's index in the column
*(((index_t*) (column->data)) + (line_nb * ((column->header)->nb_elements_per_line)) + element_idx) = idx;
@ -48,7 +76,7 @@ const char* obi_column_get_obistr_with_elt_idx(OBIDMS_column_p column, index_t l
index_t idx;
if (obi_column_prepare_to_get_value(column, line_nb) < 0)
return OBISeq_NA;
return OBIStr_NA;
idx = *(((index_t*) (column->data)) + (line_nb * ((column->header)->nb_elements_per_line)) + element_idx);

View File

@ -114,6 +114,10 @@ extern int obi_errno;
*/
#define OBI_INDEXER_ERROR (27) /** Error handling a blob indexer
*/
#define OBI_READ_ONLY_INDEXER_ERROR (28) /** Error trying to modify a read-only a blob indexer
*/
#define OBI_ALIGN_ERROR (29) /** Error while aligning sequences
*/
/**@}*/
#endif /* OBIERRNO_H_ */

View File

@ -40,6 +40,9 @@ size_t obi_sizeof(OBIType_t type)
case OBI_CHAR: size = sizeof(obichar_t);
break;
case OBI_QUAL: size = sizeof(index_t);
break;
case OBI_STR: size = sizeof(index_t);
break;
@ -96,6 +99,9 @@ char* name_data_type(int data_type)
case OBI_CHAR: name = strdup("OBI_CHAR");
break;
case OBI_QUAL: name = strdup("OBI_QUAL");
break;
case OBI_STR: name = strdup("OBI_STR");
break;

View File

@ -22,8 +22,10 @@
#define OBIIdx_NA (INT64_MIN) /**< NA value for indices */
#define OBIFloat_NA (float_NA()) /**< NA value for the type OBI_FLOAT */
#define OBIChar_NA (0) /**< NA value for the type OBI_CHAR */ // TODO not sure about this one as it can be impossible to distinguish from uninitialized values
#define OBISeq_NA ("\0") /**< NA value for the type OBI_SEQ */ // TODO discuss
#define OBIStr_NA ("\0") /**< NA value for the type OBI_STR */ // TODO discuss
#define OBISeq_NA (NULL) /**< NA value for the type OBI_SEQ */ // TODO discuss
#define OBIStr_NA (NULL) /**< NA value for the type OBI_STR */ // TODO discuss
#define OBIQual_char_NA (NULL) /**< NA value for the type OBI_QUAL if the quality is in character string format */
#define OBIQual_int_NA (NULL) /**< NA value for the type OBI_QUAL if the quality is in integer format */
/**
@ -45,6 +47,7 @@ typedef enum OBIType {
OBI_FLOAT, /**< a floating value (C type : double) */
OBI_BOOL, /**< a boolean true/false value, see obibool_t enum */
OBI_CHAR, /**< a character (C type : char) */
OBI_QUAL, /**< an index in a data structure (C type : int64_t) referring to a quality score array */
OBI_STR, /**< an index in a data structure (C type : int64_t) referring to a character string */
OBI_SEQ, /**< an index in a data structure (C type : int64_t) referring to a DNA sequence */
OBI_IDX /**< an index referring to a line in another column (C type : int64_t) */

File diff suppressed because it is too large Load Diff

View File

@ -17,6 +17,7 @@
#include <stdlib.h>
#include <errno.h>
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <time.h>
#include <math.h>
@ -33,8 +34,6 @@
*/
#define VIEW_TYPE_MAX_LENGTH (1024) /**< The maximum length of the type name of a view.
*/
#define OBIVIEW_FILE_NAME "obiviews" /**< The default name of a view file.
*/
#define LINES_COLUMN_NAME "LINES" /**< The name of the column containing the line selections
* in all views.
*/
@ -44,62 +43,17 @@
#define NUC_SEQUENCE_COLUMN "NUC_SEQ" /**< The name of the column containing the nucleotide sequences
* in NUC_SEQS_VIEW views.
*/
#define NUC_SEQUENCE_INDEXER "NUC_SEQ_INDEXER" /**< The name of the indexer containing the nucleotide sequences
* in NUC_SEQS_VIEW views.
*/
#define ID_COLUMN "ID" /**< The name of the column containing the sequence identifiers
* in NUC_SEQS_VIEW views.
*/
#define ID_INDEXER "ID_INDEXER" /**< The name of the indexer containing the sequence identifiers
* in NUC_SEQS_VIEW views.
*/
#define DEFINITION_COLUMN "DEFINITION" /**< The name of the column containing the sequence definitions
* in NUC_SEQS_VIEW views.
*/
#define DEFINITION_INDEXER "DEFINITION_INDEXER" /**< The name of the indexer containing the sequence definitions
#define QUALITY_COLUMN "QUALITY" /**< The name of the column containing the sequence qualities
* in NUC_SEQS_VIEW views.
*/
/**
* @brief Structure for an opened view.
*/
typedef struct Obiview {
OBIDMS_p dms; /**< A pointer on the DMS to which the view belongs.
*/
char name[OBIVIEW_NAME_MAX_LENGTH+1]; /**< Name of the view.
*/
char created_from[OBIVIEW_NAME_MAX_LENGTH+1]; /**< Name of the view from which that view was cloned if the view was cloned.
*/
char view_type[VIEW_TYPE_MAX_LENGTH+1]; /**< Type of the view if there is one.
* Types existing: NUC_SEQS_VIEW.
*/
bool read_only; /**< Whether the view is read-only or can be modified.
*/
OBIDMS_column_p line_selection; /**< A pointer on the column containing the line selection
* associated with the view if there is one.
* This line selection is read-only, and when a line from the view is read,
* it is this line selection that is used.
*/
OBIDMS_column_p new_line_selection; /**< A pointer on the column containing the new line selection being built
* to associate with the view, if there is one.
* When a line is selected with obi_select_line() or obi_select_lines(),
* it is recorded in this line selection.
*/
index_t line_count; /**< The number of lines in the view. Refers to the number of lines in each
* column of the view if line_selection is NULL, or to the line count of
* line_selection if it is not NULL.
*/
int column_count; /**< The number of columns in the view.
*/
OBIDMS_column_p columns[MAX_NB_OPENED_COLUMNS]; /**< Array of pointers on all the columns of the view.
*/
char comments[OBIVIEW_COMMENTS_MAX_LENGTH+1]; /**< Comments, additional informations on the view.
*/
} Obiview_t, *Obiview_p;
/**
* @brief Structure referencing a column by its name and its version.
*/
@ -117,8 +71,6 @@ typedef struct Column_reference {
* Once a view has been written in the view file, it can not be modified and can only be read.
*/
typedef struct Obiview_infos {
int view_number; /**< Number of the view in the view file.
*/
time_t creation_date; /**< Time at which the view was written in the view file.
*/
char name[OBIVIEW_NAME_MAX_LENGTH+1]; /**< Name of the view, used to identify it.
@ -143,7 +95,29 @@ typedef struct Obiview_infos {
} Obiview_infos_t, *Obiview_infos_p;
// TODO : Combine the common elements of the Obiview_infos and Obiview structures in one structure used by both?
/**
* @brief Structure for an opened view.
*/
typedef struct Obiview {
Obiview_infos_p infos; /**< A pointer on the mapped view informations.
*/
OBIDMS_p dms; /**< A pointer on the DMS to which the view belongs.
*/
bool read_only; /**< Whether the view is read-only or can be modified.
*/
OBIDMS_column_p line_selection; /**< A pointer on the column containing the line selection
* associated with the view if there is one.
* This line selection is read-only, and when a line from the view is read,
* it is this line selection that is used.
*/
OBIDMS_column_p new_line_selection; /**< A pointer on the column containing the new line selection being built
* to associate with the view, if there is one.
* When a line is selected with obi_select_line() or obi_select_lines(),
* it is recorded in this line selection.
*/
OBIDMS_column_p columns[MAX_NB_OPENED_COLUMNS]; /**< Array of pointers on all the columns of the view.
*/
} Obiview_t, *Obiview_p;
/**
@ -224,6 +198,7 @@ Obiview_p obi_new_view_cloned_from_name(OBIDMS_p dms, const char* view_name, con
* - NUC_SEQUENCE_COLUMN where nucleotide sequences are stored
* - ID_COLUMN where sequence identifiers are stored
* - DEFINITION_COLUMN where sequence definitions are stored
* - QUALITY_COLUMN where sequence qualities are stored
*
* @param dms A pointer on the OBIDMS.
* @param view_name The unique name of the view.
@ -255,6 +230,7 @@ Obiview_p obi_new_view_nuc_seqs(OBIDMS_p dms, const char* view_name, Obiview_p v
* - NUC_SEQUENCE_COLUMN where nucleotide sequences are stored
* - ID_COLUMN where sequence identifiers are stored
* - DEFINITION_COLUMN where sequence definitions are stored
* - QUALITY_COLUMN where sequence qualities are stored
*
* @param dms A pointer on the OBIDMS.
* @param view_name The unique name of the new view.
@ -272,6 +248,37 @@ Obiview_p obi_new_view_nuc_seqs(OBIDMS_p dms, const char* view_name, Obiview_p v
Obiview_p obi_new_view_nuc_seqs_cloned_from_name(OBIDMS_p dms, const char* view_name, const char* view_to_clone_name, index_t* line_selection, const char* comments);
/**
* @brief Maps a view file and returns the mapped structure stored in it.
*
* @param dms A pointer on the OBIDMS.
* @param view_name The unique name identifying the view.
*
* @returns A pointer on the mapped view infos structure.
* @retval NULL if an error occurred.
*
* @since June 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
Obiview_infos_p obi_view_map_file(OBIDMS_p dms, const char* view_name);
/**
* @brief Unmaps a view file.
*
* @param dms A pointer on the OBIDMS.
* @param view_infos A pointer on the mapped view infos structure.
*
* @returns A value indicating the success of the operation.
* @retval 0 if the operation was successfully completed.
* @retval -1 if an error occurred.
*
* @since June 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_view_unmap_file(OBIDMS_p dms, Obiview_infos_p view_infos);
/**
* @brief Opens a view identified by its name stored in the view file.
*
@ -812,6 +819,194 @@ int obi_column_set_obiint_with_elt_name_in_view(Obiview_p view, OBIDMS_column_p
obiint_t obi_column_get_obiint_with_elt_name_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, const char* element_name);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function is for qualities in the character string format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_idx The index of the element that should be set in the line.
* @param value The value that should be set, in the character string format.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_char_with_elt_idx_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, index_t element_idx, const char* value);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function is for qualities in the integer format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_idx The index of the element that should be set in the line.
* @param value The value that should be set, in the integer array format.
* @param value_length The length of the integer array.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_int_with_elt_idx_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, index_t element_idx, const uint8_t* value, int value_length);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function returns quality scores in the character string format.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_idx The index of the element that should be recovered in the line.
*
* @returns The recovered value, in the character string format.
* @retval OBIQual_char_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
char* obi_column_get_obiqual_char_with_elt_idx_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, index_t element_idx);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function returns quality scores in the integer format.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_idx The index of the element that should be recovered in the line.
* @param value_length A pointer on an integer to store the length of the integer array recovered.
*
* @returns The recovered value, in the integer array format.
* @retval OBIQual_int_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
const uint8_t* obi_column_get_obiqual_int_with_elt_idx_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, index_t element_idx, int* value_length);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function is for quality scores in the character string format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_name The name of the element that should be set in the line.
* @param value The value that should be set, in the character string format.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_char_with_elt_name_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, const char* element_name, const char* value);
/**
* @brief Sets a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function is for quality scores in the integer array format.
*
* @warning Pointers returned by obi_open_column() don't allow writing.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be set.
* @param element_name The name of the element that should be set in the line.
* @param value The value that should be set, in the integer format.
* @param value_length The length of the integer array.
*
* @returns An integer value indicating the success of the operation.
* @retval 0 on success.
* @retval -1 if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
int obi_column_set_obiqual_int_with_elt_name_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, const char* element_name, const uint8_t* value, int value_length);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function returns quality scores in the character string format.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_name The name of the element that should be recovered in the line.
*
* @returns The recovered value, in the character string format.
* @retval OBIQual_char_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
char* obi_column_get_obiqual_char_with_elt_name_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, const char* element_name);
/**
* @brief Recovers a value in an OBIDMS column containing data in the form of indices referring
* to sequence qualities handled by an indexer, and using the index of the element in the column's line,
* in the context of a view.
*
* This function returns quality scores in the integer array format.
*
* @param view A pointer on the opened view.
* @param column A pointer as returned by obi_create_column() or obi_clone_column().
* @param line_nb The number of the line where the value should be recovered.
* @param element_name The name of the element that should be recovered in the line.
* @param value_length A pointer on an integer to store the length of the integer array recovered.
*
* @returns The recovered value, in the integer format.
* @retval OBIQual_int_NA the NA value of the type if an error occurred and obi_errno is set.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
const uint8_t* obi_column_get_obiqual_int_with_elt_name_in_view(Obiview_p view, OBIDMS_column_p column, index_t line_nb, const char* element_name, int* value_length);
/**
* @brief Sets a value in an OBIDMS column containing data with the type OBI_SEQ, using the index of the element in the line,
* in the context of a view.

View File

@ -0,0 +1,696 @@
/*
* sse_banded_LCS_alignment.c
*
* Created on: 7 nov. 2012
* Author: celine mercier
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <stdint.h>
#include <stdbool.h>
#include "obierrno.h"
#include "obidebug.h"
#include "utils.h"
#include "_sse.h"
#include "sse_banded_LCS_alignment.h"
#define DEBUG_LEVEL 0 // TODO has to be defined somewhere else (cython compil flag?)
static void printreg(__m128i r)
{
int16_t a0,a1,a2,a3,a4,a5,a6,a7;
a0= _MM_EXTRACT_EPI16(r,0);
a1= _MM_EXTRACT_EPI16(r,1);
a2= _MM_EXTRACT_EPI16(r,2);
a3= _MM_EXTRACT_EPI16(r,3);
a4= _MM_EXTRACT_EPI16(r,4);
a5= _MM_EXTRACT_EPI16(r,5);
a6= _MM_EXTRACT_EPI16(r,6);
a7= _MM_EXTRACT_EPI16(r,7);
fprintf(stderr, "a00 :-> %7d %7d %7d %7d "
" %7d %7d %7d %7d "
"\n"
, a0,a1,a2,a3,a4,a5,a6,a7
);
}
static inline int extract_reg(__m128i r, int p)
{
switch (p) {
case 0: return(_MM_EXTRACT_EPI16(r,0));
case 1: return(_MM_EXTRACT_EPI16(r,1));
case 2: return(_MM_EXTRACT_EPI16(r,2));
case 3: return(_MM_EXTRACT_EPI16(r,3));
case 4: return(_MM_EXTRACT_EPI16(r,4));
case 5: return(_MM_EXTRACT_EPI16(r,5));
case 6: return(_MM_EXTRACT_EPI16(r,6));
case 7: return(_MM_EXTRACT_EPI16(r,7));
}
return(0);
}
// TODO warning on length order
void sse_banded_align_lcs_and_ali_len(int16_t* seq1, int16_t* seq2, int l1, int l2, int bandLengthLeft, int bandLengthTotal, int16_t* address, double* lcs_length, int* ali_length)
{
register int j;
int k1, k2;
int max, diff;
int l_reg, l_loc;
int line;
int numberOfRegistersPerLine;
int numberOfRegistersFor3Lines;
bool even_line;
bool odd_line;
bool even_BLL;
bool odd_BLL;
um128* SSEregisters;
um128* p_diag;
um128* p_gap1;
um128* p_gap2;
um128* p_diag_j;
um128* p_gap1_j;
um128* p_gap2_j;
um128 current;
um128* l_ali_SSEregisters;
um128* p_l_ali_diag;
um128* p_l_ali_gap1;
um128* p_l_ali_gap2;
um128* p_l_ali_diag_j;
um128* p_l_ali_gap1_j;
um128* p_l_ali_gap2_j;
um128 l_ali_current;
um128 nucs1;
um128 nucs2;
um128 scores;
um128 boolean_reg;
// Initialisations
odd_BLL = bandLengthLeft & 1;
even_BLL = !odd_BLL;
max = INT16_MAX - l1;
numberOfRegistersPerLine = bandLengthTotal / 8;
numberOfRegistersFor3Lines = 3 * numberOfRegistersPerLine;
SSEregisters = (um128*) calloc(numberOfRegistersFor3Lines * 2, sizeof(um128));
l_ali_SSEregisters = SSEregisters + numberOfRegistersFor3Lines;
// preparer registres SSE
for (j=0; j<numberOfRegistersFor3Lines; j++)
l_ali_SSEregisters[j].i = _MM_LOAD_SI128(address+j*8);
p_diag = SSEregisters;
p_gap1 = SSEregisters+numberOfRegistersPerLine;
p_gap2 = SSEregisters+2*numberOfRegistersPerLine;
p_l_ali_diag = l_ali_SSEregisters;
p_l_ali_gap1 = l_ali_SSEregisters+numberOfRegistersPerLine;
p_l_ali_gap2 = l_ali_SSEregisters+2*numberOfRegistersPerLine;
// Loop on diagonals = 'lines' :
for (line=2; line <= l1+l2; line++)
{
odd_line = line & 1;
even_line = !odd_line;
// loop on the registers of a line :
for (j=0; j < numberOfRegistersPerLine; j++)
{
p_diag_j = p_diag+j;
p_gap1_j = p_gap1+j;
p_gap2_j = p_gap2+j;
p_l_ali_diag_j = p_l_ali_diag+j;
p_l_ali_gap1_j = p_l_ali_gap1+j;
p_l_ali_gap2_j = p_l_ali_gap2+j;
// comparing nucleotides for diagonal scores :
// k1 = position of the 1st nucleotide to align for seq1 and k2 = position of the 1st nucleotide to align for seq2
if (odd_line && odd_BLL)
k1 = (line / 2) + ((bandLengthLeft+1) / 2) - j*8;
else
k1 = (line / 2) + (bandLengthLeft/2) - j*8;
k2 = line - k1 - 1;
nucs1.i = _MM_LOADU_SI128(seq1+l1-k1);
nucs2.i = _MM_LOADU_SI128(seq2+k2);
/* fprintf(stderr, "\nnucs, r %d, k1 = %d, k2 = %d\n", j, k1, k2);
printreg(nucs1.i);
printreg(nucs2.i);
*/
// computing diagonal score :
scores.i = _MM_AND_SI128(_MM_CMPEQ_EPI16(nucs1.i, nucs2.i), _MM_SET1_EPI16(1));
current.i = _MM_ADDS_EPU16(p_diag_j->i, scores.i);
// Computing alignment length
l_ali_current.i = p_l_ali_diag_j->i;
boolean_reg.i = _MM_CMPGT_EPI16(p_gap1_j->i, current.i);
l_ali_current.i = _MM_OR_SI128(
_MM_AND_SI128(p_l_ali_gap1_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, l_ali_current.i));
current.i = _MM_OR_SI128(
_MM_AND_SI128(p_gap1_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, current.i));
boolean_reg.i = _MM_AND_SI128(
_MM_CMPEQ_EPI16(p_gap1_j->i, current.i),
_MM_CMPLT_EPI16(p_l_ali_gap1_j->i, l_ali_current.i));
l_ali_current.i = _MM_OR_SI128(
_MM_AND_SI128(p_l_ali_gap1_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, l_ali_current.i));
current.i = _MM_OR_SI128(
_MM_AND_SI128(p_gap1_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, current.i));
boolean_reg.i = _MM_CMPGT_EPI16(p_gap2_j->i, current.i);
l_ali_current.i = _MM_OR_SI128(
_MM_AND_SI128(p_l_ali_gap2_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, l_ali_current.i));
current.i = _MM_OR_SI128(
_MM_AND_SI128(p_gap2_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, current.i));
boolean_reg.i = _MM_AND_SI128(
_MM_CMPEQ_EPI16(p_gap2_j->i, current.i),
_MM_CMPLT_EPI16(p_l_ali_gap2_j->i, l_ali_current.i));
l_ali_current.i = _MM_OR_SI128(
_MM_AND_SI128(p_l_ali_gap2_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, l_ali_current.i));
current.i = _MM_OR_SI128(
_MM_AND_SI128(p_gap2_j->i, boolean_reg.i),
_MM_ANDNOT_SI128(boolean_reg.i, current.i));
/*
fprintf(stderr, "\nline = %d", line);
fprintf(stderr, "\nDiag, r %d : ", j);
printreg((*(p_diag_j)).i);
fprintf(stderr, "Gap1 : ");
printreg((*(p_gap1_j)).i);
fprintf(stderr, "Gap2 : ");
printreg((*(p_gap2_j)).i);
fprintf(stderr, "current : ");
printreg(current.i);
fprintf(stderr, "L ALI\nDiag r %d : ", j);
printreg((*(p_l_ali_diag_j)).i);
fprintf(stderr, "Gap1 : ");
printreg((*(p_l_ali_gap1_j)).i);
fprintf(stderr, "Gap2 : ");
printreg((*(p_l_ali_gap2_j)).i);
fprintf(stderr, "current : ");
printreg(l_ali_current.i);
*/
// diag = gap1 and gap1 = current
p_diag_j->i = p_gap1_j->i;
p_gap1_j->i = current.i;
// l_ali_diag = l_ali_gap1 and l_ali_gap1 = l_ali_current+1
p_l_ali_diag_j->i = p_l_ali_gap1_j->i;
p_l_ali_gap1_j->i = _MM_ADD_EPI16(l_ali_current.i, _MM_SET1_EPI16(1));
}
// shifts for gap2, to do only once all the registers of a line have been computed Copier gap2 puis le charger depuis la copie?
for (j=0; j < numberOfRegistersPerLine; j++)
{
if ((odd_line && even_BLL) || (even_line && odd_BLL))
{
p_gap2[j].i = _MM_LOADU_SI128((p_gap1[j].s16)-1);
p_l_ali_gap2[j].i = _MM_LOADU_SI128((p_l_ali_gap1[j].s16)-1);
if (j == 0)
{
p_gap2[j].i = _MM_INSERT_EPI16(p_gap2[j].i, 0, 0);
p_l_ali_gap2[j].i = _MM_INSERT_EPI16(p_l_ali_gap2[j].i, max, 0);
}
}
else
{
p_gap2[j].i = _MM_LOADU_SI128(p_gap1[j].s16+1);
p_l_ali_gap2[j].i = _MM_LOADU_SI128(p_l_ali_gap1[j].s16+1);
if (j == numberOfRegistersPerLine - 1)
{
p_gap2[j].i = _MM_INSERT_EPI16(p_gap2[j].i, 0, 7);
p_l_ali_gap2[j].i = _MM_INSERT_EPI16(p_l_ali_gap2[j].i, max, 7);
}
}
}
// end shifts for gap2
}
/* /// Recovering LCS and alignment lengths \\\ */
// finding the location of the results in the registers :
diff = l1-l2;
if ((diff & 1) && odd_BLL)
l_loc = (int) floor((double)(bandLengthLeft) / (double)2) - floor((double)(diff) / (double)2);
else
l_loc = (int) floor((double)(bandLengthLeft) / (double)2) - ceil((double)(diff) / (double)2);
l_reg = (int)floor((double)l_loc/(double)8.0);
//fprintf(stderr, "\nl_reg = %d, l_loc = %d\n", l_reg, l_loc);
l_loc = l_loc - l_reg*8;
// extracting the results from the registers :
*lcs_length = extract_reg(p_gap1[l_reg].i, l_loc);
*ali_length = extract_reg(p_l_ali_gap1[l_reg].i, l_loc) - 1;
// freeing the registers
free(SSEregisters);
}
// TODO warning on length order
double sse_banded_align_just_lcs(int16_t* seq1, int16_t* seq2, int l1, int l2, int bandLengthLeft, int bandLengthTotal)
{
register int j;
int k1, k2;
int diff;
int l_reg, l_loc;
int16_t l_lcs;
int line;
int numberOfRegistersPerLine;
int numberOfRegistersFor3Lines;
bool even_line;
bool odd_line;
bool even_BLL;
bool odd_BLL;
um128* SSEregisters;
um128* p_diag;
um128* p_gap1;
um128* p_gap2;
um128* p_diag_j;
um128* p_gap1_j;
um128* p_gap2_j;
um128 current;
um128 nucs1;
um128 nucs2;
um128 scores;
// Initialisations
odd_BLL = bandLengthLeft & 1;
even_BLL = !odd_BLL;
numberOfRegistersPerLine = bandLengthTotal / 8;
numberOfRegistersFor3Lines = 3 * numberOfRegistersPerLine;
SSEregisters = malloc(numberOfRegistersFor3Lines * sizeof(um128));
if (SSEregisters == NULL)
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError allocating memory for SSE registers for LCS alignment");
return 0; // TODO DOUBLE_MIN?
}
// preparer registres SSE
for (j=0; j<numberOfRegistersFor3Lines; j++)
(*(SSEregisters+j)).i = _MM_SETZERO_SI128();
p_diag = SSEregisters;
p_gap1 = SSEregisters+numberOfRegistersPerLine;
p_gap2 = SSEregisters+2*numberOfRegistersPerLine;
// Loop on diagonals = 'lines' :
for (line=2; line <= l1+l2; line++)
{
odd_line = line & 1;
even_line = !odd_line;
// loop on the registers of a line :
for (j=0; j < numberOfRegistersPerLine; j++)
{
p_diag_j = p_diag+j;
p_gap1_j = p_gap1+j;
p_gap2_j = p_gap2+j;
// comparing nucleotides for diagonal scores :
// k1 = position of the 1st nucleotide to align for seq1 and k2 = position of the 1st nucleotide to align for seq2
if (odd_line && odd_BLL)
k1 = (line / 2) + ((bandLengthLeft+1) / 2) - j*8;
else
k1 = (line / 2) + (bandLengthLeft/2) - j*8;
k2 = line - k1 - 1;
nucs1.i = _MM_LOADU_SI128(seq1+l1-k1);
nucs2.i = _MM_LOADU_SI128(seq2+k2);
// computing diagonal score :
scores.i = _MM_AND_SI128(_MM_CMPEQ_EPI16(nucs1.i, nucs2.i), _MM_SET1_EPI16(1));
current.i = _MM_ADDS_EPU16((*(p_diag_j)).i, scores.i);
// current = max(gap1, current)
current.i = _MM_MAX_EPI16((*(p_gap1_j)).i, current.i);
// current = max(gap2, current)
current.i = _MM_MAX_EPI16((*(p_gap2_j)).i, current.i);
// diag = gap1 and gap1 = current
(*(p_diag_j)).i = (*(p_gap1_j)).i;
(*(p_gap1_j)).i = current.i;
}
// shifts for gap2, to do only once all the registers of a line have been computed
for (j=0; j < numberOfRegistersPerLine; j++)
{
if ((odd_line && even_BLL) || (even_line && odd_BLL))
{
(*(p_gap2+j)).i = _MM_LOADU_SI128(((*(p_gap1+j)).s16)-1);
if (j == 0)
{
(*(p_gap2+j)).i = _MM_INSERT_EPI16((*(p_gap2+j)).i, 0, 0);
}
}
else
{
(*(p_gap2+j)).i = _MM_LOADU_SI128(((*(p_gap1+j)).s16)+1);
if (j == numberOfRegistersPerLine - 1)
{
(*(p_gap2+j)).i = _MM_INSERT_EPI16((*(p_gap2+j)).i, 0, 7);
}
}
}
// end shifts for gap2
}
/* /// Recovering LCS and alignment lengths \\\ */
// finding the location of the results in the registers :
diff = l1-l2;
if ((diff & 1) && odd_BLL)
l_loc = (int) floor((double)(bandLengthLeft) / (double)2) - floor((double)(diff) / (double)2);
else
l_loc = (int) floor((double)(bandLengthLeft) / (double)2) - ceil((double)(diff) / (double)2);
l_reg = (int)floor((double)l_loc/(double)8.0);
//fprintf(stderr, "\nl_reg = %d, l_loc = %d\n", l_reg, l_loc);
l_loc = l_loc - l_reg*8;
// extracting LCS from the registers :
l_lcs = extract_reg((*(p_gap1+l_reg)).i, l_loc);
// freeing the registers
free(SSEregisters);
return((double) l_lcs);
}
int calculateLeftBandLength(int lmax, int LCSmin)
{
return (lmax - LCSmin);
}
int calculateRightBandLength(int lmin, int LCSmin)
{
return (lmin - LCSmin);
}
int calculateSSEBandLength(int bandLengthRight, int bandLengthLeft)
{
int bandLengthTotal= (double)(bandLengthRight + bandLengthLeft) / 2.0 + 1.0;
return (bandLengthTotal & (~ (int)7)) + (( bandLengthTotal & (int)7) ? 8:0); // Calcule le multiple de 8 superieur
}
// TODO that's gonna be fun to doc
int calculateSizeToAllocate(int maxLen, int minLen, int LCSmin)
{
int size;
size = calculateLeftBandLength(maxLen, LCSmin);
size *= 2;
size = (size & (~ (int)7)) + (( size & (int)7) ? 8:0); // Closest greater 8 multiple
size *= 3;
size += 16;
return(size*sizeof(int16_t));
}
void iniSeq(int16_t* seq, int size, int16_t iniValue)
{
int16_t* target = seq;
int16_t* end = target + (size_t)size;
for (; target < end; target++)
*target = iniValue;
}
void putSeqInSeq(int16_t* seq, char* s, int l, bool reverse)
{
int16_t *target=seq;
int16_t *end = target + (size_t)l;
char *source=s;
if (reverse)
for (source=s + (size_t)l-1; target < end; target++, source--)
*target=*source;
else
for (; target < end; source++,target++)
*target=*source;
}
void initializeAddressWithGaps(int16_t* address, int bandLengthTotal, int bandLengthLeft, int l1)
{
int i;
int address_00, x_address_10, address_01, address_01_shifted;
int numberOfRegistersPerLine;
int bm;
int value=INT16_MAX-l1;
numberOfRegistersPerLine = bandLengthTotal / 8;
bm = bandLengthLeft%2;
for (i=0; i < (3*numberOfRegistersPerLine*8); i++)
address[i] = value;
// 0,0 set to 1 and 0,1 and 1,0 set to 2
address_00 = bandLengthLeft / 2;
x_address_10 = address_00 + bm - 1;
address_01 = numberOfRegistersPerLine*8 + x_address_10;
address_01_shifted = numberOfRegistersPerLine*16 + address_00 - bm;
// fill address_00, address_01,+1, address_01_shifted,+1
address[address_00] = 1;
address[address_01] = 2;
address[address_01+1] = 2;
address[address_01_shifted] = 2;
address[address_01_shifted+1] = 2;
}
// TODO warning on length order
double sse_banded_lcs_align(int16_t* seq1, int16_t* seq2, int l1, int l2, bool normalize, int reference, bool similarity_mode, int16_t* address, int LCSmin)
{
double id;
int bandLengthRight, bandLengthLeft, bandLengthTotal;
int ali_length;
bandLengthLeft = calculateLeftBandLength(l1, LCSmin);
bandLengthRight = calculateRightBandLength(l2, LCSmin);
// fprintf(stderr, "\nBLL = %d, BLR = %d, LCSmin = %d\n", bandLengthLeft, bandLengthRight, LCSmin);
bandLengthTotal = calculateSSEBandLength(bandLengthRight, bandLengthLeft);
// fprintf(stderr, "\nBLT = %d\n", bandLengthTotal);
if ((reference == ALILEN) && (normalize || !similarity_mode))
{
initializeAddressWithGaps(address, bandLengthTotal, bandLengthLeft, l1);
sse_banded_align_lcs_and_ali_len(seq1, seq2, l1, l2, bandLengthLeft, bandLengthTotal, address, &id, &ali_length);
}
else
id = sse_banded_align_just_lcs(seq1, seq2, l1, l2, bandLengthLeft, bandLengthTotal);
// fprintf(stderr, "\nid before normalizations = %f", id);
// fprintf(stderr, "\nlcs = %f, ali = %d\n", id, ali_length);
if (!similarity_mode && !normalize)
switch(reference) {
case ALILEN: id = ali_length - id;
break;
case MAXLEN: id = l1 - id;
break;
case MINLEN: id = l2 - id;
}
// fprintf(stderr, "\n2>>> %f, %d\n", id, ali_length);
if (normalize)
switch(reference) {
case ALILEN: id = id / (double) ali_length;
break;
case MAXLEN: id = id / (double) l1;
break;
case MINLEN: id = id / (double) l2;
}
// fprintf(stderr, "\nid = %f\n", id);
return(id);
}
// PUBLIC FUNCTIONS
int calculateLCSmin(int l1, int l2, double threshold, bool normalize, int reference, bool similarity_mode)
{
int LCSmin;
if (threshold > 0)
{
if (normalize)
{
if (reference == MINLEN)
LCSmin = threshold*l2;
else // ref = maxlen or alilen
LCSmin = threshold*l1;
}
else if (similarity_mode)
LCSmin = threshold;
else if (reference == MINLEN) // not similarity_mode
LCSmin = l2 - threshold;
else // not similarity_mode and ref = maxlen or alilen
LCSmin = l1 - threshold;
}
else
LCSmin = 0;
return(LCSmin);
}
double generic_sse_banded_lcs_align(char* seq1, char* seq2, double threshold, bool normalize, int reference, bool similarity_mode)
{
double id;
int l1, l2;
int lmax, lmin;
int sizeToAllocateForBand, sizeToAllocateForSeqs;
int maxBLL;
int LCSmin;
int shift;
int16_t* address;
int16_t* iseq1;
int16_t* iseq2;
address = NULL;
l1 = strlen(seq1);
l2 = strlen(seq2);
if (l1 > l2)
{
lmax = l1;
lmin = l2;
}
else
{
lmax = l2;
lmin = l1;
}
// If the score is expressed as a normalized distance, get the corresponding identity
if (!similarity_mode && normalize)
threshold = 1.0 - threshold;
// Calculate the minimum LCS length corresponding to the threshold
LCSmin = calculateLCSmin(lmax, lmin, threshold, normalize, reference, similarity_mode);
// Allocate space for matrix band if the alignment length must be computed
if ((reference == ALILEN) && (normalize || !similarity_mode)) // cases in which alignment length must be computed
{
sizeToAllocateForBand = calculateSizeToAllocate(lmax, lmin, LCSmin);
address = obi_get_memory_aligned_on_16(sizeToAllocateForBand, &shift);
if (address == NULL)
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError getting a memory address aligned on 16 bytes boundary");
return 0; // TODO DOUBLE_MIN
}
}
// Allocate space for the int16_t arrays representing the sequences
maxBLL = calculateLeftBandLength(lmax, LCSmin);
sizeToAllocateForSeqs = 2*maxBLL+lmax;
iseq1 = (int16_t*) malloc(sizeToAllocateForSeqs*sizeof(int16_t));
iseq2 = (int16_t*) malloc(sizeToAllocateForSeqs*sizeof(int16_t));
if ((iseq1 == NULL) || (iseq2 == NULL))
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError allocating memory for integer arrays to use in LCS alignment");
return 0; // TODO DOUBLE_MIN
}
// Initialize the int arrays
iniSeq(iseq1, (2*maxBLL)+lmax, 0);
iniSeq(iseq2, (2*maxBLL)+lmax, 255);
// Shift addresses to where the sequences have to be put
iseq1 = iseq1+maxBLL;
iseq2 = iseq2+maxBLL;
// Put the DNA sequences in the int arrays. Longest sequence must be first argument of sse_align function
if (l2 > l1)
{
putSeqInSeq(iseq1, seq2, l2, TRUE);
putSeqInSeq(iseq2, seq1, l1, FALSE);
// Compute alignment
id = sse_banded_lcs_align(iseq1, iseq2, l2, l1, normalize, reference, similarity_mode, address, LCSmin);
}
else
{
putSeqInSeq(iseq1, seq1, l1, TRUE);
putSeqInSeq(iseq2, seq2, l2, FALSE);
// Compute alignment
id = sse_banded_lcs_align(iseq1, iseq2, l1, l2, normalize, reference, similarity_mode, address, LCSmin);
}
// Free allocated elements
if (address != NULL)
free(address-shift);
free(iseq1-maxBLL);
free(iseq2-maxBLL);
return(id);
}

View File

@ -0,0 +1,23 @@
/*
* sse_banded_LCS_alignment.h
*
* Created on: november 29, 2012
* Author: mercier
*/
#ifndef SSE_BANDED_LCS_ALIGNMENT_H_
#define SSE_BANDED_LCS_ALIGNMENT_H_
#include <stdint.h>
#include <stdbool.h>
#define ALILEN (0) // TODO enum
#define MAXLEN (1)
#define MINLEN (2)
// TODO doc
int calculateLCSmin(int l1, int l2, double threshold, bool normalize, int reference, bool lcsmode);
double generic_sse_banded_lcs_align(char* seq1, char* seq2, double threshold, bool normalize, int reference, bool similarity_mode);
#endif

70
src/uint8_indexer.c Normal file
View File

@ -0,0 +1,70 @@
/****************************************************************************
* Uint8 indexing functions *
****************************************************************************/
/**
* @file uint8_indexer.c
* @author Celine Mercier
* @date May 4th 2016
* @brief Functions handling the indexing and retrieval of uint8 arrays.
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <math.h>
#include "uint8_indexer.h"
#include "obiblob.h"
#include "obiblob_indexer.h"
#include "obidebug.h"
#include "obitypes.h"
#define DEBUG_LEVEL 0 // TODO has to be defined somewhere else (cython compil flag?)
Obi_blob_p obi_uint8_to_blob(const uint8_t* value, int value_length)
{
return obi_blob((byte_t*)value, ELEMENT_SIZE_UINT8, value_length, value_length);
}
const uint8_t* obi_blob_to_uint8(Obi_blob_p value_b)
{
return ((uint8_t*) (value_b->value));
}
index_t obi_index_uint8(Obi_indexer_p indexer, const uint8_t* value, int value_length)
{
Obi_blob_p value_b;
index_t idx;
// Encode value
value_b = obi_uint8_to_blob(value, value_length);
if (value_b == NULL)
return -1;
// Add in the indexer
idx = obi_indexer_add(indexer, value_b);
free(value_b);
return idx;
}
const uint8_t* obi_retrieve_uint8(Obi_indexer_p indexer, index_t idx, int* value_length)
{
Obi_blob_p value_b;
// Get encoded value
value_b = obi_indexer_get(indexer, idx);
// Return decoded sequence
*value_length = value_b->length_decoded_value;
return obi_blob_to_uint8(value_b);
}

92
src/uint8_indexer.h Normal file
View File

@ -0,0 +1,92 @@
/****************************************************************************
* uint8 indexer header file *
****************************************************************************/
/**
* @file uint8_indexer.h
* @author Celine Mercier
* @date May 4th 2016
* @brief Header file for the functions handling the indexing of uint8 arrays.
*/
#ifndef UINT8_INDEXER_H_
#define UINT8_INDEXER_H_
#include <stdlib.h>
#include <stdio.h>
#include "obidms.h"
#include "obitypes.h"
#include "obiblob.h"
#include "obiblob_indexer.h"
/**
* @brief Converts an uint8 array to a blob.
*
* @warning The blob must be freed by the caller.
*
* @param value The uint8 array to convert.
* @param value_length The length of the uint8 array to convert.
*
* @returns A pointer on the blob created.
* @retval NULL if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
Obi_blob_p obi_uint8_to_blob(const uint8_t* value, int value_length);
/**
* @brief Converts a blob to an uint8 array.
*
* @warning The array returned is mapped.
*
* @param value_b The blob to convert.
*
* @returns A pointer on the uint8 array contained in the blob.
* @retval NULL if an error occurred.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
const uint8_t* obi_blob_to_uint8(Obi_blob_p value_b);
/**
* @brief Stores an uint8 array in an indexer and returns the index.
*
* @param indexer The indexer structure.
* @param value The uint8 array to index.
* @param value_length The length of the uint8 array to index.
*
* @returns The index referring to the stored uint8 array in the indexer.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
index_t obi_index_uint8(Obi_indexer_p indexer, const uint8_t* value, int value_length);
/**
* @brief Retrieves an uint8 array from an indexer.
*
* @warning The array returned is mapped.
*
* @param indexer The indexer structure.
* @param idx The index referring to the uint8 array to retrieve in the indexer.
* @param value_length A pointer on an integer to store the length of the array retrieved.
*
* @returns A pointer on the uint8 array.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
const uint8_t* obi_retrieve_uint8(Obi_indexer_p indexer, index_t idx, int* value_length);
#endif /* UINT8_INDEXER_H_ */

View File

@ -65,6 +65,12 @@ char* obi_format_date(time_t date)
struct tm* tmp;
formatted_time = (char*) malloc(FORMATTED_TIME_LENGTH*sizeof(char));
if (formatted_time == NULL)
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError allocating memory to format a date");
return NULL;
}
tmp = localtime(&date);
if (tmp == NULL)
@ -84,3 +90,27 @@ char* obi_format_date(time_t date)
return formatted_time;
}
void* obi_get_memory_aligned_on_16(int size, int* shift)
{
void* memory;
*shift = 0;
memory = (void*) malloc(size);
if (memory == NULL)
{
obi_set_errno(OBI_MALLOC_ERROR);
obidebug(1, "\nError allocating memory");
return NULL;
}
while ((((long long unsigned int) (memory))%16) != 0)
{
memory++;
(*shift)++;
}
return (memory);
}

View File

@ -53,4 +53,25 @@ int count_dir(char* dir_path);
char* obi_format_date(time_t date);
/**
* @brief Allocates a chunk of memory aligned on 16 bytes boundary.
*
* @warning The pointer returned must be freed by the caller.
* @warning The memory chunk pointed at by the returned pointer can be
* smaller than the size asked for, since the address is shifted
* to be aligned.
*
* @param size The size in bytes of the memory chunk to be allocated.
* Ideally the closest multiple of 16 greater than the wanted size.
* @param shift A pointer on an integer corresponding to the shift made to align
* the address. Used to free the memory chunk.
*
* @returns The pointer on the aligned memory.
*
* @since May 2016
* @author Celine Mercier (celine.mercier@metabarcoding.org)
*/
void* obi_get_memory_aligned_on_16(int size, int* shift);
#endif /* UTILS_H_ */