Compare commits

...

204 Commits

Author SHA1 Message Date
Celine Mercier
717ee46f08 Commented a loose print 2017-07-05 18:02:18 +02:00
Celine Mercier
313508cc94 Better *Seq* classes but still need work 2017-07-05 17:53:46 +02:00
Celine Mercier
535fc2af83 Column rewriter and optimized View getter 2017-07-05 17:49:05 +02:00
Celine Mercier
3bbc2ae469 More optimized Column item getter 2017-07-05 17:37:19 +02:00
Celine Mercier
5ee0b3989a Cython API: set_line of Column_multi_elts now accept as values argument
any class where values are referenced by keys with an iterator
2017-07-05 17:32:32 +02:00
Celine Mercier
d10192ab0e C functions to detect IUPAC sequences 2017-07-05 17:26:03 +02:00
Celine Mercier
101f764cce New obi import with rewriting of columns when column type or line
elements (keys) change
2017-07-05 17:15:23 +02:00
Celine Mercier
cb5ad2ed2d Added functions to try to open a DMS if it exists 2017-07-05 15:38:22 +02:00
Celine Mercier
f5e992abbf Added a check on the element when setting a value in a column 2017-07-05 14:49:20 +02:00
Celine Mercier
1d2996c6c0 Better handling and tracing of Index Errors between C and Cython 2017-07-05 14:45:43 +02:00
Celine Mercier
f6631f3857 Removed deprecated declarations 2017-07-05 14:42:21 +02:00
Celine Mercier
3f5fef10b9 obi test: minor changes 2017-07-05 14:37:27 +02:00
Celine Mercier
20c72af697 Basic obi check command to check DMS and view informations 2017-07-05 13:54:19 +02:00
Celine Mercier
d252131950 Basic obi less command 2017-07-05 13:44:12 +02:00
Celine Mercier
ca16ce0bb0 Basic obi grep with new Cython API 2017-07-05 11:58:10 +02:00
Celine Mercier
ac94b35336 Removed unused import 2017-07-05 11:52:31 +02:00
Celine Mercier
2d65db4ebc Goes with c2af955b : forgotten files for NUC_SEQS views 2017-04-21 15:15:12 +02:00
Celine Mercier
4b037ae236 Updated obi test to test NUC_SEQS views and the taxonomy API 2017-04-21 12:09:04 +02:00
Celine Mercier
c2af955b78 Cython view API: added NUC_SEQS views and sequence classes + changed
cloning API
2017-04-21 12:08:14 +02:00
Celine Mercier
71b1a43df8 Added functions to clone views with a simpler API 2017-04-21 11:58:15 +02:00
Celine Mercier
1725b8b80c Reworked taxonomy Cython API to be a subclass of OBIWrapper 2017-04-21 11:54:05 +02:00
Celine Mercier
ab0d08293e Cython API: removed unnecessary imports 2017-04-21 11:51:05 +02:00
Celine Mercier
2f0c4b90d7 Fixed a problem where a view would have a wrong line count after adding
a first column to it if there was already a Line selection associated
(happening when cloning), and fixed a bad error check.
2017-04-14 16:25:55 +02:00
Celine Mercier
537b9847da Minor C doc clarification 2017-04-14 16:23:17 +02:00
Celine Mercier
b998373be5 Cython API: updated the test command for the new API and deactivated the
other commands for now
2017-04-14 16:21:33 +02:00
Celine Mercier
6f780148e2 Cython API: added taxonomy API 2017-04-14 16:20:30 +02:00
Celine Mercier
0e08fc486a Cython API: fixed bug when deleting a column from a view where the
Cython wrapper wasn't closed, and fixed the Line selection
materialization
2017-04-14 16:19:18 +02:00
Celine Mercier
2bbee64e57 Cython API: fixed problems with Column class 2017-04-14 16:14:41 +02:00
Celine Mercier
693859eec2 Cython API: fixed conversion bugs when setting and getting values
(especially NA values) in OBI_CHAR, OBI_STR and OBI_SEQ columns
2017-04-14 16:07:23 +02:00
Celine Mercier
a3fad27190 Cython API: automatic importing of column classes now works 2017-04-06 15:45:02 +02:00
Celine Mercier
f351540b0b Merge branch 'Eric_new_Python_API' of git@git.metabarcoding.org:obitools/obitools3.git into Eric_new_Python_API 2017-04-06 15:39:52 +02:00
6dccaa0213 Patch the registering function : register_all_column_classes 2017-04-06 15:37:51 +02:00
Celine Mercier
5de9e0de51 Cython API: now using const char* instead of char* for the type of
values read from OBI_STR columns
2017-04-06 15:15:20 +02:00
Celine Mercier
ad8de80353 Views: better checks when adding an existing column to a view 2017-04-06 14:44:07 +02:00
Celine Mercier
8cd3e3604f Cython Column API 2017-04-06 14:42:11 +02:00
Celine Mercier
255f3c92ae Cython View API 2017-04-06 14:41:58 +02:00
Celine Mercier
08be4e231d Cython Object API 2017-04-06 14:41:43 +02:00
Celine Mercier
b5b7995411 new Cython DMS API 2017-04-06 14:41:26 +02:00
Celine Mercier
0dfb1eb3e6 Cython typed columns 2017-04-06 14:40:44 +02:00
Celine Mercier
381194194c Cython API: compiling but not working 2017-03-06 16:07:02 +01:00
Celine Mercier
778acc48cd Added linked lists to handle lists of column pointers in views (not
tested)
2017-03-06 16:06:17 +01:00
Celine Mercier
3319ede837 Views: Column dictionaries now store and return pointers on column
pointers instead of column pointers.
2017-02-22 13:49:50 +01:00
Celine Mercier
fc20b83ad1 Merging 2017-02-20 14:56:04 +01:00
Celine Mercier
431c1c8c6a Merge branch 'Eric_new_Python_API' of
git@git.metabarcoding.org:obitools/obitools3.git into
Eric_new_Python_API

Conflicts:
	python/obitools3/obidms/_obidms.pxd
	python/obitools3/obidms/_obidms.pyx
	python/obitools3/obidms/_obidmscolumn_bool.pyx
	python/obitools3/obidms/_obidmscolumn_str.pyx
	python/obitools3/obidms/_obiseq.pxd
	python/obitools3/obidms/_obiseq.pyx
	python/obitools3/obidms/_obitaxo.pxd
	python/obitools3/obidms/_obitaxo.pyx
	python/obitools3/obidms/_obiview.pxd
	python/obitools3/obidms/_obiview.pyx
	python/obitools3/obidms/_obiview_nuc_seq.pxd
	python/obitools3/obidms/_obiview_nuc_seq.pyx
	python/obitools3/obidms/_obiview_nuc_seq_qual.pxd
	python/obitools3/obidms/_obiview_nuc_seq_qual.pyx
	python/obitools3/obidms/capi/obialign.pxd
	python/obitools3/obidms/capi/obidmscolumn.pxd
	python/obitools3/obidms/capi/obitaxonomy.pxd
	python/obitools3/obidms/capi/obiview.pxd
2017-02-20 14:55:36 +01:00
Celine Mercier
f23315e26f New Cython API: compile but doesn't work 2017-02-17 15:14:06 +01:00
Celine Mercier
071a3b61ab Merged master fixed conflict. 2017-02-14 10:58:43 +01:00
Celine Mercier
e524041013 Views: Files for unfinished views now have the extension
'.obiview_unfinished', renamed to '.obiview' when the view is finished.
2017-02-07 17:16:09 +01:00
Celine Mercier
a9102620f5 Fixed missing email address 2017-02-07 17:14:10 +01:00
Celine Mercier
7e9932f488 Fixed a C function declaration 2017-02-07 17:12:56 +01:00
Celine Mercier
e50da64ea1 The elements names when a column contains several elements per line are
now formatted with '\0' as separator and handled in a more optimized way
2017-01-31 16:48:06 +01:00
Celine Mercier
651c1d7845 utilities: bsearch and qsort with additional user_data pointer argument 2017-01-31 16:45:47 +01:00
Celine Mercier
c0bcdce724 Taxonomy: documentation for all the functions, and fixed bugs when
closing the taxonomy (overwriting of .pdx files, missing freeing, and
re-placed a misplaced condition)
2017-01-18 18:22:49 +01:00
Celine Mercier
c065c1914a Taxonomy: adding, writing and reading preferred names, changed some
function names, and fixed a bug with taxa indices not being properly
initialized
2017-01-16 17:28:20 +01:00
Celine Mercier
0385a92e02 Taxonomy: Refactored the taxdump reading, and little fixes 2017-01-11 16:36:08 +01:00
cf7f2de016 Modify __init__ and close method to deal with registration process 2017-01-10 14:26:16 +01:00
5122ad52a7 Merge branch 'Eric_new_Python_API' of git@git.metabarcoding.org:obitools/obitools3.git into Eric_new_Python_API 2017-01-10 14:07:50 +01:00
4b02ba73ac Add the OBIObject concept 2017-01-10 14:07:10 +01:00
Celine Mercier
41ad3deec0 Taxonomy: informations about deleted taxids is now read from
delnodes.dmp file and added to *.adx file
2017-01-09 17:28:49 +01:00
Celine Mercier
d68374018b Taxonomy: functions to read the *.adx file (containing the deprecated
and current taxids and their corresponding indices in the taxa
structure) and to find the taxa using the merged index.
2017-01-06 15:52:21 +01:00
Celine Mercier
f396625f98 Taxonomy: function to write *.adx files 2017-01-05 15:37:13 +01:00
Celine Mercier
897032387f Taxonomy: reading merged.dmp file in taxdump 2017-01-05 14:28:36 +01:00
4a1d3167a7 Last change on my branch 2017-01-02 16:46:52 +01:00
153c22257f Last change on my branch 2017-01-02 16:46:17 +01:00
2139bfc748 refactoring... 2017-01-02 13:05:22 +01:00
65f3b16e6d Refactoring ... 2016-12-29 18:22:05 +01:00
0526386337 first working DMS class 2016-12-27 06:17:45 +01:00
62caf1346e temporary remove some files 2016-12-26 15:03:24 +01:00
3ac6e85fb3 Big refactoring 4 2016-12-26 14:58:03 +01:00
5156f6bb9e Big refactoring 3 2016-12-26 14:18:01 +01:00
e6db2086d5 Big refactoring 2 2016-12-26 13:56:31 +01:00
daacd0df76 Strong refactoring 1 2016-12-26 13:35:31 +01:00
Celine Mercier
8e92bf6dac LCS alignment: it is now checked that sequences are not longer than what
a 16 bits integer can code for (as the LCS and alignment lengths are
kept in 16 bits registers)
2016-12-22 17:06:23 +01:00
Celine Mercier
30e4359c85 LCS alignment: documentation for all the lowest level functions 2016-12-22 17:03:51 +01:00
Celine Mercier
5c50e5b378 Embryo of code for openMP parallelization of LCS alignment but
deactivated for now because can't make it compile with cython/clang
2016-12-20 11:46:58 +01:00
3cedd00d7f Add register function for column type 2016-12-20 11:13:57 +01:00
82fbe43980 transfert method to obiviews 2016-12-20 08:18:47 +01:00
d1a972dfcb patch import 2016-12-20 08:15:42 +01:00
f43dc3e3ab separate the obicolumn classes in new files 2016-12-20 08:15:08 +01:00
Celine Mercier
9c71b06117 Removed deprecated TODOs 2016-12-19 14:36:40 +01:00
Celine Mercier
3bf5260174 Merge branch 'master' of git@git.metabarcoding.org:obitools/obitools3.git 2016-12-19 10:31:18 +01:00
Celine Mercier
857a5198e4 Updated `obi lcs` for the LCS alignment of two columns 2016-12-16 19:40:36 +01:00
Celine Mercier
d99447c12b C function for LCS alignment of two columns, and optimized and fixed
line count bug in function to align one column
2016-12-16 19:39:02 +01:00
Celine Mercier
303bd6f445 Added function to build kmer table for 2 columns, and fixed bug (with
line count) when building kmer table of one column
2016-12-16 19:10:18 +01:00
Celine Mercier
490f5fe6b9 Updated deprecated code in cython API for columns (using line count of
view instead of column)
2016-12-16 19:04:21 +01:00
Celine Mercier
191c83aafc Added missing *.cfiles 2016-12-15 15:28:34 +01:00
04d39c62ab Try for a new API 2016-12-14 08:44:44 +01:00
Celine Mercier
9b24818fe2 Refactored alignment code for minimum redundancy between the function
that aligns 1 column and the function that aligns 2 columns
2016-12-13 17:18:12 +01:00
06cb7a9a58 Some change in the way to manage access to special items of the
dictionary like sequence or quality
2016-12-13 12:49:34 +01:00
fc55fc117d Some cosmetic on the code 2016-12-13 12:48:13 +01:00
4ef5cb0d87 Move the OBIView_NUC_SEQS class to files _obiview_nuc_seq.pxd and
_obiview_nuc_seq.pyx to avoid circular inclusion
2016-12-13 12:46:49 +01:00
fc805e5443 Remove some warnings in the editor 2016-12-13 08:29:22 +01:00
8d7ef7d3d1 patch the distutils to add the C source directory in the include path.
This should solve most of the compilation problems related to .h files
located in this directory
2016-12-13 08:02:09 +01:00
Celine Mercier
8afb1644e9 Alignment: API rework. 'obi align' is now 'obi lcs', and the results are
now written to columns automatically created in the output view, all
optimally handled at the C level.
2016-12-12 11:58:59 +01:00
Celine Mercier
fa4e4ffaff Changed the cython API to create new views so as to have different
functions for the different cases
2016-12-07 14:17:57 +01:00
Celine Mercier
936be64c34 Goes with 5e0c9f87 (missing ';' and fixed compilation warnings) 2016-12-05 11:18:29 +01:00
Celine Mercier
5e0c9f878b Added the doc for the function building the element names, and a missing
free
2016-12-05 10:46:21 +01:00
Celine Mercier
852e5488c8 The default element names for columns with multiple elements per line
are now "O;1;2;...;n"
2016-12-02 17:54:51 +01:00
Celine Mercier
e60497651c Updated the documentation for the functions to set and get in the
context of a view
2016-11-30 12:22:47 +01:00
Celine Mercier
4ad8c16a73 Finished adding all the functions to directly set and get indices in
columns containing indices referring to any type of data.
2016-11-30 11:08:11 +01:00
Celine Mercier
6f6099687d Sequence alignment: if no sequence column is given and the view has the
type NUC_SEQS_VIEW, the default sequence column is aligned
2016-11-29 16:52:41 +01:00
Celine Mercier
98d0849653 Sequence alignment: added the possibility to specify the index of the
sequences to align in a column containing multiple sequences per line (C
level for now)
2016-11-29 16:15:02 +01:00
Celine Mercier
5fb025f310 When aligning, it is now quickly checked whether the sequences are
identical using their indexes
2016-11-28 11:39:29 +01:00
Celine Mercier
8ce6f6c80b Added an argument to specify whether the two sequences can be identical
when applying filters before aligning
2016-11-28 11:38:02 +01:00
Celine Mercier
3e53f9418b Added functions to recover the indexes themselves from any column
referring to indexed values
2016-11-28 11:35:19 +01:00
Celine Mercier
d40d2d0c76 Fixed error in documentation 2016-11-28 10:55:23 +01:00
Celine Mercier
f897e87600 When closing a view, it is now automatically checked that all OBI_QUAL
columns correspond to their associated OBI_SEQ column
2016-11-25 12:04:57 +01:00
Celine Mercier
70e056a2aa It is now impossible to open or clone a view that is not finished (= has
been closed at least once)
2016-11-24 11:19:07 +01:00
Celine Mercier
8abbfa203a Good file for commit 6fa9a8bd: When a view is cloned, a comment is added
to the new view specifying the name of the cloned view
2016-11-23 11:32:39 +01:00
Celine Mercier
6fa9a8bd76 When a view is cloned, a comment is added to the new view specifying the
name of the cloned view
2016-11-23 11:29:21 +01:00
Celine Mercier
76a4c6b14e Fixed a bug when cloning a view and checking its type 2016-11-23 11:28:17 +01:00
Celine Mercier
0ab9e6c05a When adding an existing column to a view, it is checked that the
column's line count is at least the view's line count. This can't be
more stringent for reasons that need to be rediscussed
2016-11-23 11:04:53 +01:00
Celine Mercier
70c49e214a Added the kmer filter to LCS alignments, and now obiblobs containing
encoded sequences are directly put in int16_t arrays for the alignment
2016-11-18 16:29:28 +01:00
Celine Mercier
08e67a090f Changed the inline functions syntax, which should make it compatible
with more compilers
2016-11-18 16:21:26 +01:00
Celine Mercier
621b4972db Functions to get obiblobs through views 2016-11-18 15:59:50 +01:00
Celine Mercier
7d022c1a52 If the indexer name is NULL when creating a column, it now becomes the
column name
2016-11-18 15:56:51 +01:00
Celine Mercier
1c71c195fc Goes with a0ebc2d8 2016-11-10 15:01:29 +01:00
Celine Mercier
54cfeffd85 Goes with 8f724f4f, forgotten file 2016-11-10 14:48:31 +01:00
Celine Mercier
a0ebc2d871 Functions to directly retrieve Obiblobs from indexers 2016-11-10 14:45:28 +01:00
Celine Mercier
8f724f4f8e Some code refactoring 2016-11-09 16:48:00 +01:00
Celine Mercier
359578814b Added view type property to OBIView cython class and updated obi export
to use it
2016-11-08 17:49:59 +01:00
Celine Mercier
51b23915ca Added properties for Nuc_Seq cython classes (and updated commands using
them)
2016-11-08 16:59:32 +01:00
Celine Mercier
b5b889c4a2 Fixed the OBI_Nuc_Seq_Stored cython class not being up to date with the
new properties of its parent class
2016-11-08 11:26:37 +01:00
Celine Mercier
36ac315125 Fixed bugs with python view type when creating a new view, and a bug
when trying to guess the obi type of a nucleotide sequence when its type
was bytes
2016-11-08 11:23:54 +01:00
Celine Mercier
8291693309 obi grep: updated to work with the new line selection class and within
the local sequence environment, and progress bar functioning
2016-11-08 11:19:12 +01:00
Celine Mercier
4bc19c3e49 obi export: view type is now checked and progress bar functioning 2016-11-08 11:17:20 +01:00
Celine Mercier
2d2fe5279d Added functions to add new taxa to a taxonomy with handling of
associated *.ldx files
2016-11-03 17:59:21 +01:00
Celine Mercier
2504bf0fa9 Added an iterator to the OBI_Taxonomy cython class 2016-11-02 11:08:18 +01:00
Celine Mercier
d8a257e711 Taxonomy handling functions in C. Features: read taxdump, read binary
files, write binary files. Not fully handled yet: *.adx, *.pdx, *.ldx,
merged.dmp and delnodes.dmp files.
2016-10-27 18:56:11 +02:00
Celine Mercier
b63d0fb9fb Added C functions to write .rdx, .tdx, .ndx binary taxonomy files from a
taxonomy C structure
2016-10-14 17:03:10 +02:00
Celine Mercier
0dfd67ec89 The endianness of binary taxonomy files is now correctly checked 2016-10-10 17:04:29 +02:00
Celine Mercier
0faaac49cf The taxonomy directory of the DMS is now automatically created with the
DMS
2016-10-10 17:02:51 +02:00
Celine Mercier
1b07109e51 Removed deprecated code 2016-10-10 17:01:51 +02:00
Celine Mercier
60ab503a14 Added properties in the OBI_Taxonomy class 2016-10-10 17:01:17 +02:00
Celine Mercier
2dcfdc59fc When a new view is created with a line selection, the view to clone is
automatically found + compacted redundant code + fixed potential bug
when cloning a NUC_SEQS view by name
2016-10-06 17:55:18 +02:00
Celine Mercier
399fc2c051 Removed deprecated source files previously used for tests 2016-09-30 17:49:37 +02:00
Celine Mercier
9cd57deca9 Added OBIView_line_selection class to make new line selections
associated with the view to clone, and improved and renamed method
closing a view
2016-09-30 17:48:53 +02:00
Celine Mercier
d88811ed7d Added a seed option to the obi test command for reproducible tests 2016-09-29 17:34:48 +02:00
Celine Mercier
8c402101e4 Renamed private attributes as _* and removed some deprecated code 2016-09-28 16:56:44 +02:00
Celine Mercier
1a7b42018e Added some error checking when opening or creating a view 2016-09-28 14:28:34 +02:00
Celine Mercier
b717e8bb8b Added properties for the OBIView class and cleaned up deprecated code 2016-09-28 14:26:23 +02:00
Celine Mercier
03a2c8ef7c Finished restructuring the OBIDMS_column class properties 2016-09-27 14:16:30 +02:00
Celine Mercier
a7f891d1c9 Added a lines_used property to the OBIDS_column class 2016-09-26 18:04:28 +02:00
Celine Mercier
bd50b3f972 Added version property to OBIDMS_column class 2016-09-26 17:45:10 +02:00
Celine Mercier
81380363b7 Added original_name property to OBIDMS_column class 2016-09-26 17:31:32 +02:00
Celine Mercier
a4b8349274 Added data_type property to OBIDMS_column class 2016-09-26 17:12:20 +02:00
Celine Mercier
a474391b27 Added nb_elements_per_line property to OBIDMS_column class 2016-09-26 17:01:13 +02:00
Celine Mercier
a0bc45cc92 Added elements_names property to OBIDMS_column class 2016-09-26 16:53:16 +02:00
Celine Mercier
76f89717fe Added alias property to OBIDMS_column cython class 2016-09-26 16:12:48 +02:00
Celine Mercier
b408a4f6eb Changed file name limits to adapt to system limits + minor changes 2016-09-22 18:05:07 +02:00
Celine Mercier
b083745f56 Deleted the "new line selection while editing a view" system 2016-09-22 11:19:29 +02:00
Celine Mercier
43f3c69a40 Fixed bug when cloning column with line selection 2016-09-21 17:50:21 +02:00
Celine Mercier
e79507b629 Fixed bugs in the process ensuring that all the columns of a view have
the same line count, fixed a bug when trying to set a value in a view
when a line selection exists, fixed a bug when adding a new column to a
view where line counts would be wrong
2016-09-21 17:42:17 +02:00
Celine Mercier
bb25723d99 Improved documentation of a function 2016-09-21 17:30:39 +02:00
Celine Mercier
a0da984003 Fixed bug where columns would not get truncated to the right size, and
fixed bug where column directories would be open and not closed in some
instances
2016-09-21 17:28:52 +02:00
Celine Mercier
802bae110b Removed deprecated function 2016-09-21 17:09:59 +02:00
Celine Mercier
dd55aef3e5 Added column class method to get the unique references (name and
version) of a column
2016-09-21 17:08:44 +02:00
Celine Mercier
9ac522fde1 Better obi test command 2016-09-21 17:06:35 +02:00
6adb9eb623 Should solde issue #56 2016-09-19 21:40:40 +02:00
Celine Mercier
8f49553d5a First version of the obi test command, testing that the OBITools3 work
correctly
2016-09-15 12:26:07 +02:00
Celine Mercier
986f90c59e Fixed bug where column directories weren't closed correctly, leading to
too many file descriptors open, and added error checking when closing
file descriptors
2016-09-15 12:18:40 +02:00
Celine Mercier
a240ec0169 Added error checking when closing file descriptors 2016-09-15 11:58:56 +02:00
Celine Mercier
0a3c23d9d0 Added a missing closedir 2016-09-15 11:58:34 +02:00
Celine Mercier
8724445fa1 Added error checking when closing files 2016-09-15 11:50:30 +02:00
Celine Mercier
de189fd7e0 Fixed major bug when cloning an AVL where the bloom filter was not
copied properly (because the sutructure copy via assignation does not
work for structures with a variable size)
2016-09-15 11:47:02 +02:00
Celine Mercier
9a97f1f633 View predicates are now carried over when cloning a view 2016-09-06 16:22:24 +02:00
Celine Mercier
00014eb023 View files now have the *.obiview extension 2016-09-06 14:19:13 +02:00
Celine Mercier
acc0da2d0b Readjusted some limits for file names and file numbers to be under OS
limits
2016-09-05 12:39:04 +02:00
Celine Mercier
668696fc5a Fixed major bug: when setting all the columns of a view to the same
number of lines, columns are now cloned before being enlarged if needed
+ predicate functions now print error messages if the predicates are not
respected
2016-09-05 12:37:36 +02:00
Celine Mercier
ba84ef4847 Fixed typo 2016-09-05 12:31:06 +02:00
Celine Mercier
c9dce03295 Fixed major bug when cloning an AVL group (last AVL of new group was not
correctly enlarged before copying the data) + minor improvements
2016-09-05 12:29:52 +02:00
Celine Mercier
eb82d088cb Added some view class methods 2016-09-05 12:20:00 +02:00
Celine Mercier
f46ea0b988 Finished fixing issues with DMS paths 2016-08-30 11:09:45 +02:00
Celine Mercier
5b2e370ffb Fixed a bug when using an absolute path for a DMS 2016-08-29 17:30:31 +02:00
Celine Mercier
8d360b0fac Minor improvements to obi export command 2016-08-19 17:49:22 +02:00
Celine Mercier
b34769b27c Minor improvements to obi export command 2016-08-19 17:46:55 +02:00
Celine Mercier
2d0a714e37 Basic obi export command exporting from view to fasta or fastq format,
for testing purposes
2016-08-19 17:40:58 +02:00
Celine Mercier
7b780ffb28 View files now have a dynamic size to allow unlimited comments size 2016-08-18 17:57:03 +02:00
Celine Mercier
e4129610cf Quality columns are now optional in NUC_SEQS views + minor fixes 2016-08-16 15:17:26 +02:00
Celine Mercier
cf839522e7 Minor update and fix to obi grep command 2016-08-12 17:45:44 +02:00
Celine Mercier
10b22f79da The cython subclass is now correctly chosen when cloning a view 2016-08-12 17:39:19 +02:00
Celine Mercier
ad8e10f2d1 Reworked a bit alignment API 2016-08-12 15:56:07 +02:00
Celine Mercier
92cad61417 Fixed bug when closing views with no associated predicate 2016-08-12 15:52:38 +02:00
Celine Mercier
64a745ce0b First very basic version of obi grep command 2016-08-11 17:32:08 +02:00
Celine Mercier
2d8ac2b035 Fixed bug when creating an OBI_IDX column 2016-08-11 17:30:32 +02:00
Celine Mercier
5b7917bb5a Fixed bug when writing predicates in view file 2016-08-11 17:30:09 +02:00
Celine Mercier
d3c58780a0 Added __len__ function do OBIViews that returns the line count 2016-08-10 17:20:23 +02:00
Celine Mercier
029d395da1 Added __iter__ function to OBIView lines 2016-08-10 17:08:22 +02:00
Celine Mercier
bea02cc7a5 Added (temporary?) check for the type of quality strings because the
import now seems to return them with bytes type
2016-08-10 16:25:45 +02:00
Celine Mercier
4ba01617af Fixed obscure compilation bug 2016-08-10 15:26:40 +02:00
Celine Mercier
bec684d5e2 Fixed merge conflict 2016-08-10 15:05:37 +02:00
Celine Mercier
2aaa87edcc 1st version of obi align command and reworked functions that handle
column alignment
2016-08-10 14:51:02 +02:00
400a3f9f3d Merge branch 'Eric_version_for_sequence'
Conflicts:
	python/obitools3/obidms/_obidmscolumn_seq.pyx
2016-08-04 09:42:42 +02:00
465ea81c77 Merge branch 'master' of git@git.metabarcoding.org:obitools/obitools3.git 2016-08-03 10:13:47 +02:00
1e6d6e32e0 Switch to Cython version >= 0.24 2016-08-03 10:13:10 +02:00
ccc877764e Patch a bug in the printing of the progress bar leading to a bus error
when compiled with some C compilers and Cython >= 0.24
2016-08-03 10:12:23 +02:00
Celine Mercier
26b8e1f215 Modified C API to set and get in columns: added functions to set and get
using column names instead of pointers, and changed function names
2016-08-02 16:33:19 +02:00
Celine Mercier
312f50ff0f Major update: Column aliases. Columns are now identified in the context
of a view by an alias that can be modified.
2016-08-01 18:25:30 +02:00
Celine Mercier
3843485a04 Deleted deprecated function declaration that would make compilation
impossible and fixed error in documentation
2016-07-22 16:21:02 +02:00
Celine Mercier
20425a5d2b Deleted deprecated structure declarations 2016-07-19 15:48:56 +02:00
Celine Mercier
56e4848ebd The predicates associated with a view are now described in its comments
field
2016-07-19 15:31:21 +02:00
Celine Mercier
8850e40b6e Minor changes for better presentation 2016-07-19 15:30:17 +02:00
Celine Mercier
b89af38109 Goes with 38718320 2016-07-18 13:57:49 +02:00
Celine Mercier
38718320f9 First version for the association of one column to another. Closes #55 2016-07-15 15:38:49 +02:00
Celine Mercier
8ee85c3005 A first version of predicate functions that are checked when a new view
is saved and closed
2016-07-12 14:54:11 +02:00
146 changed files with 16034 additions and 4999 deletions

View File

@@ -40,6 +40,7 @@ def findPackage(root,base=None):
def findCython(root,base=None,pyrexs=None):
setupdir = os.path.dirname(sys.argv[0])
csourcedir = os.path.join(setupdir,"src")
pyrexs=[]
if base is None:
@@ -53,6 +54,7 @@ def findCython(root,base=None,pyrexs=None):
[pyrex]
)
)
pyrexs[-1].include_dirs.append(csourcedir)
try:
cfiles = os.path.splitext(pyrex)[0]+".cfiles"
cfilesdir = os.path.dirname(cfiles)

View File

@@ -36,7 +36,7 @@ extensions = [
'sphinx.ext.pngmath',
'sphinx.ext.ifconfig',
'sphinx.ext.viewcode',
# 'breathe',
'breathe',
]
# Add any paths that contain templates here, relative to this directory.
@@ -295,4 +295,6 @@ texinfo_documents = [
sys.path.append( "breathe/" )
breathe_projects = { "OBITools3": "doxygen/xml/" }
breathe_default_project = "OBITools3"
#breathe_projects_source = {
# "auto" : ( "../src", ["obidms.h", "obiavl.h"] )
# }

View File

@@ -11,7 +11,7 @@ OBITools3 documentation
Programming guidelines <guidelines>
Data structures <data>
Code documentation <code_doc/codedoc>
Indices and tables
------------------

View File

@@ -4,6 +4,7 @@ OBITypes
.. image:: ./UML/OBITypes_UML.png
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`

View File

@@ -1 +1 @@
build/lib.macosx-10.6-intel-3.5
build/lib.macosx-10.6-intel-3.4

View File

@@ -21,7 +21,9 @@ default_config = { 'software' : "The OBITools",
'log' : False,
'loglevel' : 'INFO',
'progress' : True,
'defaultdms' : None
'defaultdms' : None,
'inputview' : None,
'outputview' : None
}
root_config_name='obi'

Binary file not shown.

View File

@@ -1,7 +1,5 @@
#cython: language_level=3
from ..utils cimport str2bytes
cdef extern from "stdio.h":
struct FILE
int fprintf(FILE *stream, char *format, ...)
@@ -54,7 +52,7 @@ cdef class ProgressBar:
cdef bint ontty
cdef int fd
cdef bytes head
cdef bytes _head
cdef char *chead
cdef object logger

View File

@@ -6,8 +6,9 @@ Created on 27 mars 2016
@author: coissac
'''
from ..utils cimport str2bytes, bytes2str
from .config cimport getConfiguration
import sys
from ..utils cimport bytes2str
cdef class ProgressBar:
cdef clock_t clock(self):
@@ -23,7 +24,7 @@ cdef class ProgressBar:
def __init__(self,
off_t maxi,
dict config,
dict config={},
str head="",
double seconde=0.1):
self.starttime = self.clock()
@@ -34,14 +35,18 @@ cdef class ProgressBar:
self.arrow = 0
self.lastlog = 0
if not config:
config=getConfiguration()
self.ontty = sys.stderr.isatty()
if (maxi<=0):
maxi=1
self.maxi = maxi
self.head = str2bytes(head)
self.chead= self.head
self.head = head
self.chead= self._head
self.logger=config[config["__root_config__"]]["logger"]
@@ -104,35 +109,43 @@ cdef class ProgressBar:
if self.ontty:
fraction=<int>(percent * 50.)
self.arrow=(self.arrow+1) % 4
self.diese[fraction]=0
self.spaces[50 - fraction]=0
if days:
<void>fprintf(stderr,b'\r%s %5.1f %% |%s%c%s] remain : %d days %02d:%02d:%02d',
<void>fprintf(stderr,b'\r%s %5.1f %% |%.*s%c%.*s] remain : %d days %02d:%02d:%02d',
self.chead,
percent*100,
self.diese,self.wheel[self.arrow],self.spaces,
fraction,self.diese,
self.wheel[self.arrow],
50-fraction,self.spaces,
days,hour,minu,sec)
else:
<void>fprintf(stderr,b'\r%s %5.1f %% |%s%c%s] remain : %02d:%02d:%02d',
<void>fprintf(stderr,b'\r%s %5.1f %% |%.*s%c%.*s] remain : %02d:%02d:%02d',
self.chead,
percent*100.,
self.diese,self.wheel[self.arrow],self.spaces,
fraction,self.diese,
self.wheel[self.arrow],
50-fraction,self.spaces,
hour,minu,sec)
self.diese[fraction]=b'#'
self.spaces[50 - fraction]=b' '
twentyth = int(percent * 20)
if twentyth != self.lastlog:
tenth = int(percent * 10)
if tenth != self.lastlog:
if self.ontty:
<void>fputs(b'\n',stderr)
self.logger.info('%s %5.1f %% remain : %02d:%02d:%02d' % (
bytes2str(self.head),
bytes2str(self._head),
percent*100.,
hour,minu,sec))
self.lastlog=twentyth
self.lastlog=tenth
else:
self.cycle+=1
property head:
def __get__(self):
return self._head
def __set__(self,str value):
self._head=str2bytes(value)
self.chead=self._head

View File

@@ -0,0 +1,68 @@
#cython: language_level=3
from obitools3.dms.dms import DMS # TODO cimport doesn't work
from obitools3.dms.view.view import View # TODO cimport doesn't work
__title__="Print a preview of a DMS, view, column...."
default_config = { 'inputview' : None,
}
# TODO make it work with URIs
def addOptions(parser):
# TODO put this common group somewhere else but I don't know where
group=parser.add_argument_group('DMS and view options')
group.add_argument('--default-dms','-d',
action="store", dest="obi:defaultdms",
metavar='<DMS NAME>',
default=None,
type=str,
help="Name of the default DMS for reading and writing data.")
group.add_argument('--view','-v',
action="store", dest="obi:view",
metavar='<VIEW NAME>',
default=None,
type=str,
help="Name of the view.")
# group=parser.add_argument_group('obi check specific options')
# group.add_argument('--print',
# action="store", dest="less:print",
# metavar='<N>',
# default=None,
# type=int,
# help="Print N sequences (default: 10)")
def run(config):
# Open DMS
d = DMS.open(config['obi']['defaultdms'])
# Open input view uif there is one
if config['obi']['inputview'] is not None :
iview = View.open(d, config['obi']['inputview'])
print(repr(iview))
else :
for v in d :
print(repr(v))
d.close()

View File

@@ -0,0 +1,109 @@
# from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
# from obitools3.dms.dms import OBIDMS # TODO cimport doesn't work
# from obitools3.utils cimport bytes2str
#
# import time
# import re
def run(config):
pass
# __title__="Export a NUC_SEQS view to a fasta or fastq file"
#
#
# default_config = { 'inputview' : None,
# }
#
# def addOptions(parser):
#
# # TODO put this common group somewhere else but I don't know where
# group=parser.add_argument_group('DMS and view options')
#
# group.add_argument('--default-dms','-d',
# action="store", dest="obi:defaultdms",
# metavar='<DMS NAME>',
# default=None,
# type=str,
# help="Name of the default DMS for reading and writing data.")
#
# group.add_argument('--input-view','-i',
# action="store", dest="obi:inputview",
# metavar='<INPUT VIEW NAME>',
# default=None,
# type=str,
# help="Name of the input view, either raw if the view is in the default DMS,"
# " or in the form 'dms:view' if it is in another DMS.")
#
# group=parser.add_argument_group('obi export specific options')
#
# group.add_argument('--format','-f',
# action="store", dest="export:format",
# metavar='<FORMAT>',
# default="fasta",
# type=str,
# help="Export in the format <FORMAT>, 'fasta' or 'fastq'. Default: 'fasta'.") # TODO export in csv
#
# def run(config):
#
# # TODO import doesn't work
# NUC_SEQUENCE_COLUMN = "NUC_SEQ"
# ID_COLUMN = "ID"
# DEFINITION_COLUMN = "DEFINITION"
# QUALITY_COLUMN = "QUALITY"
#
# special_columns = [NUC_SEQUENCE_COLUMN, ID_COLUMN, DEFINITION_COLUMN, QUALITY_COLUMN]
#
# # Open DMS
# d = OBIDMS(config['obi']['defaultdms'])
#
# # Open input view
# iview = d.open_view(config['obi']['inputview'])
#
# print(iview.type)
#
# # TODO check that the view has the type NUC_SEQS
# if ((config['export']['format'] == "fasta") or (config['export']['format'] == "fastq")) and (iview.type != "NUC_SEQS_VIEW") : # TODO find a way to import those macros
# raise Exception("Error: the view to export in fasta or fastq format is not a NUC_SEQS view")
#
# # Initialize the progress bar
# pb = ProgressBar(len(iview), config, seconde=5)
#
# i=0
# for seq in iview :
# pb(i)
#
# toprint = ">"+seq.id+" "
#
# for col_name in seq :
# if col_name not in special_columns :
# toprint = toprint + col_name + "=" + str(seq[col_name]) + "; "
#
# if DEFINITION_COLUMN in seq :
# toprint = toprint + seq.definition
#
# nucseq = bytes2str(seq.nuc_seq)
#
# if config['export']['format'] == "fasta" :
# nucseq = re.sub("(.{60})", "\\1\n", nucseq, 0, re.DOTALL)
#
# toprint = toprint + "\n" + nucseq
#
# if config['export']['format'] == "fastq" :
# toprint = toprint + "\n" + "+" + "\n" + seq.get_str_quality()
#
# print(toprint)
# i+=1
#
# iview.close()
# d.close()
#
# print("Done.")
#
#
#
#
#
#
#
#

View File

@@ -0,0 +1,96 @@
#cython: language_level=3
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms.dms import DMS # TODO cimport doesn't work
from obitools3.dms.view.view import View, Line_selection # TODO cimport doesn't work
from functools import reduce
import time
__title__="Grep view lines that match the given predicates"
default_config = { 'inputview' : None,
'outputview' : None
}
def addOptions(parser):
# TODO put this common group somewhere else but I don't know where
group=parser.add_argument_group('DMS and view options')
group.add_argument('--default-dms','-d',
action="store", dest="obi:defaultdms",
metavar='<DMS NAME>',
default=None,
type=str,
help="Name of the default DMS for reading and writing data.")
group.add_argument('--input-view','-i',
action="store", dest="obi:inputview",
metavar='<INPUT VIEW NAME>',
default=None,
type=str,
help="Name of the input view, either raw if the view is in the default DMS,"
" or in the form 'dms:view' if it is in another DMS.")
group.add_argument('--output-view','-o',
action="store", dest="obi:outputview",
metavar='<OUTPUT VIEW NAME>',
default=None,
type=str,
help="Name of the output view, either raw if the view is in the default DMS,"
" or in the form 'dms:view' if it is in another DMS.")
group=parser.add_argument_group('obi grep specific options')
group.add_argument('--predicate','-p',
action="append", dest="grep:predicates",
metavar='<PREDICATE>',
default=None,
type=str,
help="Grep lines that match the given python expression on <line> or <sequence>.")
def run(config):
# Open DMS
d = DMS.open(config['obi']['defaultdms'])
# Open input view 1
iview = View.open(d, config['obi']['inputview'])
# Initialize the progress bar
pb = ProgressBar(len(iview), config, seconde=5)
# Apply filter
selection = Line_selection(iview)
for i in range(len(iview)) :
pb(i)
line = iview[i]
loc_env = {'sequence': line, 'line': line} # TODO add taxonomy
good = (reduce(lambda bint x, bint y: x and y,
(bool(eval(p, loc_env, line))
for p in config['grep']['predicates']), True))
if good :
selection.append(i)
# Create output view with the line selection
oview = selection.materialize(config['obi']['outputview'], comments="obi grep: "+str(config['grep']['predicates'])+"\n")
print("\n")
print(repr(oview))
d.close()

View File

@@ -1,15 +1,33 @@
#cython: language_level=3
# TODO cimport generate errors with argument numbers, but without them some variables can't be declared
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.files.universalopener cimport uopen
from obitools3.parsers.fasta import fastaIterator
from obitools3.parsers.fastq import fastqIterator
from obitools3.obidms._obidms import OBIDMS
from obitools3.dms.dms import DMS # TODO cimport doesn't work
from obitools3.dms.view.view cimport View
from obitools3.dms.view.typed_view.view_NUC_SEQS import View_NUC_SEQS # TODO cimport doesn't work
from obitools3.dms.column.column cimport Column
from obitools3.utils cimport tobytes, \
get_obitype, \
update_obitype
from obitools3.dms.capi.obitypes cimport obitype_t, \
OBI_VOID
from obitools3.dms.capi.obierrno cimport obi_errno
import time
__title__="Counts sequences in a sequence set"
import pickle
__title__="Imports sequences from different formats into a DMS"
default_config = { 'destview' : None,
'skip' : 0,
'only' : None,
@@ -24,7 +42,7 @@ def addOptions(parser):
metavar='<FILENAME>',
nargs='?',
default=None,
help='sequence file name to be imported' )
help='Name of the sequence file to import' )
group=parser.add_argument_group('obi import specific options')
@@ -34,7 +52,7 @@ def addOptions(parser):
default=None,
type=str,
help="Name of the default DMS for reading and writing data")
group.add_argument('--destination-view','-v',
action="store", dest="import:destview",
metavar='<VIEW NAME>',
@@ -43,27 +61,25 @@ def addOptions(parser):
required=True,
help="Name of the default DMS for reading and writing data")
group=parser.add_argument_group('obi import specific options')
group.add_argument('--skip',
action="store", dest="import:skip",
metavar='<N>',
default=None,
default=0,
type=int,
help="skip the N first sequences")
help="Skip the N first sequences")
group.add_argument('--only',
action="store", dest="import:only",
metavar='<N>',
default=None,
type=int,
help="treat only N sequences")
help="Treat only N sequences")
group.add_argument('--skip-on-error',
action="store_true", dest="import:skiperror",
default=None,
help="Skip sequence entries with parse error")
group.add_argument('--fasta',
action="store_const", dest="import:seqinformat",
default=None,
@@ -81,59 +97,188 @@ def addOptions(parser):
default=None,
const='nuc',
help="Input file contains nucleic sequences")
group.add_argument('--prot',
action="store_const", dest="import:moltype",
default=None,
const='pep',
help="Input file contains protein sequences")
# TODO: Handling of NA values. Check None. Specify in doc? None or NA? Possiblity to specify in option?
# look in R read.table option to specify NA value
def run(config):
pb = ProgressBar(35000000,config,seconde=5)
cdef int i
cdef type value_type
cdef obitype_t value_obitype
cdef obitype_t old_type
cdef obitype_t new_type
cdef bint get_quality
cdef bint NUC_SEQS_view
cdef int nb_elts
cdef object d
cdef View view
cdef object iseq
cdef object seq
cdef object inputs
cdef Column id_col
cdef Column def_col
cdef Column seq_col
cdef Column qual_col
cdef Column old_column
cdef bint rewrite
cdef dict dcols
cdef int skipping
cdef str tag
cdef object value
cdef list elt_names
cdef int old_nb_elements_per_line
cdef int new_nb_elements_per_line
cdef list old_elements_names
cdef list new_elements_names
cdef ProgressBar pb
global obi_errno
pb = ProgressBar(1000000, config, seconde=5) # TODO should be number of records in file
inputs = uopen(config['import']['filename'])
# Create or open DMS
try:
d = DMS.test_open(config['obi']['defaultdms'])
except :
d = DMS.new(config['obi']['defaultdms'])
get_quality = False
NUC_SEQS_view = False
if config['import']['seqinformat']=='fasta':
get_quality = False
NUC_SEQS_view = True
iseq = fastaIterator(inputs)
view_type="NUC_SEQS_VIEW"
view = View_NUC_SEQS.new(d, config['import']['destview'], quality=get_quality)
elif config['import']['seqinformat']=='fastq':
iseq = fastqIterator(inputs)
view_type="NUC_SEQS_VIEW"
get_quality = True
NUC_SEQS_view = True
iseq = fastqIterator(inputs)
view = View_NUC_SEQS.new(d, config['import']['destview'], quality=get_quality)
else:
raise RuntimeError('No file format specified')
# Temporary way to handle NA values
#NA_list = ["nan"]
# Create DMS
d = OBIDMS(config['obi']['defaultdms'])
# Create view
view = d.new_view(config['import']['destview'], view_type=view_type)
i = 0
for seq in iseq:
pb(i)
view[i].set_id(seq['id'])
view[i].set_definition(seq['definition'])
view[i].set_sequence(seq['sequence'])
raise RuntimeError('File format not handled')
# Save basic columns in variables for optimization
if NUC_SEQS_view :
id_col = view["ID"]
def_col = view["DEFINITION"]
seq_col = view["NUC_SEQ"]
if get_quality :
view[i].set_quality(seq['quality'])
for tag in seq['tags'] :
#print(tag, seq['tags'][tag])
#if seq['tags'][tag] not in NA_list :
view[i][tag] = seq['tags'][tag]
i+=1
#print(i)
print(view.__repr__())
view.save_and_close()
d.close()
qual_col = view["QUALITY"]
print("Done.")
dcols = {}
skipping = 0
i = 0
for seq in iseq :
if skipping < config['import']['skip'] : # TODO not efficient because sequences are parsed
skipping+=1
elif i == config['import']['only'] :
break
else :
pb(i)
if NUC_SEQS_view :
id_col[i] = seq['id']
def_col[i] = seq['definition']
seq_col[i] = seq['sequence']
if get_quality :
qual_col[i] = seq['quality']
for tag in seq['tags'] :
value = seq['tags'][tag]
if tag not in dcols :
value_type = type(value)
nb_elts = 1
value_obitype = OBI_VOID
if value_type == dict or value_type == list :
nb_elts = len(value)
elt_names = list(value)
else :
nb_elts = 1
elt_names = None
value_obitype = get_obitype(value)
if value_obitype != OBI_VOID :
dcols[tag] = (Column.new_column(view, tag, value_obitype, nb_elements_per_line=nb_elts, elements_names=elt_names), value_obitype)
# Fill value
dcols[tag][0][i] = value
# TODO else log error?
else :
rewrite = False
# Check type adequation
old_type = dcols[tag][1]
new_type = OBI_VOID
new_type = update_obitype(old_type, value)
if old_type != new_type :
rewrite = True
try:
# Fill value
dcols[tag][0][i] = value
except IndexError :
value_type = type(value)
old_column = dcols[tag][0]
old_nb_elements_per_line = old_column.nb_elements_per_line
new_nb_elements_per_line = 0
old_elements_names = old_column.elements_names
new_elements_names = None
#####################################################################
# Check the length and keys of column lines if needed
if value_type == dict : # Check dictionary keys
for k in value :
if k not in old_elements_names :
new_elements_names = list(value)
rewrite = True
break
elif value_type == list or value_type == tuple : # Check vector length
if old_nb_elements_per_line < len(value) :
new_nb_elements_per_line = len(value)
rewrite = True
#####################################################################
if rewrite :
if new_nb_elements_per_line == 0 and new_elements_names is not None :
new_nb_elements_per_line = len(new_elements_names)
dcols[tag] = (view.rewrite_column_with_diff_attributes(old_column.name,
new_data_type=new_type,
new_nb_elements_per_line=new_nb_elements_per_line,
new_elements_names=new_elements_names),
value_obitype)
# Reset obierrno
obi_errno = 0
# Fill value
dcols[tag][0][i] = value
i+=1
print("\n")
print(view.__repr__())
d.close()

View File

@@ -8,6 +8,8 @@
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
@@ -23,6 +25,8 @@
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
@@ -55,5 +59,7 @@
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -0,0 +1,236 @@
#cython: language_level=3
#
# from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
# from obitools3.dms.dms import OBIDMS # TODO cimport doesn't work
# from obitools3.utils cimport str2bytes
#
# from obitools3.dms.capi.obialign cimport obi_lcs_align_one_column, \
# obi_lcs_align_two_columns
#
#
# import time
#
# __title__="Aligns one sequence column with itself or two sequence columns"
#
#
# default_config = { 'inputview' : None,
# }
#
# def addOptions(parser):
#
# # TODO put this common group somewhere else but I don't know where.
# # Also some options should probably be in another group
# group=parser.add_argument_group('DMS and view options')
#
# group.add_argument('--default-dms', '-d',
# action="store", dest="obi:defaultdms",
# metavar='<DMS NAME>',
# default=None,
# type=str,
# help="Name of the default DMS for reading and writing data.")
#
# group.add_argument('--input-view-1', '-i',
# action="store", dest="obi:inputview1",
# metavar='<INPUT VIEW NAME>',
# default=None,
# type=str,
# help="Name of the (first) input view.")
#
# group.add_argument('--input-view-2', '-I',
# action="store", dest="obi:inputview2",
# metavar='<INPUT VIEW NAME>',
# default="",
# type=str,
# help="Eventually, the name of the second input view.")
#
# group.add_argument('--input-column-1', '-c',
# action="store", dest="obi:inputcolumn1",
# metavar='<INPUT COLUMN NAME>',
# default="",
# type=str,
# help="Name of the (first) input column. "
# " Default: the default nucleotide sequence column of the view if there is one.")
#
# group.add_argument('--input-column-2', '-C',
# action="store", dest="obi:inputcolumn2",
# metavar='<INPUT COLUMN NAME>',
# default="",
# type=str,
# help="Eventually, the name of the second input column.")
#
# group.add_argument('--input-elt-1', '-e',
# action="store", dest="obi:inputelement1",
# metavar='<INPUT ELEMENT NAME>',
# default="",
# type=str,
# help="If the first input column has multiple elements per line, name of the element referring to the sequence to align. "
# " Default: the first element of the line.")
#
# group.add_argument('--input-elt-2', '-E',
# action="store", dest="obi:inputelement2",
# metavar='<INPUT ELEMENT NAME>',
# default="",
# type=str,
# help="If the second input column has multiple elements per line, name of the element referring to the sequence to align. "
# " Default: the first element of the line.")
#
# group.add_argument('--id-column-1', '-f',
# action="store", dest="obi:idcolumn1",
# metavar='<ID COLUMN NAME>',
# default="",
# type=str,
# help="Name of the (first) column containing the identifiers of the sequences to align. "
# " Default: the default ID column of the view if there is one.")
#
# group.add_argument('--id-column-2', '-F',
# action="store", dest="obi:idcolumn2",
# metavar='<ID COLUMN NAME>',
# default="",
# type=str,
# help="Eventually, the name of the second ID column.")
#
# group.add_argument('--output-view', '-o',
# action="store", dest="obi:outputview",
# metavar='<OUTPUT VIEW NAME>',
# default=None,
# type=str,
# help="Name of the output view.")
#
#
# group=parser.add_argument_group('obi lcs specific options')
#
# group.add_argument('--threshold','-t',
# action="store", dest="align:threshold",
# metavar='<THRESHOLD>',
# default=0.0,
# type=float,
# help="Score threshold. If the score is normalized and expressed in similarity (default),"
# " it is an identity, e.g. 0.95 for an identity of 95%%. If the score is normalized"
# " and expressed in distance, it is (1.0 - identity), e.g. 0.05 for an identity of 95%%."
# " If the score is not normalized and expressed in similarity, it is the length of the"
# " Longest Common Subsequence. If the score is not normalized and expressed in distance,"
# " it is (reference length - LCS length)."
# " Only sequence pairs with a similarity above <THRESHOLD> are printed. Default: 0.00"
# " (no threshold).")
#
# group.add_argument('--longest-length','-L',
# action="store_const", dest="align:reflength",
# default=0,
# const=1,
# help="The reference length is the length of the longest sequence."
# " Default: the reference length is the length of the alignment.")
#
# group.add_argument('--shortest-length','-l',
# action="store_const", dest="align:reflength",
# default=0,
# const=2,
# help="The reference length is the length of the shortest sequence."
# " Default: the reference length is the length of the alignment.")
#
# group.add_argument('--raw','-r',
# action="store_false", dest="align:normalize",
# default=True,
# help="Raw score, not normalized. Default: score is normalized with the reference sequence length.")
#
# group.add_argument('--distance','-D',
# action="store_false", dest="align:similarity",
# default=True,
# help="Score is expressed in distance. Default: score is expressed in similarity.")
#
# group.add_argument('--print-seq','-s',
# action="store_true", dest="align:printseq",
# default=False,
# help="The nucleotide sequences are written in the output view. Default: they are not written.")
#
# group.add_argument('--print-count','-n',
# action="store_true", dest="align:printcount",
# default=False,
# help="Sequence counts are written in the output view. Default: they are not written.")
#
# group.add_argument('--thread-count','-p', # TODO should probably be in a specific option group
# action="store", dest="align:threadcount",
# metavar='<THREAD COUNT>',
# default=1,
# type=int,
# help="Number of threads to use for the computation. Default: one.")
#
#
# # cpdef align(str dms_n,
# # str input_view_1_n, str output_view_n,
# # str input_view_2_n="",
# # str input_column_1_n="", str input_column_2_n="",
# # str input_elt_1_n="", str input_elt_2_n="",
# # str id_column_1_n="", str id_column_2_n="",
# # double threshold=0.0, bint normalize=True,
# # int reference=0, bint similarity_mode=True,
# # bint print_seq=False, bint print_count=False,
# # comments="",
# # int thread_count=1) :
# #
# # cdef OBIDMS d
# # d = OBIDMS(dms_n)
# #
# # if input_view_2_n == "" and input_column_2_n == "" :
# # if obi_lcs_align_one_column(d._pointer, \
# # str2bytes(input_view_1_n), \
# # str2bytes(input_column_1_n), \
# # str2bytes(input_elt_1_n), \
# # str2bytes(id_column_1_n), \
# # str2bytes(output_view_n), \
# # str2bytes(comments), \
# # print_seq, \
# # print_count, \
# # threshold, normalize, reference, similarity_mode,
# # thread_count) < 0 :
# # raise Exception("Error aligning sequences")
# # else :
# # if obi_lcs_align_two_columns(d._pointer, \
# # str2bytes(input_view_1_n), \
# # str2bytes(input_view_2_n), \
# # str2bytes(input_column_1_n), \
# # str2bytes(input_column_2_n), \
# # str2bytes(input_elt_1_n), \
# # str2bytes(input_elt_2_n), \
# # str2bytes(id_column_1_n), \
# # str2bytes(id_column_2_n), \
# # str2bytes(output_view_n), \
# # str2bytes(comments), \
# # print_seq, \
# # print_count, \
# # threshold, normalize, reference, similarity_mode) < 0 :
# # raise Exception("Error aligning sequences")
# #
# # d.close()
# #
# #
def run(config):
pass
# TODO: Build formatted comments with all parameters etc
# comments = "Obi align"
#
# # Call cython alignment function
# align(config['obi']['defaultdms'], \
# config['obi']['inputview1'], \
# config['obi']['outputview'], \
# input_view_2_n = config['obi']['inputview2'], \
# input_column_1_n = config['obi']['inputcolumn1'], \
# input_column_2_n = config['obi']['inputcolumn2'], \
# input_elt_1_n = config['obi']['inputelement1'], \
# input_elt_2_n = config['obi']['inputelement2'], \
# id_column_1_n = config['obi']['idcolumn1'], \
# id_column_2_n = config['obi']['idcolumn2'], \
# threshold = config['align']['threshold'], \
# normalize = config['align']['normalize'], \
# reference = config['align']['reflength'], \
# similarity_mode = config['align']['similarity'], \
# print_seq = config['align']['printseq'], \
# print_count = config['align']['printcount'], \
# comments = comments, \
# thread_count = config['align']['threadcount'])
#
# print("Done.")
# #
# #
# #
# #
# #

View File

@@ -0,0 +1,57 @@
#cython: language_level=3
from obitools3.dms.dms import DMS # TODO cimport doesn't work
from obitools3.dms.view.view import View # TODO cimport doesn't work
# TODO with URIs
__title__="Less equivalent"
default_config = { 'inputview' : None,
}
def addOptions(parser):
# TODO put this common group somewhere else but I don't know where
group=parser.add_argument_group('DMS and view options')
group.add_argument('--default-dms','-d',
action="store", dest="obi:defaultdms",
metavar='<DMS NAME>',
default=None,
type=str,
help="Name of the default DMS for reading and writing data.")
group.add_argument('--view','-v',
action="store", dest="obi:view",
metavar='<VIEW NAME>',
default=None,
type=str,
help="Name of the view to print.")
group=parser.add_argument_group('obi less specific options')
group.add_argument('--print', '-n',
action="store", dest="less:print",
metavar='<N>',
default=10,
type=int,
help="Print N sequences (default: 10)")
def run(config):
# Open DMS
d = DMS.open(config['obi']['defaultdms'])
# Open input view
iview = View.open(d, config['obi']['inputview'])
# Print
for i in range(config['less']['print']) :
print(repr(iview[i]))
d.close()

View File

@@ -0,0 +1,437 @@
#cython: language_level=3
from obitools3.apps.progress cimport ProgressBar # TODO I absolutely don't understand why it doesn't work without that line
from obitools3.dms.view.view import View, Line_selection
from obitools3.dms.view.typed_view.view_NUC_SEQS import View_NUC_SEQS
from obitools3.dms.dms import DMS
from obitools3.dms.column import Column
from obitools3.dms.taxo.taxo import OBI_Taxonomy
from obitools3.utils cimport str2bytes
from obitools3.dms.capi.obitypes cimport OBI_INT, \
OBI_FLOAT, \
OBI_BOOL, \
OBI_CHAR, \
OBI_STR, \
OBI_SEQ
import shutil
import string
import random
VIEW_TYPES = ["", "NUC_SEQS_VIEW"]
COL_TYPES = [OBI_INT, OBI_FLOAT, OBI_BOOL, OBI_CHAR, OBI_STR, OBI_SEQ]
NUC_SEQUENCE_COLUMN = "NUC_SEQ"
ID_COLUMN = "ID"
DEFINITION_COLUMN = "DEFINITION"
QUALITY_COLUMN = "QUALITY"
SPECIAL_COLUMNS = [NUC_SEQUENCE_COLUMN, ID_COLUMN, DEFINITION_COLUMN, QUALITY_COLUMN]
#TAXDUMP = "" TODO path=?
TAXTEST = "taxtest"
NAME_MAX_LEN = 200
COL_COMMENTS_MAX_LEN = 2048
MAX_INT = 2147483647 # used to generate random float values
__title__="Tests if the obitools are working properly"
default_config = {
}
def test_taxo(config, infos):
tax1 = OBI_Taxonomy.open(infos['dms'], config['obi']['taxo'], taxdump=True)
tax1.write(TAXTEST)
tax2 = OBI_Taxonomy.open(infos['dms'], TAXTEST, taxdump=False)
assert len(tax1) == len(tax2), "Length of written taxonomy != length of read taxdump : "+str(len(tax2))+" != "+str(len(tax1))
i = 0
for x in range(config['test']['nbtests']):
idx = random.randint(0, len(tax1)-1)
t1 = tax1.get_taxon_by_idx(idx)
t2 = tax2.get_taxon_by_idx(idx)
assert t1 == t2, "Taxon gotten from written taxonomy != taxon read from taxdump : "+str(t2)+" != "+str(t1)
i+=1
if (i%(config['test']['nbtests']/10)) == 0 :
print("Testing taxonomy functions......"+str(i*100/config['test']['nbtests'])+"%")
tax1.close()
tax2.close()
def random_length(max_len):
return random.randint(1, max_len)
def random_bool(config):
return random.choice([True, False])
def random_char(config):
return str2bytes(random.choice(string.ascii_lowercase))
def random_float(config):
return random.randint(0, MAX_INT) + random.random()
def random_int(config):
return random.randint(0, config['test']['maxlinenb'])
def random_seq(config):
return str2bytes(''.join(random.choice(['a','t','g','c']) for i in range(random_length(config['test']['seqmaxlen']))))
def random_bytes(config):
return random_bytes_with_max_len(config['test']['strmaxlen'])
def random_str_with_max_len(max_len):
return ''.join(random.choice(string.ascii_lowercase) for i in range(random_length(max_len)))
def random_bytes_with_max_len(max_len):
return str2bytes(''.join(random.choice(string.ascii_lowercase) for i in range(random_length(max_len))))
def random_column(infos):
return random.choice(sorted(list(infos['view'].keys())))
def random_unique_name(infos):
name = ""
while name == "" or name in infos['unique_names'] :
name = random_str_with_max_len(NAME_MAX_LEN)
infos['unique_names'].append(name)
return name
def random_unique_element_name(config, infos):
name = ""
while name == "" or name in infos['unique_names'] :
name = random_str_with_max_len(config['test']['elt_name_max_len'])
infos['unique_names'].append(name)
return name
def print_test(config, sentence):
if config['test']['verbose'] :
print(sentence)
def test_set_and_get(config, infos):
print_test(config, ">>> Set and get test")
col_name = random_column(infos)
col = infos['view'][col_name]
element_names = col.elements_names
data_type = col.data_type
if data_type == "OBI_QUAL" :
print_test(config, "-")
return
idx = random_int(config)
value = random.choice([None, infos['random_generator'][data_type](config)])
if col.nb_elements_per_line > 1 :
elt = random.choice(element_names)
col[idx][elt] = value
assert col[idx][elt] == value, "Column: "+repr(col)+"\nSet value != gotten value "+str(value)+" != "+str(col[idx][elt])
else:
col[idx] = value
assert col[idx] == value, "Column: "+repr(col)+"\nSet value != gotten value "+str(value)+" != "+str(col[idx])
print_test(config, ">>> Set and get test OK")
def test_add_col(config, infos):
print_test(config, ">>> Add column test")
#existing_col = random_bool(config) # TODO doesn't work because of line count problem. See obiview.c line 1737
#if existing_col and infos["view_names"] != [] :
# random_view = infos['dms'].open_view(random.choice(infos["view_names"]))
# random_column = random_view[random.choice(sorted(list(random_view.columns))]
# random_column_refs = random_column.refs
# if random_column_refs['name'] in infos['view'] :
# alias = random_unique_name(infos)
# else :
# alias = ''
# infos['view'].add_column(random_column_refs['name'], version_number=random_column_refs['version'], alias=alias, create=False)
# random_view.close()
#else :
create_random_column(config, infos)
print_test(config, ">>> Add column test OK")
def test_delete_col(config, infos):
print_test(config, ">>> Delete column test")
if len(list(infos['view'].keys())) <= 1 :
print_test(config, "-")
return
col_name = random_column(infos)
if col_name in SPECIAL_COLUMNS :
print_test(config, "-")
return
infos['view'].delete_column(col_name)
print_test(config, ">>> Delete column test OK")
def test_col_alias(config, infos):
print_test(config, ">>> Changing column alias test")
col_name = random_column(infos)
if col_name in SPECIAL_COLUMNS :
print_test(config, "-")
return
infos['view'][col_name].name = random_unique_name(infos)
print_test(config, ">>> Changing column alias test OK")
def test_new_view(config, infos):
print_test(config, ">>> New view test")
random_new_view(config, infos)
print_test(config, ">>> New view test OK")
def random_test(config, infos):
return random.choice(infos['tests'])(config, infos)
def random_view_type():
return random.choice(VIEW_TYPES)
def random_col_type():
return random.choice(COL_TYPES)
def fill_column(config, infos, col) :
data_type = col.data_type
element_names = col.elements_names
if len(element_names) > 1 :
for i in range(random_int(config)) :
for j in range(len(element_names)) :
col[i][element_names[j]] = random.choice([None, infos['random_generator'][data_type](config)])
else :
for i in range(random_int(config)) :
col[i] = random.choice([None, infos['random_generator'][data_type](config)])
def create_random_column(config, infos) :
alias = random.choice(['', random_unique_name(infos)])
nb_elements_per_line=random.randint(1, config['test']['maxelts'])
elements_names = []
for i in range(nb_elements_per_line) :
elements_names.append(random_unique_element_name(config, infos))
elements_names = random.choice([None, elements_names])
name = random_unique_name(infos)
data_type = random_col_type()
column = Column.new_column(infos['view'],
name,
data_type,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=random_str_with_max_len(COL_COMMENTS_MAX_LEN),
alias=alias
)
if alias != '' :
assert infos['view'][alias] == column
else :
assert infos['view'][name] == column
return column
def fill_view(config, infos):
for i in range(random.randint(1, config['test']['maxinicolcount'])) :
col = create_random_column(config, infos)
fill_column(config, infos, col)
def random_new_view(config, infos, first=False):
v_to_clone = None
line_selection = None
quality_col = False # TODO
if not first:
infos['view_names'].append(infos['view'].name)
infos['view'].close()
v_to_clone = View.open(infos['dms'], random.choice(infos["view_names"]))
v_type = ""
print_test(config, "View to clone: ")
print_test(config, repr(v_to_clone))
create_line_selection = random_bool(config)
if create_line_selection and v_to_clone.line_count > 0:
print_test(config, "New view with new line selection.")
line_selection = Line_selection(v_to_clone)
for i in range(random.randint(1, v_to_clone.line_count)) :
line_selection.append(random.randint(0, v_to_clone.line_count-1))
#print_test(config, "New line selection: "+str(line_selection))
else :
v_type = random_view_type()
if line_selection is not None :
infos['view'] = line_selection.materialize(random_unique_name(infos), comments=random_str_with_max_len(config['test']['commentsmaxlen']))
elif v_to_clone is not None :
infos['view'] = v_to_clone.clone(random_unique_name(infos), comments=random_str_with_max_len(config['test']['commentsmaxlen']))
else :
if v_type == "NUC_SEQS_VIEW" :
infos['view'] = View_NUC_SEQS.new(infos['dms'], random_unique_name(infos), comments=random_str_with_max_len(config['test']['commentsmaxlen'])) # TODO quality column
else :
infos['view'] = View.new(infos['dms'], random_unique_name(infos), comments=random_str_with_max_len(config['test']['commentsmaxlen'])) # TODO quality column
print_test(config, repr(infos['view']))
if v_to_clone is not None :
if line_selection is None:
assert v_to_clone.line_count == infos['view'].line_count, "New view and cloned view don't have the same line count : "+str(v_to_clone.line_count)+" (view to clone line count) != "+str(infos['view'].line_count)+" (new view line count)"
else :
assert len(line_selection) == infos['view'].line_count, "New view with new line selection does not have the right line count : "+str(len(line_selection))+" (line selection length) != "+str(infos['view'].line_count)+" (new view line count)"
v_to_clone.close()
if first :
fill_view(config, infos)
def create_test_obidms(config, infos):
infos['dms'] = DMS.new(config['obi']['defaultdms'])
def ini_dms_and_first_view(config, infos):
create_test_obidms(config, infos)
random_new_view(config, infos, first=True)
infos['view_names'] = []
def addOptions(parser):
# TODO put this common group somewhere else but I don't know where
group=parser.add_argument_group('DMS and view options')
group.add_argument('--default-dms','-d',
action="store", dest="obi:defaultdms",
metavar='<DMS NAME>',
default="/tmp/test_dms",
type=str,
help="Name of the default DMS for reading and writing data. "
"Default: /tmp/test_dms")
group.add_argument('--taxo','-t', # TODO I don't understand why the option is not registered if it is not set
action="store", dest="obi:taxo",
metavar='<TAXDUMP PATH>',
default='', # TODO not None because if it's None, the option is not entered in the option dictionary.
type=str,
help="Path to a taxdump to test the taxonomy.")
group=parser.add_argument_group('obi test specific options')
group.add_argument('--nb_tests','-n',
action="store", dest="test:nbtests",
metavar='<NB_TESTS>',
default=1000,
type=int,
help="Number of tests to carry out. "
"Default: 1000")
group.add_argument('--seq_max_len','-s',
action="store", dest="test:seqmaxlen",
metavar='<SEQ_MAX_LEN>',
default=200,
type=int,
help="Maximum length of DNA sequences. "
"Default: 200")
group.add_argument('--str_max_len','-r',
action="store", dest="test:strmaxlen",
metavar='<STR_MAX_LEN>',
default=200,
type=int,
help="Maximum length of character strings. "
"Default: 200")
group.add_argument('--comments_max_len','-c',
action="store", dest="test:commentsmaxlen",
metavar='<COMMENTS_MAX_LEN>',
default=10000,
type=int,
help="Maximum length of view comments. "
"Default: 10000")
group.add_argument('--max_ini_col_count','-o',
action="store", dest="test:maxinicolcount",
metavar='<MAX_INI_COL_COUNT>',
default=10,
type=int,
help="Maximum number of columns in the initial view. "
"Default: 10")
group.add_argument('--max_line_nb','-l',
action="store", dest="test:maxlinenb",
metavar='<MAX_LINE_NB>',
default=10000,
type=int,
help="Maximum number of lines in a column. "
"Default: 10000")
group.add_argument('--max_elts_per_line','-e',
action="store", dest="test:maxelts",
metavar='<MAX_ELTS_PER_LINE>',
default=20,
type=int,
help="Maximum number of elements per line in a column. "
"Default: 20")
group.add_argument('--verbose','-v',
action="store_true", dest="test:verbose",
default=False,
help="Print the tests. "
"Default: Don't print the tests")
group.add_argument('--seed','-g',
action="store", dest="test:seed",
metavar='<SEED>',
default=None,
help="Seed (use for reproducible tests). "
"Default: Seed is determined by Python")
def run(config):
if 'seed' in config['test'] :
random.seed(config['test']['seed'])
infos = {'dms': None,
'view': None,
'view_names': None,
'unique_names': [],
'random_generator': {b"OBI_BOOL": random_bool, b"OBI_CHAR": random_char, b"OBI_FLOAT": random_float, b"OBI_INT": random_int, b"OBI_SEQ": random_seq, b"OBI_STR": random_bytes},
'tests': [test_set_and_get, test_add_col, test_delete_col, test_col_alias, test_new_view]
}
config['test']['elt_name_max_len'] = int((COL_COMMENTS_MAX_LEN - config['test']['maxelts']) / config['test']['maxelts'])
print("Initializing the DMS and the first view...")
shutil.rmtree(config['obi']['defaultdms']+'.obidms', ignore_errors=True)
ini_dms_and_first_view(config, infos)
print_test(config, repr(infos['view']))
i = 0
for t in range(config['test']['nbtests']):
random_test(config, infos)
print_test(config, repr(infos['view']))
i+=1
if (i%(config['test']['nbtests']/10)) == 0 :
print("Testing......"+str(i*100/config['test']['nbtests'])+"%")
#print(infos)
if config['obi']['taxo'] != '' :
test_taxo(config, infos)
infos['view'].close()
infos['dms'].close()
shutil.rmtree(config['obi']['defaultdms']+'.obidms', ignore_errors=True)
print("Done.")

View File

@@ -0,0 +1,2 @@
from .dms import DMS # @UnresolvedImport

View File

@@ -0,0 +1,41 @@
#cython: language_level=3
from obitools3.dms.capi.obidms cimport OBIDMS_p
from obitools3.dms.capi.obitypes cimport const_char_p
cdef extern from "obi_align.h" nogil:
int obi_lcs_align_one_column(OBIDMS_p dms,
const_char_p seq_view_name,
const_char_p seq_column_name,
const_char_p seq_elt_name,
const_char_p id_column_name,
const_char_p output_view_name,
const_char_p output_view_comments,
bint print_seq,
bint print_count,
double threshold,
bint normalize,
int reference,
bint similarity_mode,
int thread_count)
int obi_lcs_align_two_columns(OBIDMS_p dms,
const_char_p seq1_view_name,
const_char_p seq2_view_name,
const_char_p seq1_column_name,
const_char_p seq2_column_name,
const_char_p seq1_elt_name,
const_char_p seq2_elt_name,
const_char_p id1_column_name,
const_char_p id2_column_name,
const_char_p output_view_name,
const_char_p output_view_comments,
bint print_seq,
bint print_count,
double threshold,
bint normalize,
int reference,
bint similarity_mode)

View File

@@ -0,0 +1,19 @@
#cython: language_level=3
from .obitypes cimport const_char_p
cdef extern from "obidms.h" nogil:
struct OBIDMS_t:
const_char_p dms_name
ctypedef OBIDMS_t* OBIDMS_p
OBIDMS_p obi_dms(const_char_p dms_name)
OBIDMS_p obi_open_dms(const_char_p dms_path)
OBIDMS_p obi_test_open_dms(const_char_p dms_path)
OBIDMS_p obi_create_dms(const_char_p dms_path)
int obi_close_dms(OBIDMS_p dms)
char* obi_dms_get_dms_path(OBIDMS_p dms)
char* obi_dms_get_full_path(OBIDMS_p dms, const_char_p path_name)

View File

@@ -0,0 +1,62 @@
#cython: language_level=3
from ..capi.obidms cimport OBIDMS_p
from ..capi.obitypes cimport const_char_p, \
OBIType_t, \
obiversion_t, \
obiint_t, \
obibool_t, \
obichar_t, \
obifloat_t, \
index_t, \
time_t
from libc.stdint cimport uint8_t
cdef extern from "obidmscolumn.h" nogil:
struct Column_reference_t :
const_char_p column_name
obiversion_t version
ctypedef Column_reference_t* Column_reference_p
struct OBIDMS_column_header_t:
size_t header_size
size_t data_size
index_t line_count
index_t lines_used
index_t nb_elements_per_line
const_char_p elements_names
OBIType_t returned_data_type
OBIType_t stored_data_type
time_t creation_date
obiversion_t version
obiversion_t cloned_from
const_char_p name
const_char_p indexer_name
Column_reference_t associated_column
const_char_p comments
ctypedef OBIDMS_column_header_t* OBIDMS_column_header_p
struct OBIDMS_column_t:
OBIDMS_p dms
OBIDMS_column_header_p header
bint writable
ctypedef OBIDMS_column_t* OBIDMS_column_p
int obi_close_column(OBIDMS_column_p column)
obiversion_t obi_column_get_latest_version_from_name(OBIDMS_p dms,
const_char_p column_name)
OBIDMS_column_header_p obi_column_get_header_from_name(OBIDMS_p dms,
const_char_p column_name,
obiversion_t version_number)
int obi_close_header(OBIDMS_column_header_p header)
char* obi_get_elements_names(OBIDMS_column_p column)

View File

@@ -0,0 +1,8 @@
#cython: language_level=3
cdef extern from "obierrno.h":
int obi_errno
extern int OBI_LINE_IDX_ERROR
extern int OBI_ELT_IDX_ERROR

View File

@@ -7,21 +7,38 @@ from libc.stdint cimport int32_t
cdef extern from "obidms_taxonomy.h" nogil:
struct OBIDMS_taxonomy_t
ctypedef OBIDMS_taxonomy_t* OBIDMS_taxonomy_p
struct ecotxnode :
int32_t taxid
int32_t rank
int32_t farest
ecotxnode* parent
char* name
char* preferred_name
ctypedef ecotxnode ecotx_t
OBIDMS_taxonomy_p obi_read_taxonomy(OBIDMS_p dms, const_char_p taxonomy_name, bint read_alternative_names)
struct ecotxidx_t :
int32_t count
int32_t max_taxid
int32_t buffer_size
ecotx_t* taxon
struct OBIDMS_taxonomy_t :
# ecorankidx_t* ranks
# econameidx_t* names
ecotxidx_t* taxa
ctypedef OBIDMS_taxonomy_t* OBIDMS_taxonomy_p
OBIDMS_taxonomy_p obi_read_taxonomy(OBIDMS_p dms, const_char_p taxonomy_name, bint read_alternative_names)
OBIDMS_taxonomy_p obi_read_taxdump(const_char_p taxdump)
int obi_write_taxonomy(OBIDMS_p dms, OBIDMS_taxonomy_p tax, const_char_p tax_name)
int obi_close_taxonomy(OBIDMS_taxonomy_p taxonomy)
ecotx_t* obi_taxo_get_parent_at_rank(ecotx_t* taxon, int32_t rankidx)
@@ -40,3 +57,9 @@ cdef extern from "obidms_taxonomy.h" nogil:
ecotx_t* obi_taxo_get_superkingdom(ecotx_t* taxon, OBIDMS_taxonomy_p taxonomy)
int obi_taxo_add_local_taxon(OBIDMS_taxonomy_p tax, const char* name, const char* rank_name, int32_t parent_taxid, int32_t min_taxid)
int obi_taxo_add_preferred_name_with_taxid(OBIDMS_taxonomy_p tax, int32_t taxid, const char* preferred_name)
int obi_taxo_add_preferred_name_with_taxon(OBIDMS_taxonomy_p tax, ecotx_t* taxon, const char* preferred_name)

View File

@@ -12,6 +12,8 @@ cdef extern from *:
cdef extern from "encode.h" nogil:
bint only_ATGC(const_char_p seq)
bint only_IUPAC_DNA(const_char_p seq)
bint is_a_DNA_seq(const_char_p seq)
cdef extern from "obitypes.h" nogil:
@@ -53,3 +55,4 @@ cdef extern from "obitypes.h" nogil:
const_char_p name_data_type(int data_type)
ctypedef OBIType_t obitype_t

View File

@@ -10,7 +10,9 @@ from .obitypes cimport const_char_p, \
index_t, \
time_t
from ..capi.obidms cimport OBIDMS_p
from ..capi.obidmscolumn cimport OBIDMS_column_p
from ..capi.obidmscolumn cimport OBIDMS_column_p, \
Column_reference_t, \
Column_reference_p
from libc.stdint cimport uint8_t
@@ -24,11 +26,11 @@ cdef extern from "obiview.h" nogil:
extern const_char_p QUALITY_COLUMN
struct Column_reference_t :
const_char_p column_name
obiversion_t version
struct Alias_column_pair_t :
Column_reference_t column_refs
const_char_p alias
ctypedef Column_reference_t* Column_reference_p
ctypedef Alias_column_pair_t* Alias_column_pair_p
struct Obiview_infos_t :
@@ -40,7 +42,7 @@ cdef extern from "obiview.h" nogil:
Column_reference_t line_selection
index_t line_count
int column_count
Column_reference_p column_references
Alias_column_pair_p column_references
const_char_p comments
ctypedef Obiview_infos_t* Obiview_infos_p
@@ -51,21 +53,26 @@ cdef extern from "obiview.h" nogil:
OBIDMS_p dms
bint read_only
OBIDMS_column_p line_selection
OBIDMS_column_p new_line_selection
OBIDMS_column_p columns
int nb_predicates
# TODO declarations for column dictionary and predicate function array?
ctypedef Obiview_t* Obiview_p
Obiview_p obi_new_view_nuc_seqs(OBIDMS_p dms, const_char_p view_name, Obiview_p view_to_clone, index_t* line_selection, const_char_p comments)
Obiview_p obi_new_view_nuc_seqs(OBIDMS_p dms, const_char_p view_name, Obiview_p view_to_clone, index_t* line_selection, const_char_p comments, bint quality_column)
Obiview_p obi_new_view(OBIDMS_p dms, const_char_p view_name, Obiview_p view_to_clone, index_t* line_selection, const_char_p comments)
Obiview_p obi_new_view_cloned_from_name(OBIDMS_p dms, const_char_p view_name, const_char_p view_to_clone_name, index_t* line_selection, const_char_p comments)
Obiview_p obi_new_view_nuc_seqs_cloned_from_name(OBIDMS_p dms, const_char_p view_name, const_char_p view_to_clone_name, index_t* line_selection, const_char_p comments)
Obiview_infos_p obi_view_map_file(OBIDMS_p dms, const char* view_name)
Obiview_p obi_clone_view(OBIDMS_p dms, Obiview_p view_to_clone, const char* view_name, index_t* line_selection, const char* comments)
Obiview_p obi_clone_view_from_name(OBIDMS_p dms, const char* view_to_clone_name, const char* view_name, index_t* line_selection, const char* comments)
Obiview_infos_p obi_view_map_file(OBIDMS_p dms, const char* view_name, bint finished)
int obi_view_unmap_file(OBIDMS_p dms, Obiview_infos_p view_infos)
@@ -74,207 +81,218 @@ cdef extern from "obiview.h" nogil:
int obi_view_add_column(Obiview_p view,
const_char_p column_name,
obiversion_t version_number,
const_char_p alias,
OBIType_t data_type,
index_t nb_lines,
index_t nb_elements_per_line,
const_char_p elements_names,
char* elements_names,
const_char_p indexer_name,
const_char_p associated_column_name,
obiversion_t associated_column_version,
const_char_p comments,
bint create)
int obi_view_delete_column(Obiview_p view, const_char_p column_name)
int obi_select_line(Obiview_p view, index_t line_nb)
int obi_select_lines(Obiview_p view, index_t* line_nbs)
OBIDMS_column_p obi_view_get_column(Obiview_p view, const_char_p column_name)
OBIDMS_column_p* obi_view_get_pointer_on_column_in_view(Obiview_p view, const_char_p column_name)
int obi_save_view(Obiview_p view)
int obi_close_view(Obiview_p view)
int obi_view_create_column_alias(Obiview_p view, const_char_p current_name, const_char_p alias)
int obi_save_and_close_view(Obiview_p view)
int obi_column_set_obiint_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_INT
int obi_set_int_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name,
obiint_t value)
int obi_column_set_obiint_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_int_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
obiint_t value)
obiint_t obi_column_get_obiint_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
obiint_t obi_get_int_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name)
obiint_t obi_column_get_obiint_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
obiint_t obi_get_int_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)
int obi_column_set_obibool_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_BOOL
int obi_set_bool_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name,
obibool_t value)
int obi_column_set_obibool_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_bool_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
obibool_t value)
obibool_t obi_column_get_obibool_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
obibool_t obi_get_bool_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name)
obibool_t obi_column_get_obibool_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
obibool_t obi_get_bool_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)
int obi_column_set_obichar_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_CHAR
int obi_set_char_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name,
obichar_t value)
int obi_column_set_obichar_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_char_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
obichar_t value)
obichar_t obi_column_get_obichar_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
obichar_t obi_get_char_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name)
obichar_t obi_column_get_obichar_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
obichar_t obi_get_char_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)
int obi_column_set_obifloat_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_FLOAT
int obi_set_float_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name,
obifloat_t value)
int obi_column_set_obifloat_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_float_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
obifloat_t value)
obifloat_t obi_column_get_obifloat_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
obifloat_t obi_get_float_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name)
obifloat_t obi_column_get_obifloat_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
obifloat_t obi_get_float_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)
int obi_column_set_obiqual_char_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_QUAL
int obi_set_qual_char_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
const char* value)
int obi_column_set_obiqual_int_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_qual_int_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
const uint8_t* value,
int value_length)
char* obi_column_get_obiqual_char_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
char* obi_get_qual_char_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)
const uint8_t* obi_column_get_obiqual_int_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
const uint8_t* obi_get_qual_int_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
int* value_length)
int obi_column_set_obiqual_char_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_qual_char_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const char* element_name,
const char* value)
int obi_column_set_obiqual_int_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_qual_int_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const char* element_name,
const uint8_t* value,
int value_length)
char* obi_column_get_obiqual_char_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
char* obi_get_qual_char_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const char* element_name)
const uint8_t* obi_column_get_obiqual_int_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
const uint8_t* obi_get_qual_int_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const char* element_name,
int* value_length)
int obi_column_set_obistr_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_STR
int obi_set_str_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name,
const_char_p value)
int obi_column_set_obistr_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_str_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
const_char_p value)
const_char_p obi_column_get_obistr_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
const_char_p obi_get_str_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name)
const_char_p obi_column_get_obistr_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
const_char_p obi_get_str_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)
int obi_column_set_obiseq_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
# OBI_SEQ
int obi_set_seq_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name,
const_char_p value)
int obi_column_set_obiseq_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
int obi_set_seq_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx,
const_char_p value)
char* obi_column_get_obiseq_with_elt_name_in_view(Obiview_p view,
OBIDMS_column_p column,
char* obi_get_seq_with_elt_name_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
const_char_p element_name)
char* obi_column_get_obiseq_with_elt_idx_in_view(Obiview_p view,
OBIDMS_column_p column,
char* obi_get_seq_with_elt_idx_and_col_p_in_view(Obiview_p view,
OBIDMS_column_p column_p,
index_t line_nb,
index_t element_idx)

View File

@@ -0,0 +1 @@
from .column import Column # @UnresolvedImport

View File

@@ -0,0 +1,52 @@
#cython: language_level=3
from ..capi.obitypes cimport index_t, \
obitype_t
from ..capi.obidmscolumn cimport OBIDMS_column_p
from ..view.view cimport View
from ..object cimport OBIWrapper
cdef dict __OBIDMS_COLUMN_CLASS__
cdef class Column(OBIWrapper) :
cdef View _view
cdef bytes _alias
cdef inline OBIDMS_column_p pointer(self)
cpdef close(self)
@staticmethod
cdef type get_column_class(obitype_t obitype, bint multi_elts)
@staticmethod
cdef type get_python_type(obitype_t obitype, bint multi_elts)
cdef class Column_multi_elts(Column) :
# The type of [values] can be dict, Column_line, or any other class with values referenced by keys with an iterator [for key in values]
cpdef set_line(self, index_t line_nb, object values)
cdef class Column_line:
cdef Column _column
cdef index_t _index
cpdef update(self, data)
cdef register_column_class(obitype_t obitype,
bint multi_elts,
type obiclass,
type python)
cdef register_all_column_classes()

View File

@@ -0,0 +1,366 @@
#cython: language_level=3
from obitools3.dms.column import typed_column
__OBIDMS_COLUMN_CLASS__ = {}
from ..capi.obitypes cimport name_data_type, \
obitype_t, \
OBI_BOOL
from ..capi.obidmscolumn cimport OBIDMS_column_header_p, \
obi_close_column, \
obi_get_elements_names
from ..capi.obiutils cimport obi_format_date
from ..capi.obiview cimport obi_view_add_column, \
obi_view_get_pointer_on_column_in_view, \
Obiview_p
from ..object cimport OBIObjectClosedInstance
from obitools3.utils cimport tobytes, \
bytes2str, \
str2bytes
from obitools3.dms.column import typed_column
import importlib
import inspect
import pkgutil
cdef class Column(OBIWrapper) :
'''
The obitools3.dms.column.Column class wraps a C instance of a column in the context of a View
'''
cdef inline OBIDMS_column_p pointer(self) :
return <OBIDMS_column_p>(<OBIDMS_column_p*>(self._pointer))[0]
@staticmethod
cdef type get_column_class(obitype_t obitype, bint multi_elts):
'''
Internal function returning the python class representing
a column for a given obitype.
'''
return __OBIDMS_COLUMN_CLASS__[(obitype, multi_elts)][0]
@staticmethod
cdef type get_python_type(obitype_t obitype, bint multi_elts): # TODO
'''
Internal function returning the python type representing
an instance for a given obitype.
'''
return __OBIDMS_COLUMN_CLASS__[(obitype, multi_elts)][1]
@staticmethod
def new_column(View view,
object column_name,
obitype_t data_type,
index_t nb_elements_per_line=1,
list elements_names=None,
object comments=b"",
object alias=b""):
# TODO indexer_name?
cdef bytes column_name_b = tobytes(column_name)
cdef bytes alias_b = tobytes(alias)
cdef bytes comments_b = tobytes(comments)
cdef bytes elements_names_b
cdef char* elements_names_p
if not view.active() :
raise OBIObjectClosedInstance()
if alias_b == b"" :
alias_b = column_name_b
if elements_names is not None:
elements_names_b = b';'.join([tobytes(x) for x in elements_names])
elements_names_p = elements_names_b
else:
elements_names_p = NULL
if (obi_view_add_column(view = view.pointer(),
column_name = column_name_b,
version_number = -1,
alias = alias_b,
data_type = <obitype_t>data_type,
nb_lines = len(view),
nb_elements_per_line = nb_elements_per_line,
elements_names = elements_names_p,
indexer_name = NULL,
associated_column_name = NULL,
associated_column_version = -1,
comments = comments_b,
create = True)<0):
raise RuntimeError("Cannot create column %s in view %s" % (bytes2str(column_name_b),
bytes2str(view.name)))
return Column.open(view, alias_b)
@staticmethod
def open(View view,
object column_name):
cdef bytes column_name_b = tobytes(column_name)
cdef OBIDMS_column_p* column_pp
cdef OBIDMS_column_p column_p
cdef Column column
cdef obitype_t column_type
cdef type column_class
if not view.active() :
raise OBIObjectClosedInstance()
column_pp = obi_view_get_pointer_on_column_in_view(view.pointer(),
column_name_b)
if column_pp == NULL:
raise KeyError("Cannot access to column %s in view %s" % (
bytes2str(column_name_b),
bytes2str(view.name)
))
column_p = column_pp[0]
column_type = column_p.header.returned_data_type
column_class = Column.get_column_class(column_type, (column_p.header.nb_elements_per_line > 1))
column = OBIWrapper.new(column_class, column_pp)
column._view = view
column._alias = column_name_b
view.register(column)
return column
def add_to_view(self,
View view,
object column_name=None) :
cdef bytes alias
cdef OBIDMS_column_p column_p = self.pointer()
if not view.active() :
raise OBIObjectClosedInstance()
if (column_name is None):
alias = self._alias
else:
alias = tobytes(column_name)
if (obi_view_add_column(view = view.pointer(),
column_name = column_p.header.name,
version_number = column_p.header.version,
alias = alias,
data_type = <obitype_t>0,
nb_lines = -1,
nb_elements_per_line = -1,
elements_names = NULL,
indexer_name = NULL,
associated_column_name = NULL,
associated_column_version = -1,
comments = NULL,
create = False) < 0):
raise RuntimeError("Cannot insert column %s (%s@%d) into view %s" %
( bytes2str(alias),
bytes2str(column_p.header.name),
column_p.header.version,
bytes2str(view.name)
))
view.register(self)
def __len__(self):
'''
implements the len() function for the Column class
@rtype: `int`
'''
return self.lines_used
def __sizeof__(self):
'''
returns the size of the C object wrapped by the Column instance
'''
cdef OBIDMS_column_header_p header = self.pointer().header
return header.header_size + header.data_size
def __iter__(self):
cdef index_t line_nb
for line_nb in range(self.lines_used):
yield self[line_nb]
def __setitem__(self, index_t line_nb, object value):
self.set_line(line_nb, value)
def __getitem__(self, index_t line_nb):
return self.get_line(line_nb)
def __str__(self) :
cdef str to_print
cdef Column_line line
to_print = ''
for line in self :
to_print = to_print + str(line) + "\n"
return to_print
def __repr__(self) :
cdef bytes s
s = self._alias + b", original name: " + self.original_name + b", version " + str2bytes(str(self.version)) + b", data type: " + self.data_type
return bytes2str(s) # TODO can't return bytes
cpdef close(self): # TODO discuss, can't be called bc then bug when closing view that tries to close it in C
cdef OBIDMS_column_p pointer
if self.active() :
pointer = self.pointer()
self._view.unregister(self)
OBIWrapper.close(self)
#if obi_close_column(pointer) < 0 :
# raise Exception("Problem closing column %s" % bytes2str(self.name))
# Column alias property getter and setter
@property
def name(self):
return self._alias
@name.setter
def name(self, new_alias): # @DuplicatedSignature
self._view.rename_column(self._alias, new_alias)
# elements_names property getter
@property
def elements_names(self):
return obi_get_elements_names(self.pointer()).split(b';')
# nb_elements_per_line property getter
@property
def nb_elements_per_line(self):
return self.pointer().header.nb_elements_per_line
# data_type property getter
@property
def data_type(self):
return name_data_type(self.pointer().header.returned_data_type)
# original_name property getter
@property
def original_name(self):
return self.pointer().header.name
# version property getter
@property
def version(self):
return self.pointer().header.version
# lines_used property getter
@property
def lines_used(self):
return self.pointer().header.lines_used
# comments property getter
@property
def comments(self):
return self.pointer().header.comments
# creation_date property getter
@property
def creation_date(self):
return obi_format_date(self.pointer().header.creation_date)
######################################################################################################
cdef class Column_multi_elts(Column) :
def __getitem__(self, index_t line_nb):
return Column_line(self, line_nb)
cpdef set_line(self, index_t line_nb, object values):
for element_name in values :
self.set_item(line_nb, element_name, values[element_name])
######################################################################################################
cdef class Column_line :
def __init__(self, Column column, index_t line_nb) :
self._index = line_nb
self._column = column
def __getitem__(self, object elt_id) :
return self._column.get_item(self._index, elt_id)
def __setitem__(self, object elt_id, object value):
self._column.set_item(self._index, elt_id, value)
def __contains__(self, str element_name):
return (element_name in self._column.elements_names)
def __repr__(self) :
return str(self._column.get_line(self._index))
cpdef update(self, data): # TODO ?????
if isinstance(data, dict):
data=data.items()
for key,value in data:
if key in self:
self[key]=value
######################################################################################################
cdef register_column_class(obitype_t obitype,
bint multi_elts,
type obiclass,
type python_type):
'''
Each sub class of `OBIDMS_column` needs to be registered after its declaration
to declare its relationship with an `OBIType_t`
'''
global __OBIDMS_COLUMN_CLASS__
assert issubclass(obiclass, Column)
__OBIDMS_COLUMN_CLASS__[(obitype, multi_elts)] = (obiclass, python_type)
cdef register_all_column_classes() :
x = list(pkgutil.walk_packages(typed_column.__path__, prefix="obitools3.dms.column.typed_column."))
all_modules = [importlib.import_module(a[1]) for a in x]
for mod in all_modules :
getattr(mod, 'register_class')()
register_all_column_classes()

View File

@@ -0,0 +1,3 @@
#from .bool import Column_bool
#from .int import Column_int
# TODO why is this needed?

View File

@@ -0,0 +1,44 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_bool(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_bool(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)
# cdef class Column_line_bool(Column_line) :
#
# @staticmethod
# cdef bool obibool_t2bool(obibool_t value)
#
# @staticmethod
# cdef bool2obibool_t(bool value)
#
# cpdef bool get_bool_item_by_name(self,bytes element_name)
# cpdef bool get_bool_item_by_idx(self,index_t index)
# cpdef set_bool_item_by_name(self,bytes element_name,bool value)
# cpdef set_bool_item_by_idx(self,index_t index,bool value)
#
#
# # cdef obibool_t [:] _data_view
#

View File

@@ -0,0 +1,313 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_bool_with_elt_name_and_col_p_in_view, \
obi_get_bool_with_elt_idx_and_col_p_in_view, \
obi_set_bool_with_elt_name_and_col_p_in_view, \
obi_set_bool_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_BOOL, OBIBool_NA, obibool_t
from cpython.bool cimport PyBool_FromLong
cdef class Column_bool(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_BOOL,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef obibool_t value
cdef object result
value = obi_get_bool_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIBool_NA :
result = None
else :
result = PyBool_FromLong(value)
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIBool_NA
if obi_set_bool_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, <obibool_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_bool(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id) :
cdef obibool_t value
cdef object result
cdef bytes elt_name
if type(elt_id) == int :
value = obi_get_bool_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_bool_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIBool_NA :
result = None
else :
result = PyBool_FromLong(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef obibool_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_bool_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIBool_NA :
value_in_result = None
else :
value_in_result = PyBool_FromLong(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value) :
cdef bytes elt_name
if value is None :
value = OBIBool_NA
if type(elt_id) == int :
if obi_set_bool_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, <obibool_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_bool_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, <obibool_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class() :
register_column_class(OBI_BOOL, False, Column_bool, bool)
register_column_class(OBI_BOOL, True, Column_multi_elts_bool, bool)
# cdef class Column_line_bool(Column_line) :
#
# cdef update_pointer(self):
# """
# Checks if the obicolumn address changed since the last call and update
# if need the `_column_p` and `_data_view` data structure fields.
# """
# cdef OBIDMS_column_p* column_pp
# column_pp = <OBIDMS_column_p*>self._pointer
# cdef OBIDMS_column_p column_p = column_pp[0]
#
# if column_p != self._column_p:
# self._column_p = column_p
# self._data_view = (<obibool_t*> (column_p.data)) + \
# self._index * column_p.header.nb_elements_per_line
#
# @staticmethod
# cdef bool obibool_t2bool(obibool_t value):
# cdef bool result
#
# if value == OBIBool_NA :
# result = None
# else :
# result = PyBool_FromLong(value)
#
# return result
#
# @staticmethod
# cdef bool2obibool_t(bool value):
# cdef obibool_t result
#
# if value is None:
# result=OBIBool_NA
# else:
# result= <obibool_t> <int> value
#
# return result
#
#
# def __init__(self, Column column, index_t line_nb) :
# """
# Creates a new `OBIDMS_column_line_bool`
#
# @param column: an OBIDMS_column instance
# @param line_nb: the line in the column
# """
#
# Column_line.__init__(self, column, line_nb)
# self.update_pointer()
#
#
#
# cpdef bool get_bool_item_by_name(self, bytes element_name) :
# """
# Returns the value associated to the name `element_name` of the current line
#
# @param element_name: a `bytes` instance containing the name of the element
#
# @return: the `bool` value corresponding to the name
# """
# cdef char* cname = element_name
# cdef obibool_t value
# global obi_errno
#
# self.update_pointer()
#
# cdef OBIDMS_column_p* column_pp
# column_pp = <OBIDMS_column_p*>self._pointer
# cdef OBIDMS_column_p column_p = column_pp[0]
#
# value = obi_column_get_obibool_with_elt_name(column_p,
# self._index,
# cname)
#
# if obi_errno > 0 :
# obi_errno = 0
# raise KeyError("Cannot access to key %s" % bytes2str(element_name))
#
# return Column_line_bool.obibool_t2bool(value)
#
#
# cpdef bool get_bool_item_by_idx(self,index_t index):
# """
# Returns the value associated to the name `element_name` of the current line
#
# @param index: a `int` instance containing the index of the element
#
# @return: the `bool` value corresponding to the name
# """
# cdef obibool_t value # @DuplicatedSignature
# global obi_errno
#
# cdef OBIDMS_column_p* column_pp
# column_pp = <OBIDMS_column_p*>self._pointer
# cdef OBIDMS_column_p column_p = column_pp[0]
#
# self.update_pointer()
#
# value = obi_column_get_obibool_with_elt_idx(column_p,
# self._index,
# index)
#
# if obi_errno > 0 :
# obi_errno = 0
# raise IndexError("Cannot access to element %d" % index)
#
# return Column_line_bool.obibool_t2bool(value)
#
#
# def __getitem__(self, object element_name) :
# cdef bytes name
# cdef int cindex
# cdef obibool_t value
# cdef type typearg = type(element_name)
# cdef bool result
#
#
# if typearg == int:
# cindex=element_name
# if cindex < 0:
# cindex = self._len - cindex
# result=self.get_bool_item_by_idx(cindex)
# elif typearg == bytes:
# result=self.get_bool_item_by_name(element_name)
# elif typearg == str:
# name = str2bytes(element_name)
# result=self.get_bool_item_by_name(name)
#
# return result
#
# cpdef set_bool_item_by_name(self,bytes element_name,bool value):
# """
# Sets the value associated to the name `element_name` of the current line
#
# @param element_name: a `bytes` instance containing the name of the element
# @param value: a `bool` instance of the new value
#
# @return: the `bool` value corresponding to the name
# """
# cdef char* cname = element_name
# cdef obibool_t cvalue
#
# self.update_pointer()
# cvalue = OBIDMS_column_line_bool.bool2obibool_t(value)
#
# if ( obi_column_set_obibool_with_elt_name((<OBIDMS_column_p*>self._pointer)[0],
# self._index,
# cname,
# cvalue) < 0 ):
# raise KeyError("Cannot access to key %s" % bytes2str(element_name))
#
# cpdef set_bool_item_by_idx(self,index_t index,bool value):
# """
# Sets the value associated to the name `element_name` of the current line
#
# @param index: a `int` instance containing the index of the element
# @param value: a `bool` instance of the new value
#
# @return: the `bool` value corresponding to the name
# """
# cdef obibool_t cvalue # @DuplicatedSignature
#
# self.update_pointer()
# cvalue = OBIDMS_column_line_bool.bool2obibool_t(value)
#
# if ( obi_column_set_obibool_with_elt_idx((<OBIDMS_column_p*>self._pointer)[0],
# self._index,
# index,
# cvalue) < 0 ):
# raise IndexError("Cannot access to item index %d" % index)
#
#
#
# def __setitem__(self, object element_name, object value):
# cdef bytes name
# cdef int cindex
# cdef type typearg = type(element_name)
# cdef bool result
#
#
# if typearg == int:
# cindex=element_name
# if cindex < 0:
# cindex = self._len - cindex
# self.set_bool_item_by_idx(cindex,value)
# elif typearg == bytes:
# self.set_bool_item_by_name(element_name,value)
# elif typearg == str:
# name = str2bytes(element_name)
# self.set_bool_item_by_name(name,value)
#
# def __repr__(self) :
# return str(self._column.get_line(self._index))
#
# def __len__(self):
# return self._len

View File

@@ -0,0 +1,20 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_char(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_char(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)

View File

@@ -0,0 +1,114 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_char_with_elt_name_and_col_p_in_view, \
obi_get_char_with_elt_idx_and_col_p_in_view, \
obi_set_char_with_elt_name_and_col_p_in_view, \
obi_set_char_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_CHAR, OBIChar_NA, obichar_t
cdef class Column_char(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_CHAR,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef obichar_t value
cdef object result
value = obi_get_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIChar_NA :
result = None
else :
result = <bytes>value # TODO return bytes or str?
return result
cpdef set_line(self, index_t line_nb, object value):
cdef obichar_t value_b
if value is None :
value_b = OBIChar_NA
else :
value_b = <obichar_t> tobytes(value)[0]
if obi_set_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_char(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id) :
cdef obichar_t value
cdef object result
cdef bytes elt_name
if type(elt_id) == int :
value = obi_get_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_char_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIChar_NA :
result = None
else :
result = <bytes>value # TODO return bytes or str?
return result
cpdef object get_line(self, index_t line_nb) :
cdef obichar_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIChar_NA :
value_in_result = None
else :
value_in_result = <bytes>value # TODO return bytes or str?
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value) :
cdef bytes elt_name
cdef obichar_t value_b
if value is None :
value_b = OBIChar_NA
else :
value_b = <obichar_t> tobytes(value)[0]
if type(elt_id) == int :
if obi_set_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, value_b) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_char_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, value_b) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class():
register_column_class(OBI_CHAR, False, Column_char, bytes) # TODO bytes or str?
register_column_class(OBI_CHAR, True, Column_multi_elts_char, bytes) # TODO bytes or str?

View File

@@ -0,0 +1,21 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_float(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_float(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)

View File

@@ -0,0 +1,110 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_float_with_elt_name_and_col_p_in_view, \
obi_get_float_with_elt_idx_and_col_p_in_view, \
obi_set_float_with_elt_name_and_col_p_in_view, \
obi_set_float_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_FLOAT, OBIFloat_NA, obifloat_t
cdef class Column_float(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_FLOAT,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef obifloat_t value
cdef object result
value = obi_get_float_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIFloat_NA :
result = None
else :
result = <double> value
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIFloat_NA
if obi_set_float_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, <obifloat_t> value) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_float(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id) :
cdef obifloat_t value
cdef object result
cdef bytes elt_name
if type(elt_id) == int :
value = obi_get_float_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_float_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIFloat_NA :
result = None
else :
result = <double> value
return result
cpdef object get_line(self, index_t line_nb) :
cdef obifloat_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_float_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIFloat_NA :
value_in_result = None
else :
value_in_result = <double> value
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value) :
cdef bytes elt_name
if value is None :
value = OBIFloat_NA
if type(elt_id) == int :
if obi_set_float_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, <obifloat_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_float_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, <obifloat_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class():
register_column_class(OBI_FLOAT, False, Column_float, float) # TODO why not double?
register_column_class(OBI_FLOAT, True, Column_multi_elts_float, float) # TODO why not double?

View File

@@ -0,0 +1,21 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_int(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_int(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)

View File

@@ -0,0 +1,111 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_int_with_elt_name_and_col_p_in_view, \
obi_get_int_with_elt_idx_and_col_p_in_view, \
obi_set_int_with_elt_name_and_col_p_in_view, \
obi_set_int_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_INT, OBIInt_NA, obiint_t
from cpython.int cimport PyInt_FromLong
cdef class Column_int(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_INT,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef obiint_t value
cdef object result
value = obi_get_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIInt_NA :
result = None
else :
result = PyInt_FromLong(value)
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIInt_NA
if obi_set_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, <obiint_t> value) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_int(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id) :
cdef obiint_t value
cdef object result
cdef bytes elt_name
if type(elt_id) == int :
value = obi_get_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_int_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIInt_NA :
result = None
else :
result = PyInt_FromLong(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef obiint_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIInt_NA :
value_in_result = None
else :
value_in_result = PyInt_FromLong(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value) :
cdef bytes elt_name
if value is None :
value = OBIInt_NA
if type(elt_id) == int :
if obi_set_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, <obiint_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_int_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, <obiint_t> value) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class():
register_column_class(OBI_INT, False, Column_int, int)
register_column_class(OBI_INT, True, Column_multi_elts_int, int)

View File

@@ -0,0 +1,26 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_qual(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef object get_str_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cpdef set_str_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_qual(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_str_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef object get_str_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)
cpdef set_str_item(self, index_t line_nb, object elt_id, object value)

View File

@@ -0,0 +1,236 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, bytes2str, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_qual_char_with_elt_name_and_col_p_in_view, \
obi_get_qual_char_with_elt_idx_and_col_p_in_view, \
obi_set_qual_char_with_elt_name_and_col_p_in_view, \
obi_set_qual_char_with_elt_idx_and_col_p_in_view, \
obi_get_qual_int_with_elt_name_and_col_p_in_view, \
obi_get_qual_int_with_elt_idx_and_col_p_in_view, \
obi_set_qual_int_with_elt_name_and_col_p_in_view, \
obi_set_qual_int_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_QUAL, OBIQual_char_NA, OBIQual_int_NA, const_char_p
from libc.stdlib cimport free
from libc.stdint cimport uint8_t
from libc.stdlib cimport malloc
cdef class Column_qual(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_QUAL,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef const uint8_t* value
cdef int value_length
cdef object result
value = obi_get_qual_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, &value_length)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIQual_int_NA :
result = None
else :
result = []
for i in range(value_length) :
result.append(<int>value[i])
return result
cpdef object get_str_line(self, index_t line_nb):
cdef char* value
cdef object result
cdef int i
value = obi_get_qual_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIQual_char_NA :
result = None
else : # TODO discuss
result = bytes2str(value)
free(value)
return result
cpdef set_line(self, index_t line_nb, object value):
cdef uint8_t* value_b
cdef int value_length
if value is None :
if obi_set_qual_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, OBIQual_int_NA, 0) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
else :
value_length = len(value)
value_b = <uint8_t*> malloc(value_length * sizeof(uint8_t))
for i in range(value_length) :
value_b[i] = <uint8_t>value[i]
if obi_set_qual_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, value_b, value_length) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
free(value_b)
cpdef set_str_line(self, index_t line_nb, object value):
cdef bytes value_b
if value is None :
if obi_set_qual_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, OBIQual_char_NA) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
else :
value_b = tobytes(value)
if obi_set_qual_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_qual(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id):
cdef const uint8_t* value
cdef int value_length
cdef object result
cdef int i
if type(elt_id) == int :
value = obi_get_qual_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, &value_length)
else :
elt_name = tobytes(elt_id)
value = obi_get_qual_int_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, &value_length)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIQual_int_NA :
result = None
else :
result = []
for i in range(value_length) :
result.append(<int>value[i])
return result
cpdef object get_str_item(self, index_t line_nb, object elt_id):
cdef char* value
cdef object result
if type(elt_id) == int :
value = obi_get_qual_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_qual_char_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIQual_char_NA :
result = None
else :
result = bytes2str(value) # TODO return bytes?
free(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef const uint8_t* value
cdef int value_length
cdef object value_in_result
cdef dict result
cdef index_t i
cdef int j
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_qual_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i, &value_length)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIQual_int_NA :
value_in_result = None
else :
value_in_result = []
for j in range(value_length) :
value_in_result.append(<int>value[j])
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef object get_str_line(self, index_t line_nb) :
cdef char* value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_qual_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIQual_char_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
free(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value):
cdef uint8_t* value_b
cdef int value_length
cdef bytes elt_name
if value is None :
value_b = OBIQual_int_NA
value_length = 0
else :
value_length = len(value)
value_b = <uint8_t*> malloc(value_length * sizeof(uint8_t))
for i in range(value_length) :
value_b[i] = <uint8_t>value[i]
if type(elt_id) == int :
if obi_set_qual_int_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, value_b, value_length) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_qual_int_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, value_b, value_length) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
if value is not None :
free(value_b)
cpdef set_str_item(self, index_t line_nb, object elt_id, object value):
cdef bytes value_b
cdef bytes elt_name
if value is None :
value_b = OBIQual_char_NA
else :
value_b = tobytes(value)
if type(elt_id) == int :
if obi_set_qual_char_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, value_b) < 0 :
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_qual_char_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class():
register_column_class(OBI_QUAL, False, Column_qual, bytes) # TODO str? int?
register_column_class(OBI_QUAL, True, Column_multi_elts_qual, bytes) # TODO str? int?

View File

@@ -0,0 +1,21 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_seq(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_seq(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)

View File

@@ -0,0 +1,137 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_seq_with_elt_name_and_col_p_in_view, \
obi_get_seq_with_elt_idx_and_col_p_in_view, \
obi_set_seq_with_elt_name_and_col_p_in_view, \
obi_set_seq_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_SEQ, OBISeq_NA
from libc.stdlib cimport free
cdef class Column_seq(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_SEQ,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef char* value
cdef object result
value = obi_get_seq_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBISeq_NA :
result = None
else : # TODO
try:
result = <bytes> value
finally:
free(value)
return result
cpdef set_line(self, index_t line_nb, object value):
cdef char* value_b
cdef bytes value_bytes
if value is None :
value_b = <char*>OBISeq_NA
else :
value_bytes = tobytes(value)
value_b = <char*>value_bytes
if obi_set_seq_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_seq(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id) :
cdef char* value
cdef object result
if type(elt_id) == int :
value = obi_get_seq_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_seq_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBISeq_NA :
result = None
else :
try:
result = <bytes> value
finally:
free(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef char* value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_seq_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBISeq_NA :
value_in_result = None
else :
try:
value_in_result = <bytes> value
finally:
free(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value):
cdef bytes elt_name
cdef char* value_b
cdef bytes value_bytes
if value is None :
value_b = <char*>OBISeq_NA
else :
value_bytes = tobytes(value)
value_b = <char*>value_bytes
if type(elt_id) == int :
if obi_set_seq_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_seq_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class():
register_column_class(OBI_SEQ, False, Column_seq, bytes) # TODO str?
register_column_class(OBI_SEQ, True, Column_multi_elts_seq, bytes) # TODO str?

View File

@@ -0,0 +1,21 @@
# #cython: language_level=3
from ...capi.obitypes cimport index_t
from ..column cimport Column, \
Column_multi_elts
cdef class Column_str(Column) :
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class Column_multi_elts_str(Column_multi_elts) :
cpdef object get_item(self, index_t line_nb, object elt_id)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, object elt_id, object value)

View File

@@ -0,0 +1,124 @@
#cython: language_level=3
from ..column cimport register_column_class
from ...view.view cimport View
from obitools3.utils cimport tobytes, \
obi_errno_to_exception
from ...capi.obiview cimport obi_get_str_with_elt_name_and_col_p_in_view, \
obi_get_str_with_elt_idx_and_col_p_in_view, \
obi_set_str_with_elt_name_and_col_p_in_view, \
obi_set_str_with_elt_idx_and_col_p_in_view
from ...capi.obitypes cimport OBI_STR, OBIStr_NA, const_char_p
cdef class Column_str(Column):
@staticmethod
def new(View view,
object column_name,
index_t nb_elements_per_line=1,
object elements_names=None,
object comments=b""):
return Column.new_column(view, column_name, OBI_STR,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
comments=comments)
cpdef object get_line(self, index_t line_nb):
cdef const_char_p value
cdef object result
value = obi_get_str_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0)
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem getting a value from a column")
if value == OBIStr_NA :
result = None
else :
result = <bytes> value # NOTE: value is not freed because the pointer points to a mmapped region in an AVL data file.
return result
cpdef set_line(self, index_t line_nb, object value):
cdef char* value_b
cdef bytes value_bytes
if value is None :
value_b = <char*>OBIStr_NA
else :
value_bytes = tobytes(value)
value_b = <char*>value_bytes
if obi_set_str_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, 0, value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=None, error_message="Problem setting a value in a column")
cdef class Column_multi_elts_str(Column_multi_elts):
cpdef object get_item(self, index_t line_nb, object elt_id) :
cdef const_char_p value
cdef object result
if type(elt_id) == int :
value = obi_get_str_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id)
else :
elt_name = tobytes(elt_id)
value = obi_get_str_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name)
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem getting a value from a column")
if value == OBIStr_NA :
result = None
else :
result = <bytes> value # NOTE: value is not freed because the pointer points to a mmapped region in an AVL data file.
return result
cpdef object get_line(self, index_t line_nb) :
cdef const_char_p value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_get_str_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, i)
obi_errno_to_exception(line_nb=line_nb, elt_id=i, error_message="Problem getting a value from a column")
if value == OBIStr_NA :
value_in_result = None
else :
value_in_result = <bytes> value
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, object elt_id, object value):
cdef bytes elt_name
cdef char* value_b
cdef bytes value_bytes
if value is None :
value_b = <char*>OBIStr_NA
else :
value_bytes = tobytes(value)
value_b = <char*>value_bytes
if type(elt_id) == int :
if obi_set_str_with_elt_idx_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_id, <char*>value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
else :
elt_name = tobytes(elt_id)
if obi_set_str_with_elt_name_and_col_p_in_view(self._view.pointer(), self.pointer(), line_nb, elt_name, <char*>value_b) < 0:
obi_errno_to_exception(line_nb=line_nb, elt_id=elt_id, error_message="Problem setting a value in a column")
def register_class():
register_column_class(OBI_STR, False, Column_str, bytes) # TODO str?
register_column_class(OBI_STR, True, Column_multi_elts_str, bytes) # TODO str?

View File

@@ -0,0 +1,33 @@
../../../src/bloom.c
../../../src/char_str_indexer.c
../../../src/crc64.c
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/hashtable.c
../../../src/linked_list.c
../../../src/murmurhash2.c
../../../src/obi_align.c
../../../src/obiavl.c
../../../src/obiblob_indexer.c
../../../src/obiblob.c
../../../src/obidms_taxonomy.c
../../../src/obidms.c
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.c
../../../src/obierrno.c
../../../src/obilittlebigman.c
../../../src/obitypes.c
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.c
../../../src/upperband.c
../../../src/utils.c

View File

@@ -0,0 +1,11 @@
#cython: language_level=3
from .object cimport OBIWrapper
from .capi.obidms cimport OBIDMS_p
cdef class DMS(OBIWrapper):
cdef inline OBIDMS_p pointer(self)
cpdef int view_count(self)

View File

@@ -0,0 +1,152 @@
#cython: language_level=3
from libc.stdlib cimport free
from cpython.list cimport PyList_Size
from .capi.obidms cimport obi_open_dms, \
obi_test_open_dms, \
obi_create_dms, \
obi_close_dms, \
obi_dms_get_full_path
from .capi.obitypes cimport const_char_p
from obitools3.utils cimport bytes2str, \
str2bytes, \
tobytes, \
tostr
from .object cimport OBIObjectClosedInstance
from pathlib import Path
from .view import view
cdef class DMS(OBIWrapper):
cdef inline OBIDMS_p pointer(self):
return <OBIDMS_p>(self._pointer)
@staticmethod
def new(object dms_name) :
cdef OBIDMS_p pointer
cdef DMS dms
cdef bytes dms_name_b = tobytes(dms_name)
pointer = obi_create_dms(<const_char_p> dms_name_b)
if pointer == NULL :
raise Exception("Failed creating an OBIDMS")
dms = OBIWrapper.new(DMS, pointer)
return dms
@staticmethod
def open(object dms_name) :
cdef OBIDMS_p pointer
cdef DMS dms
cdef bytes dms_name_b = tobytes(dms_name)
pointer = obi_open_dms(<const_char_p> dms_name_b)
if pointer == NULL :
raise Exception("Failed opening an OBIDMS")
dms = OBIWrapper.new(DMS, pointer)
return dms
@staticmethod
def test_open(object dms_name) :
cdef OBIDMS_p pointer
cdef DMS dms
cdef bytes dms_name_b = tobytes(dms_name)
pointer = obi_test_open_dms(<const_char_p> dms_name_b)
if pointer == NULL :
raise Exception("Failed opening an OBIDMS")
dms = OBIWrapper.new(DMS, pointer)
return dms
def close(self) :
'''
Closes the DMS instance and free the associated memory
The `close` method is automatically called by the object destructor.
'''
cdef OBIDMS_p pointer = self.pointer()
if self.active() :
OBIWrapper.close(self)
if (obi_close_dms(pointer)) < 0 :
raise Exception("Problem closing an OBIDMS")
# name property getter
@property
def name(self) :
'''
Returns the name of the DMS instance
@rtype: bytes
'''
return <bytes> self.pointer().dms_name
def keys(self) :
cdef const_char_p path = obi_dms_get_full_path(self.pointer(), b"VIEWS")
if path == NULL:
raise RuntimeError("Cannot retrieve the view database path")
p = Path(bytes2str(path))
free(path)
for v in p.glob("*.obiview") :
yield str2bytes(v.stem)
def values(self) :
cdef bytes view_name
for view_name in self.keys():
yield self.get_view(view_name)
def items(self) :
cdef bytes view_name
for view_name in self.keys():
yield (view_name, self.get_view(view_name))
def __contains__(self, key) :
cdef str key_s = tostr(key)
cdef const_char_p path = obi_dms_get_full_path(self.pointer(), b"VIEWS")
p = Path(bytes2str(path),key_s)
free(path)
return p.with_suffix(".obiview").is_file()
cpdef int view_count(self) :
return PyList_Size(list(self.keys()))
def __len__(self) :
return self.view_count()
def __getitem__(self, object view_name):
return self.get_view(view_name)
def __iter__(self) :
return self.keys()
def get_view(self, object view_name) :
return view.View.open(self, view_name)

View File

@@ -0,0 +1,22 @@
#cython: language_level=3
from .view.view cimport Line
cdef class Seq(dict) :
cdef str _id
cdef object _seq
cdef str _definition
cdef class Nuc_Seq(Seq) :
cdef object _quality
#cpdef object reverse_complement(self)
cdef class Nuc_Seq_Stored(Line) :
cpdef object get_str_quality(self)
#cpdef object reverse_complement(self)

View File

@@ -0,0 +1,130 @@
#cython: language_level=3
from obitools3.utils cimport bytes2str, str2bytes
from .capi.obiview cimport NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN, \
QUALITY_COLUMN
NUC_SEQUENCE_COLUMN_str = bytes2str(NUC_SEQUENCE_COLUMN)
ID_COLUMN_str = bytes2str(ID_COLUMN)
DEFINITION_COLUMN_str = bytes2str(DEFINITION_COLUMN)
QUALITY_COLUMN_str = bytes2str(QUALITY_COLUMN)
cdef class Seq(dict) :
def __init__(self, str id, object seq, object definition=None) :
self.id = id
self.seq = seq
if definition is not None :
self.definition = definition
# sequence id property getter and setter
@property
def id(self): # @ReservedAssignment
return self._id
@id.setter
def id(self, str new_id): # @ReservedAssignment @DuplicatedSignature
self._id = new_id
self[ID_COLUMN] = new_id
# sequence property getter and setter
@property
def seq(self):
return self._seq
@seq.setter
def seq(self, object new_seq): # @DuplicatedSignature
self._seq = new_seq
self["SEQ"] = new_seq # TODO discuss
# sequence definition property getter and setter
@property
def definition(self):
return self._definition
@definition.setter
def definition(self, object new_definition): # @DuplicatedSignature
self._definition = new_definition
self[DEFINITION_COLUMN_str] = new_definition
cdef class Nuc_Seq(Seq) :
# nuc sequence property getter and setter
@property
def seq(self):
return self._seq
@seq.setter
def seq(self, object new_seq): # @DuplicatedSignature
self._seq = new_seq
self[NUC_SEQUENCE_COLUMN_str] = new_seq
# sequence quality property getter and setter
@property
def quality(self):
return self._quality
@quality.setter
def quality(self, object new_quality): # @DuplicatedSignature
self._quality = new_quality
self[QUALITY_COLUMN_str] = new_quality
# cpdef str reverse_complement(self) : TODO in C ?
# pass
cdef class Nuc_Seq_Stored(Line) :
# sequence id property getter and setter
@property
def id(self): # @ReservedAssignment @DuplicatedSignature
return self[ID_COLUMN_str]
@id.setter
def id(self, str new_id): # @ReservedAssignment @DuplicatedSignature
self[ID_COLUMN_str] = new_id
# sequence definition property getter and setter
@property
def definition(self):
return self[DEFINITION_COLUMN_str]
@definition.setter
def definition(self, str new_def): # @DuplicatedSignature
self[DEFINITION_COLUMN_str] = new_def
# nuc_seq property getter and setter
@property
def nuc_seq(self):
return self[NUC_SEQUENCE_COLUMN_str]
@nuc_seq.setter
def nuc_seq(self, object new_seq): # @DuplicatedSignature
self[NUC_SEQUENCE_COLUMN_str] = new_seq
# quality property getter and setter
@property
def quality(self):
return self[QUALITY_COLUMN_str]
@quality.setter
def quality(self, object new_qual): # @DuplicatedSignature
if (type(new_qual) == list) or (new_qual is None) : # TODO check that quality column exists
self[QUALITY_COLUMN_str] = new_qual
else : # Quality is in str form
self._view.get_column(QUALITY_COLUMN_str).set_str_line(self._index, new_qual)
cpdef object get_str_quality(self) : # TODO not ideal. Make quality_int and quality_str properties
return self._view.get_column(QUALITY_COLUMN_str).get_str_line(self._index)
# cpdef str reverse_complement(self) : TODO in C ?
# pass
# TODO static method to import OBI_Nuc_Seq to OBI_Nuc_Seq_Stored ?

View File

@@ -0,0 +1,28 @@
#cython: language_level=3
cdef dict __c_cython_mapping__
cdef class OBIObject:
cdef dict _dependent_objects
cdef register(self, OBIObject object)
cdef unregister(self, OBIObject object)
cpdef close(self)
cdef class OBIWrapper(OBIObject):
cdef void* _pointer
cdef inline size_t cid(self)
cdef inline bint active(self)
@staticmethod
cdef object new(type constructor, void* pointer)
cdef class OBIObjectClosedInstance(Exception):
pass

View File

@@ -0,0 +1,89 @@
#cython: language_level=3
__c_cython_mapping__ = {}
cdef class OBIObject:
def __init__(self, __internalCall__) :
if __internalCall__ != 987654 or \
type(self) == OBIObject or \
type(self) == OBIWrapper or \
not isinstance(self, OBIObject) :
raise RuntimeError('OBIObject constructor can not be called directly')
self._dependent_objects = {}
cdef register(self, OBIObject object):
self._dependent_objects[id(object)] = object
cdef unregister(self, OBIObject object):
del self._dependent_objects[id(object)]
cpdef close(self):
cdef OBIObject object
cdef list to_close = list((self._dependent_objects).values())
for object in to_close:
object.close()
assert len(self._dependent_objects.values()) == 0
cdef class OBIWrapper(OBIObject) :
'''
The OBIWrapper class enables to wrap a C object representing a DMS or an element from a DMS.
'''
cdef inline size_t cid(self) :
return <size_t>(self._pointer)
cdef inline bint active(self) :
return self._pointer != NULL
cpdef close(self):
if (self._pointer != NULL):
OBIObject.close(self)
del __c_cython_mapping__[<size_t>self._pointer]
self._pointer = NULL
assert len(self._dependent_objects.values()) == 0
def __dealloc__(self):
'''
Destructor of any OBI instance.
The destructor automatically calls the `close` method and
therefore closes and frees all associated objects and memory.
'''
self.close()
@staticmethod
cdef object new(type constructor, void* pointer) :
cdef OBIWrapper o
if (<size_t>pointer in __c_cython_mapping__):
#print("Pointer already in cython dict")
return __c_cython_mapping__[<size_t>pointer]
else:
o = constructor(987654)
o._pointer = pointer
__c_cython_mapping__[<size_t>pointer] = o
return o
cdef class OBIDeactivatedInstanceError(Exception):
pass

View File

@@ -0,0 +1,23 @@
#cython: language_level=3
from ..capi.obitaxonomy cimport ecotx_t, OBIDMS_taxonomy_p
from ..dms cimport DMS
from ..object cimport OBIWrapper
cdef class OBI_Taxonomy(OBIWrapper) :
cdef str _name # TODO keep as bytes?
cdef DMS _dms
cdef inline OBIDMS_taxonomy_p pointer(self)
cpdef get_taxon_by_idx(self, int idx)
cpdef write(self, str prefix)
cpdef int add_taxon(self, str name, str rank_name, int parent_taxid, int min_taxid=*)
cpdef close(self)
cdef class OBI_Taxon :
cdef ecotx_t* _pointer
cdef OBI_Taxonomy _tax

View File

@@ -0,0 +1,196 @@
#cython: language_level=3
from obitools3.utils cimport str2bytes, bytes2str, tobytes
from ..capi.obitaxonomy cimport obi_read_taxonomy, \
obi_read_taxdump, \
obi_write_taxonomy, \
obi_close_taxonomy, \
obi_taxo_get_taxon_with_taxid, \
obi_taxo_add_local_taxon, \
obi_taxo_add_preferred_name_with_taxon, \
ecotx_t
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
cdef class OBI_Taxonomy(OBIWrapper) :
# TODO function to import taxonomy?
cdef inline OBIDMS_taxonomy_p pointer(self) :
return <OBIDMS_taxonomy_p>(self._pointer)
@staticmethod
def open(DMS dms, str name, bint taxdump=False) :
cdef void* pointer
cdef OBI_Taxonomy taxo
if taxdump :
pointer = <void*>obi_read_taxdump(tobytes(name))
else :
pointer = <void*>obi_read_taxonomy(dms.pointer(), tobytes(name), True) # TODO discuss
# TODO if not found in DMS, try to import?
if pointer == NULL :
raise RuntimeError("Error : Cannot read taxonomy %s"
% name)
taxo = OBIWrapper.new(OBI_Taxonomy, pointer)
dms.register(taxo)
taxo._dms = dms
taxo._name = name
return taxo
def __getitem__(self, object ref):
cdef ecotx_t* taxon_p
cdef object taxon_capsule
if type(ref) == int :
taxon_p = obi_taxo_get_taxon_with_taxid(self.pointer(), ref)
if taxon_p == NULL :
raise Exception("Taxon not found")
taxon_capsule = PyCapsule_New(taxon_p, NULL, NULL)
return OBI_Taxon(taxon_capsule, self)
else :
raise Exception("Not implemented")
cpdef get_taxon_by_idx(self, int idx):
cdef ecotx_t* taxa
cdef ecotx_t* taxon_p
cdef object taxon_capsule
taxa = self.pointer().taxa.taxon
taxon_p = <ecotx_t*> (taxa+idx)
taxon_capsule = PyCapsule_New(taxon_p, NULL, NULL)
return OBI_Taxon(taxon_capsule, self)
def __len__(self):
return self.pointer().taxa.count
def __iter__(self):
cdef ecotx_t* taxa
cdef ecotx_t* taxon_p
cdef object taxon_capsule
cdef int t
taxa = self.pointer().taxa.taxon
# Yield each taxid
for t in range(self.pointer().taxa.count):
taxon_p = <ecotx_t*> (taxa+t)
taxon_capsule = PyCapsule_New(taxon_p, NULL, NULL)
yield OBI_Taxon(taxon_capsule, self)
cpdef write(self, str prefix) :
if obi_write_taxonomy(self._dms.pointer(), self.pointer(), tobytes(prefix)) < 0 :
raise Exception("Error writing the taxonomy to binary files")
cpdef int add_taxon(self, str name, str rank_name, int parent_taxid, int min_taxid=10000000) :
cdef int taxid
taxid = obi_taxo_add_local_taxon(self.pointer(), tobytes(name), tobytes(rank_name), parent_taxid, min_taxid)
if taxid < 0 :
raise Exception("Error adding a new taxon to the taxonomy")
else :
return taxid
cpdef close(self) :
cdef OBIDMS_taxonomy_p pointer = self.pointer()
if self.active() :
self._dms.unregister(self)
OBIWrapper.close(self)
if (obi_close_taxonomy(self.pointer()) < 0) :
raise Exception("Problem closing the taxonomy %s" %
self._name)
# name property getter
@property
def name(self):
return self._name
cdef class OBI_Taxon : # TODO dict subclass?
def __init__(self, object taxon_capsule, OBI_Taxonomy tax) :
self._pointer = <ecotx_t*> PyCapsule_GetPointer(taxon_capsule, NULL)
if self._pointer == NULL :
raise Exception("Error reading a taxon (NULL pointer)")
self._tax = tax
# To test equality
def __richcmp__(self, OBI_Taxon taxon2, int op):
return (self.name == taxon2.name) and \
(self.taxid == taxon2.taxid) and \
(self.rank == taxon2.rank) and \
(self.farest == taxon2.farest) and \
(self.parent.taxid == taxon2.parent.taxid) and \
(self.preferred_name == taxon2.preferred_name)
# name property getter
@property
def name(self):
return bytes2str(self._pointer.name)
# taxid property getter
@property
def taxid(self):
return self._pointer.taxid
# rank property getter
@property
def rank(self):
return self._pointer.rank
# farest property getter
@property
def farest(self):
return self._pointer.farest
# parent property getter
@property
def parent(self):
cdef object parent_capsule
parent_capsule = PyCapsule_New(self._pointer.parent, NULL, NULL)
return OBI_Taxon(parent_capsule, self._tax)
# preferred name property getter and setter
@property
def preferred_name(self):
if self._pointer.preferred_name != NULL :
return bytes2str(self._pointer.preferred_name)
@preferred_name.setter
def preferred_name(self, str new_preferred_name) : # @DuplicatedSignature
if (obi_taxo_add_preferred_name_with_taxon(self._tax.pointer(), self._pointer, tobytes(new_preferred_name)) < 0) :
raise Exception("Error adding a new preferred name to a taxon")
def __repr__(self):
d = {}
d['taxid'] = self.taxid
d['name'] = self.name
d['rank'] = self.rank
d['preferred name'] = self.preferred_name
d['parent'] = self.parent.taxid
d['farest'] = self.farest
return str(d)

View File

@@ -0,0 +1,3 @@
#from .view import View # @UnresolvedImport
#from .view import Line_selection # @UnresolvedImport
#from .view import Line # @UnresolvedImport

View File

@@ -0,0 +1,8 @@
#cython: language_level=3
from ..view cimport View
cdef class View_NUC_SEQS(View):
pass

View File

@@ -0,0 +1,79 @@
#cython: language_level=3
from obitools3.dms.capi.obiview cimport obi_new_view_nuc_seqs, \
obi_new_view_nuc_seqs_cloned_from_name, \
VIEW_TYPE_NUC_SEQS
from ..view cimport register_view_class
from obitools3.dms.obiseq cimport Nuc_Seq, Nuc_Seq_Stored
from obitools3.dms.dms cimport DMS
from obitools3.dms.capi.obitypes cimport index_t
from obitools3.utils cimport tobytes, bytes2str
from obitools3.dms.capi.obidms cimport OBIDMS_p
from obitools3.dms.object cimport OBIWrapper
cdef class View_NUC_SEQS(View):
@staticmethod
def new(DMS dms,
object view_name,
object comments=None,
bint quality=False):
cdef bytes view_name_b = tobytes(view_name)
cdef bytes comments_b
cdef str message
cdef void* pointer
cdef View_NUC_SEQS view
if comments is not None:
comments_b = tobytes(comments)
else:
comments_b = b''
pointer = <void*>obi_new_view_nuc_seqs(<OBIDMS_p>dms._pointer,
view_name_b,
NULL,
NULL,
comments_b,
quality)
if pointer == NULL :
message = "Error : Cannot create view %s" % bytes2str(view_name_b)
raise RuntimeError(message)
view = OBIWrapper.new(View_NUC_SEQS, pointer)
view._dms = dms
dms.register(view)
return view
# TODO
def __getitem__(self, object item) :
if type(item) == int :
return Nuc_Seq_Stored(self, item)
else : # TODO assume str or bytes for optimization?
return self.get_column(item) # TODO hyper lent dans la pratique
def __setitem__(self, index_t line_idx, Nuc_Seq sequence_obj) :
for key in sequence_obj :
self[line_idx][key] = sequence_obj[key]
# TODO make properties for id, seq, def columns etc
def register_class() :
register_view_class(VIEW_TYPE_NUC_SEQS, View_NUC_SEQS)

View File

@@ -0,0 +1,65 @@
#cython: language_level=3
from ..capi.obiview cimport Obiview_p
from ..capi.obitypes cimport index_t, obitype_t
from ..object cimport OBIWrapper
from ..dms cimport DMS
from ..column.column cimport Column
cdef dict __OBIDMS_VIEW_CLASS__
cdef class View(OBIWrapper):
cdef DMS _dms
cdef inline Obiview_p pointer(self)
cpdef delete_column(self,
object column_name)
cpdef rename_column(self,
object current_name,
object new_name)
cpdef Column rewrite_column_with_diff_attributes(self,
object column_name,
obitype_t new_data_type=*,
index_t new_nb_elements_per_line=*,
list new_elements_names=*)
cpdef Line_selection new_selection(self,
list lines=*)
@staticmethod
cdef type get_view_class(bytes view_type)
cdef class Line_selection(list):
cdef View _view
cdef bytes _view_name
cdef index_t* __build_binary_list__(self)
cpdef View materialize(self,
object view_name,
object comments=*)
cdef class Line :
cdef index_t _index
cdef View _view
cdef register_view_class(bytes view_type_name,
type view_class)
cdef register_all_view_classes()

View File

@@ -0,0 +1,556 @@
#cython: language_level=3
cdef dict __VIEW_CLASS__= {}
from libc.stdlib cimport malloc
from ..capi.obiview cimport Alias_column_pair_p, \
obi_new_view, \
obi_open_view, \
obi_clone_view, \
obi_save_and_close_view, \
obi_view_get_pointer_on_column_in_view, \
obi_view_delete_column, \
obi_view_create_column_alias
from ..capi.obidmscolumn cimport OBIDMS_column_p
from ..capi.obidms cimport OBIDMS_p
from obitools3.utils cimport tobytes, \
str2bytes, \
bytes2str
from ..object cimport OBIObjectClosedInstance
from obitools3.dms.view import typed_view
from ..capi.obitypes cimport is_a_DNA_seq, \
OBI_VOID, \
OBI_BOOL, \
OBI_CHAR, \
OBI_FLOAT, \
OBI_INT, \
OBI_QUAL, \
OBI_SEQ, \
OBI_STR
import importlib
import inspect
import pkgutil
cdef class View(OBIWrapper) :
cdef inline Obiview_p pointer(self) :
return <Obiview_p>(self._pointer)
@staticmethod
cdef type get_view_class(bytes view_type):
global __VIEW_CLASS__
return __VIEW_CLASS__.get(view_type, View)
@staticmethod
def new(DMS dms,
object view_name,
object comments=None):
cdef bytes view_name_b = tobytes(view_name)
cdef bytes comments_b
cdef str message
cdef void* pointer
cdef View view # @DuplicatedSignature
if comments is not None:
comments_b = tobytes(comments)
else:
comments_b = b''
pointer = <void*>obi_new_view(<OBIDMS_p>dms._pointer,
view_name_b,
NULL,
NULL,
comments_b)
if pointer == NULL :
message = "Error : Cannot create view %s" % bytes2str(view_name_b)
raise RuntimeError(message)
view = OBIWrapper.new(View, pointer)
view._dms = dms
dms.register(view)
return view
def clone(self,
object view_name,
object comments=None):
cdef bytes view_name_b = tobytes(view_name)
cdef bytes comments_b
cdef void* pointer
cdef View view
if not self.active() :
raise OBIObjectClosedInstance()
if comments is not None:
comments_b = tobytes(comments)
else:
comments_b = b''
pointer = <void*> obi_clone_view(self._dms.pointer(),
self.pointer(),
view_name_b,
NULL,
comments_b)
if pointer == NULL :
raise RuntimeError("Error : Cannot clone view %s into view %s"
% (str(self.name),
bytes2str(view_name_b))
)
view = OBIWrapper.new(type(self), pointer)
view._dms = self._dms
self._dms.register(view)
return view
@staticmethod
def open(DMS dms, # @ReservedAssignment
object view_name):
cdef bytes view_name_b = tobytes(view_name)
cdef void* pointer
cdef View view
cdef type view_class
pointer = <void*> obi_open_view(dms.pointer(),
view_name_b)
if pointer == NULL :
raise RuntimeError("Error : Cannot open view %s" % bytes2str(view_name_b))
view_class = View.get_view_class((<Obiview_p>pointer).infos.view_type)
view = OBIWrapper.new(view_class, pointer)
view._dms = dms
dms.register(view)
return view
cpdef close(self):
cdef Obiview_p pointer = self.pointer()
if self.active() :
self._dms.unregister(self)
OBIWrapper.close(self)
if obi_save_and_close_view(pointer) < 0 :
raise Exception("Problem closing view %s" %
bytes2str(self.name))
def __repr__(self) :
# TODO check everywhere
if not self.active() :
raise OBIObjectClosedInstance()
cdef str s = "{name:s}\n{comments:s}\n{line_count:d} lines\n".format(name = str(self.name),
comments = str(self.comments),
line_count = self.line_count)
for column_name in self.keys() :
s = s + repr(self[column_name]) + '\n'
return s
def keys(self):
cdef int i
cdef Obiview_p pointer = self.pointer()
cdef int nb_column = pointer.infos.column_count
cdef Alias_column_pair_p column_p = pointer.infos.column_references
if not self.active() :
raise OBIObjectClosedInstance()
for i in range(nb_column) :
col_alias = bytes2str(pointer.infos.column_references[i].alias)
yield col_alias
def get_column(self,
object column_name):
if not self.active() :
raise OBIObjectClosedInstance()
return Column.open(self, column_name)
cpdef delete_column(self,
object column_name) :
cdef bytes column_name_b = tobytes(column_name)
if not self.active() :
raise OBIObjectClosedInstance()
# Close the cython instance first
col = self[column_name]
col.close()
# Remove the column from the view which closes the C structure
if obi_view_delete_column(self.pointer(), column_name_b) < 0 :
raise Exception("Problem deleting column %s from a view",
bytes2str(column_name_b))
cpdef rename_column(self,
object current_name,
object new_name):
cdef Column column
cdef bytes current_name_b = tobytes(current_name)
cdef bytes new_name_b = tobytes(new_name)
if not self.active() :
raise OBIObjectClosedInstance()
if (obi_view_create_column_alias(self.pointer(),
tobytes(current_name_b),
tobytes(new_name_b)) < 0) :
raise Exception("Problem in renaming column %s to %s" % (
bytes2str(current_name_b),
bytes2str(new_name_b)))
# TODO warning, not multithreading compliant
cpdef Column rewrite_column_with_diff_attributes(self,
object column_name,
obitype_t new_data_type=<obitype_t>OBI_VOID,
index_t new_nb_elements_per_line=0,
list new_elements_names=None) :
cdef Column old_column
cdef Column new_column
cdef index_t length = len(self)
old_column = self.get_column(column_name)
if new_data_type == 0 :
new_data_type = old_column.data_type
if new_nb_elements_per_line == 0 :
new_nb_elements_per_line = old_column.nb_elements_per_line
if new_elements_names is None :
new_elements_names = old_column.elements_names
new_column = Column.new_column(self, old_column.pointer().header.name, new_data_type,
nb_elements_per_line=new_nb_elements_per_line, elements_names=new_elements_names,
comments=old_column.comments, alias=tobytes(column_name)+tobytes('___new___'))
for i in range(length) :
new_column[i] = old_column[i]
# Remove old column from view
self.delete_column(column_name)
# Rename new
new_column.name = column_name
return new_column
cpdef Line_selection new_selection(self,list lines=None):
return Line_selection(self, lines)
def __iter__(self):
# Iteration on each line of all columns
# Declarations
cdef index_t line_nb
cdef Line line
# Yield each line
for line_nb in range(self.line_count) :
line = self[line_nb]
yield line
def __getitem__(self, object item) :
if type(item) == int :
return Line(self, item)
else : # TODO assume str or bytes for optimization?
return self.get_column(item) # TODO hyper lent dans la pratique
def __contains__(self, str column_name):
return (column_name in self.keys())
def __len__(self):
return(self.line_count)
def __str__(self) :
cdef Line line
cdef str to_print
to_print = ""
for line in self :
to_print = to_print + str(line) + "\n"
return to_print
@property
def dms(self):
return self._dms
# line_count property getter
@property
def line_count(self):
return self.pointer().infos.line_count
# name property getter
@property
def name(self):
return <bytes> self.pointer().infos.name
# view type property getter
@property
def type(self): # @ReservedAssignment
return <bytes> self.pointer().infos.view_type
# comments property getter
@property
def comments(self):
return <bytes> self.pointer().infos.comments
# TODO setter that concatenates new comments?
cdef class Line :
def __init__(self, View view, index_t line_nb) :
self._index = line_nb
self._view = view
def __getitem__(self, str column_name) :
return (self._view)[column_name][self._index]
def __setitem__(self, str column_name, object value): # TODO discuss
# TODO detect multiple elements (dict type)? put somewhere else? but more risky (in get)
# TODO OBI_QUAL ?
cdef type value_type
cdef obitype_t value_obitype
cdef bytes value_b
if column_name not in self._view :
if value == None :
raise Exception("Trying to create a column from a None value (can't guess type)")
value_type = type(value)
if value_type == int :
value_obitype = OBI_INT
elif value_type == float :
value_obitype = OBI_FLOAT
elif value_type == bool :
value_obitype = OBI_BOOL
elif value_type == str or value_type == bytes :
if value_type == str :
value_b = str2bytes(value)
else :
value_b = value
if is_a_DNA_seq(value_b) :
value_obitype = OBI_SEQ
elif len(value) == 1 :
value_obitype = OBI_CHAR
elif (len(value) > 1) :
value_obitype = OBI_STR
else :
raise Exception("Could not guess the type of a value to create a new column")
Column.new_column(self._view, column_name, value_obitype)
(self._view)[column_name][self._index] = value
def __iter__(self):
for column_name in (self._view).keys() :
yield self[column_name]
def __contains__(self, str column_name):
return (column_name in self._view.keys())
def __repr__(self):
cdef dict line
cdef str column_name
line = {}
for column_name in self._view.keys() :
line[column_name] = self[column_name]
return str(line)
# cpdef dict get_view_infos(self, str view_name) :
#
# cdef Obiview_infos_p view_infos_p
# cdef dict view_infos_d
# cdef Alias_column_pair_p column_refs
# cdef int i, j
# cdef str column_name
#
# view_infos_p = obi_view_map_file(self._pointer,
# tobytes(view_name))
# view_infos_d = {}
# view_infos_d["name"] = bytes2str(view_infos_p.name)
# view_infos_d["comments"] = bytes2str(view_infos_p.comments)
# view_infos_d["view_type"] = bytes2str(view_infos_p.view_type)
# view_infos_d["column_count"] = <int> view_infos_p.column_count
# view_infos_d["line_count"] = <int> view_infos_p.line_count
# view_infos_d["created_from"] = bytes2str(view_infos_p.created_from)
# view_infos_d["creation_date"] = bytes2str(obi_format_date(view_infos_p.creation_date))
# if (view_infos_p.all_lines) :
# view_infos_d["line_selection"] = None
# else :
# view_infos_d["line_selection"] = {}
# view_infos_d["line_selection"]["column_name"] = bytes2str((view_infos_p.line_selection).column_name)
# view_infos_d["line_selection"]["version"] = <int> (view_infos_p.line_selection).version
# view_infos_d["column_references"] = {}
# column_references = view_infos_p.column_references
# for j in range(view_infos_d["column_count"]) :
# column_name = bytes2str((column_references[j]).alias)
# view_infos_d["column_references"][column_name] = {}
# view_infos_d["column_references"][column_name]["original_name"] = bytes2str((column_references[j]).column_refs.column_name)
# view_infos_d["column_references"][column_name]["version"] = (column_references[j]).column_refs.version
#
# obi_view_unmap_file(self._pointer, view_infos_p)
#
# return view_infos_d
cdef class Line_selection(list):
def __init__(self, View view, lines=None) :
if view._pointer == NULL:
raise Exception("Error: trying to create a line selection with an invalidated view")
self._view = view
self._view_name = view.name
if lines is not None:
self.extend(lines)
def extend(self, iterable):
cdef index_t i
cdef index_t max_i = self._view.line_count
for i in iterable: # TODO this is already checked in C
if i > max_i:
raise RuntimeError("Error: trying to select line %d beyond the line count %d of view %s" %
(i,
max_i,
self._view_name)
)
list.append(self,i)
def append(self, index_t idx) :
if idx >= self._view.line_count :
raise IndexError("Error: trying to select line %d beyond the line count %d of view %s" %
(idx,
self._view.line_count,
bytes2str(self.name))
)
list.append(self,idx)
cdef index_t* __build_binary_list__(self):
cdef index_t* line_selection_p = NULL
cdef int i
cdef size_t l_selection = len(self)
line_selection_p = <index_t*> malloc((l_selection + 1) * sizeof(index_t)) # +1 for the -1 flagging the end of the array
for i in range(l_selection) :
line_selection_p[i] = self[i]
line_selection_p[l_selection] = -1 # flagging the end of the array
return line_selection_p
cpdef View materialize(self,
object view_name,
object comments=""):
cdef bytes view_name_b = tobytes(view_name)
cdef bytes comments_b
cdef Obiview_p pointer
cdef View view
if not self._view.active() :
raise OBIObjectClosedInstance()
if comments is not None:
comments_b = tobytes(comments)
else:
comments_b = b''
pointer = obi_clone_view(self._view._dms.pointer(),
self._view.pointer(),
view_name_b,
self.__build_binary_list__(),
comments_b)
if pointer == NULL :
raise RuntimeError("Error : Cannot clone view %s into view %s with new line selection"
% (str(self._view.name),
bytes2str(view_name_b))
)
view = OBIWrapper.new(type(self._view), pointer)
view._dms = self._view._dms
view._dms.register(view)
return view
#############################################################
cdef register_view_class(bytes view_type_name,
type view_class):
'''
Each subclass of `dms.view` needs to be registered after its declaration
'''
global __VIEW_CLASS__
assert issubclass(view_class, View)
__VIEW_CLASS__[view_type_name] = view_class
cdef register_all_view_classes() :
x = list(pkgutil.walk_packages(typed_view.__path__, prefix="obitools3.dms.view.typed_view."))
all_modules = [importlib.import_module(a[1]) for a in x]
for mod in all_modules :
getattr(mod, 'register_class')()
register_all_view_classes()

View File

@@ -1,6 +1,4 @@
cimport cython
from libc.stdlib cimport malloc, free, realloc
from libc.string cimport strncpy
cdef class FastaFormat:

View File

@@ -1,95 +0,0 @@
#cython: language_level=3
from .capi.obidms cimport OBIDMS_p
from .capi.obidmscolumn cimport OBIDMS_column_p
from .capi.obiview cimport Obiview_p
from .capi.obitypes cimport obiversion_t, OBIType_t, index_t
from ._obitaxo cimport OBI_Taxonomy
cdef class OBIDMS_column:
cdef OBIDMS_column_p* pointer
cdef OBIDMS dms
cdef OBIView view
cdef str data_type
cdef str dms_name
cdef str column_name
cdef index_t nb_elements_per_line
cdef list elements_names
cpdef update_pointer(self)
cpdef list get_elements_names(self)
cpdef str get_data_type(self)
cpdef index_t get_nb_lines_used(self)
cpdef str get_creation_date(self)
cpdef str get_comments(self)
cpdef close(self)
@staticmethod
cdef object get_subclass_type(OBIDMS_column_p column_p)
cdef class OBIDMS_column_multi_elts(OBIDMS_column):
cpdef set_line(self, index_t line_nb, dict values)
cdef class OBIDMS_column_line:
cdef OBIDMS_column column
cdef index_t index
cdef class OBIView:
cdef Obiview_p pointer
cdef str name
cdef str comments
cdef dict columns
cdef OBIDMS dms
cpdef delete_column(self, str column_name)
cpdef add_column(self,
str column_name,
obiversion_t version_number=*,
str type=*,
index_t nb_lines=*,
index_t nb_elements_per_line=*,
list elements_names=*,
str indexer_name=*,
str comments=*,
bint create=*
)
cpdef select_line(self, index_t line_nb)
cpdef select_lines(self, list line_selection)
cpdef save_and_close(self)
cdef class OBIView_NUC_SEQS(OBIView):
cdef OBIDMS_column ids
cdef OBIDMS_column sequences
cdef OBIDMS_column definitions
cdef OBIDMS_column qualities
cpdef delete_column(self, str column_name)
cdef class OBIView_line :
cdef index_t index
cdef OBIView view
cdef class OBIDMS:
cdef OBIDMS_p pointer
cdef str dms_name
cpdef close(self)
cpdef OBI_Taxonomy open_taxonomy(self, str taxo_name)
cpdef OBIView open_view(self, str view_name)
cpdef OBIView new_view(self, str view_name, object view_to_clone=*, list line_selection=*, str view_type=*, str comments=*)
cpdef dict read_view_infos(self, str view_name)
# cpdef dict read_views(self) TODO

View File

@@ -1,704 +0,0 @@
#cython: language_level=3
from obitools3.utils cimport bytes2str, str2bytes
from .capi.obidms cimport obi_dms, \
obi_close_dms
from .capi.obidmscolumn cimport obi_close_column, \
OBIDMS_column_p, \
OBIDMS_column_header_p
from .capi.obiutils cimport obi_format_date
from .capi.obitypes cimport const_char_p, \
OBIType_t, \
OBI_INT, \
OBI_FLOAT, \
OBI_BOOL, \
OBI_CHAR, \
OBI_QUAL, \
OBI_STR, \
OBI_SEQ, \
name_data_type, \
only_ATGC # discuss
from ._obidms cimport OBIDMS, \
OBIDMS_column, \
OBIView, \
OBIView_line
from ._obitaxo cimport OBI_Taxonomy
from ._obiseq cimport OBI_Nuc_Seq, OBI_Nuc_Seq_Stored
from ._obidmscolumn_int cimport OBIDMS_column_int, \
OBIDMS_column_multi_elts_int
from ._obidmscolumn_float cimport OBIDMS_column_float, \
OBIDMS_column_multi_elts_float
from ._obidmscolumn_bool cimport OBIDMS_column_bool, \
OBIDMS_column_multi_elts_bool
from ._obidmscolumn_char cimport OBIDMS_column_char, \
OBIDMS_column_multi_elts_char
from ._obidmscolumn_qual cimport OBIDMS_column_qual, \
OBIDMS_column_multi_elts_qual
from ._obidmscolumn_str cimport OBIDMS_column_str, \
OBIDMS_column_multi_elts_str
from ._obidmscolumn_seq cimport OBIDMS_column_seq, \
OBIDMS_column_multi_elts_seq
from .capi.obiview cimport Obiview_p, \
Obiview_infos_p, \
Column_reference_p, \
obi_new_view_nuc_seqs, \
obi_new_view, \
obi_new_view_cloned_from_name, \
obi_new_view_nuc_seqs_cloned_from_name, \
obi_view_map_file, \
obi_view_unmap_file, \
obi_open_view, \
obi_view_delete_column, \
obi_view_add_column, \
obi_view_get_column, \
obi_view_get_column, \
obi_view_get_pointer_on_column_in_view, \
obi_select_line, \
obi_select_lines, \
obi_save_and_close_view, \
VIEW_TYPE_NUC_SEQS, \
NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN, \
QUALITY_COLUMN
from libc.stdlib cimport malloc
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
cdef class OBIDMS_column :
# Should only be initialized through a subclass
def __init__(self, OBIView view, str column_name):
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_p* column_pp
column_pp = obi_view_get_pointer_on_column_in_view(view.pointer, str2bytes(column_name))
column_p = column_pp[0] # TODO ugly cython dereferencing but can't find better
# Fill structure
self.pointer = column_pp
self.dms = view.dms
self.view = view
self.data_type = bytes2str(name_data_type((column_p.header).returned_data_type))
self.column_name = bytes2str((column_p.header).name)
self.nb_elements_per_line = (column_p.header).nb_elements_per_line
self.elements_names = (bytes2str((column_p.header).elements_names)).split(';')
def __setitem__(self, index_t line_nb, object value):
self.set_line(line_nb, value)
def __getitem__(self, index_t line_nb):
return self.get_line(line_nb)
def __len__(self):
return (self.pointer)[0].header.lines_used
def __sizeof__(self):
return ((self.pointer)[0].header.header_size + (self.pointer)[0].header.data_size)
def __iter__(self):
# Declarations
cdef index_t lines_used
cdef index_t line_nb
# Yield each line
lines_used = (self.pointer)[0].header.lines_used
for line_nb in range(lines_used):
yield self.get_line(line_nb)
cpdef update_pointer(self):
self.pointer = <OBIDMS_column_p*> obi_view_get_pointer_on_column_in_view(self.view.pointer, str2bytes(self.column_name))
cpdef list get_elements_names(self):
return self.elements_names
cpdef str get_data_type(self):
return self.data_type
cpdef index_t get_nb_lines_used(self):
return (self.pointer)[0].header.lines_used
cpdef str get_creation_date(self):
return bytes2str(obi_format_date((self.pointer)[0].header.creation_date))
cpdef str get_comments(self):
return bytes2str((self.pointer)[0].header.comments)
def __str__(self) :
cdef str to_print
to_print = ''
for line in self :
to_print = to_print + str(line) + "\n"
return to_print
def __repr__(self) :
return (self.column_name + ", version " + str((self.pointer)[0].header.version) + ", data type: " + self.data_type)
cpdef close(self):
if obi_close_column((self.pointer)[0]) < 0 :
raise Exception("Problem closing a column")
@staticmethod
cdef object get_subclass_type(OBIDMS_column_p column_p) :
cdef object subclass
cdef OBIDMS_column_header_p header
cdef OBIType_t col_type
cdef bint col_writable
cdef bint col_one_element_per_line
header = column_p.header
col_type = header.returned_data_type
col_writable = column_p.writable
col_one_element_per_line = ((header.nb_elements_per_line) == 1)
if col_type == OBI_INT :
if col_one_element_per_line :
subclass = OBIDMS_column_int
else :
subclass = OBIDMS_column_multi_elts_int
elif col_type == OBI_FLOAT :
if col_one_element_per_line :
subclass = OBIDMS_column_float
else :
subclass = OBIDMS_column_multi_elts_float
elif col_type == OBI_BOOL :
if col_one_element_per_line :
subclass = OBIDMS_column_bool
else :
subclass = OBIDMS_column_multi_elts_bool
elif col_type == OBI_CHAR :
if col_one_element_per_line :
subclass = OBIDMS_column_char
else :
subclass = OBIDMS_column_multi_elts_char
elif col_type == OBI_QUAL :
if col_one_element_per_line :
subclass = OBIDMS_column_qual
else :
subclass = OBIDMS_column_multi_elts_qual
elif col_type == OBI_STR :
if col_one_element_per_line :
subclass = OBIDMS_column_str
else :
subclass = OBIDMS_column_multi_elts_str
elif col_type == OBI_SEQ :
if col_one_element_per_line :
subclass = OBIDMS_column_seq
else :
subclass = OBIDMS_column_multi_elts_seq
else :
raise Exception("Problem with the data type")
return subclass
######################################################################################################
cdef class OBIDMS_column_multi_elts(OBIDMS_column) :
def __getitem__(self, index_t line_nb):
return OBIDMS_column_line(self, line_nb)
cpdef set_line(self, index_t line_nb, dict values):
for element_name in values :
self.set_item(line_nb, element_name, values[element_name])
######################################################################################################
cdef class OBIDMS_column_line :
def __init__(self, OBIDMS_column column, index_t line_nb) :
self.index = line_nb
self.column = column
def __getitem__(self, str element_name) :
return self.column.get_item(self.index, element_name)
def __setitem__(self, str element_name, object value):
self.column.set_item(self.index, element_name, value)
def __contains__(self, str element_name):
return (element_name in self.column.elements_names)
def __repr__(self) :
return str(self.column.get_line(self.index))
##########################################
cdef class OBIView :
def __init__(self, OBIDMS dms, str view_name, bint new=False, object view_to_clone=None, list line_selection=None, str comments=""):
cdef Obiview_p view = NULL
cdef int i
cdef list col_list
cdef str col_name
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_header_p header
cdef index_t* line_selection_p
self.dms = dms
# Create the C array for the line selection if needed
if line_selection is not None :
line_selection_p = <index_t*> malloc((len(line_selection) + 1) * sizeof(index_t))
for i in range(len(line_selection)) :
line_selection_p[i] = line_selection[i]
line_selection_p[len(line_selection)] = -1
else :
line_selection_p = NULL
# Create the view if needed
if new :
if view_to_clone is not None :
if type(view_to_clone) == str :
view = obi_new_view_cloned_from_name(dms.pointer, str2bytes(view_name), str2bytes(view_to_clone), line_selection_p, str2bytes(comments))
else :
view = obi_new_view(dms.pointer, str2bytes(view_name), (<OBIView> view_to_clone).pointer, line_selection_p, str2bytes(comments))
elif view_to_clone is None :
view = obi_new_view(dms.pointer, str2bytes(view_name), NULL, line_selection_p, str2bytes(comments))
# Else, open the existing view
elif not new :
if view_name is not None :
view = obi_open_view(dms.pointer, str2bytes(view_name))
elif view_name is None :
view = obi_open_view(dms.pointer, NULL) # TODO discuss
if view == NULL :
raise Exception("Error creating/opening a view")
self.pointer = view
self.name = bytes2str(view.infos.name)
# Go through columns to build list of corresponding python instances
self.columns = {}
for i in range(view.infos.column_count) :
column_p = <OBIDMS_column_p> (view.columns)[i]
header = (column_p).header
col_name = bytes2str(header.name)
subclass = OBIDMS_column.get_subclass_type(column_p)
self.columns[col_name] = subclass(self, col_name)
def __repr__(self) :
cdef str s
s = str(self.name) + ", " + str(self.comments) + ", " + str(self.pointer.infos.line_count) + " lines\n"
for column_name in self.columns :
s = s + self.columns[column_name].__repr__() + '\n'
return s
cpdef delete_column(self, str column_name) :
cdef int i
cdef Obiview_p view_p
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_header_p header
cdef str column_n
view = self.pointer
if obi_view_delete_column(view_p, str2bytes(column_name)) < 0 :
raise Exception("Problem deleting a column from a view")
# Update the dictionaries of column pointers and column objects, and update pointers in column objects (make function?):
(self.columns).pop(column_name)
for column_n in self.columns :
(self.columns[column_n]).update_pointer()
cpdef add_column(self,
str column_name,
obiversion_t version_number=-1,
str type='',
index_t nb_lines=0,
index_t nb_elements_per_line=1,
list elements_names=None,
str indexer_name="",
str comments="",
bint create=True
) :
cdef bytes column_name_b
cdef bytes elements_names_b
cdef object subclass
cdef OBIDMS_column_p column_p
column_name_b = str2bytes(column_name)
if nb_elements_per_line > 1 :
elements_names_b = str2bytes(';'.join(elements_names))
elif nb_elements_per_line == 1 :
elements_names_b = column_name_b
if type :
if type == 'OBI_INT' :
data_type = OBI_INT
elif type == 'OBI_FLOAT' :
data_type = OBI_FLOAT
elif type == 'OBI_BOOL' :
data_type = OBI_BOOL
elif type == 'OBI_CHAR' :
data_type = OBI_CHAR
elif type == 'OBI_QUAL' :
data_type = OBI_QUAL
elif type == 'OBI_STR' :
data_type = OBI_STR
elif type == 'OBI_SEQ' :
data_type = OBI_SEQ
else :
raise Exception("Invalid provided data type")
if (obi_view_add_column(self.pointer, column_name_b, version_number, # TODO should return pointer on column?
data_type, nb_lines, nb_elements_per_line,
elements_names_b, str2bytes(indexer_name),
str2bytes(comments), create) < 0) :
raise Exception("Problem adding a column in a view")
# Get the column pointer
column_p = obi_view_get_column(self.pointer, column_name_b)
# Open and store the subclass
subclass = OBIDMS_column.get_subclass_type(column_p)
(self.columns)[column_name] = subclass(self, column_name)
cpdef save_and_close(self) :
if (obi_save_and_close_view(self.pointer) < 0) :
raise Exception("Problem closing a view")
def __iter__(self):
# iter on each line of all columns
# Declarations
cdef index_t lines_used
cdef index_t line_nb
cdef OBIView_line line # TODO Check that this works for NUC SEQ views
# Yield each line
lines_used = self.pointer.infos.line_count
for line_nb in range(lines_used) :
line = self[line_nb]
yield line
def __getitem__(self, object item) :
if type(item) == str :
return (self.columns)[item]
elif type(item) == int :
return OBIView_line(self, item)
cpdef select_line(self, index_t line_nb) :
if obi_select_line(self.pointer, line_nb) < 0 :
raise Exception("Problem selecting a line")
cpdef select_lines(self, list line_selection) :
cdef index_t* line_selection_p
line_selection_p = <index_t*> malloc((len(line_selection) + 1) * sizeof(index_t))
for i in range(len(line_selection)) :
line_selection_p[i] = line_selection[i]
line_selection_p[len(line_selection)] = -1
if obi_select_lines(self.pointer, line_selection_p) < 0 :
raise Exception("Problem selecting a list of lines")
def __contains__(self, str column_name):
return (column_name in self.columns)
def __str__(self) :
cdef OBIView_line line
cdef str to_print
to_print = ""
for line in self.__iter__() :
to_print = to_print + str(line) + "\n"
return to_print
#############################################
cdef class OBIView_NUC_SEQS(OBIView):
def __init__(self, OBIDMS dms, str view_name, bint new=False, object view_to_clone=None, list line_selection=None, str comments=""):
cdef Obiview_p view = NULL
cdef int i
cdef list col_list
cdef str col_name
cdef OBIDMS_column column
cdef OBIDMS_column_p column_p
cdef OBIDMS_column_header_p header
cdef index_t* line_selection_p
self.dms = dms
if line_selection is not None :
line_selection_p = <index_t*> malloc((len(line_selection) + 1) * sizeof(index_t))
for i in range(len(line_selection)) :
line_selection_p[i] = line_selection[i]
line_selection_p[len(line_selection)] = -1
else :
line_selection_p = NULL
if new :
if view_to_clone is not None :
if type(view_to_clone) == str :
view = obi_new_view_nuc_seqs_cloned_from_name(dms.pointer, str2bytes(view_name), str2bytes(view_to_clone), line_selection_p, str2bytes(comments))
else :
view = obi_new_view_nuc_seqs(dms.pointer, str2bytes(view_name), (<OBIView> view_to_clone).pointer, line_selection_p, str2bytes(comments))
elif view_to_clone is None :
view = obi_new_view_nuc_seqs(dms.pointer, str2bytes(view_name), NULL, line_selection_p, str2bytes(comments))
elif not new :
if view_name is not None :
view = obi_open_view(dms.pointer, str2bytes(view_name))
elif view_name is None :
view = obi_open_view(dms.pointer, NULL)
if view == NULL :
raise Exception("Error creating/opening view")
self.pointer = view
self.name = bytes2str(view.infos.name)
self.comments = bytes2str(view.infos.comments)
# Go through columns to build list of corresponding python instances
self.columns = {}
for i in range(view.infos.column_count) :
column_p = <OBIDMS_column_p> (view.columns)[i]
header = (column_p).header
col_name = bytes2str(header.name)
subclass = OBIDMS_column.get_subclass_type(column_p)
self.columns[col_name] = subclass(self, col_name)
self.ids = self.columns[bytes2str(ID_COLUMN)]
self.sequences = self.columns[bytes2str(NUC_SEQUENCE_COLUMN)]
self.definitions = self.columns[bytes2str(DEFINITION_COLUMN)]
self.qualities = self.columns[bytes2str(QUALITY_COLUMN)]
def __getitem__(self, object item) :
if type(item) == str :
return (self.columns)[item]
elif type(item) == int :
return OBI_Nuc_Seq_Stored(self, item)
def __setitem__(self, index_t line_idx, OBI_Nuc_Seq sequence_obj) :
for key in sequence_obj :
self[line_idx][key] = sequence_obj[key]
#############################################
cdef class OBIView_line :
def __init__(self, OBIView view, index_t line_nb) :
self.index = line_nb
self.view = view
def __getitem__(self, str column_name) :
return ((self.view).columns)[column_name][self.index]
def __setitem__(self, str column_name, object value):
# TODO detect multiple elements (dict type)? put somewhere else? but more risky (in get)
# TODO OBI_QUAL ?
cdef type value_type
cdef str value_obitype
if column_name not in self.view :
if value == None :
raise Exception("Trying to create a column from a None value (can't guess type)")
value_type = type(value)
if value_type == int :
value_obitype = 'OBI_INT'
elif value_type == float :
value_obitype = 'OBI_FLOAT'
elif value_type == bool :
value_obitype = 'OBI_BOOL'
elif value_type == str :
if only_ATGC(str2bytes(value)) : # TODO detect IUPAC?
value_obitype = 'OBI_SEQ'
elif len(value) == 1 :
value_obitype = 'OBI_CHAR'
elif (len(value) > 1) :
value_obitype = 'OBI_STR'
else :
raise Exception("Could not guess the type of a value to create a new column")
self.view.add_column(column_name, type=value_obitype)
(((self.view).columns)[column_name]).set_line(self.index, value)
def __contains__(self, str column_name):
return (column_name in self.view)
def __repr__(self):
cdef dict line
cdef str column_name
line = {}
for column_name in self.view.columns :
line[column_name] = self[column_name]
return str(line)
##########################################
cdef class OBIDMS :
def __init__(self, str dms_name) :
# Declarations
cdef bytes dms_name_b
# Format the character string to send to C function
dms_name_b = str2bytes(dms_name)
# Fill structure and create or open the DMS
self.dms_name = dms_name
self.pointer = obi_dms(<const_char_p> dms_name_b)
if self.pointer == NULL :
raise Exception("Failed opening or creating an OBIDMS")
cpdef close(self) :
if (obi_close_dms(self.pointer)) < 0 :
raise Exception("Problem closing an OBIDMS")
cpdef OBI_Taxonomy open_taxonomy(self, str taxo_name) :
return OBI_Taxonomy(self, taxo_name)
cpdef OBIView open_view(self, str view_name) :
cdef object view_class
cdef dict view_infos
view_infos = self.read_view_infos(view_name)
if view_infos["view_type"] == bytes2str(VIEW_TYPE_NUC_SEQS) :
view_class = OBIView_NUC_SEQS
else :
view_class = OBIView
return view_class(self, view_name)
cpdef OBIView new_view(self, str view_name, object view_to_clone=None, list line_selection=None, str view_type=None, str comments="") :
cdef object view_class
if view_type is not None :
if view_type == bytes2str(VIEW_TYPE_NUC_SEQS) :
view_class = OBIView_NUC_SEQS
else :
view_class = OBIView
return view_class(self, view_name, new=True, view_to_clone=view_to_clone, line_selection=line_selection, comments=comments)
cpdef dict read_view_infos(self, str view_name) :
cdef Obiview_infos_p view_infos_p
cdef dict view_infos_d
cdef Column_reference_p column_refs
cdef int i, j
cdef str column_name
view_infos_p = obi_view_map_file(self.pointer, str2bytes(view_name))
view_infos_d = {}
view_infos_d["name"] = bytes2str(view_infos_p.name)
view_infos_d["comments"] = bytes2str(view_infos_p.comments)
view_infos_d["view_type"] = bytes2str(view_infos_p.view_type)
view_infos_d["column_count"] = <int> view_infos_p.column_count
view_infos_d["line_count"] = <int> view_infos_p.line_count
view_infos_d["created_from"] = bytes2str(view_infos_p.created_from)
view_infos_d["creation_date"] = bytes2str(obi_format_date(view_infos_p.creation_date))
if (view_infos_p.all_lines) :
view_infos_d["line_selection"] = None
else :
view_infos_d["line_selection"] = {}
view_infos_d["line_selection"]["column_name"] = bytes2str((view_infos_p.line_selection).column_name)
view_infos_d["line_selection"]["version"] = <int> (view_infos_p.line_selection).version
view_infos_d["column_references"] = {}
column_refs = view_infos_p.column_references
for j in range(view_infos_d["column_count"]) :
column_name = bytes2str((column_refs[j]).column_name)
view_infos_d["column_references"][column_name] = {}
view_infos_d["column_references"][column_name]["version"] = column_refs[j].version
obi_view_unmap_file(self.pointer, view_infos_p)
return view_infos_d
# cpdef dict read_views(self) : # TODO function that prints the dic nicely and function that prints 1 view nicely. Add column type in col ref
#
# cdef Obiviews_infos_all_p all_views_p
# cdef Obiview_infos_p view_p
# cdef Column_reference_p column_refs
# cdef int nb_views
# cdef int i, j
# cdef str view_name
# cdef str column_name
# cdef dict views
# cdef bytes name_b
#
# views = {}
# all_views_p = obi_read_view_infos(self.pointer)
# if all_views_p == NULL :
# raise Exception("No views to read")
# nb_views = <int> (all_views_p.header).view_count
# for i in range(nb_views) :
# view_p = (<Obiview_infos_p> (all_views_p.view_infos)) + i
# view_name = bytes2str(view_p.name)
# views[view_name] = {}
# views[view_name]["comments"] = bytes2str(view_p.comments)
# views[view_name]["view_type"] = bytes2str(view_p.view_type)
# views[view_name]["column_count"] = <int> view_p.column_count
# views[view_name]["line_count"] = <int> view_p.line_count
# views[view_name]["view_number"] = <int> view_p.view_number
# views[view_name]["created_from"] = bytes2str(view_p.created_from)
# views[view_name]["creation_date"] = bytes2str(obi_format_date(view_p.creation_date))
# if (view_p.all_lines) :
# views[view_name]["line_selection"] = None
# else :
# views[view_name]["line_selection"] = {}
# views[view_name]["line_selection"]["column_name"] = bytes2str((view_p.line_selection).column_name)
# views[view_name]["line_selection"]["version"] = <int> (view_p.line_selection).version
# views[view_name]["column_references"] = {}
# column_refs = view_p.column_references
# for j in range(views[view_name]["column_count"]) :
# column_name = bytes2str((column_refs[j]).column_name)
# views[view_name]["column_references"][column_name] = {}
# views[view_name]["column_references"][column_name]["version"] = column_refs[j].version
#
# obi_close_view_infos(all_views_p);
#
# return views

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,14 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_bool(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_bool(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,77 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obibool_with_elt_name_in_view, \
obi_column_get_obibool_with_elt_idx_in_view, \
obi_column_set_obibool_with_elt_name_in_view, \
obi_column_set_obibool_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIBool_NA, obibool_t
from obitools3.utils cimport str2bytes
from cpython.bool cimport PyBool_FromLong
cdef class OBIDMS_column_bool(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obibool_t value
cdef object result
value = obi_column_get_obibool_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIBool_NA :
result = None
else :
result = PyBool_FromLong(value)
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIBool_NA
if obi_column_set_obibool_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, <obibool_t> value) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_bool(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obibool_t value
cdef object result
value = obi_column_get_obibool_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIBool_NA :
result = None
else :
result = PyBool_FromLong(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef obibool_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obibool_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIBool_NA :
value_in_result = None
else :
value_in_result = PyBool_FromLong(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIBool_NA
if obi_column_set_obibool_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), <obibool_t> value) < 0:
raise Exception("Problem setting a value in a column")

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,14 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_char(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_char(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,76 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obichar_with_elt_name_in_view, \
obi_column_get_obichar_with_elt_idx_in_view, \
obi_column_set_obichar_with_elt_name_in_view, \
obi_column_set_obichar_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIChar_NA, obichar_t
from obitools3.utils cimport str2bytes, bytes2str
cdef class OBIDMS_column_char(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obichar_t value
cdef object result
value = obi_column_get_obichar_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIChar_NA :
result = None
else :
result = bytes2str(value)
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIChar_NA
if obi_column_set_obichar_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)[0]) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_char(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obichar_t value
cdef object result
value = obi_column_get_obichar_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIChar_NA :
result = None
else :
result = bytes2str(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef obichar_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obichar_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIChar_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIChar_NA
if obi_column_set_obichar_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), str2bytes(value)[0]) < 0:
raise Exception("Problem setting a value in a column")

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,14 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_float(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_float(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,76 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obifloat_with_elt_name_in_view, \
obi_column_get_obifloat_with_elt_idx_in_view, \
obi_column_set_obifloat_with_elt_name_in_view, \
obi_column_set_obifloat_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIFloat_NA, obifloat_t
from obitools3.utils cimport str2bytes
cdef class OBIDMS_column_float(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obifloat_t value
cdef object result
value = obi_column_get_obifloat_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIFloat_NA :
result = None
else :
result = <double> value
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIFloat_NA
if obi_column_set_obifloat_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, <obifloat_t> value) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_float(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obifloat_t value
cdef object result
value = obi_column_get_obifloat_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIFloat_NA :
result = None
else :
result = <double> value
return result
cpdef object get_line(self, index_t line_nb) :
cdef obifloat_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obifloat_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIFloat_NA :
value_in_result = None
else :
value_in_result = <double> value
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIFloat_NA
if obi_column_set_obifloat_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), <obifloat_t> value) < 0:
raise Exception("Problem setting a value in a column")

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,14 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_int(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_int(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,78 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obiint_with_elt_name_in_view, \
obi_column_get_obiint_with_elt_idx_in_view, \
obi_column_set_obiint_with_elt_name_in_view, \
obi_column_set_obiint_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIInt_NA, obiint_t
from obitools3.utils cimport str2bytes
from cpython.int cimport PyInt_FromLong
cdef class OBIDMS_column_int(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef obiint_t value
cdef object result
value = obi_column_get_obiint_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIInt_NA :
result = None
else :
result = PyInt_FromLong(value)
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
value = OBIInt_NA
if obi_column_set_obiint_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, <obiint_t> value) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_int(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef obiint_t value
cdef object result
value = obi_column_get_obiint_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIInt_NA :
result = None
else :
result = PyInt_FromLong(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef obiint_t value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiint_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIInt_NA :
value_in_result = None
else :
value_in_result = PyInt_FromLong(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None # TODO discuss
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
if value is None :
value = OBIInt_NA
if obi_column_set_obiint_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), <obiint_t> value) < 0:
raise Exception("Problem setting a value in a column")

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,20 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column , OBIDMS_column_multi_elts
cdef class OBIDMS_column_qual(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef object get_str_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cpdef set_str_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_qual(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_str_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef object get_str_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)
cpdef set_str_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,184 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obiqual_char_with_elt_name_in_view, \
obi_column_get_obiqual_char_with_elt_idx_in_view, \
obi_column_set_obiqual_char_with_elt_name_in_view, \
obi_column_set_obiqual_char_with_elt_idx_in_view, \
obi_column_get_obiqual_int_with_elt_name_in_view, \
obi_column_get_obiqual_int_with_elt_idx_in_view, \
obi_column_set_obiqual_int_with_elt_name_in_view, \
obi_column_set_obiqual_int_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIQual_char_NA, OBIQual_int_NA, const_char_p
from ._obidms cimport OBIView
from obitools3.utils cimport str2bytes, bytes2str
from libc.stdlib cimport free
from libc.string cimport strcmp
from libc.stdint cimport uint8_t
from libc.stdlib cimport malloc
cdef class OBIDMS_column_qual(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef const uint8_t* value
cdef int value_length
cdef object result
cdef int i
value = obi_column_get_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, &value_length)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_int_NA :
result = None
else :
result = []
for i in range(value_length) :
result.append(<int>value[i])
return result
cpdef object get_str_line(self, index_t line_nb):
cdef char* value
cdef object result
cdef int i
value = obi_column_get_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_char_NA :
result = None
else :
result = bytes2str(value)
free(value)
return result
cpdef set_line(self, index_t line_nb, object value):
cdef uint8_t* value_b
cdef int value_length
if value is None :
if obi_column_set_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, OBIQual_int_NA, 0) < 0:
raise Exception("Problem setting a value in a column")
else :
value_length = len(value)
value_b = <uint8_t*> malloc(value_length * sizeof(uint8_t))
for i in range(value_length) :
value_b[i] = <uint8_t>value[i]
if obi_column_set_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, value_b, value_length) < 0:
raise Exception("Problem setting a value in a column")
free(value_b)
cpdef set_str_line(self, index_t line_nb, object value):
if value is None :
if obi_column_set_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, OBIQual_char_NA) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_qual(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef const uint8_t* value
cdef int value_length
cdef object result
cdef int i
value = obi_column_get_obiqual_int_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), &value_length)
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIQual_int_NA :
result = None
else :
result = []
for i in range(value_length) :
result.append(<int>value[i])
return result
cpdef object get_str_item(self, index_t line_nb, str element_name):
cdef char* value
cdef object result
value = obi_column_get_obiqual_char_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIQual_char_NA :
result = None
else :
result = bytes2str(value)
free(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef const uint8_t* value
cdef int value_length
cdef object value_in_result
cdef dict result
cdef index_t i
cdef int j
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiqual_int_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i, &value_length)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_int_NA :
value_in_result = None
else :
value_in_result = []
for j in range(value_length) :
value_in_result.append(<int>value[j])
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef object get_str_line(self, index_t line_nb) :
cdef char* value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiqual_char_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIQual_char_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
free(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
cdef uint8_t* value_b
cdef int value_length
if value is None :
if obi_column_set_obiqual_int_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), OBIQual_int_NA, 0) < 0:
raise Exception("Problem setting a value in a column")
else :
value_length = len(value)
value_b = <uint8_t*> malloc(value_length * sizeof(uint8_t))
for i in range(value_length) :
value_b[i] = <uint8_t>value[i]
if obi_column_set_obiqual_int_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), value_b, value_length) < 0:
raise Exception("Problem setting a value in a column")
free(value_b)
cpdef set_str_item(self, index_t line_nb, str element_name, object value):
if value is None :
if obi_column_set_obiqual_char_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), OBIQual_char_NA) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obiqual_char_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,26 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIView, OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_seq(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
# TO DISCUSS :
# I'am not sure that this method has to be declared here
# Alignment must be declared outside of the sequence object
cpdef align(self,
OBIView score_view,
OBIDMS_column score_column,
double threshold = *,
bint normalize = *,
int reference = *,
bint similarity_mode = *)
cdef class OBIDMS_column_multi_elts_seq(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,127 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obiseq_with_elt_name_in_view, \
obi_column_get_obiseq_with_elt_idx_in_view, \
obi_column_set_obiseq_with_elt_name_in_view, \
obi_column_set_obiseq_with_elt_idx_in_view
from .capi.obialign cimport obi_align_one_column
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBISeq_NA, const_char_p
from ._obidms cimport OBIView
from obitools3.utils cimport str2bytes, bytes2str
from libc.stdlib cimport free
cdef class OBIDMS_column_seq(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef char* value
cdef object result
value = obi_column_get_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBISeq_NA :
result = None
else :
try:
result = <bytes> value
finally:
free(value)
return result
cpdef set_line(self, index_t line_nb, object value):
cdef bytes value_b
if value is None :
value_b = OBISeq_NA
elif isinstance(value, bytes) :
value_b = value
elif isinstance(value, str) :
value_b = str2bytes(value)
else:
raise TypeError('Sequence value must be of type Bytes, Str or None')
if obi_column_set_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, value_b) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")
# TODO choose alignment type (lcs or other) with supplementary argument
cpdef align(self,
OBIView score_view,
OBIDMS_column score_column,
double threshold = 0.0,
bint normalize = True,
int reference = 0, # TODO
bint similarity_mode = True):
if (obi_align_one_column(self.view.pointer, (self.pointer)[0], score_view.pointer, (score_column.pointer)[0], threshold, normalize, reference, similarity_mode) < 0) :
raise Exception("An error occurred while aligning sequences")
cdef class OBIDMS_column_multi_elts_seq(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef char* value
cdef object result
value = obi_column_get_obiseq_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBISeq_NA :
result = None
else :
try:
result = <bytes> value
finally:
free(value)
return result
cpdef object get_line(self, index_t line_nb) :
cdef char* value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obiseq_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBISeq_NA :
value_in_result = None
else :
try:
value_in_result = <bytes> value
finally:
free(value)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
cdef bytes value_b
if value is None :
value_b = OBISeq_NA
elif isinstance(value, bytes) :
value_b = value
elif isinstance(value, str) :
value_b = str2bytes(value)
else:
raise TypeError('Sequence value must be of type Bytes, Str or None')
if obi_column_set_obiseq_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), value_b) < 0:
raise Exception("Problem setting a value in a column")
# cpdef align(self, ): # TODO
# raise Exception("Columns with multiple sequences per line can't be aligned") # TODO discuss

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,14 +0,0 @@
#cython: language_level=3
from .capi.obitypes cimport index_t
from ._obidms cimport OBIDMS_column, OBIDMS_column_multi_elts
cdef class OBIDMS_column_str(OBIDMS_column):
cpdef object get_line(self, index_t line_nb)
cpdef set_line(self, index_t line_nb, object value)
cdef class OBIDMS_column_multi_elts_str(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name)
cpdef object get_line(self, index_t line_nb)
cpdef set_item(self, index_t line_nb, str element_name, object value)

View File

@@ -1,84 +0,0 @@
#cython: language_level=3
from .capi.obiview cimport obi_column_get_obistr_with_elt_name_in_view, \
obi_column_get_obistr_with_elt_idx_in_view, \
obi_column_set_obistr_with_elt_name_in_view, \
obi_column_set_obistr_with_elt_idx_in_view
from .capi.obierrno cimport obi_errno
from .capi.obitypes cimport OBIStr_NA, const_char_p
from obitools3.utils cimport str2bytes, bytes2str
cdef class OBIDMS_column_str(OBIDMS_column):
cpdef object get_line(self, index_t line_nb):
cdef const_char_p value
cdef object result
value = obi_column_get_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIStr_NA :
result = None
else :
result = bytes2str(value)
# NOTE: value is not freed because the pointer points to a mmapped region in an AVL data file. (TODO discuss)
return result
cpdef set_line(self, index_t line_nb, object value):
if value is None :
if obi_column_set_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, OBIStr_NA) < 0:
raise Exception("Problem setting a value in a column")
else :
if obi_column_set_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, 0, str2bytes(value)) < 0:
raise Exception("Problem setting a value in a column")
cdef class OBIDMS_column_multi_elts_str(OBIDMS_column_multi_elts):
cpdef object get_item(self, index_t line_nb, str element_name):
cdef const_char_p value
cdef object result
value = obi_column_get_obistr_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name))
if obi_errno > 0 :
raise IndexError(line_nb, element_name)
if value == OBIStr_NA :
result = None
else :
result = bytes2str(value)
# NOTE: value is not freed because the pointer points to a mmapped region in an AVL data file. (TODO discuss)
return result
cpdef object get_line(self, index_t line_nb) :
cdef const_char_p value
cdef object value_in_result
cdef dict result
cdef index_t i
cdef bint all_NA
result = {}
all_NA = True
for i in range(self.nb_elements_per_line) :
value = obi_column_get_obistr_with_elt_idx_in_view(self.view.pointer, (self.pointer)[0], line_nb, i)
if obi_errno > 0 :
raise IndexError(line_nb)
if value == OBIStr_NA :
value_in_result = None
else :
value_in_result = bytes2str(value)
# NOTE: value is not freed because the pointer points to a mmapped region in an AVL data file. (TODO discuss)
result[self.elements_names[i]] = value_in_result
if all_NA and (value_in_result is not None) :
all_NA = False
if all_NA :
result = None
return result
cpdef set_item(self, index_t line_nb, str element_name, object value):
cdef bytes value_b
if value is None :
value_b = OBIStr_NA
else :
value_b = str2bytes(value)
if obi_column_set_obistr_with_elt_name_in_view(self.view.pointer, (self.pointer)[0], line_nb, str2bytes(element_name), value_b) < 0:
raise Exception("Problem setting a value in a column")

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,38 +0,0 @@
#cython: language_level=3
from ._obidms cimport OBIView_line
cdef class OBI_Seq(dict) :
cdef object id
cdef object definition
cdef object sequence
cpdef set_id(self, object id)
cpdef object get_id(self)
cpdef set_definition(self, object definition)
cpdef object get_definition(self)
cpdef object get_sequence(self)
cdef class OBI_Nuc_Seq(OBI_Seq) :
cdef object quality
#cpdef object reverse_complement(self)
cpdef set_sequence(self, object sequence)
cpdef set_quality(self, object quality)
cpdef object get_quality(self)
cdef class OBI_Nuc_Seq_Stored(OBIView_line) :
cpdef set_id(self, object id)
cpdef object get_id(self)
cpdef set_definition(self, object definition)
cpdef object get_definition(self)
cpdef set_sequence(self, object sequence)
cpdef object get_sequence(self)
cpdef set_quality(self, object quality)
cpdef object get_quality(self)
cpdef object get_str_quality(self)
# cpdef object reverse_complement(self)

View File

@@ -1,97 +0,0 @@
#cython: language_level=3
from obitools3.utils cimport bytes2str, str2bytes
from .capi.obiview cimport NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN, \
QUALITY_COLUMN
cdef class OBI_Seq(dict) :
def __init__(self, object id, object seq, object definition=None) :
self.set_id(id)
self.set_sequence(seq)
if definition is not None :
self.set_definition(definition)
cpdef set_id(self, object id) :
self.id = id
self[bytes2str(ID_COLUMN)] = id
cpdef get_id(self) :
return self.id
cpdef set_definition(self, object definition) :
self.definition = definition
self[bytes2str(DEFINITION_COLUMN)] = definition
cpdef get_definition(self) :
return self.definition
cpdef get_sequence(self) :
return self.sequence
def __str__(self) :
return self.sequence # or not
cdef class OBI_Nuc_Seq(OBI_Seq) :
cpdef set_sequence(self, object sequence) :
self.sequence = sequence
self[bytes2str(NUC_SEQUENCE_COLUMN)] = sequence
cpdef set_quality(self, object quality) :
self.quality = quality
self[bytes2str(QUALITY_COLUMN)] = quality
cpdef get_quality(self) :
return self.quality
# cpdef str reverse_complement(self) : TODO in C ?
# pass
cdef class OBI_Nuc_Seq_Stored(OBIView_line) :
# TODO store the str version of column name macros
cpdef set_id(self, object id) :
self[bytes2str(ID_COLUMN)] = id
cpdef object get_id(self) :
return self[bytes2str(ID_COLUMN)]
cpdef set_definition(self, object definition) :
self[bytes2str(DEFINITION_COLUMN)] = definition
cpdef object get_definition(self) :
return self[bytes2str(DEFINITION_COLUMN)]
cpdef set_sequence(self, object sequence) :
self[bytes2str(NUC_SEQUENCE_COLUMN)] = sequence
cpdef object get_sequence(self) :
return self[bytes2str(NUC_SEQUENCE_COLUMN)]
cpdef set_quality(self, object quality) :
if (type(quality) == list) or (quality is None) :
self[bytes2str(QUALITY_COLUMN)] = quality
else : # Quality is in str form
(((self.view).columns)[bytes2str(QUALITY_COLUMN)]).set_str_line(self.index, quality)
cpdef object get_quality(self) :
return self[bytes2str(QUALITY_COLUMN)]
cpdef object get_str_quality(self) :
return ((self.view).columns)[bytes2str(QUALITY_COLUMN)].get_str_line(self.index)
# def __str__(self) :
# return self[bytes2str(NUC_SEQUENCE_COLUMN)] # or not
# cpdef str reverse_complement(self) : TODO in C ?
# pass
# TODO static method to import?

View File

@@ -1,59 +0,0 @@
../../../src/bloom.h
../../../src/bloom.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.h
../../../src/encode.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obi_align.h
../../../src/obi_align.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/obidebug.h
../../../src/obidms_taxonomy.h
../../../src/obidms_taxonomy.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_str.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/obilittlebigman.h
../../../src/obilittlebigman.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/uint8_indexer.h
../../../src/uint8_indexer.c
../../../src/utils.h
../../../src/utils.c

View File

@@ -1,31 +0,0 @@
#cython: language_level=3
from .capi.obitaxonomy cimport ecotx_t, OBIDMS_taxonomy_p
from libc.stdint cimport int32_t
cdef class OBI_Taxonomy :
cdef str name
cdef OBIDMS_taxonomy_p pointer
cpdef close(self)
cdef class OBI_Taxon :
cdef ecotx_t* pointer
cdef int32_t taxid
cdef int32_t rank
cdef int32_t farest
cdef ecotx_t* parent
cdef str name
cpdef int32_t taxid(self)
cpdef int32_t rank(self)
cpdef int32_t farest(self)
cpdef OBI_Taxon parent(self)

View File

@@ -1,65 +0,0 @@
#cython: language_level=3
from obitools3.utils cimport bytes2str, str2bytes
from .capi.obitaxonomy cimport obi_read_taxonomy, \
obi_close_taxonomy, \
obi_taxo_get_taxon_with_taxid
from ._obidms cimport OBIDMS
from cpython.pycapsule cimport PyCapsule_New, PyCapsule_GetPointer
cdef class OBI_Taxonomy :
def __init__(self, OBIDMS dms, str name) :
self.name = name
self.pointer = obi_read_taxonomy(dms.pointer, str2bytes(name), True) # TODO discuss
def __getitem__(self, object ref):
cdef ecotx_t* taxon_p
cdef object taxon_capsule
if type(ref) == int :
taxon_p = obi_taxo_get_taxon_with_taxid(self.pointer, ref)
taxon_capsule = PyCapsule_New(taxon_p, NULL, NULL)
return OBI_Taxon(taxon_capsule)
cpdef close(self) :
if (obi_close_taxonomy(self.pointer) < 0) :
raise Exception("Error closing the taxonomy")
cdef class OBI_Taxon : # dict subclass?
def __init__(self, object taxon_capsule) :
cdef ecotx_t* taxon
taxon = <ecotx_t*> PyCapsule_GetPointer(taxon_capsule, NULL)
self.pointer = taxon
self.taxid = taxon.taxid
self.rank = taxon.rank
self.farest = taxon.farest
self.parent = taxon.parent
self.name = bytes2str(taxon.name)
cpdef int32_t taxid(self):
return self.taxid
cpdef int32_t rank(self):
return self.rank
cpdef int32_t farest(self):
return self.farest
cpdef OBI_Taxon parent(self):
cdef object parent_capsule
parent_capsule = PyCapsule_New(self.parent, NULL, NULL)
return OBI_Taxon(parent_capsule)

View File

@@ -1,10 +0,0 @@
#cython: language_level=3
from ..capi.obiview cimport Obiview_p
from ..capi.obidmscolumn cimport OBIDMS_column_p
cdef extern from "obi_align.h" nogil:
int obi_align_one_column(Obiview_p seq_view, OBIDMS_column_p seq_column, Obiview_p score_view, OBIDMS_column_p score_column, double threshold, bint normalize, int reference, bint similarity_mode)

View File

@@ -1,12 +0,0 @@
#cython: language_level=3
from .obitypes cimport const_char_p
cdef extern from "obidms.h" nogil:
struct OBIDMS_t:
pass
ctypedef OBIDMS_t* OBIDMS_p
OBIDMS_p obi_dms(const_char_p dms_name)
int obi_close_dms(OBIDMS_p dms)

View File

@@ -1,241 +0,0 @@
#cython: language_level=3
from ..capi.obidms cimport OBIDMS_p
from ..capi.obitypes cimport const_char_p, \
OBIType_t, \
obiversion_t, \
obiint_t, \
obibool_t, \
obichar_t, \
obifloat_t, \
index_t, \
time_t
from libc.stdint cimport uint8_t
cdef extern from "obidmscolumn.h" nogil:
struct OBIDMS_column_header_t:
size_t header_size
size_t data_size
index_t line_count
index_t lines_used
index_t nb_elements_per_line
const_char_p elements_names
OBIType_t returned_data_type
OBIType_t stored_data_type
time_t creation_date
obiversion_t version
obiversion_t cloned_from
const_char_p name
const_char_p indexer_name
const_char_p comments
ctypedef OBIDMS_column_header_t* OBIDMS_column_header_p
struct OBIDMS_column_t:
OBIDMS_p dms
OBIDMS_column_header_p header
bint writable
ctypedef OBIDMS_column_t* OBIDMS_column_p
OBIDMS_column_p obi_create_column(OBIDMS_p dms,
const_char_p column_name,
OBIType_t type,
index_t nb_lines,
index_t nb_elements_per_line,
const_char_p elements_names,
const_char_p indexer_name,
const_char_p comments)
OBIDMS_column_p obi_open_column(OBIDMS_p dms,
const_char_p column_name,
obiversion_t version_number)
int obi_close_column(OBIDMS_column_p column)
OBIDMS_column_p obi_clone_column(OBIDMS_p dms,
OBIDMS_column_p line_selection,
const_char_p column_name,
obiversion_t version_number,
bint clone_data)
int obi_close_column(OBIDMS_column_p column)
obiversion_t obi_column_get_latest_version_from_name(OBIDMS_p dms,
const_char_p column_name)
OBIDMS_column_header_p obi_column_get_header_from_name(OBIDMS_p dms,
const_char_p column_name,
obiversion_t version_number)
int obi_close_header(OBIDMS_column_header_p header)
int obi_select(OBIDMS_column_p line_selection_column, index_t line_to_grep)
cdef extern from "obidmscolumn_int.h" nogil:
int obi_column_set_obiint_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
obiint_t value)
int obi_column_set_obiint_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
obiint_t value)
obiint_t obi_column_get_obiint_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
obiint_t obi_column_get_obiint_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_bool.h" nogil:
int obi_column_set_obibool_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
obibool_t value)
int obi_column_set_obibool_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
obibool_t value)
obibool_t obi_column_get_obibool_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
obibool_t obi_column_get_obibool_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_char.h" nogil:
int obi_column_set_obichar_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
obichar_t value)
int obi_column_set_obichar_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
obichar_t value)
obichar_t obi_column_get_obichar_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
obichar_t obi_column_get_obichar_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_float.h" nogil:
int obi_column_set_obifloat_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
obifloat_t value)
int obi_column_set_obifloat_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
obifloat_t value)
obifloat_t obi_column_get_obifloat_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
obifloat_t obi_column_get_obifloat_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_str.h" nogil:
int obi_column_set_obistr_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
const_char_p value)
int obi_column_set_obistr_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const_char_p value)
const_char_p obi_column_get_obistr_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
const_char_p obi_column_get_obistr_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_seq.h" nogil:
int obi_column_set_obiseq_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
const_char_p value)
int obi_column_set_obiseq_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const_char_p value)
char* obi_column_get_obiseq_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
char* obi_column_get_obiseq_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
cdef extern from "obidmscolumn_qual.h" nogil:
int obi_column_set_obiqual_char_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
const_char_p value)
int obi_column_set_obiqual_char_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const_char_p value)
int obi_column_set_obiqual_int_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
const uint8_t* value,
int value_length)
int obi_column_set_obiqual_int_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
const uint8_t* value,
int value_length)
char* obi_column_get_obiqual_char_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name)
char* obi_column_get_obiqual_char_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx)
const uint8_t* obi_column_get_obiqual_int_with_elt_name(OBIDMS_column_p column,
index_t line_nb,
const_char_p element_name,
int* value_length)
const uint8_t* obi_column_get_obiqual_int_with_elt_idx(OBIDMS_column_p column,
index_t line_nb,
index_t element_idx,
int* value_length)

View File

@@ -1,5 +0,0 @@
#cython: language_level=3
cdef extern from "obierrno.h" nogil:
extern int obi_errno

View File

@@ -1,51 +0,0 @@
import sys
import argparse
from obitools3.obidms._obidms import OBIDMS
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Pseudo obigrep.')
parser.add_argument('-V', '--view', dest='view', type=str,
help='Name of the view that should be considered')
parser.add_argument('-N', '--new_view', dest='new_view', type=str,
help='Name of the new view that should be created')
# parser.add_argument('-k', '--key', dest='key', type=str,
# help='Name of the key that should be considered')
#
# parser.add_argument('-c', '--comp', dest='comparison', type=int,
# help='Comparison to be made: -1:< ; 0:== ; 1:>')
#
# parser.add_argument('-v', '--value', dest='value', type=object,
# help='Value to be compared')
args = parser.parse_args()
d = OBIDMS('tdms')
#condition = 1
line_selec = []
v = d.open_view(args.view)
i = 0
for l in v :
if l['score'] > 350 :
line_selec.append(i)
i+=1
new_v = d.new_view(args.new_view, view_to_clone=v, line_selection=line_selec, view_type="NUC_SEQS_VIEW", comments="obigrep "+args.view+" to "+args.new_view) #args.key+" "+str(args.comparison)+" "+str(args.value)+" "+)
print("\n")
print(new_v.__repr__())
v.save_and_close()
new_v.save_and_close()
d.close()
print("\nDone.")

View File

@@ -1,43 +0,0 @@
import sys
import argparse
from obitools3.obidms._obidms import OBIDMS
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Pseudo obihead.')
parser.add_argument('-V', '--view', dest='view', type=str,
help='Name of the view that should be considered')
parser.add_argument('-N', '--new_view', dest='new_view', type=str,
help='Name of the new view that should be created')
parser.add_argument('-n', '--nb', dest='nb_lines', type=int,
help='Number of lines that should be taken')
args = parser.parse_args()
d = OBIDMS('tdms')
#condition = 1
line_selec = []
v = d.open_view(args.view)
for i in range(0, args.nb_lines) :
line_selec.append(i)
new_v = d.new_view(args.new_view, view_to_clone=v, line_selection=line_selec, view_type="NUC_SEQS_VIEW", comments="obihead "+str(args.nb_lines)+", "+args.view+" to "+args.new_view) #args.key+" "+str(args.comparison)+" "+str(args.value)+" "+)
print("\n")
print(new_v.__repr__())
v.save_and_close()
new_v.save_and_close()
d.close()
print("\nDone.")

View File

@@ -1,199 +0,0 @@
import sys
import argparse
import time
from obitools3.obidms._obidms import OBIDMS
def bufferedRead(fileobj,size=209715200): ## 200 MB
buffer = fileobj.readlines(size)
while buffer:
for l in buffer:
yield l
buffer = fileobj.readlines(size)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Convert a fasta file in an OBIDMS.')
parser.add_argument('-i', '--input', dest='input_file', type=str,
help='Name of the file containing the sequences')
args = parser.parse_args()
d = OBIDMS('tdms')
view = d.new_view('uniq view', view_type="NUC_SEQS_VIEW")
# for i in range(35000000) :
# if (not (i%500000)) :
# print(str(time.time())+'\t'+str(i))
# id = "@HWI-D00405:142:C71BAANXX:4:1101:1234:2234_CONS_SUB_SUB_"+str(i)
# view[i].set_id(id)
# if id != view[i]["ID"] :
# print("nope", id, view[i]["ID"])
input_file = open(args.input_file, 'r')
input_file_buffered = bufferedRead(input_file)
#
# if args.input_file[-1:] == "a" :
#
# i = 0
# next = False
# first = True
#
# for line in input_file :
#
# if line[0] == ">" :
#
# if not first :
# # save seq
# #print(i, id, seq)
# view[i].set_sequence(seq)
# i+=1
#
# first = False
#
# #id = line.split(" ", 1)[0][1:]
# #rest = (line[:-1].split(" ", 1)[1]).split(";")
# #view[i].set_id(id)
#
# # description = ""
# # for j in range(len(rest)) :
# # if "=" in rest[j] :
# # rest[j] = rest[j].strip()
# # rest[j] = rest[j].split("=", 1)
# # column_name = rest[j][0]
# # v = rest[j][1]
# # if ((not v.isalpha()) and (v.isalnum())) :
# # conv_v = int(v)
# # elif (v == "True") or (v == "False") :
# # conv_v = bool(v)
# # else :
# # f = True
# # for letter in v :
# # if ((not letter.isalnum()) or (letter != ".")) :
# # f = False
# # if f :
# # conv_v = float(v)
# # else :
# # conv_v = v
# # view[i][column_name] = conv_v
# # else :
# # description+=rest[j]
# #
# # if description != "" :
# # description = description.strip()
# # view[i].set_description(description)
#
# #print(id)
# #print(rest)
# #print(description)
#
# next = True
#
# elif next == True :
#
# # if not (i % 1E5) :
# # print(i)
#
# seq = line[:-1]
# next = False
#
# elif not next :
#
# seq += line[:-1]
#
#
# elif args.input_file[-1:] == "q" :
#
# i = 0
# l = 0
# next = False
#
l=0
i=0
# while (True):
# l+=1
# line = input_file.readline()
# if line=="":
# break
for line in input_file_buffered :
#
#if i > 1E7 :
# # print('hmm?')
#
# if i == 6000000 :
# break
#
if l%4 == 0 :
#
if (not (i%500000)) :
print(str(time.time())+'\t'+str(i))
# #
# # #print("header", line)
# #
id = line.split(" ", 1)[0][1:]
# print(id)
# # #rest = (line[:-1].split(" ", 1)[1]).split(";")
view[i].set_id(id)
# print(view[i]["ID"])
#
# i+=1
# l+=1
#
# # description = ""
# # for j in range(len(rest)) :
# # if "=" in rest[j] :
# # rest[j] = rest[j].strip()
# # rest[j] = rest[j].split("=", 1)
# # column_name = rest[j][0]
# # #print("COLUMN", column_name)
# # v = rest[j][1]
# # if (v == "") and (column_name in view) and (view[column_name].get_data_type() == "OBI_SEQ") :
# # #print(">>>>>>YUP")
# # conv_v = "aa"
# # else :
# # if ((not v.isalpha()) and (v.isalnum())) :
# # conv_v = int(v)
# # elif (v == "True") or (v == "False") :
# # conv_v = bool(v)
# # else :
# # f = True
# # for letter in v :
# # if ((not letter.isalnum()) or (letter != ".")) :
# # f = False
# # if f :
# # conv_v = float(v)
# # else :
# # conv_v = v
# # view[i][column_name] = conv_v
# # else :
# # description+=rest[j]
# #
# # if description != "" :
# # description = description.strip()
# # view[i].set_description(description)
#
elif l%4 == 1 :
# #
seq = line[:-1]
# #print("seq", seq)
view[i].set_sequence(seq)
i+=1
#
l+=1
#
#
input_file.close()
#print(view)
print(view.__repr__())
view.save_and_close()
d.close()
print("Done.")

View File

@@ -6,8 +6,12 @@ Created on 30 mars 2016
@author: coissac
'''
#from obitools3.dms._obiseq cimport OBI_Seq
def fastaIterator(lineiterator, int buffersize=100000000):
def fastaIterator(lineiterator,
int buffersize=100000000
):
cdef LineBuffer lb
cdef str ident
cdef str definition
@@ -15,6 +19,7 @@ def fastaIterator(lineiterator, int buffersize=100000000):
cdef list s
cdef bytes sequence
cdef bytes quality
# cdef OBI_Seq seq
if isinstance(lineiterator,(str,bytes)):
lineiterator=uopen(lineiterator)
@@ -41,7 +46,60 @@ def fastaIterator(lineiterator, int buffersize=100000000):
sequence = b"".join(s)
quality = None
# seq = OBI_Seq(id,
# sequence,
# definition,
# tags=tags,
# )
yield { "id" : ident,
"definition" : definition,
"sequence" : sequence,
"quality" : quality,
"tags" : tags,
"annotation" : {}
}
def fastaNucIterator(lineiterator, int buffersize=100000000):
cdef LineBuffer lb
cdef str ident
cdef str definition
cdef dict tags
cdef list s
cdef bytes sequence
cdef bytes quality
# cdef OBI_Seq seq
if isinstance(lineiterator,(str,bytes)):
lineiterator=uopen(lineiterator)
if isinstance(lineiterator, LineBuffer):
lb=lineiterator
else:
lb=LineBuffer(lineiterator,buffersize)
i = iter(lb)
line = next(i)
while True:
ident,tags,definition = parseHeader(line)
s = []
line = next(i)
try:
while line[0]!='>':
s.append(str2bytes(line)[0:-1])
line = next(i)
except StopIteration:
pass
sequence = b"".join(s)
quality = None
# seq =
yield { "id" : ident,
"definition" : definition,
"sequence" : sequence,

View File

@@ -1,311 +0,0 @@
import os
import sys
import shutil
import unittest
from random import randint, uniform, choice
import string
from obitools3.obidms._obidms import OBIDMS
LINE_COUNT_FOR_TEST_COLUMN = 10000 # TODO randomize?
SMALLER_LINE_COUNT_FOR_TEST_COLUMN = 1000 # TODO randomize?
NB_ELEMENTS_PER_LINE = 10 # TODO randomize?
DMS_NAME = "unit_test_dms"
def create_test_obidms():
dms_name = DMS_NAME
dms_dir_name = dms_name+'.obidms'
dms = OBIDMS(dms_name)
return (dms, dms_name, dms_dir_name)
def create_test_column(dms, data_type, multiple_elements_per_line=False):
col_name = "unit_test_"+data_type
if multiple_elements_per_line :
elts_names = elements_names()
col = dms.open_column(col_name,
create=True,
type=data_type,
nb_elements_per_line=NB_ELEMENTS_PER_LINE,
elements_names=elts_names)
return (col, col_name, elts_names)
else :
col = dms.open_column(col_name,
create=True,
type=data_type)
return (col, col_name)
def elements_names():
names = [str(i) for i in range(NB_ELEMENTS_PER_LINE)]
return names
def random_obivalue(data_type):
r = 1000000
if data_type == "OBI_INT" :
return randint(-r,r)
elif data_type == "OBI_FLOAT" :
return uniform(-r,r)
elif data_type == "OBI_BOOL" :
return randint(0,1)
elif data_type == "OBI_CHAR" :
return choice(string.ascii_lowercase)
elif data_type == "OBI_STR" :
length = randint(1,200)
randoms = ''.join(choice(string.ascii_lowercase) for i in range(length))
return randoms
elif data_type == "OBI_SEQ" :
length = randint(1,200)
randoms = ''.join(choice("atgcryswkmdbhvn") for i in range(length))
return randoms
class OBIDMS_Column_TestCase(unittest.TestCase):
def tearDown(self):
self.col.close()
self.dms.close()
shutil.rmtree(self.dms_dir_name, ignore_errors=True)
def test_OBIDMS_column_type(self):
assert self.col.get_data_type() == self.data_type, 'Wrong data type associated with column'
def test_OBIDMS_column_cloning(self):
for i in range(LINE_COUNT_FOR_TEST_COLUMN):
self.col[i]= random_obivalue(self.data_type)
self.col.close()
clone = self.dms.open_column(self.col_name, clone=True)
self.col = self.dms.open_column(self.col_name)
assert clone.get_nb_lines_used() == self.col.get_nb_lines_used(), "Cloned column doesn't have the same number of lines used"
i=0
for i in range(clone.get_nb_lines_used()) :
assert clone[i] == self.col[i], "Different value in original column and cloned column"
assert clone[i] is not None, "None value"
clone.close()
def test_OBIDMS_column_set_and_get(self):
for i in range(LINE_COUNT_FOR_TEST_COLUMN):
v = random_obivalue(self.data_type)
self.col[i] = v
assert self.col[i] == v, "Different value than the set value"
assert self.col[i] is not None, "None value"
def test_OBIDMS_referring_column(self):
for i in range(LINE_COUNT_FOR_TEST_COLUMN):
self.col[i] = random_obivalue(self.data_type)
ref_col = self.dms.open_column(self.col_name, referring=True)
j = 0
for i in range(LINE_COUNT_FOR_TEST_COLUMN):
if i%2 : # TODO randomize
ref_col.grep_line(i)
assert ref_col[j] == self.col[i], "Different value in original column and returned by referring column"
assert ref_col[j] is not None, "None value"
j+=1
class OBIDMS_Column_multiple_elements_TestCase(OBIDMS_Column_TestCase):
def test_OBIDMS_column_cloning(self):
pass
for i in range(SMALLER_LINE_COUNT_FOR_TEST_COLUMN):
v = {}
for e in self.elts_names :
v[e] = random_obivalue(self.data_type)
self.col[i] = v
self.col.close()
clone = self.dms.open_column(self.col_name, clone=True)
self.col = self.dms.open_column(self.col_name)
assert clone.get_nb_lines_used() == self.col.get_nb_lines_used(), "Cloned column doesn't have the same number of lines used"
i=0
for i in range(SMALLER_LINE_COUNT_FOR_TEST_COLUMN):
assert self.col[i] == clone[i], "Different value in original column and cloned column"
assert self.col[i] is not None, "None value"
clone.close()
def test_OBIDMS_column_set_and_get_with_elements_names(self):
for i in range(SMALLER_LINE_COUNT_FOR_TEST_COLUMN):
for e in range(NB_ELEMENTS_PER_LINE) :
v = random_obivalue(self.data_type)
self.col.set_item(i, self.elts_names[e], v)
assert self.col.get_item(i, self.elts_names[e]) == v, "Different value than the set value"
assert self.col.get_item(i, self.elts_names[e]) is not None, "None value"
def test_OBIDMS_column_set_and_get(self):
for i in range(SMALLER_LINE_COUNT_FOR_TEST_COLUMN):
v = {}
for e in self.elts_names :
v[e] = random_obivalue(self.data_type)
self.col[i] = v
assert self.col[i] == v, "Different value than the set value"
assert self.col[i] is not None, "None value"
def test_OBIDMS_referring_column(self):
for i in range(SMALLER_LINE_COUNT_FOR_TEST_COLUMN):
v = {}
for e in self.elts_names :
v[e] = random_obivalue(self.data_type)
self.col[i] = v
ref_col = self.dms.open_column(self.col_name, referring=True)
j = 0
for i in range(SMALLER_LINE_COUNT_FOR_TEST_COLUMN):
if i%2 : # TODO randomize
ref_col.grep_line(i)
assert ref_col[j] == self.col[i], "Different value in original column and returned by referring column"
assert ref_col[j] is not None, "None value"
j+=1
ref_col.close()
class OBIDMS_Column_OBI_INT_TestCase(OBIDMS_Column_TestCase):
def setUp(self):
self.data_type = 'OBI_INT'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name = create_test_column(self.dms,
self.data_type)
class OBIDMS_Column_OBI_INT_multiple_elements_TestCase(OBIDMS_Column_multiple_elements_TestCase):
def setUp(self):
self.data_type = 'OBI_INT'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name, \
self.elts_names = create_test_column(self.dms,
self.data_type,
multiple_elements_per_line=True)
class OBIDMS_Column_OBI_FLOAT_TestCase(OBIDMS_Column_TestCase):
def setUp(self):
self.data_type = 'OBI_FLOAT'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name = create_test_column(self.dms,
self.data_type)
class OBIDMS_Column_OBI_FLOAT_multiple_elements_TestCase(OBIDMS_Column_multiple_elements_TestCase):
def setUp(self):
self.data_type = 'OBI_FLOAT'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name, \
self.elts_names = create_test_column(self.dms,
self.data_type,
multiple_elements_per_line=True)
class OBIDMS_Column_OBI_BOOL_TestCase(OBIDMS_Column_TestCase):
def setUp(self):
self.data_type = 'OBI_BOOL'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name = create_test_column(self.dms,
self.data_type)
class OBIDMS_Column_OBI_BOOL_multiple_elements_TestCase(OBIDMS_Column_multiple_elements_TestCase):
def setUp(self):
self.data_type = 'OBI_BOOL'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name, \
self.elts_names = create_test_column(self.dms,
self.data_type,
multiple_elements_per_line=True)
class OBIDMS_Column_OBI_CHAR_TestCase(OBIDMS_Column_TestCase):
def setUp(self):
self.data_type = 'OBI_CHAR'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name = create_test_column(self.dms,
self.data_type)
class OBIDMS_Column_OBI_CHAR_multiple_elements_TestCase(OBIDMS_Column_multiple_elements_TestCase):
def setUp(self):
self.data_type = 'OBI_CHAR'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name, \
self.elts_names = create_test_column(self.dms,
self.data_type,
multiple_elements_per_line=True)
class OBIDMS_Column_OBI_STR_TestCase(OBIDMS_Column_TestCase):
def setUp(self):
self.data_type = 'OBI_STR'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name = create_test_column(self.dms,
self.data_type)
class OBIDMS_Column_OBI_STR_multiple_elements_TestCase(OBIDMS_Column_multiple_elements_TestCase):
def setUp(self):
self.data_type = 'OBI_STR'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name, \
self.elts_names = create_test_column(self.dms,
self.data_type,
multiple_elements_per_line=True)
class OBIDMS_Column_OBI_SEQ_TestCase(OBIDMS_Column_TestCase):
def setUp(self):
self.data_type = 'OBI_SEQ'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name = create_test_column(self.dms,
self.data_type)
class OBIDMS_Column_OBI_SEQ_multiple_elements_TestCase(OBIDMS_Column_multiple_elements_TestCase):
def setUp(self):
self.data_type = 'OBI_SEQ'
self.dms, \
self.dms_name, \
self.dms_dir_name = create_test_obidms()
self.col, \
self.col_name, \
self.elts_names = create_test_column(self.dms,
self.data_type,
multiple_elements_per_line=True)
if __name__ == '__main__':
unittest.main(verbosity=2, defaultTest=["OBIDMS_Column_OBI_INT_TestCase",
"OBIDMS_Column_OBI_INT_multiple_elements_TestCase",
"OBIDMS_Column_OBI_FLOAT_TestCase",
"OBIDMS_Column_OBI_FLOAT_multiple_elements_TestCase",
"OBIDMS_Column_OBI_BOOL_TestCase",
"OBIDMS_Column_OBI_BOOL_multiple_elements_TestCase",
"OBIDMS_Column_OBI_CHAR_TestCase",
"OBIDMS_Column_OBI_CHAR_multiple_elements_TestCase",
"OBIDMS_Column_OBI_STR_TestCase",
"OBIDMS_Column_OBI_STR_multiple_elements_TestCase",
"OBIDMS_Column_OBI_SEQ_TestCase",
"OBIDMS_Column_OBI_SEQ_multiple_elements_TestCase"])

Some files were not shown because too many files have changed in this diff Show More