Compare commits

...

365 Commits

Author SHA1 Message Date
3db93ee9c4 Fixed stdout output 2024-01-12 16:13:30 +13:00
4844b20770 Merge branch 'master' of https://git.metabarcoding.org/obitools/obitools3 2024-01-12 15:36:58 +13:00
0d98a4f717 Switch to version 3.0.1b26 2024-01-10 16:40:08 +13:00
837ff1a1ba Taxonomy: fixed an issue related to StopIteration behaviour in new
versions of python
2024-01-10 15:53:15 +13:00
aeed42456a export: columns are now in alphabetical order when exporting to tab
format
2024-01-10 15:52:28 +13:00
fb6e27bb5d Revert "Testing RAM instead of mmap for blob alignment"
This reverts commit 6d94cdcc0d
2023-11-29 04:22:31 +01:00
6d94cdcc0d Testing RAM instead of mmap for blob alignment 2023-11-29 16:19:34 +13:00
8a1f844645 obi import: fixed bug caused by new behaviour of StopIteration
exceptions in Python>=3.7
2023-09-21 17:47:40 +12:00
791ccfb92e Fixed include bug in previous version and switch to version 3.0.1b24 2023-05-15 11:35:42 +12:00
1c9a906f5b ngsfilter and ecopcr: now check for primers too long for apat library to
handle (31bp max) and switch to version 3.0.1b23
2023-05-12 17:04:21 +12:00
55b2679b23 New command obi rm and switch to version 3.0.1b22 2023-05-08 17:48:50 +12:00
9ea2124adc Switch to version 3.0.1b21 2023-02-13 11:00:01 +13:00
2130a949c7 New command: obi taxonomy to add local taxa (closes #64) 2023-02-13 10:59:20 +13:00
eeb93afa7d import: now automatically renames scientific_name tag to
`SCIENTIFIC_NAME`, and suggests using `--input-na-string` when a
sequence import fails
2023-02-13 10:40:38 +13:00
755ce179ad head: added output format options 2023-02-13 10:31:26 +13:00
7e492578b3 Switch to version 3.0.1b20 2022-09-21 11:33:03 +12:00
02e9df3ad1 alignpairedend and ngsfilter: ids of original sequences are now kept 2022-09-21 11:32:19 +12:00
55ada80500 import: made ngsfilter file parsing more resilient and switching to
version 3.0.1b19
2022-07-15 16:02:21 +12:00
ef9d9674b0 obi import: added SINTAX format import and switch to version 3.0.1b18 2022-05-17 09:36:33 +12:00
4f39bb2418 switch to version 3.0.1b17 2022-05-03 10:55:36 +12:00
0a2b8adb50 import: added import of UNITE fasta format 2022-05-03 10:54:41 +12:00
f9b99a9397 annotate: fixed a bug where a column type could be wrongly guessed and
switch to version 3.0.1b16
2022-03-30 16:32:07 +13:00
ce2833c04b switch to version v3.0.1b15 2022-02-25 17:48:44 +13:00
f64b3da30b split command 2022-02-25 17:44:18 +13:00
388b3e0410 removed a trace 2021-11-11 15:53:27 +13:00
c9db990b83 switch to version 3.0.1b14 2021-11-11 15:28:00 +13:00
3f253feb5e Cython: View: fixed keys method to get list of view keys 2021-11-11 15:27:32 +13:00
85d2bab607 small fix 2021-11-11 15:26:48 +13:00
53b3d81137 small fixes and improvements 2021-11-11 15:26:09 +13:00
f6353fbf28 obi export: added options to export to metabaR compatible format 2021-11-11 15:24:12 +13:00
5a8b9dca5d goes with previous commit 2021-11-11 15:12:04 +13:00
8bd6d6c8e9 Python: URI decoding: now properly checking that paths can be encoded in
ASCII (issue #89)
2021-11-02 11:17:59 +13:00
405e6ef420 Python: URI decoding: added metabaR output 2021-11-02 11:16:29 +13:00
fedacfafe7 switch to version 3.0.1b13 2021-09-13 11:46:17 +12:00
2d66e0e965 python: genbank parser: better handling of white spaces 2021-09-13 11:44:38 +12:00
f43856b712 switch to version 3.0.1b12 2021-09-08 10:56:55 +12:00
9e0c319806 Cython: fixed rewriting of column when rewriting a 1 element dict column 2021-09-08 10:54:23 +12:00
58b42cd977 C: views: now correctly parses view names containing '.' when cleaning
unfinished views. Closes #115
2021-09-08 10:52:42 +12:00
34de90bce6 ngsfilter: checks better if there is an associated sequencing quality 2021-09-08 10:30:11 +12:00
4be9f36f99 stats: fixed the computation of variance when it is equal to 0 2021-08-05 11:32:16 +12:00
f10e78ba3c C: fixed the printing of view informations from a DMS (fixes #114) 2021-08-05 11:31:24 +12:00
88c8463ed7 Cython: taxonomy: improved logging 2021-08-05 11:29:20 +12:00
89168271ef ecopcr: now accepting taxonomy from a different DMS than the reference
sequences
2021-08-05 11:28:57 +12:00
82d2642000 Switch to version 3.0.1b11 2021-07-22 09:25:39 +12:00
99c1cd60d6 export: now exports header for tabular files by default and added option
to only export specific columns
2021-07-22 09:23:18 +12:00
ce7ae4ac55 export: fixed 'only' option printing one too many if printing header 2021-07-21 15:23:04 +12:00
0b4283bb58 cat: improved error handling 2021-07-21 15:22:08 +12:00
747f3efbb2 Improved taxonomy reading information display 2021-07-21 15:20:44 +12:00
6c1a3aff47 Fixed the handling of sample names that are numbers (forcing conversion) 2021-07-21 15:19:24 +12:00
e2932b05f2 Implements #108 export integer missing values as 0 for tables by default 2021-07-21 14:41:54 +12:00
32345b9ec4 Addresses #111 2021-07-19 15:55:25 +12:00
9334cf6cc6 import: improved genbank parser and switch to version 3.0.1.b10 2021-06-17 08:42:01 +12:00
8ec13a294c Switch to version 3.0.1b9 2021-06-01 09:21:43 +12:00
3e45c34491 import: now imports and adds taxids for SILVA and RDP files, added
import of lists, fixed skipping of errors (was not overwriting), and
fixed --no-progress-bar option
2021-06-01 09:21:07 +12:00
c2f3d90dc1 build_ref_db: set default threshold to 0.99 2021-06-01 09:11:17 +12:00
6b732d11d3 align: fixed column URI parsing 2021-06-01 09:10:21 +12:00
9eb833a0af typo fix 2021-06-01 09:09:16 +12:00
6b7b0e3bd1 cat: fixed the handling of dictionary columns 2021-06-01 09:06:13 +12:00
47691a8f58 count: added option to specify the count column 2021-06-01 09:05:14 +12:00
b908b581c8 clean: hid not implemented option 2021-06-01 09:04:22 +12:00
03c174fd7a grep: added taxonomy check 2021-05-31 17:03:39 +12:00
2156588ff6 added TODO comment 2021-05-31 17:01:57 +12:00
6ff29c6a6a Increased maximum line count to 10E12 2021-05-31 17:00:55 +12:00
51a3c68fb5 C: build_reference_db: fixed gcc warning/error 2021-05-31 16:59:17 +12:00
da91ffc2c7 URI decoding: fixed reading a taxonomy before any view 2021-05-31 16:57:20 +12:00
c884615522 obi stats: various fixes and improvements 2021-05-31 16:51:06 +12:00
cb53381863 ecotag: BEST_MATCH_TAXIDS now dereplicated (no repeated taxids in the
list) and switch to version 3.0.1b8
2021-05-10 16:02:06 +12:00
72b3e5d872 switch to version 3.0.1b7 2021-04-07 10:31:54 +12:00
238e9f70f3 alignpairedend: fixed bug that would cut out sequence ends when it
should not have
2021-04-07 10:31:12 +12:00
e099a16624 small fixes 2021-04-07 10:28:02 +12:00
847c9c816d import: fixed count estimation for tabular files with header 2021-03-30 09:07:14 +13:00
6026129ca8 Fixes 101 2021-03-30 09:06:08 +13:00
169b6514b4 small doc fixes 2021-03-29 13:07:48 +13:00
89b0c48141 switch to version 3.0.1b6 2021-03-29 11:18:44 +13:00
7c02782e3c import/export: workaround for issue where flake8(?) reads '\t' as
'\'+'t' when parsing an option value
2021-03-29 11:18:19 +13:00
ecc4c2c78b stats: improved the tabular display 2021-03-29 09:03:32 +13:00
f5413381fd C: taxonomy: fixed a bug where some taxa would not be stored in the
merged index
2021-03-29 09:02:18 +13:00
3e93cfff7b import: Columns are now rewritten in OBI_FLOAT if a value is > INT32_MAX 2021-03-29 09:00:52 +13:00
6d445fe3ad switch to version 3.0.1b5 2021-03-22 09:41:01 +13:00
824deb7e21 new command: obi rm: deletes any view (for now the user deleting a view
accepts that there will be missing information when running obi history
if other views came from the deleted view)
2021-03-18 09:17:06 +13:00
d579bb2749 switch to version 3.0.1b4 2021-03-16 17:40:58 +13:00
10e5ebdbc0 ngsfilter: fixed critical bug where barcodes shorter than the forward
primer would be missed
2021-03-16 15:09:28 +13:00
8833110490 import: fixed the import of tabular files with no header 2021-03-16 09:15:48 +13:00
bd38449f2d switch to version 3.0.1b3 2021-03-15 16:50:17 +13:00
904823c827 uniq: now OK to use -m option even if only one unique key in information
to merge (e.g. one sample)
2021-03-15 16:48:22 +13:00
af68a1024c Switch to version 3.0.1b2 2021-03-15 16:26:43 +13:00
425fe25bd2 Made the OBITools3 more 'empty file friendly' 2021-03-15 16:25:41 +13:00
d48aed38d4 switch to version 3.0.1b1 2021-03-11 17:11:23 +13:00
5e32f8523e Merge branch 'wsl_version' 2021-03-11 16:47:59 +13:00
8f1d94fd24 obi test: fixed bug introduced in ad1fd3c3 2021-03-11 16:31:31 +13:00
38f42cb0fb C: Made maximum file path length 2048 instead of 1024 2021-03-11 15:23:22 +13:00
7f0f63cf26 C: now completely unmapping files before truncating them to a smaller
size (#68)
2021-03-11 15:12:40 +13:00
cba78111c9 obi test: fixed bug introduced in previous version 2021-03-11 11:36:52 +13:00
41fbae7b6c Switch to version 3.0.0b43 2021-03-10 16:52:03 +13:00
ad1fd3c341 Now handling dictionaries with one key 2021-03-10 16:50:30 +13:00
fbf0f7dfb6 import: improved genbank parser and switch to version 3.0.0b42 2021-02-17 15:26:35 +13:00
fda0edd0d8 Switch to version 3.0.0b41 2021-02-10 17:29:08 +13:00
382e37a6ae Fixes #88 2021-02-10 17:28:49 +13:00
5cc3e29f75 obi test: made less heavy by default 2021-02-10 17:28:15 +13:00
a8e2aee281 Switch to version 3.0.0b40 2021-02-06 14:45:07 +13:00
13adb479d3 Adds an extern qualifier to the keep_running declaration. 2021-02-05 15:59:43 +01:00
8ba7acdfe1 export: fixed a bug where exporting to tab format with a header would
not export the first line of data and switch to version 3.0.0b39
2021-01-13 16:09:04 +01:00
38051b1e4f Removed spurious commentaries 2021-01-13 16:07:42 +01:00
52a2e21b38 grep: fixed --id-list option
and switch to version 3.0.0b38
2020-11-06 16:36:37 +01:00
d27a5b9115 Switch to version 3.0.0b37 2020-10-30 10:47:13 +01:00
20bd3350b4 New command: obi addtaxids to add NCBI taxids to sequences from their
taxon name.
2020-10-30 10:46:55 +01:00
2e191372d7 Now handling sequences with Uracil (U) nucleotides by converting to
Thymine (T)
2020-10-30 10:46:17 +01:00
112e12cab0 Taxonomy: new functions to find taxa by name 2020-10-30 10:45:20 +01:00
b9b4cec5b5 import: now can import SILVA fasta files 2020-10-30 10:43:04 +01:00
199f3772e8 Small fixes (potential compilation problems) 2020-10-30 10:41:58 +01:00
422a6450fa ecotag: clarified similarity circle documentation 2020-09-29 17:57:29 +02:00
137c109f86 obi ls: now done in C (preparing things for R packages to read DMS) and
switch to version 3.0.0b36
2020-09-29 17:51:39 +02:00
b6648ae81e Revert "Fixed version numbering mistake (should be b34 not b35)"
This reverts commit f6dffbecfe
2020-09-25 16:25:39 +02:00
f6dffbecfe Fixed version numbering mistake (should be b34 not b35) 2020-09-25 16:24:23 +02:00
c4696ac865 ecotag: added separate threshold for minimum circle identity (and switch
to version 3.0.0b35
2020-09-25 16:22:09 +02:00
11a0945a9b obi cat: fixed open file descriptor leak and switch to version 3.0.0b34 2020-08-28 10:41:22 +02:00
f23c40c905 obi cat: fixed a bug introduced in 3.0.0b28 and switch to version
3.0.0b33
2020-08-27 18:38:16 +02:00
f99fc13b75 switch to version 3.0.0b32 2020-08-13 18:17:09 +02:00
1da6aac1b8 C: patch for failed creation of AVL with errno EEXIST 2020-08-12 17:55:08 +02:00
159803b40a export: now automatically sorts dictionary keys alphabetically for
tab/csv output
2020-07-31 16:43:35 +02:00
7dcbc34017 import: fixed entry count estimation when importing fastq files 2020-07-30 16:56:36 +02:00
db2202c8b4 uniq: added a check to make sure that there is more than one element for
one tag when merging its information
2020-07-30 16:14:37 +02:00
d33ff97846 switch to version 3.0.0b31 2020-07-28 09:31:19 +02:00
1dcdf69f1f export: fixed a bug introduced in version 3.0.0b28 2020-07-28 09:31:05 +02:00
dec114eed6 Python: added "date created" information in view representation 2020-07-27 17:38:45 +02:00
f36691053b Python: added the OBITools3 version that generated the view in view
comments
2020-07-27 16:50:00 +02:00
f2aa5fcf8b alignpairedend: fixed division by 0 bug and switch to version 3.0.0b30 2020-07-27 10:15:59 +02:00
bccb3e6874 switch to version 3.0.0b29 2020-07-26 17:40:26 +02:00
f5a17bea68 C: added a missing error check 2020-07-26 17:39:55 +02:00
e28507639a C and Cython: fixed and improved the associated columns system 2020-07-26 17:39:29 +02:00
e6feac93fe obi test: made less heavy to be faster 2020-07-26 17:37:21 +02:00
50b292b489 obi import: added --space-priority option to import a view line by line 2020-07-26 17:36:52 +02:00
24a737aa55 switch to version 3.0.0b28 2020-07-24 16:10:10 +02:00
8aa455ad8a Python: made all commands handle output to buffer object (e.g. stdout) 2020-07-24 16:09:48 +02:00
46ca693ca9 Cython: View: new method to print a view to a buffer (e.g. stdout) 2020-07-24 16:03:23 +02:00
9a9afde113 Cython: progress bar: set default refresh rate to 5 seconds 2020-07-24 15:29:12 +02:00
8dd403a118 grep: now prints the number of entries grepped 2020-07-13 17:08:13 +02:00
9672f01c6a alignpairedend: improved/fixed the output tags for the alignment score
and lengths. Removed minimum score option
2020-07-13 15:59:50 +02:00
ed9549acfb ngsfilter: unidentified sequences are now stored untrimmed 2020-07-13 15:56:40 +02:00
9ace9989c4 Switch to version 3.0.0b27 2020-07-07 16:47:21 +02:00
a3ebe5f118 C: AVL trees: fixed a bug where storing the difference between 2 crc64
values in an int64 would mess trees up resulting in failed data
dereplication
2020-07-07 16:47:00 +02:00
9100e14899 obi uniq: quick fix for bug where some sequences are not correctly
dereplicated
2020-07-03 17:36:57 +02:00
ccda0661ce small help documentation improvement 2020-07-01 18:20:38 +02:00
aab59f2214 obi clean: fixed a memory bug, fixed the behaviour when no sample info,
and added checks warnings and error handling when sample info not
dereplicated
2020-07-01 18:17:47 +02:00
ade1107b42 switch to version 3.0.0b26 2020-06-17 18:56:07 +02:00
9c7d24406f export: dictionaries are now formatted like in the original OBITools
when exporting in tabular format and tuple formatting is cleaner
2020-06-17 18:55:46 +02:00
03bc9915f2 Cython: utils: added handling of tuples to bytes2str_object function 2020-06-17 18:54:14 +02:00
24b1dab573 Cython: Columns: added a keys() method that returns all element names 2020-06-17 18:53:41 +02:00
7593673f3f ngsfilter: now setting 'reversed' tag to False instead of None when
false
2020-06-17 18:52:35 +02:00
aa01236cae switch to version 3.0.0b25 2020-06-13 21:48:49 +02:00
49b8810a76 C: made indexer opening/closing cleaner 2020-06-13 21:47:03 +02:00
7a39df54c0 ls: fixed an issue where big DMS couldn't be read by ls 2020-06-13 21:45:22 +02:00
09e483b0d6 switch to temporary version 3.0.0b24a 2020-06-10 17:47:56 +02:00
14a2579173 uniq: now outputs an empty view if input view is empty instead of
displaying an error
2020-06-10 17:47:26 +02:00
36a8aaa92e grep: now creating empty views instead of displaying an error when
selecting on an unexisting column/tag
2020-06-10 16:57:42 +02:00
a17eb445c2 ngsfilter: made one of the tag error messages more accurate 2020-06-10 16:27:36 +02:00
e4a32788c2 Switch to version 3.0.0b24 2020-06-09 14:36:58 +02:00
2442cc80bf Cython: View: fixed bash history display 2020-06-09 14:36:37 +02:00
aa836b2ace uniq: improved progress bar of second browsing 2020-06-09 14:36:02 +02:00
8776ce22e6 C: fixed a bug where indexers referring to tuples of certain types were
not properly closed and imported
2020-06-09 14:34:43 +02:00
4aa772c405 ecotag: Added list of taxids for all best matches (closes #80) 2020-06-09 14:33:14 +02:00
b0b96ac37a version 3.0.0b23a 2020-06-05 16:10:24 +02:00
687e42ad22 C: kmer alignment: fixed a bug where scores of 0 were at
(0+kmer_length-1) (and now setting alignment direction to None if score
is 0
2020-06-05 16:09:33 +02:00
5fbbb6d304 alignpairedend: fixed a bug when rebuilding joined (unaligned) sequences
where only the forward sequence was kept
2020-06-05 16:06:43 +02:00
359a9fe237 Switch to version 3.0.0b23 2020-06-04 15:35:03 +02:00
f9b6851f75 Python: correctly flagged some mandatory options as required 2020-06-04 15:34:24 +02:00
29a2652bbf Fixed installation on Ubuntu without pip 2020-06-04 15:06:35 +02:00
2a2c233936 obi import: fixed a bug when skipping an entry 2020-05-29 21:19:42 +02:00
faf8ea9d86 Switch to version 3.0.0b21 2020-05-28 20:42:09 +02:00
ffe2485e94 Genbank parser: now reading ORIGIN lines with comments without
triggering error
2020-05-28 20:41:34 +02:00
6094ce2bbc obi import: skip on error more robust 2020-05-28 20:40:36 +02:00
a7dcf16c06 Minor changes for pip release 2020-05-20 15:59:04 +02:00
f13f8f6165 obi import: minor doc/display improvements 2020-05-20 11:46:29 +02:00
b5a29ac413 Switch to version 3.0.0b19 2020-05-20 10:29:36 +02:00
efd2b9d338 Cleaner installation 2020-05-20 10:29:12 +02:00
ca6e3e7aad obi import: fixed to work with seq genbank extension 2020-05-20 10:28:14 +02:00
76ed8e18e5 Switch to version 3.0.0b18 with version formatting that fits setuptools 2020-05-18 17:08:55 +02:00
1d17f28aec setup: now using setuptools instead of distutils to work with pip 2020-05-18 17:08:09 +02:00
fa834e4b8b obi import: small bug fix 2020-05-18 17:06:58 +02:00
a72fea3cc9 Python: fasta parser: fixed a bug stopping the program when the last
line contained a single nucleotide
2020-05-12 11:24:12 +02:00
e9a37d8a6e Switch to version 3.0.0-beta16 2020-05-07 17:09:26 +02:00
ef074f8455 typo 2020-05-07 17:08:59 +02:00
aec5e69f2c C, views: no more automatic COUNT column if MERGED_sample column exists 2020-05-07 17:08:07 +02:00
170ef3f1ba Views: added obi prefix to commands in bash history 2020-05-07 17:07:01 +02:00
f999946582 obi uniq: fixed the remerging of already merged informations, and
efficiency improvements
2020-05-07 17:05:54 +02:00
773b36ec37 obi import: fixed the import of old obitools files with premerged
informations, and other minor improvements
2020-05-07 17:03:04 +02:00
69cb434a6c version 3.0.0-beta15c 2020-04-29 14:25:33 +02:00
55d4f98d60 obi annotate: fixed annotation at ranks 2020-04-29 14:24:40 +02:00
0bec2631e8 ecotag: fixed a bug where all the full DMS path weren't properly sent to
the C layer
2020-04-29 10:35:55 +02:00
e6b6c6fa84 AVLs: Made an error message more informative 2020-04-29 10:14:04 +02:00
974528b2e6 build_ref_db: fixed bug erasing some of the higher LCAs (i.e. lowest
similarities)
2020-04-28 15:56:06 +02:00
1b346b54f9 ecotag: better specificity by now correctly looking for similarities
within refs above best score instead of ecotag threshold
2020-04-28 15:10:07 +02:00
058f2ad8b3 ecopcr: fixed a bug where sequences were considered circular (generating
false positives)
2020-04-27 14:44:35 +02:00
60bfd3ae8d obi annotate: now defaults to setting str if expression is not valid 2020-04-24 11:35:20 +02:00
67bdee105a C: build_ref_db: added progress display for each step 2020-04-18 14:24:08 +02:00
0f745e0113 C: Columns: optimizing column file growth 2020-04-18 13:55:47 +02:00
da8de52ba4 export: fixed progress bar bug 2020-04-17 15:09:10 +02:00
4d36538c6e C: SSE lcs alignment: band-aid for memory bug I don't understand
(triggered on specific db on ubuntu)
2020-04-17 15:07:52 +02:00
8d0b17d87d Switch to version 3.0.0-beta14 2020-04-15 17:47:26 +02:00
343999a627 Taxonomy: fixed a critical memory bug when building the list of merged
taxids
2020-04-15 17:46:13 +02:00
e9a40630e9 C: Columns: rounding column growth to ceil to avoid looping on small
values
2020-04-13 19:02:10 +02:00
8dbcd3025a C: Columns: reduced column growth factor from 2 to 1.3 to avoid errno28 2020-04-13 14:47:56 +02:00
4cf635d001 Switch to version 3.0.0-beta13 2020-04-12 17:42:58 +02:00
b7e7cc232a Made completion script cleaner 2020-04-12 17:41:59 +02:00
b6ab792ceb C: made error message more detailed when checking that sequences and
qualities match
2020-04-12 17:40:24 +02:00
ddea5a2964 obi import: fixed inconsequential error when precomputing number of
entries in some formats
2020-04-12 17:38:42 +02:00
30852ab7d5 View bash history: removed useless shebang 2020-04-12 17:36:04 +02:00
4d0299904e all commands (almost): cleaner DMS closing at the end 2020-04-12 17:31:58 +02:00
eef5156d95 obi stats: fixed error when printing bool keys 2020-04-12 17:12:04 +02:00
e62c991bbc goes with previous commit 2020-04-10 11:22:26 +02:00
1218eed7fd ecopcr: now printing a warning instead of interrupting with an error
when a taxid is not found
2020-04-10 11:22:04 +02:00
cd9cea8c97 obi import: fixed critical bug where the last entry of embl and genbank
files was not imported
2020-04-09 19:26:27 +02:00
98cfb70d73 ecopcr: made some errors more informative 2020-04-09 09:15:28 +02:00
b9f68c76c8 ecopcr: added warnings and check of primer length (related to #75) 2020-04-05 18:40:56 +02:00
0b98371688 ngsfilter: added warning about primer length in -h (#75) 2020-04-05 18:39:20 +02:00
f0d152fcbd ngsfilter: now checking primer length (fixes #75) 2020-04-05 18:29:10 +02:00
8019dee68e ecotag: now closing all DMS properly 2020-04-05 13:20:49 +02:00
0b4a234671 Swich to version 3.0.0-beta11 2020-02-12 14:23:42 +01:00
d32cfdcce5 ecotag: fixed the generated column comments formatting that would
generate errors
2020-02-12 14:23:17 +01:00
219c0d6fdc obi cat: Fixed the handling when concatenating views with dictionaries
having different key sets
2020-02-12 14:21:39 +01:00
dc9f897917 switch to version 3.0.0-beta10 2020-02-02 21:15:27 +01:00
bb72682f7d obi import: new option --preread to do a first readthrough of the
dataset if it contains huge dictionaries for a much faster import.
2020-02-02 21:12:34 +01:00
52920c3c71 URI decoding: dirty temp fix for bug where default dms makes a mess when
should guess file
2020-02-02 21:11:05 +01:00
18c22cecf9 switch to version 3.0.0-beta9 2020-02-01 15:48:55 +01:00
1bfb96023c obi import: rewriting a column now deletes the old one to save disk
space
2020-02-01 15:31:14 +01:00
c67d668989 obi import: fixed a bug when the first entry would contain a dictionary
with one key. Switch to beta8
2020-01-29 20:23:39 +01:00
db0ac37d41 switch to version 3.0.0-beta7 2020-01-29 16:18:53 +01:00
d0c21ecd39 Removed an OpenMP clause that was not obligatory and triggered a known
gcc bug involving macros
2020-01-24 16:00:53 +01:00
53212168a2 History: added 'obi' in bash history for practical reasons 2020-01-23 16:51:49 +01:00
b4b2e62195 Cleaner handling of reverse quality columns 2020-01-18 19:28:12 +01:00
ced82c4242 Switching to version 3.0-beta6 2020-01-18 17:29:23 +01:00
a524f8829e New command: obi cat to concatenate views (not optimized yet) 2020-01-18 17:28:31 +01:00
5c9091e9eb C: closing DMS after cleaning it instead of counting on upper layer 2020-01-18 17:27:35 +01:00
822000cb70 Fixes in documentation 2020-01-18 17:26:18 +01:00
b9cd9bee9a C: Changed obibool definitions because of conflict with R 2020-01-06 15:11:31 +01:00
b1f3e082f9 ngsfilter: fixed a bug when there is only one tag introduced in latest
edit
2020-01-06 13:53:38 +01:00
6c018b403c ecopcr: fixed and improved the options to keep nuclotides around the
amplicon
2019-12-26 20:45:54 +01:00
694d1934a8 Tagging version beta3 2019-12-12 17:03:13 +01:00
fc3ac03630 clean_dms: now works with extension 2019-12-12 17:02:50 +01:00
d75e54a078 uniq: added forced deletion of reverse sequence quality 2019-12-12 17:02:36 +01:00
6bfd7441f3 ngsfilter: fixed sequence cutting when dealing with unaligned sequences.
Could use optimization
2019-12-12 17:01:31 +01:00
81a179239c ngsfilter: fixed sequence cut bug on aligned sequences. Still exists for
unaligned sequences
2019-12-10 18:13:27 +01:00
35ce37c0f7 ngsfilter: fixed a bug with unaligned chimeras (unpaired primers) and
made error annotations more explicit
2019-12-10 13:43:32 +01:00
53f18316b0 ngsfilter: made more robust and practical to use with empty tags 2019-11-29 15:21:08 +01:00
8bc249b2f4 Version 3.0.0-beta1 2019-09-27 14:52:05 +02:00
e308c2e822 versioning 1.0.beta 2019-09-26 21:05:05 +02:00
3b3cf9359d CMake: unset gcc for nix 2019-09-26 21:04:42 +02:00
be85c55c9e Python: URIs: fixed bug on linux systems 2019-09-25 14:41:52 +02:00
6d5b904888 Cleaning 2019-09-25 11:58:00 +02:00
50e8374f6f Added website URL in readme file 2019-09-25 11:40:00 +02:00
6282242a04 C: Views: fixed a bug when trying to add a comment after changing the
file name of a finished view
2019-09-25 11:39:32 +02:00
44517db51f Fixed gcc warnings 2019-09-25 11:38:00 +02:00
c3b9e46291 more cleaning 2019-09-24 13:58:53 +02:00
28b7fce59a Cython API: simpler column repr display 2019-09-22 20:23:31 +02:00
fa9555deb9 obi stats: fixed bug with None values 2019-09-22 20:21:53 +02:00
d30f7e7317 more cleaning 2019-09-22 18:52:05 +02:00
4fa38d9886 cleaning 2019-09-22 17:38:28 +02:00
71276537a6 obi import: fixed bug when importing a taxdump 2019-09-22 16:45:30 +02:00
ba9ba7aa60 obi grep: now able to convert str to bytes in predicate expressions 2019-09-22 16:44:45 +02:00
7b4046c288 Bash completion script for commands, dms and views 2019-09-21 23:46:08 +02:00
e2ba76002a Cleaned setup script and put to my name ;) 2019-09-21 23:44:24 +02:00
336100f716 obi less: now actually behaves like less 2019-09-21 18:29:12 +02:00
d83398c0e0 Cython: View: lines from simple View instances are now displayed in tab
instead of dict format
2019-09-21 18:28:56 +02:00
974d25b815 Cython: Fixed bug in tab formatter with header option always being set
to true
2019-09-21 18:27:47 +02:00
ec0737a600 Added signal catching and handling in C and Cython 2019-09-21 16:47:22 +02:00
06f9d6da60 obi import: importing a view to a DMS now uses the C API (more efficient
and imports all metadata)
2019-09-21 12:49:29 +02:00
f0f7edf152 Python API: small option improvements 2019-09-21 12:08:36 +02:00
9e72c8d16a obi ls: improved taxonomy list 2019-09-20 20:46:33 +02:00
7c3fa14789 obi import: fixed bug when reading output URI 2019-09-20 20:43:48 +02:00
ec874c095b new command: clean_dms to clean and unlock a DMS after a bad exit. 2019-09-20 20:38:25 +02:00
783a1343c4 DMS are now locked when used by a command. Added checks and changed
cleaning mechanisms.
2019-09-20 20:37:19 +02:00
eb6c59dc1e obi import: proper check for taxonomy name already existing in DMS when
importing a taxdump
2019-09-17 13:41:49 +02:00
ad46056179 obi export: if export format is not specified, it is guessed from the
view type
2019-09-17 13:22:41 +02:00
9063e9159d Export options: output option is now only non-positional for obi export 2019-09-17 13:19:17 +02:00
0159385943 URI decoding: fixed bug with dms-only URI 2019-09-17 12:50:37 +02:00
a0c8deb806 obi export: made output to stdout and pipe in less possible 2019-09-17 12:31:03 +02:00
f566618be6 Added option for no progress bar and made output URI option non
positional (for stdout output)
2019-09-17 12:29:33 +02:00
88451116e8 URIs: added stdout output (empty URI) 2019-09-17 12:28:10 +02:00
eb913b2742 ecotag: trying to use a threshold lower than the ref db threshold now
returns an error instead of a warning
2019-09-15 19:27:47 +02:00
f8d1fa678a obi stats: improved display with str instead of bytes 2019-09-10 16:20:36 +02:00
bc55c5ef8c obi clean: fixed an openmp bug where the share size would be 0 blocking
the program
2019-09-10 15:37:33 +02:00
f3b0e10c7f fixed a comment 2019-09-10 14:42:12 +02:00
8f9f2a2d10 obi ls: various improvements 2019-09-10 14:41:43 +02:00
045a751b0f Merge branch 'master' of git@git.metabarcoding.org:obitools/obitools3.git 2019-09-05 17:20:13 +02:00
ad3a72597f Little fixes for linux compilation 2019-09-05 17:19:29 +02:00
8899478237 Update README.md 2019-09-04 17:29:45 +02:00
ec614e5d15 Update README.md 2019-09-04 17:23:30 +02:00
f8cccebe19 Update README.md 2019-09-04 17:11:46 +02:00
5e3c41b058 C: Fixed opened DIR leak 2019-09-04 16:48:13 +02:00
b3a1011d36 C: fixed a bug when opening or creating a new column directory where the
DMS was not saved in the struct
2019-09-04 13:16:28 +02:00
a7fabff1c7 C: made it so column DIR* are not kept open to handle very large DMS 2019-09-04 12:55:21 +02:00
f296517716 Various display improvements 2019-09-03 21:46:39 +02:00
d491480af2 C: fixed remaining memory bug in array indexer 2019-09-01 17:24:57 +02:00
073d98db08 C: ecotag: now prints a warning if the demanded threshold is lower than
the db threshold
2019-08-31 18:30:06 +02:00
0ee728c4d0 C: build_ref_db: now adds a comment with the threshold used to build the
DB
2019-08-31 18:29:40 +02:00
7423bacac0 C: Json comments: added an obi_read_comment function to read one value
from comments
2019-08-31 18:28:51 +02:00
53dcbc8ea3 Fixed log to be in str instead of bytes 2019-08-29 18:26:51 +02:00
4e75514bad obi import: fixed entry count 2019-08-29 18:26:09 +02:00
1ed2d45ac4 obi grep: made an error message clearer (error could be eventually be
handled by program, looking for str in bytes returned by a column)
2019-08-29 17:17:52 +02:00
e43e49d6f1 C: optimized dir opening 2019-08-29 16:35:10 +02:00
187053026f Better detection of missing taxonomy 2019-08-29 16:10:09 +02:00
dcf8cf1d64 Improved obi stats 2019-08-29 15:18:26 +02:00
3cfe3a9b00 Improved progress display when importing files in a DMS 2019-08-29 10:12:06 +02:00
728af51cb2 Python: better display of tuple values in fasta format 2019-08-28 15:55:36 +02:00
99a397b842 obi uniq: various improvements and fixes #66 2019-08-27 20:27:36 +02:00
f5c472ffd1 C: fixed a memory bug in the array indexer 2019-08-27 20:26:46 +02:00
580db2f710 minor comment 2019-08-27 20:25:54 +02:00
dbe09f83a2 Increased the threshold of elements per line in a column before they are
stored as a character string
2019-08-27 20:25:14 +02:00
3d1b2e8ed9 Better handling of column lines with all values at NA 2019-08-27 20:20:26 +02:00
ae5f42c260 fixes #61 : now reading merged taxids information when building a
reference database
2019-08-19 12:30:56 +02:00
af7cecf59f Fixed a bug where a directory was not closed properly resulting in errno
24 sometimes
2019-08-18 19:46:52 +02:00
5f20be44b2 Minor fixes 2019-08-18 19:45:53 +02:00
66441e0aef Fixed a bug when sending a DMS path to a C function from Cython 2019-08-18 19:43:51 +02:00
13952358b3 Fixed a bug where some commands wouldn't work if the input DMS was not
in the current directory
2019-07-25 11:59:19 +02:00
9f38cd8cf6 updated a comment 2019-07-23 19:03:24 +02:00
946f9723b8 ecotag: fixed a bug where the wrong taxid for the best match was
retrieved
2019-07-23 19:02:17 +02:00
9752ff8494 embl parser: information display about progress when parsing multiple
files
2019-07-23 18:59:07 +02:00
d99702f56f ngsfilter and alignpairedend: paired-end reads are now correctly
reversed and labeled to be aligned correctly by alignpairedend
2019-07-23 18:56:51 +02:00
1759302829 C: ecotag: fixed 2 memory bugs 2019-07-06 16:31:19 +02:00
86bfa96fbe C: kmer similarity: small improvements 2019-07-06 16:30:32 +02:00
f765c6f41e obi alignpairedend: fixed a bug where first seq was kept in result view
instead of consensus seq
2019-07-06 16:29:32 +02:00
a83bf43ab9 obi stats: result display is now sorted 2019-07-06 16:27:51 +02:00
3d9f0352ff obiclean parallelized 2019-06-20 19:44:04 +02:00
9b4c3537f9 multithreaded obiclean working but not cleaned 2019-06-19 17:29:58 +02:00
fd0b7a9177 j loop with critical (untested) 2019-06-04 17:14:36 +02:00
debf59b266 i loop parallelized: bad 2019-05-25 18:37:56 +02:00
a04588da31 openmp on j loop (i loop probably better) 2019-05-24 16:51:04 +02:00
ed5bb70c80 CMake: setting compiler higher to avoid conflicts, and linking libopenmp 2019-05-22 16:26:30 +02:00
22a5ae72d1 obi clean: not using tsearch library anymore, a simple byte array
instead. A lot more time and memory efficient. Closes #67
2019-05-19 17:39:53 +02:00
dc88181eeb Add a --cobitools3 options to setup.py 2019-04-12 14:55:05 +02:00
2f60e91d93 Comment the install of the packages 2019-04-12 13:03:53 +02:00
7ba27b6a99 Ask for python 3.7 2019-04-01 09:08:27 +02:00
d3937e1051 Add the cmakefile to the manifest 2019-04-01 09:01:45 +02:00
35eeb07f08 Build the C src in build/cobject 2019-04-01 08:52:38 +02:00
3afbbeb7e5 CMake: made required version 3.10 for ubuntu 2019-03-31 16:54:05 +02:00
d6056a8e50 dirty temporary fix for install 2019-03-31 16:19:05 +02:00
ac47bdce5d history: fixed DMS history when multiple inputs 2019-03-31 15:44:20 +02:00
7f8d1e7196 C: obi lcs: cleaner progress print 2019-03-31 15:42:58 +02:00
80068a3c19 ngsfilter: fixed parsing error 2019-03-31 15:42:30 +02:00
a3e6b7d913 obi import: fixed import of View_NUC_SEQS to another DMS 2019-03-31 15:42:07 +02:00
416c2d7ba0 Cython: made fasta formatter cleaner 2019-03-31 15:41:32 +02:00
26fb149efb C: made build_ref_db cleaner 2019-03-31 15:40:13 +02:00
2b8c066f8e Cython: added possibility to output in tabular format 2019-03-31 15:39:38 +02:00
e39c1a7fbf Cython: added tab formatter and parser (for obi export) 2019-03-31 15:38:34 +02:00
6841d879aa obi history: fixed a bug when displaying ascii history 2019-03-31 10:51:52 +02:00
f0ff585455 Removing trace 2019-03-30 20:52:54 +01:00
601a2cfd7d obi uniq: various fixes... 2019-03-30 20:34:53 +01:00
7c518300a0 C: Views: fixed a bug when creating automatic columns with unformatted
comments
2019-03-30 20:33:14 +01:00
f16bbca8e2 obi grep: fixed a bug where -p option didn't work 2019-03-30 19:10:42 +01:00
173483448a Merge commit '3d842ff7' 2019-03-30 15:29:52 +01:00
52b3a9fc39 C: taxonomy: fixed a segfault on linux when trying to fclose an unopened
file
2019-03-30 15:19:12 +01:00
ce686e9569 obi import: progress bar fixed when using --only option 2019-03-30 15:16:57 +01:00
c293cfabbb Python: embl parser: fixed a bug preventing taxids from being parsed 2019-03-30 15:15:49 +01:00
0847d618d6 fixed typo 2019-03-30 15:14:30 +01:00
9fcebd7643 C: build_reference_db: made some errors more explicit 2019-03-30 15:11:49 +01:00
5d842ff7e7 Clean the manifest of old files 2019-03-29 16:58:45 +01:00
3445579251 remove all the no more needed .cfiles 2019-03-29 16:56:58 +01:00
995a66b488 Add the new script emplacement 2019-03-29 16:55:23 +01:00
5007b02cbc cleaning stage 2 2019-03-29 16:46:17 +01:00
cdd5975e8b Cleaning first stage 2019-03-29 16:40:36 +01:00
0c466046f4 Merge branch 'pip-standard-orig-python' into 'master'
The new install version based on classical setup.py

See merge request obitools/obitools3!1
2019-03-29 16:25:01 +01:00
ee9947217c alignpairedend: fixed the worst memory leak and the handling of the case
where 0 common kmers are found
2019-03-29 11:16:25 +01:00
ceaafca427 ngsfilter: fixed a bug (maybe 2) in the algo for the choice of the
reverse primer when running on unaligned sequences
2019-03-29 10:56:17 +01:00
8e70bf1ee1 obi import: fixed bug when rewriting a column (keeping wrong type in
import module)
2019-03-26 14:56:18 +01:00
d8a7bd42bd Cython API, taxonomy: fixed parental tree iterator (skipped second to
last taxon, in OBI1 too)
2019-03-26 14:08:54 +01:00
226 changed files with 6828 additions and 15985 deletions

13
.gitignore vendored
View File

@ -1,13 +0,0 @@
.DS_Store
/obitools.test/
/OBITools3- 1.01.16/
/OBITools3-0.00.0/
/build/
/testpip/
/toto.obidms/
/unittests/
/setup.py.old
/3_Hornung_testseq.fastq
/sample/
/save.temp/
/obitmp/

View File

@ -1,17 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<projectDescription>
<name>obitools3</name>
<comment></comment>
<projects>
</projects>
<buildSpec>
<buildCommand>
<name>org.python.pydev.PyDevBuilder</name>
<arguments>
</arguments>
</buildCommand>
</buildSpec>
<natures>
<nature>org.python.pydev.pythonNature</nature>
</natures>
</projectDescription>

View File

@ -1,20 +0,0 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?eclipse-pydev version="1.0"?><pydev_project>
<pydev_property name="org.python.pydev.PYTHON_PROJECT_INTERPRETER">python3.6-obitool3</pydev_property>
<pydev_property name="org.python.pydev.PYTHON_PROJECT_VERSION">python interpreter</pydev_property>
<pydev_pathproperty name="org.python.pydev.PROJECT_SOURCE_PATH">
<path>/${PROJECT_DIR_NAME}/python</path>
</pydev_pathproperty>
</pydev_project>

View File

@ -1,11 +1,12 @@
include setup.py
recursive-include distutils.ext *.py *.c *.pem
recursive-include python *.pyx *.pxd *.c *.h *.cfiles
recursive-include python *.pyx *.pxd *.c *.h
recursive-include src *.c *.h
include src/CMakeLists.txt
recursive-include doc/sphinx/source *.txt *.rst *.py
recursive-include doc/sphinx/sphinxext *.py
include doc/sphinx/Makefile
include doc/sphinx/Doxyfile
include README.txt
include README.md
include requirements.txt
include scripts/obi

View File

@ -1,9 +1,9 @@
The `OBITools3`: A package for the management of analyses and data in DNA metabarcoding
---------------------------------------------
**Website: <https://metabarcoding.org/obitools3>**
DNA metabarcoding offers new perspectives for biodiversity research [1]. This approach of ecosystem studies relies heavily on the use of Next-Generation Sequencing (NGS), and consequently requires the ability to to treat large volumes of data. The `OBITools` package satisfies this requirement thanks to a set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context [2] - <http://metabarcoding.org/obitools>. Their capacity to filter and edit sequences while taking into account taxonomic annotation helps to setup tailored-made analysis pipelines for a broad range of DNA metabarcoding applications, including biodiversity surveys or diet analyses.
DNA metabarcoding offers new perspectives for biodiversity research [1]. This approach of ecosystem studies relies heavily on the use of Next-Generation Sequencing (NGS), and consequently requires the ability to to treat large volumes of data. The `OBITools` package satisfies this requirement thanks to a set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context [2] - <https://metabarcoding.org/obitools>. Their capacity to filter and edit sequences while taking into account taxonomic annotation helps to setup tailored-made analysis pipelines for a broad range of DNA metabarcoding applications, including biodiversity surveys or diet analyses.
@ -23,18 +23,19 @@ DNA metabarcoding offers new perspectives for biodiversity research [1]. This ap
**Tools.** The `OBITools3` offer the same tools as the original `OBITools`. Eventually, new versions of `ecoPrimers` (PCR primer design) [3], `ecoPCR` (*in silico* PCR) [4], as well as `Sumatra` (sequence alignment) and `Sumaclust` (sequence alignment and clustering) [5] will be added, taking advantage of the database structure developed for the `OBITools3`.
**Tools.** The `OBITools3` offer the same tools as the original `OBITools`, plus `ecoPCR` (*in silico* PCR) [4] and `Sumatra` (sequence alignment, not multithreaded yet) [5].
Eventually, new versions of `ecoPrimers` (PCR primer design) [3], as well as `Sumaclust` (sequence alignment and clustering) [5] will be added, taking advantage of the database structure developed for the `OBITools3`.
**Implementation and disponibility.** The lower layers managing the DMS as well as all the compute-intensive functions are coded in `C99` for efficiency reasons. A `Cython` (<http://www.cython.org>) object layer allows for a simple but efficient implementation of the `OBITools3` commands in `Python 3.5`. The `OBITools3` are still in development, and the first functional versions are expected for autumn 2016.
**Implementation and disponibility.** The lower layers managing the DMS as well as all the compute-intensive functions are coded in `C99` for efficiency reasons. A `Cython` (<http://www.cython.org>) object layer allows for a simple but efficient implementation of the `OBITools3` commands in `Python 3`. The `OBITools3` are now being released, check the wiki for more information.
**References.**
1. Taberlet P, Coissac E, Hajibabaei M, Rieseberg LH: Environmental DNA. Mol Ecol 2012:17891793.
2. Boyer F, Mercier C, Bonin A, Le Bras Y, Taberlet P, Coissac E: OBITools: a Unix-inspired software package for DNA metabarcoding. Mol Ecol Resour 2015:n/an/a.
2. Boyer F, Mercier C, Bonin A, Le Bras Y, Taberlet P, Coissac E: OBITools: a Unix-inspired software package for DNA metabarcoding. Mol Ecol Resour, 2016: 176-182.
3. Riaz T, Shehzad W, Viari A, Pompanon F, Taberlet P, Coissac E: ecoPrimers: inference of new DNA barcode markers from whole genome sequence analysis. Nucleic Acids Res 2011, 39:e145.
4. Ficetola GF, Coissac E, Zundel S, Riaz T, Shehzad W, Bessière J, Taberlet P, Pompanon F: An in silico approach for the evaluation of DNA barcodes. BMC Genomics 2010, 11:434.
5. Mercier C, Boyer F, Bonin A, Coissac E (2013) SUMATRA and SUMACLUST: fast and exact comparison and clustering of sequences. Available: <http://metabarcoding.org/sumatra> and <http://metabarcoding.org/sumaclust>

View File

@ -1,18 +0,0 @@
CC = gcc
CFLAGS = -c -Wall
LDFLAGS =
SOURCES = obicount.c ../obidmscolumn.c
OBJECTS = $(SOURCES:.c=.o)
EXECUTABLE = obicount
all: $(SOURCES) $(EXECUTABLE)
$(EXECUTABLE): $(OBJECTS)
$(CC) $(LDFLAGS) $(OBJECTS) -o $@
.c.o:
$(CC) $(CFLAGS) $< -o $@
clean:
rm *o
rm $(EXECUTABLE)

View File

@ -1,87 +0,0 @@
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h> /* mmap() is defined in this header */
#include <stdint.h>
#include <fcntl.h>
#include <sys/stat.h>
#include "../obitypes.h"
#include "../obidmscolumn.h"
/**
* @brief Computes the size to map.
*
* * @param OBIDMSColumn_file The file to map.
* @return The size to map.
*/
int get_size_to_map(int OBIDMSColumn_file)
// compute size to map : file size minus size of the header
{
int size;
struct stat s;
fstat(OBIDMSColumn_file, &s);
size = (s.st_size) - HEADER_SIZE;
return(size);
}
/**
* @brief Computes and prints the total number of sequences by summing their counts.
*
* * @param The count file.
*/
int main(int argc, char const *argv[])
{
char* map;
int size;
int OBIDMSColumn_file;
int count;
char c;
char num_str[10] = "";
int num_int;
int i,j;
// initialize variables
OBIDMSColumn_file = open(argv[1], O_RDONLY); //read only
count = 0;
j = 0;
// compute size to map
size = get_size_to_map(OBIDMSColumn_file);
// map the data
map = obi_map_read_only(OBIDMSColumn_file, HEADER_SIZE, size);
// sum the counts
for (i=0; i<size; i++)
{
c = map[i];
if (c != SEPARATOR) // reading lines
{
num_str[j] = c;
j++;
}
else if (c == SEPARATOR) // end of a line
{
num_int = atoi(num_str); // turn number from character string to int
count = count + num_int; // add the number to the sum
j = 0;
num_str[j] = '\0';
}
}
// print the final count of sequences
fprintf(stderr, "Sequence count = %d\n", count);
// unmap
obi_unmap(size);
// close file
close(OBIDMSColumn_file);
return(0);
}

5
doc/.gitignore vendored
View File

@ -1,5 +0,0 @@
/build/
/doxygen/
/build_dir.txt
/.DS_Store
/.gitignore

File diff suppressed because it is too large Load Diff

View File

@ -1,203 +0,0 @@
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
BUILDDIR = build
DOXYGENDIR = doxygen
# User-friendly check for sphinx-build
ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
$(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
endif
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
# the i18n builder cannot share the environment and doctrees with the others
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest coverage gettext
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " dirhtml to make HTML files named index.html in directories"
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " applehelp to make an Apple Help Book"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " latexpdf to make LaTeX files and run them through pdflatex"
@echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
@echo " text to make text files"
@echo " man to make manual pages"
@echo " texinfo to make Texinfo files"
@echo " info to make Texinfo files and run them through makeinfo"
@echo " gettext to make PO message catalogs"
@echo " changes to make an overview of all changed/added/deprecated items"
@echo " xml to make Docutils-native XML files"
@echo " pseudoxml to make pseudoxml-XML files for display purposes"
@echo " linkcheck to check all external links for integrity"
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
@echo " coverage to run coverage check of the documentation (if enabled)"
clean:
rm -rf $(BUILDDIR)/*
rm -rf $(DOXYGENDIR)/*
html:
@echo "Generating Doxygen documentation..."
doxygen Doxyfile
@echo "Doxygen documentation generated. \n"
$(SPHINXBUILD) -b html -c ./ $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
singlehtml:
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
@echo
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
pickle:
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
@echo
@echo "Build finished; now you can process the pickle files."
json:
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
@echo
@echo "Build finished; now you can process the JSON files."
htmlhelp:
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in $(BUILDDIR)/htmlhelp."
qthelp:
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
@echo
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
".qhcp project file in $(BUILDDIR)/qthelp, like this:"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/OBITools-3.qhcp"
@echo "To view the help file:"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/OBITools-3.qhc"
applehelp:
$(SPHINXBUILD) -b applehelp $(ALLSPHINXOPTS) $(BUILDDIR)/applehelp
@echo
@echo "Build finished. The help book is in $(BUILDDIR)/applehelp."
@echo "N.B. You won't be able to view it unless you put it in" \
"~/Library/Documentation/Help or install it in your application" \
"bundle."
devhelp:
$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
@echo
@echo "Build finished."
@echo "To view the help file:"
@echo "# mkdir -p $$HOME/.local/share/devhelp/OBITools-3"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/OBITools-3"
@echo "# devhelp"
epub:
$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
@echo
@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
latex:
@echo "Generating Doxygen documentation..."
doxygen Doxyfile
@echo "Doxygen documentation generated. \n"
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo
@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
@echo "Run \`make' in that directory to run these through (pdf)latex" \
"(use \`make latexpdf' here to do that automatically)."
latexpdf:
@echo "Generating Doxygen documentation..."
doxygen Doxyfile
@echo "Doxygen documentation generated. \n"
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through pdflatex..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
latexpdfja:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through platex and dvipdfmx..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
text:
$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
@echo
@echo "Build finished. The text files are in $(BUILDDIR)/text."
man:
$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
@echo
@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
texinfo:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo
@echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
@echo "Run \`make' in that directory to run these through makeinfo" \
"(use \`make info' here to do that automatically)."
info:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo "Running Texinfo files through makeinfo..."
make -C $(BUILDDIR)/texinfo info
@echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
gettext:
$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
@echo
@echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
changes:
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
@echo
@echo "The overview file is in $(BUILDDIR)/changes."
linkcheck:
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in $(BUILDDIR)/linkcheck/output.txt."
doctest:
$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
@echo "Testing of doctests in the sources finished, look at the " \
"results in $(BUILDDIR)/doctest/output.txt."
coverage:
$(SPHINXBUILD) -b coverage $(ALLSPHINXOPTS) $(BUILDDIR)/coverage
@echo "Testing of coverage in the sources finished, look at the " \
"results in $(BUILDDIR)/coverage/python.txt."
xml:
$(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
@echo
@echo "Build finished. The XML files are in $(BUILDDIR)/xml."
pseudoxml:
$(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
@echo
@echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."

View File

@ -1,300 +0,0 @@
# -*- coding: utf-8 -*-
#
# OBITools3 documentation build configuration file, created by
# sphinx-quickstart on Mon May 4 14:36:57 2015.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
import sys
import os
import shlex
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#sys.path.insert(0, os.path.abspath('.'))
# -- General configuration ------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.todo',
'sphinx.ext.coverage',
'sphinx.ext.imgmath',
'sphinx.ext.ifconfig',
'sphinx.ext.viewcode',
'breathe',
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The encoding of source files.
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'source/index'
# General information about the project.
project = u'OBITools3'
copyright = u'2015, Céline Mercier, Eric Coissac, Frédéric Boyer'
author = u'Céline Mercier, Eric Coissac, Frédéric Boyer'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = '0.0'
# The full version, including alpha/beta/rc tags.
release = '0.0.0'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
#today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = []
# The reST default role (used for this markup: `text`) to use for all
# documents.
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# A list of ignored prefixes for module index sorting.
#modindex_common_prefix = []
# If true, keep warnings as "system message" paragraphs in the built documents.
#keep_warnings = False
# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = True
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'bizstyle'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
#html_title = None
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#html_logo = None
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
#html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
# directly to the root of the documentation.
#html_extra_path = []
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
#html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
#html_domain_indices = True
# If false, no index is generated.
#html_use_index = True
# If true, the index is split into individual pages for each letter.
#html_split_index = False
# If true, links to the reST sources are added to the pages.
#html_show_sourcelink = True
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
#html_show_sphinx = True
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
#html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
#html_file_suffix = None
# Language to be used for generating the HTML full-text search index.
# Sphinx supports the following languages:
# 'da', 'de', 'en', 'es', 'fi', 'fr', 'hu', 'it', 'ja'
# 'nl', 'no', 'pt', 'ro', 'ru', 'sv', 'tr'
#html_search_language = 'en'
# A dictionary with options for the search language support, empty by default.
# Now only 'ja' uses this config value
#html_search_options = {'type': 'default'}
# The name of a javascript file (relative to the configuration directory) that
# implements a search results scorer. If empty, the default will be used.
#html_search_scorer = 'scorer.js'
# Output file base name for HTML help builder.
htmlhelp_basename = 'OBITools3doc'
# -- Options for LaTeX output ---------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#'preamble': '',
# Latex figure (float) alignment
#'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'OBITools3.tex', u'OBITools3 Documentation',
u'Céline Mercier, Eric Coissac, Frédéric Boyer', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# If true, show page references after internal links.
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
#latex_show_urls = False
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_domain_indices = True
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'obitools3', u'OBITools3 Documentation',
[author], 1)
]
# If true, show URL addresses after external links.
#man_show_urls = False
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'OBITools3', u'OBITools3 Documentation',
author, 'OBITools3', 'One line description of project.',
'Miscellaneous'),
]
# Documents to append as an appendix to all manuals.
#texinfo_appendices = []
# If false, no module index is generated.
#texinfo_domain_indices = True
# How to display URL addresses: 'footnote', 'no', or 'inline'.
#texinfo_show_urls = 'footnote'
# If true, do not generate a @detailmenu in the "Top" node's menu.
#texinfo_no_detailmenu = False
#Breathe configuration
sys.path.append( "breathe/" )
breathe_projects = { "OBITools3": "doxygen/xml/" }
breathe_default_project = "OBITools3"
#breathe_projects_source = {
# "auto" : ( "../src", ["obidms.h", "obiavl.h"] )
# }

View File

@ -1,160 +0,0 @@
*********************************************
The OBItools3 Data Management System (OBIDMS)
*********************************************
A complete DNA metabarcoding experiment relies on several kinds of data.
- The sequence data resulting from the sequencing of the PCR products,
- The description of the samples including all their metadata,
- One or several reference databases used for the taxonomic annotation,
- One or several taxonomy databases.
Up to now, each of these categories of data were stored in separate
files, and nothing made it mandatory to keep them together.
The `Data Management System` (DMS) of OBITools3 can be viewed like a basic
database system.
OBIDMS UML
==========
.. image:: ./UML/OBIDMS_UML.png
:download:`html version of the OBIDMS UML file <UML/ObiDMS_UML.class.violet.html>`
An OBIDMS directory contains :
* one `OBIDMS history file <#obidms-history-files>`_
* OBIDMS column directories
OBIDMS column directories
=========================
OBIDMS column directories contain :
* all the different versions of one OBIDMS column, under the form of different files (`OBIDMS column files <#obidms-column-files>`_)
* one `OBIDMS version file <#obidms-version-files>`_
The directory name is the column attribute with the extension ``.obicol``.
Example: ``count.obicol``
OBIDMS column files
===================
Each OBIDMS column file contains :
* a header of a size equal to a multiple of PAGESIZE (PAGESIZE being equal to 4096 bytes
on most systems) containing metadata
* Lines of data with the same `OBIType <types.html#obitypes>`_
Header
------
The header of an OBIDMS column contains :
* Endian byte order
* Header size (PAGESIZE multiple)
* Number of lines of data
* Number of lines of data used
* `OBIType <types.html#obitypes>`_ (type of the data)
* Date of creation of the file
* Version of the OBIDMS column
* The column name
* Eventual comments
Data
----
A line of data corresponds to a vector of elements. Each element is associated with an element name.
Elements names are stored in the header. The correspondance between an element and its name is done
using their order in the lists of elements and elements names. This structure allows the storage of
dictionary-like data.
Example: In the header, the attribute ``elements_names`` will be associated with the value ``"sample_1;
sample_2;sample_3"``, and a line of data with the type ``OBInt_t`` will be stored as an ``OBInt_t`` vector
of size three e.g. ``5|8|4``.
Mandatory columns
-----------------
Some columns must exist in an OBIDMS directory :
* sequence identifiers column (type ``OBIStr_t``)
File name
---------
Each file is named with the attribute associated to the data it contains, and the number of
its version, separated by an ``@``, and with the extension ``.odc``.
Example : ``count@3.odc``
Modifications
-------------
An OBIDMS column file can only be modified by the process that created it, and while its status is set to Open.
When a process wants to modify an OBIDMS column file that is closed, it must first clone it. Cloning creates a new version of the
file that belongs to the process, i.e., only that process can modify that file, as long as its status is set to Open. Once the process
has finished writing the new version of the column file, it sets the column file's status to Closed, and the file can never be modified
again.
That means that one column is stored in one file (if there is only one version)
or more (if there are several versions), and that there is one file per version.
All the versions of one column are stored in one directory.
Versioning
----------
The first version of a column file is numbered 0, and each new version increments that
number by 1.
The number of the latest version of an OBIDMS column is stored in the `OBIDMS version file <#obidms-version-files>`_ of its directory.
OBIDMS version files
====================
Each OBIDMS column is associated with an OBIDMS version file in its directory, that contains the number of the latest
version of the column.
File name
---------
OBIDMS version files are named with the attribute associated to the data contained in the column, and
have the extension ``.odv``.
Example : ``count.odv``
OBIDMS views
============
An OBIDMS view consists of a list of OBIDMS columns and lines. A view includes one version
of each mandatory column. Only one version of each column is included. All the columns of
one view contain the same number of lines in the same order.
OBIDMS history file
===================
An OBIDMS history file consists of an ordered list of views and commands, those commands leading
from one view to the next one.
This history can be represented in the form of a ?? showing all the
operations ever done in the OBIDMS directory and the views in between them :
.. image:: ./images/history.png
:width: 150 px
:align: center

Binary file not shown.

Before

Width:  |  Height:  |  Size: 67 KiB

View File

@ -1,832 +0,0 @@
<HTML>
<HEAD>
<META name="description"
content="Violet UML Editor cross format document" />
<META name="keywords" content="Violet, UML" />
<META charset="UTF-8" />
<SCRIPT type="text/javascript">
function switchVisibility() {
var obj = document.getElementById("content");
obj.style.display = (obj.style.display == "block") ? "none" : "block";
}
</SCRIPT>
</HEAD>
<BODY>
This file was generated with Violet UML Editor 2.1.0.
&nbsp;&nbsp;(&nbsp;<A href=# onclick="switchVisibility()">View Source</A>&nbsp;/&nbsp;<A href="http://sourceforge.net/projects/violet/files/violetumleditor/" target="_blank">Download Violet</A>&nbsp;)
<BR />
<BR />
<SCRIPT id="content" type="text/xml"><![CDATA[<ClassDiagramGraph id="1">
<nodes id="2">
<ClassNode id="3">
<children id="4"/>
<location class="Point2D.Double" id="5" x="520.0" y="30.0"/>
<id id="6" value="a6688f6e-9346-46c6-9cf5-4fa6148f613f"/>
<revision>1</revision>
<backgroundColor id="7">
<red>255</red>
<green>255</green>
<blue>255</blue>
<alpha>255</alpha>
</backgroundColor>
<borderColor id="8">
<red>0</red>
<green>0</green>
<blue>0</blue>
<alpha>255</alpha>
</borderColor>
<textColor reference="8"/>
<name id="9" justification="1" size="3" underlined="false">
<text>OBIType_t
</text>
</name>
<attributes id="10" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="11" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="12">
<children id="13"/>
<location class="Point2D.Double" id="14" x="780.0" y="100.0"/>
<id id="15" value="7edd4f08-c5e5-4e41-bc05-8b357cb5e629"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="16" justification="1" size="3" underlined="false">
<text>OBIContainer_t</text>
</name>
<attributes id="17" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="18" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="19">
<children id="20"/>
<location class="Point2D.Double" id="21" x="330.0" y="110.0"/>
<id id="22" value="dbb15831-2f0b-4e97-83e7-5ecdda6d6075"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="23" justification="1" size="3" underlined="false">
<text>OBIElementary_t</text>
</name>
<attributes id="24" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="25" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="26">
<children id="27"/>
<location class="Point2D.Double" id="28" x="670.0" y="240.0"/>
<id id="29" value="9693da23-1b47-4bf3-9544-86390a533713"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="30" justification="1" size="3" underlined="false">
<text>OBIList_t</text>
</name>
<attributes id="31" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="32" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="33">
<children id="34"/>
<location class="Point2D.Double" id="35" x="780.0" y="240.0"/>
<id id="36" value="b2f4d561-0c10-4443-b8f6-d3628ab9bcfe"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="37" justification="1" size="3" underlined="false">
<text>OBISet_t</text>
</name>
<attributes id="38" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="39" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="40">
<children id="41"/>
<location class="Point2D.Double" id="42" x="890.0" y="240.0"/>
<id id="43" value="8cc209c6-18c7-4a90-a5d4-ab7246638b2f"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="44" justification="1" size="3" underlined="false">
<text>OBIDictionnary_t</text>
</name>
<attributes id="45" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="46" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="47">
<children id="48"/>
<location class="Point2D.Double" id="49" x="170.0" y="220.0"/>
<id id="50" value="cb77086b-7535-49dc-ab33-b58d16eec496"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="51" justification="1" size="3" underlined="false">
<text>OBIAtomic_t</text>
</name>
<attributes id="52" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="53" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="54">
<children id="55"/>
<location class="Point2D.Double" id="56" x="500.0" y="240.0"/>
<id id="57" value="5a32037d-eaf1-4bbc-977e-06589f1d2ca5"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="58" justification="1" size="3" underlined="false">
<text>OBIComposite_t
</text>
</name>
<attributes id="59" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="60" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="61">
<children id="62"/>
<location class="Point2D.Double" id="63" x="560.0" y="400.0"/>
<id id="64" value="84fa636d-0c5a-4df8-bac7-79df8e546c12"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="65" justification="1" size="3" underlined="false">
<text>OBIString_t</text>
</name>
<attributes id="66" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="67" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="68">
<children id="69"/>
<location class="Point2D.Double" id="70" x="450.0" y="400.0"/>
<id id="71" value="752b5f9b-0ece-4902-8de5-e2c465d281dd"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="72" justification="1" size="3" underlined="false">
<text>OBITaxid_t
</text>
</name>
<attributes id="73" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="74" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="75">
<children id="76"/>
<location class="Point2D.Double" id="77" x="220.0" y="400.0"/>
<id id="78" value="9b89f530-cedc-4b33-a36f-371ccfb2ffae"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="79" justification="1" size="3" underlined="false">
<text>OBIInteger_t
</text>
</name>
<attributes id="80" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="81" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="82">
<children id="83"/>
<location class="Point2D.Double" id="84" x="0.0" y="400.0"/>
<id id="85" value="01da8ca2-da98-4fde-9ffe-9753a3f202bf"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="86" justification="1" size="3" underlined="false">
<text>OBIFloat_t</text>
</name>
<attributes id="87" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="88" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="89">
<children id="90"/>
<location class="Point2D.Double" id="91" x="110.0" y="400.0"/>
<id id="92" value="1134dbf0-087e-4c9b-be7a-6157bb10ebf0"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="93" justification="1" size="3" underlined="false">
<text>OBIBool_t
</text>
</name>
<attributes id="94" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="95" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
<ClassNode id="96">
<children id="97"/>
<location class="Point2D.Double" id="98" x="330.0" y="400.0"/>
<id id="99" value="a78dbab0-8879-4149-b5c9-aed41e7da0bb"/>
<revision>1</revision>
<backgroundColor reference="7"/>
<borderColor reference="8"/>
<textColor reference="8"/>
<name id="100" justification="1" size="3" underlined="false">
<text>OBIChar_t</text>
</name>
<attributes id="101" justification="0" size="4" underlined="false">
<text></text>
</attributes>
<methods id="102" justification="0" size="4" underlined="false">
<text></text>
</methods>
</ClassNode>
</nodes>
<edges id="103">
<InheritanceEdge id="104">
<start class="ClassNode" reference="12"/>
<end class="ClassNode" reference="3"/>
<startLocation class="Point2D.Double" id="105" x="50.0" y="10.0"/>
<endLocation class="Point2D.Double" id="106" x="70.0" y="40.0"/>
<transitionPoints id="107"/>
<id id="108" value="debdd86e-d072-4413-9d4c-dcabf70e44f9"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="109">
<start class="ClassNode" reference="19"/>
<end class="ClassNode" reference="3"/>
<startLocation class="Point2D.Double" id="110" x="90.0" y="10.0"/>
<endLocation class="Point2D.Double" id="111" x="20.0" y="50.0"/>
<transitionPoints id="112"/>
<id id="113" value="1491704a-dd29-47dc-92e4-2b53c62fd634"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="114">
<start class="ClassNode" reference="47"/>
<end class="ClassNode" reference="19"/>
<startLocation class="Point2D.Double" id="115" x="70.0" y="10.0"/>
<endLocation class="Point2D.Double" id="116" x="40.0" y="40.0"/>
<transitionPoints id="117"/>
<id id="118" value="8475a565-b6dd-404b-8dd3-07f89b3a2853"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="119">
<start class="ClassNode" reference="54"/>
<end class="ClassNode" reference="19"/>
<startLocation class="Point2D.Double" id="120" x="60.0" y="20.0"/>
<endLocation class="Point2D.Double" id="121" x="80.0" y="40.0"/>
<transitionPoints id="122"/>
<id id="123" value="a8696d02-b718-4800-bacb-d076aa2ed3ce"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="124">
<start class="ClassNode" reference="68"/>
<end class="ClassNode" reference="54"/>
<startLocation class="Point2D.Double" id="125" x="50.0" y="20.0"/>
<endLocation class="Point2D.Double" id="126" x="40.0" y="30.0"/>
<transitionPoints id="127"/>
<id id="128" value="4c4010cd-e981-4051-8cf6-a28ebc154bc7"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="129">
<start class="ClassNode" reference="61"/>
<end class="ClassNode" reference="54"/>
<startLocation class="Point2D.Double" id="130" x="50.0" y="10.0"/>
<endLocation class="Point2D.Double" id="131" x="80.0" y="50.0"/>
<transitionPoints id="132"/>
<id id="133" value="b99845fa-28f7-4625-9228-313aa4a9cd17"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="134">
<start class="ClassNode" reference="26"/>
<end class="ClassNode" reference="12"/>
<startLocation class="Point2D.Double" id="135" x="50.0" y="10.0"/>
<endLocation class="Point2D.Double" id="136" x="50.0" y="50.0"/>
<transitionPoints id="137"/>
<id id="138" value="3410cc22-8aee-4667-9dcf-8f24edfe9ce2"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="139">
<start class="ClassNode" reference="33"/>
<end class="ClassNode" reference="12"/>
<startLocation class="Point2D.Double" id="140" x="60.0" y="10.0"/>
<endLocation class="Point2D.Double" id="141" x="70.0" y="50.0"/>
<transitionPoints id="142"/>
<id id="143" value="0316447c-b7b7-480a-80cd-87d66ab6452b"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="144">
<start class="ClassNode" reference="40"/>
<end class="ClassNode" reference="12"/>
<startLocation class="Point2D.Double" id="145" x="50.0" y="10.0"/>
<endLocation class="Point2D.Double" id="146" x="90.0" y="50.0"/>
<transitionPoints id="147"/>
<id id="148" value="d995fa9f-7a1d-4340-b519-d29256c972ce"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="149">
<start class="ClassNode" reference="82"/>
<end class="ClassNode" reference="47"/>
<startLocation class="Point2D.Double" id="150" x="60.0" y="20.0"/>
<endLocation class="Point2D.Double" id="151" x="30.0" y="20.0"/>
<transitionPoints id="152"/>
<id id="153" value="cdf54d62-5b64-46ab-8fdc-ee0184ea9a43"/>
<revision>1</revision>
<bentStyle name="STRAIGHT"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="154">
<start class="ClassNode" reference="89"/>
<end class="ClassNode" reference="47"/>
<startLocation class="Point2D.Double" id="155" x="80.0" y="20.0"/>
<endLocation class="Point2D.Double" id="156" x="20.0" y="30.0"/>
<transitionPoints id="157"/>
<id id="158" value="9d2d9603-b709-499b-8357-4d844bfa227d"/>
<revision>1</revision>
<bentStyle name="STRAIGHT"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="159">
<start class="ClassNode" reference="75"/>
<end class="ClassNode" reference="47"/>
<startLocation class="Point2D.Double" id="160" x="50.0" y="10.0"/>
<endLocation class="Point2D.Double" id="161" x="70.0" y="50.0"/>
<transitionPoints id="162"/>
<id id="163" value="4c3e06c4-c978-410f-8925-32a64eea92f8"/>
<revision>1</revision>
<bentStyle name="STRAIGHT"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<InheritanceEdge id="164">
<start class="ClassNode" reference="96"/>
<end class="ClassNode" reference="47"/>
<startLocation class="Point2D.Double" id="165" x="20.0" y="10.0"/>
<endLocation class="Point2D.Double" id="166" x="90.0" y="50.0"/>
<transitionPoints id="167"/>
<id id="168" value="37c12090-8ace-4fbe-b7f9-e86eb9bf2805"/>
<revision>1</revision>
<bentStyle name="STRAIGHT"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</InheritanceEdge>
<CompositionEdge id="169">
<start class="ClassNode" reference="12"/>
<end class="ClassNode" reference="19"/>
<startLocation class="Point2D.Double" id="170" x="10.0" y="50.0"/>
<endLocation class="Point2D.Double" id="171" x="70.0" y="20.0"/>
<transitionPoints id="172"/>
<id id="173" value="7df9ed88-49f1-4828-9f45-b9d693d877aa"/>
<revision>1</revision>
<bentStyle name="AUTO"/>
<startLabel></startLabel>
<middleLabel></middleLabel>
<endLabel></endLabel>
</CompositionEdge>
</edges>
</ClassDiagramGraph>]]></SCRIPT>
<BR />
<BR />
<IMG alt="embedded diagram image" src="
JyNFIJt0TFI0LNab7tDo3HTfbk1F5+akJpvUsh02qWilo7u2sa4VZ6PiVjY5yWlpmy1W2RZbtjwt
3WFZsbtkeKvBETbxNCIp50gL6Yh+Xuv7eH3nO78YmBnm1+PxB2fm4vr5vF7X+/24Zq65rssuAAAA
2HAZDAcKBgCCtDEnAgAAsLN8QiArAMDyAQAAcyUrAAAsHwAAMFeyAgDA8gEAAHMNzqy+uARpAACW
DwAAWD6WDwCA5QMAAJaP5QMAYPkAAIDlY/kAgOUDAABg+Vg+AGD5AACA5WP5WD4AYPkAAIDlY/kA
AFg+AABg+Vg+AACWDwAAWD6WDwA0UEQAAABYPpYPAFg+AABg+YDlAwCWDwAAWD6WDwCA5QMAAJaP
5QMAYPkAAIDlY/kAAFg+AABg+Vg+AGD5AACA5QOWDwBYPgAAYPlYPgAAlg8AAFg+lg8AgOUDAACW
j+UDAGD5AACA5WP5AIDlAwBA2NHV1eUnyz9//nxHR4eb+UeC5VssFmoMALB8AAAIgJ6WlZUNDAz4
0PLPnDmzadOmmJiYqKgomUlcXNzLL788ODio/nvllVdedpFZs2atXr36m2++0YYnJCTIi2uuueYy
Z3z55Zd+CuHs2bP+sPyamhq+DwEALB8AAAJj+UVFRQaDwWw2W61Wn1j+I488ItOWlJScPn365MmT
a9askbfPPvusZvMLFy5sbm5+7LHHZPiGDRvsLL+jo+PwRRYvXiwjtLS0qLfeu7hTli5dOmPGDN9a
fnt7e1ZWlslkwvIBAMsHAIDAWL6yUlHS9PT0pqYmLy3/yJEjUVFRkydPPnfunBrS09MzZsyYK664
oq+vT9n8zTffLC/eeecdWcRzzz1nZ/kaGRkZMsK3336r3q5bt85oNB48eFC9ffLJJ+X8RGn6XXfd
tWPHjqSkpOTk5D/84Q9qhAMHDtx6662xsbELFiz4+OOPna6tnHuMHz8+Ojpa5rx3716fWH5paamc
NdXV1V3gtw0AgOUDAEAALV8hip+Wlia6L9I/YkPdtWuXTHjnnXfaDly0aJEMbG5uVjY/e/bszZs3
y9+cnJxTp055aPlvvPGGvP3pT38qr3t7e+W0Yc+ePWrCKVOmyClKfn7+uHHjrr76ahkoU82YMeP6
66/fvXt3ampqYmKidsmQLa2trbNmzZo6daqsdnd3tzcxyqxkJvPnz5dzD1k9b86UAACwfAAA8Jnl
K2pqagwGg6jqyAx169at6nId24H333+/DKyvr1dSPmHChJiYGL1ebzabz58/76Hlywsx++uuu05e
v/TSS+LT2oQzZ85Uo6kFHThw4M0335QXzzzzTE9Pz4YNG+T1/v37na6wnAmoEwNvkFXNy8uTMw11
JuMmXgAALB8AAJx7uW9xXIQI8eTJk+VfnZ2dw109dR3OsmXLbAfOnj1bfdp94dIVOwMDAy+88IIM
XLdunYeWL+Tm5qrvBIxG4yuvvKJNuGDBAvW6tLRURvjd7373/PPPy4tx48ZNuIT64N8fli8pybLm
zJnj+CPmy8IXjkQALB8AAHxs+X6dW0NDQ0pKyqpVq0a2IIvFoqxaU96vvvpKZjVt2jQ1RLsuf3Bw
cMqUKXFxcerjfE8s/+2335YhMrlM1d/f72j5q1evlhGOHj2qLu8pLi4ecoXF8qdPn+59jMuXL7/p
ppsOHDjgv50VrkUIAFg+AAD40fJbWlqys7OzsrLkhTcLeuqpp2TalStXdnV1tbW13X777fJ227Zt
mpQbjca33npr3bp1So614UNa/pkzZ6ZOnSoD169frw2UCSdNmlRdXf3hhx/K67lz58rAkydPynmF
nAy8/vrrHR0ddXV1siZO11bWU2bY1NSknTaMLMYvvvjixRdfTExM3Lhxo3aGg+UDAJYPAAABs3zR
8fz8/NTU1Pr6eu1+miNe0Llz5zZv3hwTE6Mu7RDbrqqqspVyNVx8/d577/3ss888t3xB1jM6Ovro
0aO2M5w5c+aSJUtkZJ1Op67+F0T65XRCLWv69Onvv/++07Xdu3evnBjIONu3b/fS8oVPPvmkqKhI
lqsuEMLyAQDLBwCAwFj+xo0bDQZDZWWlr+6XrxgcHGxra/P5s2+XLVt233332Q7Rrtjp7u7Wbt+p
cerUKe2ON26QabXfAXtj+ep++c3NzWlpaSO+6gnLBwAsHwAAy/d2biUlJU49OAhNbseOHbJWdje2
t70u3z1zHfBhjHbPvpVTpvLyciwfALB8AAAIgGC5+aw9CE3ugw8+2L17t93AP/3pT3/5y188mdzi
gP8sX1siRQgAWD4AAASRYGFy3ls+RQgAWD4AAGD5WD5FCABYPgAAlo/JYflYPgBg+QAACBYmh+VT
GwBYPgAAYPmA5VMbAFg+AABg+Vg+RQgAWD4AAGD5WD5FCABYPgAAlo/JYflYPgBg+QAACBYmh+VT
GwBYPgAAIFiYHJZPbQBg+QAAgOVj+Vg+AGD5AACA5WP5FCEAYPkAAIDlY/lYPgBg+QAACBYmh+VT
GwBYPgAAIFiYHJZPbQBg+QAAgOUDlg8AWD4AAGD5WD5FCABYPgAAYPlYPpYPAFg+AACChclh+dQG
AGD5AAAIFiaH5VMbAFg+AABg+YDlAwCWDwAAWD6WTxECAJYPAABYPpaP5QMAlg8AgGD5dEHgOVg+
AGD5AAAQYoL1BXgMRQgAWD4AAGD5WD5FCABYPgAA7TWWj+Vj+QCA5QMAIFgAFCEAlg8AAAgWAEUI
gOUDAACCBRQhAGD5AACAYAFFCABYPgAAIFhAEQIAlg8AgGABUIQAHLBEAACAYAFQhE63FzyHAwTL
BwAABAsoQraXrADLBwAAOlGgCNlesgIsHwCAThSAImR7yQrLBwAAOlEAipCDbuRZfXEJ0sDyAQAA
4QCKkO3F8gHLBwBAsAAoQrYXywcsHwAA4QCgCNleLB/LBwAAhAOAIuSgw/KxfAAAQDiAImR7sXws
H8sHgFFrcyGcnviIcADWy/Zi+YDlAwD9U7jlzA4FWhW2F8sHLB8A6J+wfACKkO3F8rF8AKB/Aiwf
gCLkoMPysXwAoH+C0ezb2KFAq8L20hIClg8A9E9YPgBFyPZi+Vg+ANA/AZYPQBFy0GH5WD4A0D8B
lg+0KmwvYPlYPgDQP9G3sUOBVoXtpSUELB8A6J/o29ihQKvC9tISApYPAPRPWD6A/4qQJ1sDlo/l
AwCWT9/GDgWgFaUlBCwfAOif6NvYoQC0orSEgOUDAP0Tlg8AHHRYPmD5APRPgOUDwAgPuvPnz3d0
dHR1dY14njKHY8eOHTx48PTp035d+cbGxo8//hjLx/IBAMsHLB8AXB50Z86c2bRpU0xMTFRUlAyP
i4t7+eWXBwcH1X+vvPJK9WvdWbNmrV69+ptvvtGGJyQkqNf9/f0/+9nPJk2aJKONHTtW/ur1+hGv
29mzZ92PIItesGCBn5JxXDqWj+UDAJaP5bNDAUKvFX3kkUfkbUlJyenTp0+ePLlmzRp5++yzz2pK
vXDhwubm5scee0yGb9iwwdHy165dK/+S+Rw/ftxqte7fv/+1114b2YotXbp0xowZ7sc5duzY119/
7Y9YnC4dy8fyAQDLx/LZoQAh1ooeOXIkKipq8uTJ586dU0N6enrGjBlzxRVX9PX1KZu/+eab5cU7
77wjUz333HN2lt/W1ibjz58///z5844Lkv9mZ2dPmTIlMTGxuLh4YGBA8+m77rpr165dc+bMWbJk
yWeffSYD5dRi/Pjx0dHRRqNx7969MuSHP/zhDTfcMHXq1LvvvltriNS0buYjHDhw4NZbb42NjV2w
YIF2eY8a+dVXX501a9bnn39ut6qOS8fysXwAwPKxfHYoQEi2ouLH8vrOO++0/e+iRYtkYHNzs7L5
2bNnb968Wf7m5OScOnXKzvJ37twpIz/66KOOS+nv7xe5j4uLq66uLiwslNHkrza5TqdLSUkpKCiQ
4bm5uTKwtbVV/FucXtaqu7v7wsXvGeT1u+++e/nll99///3atNoVO07n8+23386YMeP666/fvXt3
amqqrIO6AElGlvMN2ZC1a9d2dHTYra3j0rF8LB8AsHwsnx0KEJKt6NatW9XlOrb/FZ+WgfX19cqM
J0yYEBMTo9frzWaz9oG9ZvlO56Cora2Vf4l/X7j421zR/fHjx6svDWRyeasuvLnqqquuvfZaNYmo
+dVXX203H3FukXXtWho7y3ecz5tvvinLfeaZZ3p6ejZs2CCv9+/fr0aWrbBYLK5icbp0LB/LBwAs
Pwxz5jGcAOHdiqrrcJYtW2b739mzZ8vA1tbWC5eu2BkYGHjhhRdk4Lp16+wsf/fu3TJ88eLFjkt5
/vnn5V+VlZXqbWZmprw9evSonanPvoijZw8ODq5fv37evHmXX375uHHjtOF2lu84H7VcmWTCJfbs
2XPBg5/tYvlYPgBg+ZGSM30bQHi3ohaLRXmwdsX8V199Jf+dNm2aGqJdly/OPWXKlLi4OPVxvmb5
x44dE5+WSRzvwllTUyPDn3jiCfU2KSkpKirqu+++c2/506dPV69/+9vfql8CWK3WjIwMzy3/jTfe
kAmLi4vt1scTy9eWTkuI5QMAlo/lA0AIt6JPPfWUvF25cqVoeltb2+233y5vt23bppmx0Wh86623
1q1bJ8OXL1+uDdfusSMeL/+68cYb33//fTlJkJEffvjhb7/9tru7W10HL22I2WyWcbRfzbqyfFkN
Ga2pqam/v/+VV16R11u3bt25c6fId0xMzD//+U9PLP/kyZNyliInJK+//npHR0ddXZ1slyeWb7t0
WkIsHwCwfCwfAEK4FT137tzmzZvFodXlc+LHVVVV2n+1++VPnTr13nvv1W5iY2v5Z8+e/cUvfhEb
G6vGnDx58i233KKM/L333ktMTJSBY8aMueOOO7Q7YLqy/L17986dO1fG3759+6lTpxYuXCivlyxZ
8tBDD8mL1atXe2L5wocffignJ2p95AxBTj88sXzbpdMSYvkA4C/cPILRE8v3/jmOdjz99NO/+tWv
sHzBarVSnwBh9lnJ4OBgW1ubl8++7ezsPHLkiOMtNY8fP2776fiQdHd3azPRnsN14sSJM2fODGuV
5Dyht7d3uBtiu3QsH8sHAL90QmVlZdqlop5bvvfPcdTGUcjcLvj5aYuuGPIZkH6ds9O+7cc//rEM
/4//+I8R9J0AELSWH2nMdWC4n3cAlg8AI++EioqKDAaD2Wy2+/DYff/k/XMc5XV8fPzhS6gzgdG3
fE+eAenXOdv1be+88860adO0k5/o6Ggxfj7XB8DyQxGLA1g+lg8Ao9oJtbe3m0ym9PT0pqYmT/on
75/jaPdaw9byXT1V8YEHHti+fXtSUpKcSLS2tv76179OTEyUMbXHr7h5HKM3z4BUj3JcuXKljHzw
4EH1ryeffFJOkxwjcvV8Rzd924cffvjv//7vEqPjHTAnTZq0Y8cOyhUAyw/jrLB8LB8A/NUJieKn
paWJ7ov0u++fvH+Oo3otMl19kZ07d9pZvpunKk6bNk3WUz1QxmAwyMnJPffco51LuJnQm2dAao9y
fOmll2Tyn/70pzK8t7dXTmzULaLtcPV8R6e7oKWl5cEHHxw7dqz7u93LtnzwwQcULQCWj+UDlg8A
w+6EampqRJ2Liorc9E/eP8dRvRavVbdrsL1tnLJ8N09VTEpKOn36dH9/f1RU1Lx58+TFiRMnZIQ1
a9a4n9CbZ0Bqj3KUswgx++uuu05ei/HPnz/fVUpO5+x0F8jJgIePtZJNXrp06bFjxyhdACwfywcs
HyDMW0afPw9VtHjy5Mnyr87OTqcL9f45jheGumLHzVMVMzMz1cjyX5PJJC9kQZrle/I4Rm+eASnk
5uaqby2MRuMrr7zijeVLwpcBQPhCJ4XlY/kAEJjPihwnb2hoSElJWbVqlZs5e/8cxyEt381TFd1b
viePY/TmGZDC22+/LaPJBsp2ublvndPnOzrdBbJFM2fO1G6k7QpZ4nXXXSe7Rl1SBQDh3T5j+YDl
A9CL+GbylpaW7OzsrKwseTHknL1/jqO6DObdS3z66ae2Pu3mqYruLd+TxzF68wzICxfvIqqusVm/
fr2biJw+39FV3/biiy+K6C9atCg6OtrR7w0Gw0033aTtGgDA8rF8wPIB6EWGnlxMPT8/PzU1tb6+
Xrtpo/s5e/8cR7v75aenp9v5tKunKrq3/AsePI7Rm2dAKiQu0fGjR4+6icjp8x3d9G2ffPJJUVFR
UlLSvHnztFjkdOXf/u3fZK1sdw0AYPlYPmD5APQiQ0y+ceNGg8FQWVk5rPvlK7x/juOQjOypisOd
cLjPgFy2bNl999033Dl70rc1NzenpaV973vfk9MhOfORsxHHXQMAWD6WD1g+AL3IEJOXlJQ4tWH6
J1fs2LFDwrG9C77nz3f0pG8Tpy8vL3ezawAAy8fyAcsHoBdxh5uP4emfXPHBBx/s3r3bdojnz3f0
vG/j83sALB/LBywfgF6E/om+DQCwfFpCwPIB6EXon+jbAADLpyUELB8Aywf6NgCgFaUlxPIBAMsH
+jYA2megJcTyAQDLp28DANpnWkLA8gEAy6dvAwAsn5YQsHwAehH6J/o2AMDyaQkBywfA8iEQfduw
broPAFg+lg9YPgC9CP1TsPdtNTU15A9A+4zlA5YPQC9C/xQmfVt7e3tWVpbJZCJ/ANpnLB+wfAB6
EfqncOjbSktLDQZDXV0d+QPQPmP5gOUD0IvQP4X2Hmxtbd21a9f8+fOLiop6e3vJH4D2GcsHLB+A
lhHLD20yMjLy8vLS09Obm5vJH4D22RM6OzsDMq3/5ozlY/kAgOWHFdIpSs5z5swZGBhwzH8EEClA
ULXP/iAjIyMg0/p7zlg+lg8AWH647cHly5ffdNNNBw4c8DJ/dhlA0PKFj2htbc3Ly7vhhhtuu+22
0Zx2NOdMtWD5AIDlh8kelF7txRdfTExM3Lhxo/ahPpYPgOU7smvXrvT0dKvVunTpUmk3Rm3a0Zwz
1YLlAwCWHz6WL3zyySdFRUVGo3HPnj1YPgCW75T58+er3/B8+eWXiYmJH3300ehMO5pzplqwfADA
8sPK8lXfJj1lWlraqlWrsHwAsKO0tLSoqMjVW/9NG6g5A5YPgOVj+eFj+YLVai0vL8fyAcCW9vZ2
g8Gg3WxXtRUpKSl2t+fy+bSBmjNg+QA4IpYfbpavsFgs7DIA0MjOzlaPzLOlqakpLS1NxNp/0wZq
zoDlA+CIWH54Wj67DAA0ampqTCaT03/l5+eXl5f7aVr/rRVg+QCA5WP57DKAiMZisQx5s/muri6f
T+u/tQIsHwCwfCyfXQYAvjzY6TUAywfA8mGUwPIBAMsHLB8AAtCqfgGjAl0sAGD5gOUDAJaP5dPF
AtDyY/mA5QPQ1mP5WD4AYPlYPmD5AFg+UAwAgOXTBGH5AIDYAcUAAFg+YPkAQKsKFAMAYPmA5QMA
rSpQDAAc7Fg+YPkAtPVAMQAABzuWD1g+AGIHFAMAYPk0QVg+ACB2QDEAAJYPWD4A0KoCxQAAWD5g
+QBAqwoUAwBg+YDlA9DWA8UAABzsWD5g+QCIHVAMAIDl0wQBlg+A2AHFAABYPk0Qlg8AtKpAMQAA
lg9YPgDQqgLFAABYPmD5ALT1QDEAAAc7lg9YPgBiBxQDAGD5NEGA5QMgdkAxAACWTxOE5QMArSpQ
DACA5QOWDwC0qkAxAACWD1g+AG09UAwAwMGO5QOWD4DYAcUAAFg+TRBg+QCIHVAMAIDl0wRh+QCA
2AHFAABYPmD5AECrChQDAGD5gOUD0NYDUAwAHOxYPmD5ALT1QDEAAJZPEwRYPgBiBxQDQEDLG2wJ
TssHn+wjLB8AEDugGIDyBpIkZywfgHYEKAYAypskgZyxfADaEaAYAChvkgQsHwBoR4BiAKC8SZKc
sXwAoL0GigGA8iZJcsbyAYD2GigGoLyBJMkZywegHQGKAYDyBpIkZywfgHYEKAYAypskAcsHANoR
oBgAKG+SBCwfAGhHgGIAoLxJkpyxfACgvQaKAShvIElyxvIBaEeAYgCgvIEkyRnLB6AdAYoBgPIm
SSBnLB+AdgQoBgDKmyQBywcA2hGgGAAob5IkZywfAGivgWIAypvyJklyxvIBaEeAYgCgvIEkyRnL
B6AdAYoBgPImSSBnLB+AdgQoBgDKmyQBy4cwrWDwHNoRoFMBoLxJkpyxfKCCyYq9ABQDAOVNkuSM
5QMVjOUDZQZAeQNJkjOWD1Qwlg+UGQDlTZJAzlg+0FKMPKsvLsFeAA5JAMqbJMkZywcqGMtnLwDF
AJQ35U2S5IzlAxWM5QOHJADlDSRJzlg+0FJg+cAhCUB5kySQM5YPtBRYPnBIAlDeJAlYPlDBWD6W
DxySAJQ3SZIzlg9UMJbPXgCKAYDyJklyxvKBCsbygUMSgPIGkiRnLB9oKbB84JAEoLxJEsgZywda
CiwfOCQBKG+SBCwfqGDA8oFDEoDyJklyxvKBCsby2QtAMQBQ3iRJzlg+UMFYPgDFAJQ3kCQ5Y/lA
BWP5wCEJQHmTJJAzlg+0FFg+cEgCUN4kCVg+UMGA5QOHJADlTZLkjOUDFYzlsxcgWIrhMhgtgrOQ
wCd7kHBIMqT3AnqBUgCWD+Fp+eQZsa0oe99XWZEkSYb0XiB0qgGwfMDyActnD5IkSWL5QP+E5bMX
AMsHLB83BZLE8iEM+qfz5893dHR0dXUF7bY0NjZ+/PHHWD6gjNRVeJz/s/eDYQ+SJEmG9F4gdJRi
CM6cObNp06aYmJioqCgZPy4u7uWXXx4cHFT/vfLKK9WvPWbNmrV69epvvvlGG56QkGA7n3vuuUdG
k1nZzf/s2bM+2RZZ4oIFC4Y1ybAWjeUDlg9YPm4KJInlQ/goxSOPPCKjlZSUnD59+uTJk2vWrJG3
zz77rObWCxcubG5ufuyxx2T4hg0bnFq+2P/48eOvuOKK5OTk8+fPa8OXLl06Y8YMn2zLsWPHvv76
a8/HH+6isXzA8gHLx02BJLF8CBOlOHLkSFRU1OTJk8+dO6eG9PT0jBkzRny9r69P2fzNN98sL955
5x2Z23PPPefU8l9++eV58+Y9/vjjMk5jY6MaKKcKov7R0dFGo3Hv3r0ypK2tLTs7e8qUKYmJicXF
xQMDA5qRP/DAA9u3b09KSpKTitbW1l//+tcyzq233trR0aGNc9ddd2mrfeedd06bNk2v169du9Zx
uxwXjeUDlg9YPnuQJEkSy4dIUYpdu3bJOGLMtgMXLVokA5ubm5XNz549e/PmzfI3Jyfn1KlTTi1/
8eLFW7Zs+fzzz2XCNWvWqIEi67NmzZo6daospbu7u7+/X8Q9Li6uurq6sLBQxpS/2txE2dPS0u6/
/34ZbjAY0tPT1SVAtucV6oodmU9ycvLEiRPXr1//5ptvVlZWOm6X3aKxfMDyActnD5IkSWL5EEFK
sXXrVnW5ju1Apdr19fXKrSdMmBATE6PX681ms3Y1jq3lS6lFRUWpX+4uXLhQRlbfAwjXX3/91Vdf
rV7X1tbKbAsKCi5c/LGv6P748ePVdwgyt6SkpNOnT4vBy6zmzZsnL06cOGF7zqBZ/u9//3sZLivp
ftNsF43lA5YPWD57kCRJEsuHCFIKdR3OsmXLbAfOnj1bBra2tl64dMXOwMDACy+8IAPXrVvnaPmP
P/54bGxszkUMBoOMVl1d7ajazz//vPxL++g9MzNT3h49elTNTd6q4ePGjTOZTPJCFurU8svKymS4
nJ/43PIj6tmZgOUDls8eJEmSxPIhbJXCYrFMuIh2ifxXX30lU02bNk0N0a7LHxwcnDJlSlxcnPo4
X7N8q9U6ffr0u+++e91FHn300bFjx956662aast/1euamhqZ8xNPPKHeJiUlRUVFfffdd8O1fDWf
Bx98cEjL1xYdun05YPl4HpbP3sdNSRKwfBjJUffUU0/JaCtXruzq6mpra7v99tvl7bZt2zS3NhqN
b731lhi8DF++fLk2XFn+H//4R9tLdITs7OwxY8aoX83KbGWqpqam/v7+7u5uOU+YPXu2lKbZbJbh
2q9ph2X5aj46na66ulrW+YMPPnC6XbaLxvIBywcsnz1IkiSJ5UNkKcW5c+c2b94spq6uLZk2bVpV
VZX2X+1++VOnTr333ns/++wzO8u/5557HnjgAdsZKoNXN87fu3fv3Llz5e327dvl7XvvvZeYmChv
5TTgjjvu0O6MOSzLV/NJSkpSK5aRkeF0u+wWjeUDlg9YPnuQJEkSy4eIU4rBwcG2tjY/Pfu2u7vb
9ib6x48f9/Dz9SFne/r06WEtGssHLB+wfPYgSZIklg8oRWgw14Gw6cuBQxLPw/LZ+7gpSQKWDxF6
1FkcwPIBywcsH3BTksTyAaUALB+wfMDycVMgSSwfsHz6cgAsn5aBth03JUmSxPKBnoC+HADLp2Vg
7+OmJAlYPtAT0JcDhyRHNy0Dex83JUnA8oGjDssHLB+wfMBNSRLLB5QCsHzA8gHLx02BJLF8wPLp
ywGwfFoG2nbclCRJEssHegL6coAAWf758+c7Ojq8eaa1zOHYsWMHDx4c8jnTo0ljY+PHH3+M5fsE
74skgHsnCN3Uyzxl8sOHD3/zzTdhZvk+L7OR1VuwFSeWD1g+lg8cksOe1ZkzZzZt2hQTExMVFSVj
xsXFvfzyy4ODg+q/V1555WUXmTVr1urVqzWlkOEJCQnqdX9//89+9rNJkybJaGPHjpW/er0+SGKU
9VywYIH29uzZsyOYiYdThbHle18ktq9d7Z2RhT+yfRpYy/c+z5///Ofx8fFq8ltuuSV4MvQmSZ+U
2WWXuPrqq7Ozsz/77DPP6802B0+KM7AMudewfMDysXyIdMt/5JFHZISSkpLTp0+fPHlyzZo18vbZ
Z5/VurqFCxc2Nzc/9thjMnzDhg2OPevatWvlXzKf48ePW63W/fv3v/baa0ES47Fjx77++mv1eunS
pTNmzBjuHDyfKowt3/sicWr5tntnZOGPbJ8G3PK9zPPw4cPiwfn5+QMDAx0dHe+//37wZOhNkj4p
Mzn5kXz27du3bdu2SZMmjR079q9//asn9WaXw5DFGVg82WtYPmD5WD5EtOUfOXJEdGHy5Mnnzp1T
Q3p6esaMGXPFFVf09fWpXvPmm2+WF++8847M57nnnrPrWdva2mT8+fPnnz9/3nH+8t/s7OwpU6Yk
JiYWFxeLlGhd1AMPPLB9+/akpCTpuVtbW3/961/LOLfeeqtYizbOXXfdtWPHDhknOTn5D3/4g/t5
fvTRR9///vfV8JqaGtuZyAtxhfHjx0dHRxuNxr1798qQAwcOyOJiY2MXLFjg6qt5x6ki0PK9LxJX
lq/tHae7b8jwh7V3gsfyvc/zz3/+swxfsWLFd999Zztnx5Ie/QxHnKQ/yqyyslLG/K//+i/HepPF
3XnnndOmTdPr9WvXrnXMwXZkN+2YjLNr1645c+YsWbJE+97A1fAf/vCHN9xww9SpU++++24tGTXy
q6++OmvWrJUrV8oKHDx4UP3rySefLCoqGvFew/IBy8fyIaItX/oh+a/0drYDFy1aJAObm5tVrzl7
9uzNmzfL35ycnFOnTtn1pjt37pSRH330UceZ9/f3S6cYFxdXXV1dWFgoo8lfbXLpX9PS0u6//34Z
bjAY0tPT77nnHrvOW7pVGZ6fnz9u3Lirr77a/TxvvPFG6VNFFt966y35q81Efe0uJxLSiUr/Kpvc
3d397bffzpgx4/rrr9+9e3dqaqrMU7swwBa7qSLT8r0vkgseXLHjuPuGDH9Yeyd4LN/7PM+ePSuH
jIwv6f3tb39T/3Va0qOf4YiT9EeZiZ3L5ElJSXb1Js1IcnLyxIkT169f/+abb8rJgGMOtiO7acd0
Ol1KSkpBQYEMz83NdT/8kUcekfm/++67l19+uTR9tg2dbJScbLz00ksy/k9/+lMZ3tvbK2c4e/bs
GfFew/IBy8fyIaItf+vWreorctuByrzr6+tVDzRhwoSYmBi9Xm82m7UP7LXe1OkcFLW1tfIv6ecu
XPxFnXST48ePVx/UyeTS9Z4+fVp60KioqHnz5smLEydOyPhr1qzRFjFz5kxxF22VDhw44Gqe8loU
R16/9tprtr5u65EiQOpUQZCuXebzzDPP9PT0bNiwQV7v37/faUS2U0Wm5XtfJENavqvdN2T4nu+d
4LF8n+Qpjnv77bfLJKKqv/nNb9yU9ChnOOIk/VRm48aNk1bCrt5+//vfy2w1z3aagzay+3ZM3qoL
e6666qprr71Wm9bpcIV4uZyGadfbyMiyRRaLRZ2qidlfd9118lqMf/78+d7sNSwfsHwsH4K3zHyI
q6Wo776XLVtmO3D27NkysLW19cKlb8kHBgZeeOEFGbhu3Tq73nT37t0yfPHixY4zf/755+VflZWV
6m1mZqa8PXr0qJpc3mrdsMlkkheyFDvL1wS9tLRU/vW73/3OzTz/+Mc/Tp8+Xd4uWbLk73//u3vL
V/ORRU+4hNPPzLB8nxTJBQ8+y3e6+8LS8n2SpzJOEUGpYTlPFnF0VdKhYvn+KDP1wUFaWppdvZWV
lclwOa/wxPLdt2NaAc++iGNha8Pl9HX9+vXz5s27/PLLtS8nLzj8zDc3N1d9fWE0Gl955RUsH7B8
LB9ghEe3xWJRQqBdafrVV1/J+NOmTVNDtGthpYuaMmVKXFyc+ghN602PHTsmPZZM4njnu5qaGhn+
xBNPqLdJSUmiI+pK4uFa/urVq1XP6mae6gPOjRs3yghqnR0tXzxSvX7jjTdktOLi4iHTs50qMi3f
+yK54Nk9dhx335Dhe753gsfyfZKnxoMPPijT1tbWuirpUc5wxEn6o8yUoP/kJz+xqzfVjEh0bnKw
G9lVO+a55f/2t79VVyRardaMjAxXlv/222+rQ0A2sL+/35u9huUDlo/lQ6Qf3U899ZSMsHLlStH0
trY2dRnAtm3btB7IaDS+9dZb69atk+HLly937E2l/5N/3Xjjje+//750zDLyww8//O2333Z3d6vr
TeVIMZvNMo72azYPLX/SpEnV1dUffvihvJ47d+6Fi192O52nLO7VV1/9xz/+cfLkyalTp952222O
Pahso4zf1NQkfaeMJvYg/ejrr7/e0dFRV1cn2+40H9upItPyfVIk6rKEdy+xb98+273javcNGb7n
eyd4LN/7PPfu3bt9+/YjR44cOnRIxhwzZkxra6urkh7lDL1J0odl9vvf/15GGzt2rMh0b2+vXWug
mhGdTifNiyzrgw8+cMzBbmRX7Zjnlv/KK6+oLxB27twpgh4TE/PPf/7T0fLPnDkjh4CMuX79ejcJ
e7LXsHzA8rF8iPSj+9y5c5s3b5YuR13bI6JQVVVl+4GWGi4dz7333qvdLMK2Zz179uwvfvGL2NhY
NebkyZNvueUW1YG99957iYmJMlBE5I477tDuTOeh5c+cOXPJkiUyUPpjdW2uq3meOnVKBsoQ6ddv
uummTz/91LG7FTeSUwWZUAxJ3srJg0iDWmfpdF3djtBuqsi0fO+LxPZG5tpFFNrecbX7hgzf870T
VJbvZZ47duxQI0hiixYt0i4mcVrSo5yhN0n6sMyioqIMBsMPfvCDEydO2M5Baw2kGUlKSlIjZ2Rk
OOZgN7Krdsxzy5ciX7hwobom7aGHHpIXq1evvuDsxvz5+fnR0dHqoiBXeLLXsHzA8rF84Oj+F4OD
g21tbV4++7azs/PIkSOOt9Q8fvz4CD4mtP0sTbu5nvt5yhDtzhuukLnZrqGMr33U5/lUkWb5vioS
97jafUOG78neCSrL9z5Pyerw4cNO76/itKRHLUPvk/R3mdlttd2zut3kMLJ2zA7tYV5y+nHmzBmn
4yxbtuy+++7zcP3d7DUsH7B8LB84uoOU0X/25FwHwqBlCJu23cu9E4SWH5YZRkiS/kN9UWN7F/wR
7zUsHzjqsHzg6A5S/vSnP/3lL38ZzSVaHMDygwcv9w5uOjoZYvle8sEHH+zevdsnew3LB446LB84
uiGcWwb2Pm5KkuwFLJ9qoACwfODoBiyfPYibkiSWD3jAJXp6eqxW68im/cc//jHi5R4+fHjE0x4/
fhzLB45uwPIBNyVJLB/Cvxo8JyYmJjExcd68efPnz584ceLIHsYpExYVFY34WZ7Tpk2Tv2PHjh3x
HK666irZhJkzZ45gJlg+YPmA5eOmQJJYPoQYX/z/+fDDD5XaXnfddbfccst///d/FxcXl5eXm83m
xsbG+Rf5YpgcPHhQFD89PV1mu3v37i+Gz7Zt26Kjo/V6/Y9+9KMRTH7ttdeWlpaWlZXl5OTEx8dP
mTJl+fLljz76aFVV1SeffOL5fKgWwPIBy8dNgSSxfAhJyxeqq6vT0tK0505r9PX1xcbGvvrqq8My
7E8//VT8vrCwUGa4ePHi7du3j0DTxc4rKio6OzvnzJnz/e9/v6WlZViT5+XlbdmyRdsQmY+ctMjZ
izqf2bVrF5YPWD5g+exBkiRJLB/C3PKFH/3oR+LldmOWl5cP94P8119/PTk5uba2Vs3h/vvvf/rp
p4er+J988sm0adPUUzPkTGP58uU33njjRx995PkcfvnLX/7nf/6n3ebs2bNnypQplZWVfJYPWD5g
+exBkiRJLB8iAqvVmp6eXldXZzvQYDDU1NR4Pofi4mKZSXt7uzZw48aNpaWlw10ZOUlQj6TW5pyf
n280Gm3n7J7Ozk69Xm/7o+GGhgadTldfX8++BiwfsHz2IEmSJJYPEURXV1diYqJm0mL8CQkJHt5d
R6bVrtKxHV5ZWVlQUDDcNcnNzS0vL7cbuGXLFhH3pqYmD2cSHx+vbcuePXsmTZr09ttvs5cBywcs
nz1IkiSJ5UNkIVo/ffr01NRUZeqZmZm2l7a7obm5WV3vXlNT09fXZ/uv+vr6FStWDGs1ZOmxsbF2
j3+Ts4jq6uro6GhZSktLiyfzycnJkUkuXPoUf+nSpVlZWY6/PQDA8gHLZw+SJEli+RC2lJaWxsfH
79mzp7i4uLCwUMRdVLu3t9fDyXt6eqqqqrKzs2Uqk8kkei1DZLgYuZw2DPdkQ7tcp7W1VVZs8eLF
er1eZi6LULP1BDlFKSgokC0SxRfRt1qteXl56enpns8BAMsHLB83BZLE8iFUEf0tKirSrnpXF+iL
rI/gShtBTgxWrlx5zTXXiFuvWLGivLx84sSJw5qDnCSI0MuZhsFgSE5Ovueee2688Ua7rwg8obGx
UY4Bu2vxZaNktnZfFABg+YDl46ZERJJYPoQVYs/i4pmZmbafcHd1dUn1eP5TVztSU1PVZ/nyV3xd
ZuX5dTKyPjJ+fHz8pk2b9u3bJ6ccMkRM3fNvFexm1dDQYDdcTmnk5GHEWweA5dOnsvdxU5IkSSwf
gprOzk4x8ry8PEcLH7EEt7S0iKPb/mZXBH1Ylu94RY3JZKqoqBjByhw6dMjp8PLycr1ej+gDlg9Y
PnuQJEkSy4dwo6mpSX1k7uFddDwkJyenpKTEt6uqvhPw+TxF9BsbG6kEwPIBy2cPkiRJYvkQJqjb
zmjPrvIVPT090dHRrj5BHzG9vb3D+imwh9TU1Mh5DqIPWD5g+exBkiRJLB/CgYqKiuTk5D179vh8
zqWlpbm5uf5Y5+zsbMc76HuPKH5CQoLnT/4CwPLpU7F83JQkSRLLh6DDarUWFhYajcbOzk5/zFyM
2fHXrj7BbDYP9777HtLa2mowGPxxCgGA52H57H3clCQBywe/09PTk5GRkZWV5fNLXxT19fWiy769
yt925XU6nZ9udd/e3h4fH19UVESRAJYPWD5uCiSJ5UMo0dXVJQqel5fnJwu/cPGimrKyMv9tQk5O
jv8+ce/s7ExJSSksLPRfPgB4HpbP3sdNSRKwfPAlzc3NCQkJpaWl/lPY9vb2kd3V3nNqamqysrL8
N39Z+czMTJPJRMEAlg9YPm4KJInlQ7BjNpvj4+P9/QPTgoKCvLw8vy6ir68vNja2q6vLf4sYGBjI
yMhYsWLFCB61C4DnYfnsfdyUJEkSy4dRoqKiQhS/ubnZr0sROdbpdD6/gaYjJpOpsrLSr4uwWq2y
lPT0dEQfsHz3+ONH/MG8XCwfNwWSxPIhWCgsLExJSRmFHlHOJUSLR2GLqqurMzMzR2FBRUVFqamp
FouFKoKQaOtHn4yMjIhariI4LR98sgcJhyRDei9g+RFEb29vTk5OVlbW6HwgLUIs/j062+Xvi3Y0
SktLk5OT29vbKScIIb4YFVpbW/Py8m644Ybbbrvti1EkUMt1JJL3fnhAkiQZZnsBy48UDh06ZDQa
8/PzBwYGRmFxLS0tCQkJo3Zrmuzs7IqKitFZlhJ92UCKCrB8W3bt2pWeni5H/dKlS1988cVR68kC
tVwsHzcFksTyIfA0NzfHx8ePmgdfuPi725KSklFbnNlsHp2rgxQ1NTWj8MMGgNDyvPnz56uD4ssv
v0xMTPzoo4/Ce7lYPm4KJInlQ4Cpra0VJa2vrx+1Jfb09ERHR4/mj+FkibGxsX56PJZTGhoaJFU/
PdMXIOQoLS21fYSc3dvwWy4AQPCD5Yc5W7ZsGf3ryMvLy0f/BvN+fTyWU/bs2aPX60fz9AkgOJEW
xmAw2D4Zw2q1pqSk+Pv7rkAtFwAAy4cAU1xcnJqaOso3mJNeVs4rxIBHeWNramqys7NHeaHq5wdV
VVUUG0QycujV1dXZDWxqakpLS/Prj3MCtVwAACwfAkZfX5/JZJIucHR+a2tLQ0ODwWAY/S5WNnni
xImjc6cdW2SJsr3+vmE/QNAiJ9iuvrvLz8/33zdsgVouAACWDwGjvb3daDQWFxcH5NOsnJycQCmv
dPkB6do7OztTUlIKCwv5+BAiDYvFMuTNm/1x7h2o5QIAYPkQMBobG+Pj4wPl2XKCodPpbC+THU2q
qqqysrICsmjZ5LS0tFWrViH6EOmdSoCecMmTNQEAsPxwpq6uThS/qakpUCtQUlJSUFAQqKWP5uOx
nC49MzMzIFdJAWD5WD4AAJYftmzZsiUhISGAN5cQu9XpdIcOHQpgCCLZAbxE3mq15ufnp6enj87T
hQGwfCwfAADLD3MKCwtTU1NH84bxjlRXVwfqghnbdcjMzAzsOhQUFAR8XwBg+QAAWD6ENgMDAyaT
KScnJ+CfH4tei2QHdh0Ce9GORmlpqdFoHOXHFABg+QAAgOWHCSK1GRkZJSUlAf/RZ2tra3JycjD8
9jSwF+1oBOR5ZABYPgAAYPkhT1NTk3hkaWlpMKxMfn6+eG0wrEkwXLSjMJvNer2+sbGRWgUsH8sH
AMDywSPq6up0Ol19fX0wrExPT4+sTJDcnTpILtrRdpOIfpDsJgAsHwAAy4egpqKiIrC307GjsrIy
JycnePIJkot2FPv27ZOdVVNTQ90Clo/lAwBg+eCSkpKS1NRUi8USJOtjtVoNBkMAb9LvSPBctKNo
aWnR6/XBc+IBgOUDAGD5EET09fWZTKZVq1YF1e3Y6+vrjUZjUAUVVBftKDo7O5OTk8vKyihjwPKx
fAAALB/+P8RZ09LSioqKguE+NrbIiUcQymtQXbSj7UE5HZI9SDEDlo/lAwBg+fAvWltbDQZDRUVF
EJ57xMfH9/b2BtuKBdtFOwoJStYqPz9/YGCAqgYsH8sHAMDyI5r6+nox6eC8T8umTZuKi4uDcMWC
8KIdRV9fn4h+VlYWog9YPpYPAIDlRy7l5eV6vT54bqdji3iqrFtra2twRheEF+1oueXm5qanpwfV
7ysAsHwAACwfRomioiJxwSD8QFphNpszMjKCNr3gvGhHYbVa8/LyUlJSOjs7qXPA8rF8AAAsP1JQ
t9PJysoK5o97xaHr6uqCdvWC9qIdjbKyMqPR2N7eTsEDlo/lAwBg+eGPup1OcXFxMF+63dTUlJyc
HGw3/LEjaC/a0aioqJAYW1paKHvA8rF8AAAsP5xpbW0V7Qv+JygVFBRs3LgxyFcymC/a0airqwva
n14AYPkAAFg++ID6+noRvtra2iBfz56eHp1OF8wXwyhkDWNjY4PwRp+O+z0+Pt5sNnMIAJaP5QMA
YPnhRnl5ucFgCNpb1titqslkColUMzMzq6urg389W1paRPRramo4EADLx/IBALD88KG4uDiEfogp
ZyONjY0hsaqVlZXZ2dkhsapK9OUMisMBsHwsHwAAyw95BgYG8vLy0tPTe3p6QmKF6+rq5IQkyH93
qxEqF+0oOjs7U1JSNm3axHEBWD6WDwCA5YcwYp+ZmZmFhYUh9CRUk8lUUVERQiGHykU7CovFkpqa
WlRUxNEBWD6WDwCA5Yck7e3tBoMhtD64lXXW6XSh8tG4IoQu2tHO/bKysvLy8kLo3A8AywcAwPLh
XzQ2NiYnJwf/7XTsKCkpKSwsDK11Dq2LdhR9fX1yZmIymRB9wPKxfAAALD9kMJvNCQkJIXeLdDFO
vV4fio9wCq2LdhRWqzU/Pz+EfrABgOUDAGD5EU1paanRaAyJO2baUVlZmZGREYqZh9xFOxoFBQUp
KSmdnZ0cOIDlY/kAAFh+kDIwMLBq1SrRzb6+vlBcf1H8EH1yUyhetGN3WojoA5aP5QMAYPnBiLqd
Tm5ubohead3Y2JiQkBAqN9B0JBQv2tHYsmVLcnJyKH7/A1g+lg8AgOWHM+p2OhUVFaFryXJ+EtL3
cQ/di3YUDQ0Ncpa1b98+jibA8rF8AAAsPyhobGyMj48P3Q+SL1y8ibtOp5O/obsJIX3RjqK5uVmv
14vuc0wBlo/lAwBg+QHGbDYnJyc3NTWF9FaUlpbm5OSE+r4I6Yt2FPv27ZMzxpqaGo4swPKxfAAA
LD+QcpySktLe3h7SW2G1Wg0GQ2NjY6jvjlC/aEdx6NAhOW8MrccPA5aP5QMAYPlhgpixyWQKj5ud
m81m2ZAw2CldXV0TJ04M6Yt2FBaLJTU1tbCwMHR/5gFYPpYPAIDlh6SEiRbn5eWFh4SFwYUuGitW
rKisrAyDDZFzFamxVatWIfqA5WP5AABY/migLqgoKSkJD/2SzdHpdCF6909HRPGzsrLCY1tkp5hM
puzs7LDZO4DlY/kAAFh+kGI2m8Psx5EFBQVyxhI2m2OxWMLjoh2NnJycjIyMcNoiwPKxfAAALD+4
2LJlS0JCQnNzc9hskbijXq8P9V8P22EymcLmAqQLF38BkpeXl5qaGtL3OQUsH8sHAMDyg5SCggKj
0djZ2RlOGxUeN6Wxo6KiImwu2tHYuHFjcnJymJUfYPlYPgAAlh9IBgYGVqxYkZmZGQa307FDzlvC
7xlMspvC7KIdRVlZWXx8fEtLC4ckYPkAAECz6C2dnZ0pKSl5eXnh9wvIffv2GQyGsLyFi8lkCo87
7dhRW1sroh/qj2ADX1lvoAi/7aWcAADLjzhaWlpEqrZs2RKWWyenLuG6aVVVVRkZGWG5aQ0NDXq9
Pvy+gYERWC8hkCQAYPkwcp0Kp9vp2NLb26vT6cL1zi0WiyU2NjZct665uTnMbvQEuClJAgBg+aNE
+N1Ox3ED8/LywngPhtOjvhxpaWmRU9CKigoOVdwUSBIAsHzwlLC8nY4tVqvVYDDs27cvjHdiOD0e
yynq6WylpaUcsLgpkCQAYPkwBGF8Ox1bGhoa0tPTw3tXWiyWML4kSSEnonK2VlhYGJY/oQbclCQB
ALB8n3lhWlpaQUFB2DtTdnZ2GF/NoiEnbGF5px1b5HRUTthWrVqF6OOmQJIAgOWDE8L7djq2tLe3
6/X68LsxqCOi+OF6px1bZFdmXyQS9ingpiQJAIDlD4Pwvp2OHSUXiYQtDdfHYzlitVpXrVolpzSR
sLGAm5IkAACW7xGVlZXhfTsdWwYGBuR8pr29PUJ2rslkioRrkxT5+fmpqakWi4WDGjcFkgQALD/S
Cfvb6dghypubmxtRp3DhfacdOwoLC5OTkyOnnnFTIEkAwPLBngi5nY4d6enpjY2NkbO9fX19Yfx4
LKeUlZXFx8e3tLRwjOOmQJIAgOVHHJFzOx1b9u3bZzQaI21fm0wms9kcUZtcXV2t1+ubmpo40nFT
IEkAwPIjCPU4oUi4nY4deXl5YX9nSUeqqqoyMzMjbavr6+tF9BsaGjjecVMgSQDA8iMCdTsdcaBI
2/De3t74+PgIvAeLbHKkXbSjaG5ulj0eIXeOwk2BJAEAy49oIup2OnZs2bKloKAgMvd7hDwFzJGW
lhY5p62oqODYx02BJAEAyw9bSkpKIup2OnbIth86dCgyt91sNkfUnXZskZ0uZ7alpaW0ALgpkCQA
YPnhhrqdTnZ2dl9fX2Qm0NjYGIHXpmv09PTodLqIupmSLXJmK+d4EfIoNNwUSBIAsPxIITJvp2NH
bm5upN1nxg45zYvkK9TlDEdO8/Ly8iL5KMBNgSQBAMsPHyL2djq2dHV1SQgRrneReacdW/r6+rKz
s00m08DAAC0jbgokCQBYfgjT2NiYkJAQgbfTsWPTRSI8hN7e3okTJ0bsRTsKdelaVlZWBN5xCDcF
kgQALD9MiOTb6dhitVqTk5O7uro4GEwmUwQ+LsCxHvLz81NTUyP8hAc3BZIEACw/JCkpKRGPsVgs
VIDZbM7NzSWHCxefCJudnU0OwsaNGzlAcFMgSQDA8kOJgYGBVatWRfLtdOzIzMxsbGwkhwsXL0yP
jY3lM2zFli1bDAZDxN5YFjcFkgQALD+UsFgsGRkZEX47HVsOHTpkNBrJQcNkMlVVVZGDoqKigqva
cFMgSQDA8oOdlpYWUZbIfMSpK/Lz85FaW2pra3NycshBQ44XnU7X0NBAFLgpSQIAYPnByJ49e+Lj
4yP8lvB2DAwMSCbcTcWWvr4+vV7PrSRtqampEdGX8x+iwE1JEgAAyw8uKioqRGdbW1vZ33axFBQU
kIMdOTk5nA3a0dLSIic/iD5uSpIAAFh+sGC1WgsLC9PS0trb29nZdhiNRs58HGloaMjKyiIHR9GX
U2U5MyQK3JQkAQCw/AAzMDCQm5ubnp7ORSmONDY2yskPOTg9MxSd5U47jsg5ocFg2LhxI1HgpiQJ
AIDlBwyLxSJ+X1hYyDXWTsnOzua6FFesWrWKHyU7RU6Y5eQwPz+fu1ThpiQJAIDlBwB1dcGmTZtw
EVenQJIPTwxwRW1trclkIgc3oi8nQhxcuClJAgBg+aNKY2NjcnLynj172LuuKC4ullMgcnCFnP/o
dDou9HIj+llZWdnZ2XxRhpuSJAAAlj9KlJeXi+LzMFc3WK1WvV7f1dVFFG5YsWJFXV0dObipory8
vIyMDM6FcFOSBADA8v1OQUGBKL7FYmG/uqG6ujo7O5sc3FNfX8+ddoYkPz8/NTWVIw43JUkAACzf
X/T19eXm5q5YsYJbowxJSkoKjzIdEqvVqtPp8NchKSkpSUhI4E61uClJAgBg+b6nq6srLS0tJyeH
q4SHpLm5WZyM3016gpw3cqcdT5CUDAbDoUOHiAI3JUkAACzfZ7S3t4u2lpWVYa6eUFhYyP3OPcRs
NnOnHQ+prq7W6/VyDkkUuClJAgBg+T6gtrY2Pj5eDIMd6Qk9PT06nY7f3XpIb2+vxMX9Rj2krq5O
RJ+LwXBTkgQAwPK9ZdOmTQkJCa2trexFDykrK8vNzSUHz1mxYgXPDvOcxsZGOesmMdyUJAEAsPwR
YrVa8/PzU1JSuBR4WBgMBj5qHRY1NTUi+uTgOS0tLXq9nt8z4KYkCQCA5Q+bvr4+Ea/09HQupRgW
9fX1qamp/HphuOeTXOM0XNrb25OTk8vKyogCNyVJAAAsfxgCYTAYCgoKuJ3OcDGZTIjXCOBOOyOg
s7NTRL+kpISzStyUJAEAsPyhaW1t1ev1xcXFqMNwOXTokE6n4zGlI6C2tpaHiI0Ai8WSlpYm50ic
kOOmJAkAgOW7w2w2x8fH19fXs89GgJwaFRQUkMMI6Ovri42N5QRpBIjfywlSTk4Op+W4KUkCAGD5
zlG302lsbGSHjQB1R0juZT5iRFW5b8zIEL9fsWKFBMgn+rgpSQIAYPn2llBUVGQwGCwWC3trZJSX
l2dmZpLDiKmqquJOO94cwvn5+RkZGT09PaSBm5IkAACW/y96e3uzL4Lie0NaWhoPDvOGgYEBnU5H
EXrDpk2bjEZjZ2cnUeCmJAkAEOmWL0JgMBhMJhMX9XpDQ0NDfHw810t4SW5ubkVFBTl4Q2lpqZQi
z7jATUkSACCiLb+lpUWEQLQAxfeSVatWlZSUkIOXmM3mrKwscvCS6upqOa737dtHFLgpSQIAjELz
BcOAKEmSJCMwSZpRKjacKhYAIsjyg3C1gvPGhUP2T0G4zsF5rQ5JRnKSoXh081lvJFdsKCYJABC8
lk//RJJAklg+FQtkBQBYPv0TSZIkSWL5VCxJAgBg+SMK64tL0OaSJElGTpL4FhUbThULAFg+0D+R
JEmSJJZPxWL5AIDl0z8BSZIklg9ULJYPAFg+/RNJkiRJYvnkTJJYPgBg+fRPJAkkieVTsSSJ5QMA
lk//RJIkCVg+FUvFAgCWD/RPJEmSJInlU7FYPgBg+bSq9E8kSZJYPlCxWD4AYPn0TyRJkiSJ5ZMz
SWL5AIDl0z+RJJAklk/FApYPAFg+/RNJkiRJYvlULBULAFg+0D+RJEmSJJZPxWL5AIDl06rSP5Ek
SeJM5EzFYvkAgOXTPwFJkiSWT85ULJYPAFg+/RNJAkli+VQsYPkAgOXTP5EkSZIklk/FUrEAQENB
q0r/RJIkSZJYPhWL5QMAlk+rSv9EkiSJM5EzFYvlAwCWT/8EJEmSWD5QsVg+AGD59E8kCSSJ5VOx
gOUDQIAs//z58x0dHV1dXb5aoaeffvpXv/pVZPZPPg8zaPP3VZLBkBg1OeQcjh07dvDgwdOnTwfz
YR4Qyw9sATc2Nn788ceOw7ds2fLLX/4yhCrWyxhl8sOHD3/zzTde5oblA0D4WP6ZM2c2bdoUExMT
FRUlM4mLi3v55ZcHBwfVf6+88srLLjJr1qzVq1drDagMT0hIsBtH8ZOf/EQNXLBgwchW6ezZs16O
ECij8mGYduMMlyHz9zJDXyXpk8S01/7eXt+WcajUZH9//89+9rNJkybJaGPHjpW/er3ewzLza2jB
YPlexnvNNddc5owvv/zS+4N99uzZycnJfkrbtxXrfZX+/Oc/j4+PV5PfcsstnmxjkFQvlg8AfrT8
Rx55RKYtKSk5ffr0yZMn16xZI2+fffZZrR1cuHBhc3PzY489JsM3bNjgVLOkeT18iRMnTnjTgC5d
unTGjBnejBBAo/JJmE7H8a3le5+hr5L0SWJDWr6vtteHZRxCNbl27Vr5l8zn+PHjVqt1//79r732
mj88ySe7aZQt38t4Ozo6VLO5ePFiGaGlpUW9HZYvHjt27Ouvvx6u5XuZtm8r1ssYJTHx+/z8/IGB
AYn0/fff92QbXeU2ytWL5QOAvyz/yJEj0jhOnjz53LlzakhPT8+YMWOuuOKKvr4+1YzefPPN8uKd
d96RRTz33HMeapZt99/W1padnT1lypTExMTi4mJpiNXwH/7whzfccMPUqVPvvvtu1cBJsz5+/Pjo
6Gij0bh3717HFR5yhAAala/CdDqOqwxdDXejXz7J0CdJ+rz8pKO96667du3aNWfOnCVLlnz22WdO
t/fAgQO33nprbGysRKR9Zf/pp5+mpqbqdLqCgoI777zztttuU8OdjqwW9Oqrr86aNevzzz/3PuGg
rUkpMBl//vz558+fd3UyaRe400PbH6EF3PJ9UsCKjIwMGeHbb7/VhjhmKPFKOD/4wQ/ktcVikeTf
ffddLVg11V//+ldVxjL5zJkzXVm+92n7sGK9j/HPf/6zDF+xYsV3333nZhvtKtA2N6dNh5tmwYd5
YvkA4C/Ll0ZNJpTGy3bgokWLZGBzc7NqRmfPnr1582b5m5OTc+rUKaeaFR8f336Rzs5OO8vs7+8X
AY2Li6uuri4sLJQ5y1/t8xtZAemoLr/88vvvv1+GtLa2SvsrHZsM7+7udlzhIUcIoFH5KkzHcVxl
6CZbN5bvkwx9kqSvErN9Lf1xSkqKdMkyk9zcXMftFZGaMWPG9ddfv3v3bum/JUB1YcDcuXPlZKmq
qurHP/7x2LFjlR65GlkWJCPLWq1du7ajo8P7hIO2Jnfu3CkjP/roo07n7zRwp4e2P0ILuOX7pIBd
Wb7TDO+55x4ZTYavXLnye9/7njr1sj3YpVbljPR//ud/Hn74YVFnV5bvfdo+rFjvYzx79qzBYJDx
JYe//e1vrrbRrgJtc3NVyU6bBd/mieUDgL8sf+vWrep7UtuB0qPIwPr6etX2TZgwISYmRq/Xm81m
7fM8V9fl2w5UDWhtba0Ml6bzwsVfR4mSjh8/XvvMRpA2UfxJ+65Teqmrr77azToPOUKgjMpXYTqO
4ypDN9m6v5TC+wx9kqSvErN9LSGob+Gvuuqqa6+91nF733zzTZn/M88809PTs2HDBnm9f/9+6fLl
RV5enhpHu9TB6chqQbJKFovFVwkHbU06nYOt5TsN3Omh7fPQAm75PilgV5bvNMOuri6ReLFJMc6/
//3vdo3tsWPHZCarV6+2K2N/pO3DivVJjKL+t99+u0wyceLE3/zmN0630a4C7SzfsZJdNQu+zRPL
BwB/Wb76AnTZsmW2A6Utk4Gtra0XLn1VOjAw8MILL8jAdevWOVWr+Pj4Ixf56quv7BrQ559/Xias
rKxUwzMzM+Xt0aNHBwcH169fP2/evMsvv3zcuHFa+xi6lu+rMB3HcZWhq+GhYvm+Sszx3FLNR3Dc
XhWalNyES+zZs2fHjh0yUJZi1507HfmCB9ejB4nle5/w7t27ZfjixYtdWb5j4K4ObZ+HFnDL90kB
O7V8VxkKjz/+uIyZnZ3tuBfEbuVf2l2PQsXyfRWj2P9LL70kcUVFRakP1B0t37YC7SzfsZJdNQtY
PgCEhuVbLBblLtr13KLpMqtp06apIdoFkdLrTJkyJS4uTvuO2MPr8mtqamSGTzzxhBqelJQkTfB3
333329/+Vl1habVapYeztfzp06e7VwH3IwTKqHwVpuM4rjJ0NdwTy/cyQ58k6fPyc2P52va+8cYb
soji4mLbNdm3b5/6gam87u3tlQWp7tzpyB4K67ASDtqaPHbsmGiTTOL0/oZOA3d1aPs8tIBbvk8K
2Knlu8rw5MmTEpGof3R0tPpayTZYOQXVJPjcuXMylXsr9SZtH1asD2MUHnzwQZm2trbWcRuHa/mu
mgXf5onlA4C/LF946qmnZNqVK1dKL97W1qa+9Ny2bZvW9hmNxrfeekt6Dhm+fPly9x+mOjag3d3d
6lJIacLMZrPMRP3g6ZVXXpHXW7du3blzpzSOMTEx//znP2W4rIkMb2pq6u/vd7rCQ44QKKPyVZiO
47jK0NXwIY3K+wx9laRvy8+V5dtur3iS2IN02K+//npHR0ddXZ0sV5RIp9NdccUVP/7xj5csWRIb
G6umdTqyJ8I63ISDuSblNFL+deONN77//vuiXzLyww8/rHzUaeCuDm2fhxZwy/dJvE4t31WGDzzw
wKJFi/73f/9XzudFfLVfiahgxYknTpx41VVXbd++/b777pPTMzdW6mXavq1YL2Pcu3evbPKRI0cO
HTokY44ZM0Z9CWC3jcO1fFfNgm/zxPIBwI+WLw3Z5s2bpRdRF9aL01RVVdm2g2r41KlT7733Xu3O
A8O6x857772XmJgoM5HG94477lDXPp46dWrhwoUyUFrPhx56SLucVNrruXPnyltptZ2u8JAjBNCo
fBKm03GcZuhmuHuj8j5DXyXp2/JzZfl22/vhhx+KCqg5i0Kp++698cYbs2bNuuaaa0ToxZPmz5+v
pnU68pDCOtyEg7kmz549+4tf/EIUR405efLkW265xVHctcBdHdo+Dy0YLN/7eJ1avtMM//znP4u4
q9sTiXeq0wC7YJ9++mk5AZgwYUJpaan7K0y8TNu3FetljOrSGtUMylmQdhGj3TYO1/LdNAs+zBPL
BwA/Wr5icHCwra3Nr89uPH78uOOHHNrzTU6cOHHmzBlteHd3t9M793k+QkCMyt9hOs3QzXD3eJOh
b5MchfJz3F4Rqd7eXu1tR0eH1WpVWi9rnpOTYzut3cg+Tzj4a1I2pLOz88iRIx5ukatD299lGZBn
3/qpgEeWYV9fnzoH82va/qhYb2KUBvDw4cNO72/jTUW5bxZ8siwsHwD8bvnByVwHfBWWX40qEjIM
vyQXLlwYGxs7Z84cWW2DwSDGMJoJR1RN+rUsA2L5EZh2hFSs02bBt3li+QAQoZZvcQA3DZIMwy/J
U6dO/eUvf9m+ffv//d//jX7CkWb5/itLLH900o6QinXaLPg2TywfACLU8v0XVkQZFUmSJElGsuVT
seFasQCA5QP9E0mSJEli+VQslg8AWD79E5AkSWL5QMVi+QCA5dM/kSRJkiSWT84kieUDAJZP/0SS
QJJYPhVLklg+AGD59E8kSZKA5VOxVCwAYPlA/0SSJEmSWD4Vi+UDAJZPq0r/RJIkieUDFYvlAwCW
T/9EkiRJklg+OZMklg8AWD79E0kCSWL5VCxg+QCA5dM/kSRJkiSWT8VSsQCA5QP9E0mSJEli+VQs
lg8AWD6tKv0TSZIkzkTOVCyWDwBYPv0TkCRJYvnkTMVi+QCA5dM/kSSQJJZPxQKWDwBYPv0TSZIk
SWL5VCwVCwA0FLSq9E8kSZIkieVTsVg+AGD5tKr0TyRJkjgTOVOxWD4AYPn0T0CSJInlAxWL5QMA
lk//RJJAklg+FQtYPgBg+fRPJEmSJInlU7EkCQCA5dM/kSRJApZPxWL5AIDlExb9E0mSJM5EzlQs
lg8AIWL54Dnu+ycgSZIMyyRpRqnYcKpYAIgUy7d98wV4jPtYyYckSTJckyRnKjacKhYAsHygfyJJ
kiRJLJ+KxfIBAMunfwKSJEksH6hYLB8AgsHyAQAAAAAgDPh/BFQraMwlB3wAAAAASUVORK5C" />
</BODY>
</HTML>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 20 KiB

File diff suppressed because it is too large Load Diff

View File

@ -1,29 +0,0 @@
===============
Container types
===============
Containers allow to manage collections of values of homogeneous type.
Three container types exist.
A container is a non-mutable structure once it has been locked.
Consequently, only insertion procedures are needed.
Lists
-----
Correspond to an ordered collection of values belonging to an elementary type.
At its creation, ...
Sets
----
Correspond to an unordered collection of values belonging to an elementary type.
Dictionaries
------------
Dictionaries allow to associate a `key` to a `value`. Values can be retrieved through their associated key.
Values must belong to an elementary type and keys must be *OBIStr_t*.

View File

@ -1,16 +0,0 @@
#################
Data in OBITools3
#################
The OBITools3 introduce a new way to manage DNA metabarcoding data.
They rely on a `Data management System` (DMS) that can be viewed like
a simplified database system.
.. toctree::
:maxdepth: 2
The data management system <DMS>
The data types <types>

View File

@ -1,40 +0,0 @@
================
Elementary types
================
They correspond to simple values.
Atomic types
------------
========= ========= ============ ==============================
Type C type OBIType Definition
========= ========= ============ ==============================
integer int32_t OBIInt_t a signed integer value
float double OBIFloat_t a floating value
boolean bool OBIBool_t a boolean true/false value
char char OBIChar_t a character
index size_t OBIIdx_t an index in a data structure
========= ========= ============ ==============================
The composite types
-------------------
Character string type
.....................
================ ====== ======== ==================
Type C type OBIType Definition
================ ====== ======== ==================
Character string ? OBIStr_t a character string
================ ====== ======== ==================
The taxid type
..............
==================== ====== ========== ======================
Type C type OBIType Definition
==================== ====== ========== ======================
Taxonomic identifier size_t OBITaxid_t a taxonomic identifier
==================== ====== ========== ======================

View File

@ -1,131 +0,0 @@
######################
Programming guidelines
######################
***************
Version control
***************
Version control is managed with `Git <http://git-scm.com/>`_.
Issue tracking and repository management are done using `GitLab <https://about.gitlab.com/>`_
at http://git.metabarcoding.org/.
Branching strategy
==================
Master branch
-------------
The master branch should only contain functional scripts.
Topic branches
--------------
Topic branches should correspond to development branches revolving around a topic corresponding
to the branch's name.
Release branches
----------------
Release branches should start with duplicates of tags and be used to patch them.
Tags
----
Tags should never be committed to.
Rebasing
--------
Rebasing should be avoided on the distant server.
Merging
-------
Merging should never overwrite on a release branch or on a tag.
Branching strategy diagram
--------------------------
.. image:: ./images/version_control.png
Issue tracking
==============
Issue tracking is done using `GitLab <https://about.gitlab.com/>`_ at http://git.metabarcoding.org/.
Tickets should always be labeled with the branches for which they are relevant.
*************
Documentation
*************
C functions are documented in the header files for public functions, and in the source file for private functions.
**************
OBITools3 wiki
**************
The OBITools3 wiki is managed with GitLab.
*********************
Programming languages
*********************
C99 :
* All the low-level input/output functions (e.g. all the `OBIDMS <formats.html#the-obitools3-data-management-system-obidms>`_ functions)
* Computing-intensive code (e.g. alignment or pattern matching)
`Cython <cython.org>`_ :
* Object layer
* OBITools3 library
`Python 3.5 <https://www.python.org/>`_ :
* Top layer code (scripts)
For the documentation, `Sphinx <http://sphinx-doc.org/>`_ should be used for both the original
documentation and for the generation of documentation from the python code. `Doxygen <http://www.stack.nl/~dimitri/doxygen/>`_
should be used for the generation of documentation from the C code, which should be then integrated
in the Sphinx documentation using `Breathe <https://breathe.readthedocs.org/en/latest/>`_.
******************
Naming conventions
******************
Struct, Enum: ``Title_case``
Enum members, macros, constants: ``ALL_CAPS``
Functions, local variables: ``lower_case``
Public functions: ``obi_lower_case``
Functions that shouldn't be called directly: ``_lower_case`` (``_`` prefix)
Global variables: ``g_lower_case`` (``g_`` prefix)
Pointers: ``pointer_ptr`` (``_ptr`` suffix)
.. note::
Underscores are used to delimit 'words'.
*****************
Programming rules
*****************
*

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 48 KiB

View File

@ -1,21 +0,0 @@
.. OBITools3 documentation master file, created by
sphinx-quickstart on Mon May 4 14:36:57 2015.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
OBITools3 documentation
==========================
.. toctree::
:maxdepth: 2
Programming guidelines <guidelines>
Data structures <data>
Code documentation <code_doc/codedoc>
Indices and tables
------------------
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

View File

@ -1,55 +0,0 @@
==============
Special values
==============
NA values
=========
All OBITypes have an associated NA (Not Available) value.
NA values are implemented by specifying an explicit NA value for each type,
corresponding to the R standards as much as possible:
* For the type ``OBIInt_t``, the NA value is ``INT_MIN``.
* For the type ``OBIBool_t``, the NA value is ``2``.
* For the type ``OBIIdx_t`` and ``OBITaxid_t``, the NA value is ``SIZE_MAX``.
* For the type ``OBIChar_t``: the NA value is ``\0``.
* For the type ``OBIFloat_t``::
typedef union
{
double value;
unsigned int word[2];
} ieee_double;
static double NA_value(void)
{
volatile ieee_double x;
x.word[hw] = 0x7ff00000;
x.word[lw] = 1954;
return x.value;
}
Minimum and maximum values for ``OBIInt_t``
===========================================
* Maximum value : ``INT_MAX``
* Minimum value : ``INT_MIN(-1?)``
Infinity values for the type ``OBIFloat_t``
===========================================
* Positive infinity : ``INFINITY`` (should be defined in ``<math.h>``)
* Negative infinity : ``-INFINITY``
NaN value for the type ``OBIFloat_t``
=====================================
* NaN (Not a Number) value : ``NAN`` (should be defined in ``<math.h>`` but probably needs to be tested)

View File

@ -1,21 +0,0 @@
********
OBITypes
********
.. image:: ./UML/OBITypes_UML.png
:download:`html version of the OBITypes UML file <UML/OBITypes_UML.class.violet.html>`
.. image:: ./UML/Obicolumn_classes_UML.png
:download:`html version of the OBIDMS classes UML file <UML/Obicolumn_classes_UML.class.violet.html>`
.. toctree::
:maxdepth: 2
The elementary types <elementary>
The containers <containers>
Special values <specialvalues>

View File

@ -1,2 +0,0 @@
.DS_Store
/build_dir.txt

25
obi_completion_script.bash Executable file
View File

@ -0,0 +1,25 @@
_obi_comp ()
{
local cur prev
cur="${COMP_WORDS[COMP_CWORD]}"
prev="${COMP_WORDS[COMP_CWORD-1]}"
if [ "${#COMP_WORDS[@]}" = "2" ]; then
COMPREPLY=($(compgen -W "align alignpairedend annotate build_ref_db clean_dms clean count ecopcr ecotag export grep head history import less ls ngsfilter sort stats tail test uniq" "${COMP_WORDS[1]}"))
else
if [[ "$cur" == *VIEWS* ]]; then
COMPREPLY=($(compgen -o plusdirs -f -X '!*.obiview' -- "${COMP_WORDS[COMP_CWORD]}"))
elif [[ -d $cur.obidms ]]; then
COMPREPLY=($(compgen -o plusdirs -f $cur.obidms/VIEWS/ -- "${COMP_WORDS[COMP_CWORD]}"), $(compgen -o plusdirs -f -X '!*.obidms/' -- "${COMP_WORDS[COMP_CWORD]}"))
elif [[ "$cur" == *obidms* ]]; then
COMPREPLY=($(compgen -o plusdirs -f $cur/VIEWS/ -- "${COMP_WORDS[COMP_CWORD]}"))
else
COMPREPLY=($(compgen -o plusdirs -f -X '!*.obidms/' -- "${COMP_WORDS[COMP_CWORD]}"))
fi
if [[ "$prev" == import ]]; then
COMPREPLY+=($(compgen -f -- "${COMP_WORDS[COMP_CWORD]}"))
fi
fi
}
complete -o nospace -F _obi_comp obi

1
python/.gitignore vendored
View File

@ -1 +1,2 @@
/.DS_Store
/OBITools3.egg-info/

1
python/obitools3/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/.DS_Store

View File

@ -1,110 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -33,10 +33,6 @@ cpdef buildArgumentParser(str configname,
default=None,
help='Create a logfile')
parser.add_argument('--no-progress', dest='%s:progress' % configname,
action='store_false',
default=None,
help='Do not print the progress bar during analyzes')
subparsers = parser.add_subparsers(title='subcommands',
description='valid subcommands',

View File

@ -1,110 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -1,110 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -13,7 +13,7 @@ from .logging cimport getLogger
from .arguments cimport buildArgumentParser
from ..version import version
from _curses import version
cdef dict __default_config__ = {}

View File

@ -1,110 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -1,3 +1,9 @@
import codecs
def unescaped_str(arg_str):
return arg_str.encode('latin-1', 'backslashreplace').decode('unicode-escape')
def __addInputOption(optionManager):
optionManager.add_argument(
@ -39,6 +45,30 @@ def __addImportInputOption(optionManager):
const=b'fastq',
help="Input file is in fastq format")
group.add_argument('--silva-input',
action="store_const", dest="obi:inputformat",
default=None,
const=b'silva',
help="Input file is in SILVA fasta format. If NCBI taxonomy provided with --taxonomy, taxid and scientific name will be added for each sequence.")
group.add_argument('--rdp-input',
action="store_const", dest="obi:inputformat",
default=None,
const=b'rdp',
help="Input file is in RDP training set fasta format. If NCBI taxonomy provided with --taxonomy, taxid and scientific name will be added for each sequence.")
group.add_argument('--unite-input',
action="store_const", dest="obi:inputformat",
default=None,
const=b'unite',
help="Input file is in UNITE fasta format. If NCBI taxonomy provided with --taxonomy, taxid and scientific name will be added for each sequence.")
group.add_argument('--sintax-input',
action="store_const", dest="obi:inputformat",
default=None,
const=b'sintax',
help="Input file is in SINTAX fasta format. If NCBI taxonomy provided with --taxonomy, taxid and scientific name will be added for each sequence.")
group.add_argument('--embl-input',
action="store_const", dest="obi:inputformat",
default=None,
@ -55,7 +85,7 @@ def __addImportInputOption(optionManager):
action="store_const", dest="obi:inputformat",
default=None,
const=b'ngsfilter',
help="Input file is an ngsfilter file")
help="Input file is an ngsfilter file. If not using tags, use ':' or 'None:None' or '-:-' or any combination")
group.add_argument('--ecopcr-result-input',
action="store_const", dest="obi:inputformat",
@ -115,20 +145,26 @@ def __addImportInputOption(optionManager):
type=str,
help="String associated with Non Available (NA) values in the input")
def __addTabularInputOption(optionManager):
group = optionManager.add_argument_group("Input format options for tabular files")
group.add_argument('--header',
action="store_true", dest="obi:header",
default=False,
help="First line of tabular file contains column names")
def __addTabularOption(optionManager):
group = optionManager.add_argument_group("Input and output format options for tabular files")
group.add_argument('--no-header',
action="store_false", dest="obi:header",
default=True,
help="Don't print the header (first line with column names")
group.add_argument('--sep',
action="store", dest="obi:sep",
default=None,
type=str,
default="\t",
type=unescaped_str,
help="Column separator")
def __addTabularInputOption(optionManager):
group = optionManager.add_argument_group("Input format options for tabular files")
__addTabularOption(optionManager)
group.add_argument('--dec',
action="store", dest="obi:dec",
@ -153,6 +189,16 @@ def __addTabularInputOption(optionManager):
help="Lines starting by this char are considered as comment")
def __addTabularOutputOption(optionManager):
group = optionManager.add_argument_group("Output format options for tabular files")
__addTabularOption(optionManager)
group.add_argument('--na-int-stay-na',
action="store_false", dest="obi:na_int_to_0",
help="NA (Non available) integer values should be exported as NA in tabular output (default: they are converted to 0 for tabular output).") # TODO
def __addTaxdumpInputOption(optionManager): # TODO maybe not the best way to do it
group = optionManager.add_argument_group("Input format options for taxdump")
@ -186,6 +232,10 @@ def addTabularInputOption(optionManager):
__addTabularInputOption(optionManager)
def addTabularOutputOption(optionManager):
__addTabularOutputOption(optionManager)
def addTaxonomyOption(optionManager):
__addTaxonomyOption(optionManager)
@ -198,6 +248,7 @@ def addAllInputOption(optionManager):
__addInputOption(optionManager)
__addImportInputOption(optionManager)
__addTabularInputOption(optionManager)
__addTabularOutputOption(optionManager)
__addTaxonomyOption(optionManager)
__addTaxdumpInputOption(optionManager)
@ -205,7 +256,7 @@ def addAllInputOption(optionManager):
def __addOutputOption(optionManager):
optionManager.add_argument(
dest='obi:outputURI',
dest='obi:outputURI',
metavar='OUTPUT',
help='Data destination URI')
@ -216,12 +267,15 @@ def __addDMSOutputOption(optionManager):
group.add_argument('--no-create-dms',
action="store_true", dest="obi:nocreatedms",
default=False,
help="Don't create an output DMS is it is not existing")
help="Don't create an output DMS if it does not already exist")
def __addEltLimitOption(optionManager):
group = optionManager.add_argument_group("Option to limit the number of elements per line in columns")
group.add_argument('--max-elts',
action="store", dest="obi:maxelts",
metavar='<N>',
default=1000,
default=1000000,
type=int,
help="Maximum number of elements per line in a column "
"(e.g. the number of different keys in a dictionary-type "
@ -232,6 +286,11 @@ def __addDMSOutputOption(optionManager):
def __addExportOutputOption(optionManager):
group = optionManager.add_argument_group("Output format options for exported files")
group.add_argument('-o',
dest='obi:outputURI',
metavar='OUTPUT',
help='Data destination URI')
group.add_argument('--fasta-output',
action="store_const", dest="obi:outputformat",
default=None,
@ -244,6 +303,41 @@ def __addExportOutputOption(optionManager):
const=b'fastq',
help="Output file is in fastq format")
group.add_argument('--tab-output',
action="store_const", dest="obi:outputformat",
default=None,
const=b'tabular',
help="Output file is in tabular format")
group.add_argument('--metabaR-output',
action="store_const", dest="obi:outputformat",
default=None,
const=b'metabaR',
help="Export the files needed by the obifiles_to_metabarlist function of the metabaR package")
group.add_argument('--metabaR-prefix',
action="store", dest="obi:metabarprefix",
type=str,
help="Prefix for the files when using --metabaR-output option")
group.add_argument('--metabaR-ngsfilter',
action="store", dest="obi:metabarngsfilter",
type=str,
default=None,
help="URI to the ngsfilter view when using --metabaR-output option (if not provided, it is not exported)")
group.add_argument('--metabaR-samples',
action="store", dest="obi:metabarsamples",
type=str,
default=None,
help="URI to the sample metadata view when using --metabaR-output option (if not provided, it is built as just a list of the sample names)")
group.add_argument('--only-keys',
action="append", dest="obi:only_keys",
type=str,
default=[],
help="Only export the given keys (columns).")
group.add_argument('--print-na',
action="store_true", dest="obi:printna",
default=False,
@ -256,17 +350,40 @@ def __addExportOutputOption(optionManager):
help="String associated with Non Available (NA) values in the output")
def __addNoProgressBarOption(optionManager):
group = optionManager.add_argument_group("Option to deactivate the display of the progress bar")
group.add_argument('--no-progress-bar',
action="store_true", dest="obi:noprogressbar",
default=False,
help="Do not display progress bar")
def addMinimalOutputOption(optionManager):
__addOutputOption(optionManager)
__addDMSOutputOption(optionManager)
def addTabularOutputOption(optionManager):
__addTabularOption(optionManager)
def addExportOutputOption(optionManager):
__addOutputOption(optionManager)
__addExportOutputOption(optionManager)
__addTabularOutputOption(optionManager)
def addAllOutputOption(optionManager):
__addOutputOption(optionManager)
__addDMSOutputOption(optionManager)
__addExportOutputOption(optionManager)
__addTabularOutputOption(optionManager)
def addNoProgressBarOption(optionManager):
__addNoProgressBarOption(optionManager)
def addEltLimitOption(optionManager):
__addEltLimitOption(optionManager)

View File

@ -1,110 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -30,12 +30,12 @@ cdef class ProgressBar:
off_t maxi,
dict config={},
str head="",
double seconde=0.1,
double seconds=5,
cut=False):
self.starttime = self.clock()
self.lasttime = self.starttime
self.tickcount = <clock_t> (seconde * CLOCKS_PER_SEC)
self.tickcount = <clock_t> (seconds * CLOCKS_PER_SEC)
self.freq = 1
self.cycle = 0
self.arrow = 0

View File

@ -1,110 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

1
python/obitools3/commands/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/.DS_Store

View File

@ -0,0 +1,231 @@
#cython: language_level=3
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.dms.column.column cimport Column
from functools import reduce
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes, tostr
from io import BufferedWriter
from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN, \
QUALITY_COLUMN, \
COUNT_COLUMN, \
TAXID_COLUMN
from obitools3.dms.capi.obitypes cimport OBI_INT
from obitools3.dms.capi.obitaxonomy cimport MIN_LOCAL_TAXID
import time
import math
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Annotate sequences with their corresponding NCBI taxid found from the taxon scientific name"
def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi addtaxids specific options')
group.add_argument('-t', '--taxid-tag',
action="store",
dest="addtaxids:taxid_tag",
metavar="<TAXID_TAG>",
default=b"TAXID",
help="Name of the tag to store the found taxid "
"(default: 'TAXID').")
group.add_argument('-n', '--taxon-name-tag',
action="store",
dest="addtaxids:taxon_name_tag",
metavar="<SCIENTIFIC_NAME_TAG>",
default=b"SCIENTIFIC_NAME",
help="Name of the tag giving the scientific name of the taxon "
"(default: 'SCIENTIFIC_NAME').")
group.add_argument('-g', '--try-genus-match',
action="store_true", dest="addtaxids:try_genus_match",
default=False,
help="Try matching the first word of <SCIENTIFIC_NAME_TAG> when can't find corresponding taxid for a taxon. "
"If there is a match it is added in the 'parent_taxid' tag. (Can be used by 'obi taxonomy' to add the taxon under that taxid).")
group.add_argument('-a', '--restricting-ancestor',
action="store",
dest="addtaxids:restricting_ancestor",
metavar="<RESTRICTING_ANCESTOR>",
default=None,
help="Enables to restrict the search of taxids under an ancestor specified by its taxid.")
group.add_argument('-l', '--log-file',
action="store",
dest="addtaxids:log_file",
metavar="<LOG_FILE>",
default='',
help="Path to a log file to write informations about not found taxids.")
def run(config):
DMS.obi_atexit()
logger("info", "obi addtaxids")
# Open the input
input = open_uri(config['obi']['inputURI'])
if input is None:
raise Exception("Could not read input view")
i_dms = input[0]
i_view = input[1]
i_view_name = input[1].name
# Open the output: only the DMS, as the output view is going to be created by cloning the input view
# (could eventually be done via an open_uri() argument)
output = open_uri(config['obi']['outputURI'],
input=False,
dms_only=True)
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
output_0 = output[0]
o_view_name = output[1]
# stdout output: create temporary view
if type(output_0)==BufferedWriter:
o_dms = i_dms
i=0
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in output DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
imported_view_name = o_view_name
# If the input and output DMS are not the same, import the input view in the output DMS before cloning it to modify it
# (could be the other way around: clone and modify in the input DMS then import the new view in the output DMS)
if i_dms != o_dms:
imported_view_name = i_view_name
i=0
while imported_view_name in o_dms: # Making sure view name is unique in output DMS
imported_view_name = i_view_name+b"_"+str2bytes(str(i))
i+=1
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], i_view_name, imported_view_name)
i_view = o_dms[imported_view_name]
# Clone output view from input view
o_view = i_view.clone(o_view_name)
if o_view is None:
raise Exception("Couldn't create output view")
i_view.close()
# Open taxonomy
taxo_uri = open_uri(config['obi']['taxoURI'])
if taxo_uri is None or taxo_uri[2] == bytes:
raise Exception("Couldn't open taxonomy")
taxo = taxo_uri[1]
# Initialize the progress bar
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(o_view), config)
else:
pb = None
try:
if config['addtaxids']['log_file']:
logfile = open(config['addtaxids']['log_file'], 'w')
else:
logfile = None
if config['addtaxids']['try_genus_match']:
try_genus = True
else:
try_genus = False
if 'restricting_ancestor' in config['addtaxids']:
res_anc = int(config['addtaxids']['restricting_ancestor'])
else:
res_anc = None
taxid_column_name = config['addtaxids']['taxid_tag']
parent_taxid_column_name = "PARENT_TAXID" # TODO macro
taxon_name_column_name = config['addtaxids']['taxon_name_tag']
taxid_column = Column.new_column(o_view, taxid_column_name, OBI_INT)
parent_taxid_column = Column.new_column(o_view, parent_taxid_column_name, OBI_INT)
taxon_name_column = o_view[taxon_name_column_name]
found_count = 0
not_found_count = 0
parent_found_count = 0
for i in range(len(o_view)):
PyErr_CheckSignals()
if pb is not None:
pb(i)
taxon_name = taxon_name_column[i]
taxon = taxo.get_taxon_by_name(taxon_name, res_anc)
if taxon is not None:
taxid_column[i] = taxon.taxid
found_count+=1
elif try_genus: # try finding genus or other parent taxon from the first word
#print(i, o_view[i].id)
taxon_name_sp = taxon_name.split(b" ")
taxon = taxo.get_taxon_by_name(taxon_name_sp[0], res_anc)
if taxon is not None:
parent_taxid_column[i] = taxon.taxid
parent_found_count+=1
if logfile:
print("Found parent taxon for", tostr(taxon_name), file=logfile)
else:
not_found_count+=1
if logfile:
print("No taxid found for", tostr(taxon_name), file=logfile)
else:
not_found_count+=1
if logfile:
print("No taxid found for", tostr(taxon_name), file=logfile)
except Exception, e:
raise RollbackException("obi addtaxids error, rollbacking view: "+str(e), o_view)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
logger("info", "\nTaxids found: "+str(found_count)+"/"+str(len(o_view))+" ("+str(round(found_count*100.0/len(o_view), 2))+"%)")
if config['addtaxids']['try_genus_match']:
logger("info", "\nParent taxids found: "+str(parent_found_count)+"/"+str(len(o_view))+" ("+str(round(parent_found_count*100.0/len(o_view), 2))+"%)")
logger("info", "\nTaxids not found: "+str(not_found_count)+"/"+str(len(o_view))+" ("+str(round(not_found_count*100.0/len(o_view), 2))+"%)")
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
input_dms_name=[input[0].name]
input_view_name=[i_view_name]
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
o_view.write_config(config, "addtaxids", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(o_dms, imported_view_name)
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,7 +4,7 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes
@ -12,17 +12,21 @@ from obitools3.utils cimport tobytes, str2bytes
from obitools3.dms.capi.obilcsalign cimport obi_lcs_align_one_column, \
obi_lcs_align_two_columns
from io import BufferedWriter
from cpython.exc cimport PyErr_CheckSignals
import time
import sys
__title__="Aligns one sequence column with itself or two sequence columns"
__title__="Align one sequence column with itself or two sequence columns"
def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi align specific options')
@ -154,7 +158,7 @@ def run(config):
i_view_name = i_uri.split(b"/")[0]
i_column_name = b""
i_element_name = b""
if len(i_uri.split(b"/")) == 2:
if len(i_uri.split(b"/")) >= 2:
i_column_name = i_uri.split(b"/")[1]
if len(i_uri.split(b"/")) == 3:
i_element_name = i_uri.split(b"/")[2]
@ -177,7 +181,7 @@ def run(config):
i_dms_name_2 = i_dms_2.name
i_uri_2 = input_2[1]
original_i_view_name_2 = i_uri_2.split(b"/")[0]
if len(i_uri_2.split(b"/")) == 2:
if len(i_uri_2.split(b"/")) >= 2:
i_column_name_2 = i_uri_2.split(b"/")[1]
if len(i_uri_2.split(b"/")) == 3:
i_element_name_2 = i_uri_2.split(b"/")[2]
@ -201,20 +205,20 @@ def run(config):
if output is None:
raise Exception("Could not create output")
o_dms = output[0]
output_0 = output[0]
o_dms_name = o_dms.name
final_o_view_name = output[1]
o_view_name = final_o_view_name
# If the input and output DMS are not the same, align creating a temporary view in the input dms that will be exported to
# If stdout output or the input and output DMS are not the same, align creating a temporary view in the input dms that will be exported to
# the right DMS and deleted in the other afterwards.
if i_dms != o_dms:
temporary_view_name = final_o_view_name
i=0
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
if i_dms != o_dms or type(output_0)==BufferedWriter:
if type(output_0)==BufferedWriter:
o_dms = i_dms
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in input DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
else:
o_view_name = final_o_view_name
# Save command config in View comments
command_line = " ".join(sys.argv[1:])
@ -229,7 +233,7 @@ def run(config):
# Call cython alignment function
# Using default ID columns of the view. TODO discuss adding option
align_columns(i_dms_name, \
align_columns(i_dms.name_with_full_path, \
i_view_name, \
o_view_name, \
input_view_2_n = i_view_name_2, \
@ -263,12 +267,19 @@ def run(config):
View.delete_view(i_dms, i_view_name_2)
i_dms_2.close()
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view = o_dms[o_view_name]
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# If the input and the output DMS are different, delete the temporary result view in the input DMS
if i_dms != o_dms:
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
o_dms.close(force=True)
i_dms.close()
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -2,12 +2,11 @@
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view import RollbackException
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from obitools3.dms.column.column cimport Column
from obitools3.dms.capi.obiview cimport QUALITY_COLUMN
from obitools3.dms.capi.obitypes cimport OBI_QUAL
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.libalign._qsassemble import QSolexaReverseAssemble
@ -15,13 +14,16 @@ from obitools3.libalign._qsrassemble import QSolexaRightReverseAssemble
from obitools3.libalign._solexapairend import buildConsensus, buildJoinedSequence
from obitools3.dms.obiseq cimport Nuc_Seq
from obitools3.libalign.shifted_ali cimport Kmer_similarity, Ali_shifted
from obitools3.commands.ngsfilter import REVERSE_SEQ_COLUMN_NAME, REVERSE_QUALITY_COLUMN_NAME
from obitools3.dms.capi.obiview cimport REVERSE_SEQUENCE_COLUMN, REVERSE_QUALITY_COLUMN
from obitools3.utils cimport str2bytes
from io import BufferedWriter
import sys
import os
from cpython.exc cimport PyErr_CheckSignals
__title__="Aligns paired-ended reads"
__title__="Align paired-ended reads"
@ -29,6 +31,7 @@ def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi alignpairedend specific options')
@ -39,24 +42,25 @@ def addOptions(parser):
type=str,
help="URI to the reverse reads if they are in a different view than the forward reads")
group.add_argument('--score-min',
action="store", dest="alignpairedend:smin",
metavar="#.###",
default=None,
type=float,
help="Minimum score for keeping alignments")
# group.add_argument('--score-min',
# action="store", dest="alignpairedend:smin",
# metavar="#.###",
# default=None,
# type=float,
# help="Minimum score for keeping alignments. "
# "(for kmer alignment) The score is an approximation of the number of nucleotides matching in the overlap of the alignment.")
group.add_argument('-A', '--true-ali',
action="store_true", dest="alignpairedend:trueali",
default=False,
help="Performs gap free end alignment of sequences instead of using kmers to compute alignments (slower).")
# group.add_argument('-A', '--true-ali',
# action="store_true", dest="alignpairedend:trueali",
# default=False,
# help="Performs gap free end alignment of sequences instead of using kmers to compute alignments (slower).")
group.add_argument('-k', '--kmer-size',
action="store", dest="alignpairedend:kmersize",
metavar="#",
default=3,
type=int,
help="K-mer size for kmer comparisons, between 1 and 4 (not when using -A option; default: 3)")
help="K-mer size for kmer comparisons, between 1 and 4 (default: 3)")
la = QSolexaReverseAssemble()
@ -96,12 +100,13 @@ def alignmentIterator(entries, aligner):
entries_len = len(entries)
for i in range(entries_len):
if two_views:
seqF = forward[i]
seqR = reverse[i]
else:
seqF = Nuc_Seq.new_from_stored(entries[i])
seqR = Nuc_Seq(seqF.id, seqF[REVERSE_SEQ_COLUMN_NAME], quality=seqF[REVERSE_QUALITY_COLUMN_NAME])
seqR = Nuc_Seq(seqF.id, seqF[REVERSE_SEQUENCE_COLUMN], quality=seqF[REVERSE_QUALITY_COLUMN])
seqR.index = i
ali = aligner(seqF, seqR)
@ -169,81 +174,120 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
view = output[1]
output_0 = output[0]
o_dms = output[0]
Column.new_column(view, QUALITY_COLUMN, OBI_QUAL) #TODO output URI quality option?
if 'smin' in config['alignpairedend']:
smin = config['alignpairedend']['smin']
# stdout output: create temporary view
if type(output_0)==BufferedWriter:
i_dms = forward.dms # using any dms
o_dms = i_dms
i=0
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in input DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
o_view = View_NUC_SEQS.new(o_dms, o_view_name, quality=True)
else:
smin = 0
o_view = output[1]
Column.new_column(o_view, QUALITY_COLUMN, OBI_QUAL)
# Initialize the progress bar
pb = ProgressBar(entries_len, config, seconde=5)
if config['alignpairedend']['trueali']:
kmer_ali = False
aligner = buildAlignment
else :
kmer_ali = True
if type(entries) == list:
forward = entries[0]
reverse = entries[1]
aligner = Kmer_similarity(forward, view2=reverse, kmer_size=config['alignpairedend']['kmersize'])
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(entries_len, config)
else:
pb = None
#if config['alignpairedend']['trueali']:
# kmer_ali = False
# aligner = buildAlignment
#else :
kmer_ali = True
if type(entries) == list:
forward = entries[0]
reverse = entries[1]
if len(forward) == 0 or len(reverse) == 0:
aligner = None
else:
aligner = Kmer_similarity(entries, column2=entries[REVERSE_SEQ_COLUMN_NAME], qual_column2=entries[REVERSE_QUALITY_COLUMN_NAME], kmer_size=config['alignpairedend']['kmersize'])
aligner = Kmer_similarity(forward, \
view2=reverse, \
kmer_size=config['alignpairedend']['kmersize'], \
reversed_column=None)
else:
if len(entries) == 0:
aligner = None
else:
aligner = Kmer_similarity(entries, \
column2=entries[REVERSE_SEQUENCE_COLUMN], \
qual_column2=entries[REVERSE_QUALITY_COLUMN], \
kmer_size=config['alignpairedend']['kmersize'], \
reversed_column=entries[b'reversed']) # column created by the ngsfilter tool
ba = alignmentIterator(entries, aligner)
i = 0
for ali in ba:
pb(i)
if pb is not None:
pb(i)
consensus = view[i]
PyErr_CheckSignals()
consensus = o_view[i]
if two_views:
consensus[b"R1_parent"] = forward[i].id
consensus[b"R2_parent"] = reverse[i].id
if not two_views:
seqF = entries[i]
else:
seqF = forward[i]
if smin > 0:
if (ali.score > smin) :
buildConsensus(ali, consensus, seqF)
else:
if not two_views:
seqR = Nuc_Seq(seqF.id, seqF[REVERSE_SEQ_COLUMN_NAME], quality = seqF[REVERSE_QUALITY_COLUMN_NAME])
else:
seqR = reverse[i]
buildJoinedSequence(ali, seqR, consensus, forward=seqF)
consensus[b"smin"] = smin
else:
if ali.overlap_len > 0 :
buildConsensus(ali, consensus, seqF)
else:
if not two_views:
seqR = Nuc_Seq(seqF.id, seqF[REVERSE_SEQUENCE_COLUMN], quality = seqF[REVERSE_QUALITY_COLUMN])
else:
seqR = reverse[i]
buildJoinedSequence(ali, seqR, consensus, forward=seqF)
if kmer_ali :
ali.free()
i+=1
pb(i, force=True)
print("", file=sys.stderr)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
if kmer_ali :
if kmer_ali and aligner is not None:
aligner.free()
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
view.write_config(config, "alignpairedend", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
output[0].record_command_line(command_line)
o_view.write_config(config, "alignpairedend", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(view), file=sys.stderr)
input[0].close()
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# If stdout output, delete the temporary imported view used to create the final file
if type(output_0)==BufferedWriter:
View_NUC_SEQS.delete_view(o_dms, o_view_name)
output_0.close()
# Close all DMS
input[0].close(force=True)
if two_views:
rinput[0].close()
output[0].close()
rinput[0].close(force=True)
o_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,21 +4,27 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from functools import reduce
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes
from io import BufferedWriter
from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN, \
QUALITY_COLUMN, \
COUNT_COLUMN
COUNT_COLUMN, \
TAXID_COLUMN
from obitools3.dms.capi.obitypes cimport OBI_STR
from obitools3.dms.column.column cimport Column
import time
import math
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Annotate views with new tags and edit existing annotations"
@ -31,6 +37,7 @@ def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi annotate specific options')
@ -173,8 +180,8 @@ def sequenceTaggerGenerator(config, taxo=None):
counter[0]+=1
for rank in annoteRank:
if 'taxid' in seq:
taxid = seq['taxid']
if TAXID_COLUMN in seq:
taxid = seq[TAXID_COLUMN]
if taxid is not None:
rtaxid = taxo.get_taxon_at_rank(taxid, rank)
if rtaxid is not None:
@ -182,64 +189,58 @@ def sequenceTaggerGenerator(config, taxo=None):
else:
scn=None
seq[rank]=rtaxid
if "%s_name"%rank not in seq.view:
Column.new_column(seq.view, "%s_name"%rank, OBI_STR)
seq["%s_name"%rank]=scn
if add_rank:
seq['seq_rank']=counter[0]
for i,v in toSet:
#try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(v, environ, seq)
#except Exception,e: # TODO discuss usefulness of this
# if options.onlyValid:
# raise e
# val = v
try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(v, environ, seq)
except Exception: # set string if not a valid expression
val = v
seq[i]=val
if length:
seq['seq_length']=len(seq)
if newId is not None:
# try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(newId, environ, seq)
# except Exception,e:
# if options.onlyValid:
# raise e
# val = newId
try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(newId, environ, seq)
except Exception: # set string if not a valid expression
val = newId
seq.id=val
if newDef is not None:
# try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(newDef, environ, seq)
# except Exception,e:
# if options.onlyValid:
# raise e
# val = newDef
try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(newDef, environ, seq)
except Exception: # set string if not a valid expression
val = newDef
seq.definition=val
#
if newSeq is not None:
# try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(newSeq, environ, seq)
# except Exception,e:
# if options.onlyValid:
# raise e
# val = newSeq
try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
val = eval(newSeq, environ, seq)
except Exception: # set string if not a valid expression
val = newSeq
seq.seq=val
if 'seq_length' in seq:
seq['seq_length']=len(seq)
@ -249,15 +250,14 @@ def sequenceTaggerGenerator(config, taxo=None):
seq.view.delete_column(QUALITY_COLUMN)
if run is not None:
# try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
eval(run, environ, seq)
# except Exception,e:
# if options.onlyValid:
# raise e
try:
if taxo is not None:
environ = {'taxonomy' : taxo, 'sequence':seq, 'counter':counter[0], 'math':math}
else:
environ = {'sequence':seq, 'counter':counter[0], 'math':math}
eval(run, environ, seq)
except Exception,e:
raise e
return sequenceTagger
@ -284,8 +284,19 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
output_0 = output[0]
o_view_name = output[1]
# stdout output: create temporary view
if type(output_0)==BufferedWriter:
o_dms = i_dms
i=0
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in output DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
imported_view_name = o_view_name
# If the input and output DMS are not the same, import the input view in the output DMS before cloning it to modify it
# (could be the other way around: clone and modify in the input DMS then import the new view in the output DMS)
if i_dms != o_dms:
@ -296,7 +307,7 @@ def run(config):
i+=1
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], i_view_name, imported_view_name)
i_view = o_dms[imported_view_name]
# Clone output view from input view
o_view = i_view.clone(o_view_name)
if o_view is None:
@ -306,14 +317,17 @@ def run(config):
# Open taxonomy if there is one
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
taxo_uri = open_uri(config['obi']['taxoURI'])
if taxo_uri is None:
if taxo_uri is None or taxo_uri[2] == bytes:
raise Exception("Couldn't open taxonomy")
taxo = taxo_uri[1]
else :
taxo = None
# Initialize the progress bar
pb = ProgressBar(len(o_view), config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(o_view), config)
else:
pb = None
try:
@ -351,14 +365,17 @@ def run(config):
# Editions at line level
sequenceTagger = sequenceTaggerGenerator(config, taxo=taxo)
for i in range(len(o_view)):
pb(i)
PyErr_CheckSignals()
if pb is not None:
pb(i)
sequenceTagger(o_view[i])
except Exception, e:
raise RollbackException("obi annotate error, rollbacking view: "+str(e), o_view)
pb(i, force=True)
print("", file=sys.stderr)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
@ -368,15 +385,21 @@ def run(config):
input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
o_view.write_config(config, "annotate", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
output[0].record_command_line(command_line)
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary imported view used to create the final view
if i_dms != o_dms:
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(o_dms, imported_view_name)
o_dms.close()
i_dms.close()
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,17 +4,19 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms.dms cimport DMS
from obitools3.dms.view import RollbackException
from obitools3.dms.capi.build_reference_db cimport build_reference_db
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes
from obitools3.dms.view.view cimport View
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from io import BufferedWriter
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Tag a set of sequences for PCR and sequencing errors identification"
__title__="Build a reference database for ecotag"
def addOptions(parser):
@ -22,16 +24,16 @@ def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi build_ref_db specific options')
group.add_argument('--threshold','-t',
action="store", dest="build_ref_db:threshold",
metavar='<THRESHOLD>',
default=0.0,
default=0.99,
type=float,
help="Score threshold as a normalized identity, e.g. 0.95 for an identity of 95%%. Default: 0.00"
" (no threshold).")
help="Score threshold as a normalized identity, e.g. 0.95 for an identity of 95%%. Default: 0.99.")
def run(config):
@ -56,17 +58,20 @@ def run(config):
if output is None:
raise Exception("Could not create output")
o_dms = output[0]
output_0 = output[0]
final_o_view_name = output[1]
# If the input and output DMS are not the same, build the database creating a temporary view that will be exported to
# If stdout output or the input and output DMS are not the same, build the database creating a temporary view that will be exported to
# the right DMS and deleted in the other afterwards.
if i_dms != o_dms:
temporary_view_name = final_o_view_name
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
@ -80,26 +85,33 @@ def run(config):
input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
comments = View.print_config(config, "build_ref_db", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
if build_reference_db(tobytes(i_dms_name), tobytes(i_view_name), tobytes(taxonomy_name), tobytes(o_view_name), comments, config['build_ref_db']['threshold']) < 0:
if build_reference_db(i_dms.name_with_full_path, tobytes(i_view_name), tobytes(taxonomy_name), tobytes(o_view_name), comments, config['build_ref_db']['threshold']) < 0:
raise Exception("Error building a reference database")
# If the input and output DMS are not the same, export result view to output DMS
if i_dms != o_dms:
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view = o_dms[o_view_name]
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# Save command config in DMS comments
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_dms[final_o_view_name]), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary result view in the input DMS
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
o_dms.close(force=True)
i_dms.close()
i_dms.close(force=True)
logger("info", "Done.")

168
python/obitools3/commands/cat.pyx Executable file
View File

@ -0,0 +1,168 @@
#cython: language_level=3
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport str2bytes
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from obitools3.dms.view.view cimport View
from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, REVERSE_SEQUENCE_COLUMN, \
QUALITY_COLUMN, REVERSE_QUALITY_COLUMN
from obitools3.dms.capi.obitypes cimport OBI_SEQ, OBI_QUAL
from obitools3.dms.column.column cimport Column
from io import BufferedWriter
import time
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Concatenate views"
def addOptions(parser):
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi cat specific options')
group.add_argument("-c",
action="append", dest="cat:views_to_cat",
metavar="<VIEW_NAME>",
default=[],
type=str,
help="URI of a view to concatenate. (e.g. 'my_dms/my_view'). "
"Several -c options can be used on the same "
"command line.")
def run(config):
DMS.obi_atexit()
logger("info", "obi cat")
# Check the views to concatenate
idms_list = []
iview_list = []
total_len = 0
remove_qual = False
remove_rev_qual = False
v_type = View_NUC_SEQS
for v_uri in config["cat"]["views_to_cat"]:
input = open_uri(v_uri)
if input is None:
raise Exception("Could not read input view")
i_dms = input[0]
i_view = input[1]
if input[2] != View_NUC_SEQS: # Check view type (output view is nuc_seqs view if all input view are nuc_seqs view)
v_type = View
if QUALITY_COLUMN not in i_view: # Check if keep quality column in output view (if all input views have it)
remove_qual = True
if REVERSE_QUALITY_COLUMN not in i_view: # same as above for reverse quality
remove_rev_qual = True
total_len += len(i_view)
idms_list.append(i_dms)
iview_list.append(i_view.name)
i_view.close()
# Open the output: only the DMS
output = open_uri(config['obi']['outputURI'],
input=False,
newviewtype=v_type)
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
output_0 = output[0]
o_view = output[1]
# stdout output
if type(output_0)==BufferedWriter:
o_dms = i_dms
# Initialize quality columns and their associated sequence columns if needed
if type(output_0) != BufferedWriter:
if not remove_qual:
if NUC_SEQUENCE_COLUMN not in o_view:
Column.new_column(o_view, NUC_SEQUENCE_COLUMN, OBI_SEQ)
Column.new_column(o_view, QUALITY_COLUMN, OBI_QUAL, associated_column_name=NUC_SEQUENCE_COLUMN, associated_column_version=o_view[NUC_SEQUENCE_COLUMN].version)
if not remove_rev_qual:
Column.new_column(o_view, REVERSE_SEQUENCE_COLUMN, OBI_SEQ)
Column.new_column(o_view, REVERSE_QUALITY_COLUMN, OBI_QUAL, associated_column_name=REVERSE_SEQUENCE_COLUMN, associated_column_version=o_view[REVERSE_SEQUENCE_COLUMN].version)
# Initialize multiple elements columns
if type(output_0)!=BufferedWriter:
dict_cols = {}
for v_uri in config["cat"]["views_to_cat"]:
v = open_uri(v_uri)[1]
for coln in v.keys():
col = v[coln]
if v[coln].nb_elements_per_line > 1:
if coln not in dict_cols:
dict_cols[coln] = {}
dict_cols[coln]['eltnames'] = set(v[coln].elements_names)
dict_cols[coln]['nbelts'] = v[coln].nb_elements_per_line
dict_cols[coln]['obitype'] = v[coln].data_type_int
else:
dict_cols[coln]['eltnames'] = set(v[coln].elements_names + list(dict_cols[coln]['eltnames']))
dict_cols[coln]['nbelts'] = len(dict_cols[coln]['eltnames'])
v.close()
for coln in dict_cols:
Column.new_column(o_view, coln, dict_cols[coln]['obitype'],
nb_elements_per_line=dict_cols[coln]['nbelts'], elements_names=list(dict_cols[coln]['eltnames']), dict_column=True)
# Initialize the progress bar
if not config['obi']['noprogressbar']:
pb = ProgressBar(total_len, config)
else:
pb = None
i = 0
for v_uri in config["cat"]["views_to_cat"]:
v = open_uri(v_uri)[1]
for entry in v:
PyErr_CheckSignals()
if pb is not None:
pb(i)
if type(output_0)==BufferedWriter:
rep = repr(entry)
output_0.write(str2bytes(rep)+b"\n")
else:
try:
o_view[i] = entry
except:
print("\nError with entry:", repr(entry))
print(repr(o_view))
i+=1
v.close()
# Deletes quality columns if needed
if type(output_0)!=BufferedWriter:
if QUALITY_COLUMN in o_view and remove_qual :
o_view.delete_column(QUALITY_COLUMN)
if REVERSE_QUALITY_COLUMN in o_view and remove_rev_qual :
o_view.delete_column(REVERSE_QUALITY_COLUMN)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
# Save command config in DMS comments
command_line = " ".join(sys.argv[1:])
o_view.write_config(config, "cat", command_line, input_dms_name=[d.name for d in idms_list], input_view_name=[vname for vname in iview_list])
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(view), file=sys.stderr)
for d in idms_list:
d.close(force=True)
o_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,13 +4,14 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms.dms cimport DMS
from obitools3.dms.view import RollbackException
from obitools3.dms.capi.obiclean cimport obi_clean
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes
from obitools3.dms.view.view cimport View
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from io import BufferedWriter
import sys
@ -21,7 +22,8 @@ def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi clean specific options')
group.add_argument('--distance', '-d',
@ -36,8 +38,7 @@ def addOptions(parser):
dest="clean:sample-tag-name",
metavar="<SAMPLE TAG NAME>",
type=str,
default="merged_sample",
help="Name of the tag where sample counts are kept.")
help="Name of the tag where merged sample count informations are kept (typically generated by obi uniq, usually MERGED_sample, default: None).")
group.add_argument('--ratio', '-r',
action="store", dest="clean:ratio",
@ -53,11 +54,18 @@ def addOptions(parser):
default=False,
help="Only sequences labeled as heads are kept in the output. Default: False")
group.add_argument('--cluster-tags', '-C',
action="store_true",
dest="clean:cluster-tags",
default=False,
help="Adds tags for each sequence giving its cluster's head and weight for each sample.")
# group.add_argument('--cluster-tags', '-C',
# action="store_true",
# dest="clean:cluster-tags",
# default=False,
# help="Adds tags for each sequence giving its cluster's head and weight for each sample.")
group.add_argument('--thread-count','-p', # TODO should probably be in a specific option group
action="store", dest="clean:thread-count",
metavar='<THREAD COUNT>',
default=-1,
type=int,
help="Number of threads to use for the computation. Default: the maximum available.")
def run(config):
@ -82,17 +90,20 @@ def run(config):
if output is None:
raise Exception("Could not create output")
o_dms = output[0]
output_0 = output[0]
final_o_view_name = output[1]
# If the input and output DMS are not the same, run obiclean creating a temporary view that will be exported to
# If stdout output or the input and output DMS are not the same, create a temporary view that will be exported to
# the right DMS and deleted in the other afterwards.
if i_dms != o_dms:
temporary_view_name = final_o_view_name
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
@ -100,25 +111,36 @@ def run(config):
command_line = " ".join(sys.argv[1:])
comments = View.print_config(config, "clean", command_line, input_dms_name=[i_dms_name], input_view_name=[i_view_name])
if obi_clean(tobytes(i_dms_name), tobytes(i_view_name), tobytes(config['clean']['sample-tag-name']), tobytes(o_view_name), comments, \
config['clean']['distance'], config['clean']['ratio'], config['clean']['heads-only'], 1) < 0:
if 'sample-tag-name' not in config['clean']:
config['clean']['sample-tag-name'] = ""
if obi_clean(i_dms.name_with_full_path, tobytes(i_view_name), tobytes(config['clean']['sample-tag-name']), tobytes(o_view_name), comments, \
config['clean']['distance'], config['clean']['ratio'], config['clean']['heads-only'], config['clean']['thread-count']) < 0:
raise Exception("Error running obiclean")
# If the input and output DMS are not the same, export result view to output DMS
if i_dms != o_dms:
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view = o_dms[o_view_name]
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# Save command config in DMS comments
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_dms[final_o_view_name]), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary result view in the input DMS
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
o_dms.close(force=True)
i_dms.close()
i_dms.close(force=True)
logger("info", "Done.")
logger("info", "Done.")

View File

@ -0,0 +1,30 @@
#cython: language_level=3
from obitools3.apps.optiongroups import addMinimalInputOption
from obitools3.uri.decode import open_uri
from obitools3.dms import DMS
from obitools3.dms.capi.obidms cimport obi_clean_dms
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes
__title__="Clean a DMS from unfinished views and columns"
def addOptions(parser):
addMinimalInputOption(parser)
def run(config):
DMS.obi_atexit()
logger("info", "obi clean_dms")
dms_path = tobytes(config['obi']['inputURI'])
if b'.obidms' in dms_path:
dms_path = dms_path.split(b'.obidms')[0]
if obi_clean_dms(dms_path) < 0 :
raise Exception("Error cleaning DMS", config['obi']['inputURI'])
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -7,8 +7,10 @@ from obitools3.dms import DMS
from obitools3.apps.optiongroups import addMinimalInputOption
from obitools3.dms.capi.obiview cimport COUNT_COLUMN
from cpython.exc cimport PyErr_CheckSignals
__title__="Counts sequence records"
__title__="Count sequence records"
def addOptions(parser):
@ -20,13 +22,19 @@ def addOptions(parser):
group.add_argument('-s','--sequence',
action="store_true", dest="count:sequence",
default=False,
help="Prints only the number of sequence records.")
help="Prints only the number of sequence records (much faster, default: False).")
group.add_argument('-a','--all',
action="store_true", dest="count:all",
default=False,
help="Prints only the total count of sequence records (if a sequence has no `count` attribute, its default count is 1) (default: False).")
group.add_argument('-c','--count-tag',
action="store", dest="count:countcol",
default='COUNT',
type=str,
help="Name of the tag/column associated with the count information (default: COUNT).")
def run(config):
@ -39,17 +47,22 @@ def run(config):
if input is None:
raise Exception("Could not read input")
entries = input[1]
countcol = config['count']['countcol']
count1 = len(entries)
count2 = 0
if COUNT_COLUMN in entries and ((config['count']['sequence'] == config['count']['all']) or (config['count']['all'])) :
if countcol in entries and ((config['count']['sequence'] == config['count']['all']) or (config['count']['all'])) :
for e in entries:
count2+=e[COUNT_COLUMN]
PyErr_CheckSignals()
count2+=e[countcol]
if COUNT_COLUMN in entries and (config['count']['sequence'] == config['count']['all']):
if countcol in entries and (config['count']['sequence'] == config['count']['all']):
print(count1,count2)
elif COUNT_COLUMN in entries and config['count']['all']:
elif countcol in entries and config['count']['all']:
print(count2)
else:
print(count1)
input[0].close(force=True)

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -5,10 +5,10 @@ from obitools3.dms.dms cimport DMS
from obitools3.dms.capi.obidms cimport OBIDMS_p
from obitools3.dms.view import RollbackException
from obitools3.dms.capi.obiecopcr cimport obi_ecopcr
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addTaxonomyOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addTaxonomyOption, addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes
from obitools3.utils cimport tobytes, str2bytes
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from obitools3.dms.view import View
@ -16,6 +16,7 @@ from libc.stdlib cimport malloc, free
from libc.stdint cimport int32_t
import sys
from io import BufferedWriter
__title__="in silico PCR"
@ -27,6 +28,7 @@ def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi ecopcr specific options')
@ -35,13 +37,15 @@ def addOptions(parser):
action="store", dest="ecopcr:primer1",
metavar='<PRIMER>',
type=str,
help="Forward primer.")
required=True,
help="Forward primer, length must be less than or equal to 32")
group.add_argument('--primer2', '-R',
action="store", dest="ecopcr:primer2",
metavar='<PRIMER>',
type=str,
help="Reverse primer.")
required=True,
help="Reverse primer, length must be less than or equal to 32")
group.add_argument('--error', '-e',
action="store", dest="ecopcr:error",
@ -107,14 +111,20 @@ def addOptions(parser):
help="Defines the method used for estimating the Tm (melting temperature) between the primers and their corresponding "
"target sequences. SANTALUCIA: 1, or OWCZARZY: 2. Default: 1.")
group.add_argument('--keep-primers', '-p',
action="store_true",
dest="ecopcr:keep-primers",
default=False,
help="Whether to keep the primers attached to the output sequences (default: the primers are cut out).")
group.add_argument('--keep-nucs', '-D',
action="store",
dest="ecopcr:keep-nucs",
metavar="<INTEGER>",
metavar="<N>",
type=int,
default=0,
help="Keeps the specified number of nucleotides on each side of the in silico amplified sequences, "
"(already including the amplified DNA fragment plus the two target sequences of the primers).")
help="Keeps N nucleotides on each side of the in silico amplified sequences, "
"not including the primers (implying that primers are automatically kept if N > 0).")
group.add_argument('--kingdom-mode', '-k',
action="store_true",
@ -161,11 +171,28 @@ def run(config):
if output is None:
raise Exception("Could not create output")
o_dms = output[0]
output_0 = output[0]
o_dms_name = output[0].name
o_view_name = output[1]
# Open the taxonomy DMS
taxdms = open_uri(config['obi']['taxoURI'],
dms_only=True)
if taxdms is None:
raise Exception("Could not open taxonomy DMS")
tax_dms = taxdms[0]
tax_dms_name = taxdms[0].name
# Read taxonomy name
taxonomy_name = config['obi']['taxoURI'].split("/")[-1] # Robust in theory
# If stdout output create a temporary view in the input dms that will be deleted afterwards.
if type(output_0)==BufferedWriter:
o_dms = i_dms
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in input DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
# Save command config in View comments
command_line = " ".join(sys.argv[1:])
@ -178,14 +205,15 @@ def run(config):
# TODO: primers in comments?
if obi_ecopcr(tobytes(i_dms_name), tobytes(i_view_name), tobytes(taxonomy_name), \
tobytes(o_dms_name), tobytes(o_view_name), comments, \
if obi_ecopcr(i_dms.name_with_full_path, tobytes(i_view_name),
tax_dms.name_with_full_path, tobytes(taxonomy_name), \
o_dms.name_with_full_path, tobytes(o_view_name), comments, \
tobytes(config['ecopcr']['primer1']), tobytes(config['ecopcr']['primer2']), \
config['ecopcr']['error'], \
config['ecopcr']['min-length'], config['ecopcr']['max-length'], \
restrict_to_taxids_p, ignore_taxids_p, \
config['ecopcr']['circular'], config['ecopcr']['salt-concentration'], config['ecopcr']['salt-correction-method'], \
config['ecopcr']['keep-nucs'], config['ecopcr']['kingdom-mode']) < 0:
config['ecopcr']['keep-nucs'], config['ecopcr']['keep-primers'], config['ecopcr']['kingdom-mode']) < 0:
raise Exception("Error running ecopcr")
# Save command config in DMS comments
@ -193,10 +221,22 @@ def run(config):
free(restrict_to_taxids_p)
free(ignore_taxids_p)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view = o_dms[o_view_name]
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_dms[o_view_name]), file=sys.stderr)
o_dms.close()
# If stdout output, delete the temporary result view in the input DMS
if type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
i_dms.close(force=True)
o_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,7 +4,7 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms.dms cimport DMS
from obitools3.dms.view import RollbackException
from obitools3.dms.capi.obiecotag cimport obi_ecotag
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes
@ -12,6 +12,7 @@ from obitools3.dms.view.view cimport View
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
import sys
from io import BufferedWriter
__title__="Taxonomic assignment of sequences"
@ -22,6 +23,7 @@ def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi ecotag specific options')
@ -39,6 +41,17 @@ def addOptions(parser):
help="Minimum identity to consider for assignment, as a normalized identity, e.g. 0.95 for an identity of 95%%. "
"Default: 0.00 (no threshold).")
group.add_argument('--minimum-circle','-c',
action="store", dest="ecotag:bubble_threshold",
metavar='<CIRCLE_THRESHOLD>',
default=0.99,
type=float,
help="Minimum identity considered for the assignment circle "
"(sequence is assigned to the LCA of all sequences within a similarity circle of the best matches; "
"the threshold for this circle is the highest value between <CIRCLE_THRESHOLD> and the best assignment score found for the query sequence). "
"Give value as a normalized identity, e.g. 0.95 for an identity of 95%%. "
"Default: 0.99.")
def run(config):
DMS.obi_atexit()
@ -63,6 +76,10 @@ def run(config):
ref_dms_name = ref[0].name
ref_view_name = ref[1]
# Check that the threshold demanded is greater than or equal to the threshold used to build the reference database
if config['ecotag']['bubble_threshold'] < eval(ref_dms[ref_view_name].comments["ref_db_threshold"]) :
raise Exception(f"Error: The threshold demanded ({config['ecotag']['bubble_threshold']}) is lower than the threshold used to build the reference database ({float(ref_dms[ref_view_name].comments['ref_db_threshold'])}).")
# Open the output: only the DMS
output = open_uri(config['obi']['outputURI'],
input=False,
@ -70,17 +87,19 @@ def run(config):
if output is None:
raise Exception("Could not create output")
o_dms = output[0]
output_0 = output[0]
final_o_view_name = output[1]
# If the input and output DMS are not the same, run ecotag creating a temporary view that will be exported to
# the right DMS and deleted in the other afterwards.
if i_dms != o_dms:
temporary_view_name = final_o_view_name
# If stdout output or the input and output DMS are not the same, create a temporary view that will be exported and deleted.
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = final_o_view_name+b"_"+str2bytes(str(i))
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
@ -101,11 +120,12 @@ def run(config):
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
comments = View.print_config(config, "ecotag", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
if obi_ecotag(tobytes(i_dms_name), tobytes(i_view_name), \
tobytes(ref_dms_name), tobytes(ref_view_name), \
tobytes(taxo_dms_name), tobytes(taxonomy_name), \
tobytes(o_view_name), comments,
config['ecotag']['threshold']) < 0:
if obi_ecotag(i_dms.name_with_full_path, tobytes(i_view_name), \
ref_dms.name_with_full_path, tobytes(ref_view_name), \
taxo_dms.name_with_full_path, tobytes(taxonomy_name), \
tobytes(o_view_name), comments, \
config['ecotag']['threshold'], \
config['ecotag']['bubble_threshold']) < 0:
raise Exception("Error running ecotag")
# If the input and output DMS are not the same, export result view to output DMS
@ -115,15 +135,24 @@ def run(config):
# Save command config in DMS comments
o_dms.record_command_line(command_line)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view = o_dms[o_view_name]
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_dms[final_o_view_name]), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary result view in the input DMS
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
o_dms.close(force=True)
i_dms.close()
taxo_dms.close(force=True)
ref_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -5,11 +5,19 @@ from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.dms import DMS
from obitools3.dms.obiseq import Nuc_Seq
from obitools3.dms.capi.obiview cimport QUALITY_COLUMN
from obitools3.writers.tab import TabWriter
from obitools3.format.tab import TabFormat
from obitools3.utils cimport tobytes, tostr
from obitools3.apps.optiongroups import addMinimalInputOption, \
addExportOutputOption
addExportOutputOption, \
addNoProgressBarOption
import sys
import io
from cpython.exc cimport PyErr_CheckSignals
__title__="Export a view to a different file format"
@ -18,6 +26,7 @@ def addOptions(parser):
addMinimalInputOption(parser)
addExportOutputOption(parser)
addNoProgressBarOption(parser)
def run(config):
@ -31,7 +40,16 @@ def run(config):
if input is None:
raise Exception("Could not read input")
iview = input[1]
if 'outputformat' not in config['obi']:
if iview.type == b"NUC_SEQS_VIEW":
if QUALITY_COLUMN in iview:
config['obi']['outputformat'] = b'fastq'
else:
config['obi']['outputformat'] = b'fasta'
else:
config['obi']['outputformat'] = b'tabular'
# Open the output
output = open_uri(config['obi']['outputURI'],
input=False)
@ -44,26 +62,128 @@ def run(config):
# Check that the input view has the type NUC_SEQS if needed # TODO discuss, maybe bool property
if (output[2] == Nuc_Seq) and (iview.type != b"NUC_SEQS_VIEW") : # Nuc_Seq_Stored? TODO
raise Exception("Error: the view to export in fasta or fastq format is not a NUC_SEQS view")
# Initialize the progress bar
pb = ProgressBar(len(iview), config, seconde=5)
if config['obi']['only'] is not None:
withoutskip = min(input[4], config['obi']['only'])
else:
withoutskip = input[4]
if config['obi']['skip'] is not None:
skip = min(input[4], config['obi']['skip'])
else:
skip = 0
# Initialize the progress bar
if config['obi']['noprogressbar']:
pb = None
else:
pb = ProgressBar(withoutskip - skip, config)
if config['obi']['outputformat'] == b'metabaR':
# Check prefix
if "metabarprefix" not in config["obi"]:
raise Exception("Prefix needed when exporting for metabaR (--metabaR-prefix option)")
else:
metabaRprefix = config["obi"]["metabarprefix"]
i=0
for seq in iview :
pb(i)
PyErr_CheckSignals()
if pb is not None:
pb(i)
try:
writer(seq)
except StopIteration:
except (StopIteration, BrokenPipeError, IOError):
break
i+=1
pb(i, force=True)
print("", file=sys.stderr)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
if config['obi']['outputformat'] == b'metabaR':
# Export ngsfilter file if view provided
if 'metabarngsfilter' in config['obi']:
ngsfilter_input = open_uri(config['obi']['metabarngsfilter'])
if ngsfilter_input is None:
raise Exception("Could not read ngsfilter view for metabaR output")
ngsfilter_view = ngsfilter_input[1]
ngsfilter_output = open(config['obi']['metabarprefix']+'.ngsfilter', 'w')
for line in ngsfilter_view:
line_to_print = b""
line_to_print += line[b'experiment']
line_to_print += b"\t"
line_to_print += line[b'sample']
line_to_print += b"\t"
line_to_print += line[b'forward_tag']
line_to_print += b":"
line_to_print += line[b'reverse_tag']
line_to_print += b"\t"
line_to_print += line[b'forward_primer']
line_to_print += b"\t"
line_to_print += line[b'reverse_primer']
line_to_print += b"\t"
line_to_print += line[b'additional_info']
print(tostr(line_to_print), file=ngsfilter_output)
if ngsfilter_input[0] != input[0]:
ngsfilter_input[0].close()
ngsfilter_output.close()
# Export sample metadata
samples_output = open(config['obi']['metabarprefix']+'_samples.csv', 'w')
# Export sample metadata file if view provided
if 'metabarsamples' in config['obi']:
samples_input = open_uri(config['obi']['metabarsamples'])
if samples_input is None:
raise Exception("Could not read sample view for metabaR output")
samples_view = samples_input[1]
# Export with tab formatter
TabWriter(TabFormat(header=True, sep='\t',),
samples_output,
header=True)
if samples_input[0] != input[0]:
samples_input[0].close()
# Else export just sample names from main view
else:
sample_list = []
if 'MERGED_sample' in iview:
sample_list = iview['MERGED_sample'].keys()
elif 'sample' not in iview:
for seq in iview:
sample = seq['sample']
if sample not in sample_list:
sample_list.append(sample)
else:
logger("warning", "Can not read sample list from main view for metabaR sample list export")
print("sample_id", file=samples_output)
for sample in sample_list:
line_to_print = b""
line_to_print += sample
line_to_print += b"\t"
print(tostr(line_to_print), file=samples_output)
samples_output.close()
# TODO save command in input dms?
output_object.close()
if not BrokenPipeError and not IOError:
output_object.close()
iview.close()
input[0].close()
input[0].close(force=True)
logger("info", "Done.")
if BrokenPipeError or IOError:
sys.stderr.close()

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

158
python/obitools3/commands/grep.pyx Executable file → Normal file
View File

@ -4,7 +4,7 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes
@ -13,6 +13,9 @@ from functools import reduce
import time
import re
import sys
import ast
from io import BufferedWriter
from cpython.exc cimport PyErr_CheckSignals
__title__="Grep view lines that match the given predicates"
@ -26,21 +29,22 @@ def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group("obi grep specific options")
group.add_argument("--predicate", "-p",
action="append", dest="grep:grep_predicates",
metavar="<PREDICATE>",
default=None,
default=[],
type=str,
help="Python boolean expression to be evaluated in the "
"sequence/line context. The attribute name can be "
"used in the expression as a variable name."
"An extra variable named 'sequence' or 'line' refers"
"used in the expression as a variable name. "
"An extra variable named 'sequence' or 'line' refers "
"to the sequence or line object itself. "
"Several -p options can be used on the same "
"commande line.")
"command line.")
group.add_argument("-S", "--sequence",
action="store", dest="grep:seq_pattern",
@ -87,7 +91,7 @@ def addOptions(parser):
metavar="<ATTRIBUTE_NAME>",
help="Select records with the attribute <ATTRIBUTE_NAME> "
"defined (not set to NA value). "
"Several -a options can be used on the same "
"Several -A options can be used on the same "
"command line.")
group.add_argument("-L", "--lmax",
@ -138,16 +142,39 @@ def addOptions(parser):
"the sequences having at least one of them are ignored.")
def Filter_generator(options, tax_filter):
#taxfilter = taxonomyFilterGenerator(options)
def obi_compile_eval(str expr):
class MyVisitor(ast.NodeTransformer):
def visit_Str(self, node: ast.Str):
result = ast.Bytes(s = node.s.encode('utf-8'))
return ast.copy_location(result, node)
expr = "obi_eval_result="+expr
tree = ast.parse(expr)
optimizer = MyVisitor()
tree = optimizer.visit(tree)
return compile(tree, filename="<ast>", mode="exec")
def obi_eval(compiled_expr, loc_env, line):
exec(compiled_expr, {}, loc_env)
obi_eval_result = loc_env["obi_eval_result"]
return obi_eval_result
def Filter_generator(options, tax_filter, i_view):
# Initialize conditions
predicates = None
if "predicates" in options:
predicates = options["predicates"]
if "grep_predicates" in options:
predicates = [obi_compile_eval(p) for p in options["grep_predicates"]]
attributes = None
if "attributes" in options:
if "attributes" in options and len(options["attributes"]) > 0:
attributes = options["attributes"]
for attribute in attributes:
if attribute not in i_view:
return None
lmax = None
if "lmax" in options:
lmax = options["lmax"]
@ -157,7 +184,7 @@ def Filter_generator(options, tax_filter):
invert_selection = options["invert_selection"]
id_set = None
if "id_list" in options:
id_set = set(x.strip() for x in open(options["id_list"]))
id_set = set(x.strip() for x in open(options["id_list"], 'rb'))
# Initialize the regular expression patterns
seq_pattern = None
@ -170,9 +197,11 @@ def Filter_generator(options, tax_filter):
if "def_pattern" in options:
def_pattern = re.compile(tobytes(options["def_pattern"]))
attribute_patterns={}
if "attribute_patterns" in options:
if "attribute_patterns" in options and len(options["attribute_patterns"]) > 0:
for p in options["attribute_patterns"]:
attribute, pattern = p.split(":", 1)
if attribute not in i_view:
return None
attribute_patterns[tobytes(attribute)] = re.compile(tobytes(pattern))
def filter(line, loc_env):
@ -197,7 +226,7 @@ def Filter_generator(options, tax_filter):
if good and attribute_patterns:
good = (reduce(lambda bint x, bint y : x and y,
(line[attribute] is not None for attribute in attributes),
(line[attribute] is not None for attribute in attribute_patterns),
True)
and
reduce(lambda bint x, bint y: x and y,
@ -205,10 +234,10 @@ def Filter_generator(options, tax_filter):
for attribute in attribute_patterns),
True)
)
if good and predicates:
good = (reduce(lambda bint x, bint y: x and y,
(bool(eval(p, loc_env, line))
(bool(obi_eval(p, loc_env, line))
for p in predicates), True))
if good and lmin:
@ -229,6 +258,13 @@ def Filter_generator(options, tax_filter):
def Taxonomy_filter_generator(taxo, options):
if (("required_ranks" in options and options["required_ranks"]) or \
("required_taxids" in options and options["required_taxids"]) or \
("ignored_taxids" in options and options["ignored_taxids"])) and \
(taxo is None):
raise RollbackException("obi grep error: can't use taxonomy options without providing a taxonomy. Rollbacking view")
if taxo is not None:
def tax_filter(seq):
good = True
@ -277,52 +313,74 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
o_view_name_final = output[1]
o_view_name = o_view_name_final
# If the input and output DMS are not the same, create output view in input DMS first, then export it
# to output DMS, making sure the temporary view name is unique in the input DMS
if i_dms != o_dms:
output_0 = output[0]
final_o_view_name = output[1]
# If stdout output or the input and output DMS are not the same, create a temporary view that will be exported and deleted afterwards.
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while o_view_name in i_dms:
o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
i+=1
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
taxo_uri = open_uri(config["obi"]["taxoURI"])
if taxo_uri is None:
if taxo_uri is None or taxo_uri[2] == bytes:
raise Exception("Couldn't open taxonomy")
taxo = taxo_uri[1]
else :
taxo = None
# Initialize the progress bar
pb = ProgressBar(len(i_view), config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(i_view), config)
else:
pb = None
# Apply filter
tax_filter = Taxonomy_filter_generator(taxo, config["grep"])
filter = Filter_generator(config["grep"], tax_filter)
filter = Filter_generator(config["grep"], tax_filter, i_view)
selection = Line_selection(i_view)
for i in range(len(i_view)):
pb(i)
line = i_view[i]
loc_env = {"sequence": line, "line": line, "taxonomy": taxo}
good = filter(line, loc_env)
if good :
if filter is None and config["grep"]["invert_selection"]: # all sequences are selected: filter is None if no line will be selected because some columns don't exist
for i in range(len(i_view)):
PyErr_CheckSignals()
if pb is not None:
pb(i)
selection.append(i)
pb(i, force=True)
print("", file=sys.stderr)
elif filter is not None : # filter is None if no line will be selected because some columns don't exist
for i in range(len(i_view)):
PyErr_CheckSignals()
if pb is not None:
pb(i)
line = i_view[i]
loc_env = {"sequence": line, "line": line, "taxonomy": taxo, "obi_eval_result": False}
good = filter(line, loc_env)
if good :
selection.append(i)
if pb is not None:
pb(len(i_view), force=True)
print("", file=sys.stderr)
# Create output view with the line selection
try:
o_view = selection.materialize(o_view_name)
except Exception, e:
raise RollbackException("obi grep error, rollbacking view: "+str(e), o_view)
logger("info", "Grepped %d entries" % len(o_view))
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
input_dms_name=[input[0].name]
@ -337,16 +395,22 @@ def run(config):
# and delete the temporary view in the input DMS
if i_dms != o_dms:
o_view.close()
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
o_view = o_dms[o_view_name_final]
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
o_view = o_dms[final_o_view_name]
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary imported view used to create the final view
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
i_dms.close()
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,22 +4,28 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport str2bytes
from obitools3.apps.optiongroups import addExportOutputOption
import time
import sys
from io import BufferedWriter
__title__="Keep the N first lines of a view."
from cpython.exc cimport PyErr_CheckSignals
__title__="Keep the N first lines of a view"
def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addExportOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi head specific options')
@ -51,30 +57,41 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
o_view_name_final = output[1]
o_view_name = o_view_name_final
output_0 = output[0]
final_o_view_name = output[1]
# If the input and output DMS are not the same, create output view in input DMS first, then export it
# to output DMS, making sure the temporary view name is unique in the input DMS
if i_dms != o_dms:
# If stdout output or the input and output DMS are not the same, create a temporary view that will be exported and deleted.
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while o_view_name in i_dms:
o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
i+=1
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
n = min(config['head']['count'], len(i_view))
# Initialize the progress bar
pb = ProgressBar(n, config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(n, config)
else:
pb = None
selection = Line_selection(i_view)
for i in range(n):
pb(i)
PyErr_CheckSignals()
if pb is not None:
pb(i)
selection.append(i)
pb(i, force=True)
print("", file=sys.stderr)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
# Create output view with the line selection
try:
@ -91,16 +108,22 @@ def run(config):
# and delete the temporary view in the input DMS
if i_dms != o_dms:
o_view.close()
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
o_view = o_dms[o_view_name_final]
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
o_view = o_dms[final_o_view_name]
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(view), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary imported view used to create the final view
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
i_dms.close()
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -51,7 +51,8 @@ def run(config):
print(bytes2str(entries.dot_history_graph))
elif config['history']['format'] == "ascii" :
if isinstance(entries, View):
print(bytes2str(entries.ascii_history_graph))
print(bytes2str(entries.ascii_history))
else:
raise Exception("ASCII history only available for views")
input[0].close(force=True)

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -2,6 +2,7 @@
import sys
import os
import re
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms.view.view cimport View
@ -11,9 +12,11 @@ from obitools3.dms.column.column cimport Column
from obitools3.dms.obiseq cimport Nuc_Seq
from obitools3.dms import DMS
from obitools3.dms.taxo.taxo cimport Taxonomy
from obitools3.files.uncompress cimport CompressedFile
from obitools3.utils cimport tobytes, \
tostr, \
get_obitype, \
update_obitype
@ -23,24 +26,38 @@ from obitools3.dms.capi.obiview cimport VIEW_TYPE_NUC_SEQS, \
DEFINITION_COLUMN, \
QUALITY_COLUMN, \
COUNT_COLUMN, \
TAXID_COLUMN
TAXID_COLUMN, \
MERGED_PREFIX, \
SCIENTIFIC_NAME_COLUMN
from obitools3.dms.capi.obidms cimport obi_import_view
from obitools3.dms.capi.obitypes cimport obitype_t, \
OBI_VOID, \
OBI_QUAL
OBI_QUAL, \
OBI_STR, \
OBI_INT
from obitools3.dms.capi.obierrno cimport obi_errno
from obitools3.apps.optiongroups import addImportInputOption, \
addTabularInputOption, \
addTaxdumpInputOption, \
addMinimalOutputOption
addMinimalOutputOption, \
addNoProgressBarOption, \
addTaxonomyOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
__title__="Imports sequences from different formats into a DMS"
from cpython.exc cimport PyErr_CheckSignals
from io import BufferedWriter
import ast
__title__="Import sequences from different formats into a DMS"
default_config = { 'destview' : None,
@ -57,8 +74,27 @@ def addOptions(parser):
addImportInputOption(parser)
addTabularInputOption(parser)
addTaxdumpInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi import specific options')
group.add_argument('--preread',
action="store_true", dest="import:preread",
default=False,
help="Do a first readthrough of the dataset if it contains huge dictionaries (more than 100 keys) for "
"a much faster import. This option is not recommended and will slow down the import in any other case.")
group.add_argument('--space-priority',
action="store_true", dest="import:space_priority",
default=False,
help="If importing a view into another DMS, do it by importing each line, saving disk space if the original view "
"has a line selection associated.")
# group.add_argument('--only-id',
# action="store", dest="import:onlyid",
# help="only id")
def run(config):
@ -71,6 +107,10 @@ def run(config):
cdef obitype_t new_type
cdef bint get_quality
cdef bint NUC_SEQS_view
cdef bint silva
cdef bint rdp
cdef bint unite
cdef bint sintax
cdef int nb_elts
cdef object d
cdef View view
@ -81,6 +121,8 @@ def run(config):
cdef Column seq_col
cdef Column qual_col
cdef Column old_column
cdef Column sci_name_col
cdef bytes sci_name
cdef bint rewrite
cdef dict dcols
cdef int skipping
@ -99,44 +141,86 @@ def run(config):
logger("info", "obi import: imports an object (file(s), obiview, taxonomy...) into a DMS")
entry_count = -1
pb = None
if not config['obi']['taxdump']:
input = open_uri(config['obi']['inputURI'])
if input is None: # TODO check for bytes instead now?
raise Exception("Could not open input URI")
entry_count = input[4]
logger("info", "Importing %d entries", entry_count)
if config['obi']['only'] is not None:
entry_count = min(input[4], config['obi']['only'])
else:
entry_count = input[4]
if entry_count > 0:
logger("info", "Importing %d entries", entry_count)
else:
logger("info", "Importing an unknown number of entries")
# TODO a bit dirty?
if input[2]==Nuc_Seq:
if input[2]==Nuc_Seq or input[2]==View_NUC_SEQS:
v = View_NUC_SEQS
else:
v = View
else:
v = None
if config['obi']['taxdump'] or (isinstance(input[1], View) and not config['import']['space_priority']):
dms_only=True
else:
dms_only=False
output = open_uri(config['obi']['outputURI'],
input=False,
newviewtype=v)
newviewtype=v,
dms_only=dms_only)
if output is None:
raise Exception("Could not create output view")
raise Exception("Could not open output")
o_dms = output[0]
# Read taxdump
if config['obi']['taxdump']: # The input is a taxdump to import in a DMS
taxo = Taxonomy.open_taxdump(output[0], config['obi']['inputURI'])
taxo.write(output[1])
# Check if taxonomy name isn't already taken
taxo_name = output[1].split(b'/')[1]
if Taxonomy.exists(o_dms, taxo_name):
raise Exception("Taxonomy name already exists in this DMS")
taxo = Taxonomy.open_taxdump(o_dms, config['obi']['inputURI'])
taxo.write(taxo_name)
taxo.close()
output[0].record_command_line(" ".join(sys.argv[1:]))
output[0].close()
o_dms.record_command_line(" ".join(sys.argv[1:]))
o_dms.close(force=True)
logger("info", "Done.")
return
if entry_count >= 0:
pb = ProgressBar(entry_count, config, seconde=5)
# Open taxonomy if there is one
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
taxo_uri = open_uri(config['obi']['taxoURI'])
if taxo_uri is None or taxo_uri[2] == bytes:
raise Exception("Couldn't open taxonomy")
taxo = taxo_uri[1]
else :
taxo = None
# If importing a view between two DMS and not wanting to save space if line selection in original view, use C API
if isinstance(input[1], View) and not config['import']['space_priority']:
if obi_import_view(input[0].name_with_full_path, o_dms.name_with_full_path, input[1].name, tobytes((config['obi']['outputURI'].split('/'))[-1])) < 0 :
input[0].close(force=True)
output[0].close(force=True)
raise Exception("Error importing a view in a DMS")
o_dms.record_command_line(" ".join(sys.argv[1:]))
input[0].close(force=True)
output[0].close(force=True)
logger("info", "Done.")
return
# Reinitialize the progress bar
if entry_count >= 0 and config['obi']['noprogressbar'] == False:
pb = ProgressBar(entry_count, config)
else:
pb = None
entries = input[1]
NUC_SEQS_view = False
if isinstance(output[1], View) :
@ -145,141 +229,308 @@ def run(config):
NUC_SEQS_view = True
else:
raise NotImplementedError()
# Save basic columns in variables for optimization
if NUC_SEQS_view :
id_col = view[ID_COLUMN]
def_col = view[DEFINITION_COLUMN]
seq_col = view[NUC_SEQUENCE_COLUMN]
# Prepare taxon scientific name and taxid refs if RDP/SILVA/UNITE/SINTAX formats
silva = False
rdp = False
unite = False
sintax=False
if 'inputformat' in config['obi'] and (config['obi']['inputformat'] == b"silva" or \
config['obi']['inputformat'] == b"rdp" or \
config['obi']['inputformat'] == b"unite" or \
config['obi']['inputformat'] == b"sintax"):
#if taxo is None:
# raise Exception("A taxonomy (as built by 'obi import --taxdump') must be provided for SILVA and RDP files")
if config['obi']['inputformat'] == b"silva":
silva = True
elif config['obi']['inputformat'] == b"rdp":
rdp = True
elif config['obi']['inputformat'] == b"unite":
unite = True
elif config['obi']['inputformat'] == b"sintax":
sintax = True
sci_name_col = Column.new_column(view, SCIENTIFIC_NAME_COLUMN, OBI_STR)
if taxo is not None:
taxid_col = Column.new_column(view, TAXID_COLUMN, OBI_INT)
dcols = {}
# First read through the entries to prepare columns with dictionaries as they are very time-expensive to rewrite
if config['import']['preread']:
logger("info", "First readthrough...")
entries = input[1]
i = 0
dict_dict = {}
for entry in entries:
PyErr_CheckSignals()
if entry is None: # error or exception handled at lower level, not raised because Python generators can't resume after any exception is raised
if config['obi']['skiperror']:
i-=1
continue
else:
raise Exception("obi import error in first readthrough")
if pb is not None:
pb(i)
elif not i%50000:
logger("info", "Read %d entries", i)
for tag in entry :
newtag = tag
if tag[:7] == b"merged_":
newtag = MERGED_PREFIX+tag[7:]
if type(entry[tag]) == dict :
if tag in dict_dict:
dict_dict[newtag][0].update(entry[tag].keys())
else:
dict_dict[newtag] = [set(entry[tag].keys()), get_obitype(entry[tag])]
i+=1
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
for tag in dict_dict:
dcols[tag] = (Column.new_column(view, tag, dict_dict[tag][1], \
nb_elements_per_line=len(dict_dict[tag][0]), \
elements_names=list(dict_dict[tag][0]), \
dict_column=True), \
dict_dict[tag][1])
# Reinitialize the input
if isinstance(input[0], CompressedFile):
input_is_file = True
# Reinitialize the progress bar
if entry_count >= 0 and config['obi']['noprogressbar'] == False:
pb = ProgressBar(entry_count, config)
else:
pb = None
try:
input[0].close()
except AttributeError:
pass
input = open_uri(config['obi']['inputURI'], force_file=input_is_file)
if input is None:
raise Exception("Could not open input URI")
# if 'onlyid' in config['import']:
# onlyid = tobytes(config['import']['onlyid'])
# else:
# onlyid = None
entries = input[1]
i = 0
for entry in entries :
PyErr_CheckSignals()
if entry is None: # error or exception handled at lower level, not raised because Python generators can't resume after any exception is raised
if config['obi']['skiperror']:
i-=1
continue
else:
raise RollbackException("obi import error, rollbacking view", view)
if pb is not None:
pb(i)
if NUC_SEQS_view:
id_col[i] = entry.id
def_col[i] = entry.definition
seq_col[i] = entry.seq
# Check if there is a sequencing quality associated by checking the first entry # TODO haven't found a more robust solution yet
if i == 0:
get_quality = QUALITY_COLUMN in entry
elif not i%50000:
logger("info", "Imported %d entries", i)
# if onlyid is not None and entry.id != onlyid:
# continue
try:
if NUC_SEQS_view:
id_col[i] = entry.id
def_col[i] = entry.definition
seq_col[i] = entry.seq
# Check if there is a sequencing quality associated by checking the first entry # TODO haven't found a more robust solution yet
if i == 0:
get_quality = QUALITY_COLUMN in entry
if get_quality:
Column.new_column(view, QUALITY_COLUMN, OBI_QUAL)
qual_col = view[QUALITY_COLUMN]
if get_quality:
Column.new_column(view, QUALITY_COLUMN, OBI_QUAL)
qual_col = view[QUALITY_COLUMN]
if get_quality:
qual_col[i] = entry.quality
for tag in entry :
if tag != ID_COLUMN and tag != DEFINITION_COLUMN and tag != NUC_SEQUENCE_COLUMN and tag != QUALITY_COLUMN : # TODO dirty
value = entry[tag]
if tag == b"taxid":
tag = TAXID_COLUMN
if tag == b"count":
tag = COUNT_COLUMN
qual_col[i] = entry.quality
if tag not in dcols :
value_type = type(value)
nb_elts = 1
value_obitype = OBI_VOID
if value_type == dict or value_type == list :
nb_elts = len(value)
elt_names = list(value)
else :
nb_elts = 1
elt_names = None
value_obitype = get_obitype(value)
if value_obitype != OBI_VOID :
dcols[tag] = (Column.new_column(view, tag, value_obitype, nb_elements_per_line=nb_elts, elements_names=elt_names), value_obitype)
# Fill value
dcols[tag][0][i] = value
# TODO else log error?
else :
rewrite = False
# Check type adequation
old_type = dcols[tag][1]
new_type = OBI_VOID
new_type = update_obitype(old_type, value)
if old_type != new_type :
rewrite = True
try:
# Fill value
dcols[tag][0][i] = value
except IndexError :
value_type = type(value)
old_column = dcols[tag][0]
old_nb_elements_per_line = old_column.nb_elements_per_line
new_nb_elements_per_line = 0
old_elements_names = old_column.elements_names
new_elements_names = None
#####################################################################
# Check the length and keys of column lines if needed
if value_type == dict : # Check dictionary keys
for k in value :
if k not in old_elements_names :
new_elements_names = list(set(old_elements_names+[tobytes(k) for k in value]))
rewrite = True
# Parse taxon scientific name if RDP or Silva or Unite file
if (rdp or silva or unite or sintax):
if rdp or silva:
sci_names = entry.definition.split(b";")
sci_name_col[i] = sci_names[-1]
elif unite:
sci_names = entry.id.split(b'|')[-1].split(b';')
sci_name_col[i] = re.sub(b'[a-zA-Z]__', b'', sci_names[-1])
elif sintax:
reconstructed_line = entry.id+b' '+entry.definition[:-1]
splitted_reconstructed_line = reconstructed_line.split(b';')
taxa = splitted_reconstructed_line[1].split(b'=')[1]
taxa = splitted_reconstructed_line[1].split(b',')
sci_names = []
for t in taxa:
tf = t.split(b':')[1]
sci_names.append(tf)
sci_name_col[i] = sci_names[-1]
id_col[i] = reconstructed_line.split(b';')[0]
def_col[i] = reconstructed_line
# Fond taxid if taxonomy provided
if taxo is not None :
for sci_name in reversed(sci_names):
if unite:
sci_name = re.sub(b'[a-zA-Z]__', b'', sci_name)
if sci_name.split()[0] != b'unidentified' and sci_name.split()[0] != b'uncultured' and sci_name.split()[0] != b'metagenome':
taxon = taxo.get_taxon_by_name(sci_name)
if taxon is not None:
sci_name_col[i] = taxon.name
taxid_col[i] = taxon.taxid
#print(taxid_col[i], sci_name_col[i])
break
elif value_type == list or value_type == tuple : # Check vector length
if old_nb_elements_per_line < len(value) :
new_nb_elements_per_line = len(value)
rewrite = True
#####################################################################
if rewrite :
if new_nb_elements_per_line == 0 and new_elements_names is not None :
new_nb_elements_per_line = len(new_elements_names)
# Reset obierrno
obi_errno = 0
dcols[tag] = (view.rewrite_column_with_diff_attributes(old_column.name,
new_data_type=new_type,
new_nb_elements_per_line=new_nb_elements_per_line,
new_elements_names=new_elements_names,
rewrite_last_line=False),
value_obitype)
# Update the dictionary:
for t in dcols :
dcols[t] = (view[t], dcols[t][1])
for tag in entry :
if tag != ID_COLUMN and tag != DEFINITION_COLUMN and tag != NUC_SEQUENCE_COLUMN and tag != QUALITY_COLUMN : # TODO dirty
value = entry[tag]
if tag == b"taxid":
tag = TAXID_COLUMN
if tag == b"count":
tag = COUNT_COLUMN
if tag == b"scientific_name":
tag = SCIENTIFIC_NAME_COLUMN
if tag[:7] == b"merged_":
tag = MERGED_PREFIX+tag[7:]
if type(value) == bytes and value[:1]==b"[" :
try:
if type(eval(value)) == list:
value = eval(value)
#print(value)
except:
pass
if tag not in dcols :
value_type = type(value)
nb_elts = 1
value_obitype = OBI_VOID
dict_col = False
if value_type == dict :
nb_elts = len(value)
elt_names = list(value)
dict_col = True
else :
nb_elts = 1
elt_names = None
if value_type == list :
tuples = True
else:
tuples = False
value_obitype = get_obitype(value)
if value_obitype != OBI_VOID :
dcols[tag] = (Column.new_column(view, tag, value_obitype, nb_elements_per_line=nb_elts, elements_names=elt_names, dict_column=dict_col, tuples=tuples), value_obitype)
# Fill value
dcols[tag][0][i] = value
# TODO else log error?
else :
rewrite = False
# Check type adequation
old_type = dcols[tag][1]
new_type = OBI_VOID
new_type = update_obitype(old_type, value)
if old_type != new_type :
rewrite = True
try:
# Check that it's not the case where the first entry contained a dict of length 1 and now there is a new key
if type(value) == dict and \
dcols[tag][0].nb_elements_per_line == 1 \
and set(dcols[tag][0].elements_names) != set(value.keys()) :
raise IndexError # trigger column rewrite
# Fill value
dcols[tag][0][i] = value
except (IndexError, OverflowError):
value_type = type(value)
old_column = dcols[tag][0]
old_nb_elements_per_line = old_column.nb_elements_per_line
new_nb_elements_per_line = 0
old_elements_names = old_column.elements_names
new_elements_names = None
#####################################################################
# Check the length and keys of column lines if needed
if value_type == dict : # Check dictionary keys
for k in value :
if k not in old_elements_names :
new_elements_names = list(set(old_elements_names+[tobytes(k) for k in value]))
rewrite = True
break
elif value_type == list or value_type == tuple : # Check vector length
if old_nb_elements_per_line < len(value) :
new_nb_elements_per_line = len(value)
rewrite = True
#####################################################################
if rewrite :
if new_nb_elements_per_line == 0 and new_elements_names is not None :
new_nb_elements_per_line = len(new_elements_names)
# Reset obierrno
obi_errno = 0
dcols[tag] = (view.rewrite_column_with_diff_attributes(old_column.name,
new_data_type=new_type,
new_nb_elements_per_line=new_nb_elements_per_line,
new_elements_names=new_elements_names,
rewrite_last_line=False),
new_type)
# Update the dictionary:
for t in dcols :
dcols[t] = (view[t], dcols[t][1])
# Fill value
dcols[tag][0][i] = value
except Exception as e:
print("\nCould not import sequence:\n", repr(entry), "\nError raised:", e, "\n/!\ Check if '--input-na-string' option needs to be set")
if 'skiperror' in config['obi'] and not config['obi']['skiperror']:
raise e
else:
pass
i-=1 # overwrite problematic entry
i+=1
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
logger("info", "Imported %d entries", len(view))
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
@ -294,7 +545,7 @@ def run(config):
except AttributeError:
pass
try:
output[0].close()
output[0].close(force=True)
except AttributeError:
pass

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -1,44 +1,50 @@
#cython: language_level=3
from obitools3.apps.optiongroups import addMinimalInputOption
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.uri.decode import open_uri
from obitools3.dms import DMS
from obitools3.utils cimport tobytes
from obitools3.dms.capi.obiview cimport QUALITY_COLUMN
from obitools3.apps.optiongroups import addMinimalInputOption
import sys
import io
from subprocess import Popen, PIPE
from cpython.exc cimport PyErr_CheckSignals
__title__="Less equivalent"
def addOptions(parser):
addMinimalInputOption(parser)
group=parser.add_argument_group('obi less specific options')
group.add_argument('--print', '-n',
action="store", dest="less:print",
metavar='<N>',
default=10,
type=int,
help="Print N entries (default: 10)")
def run(config):
cdef object entries
cdef int n
DMS.obi_atexit()
input = open_uri(config['obi']['inputURI'])
entries = input[1]
if config['less']['print'] > len(entries) :
n = len(entries)
else :
n = config['less']['print']
# Print
for i in range(n) :
print(repr(entries[i]))
DMS.obi_atexit()
# Open the input
input = open_uri(config['obi']['inputURI'])
if input is None:
raise Exception("Could not read input")
iview = input[1]
process = Popen(["less"], stdin=PIPE)
for seq in iview :
PyErr_CheckSignals()
try:
process.stdin.write(tobytes(repr(seq)))
process.stdin.write(b"\n")
except (StopIteration, BrokenPipeError, IOError):
break
sys.stderr.close()
process.stdin.close()
process.wait()
iview.close()
input[0].close(force=True)

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -3,7 +3,9 @@
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.dms import DMS
from obitools3.dms.taxo.taxo cimport Taxonomy
from obitools3.apps.optiongroups import addMinimalInputOption
from obitools3.utils cimport tostr, bytes2str_object
__title__="Print a preview of a DMS, view, column...."
@ -11,6 +13,12 @@ __title__="Print a preview of a DMS, view, column...."
def addOptions(parser):
addMinimalInputOption(parser)
group = parser.add_argument_group('obi ls specific options')
group.add_argument('-l',
action="store_true", dest="ls:longformat",
default=False,
help="Detailed list in long format with all metadata.")
def run(config):
@ -24,5 +32,10 @@ def run(config):
if input is None:
raise Exception("Could not read input")
print(repr(input[1]))
# Print representation
if config['ls']['longformat']:
print(input[1].repr_longformat())
else:
print(repr(input[1]))
input[0].close(force=True)

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -2,10 +2,10 @@
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view import RollbackException
from obitools3.dms.view import RollbackException, View
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from obitools3.dms.column.column cimport Column, Column_line
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.libalign._freeendgapfm import FreeEndGapFullMatch
@ -13,26 +13,29 @@ from obitools3.libalign.apat_pattern import Primer_search
from obitools3.dms.obiseq cimport Nuc_Seq
from obitools3.dms.capi.obitypes cimport OBI_SEQ, OBI_QUAL
from obitools3.dms.capi.apat cimport MAX_PATTERN
from obitools3.utils cimport tobytes
from obitools3.dms.capi.obiview cimport REVERSE_SEQUENCE_COLUMN, REVERSE_QUALITY_COLUMN
from obitools3.utils cimport tobytes, str2bytes
from libc.stdint cimport INT32_MAX
from functools import reduce
import math
import sys
from cpython.exc cimport PyErr_CheckSignals
from io import BufferedWriter
REVERSE_SEQ_COLUMN_NAME = b"REVERSE_SEQUENCE" # used by alignpairedend tool
REVERSE_QUALITY_COLUMN_NAME = b"REVERSE_QUALITY" # used by alignpairedend tool
MAX_PAT_LEN = 31
__title__="Assigns sequence records to the corresponding experiment/sample based on DNA tags and primers"
__title__="Assign sequence records to the corresponding experiment/sample based on DNA tags and primers"
def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi ngsfilter specific options')
group.add_argument('-t','--info-view',
@ -40,7 +43,9 @@ def addOptions(parser):
metavar="<URI>",
type=str,
default=None,
help="URI to the view containing the samples definition (with tags, primers, sample names,...)")
required=True,
help="URI to the view containing the samples definition (with tags, primers, sample names,...).\n"
"\nWarning: primer lengths must be less than or equal to 32")
group.add_argument('-R', '--reverse-reads',
action="store", dest="ngsfilter:reverse",
@ -54,7 +59,12 @@ def addOptions(parser):
metavar="<URI>",
type=str,
default=None,
help="URI to the view used to store the sequences unassigned to any sample")
help="URI to the view used to store the sequences unassigned to any sample. Those sequences are untrimmed.")
group.add_argument('--no-tags',
action="store_true", dest="ngsfilter:notags",
default=False,
help="Use this option if your experiment does not use tags to identify samples")
group.add_argument('-e','--error',
action="store", dest="ngsfilter:error",
@ -77,6 +87,8 @@ class Primer:
@type direct:
'''
assert len(sequence) <= MAX_PAT_LEN, "Primer %s is too long, 31 bp max" % sequence
assert sequence not in Primer.collection \
or Primer.collection[sequence]==taglength, \
"Primer %s must always be used with tags of the same length" % sequence
@ -165,8 +177,15 @@ cdef read_info_view(info_view, max_errors=2, verbose=False, not_aligned=False):
primer_list = []
i=0
for p in info_view:
# Check primer length: should not be longer than 32, the max allowed by the apat lib
if len(p[b'forward_primer']) > 32:
raise RollbackException("Error: primers can not be longer than 32bp, rollbacking views")
if len(p[b'reverse_primer']) > 32:
raise RollbackException("Error: primers can not be longer than 32bp, rollbacking views")
forward=Primer(p[b'forward_primer'],
len(p[b'forward_tag']) if p[b'forward_tag']!=b'-' else None,
len(p[b'forward_tag']) if (b'forward_tag' in p and p[b'forward_tag']!=None) else None,
True,
max_errors=max_errors,
verbose=verbose,
@ -177,7 +196,7 @@ cdef read_info_view(info_view, max_errors=2, verbose=False, not_aligned=False):
infos[forward]=fp
reverse=Primer(p[b'reverse_primer'],
len(p[b'reverse_tag']) if p[b'reverse_tag']!=b'-' else None,
len(p[b'reverse_tag']) if (b'reverse_tag' in p and p[b'reverse_tag']!=None) else None,
False,
max_errors=max_errors,
verbose=verbose,
@ -212,10 +231,11 @@ cdef read_info_view(info_view, max_errors=2, verbose=False, not_aligned=False):
rpp=rp.get(cf,{})
rp[cf]=rpp
tags = (p[b'forward_tag'] if p[b'forward_tag']!=b'-' else None,
p[b'reverse_tag'] if p[b'reverse_tag']!=b'-' else None)
tags = (p[b'forward_tag'] if (b'forward_tag' in p and p[b'forward_tag']!=None) else None,
p[b'reverse_tag'] if (b'reverse_tag' in p and p[b'reverse_tag']!=None) else None)
assert tags not in dpp, \
if tags != (None, None):
assert tags not in dpp, \
"Tag pair %s is already used with primer pairs: (%s,%s)" % (str(tags),forward,reverse)
# Save additional data
@ -233,7 +253,7 @@ cdef read_info_view(info_view, max_errors=2, verbose=False, not_aligned=False):
return infos, primer_list
cdef tuple annotate(sequences, infos, verbose=False):
cdef tuple annotate(sequences, infos, no_tags, verbose=False):
def sortMatch(match):
if match[1] is None:
@ -248,20 +268,19 @@ cdef tuple annotate(sequences, infos, verbose=False):
return match[1][1]
not_aligned = len(sequences) > 1
sequenceF = sequences[0]
sequenceR = None
if not not_aligned:
final_sequence = sequenceF
else:
final_sequence = sequenceF.clone() # TODO maybe not cloning and then deleting quality tags is more efficient
sequences[0] = sequences[0].clone()
if not_aligned:
sequences[0][b"R1_parent"] = sequences[0].id
sequences[0][b"R2_parent"] = sequences[1].id
if not_aligned:
sequenceR = sequences[1]
final_sequence[REVERSE_SEQ_COLUMN_NAME] = sequenceR.seq # used by alignpairedend tool
final_sequence[REVERSE_QUALITY_COLUMN_NAME] = sequenceR.quality # used by alignpairedend tool
sequences[1] = sequences[1].clone()
sequences[0][REVERSE_SEQUENCE_COLUMN] = sequences[1].seq # used by alignpairedend tool
sequences[0][REVERSE_QUALITY_COLUMN] = sequences[1].quality # used by alignpairedend tool
for seq in sequences:
if hasattr(seq, "quality_array"):
if hasattr(seq, "quality_array") and seq.quality_array is not None:
q = -reduce(lambda x,y:x+y,(math.log10(z) for z in seq.quality_array),0)/len(seq.quality_array)*10
seq[b'avg_quality']=q
q = -reduce(lambda x,y:x+y,(math.log10(z) for z in seq.quality_array[0:10]),0)
@ -274,8 +293,6 @@ cdef tuple annotate(sequences, infos, verbose=False):
# Try direct matching:
directmatch = []
first_matched_seq = None
second_matched_seq = None
for seq in sequences:
new_seq = True
pattern = 0
@ -283,72 +300,110 @@ cdef tuple annotate(sequences, infos, verbose=False):
if pattern == MAX_PATTERN:
new_seq = True
pattern = 0
directmatch.append((p, p(seq, same_sequence=not new_seq, pattern=pattern), seq))
# Saving original primer as 4th member of the tuple to serve as correct key in infos dict even if it might be reversed complemented (not here)
directmatch.append((p, p(seq, same_sequence=not new_seq, pattern=pattern), seq, p))
new_seq = False
pattern+=1
# Choose match closer to the start of (one of the) sequence(s)
directmatch = sorted(directmatch, key=sortMatch)
all_direct_matches = directmatch
directmatch = directmatch[0] if directmatch[0][1] is not None else None
if directmatch is None:
final_sequence[b'error']=b'No primer match'
return False, final_sequence
if not_aligned:
sequences[0][REVERSE_SEQUENCE_COLUMN] = sequences[1].seq # used by alignpairedend tool
sequences[0][REVERSE_QUALITY_COLUMN] = sequences[1].quality # used by alignpairedend tool
sequences[0][b'error']=b'No primer match'
return False, sequences[0]
first_matched_seq = directmatch[2]
if id(first_matched_seq) == id(sequenceF) and not_aligned:
second_matched_seq = sequenceR
if id(directmatch[2]) == id(sequences[0]):
first_match_first_seq = True
else:
second_matched_seq = sequenceF
match = first_matched_seq[directmatch[1][1]:directmatch[1][2]]
first_match_first_seq = False
match = directmatch[2][directmatch[1][1]:directmatch[1][2]]
if not not_aligned:
final_sequence[b'seq_length_ori']=len(final_sequence)
sequences[0][b'seq_length_ori']=len(sequences[0])
if not not_aligned or id(first_matched_seq) == id(sequenceF):
final_sequence = final_sequence[directmatch[1][2]:]
if not not_aligned or first_match_first_seq:
sequences[0] = sequences[0][directmatch[1][2]:]
else:
cut_seq = sequenceR[directmatch[1][2]:]
final_sequence[REVERSE_SEQ_COLUMN_NAME] = cut_seq.seq # used by alignpairedend tool
final_sequence[REVERSE_QUALITY_COLUMN_NAME] = cut_seq.quality # used by alignpairedend tool
sequences[1] = sequences[1][directmatch[1][2]:]
sequences[0][REVERSE_SEQUENCE_COLUMN] = sequences[1].seq # used by alignpairedend tool
sequences[0][REVERSE_QUALITY_COLUMN] = sequences[1].quality # used by alignpairedend tool
if directmatch[0].forward:
final_sequence[b'direction']=b'forward'
final_sequence[b'forward_errors']=directmatch[1][0]
final_sequence[b'forward_primer']=directmatch[0].raw
final_sequence[b'forward_match']=match.seq
sequences[0][b'direction']=b'forward'
sequences[0][b'forward_errors']=directmatch[1][0]
sequences[0][b'forward_primer']=directmatch[0].raw
sequences[0][b'forward_match']=match.seq
else:
final_sequence[b'direction']=b'reverse'
final_sequence[b'reverse_errors']=directmatch[1][0]
final_sequence[b'reverse_primer']=directmatch[0].raw
final_sequence[b'reverse_match']=match.seq
sequences[0][b'direction']=b'reverse'
sequences[0][b'reverse_errors']=directmatch[1][0]
sequences[0][b'reverse_primer']=directmatch[0].raw
sequences[0][b'reverse_match']=match.seq
# Keep only paired reverse primer
infos = infos[directmatch[0]]
# If not aligned, look for other match in already computed match (choose the one that makes the biggest amplicon)
infos = infos[directmatch[0]]
reverse_primer = list(infos.keys())[0]
direct_primer = directmatch[0]
# If not aligned, look for other match in already computed matches (choose the one that makes the biggest amplicon)
if not_aligned:
i=1
while all_direct_matches[i][1] is None and all_direct_matches[i][0].forward and i<len(all_direct_matches):
# TODO comment
while i<len(all_direct_matches) and \
(all_direct_matches[i][1] is None or \
all_direct_matches[i][0].forward == directmatch[0].forward or \
all_direct_matches[i][0] == directmatch[0] or \
reverse_primer != all_direct_matches[i][0]) :
i+=1
if i < len(all_direct_matches):
reversematch = all_direct_matches[i]
else:
reversematch = None
# Cut reverse primer out of 1st matched seq if it contains it, because if it's also in the other sequence, the next step will "choose" only the one on the other sequence
if not_aligned:
# do it on same seq
if first_match_first_seq:
r = reverse_primer.revcomp(sequences[0])
else:
r = reverse_primer.revcomp(sequences[1])
if r is not None: # found
if first_match_first_seq :
sequences[0] = sequences[0][:r[1]]
else:
sequences[1] = sequences[1][:r[1]]
sequences[0][REVERSE_SEQUENCE_COLUMN] = sequences[1].seq # used by alignpairedend tool
sequences[0][REVERSE_QUALITY_COLUMN] = sequences[1].quality # used by alignpairedend tool
# do the same on the other seq
if first_match_first_seq:
r = direct_primer.revcomp(sequences[1])
else:
r = direct_primer.revcomp(sequences[0])
if r is not None: # found
if first_match_first_seq:
sequences[1] = sequences[1][:r[1]]
else:
sequences[0] = sequences[0][:r[1]]
sequences[0][REVERSE_SEQUENCE_COLUMN] = sequences[1].seq
sequences[0][REVERSE_QUALITY_COLUMN] = sequences[1].quality
# Look for other primer in the other direction on the sequence, or
# If sequences are not already aligned and reverse primer not found in most likely sequence (the one without the forward primer), try matching on the same sequence than the first match (primer in the other direction)
if not not_aligned or (not_aligned and reversematch[1] is None):
if not not_aligned:
sequence_to_match = second_matched_seq
if not not_aligned or (not_aligned and (reversematch is None or reversematch[1] is None)):
if not_aligned and first_match_first_seq:
seq_to_match = sequences[1]
else:
sequence_to_match = first_matched_seq
seq_to_match = sequences[0]
reversematch = []
# Compute begin
begin=directmatch[1][2]+1 # end of match + 1 on the same sequence
#begin=directmatch[1][2]+1 # end of match + 1 on the same sequence -- No, already cut out forward primer
# Try reverse matching on the other sequence:
new_seq = True
pattern = 0
@ -360,7 +415,9 @@ cdef tuple annotate(sequences, infos, verbose=False):
primer=p.revcomp
else:
primer=p
reversematch.append((primer, primer(sequence_to_match, same_sequence=not new_seq, pattern=pattern, begin=begin)))
# Saving original primer as 4th member of the tuple to serve as correct key in infos dict even if it might have been reversed complemented
# (3rd member already used by directmatch)
reversematch.append((primer, primer(seq_to_match, same_sequence=not new_seq, pattern=pattern, begin=0), None, p))
new_seq = False
pattern+=1
# Choose match closer to the end of the sequence
@ -373,11 +430,11 @@ cdef tuple annotate(sequences, infos, verbose=False):
message = b'No reverse primer match'
else:
message = b'No direct primer match'
final_sequence[b'error']=message
return False, final_sequence
sequences[0][b'error']=message
return False, sequences[0]
if reversematch is None:
final_sequence[b'status']=b'partial'
sequences[0][b'status']=b'partial'
if directmatch[0].forward:
tags=(directmatch[1][3],None)
@ -387,75 +444,86 @@ cdef tuple annotate(sequences, infos, verbose=False):
samples = infos[None]
else:
final_sequence[b'status']=b'full'
sequences[0][b'status']=b'full'
if not not_aligned or first_match_first_seq:
match = sequences[0][reversematch[1][1]:reversematch[1][2]]
else:
match = sequences[1][reversematch[1][1]:reversematch[1][2]]
match = second_matched_seq[reversematch[1][1]:reversematch[1][2]]
match = match.reverse_complement
if not not_aligned or id(second_matched_seq) == id(sequenceF):
final_sequence = final_sequence[0:reversematch[1][1]]
if not not_aligned:
sequences[0] = sequences[0][0:reversematch[1][1]]
elif first_match_first_seq:
sequences[1] = sequences[1][reversematch[1][2]:]
if not directmatch[0].forward:
sequences[1] = sequences[1].reverse_complement
sequences[0][REVERSE_SEQUENCE_COLUMN] = sequences[1].seq # used by alignpairedend tool
sequences[0][REVERSE_QUALITY_COLUMN] = sequences[1].quality # used by alignpairedend tool
else:
cut_seq = sequenceR[reversematch[1][2]:]
final_sequence[REVERSE_SEQ_COLUMN_NAME] = cut_seq.seq # used by alignpairedend tool
final_sequence[REVERSE_QUALITY_COLUMN_NAME] = cut_seq.quality # used by alignpairedend tool
sequences[0] = sequences[0][reversematch[1][2]:]
if directmatch[0].forward:
tags=(directmatch[1][3], reversematch[1][3])
final_sequence[b'reverse_errors'] = reversematch[1][0]
final_sequence[b'reverse_primer'] = reversematch[0].raw
final_sequence[b'reverse_match'] = match.seq
sequences[0][b'reverse_errors'] = reversematch[1][0]
sequences[0][b'reverse_primer'] = reversematch[0].raw
sequences[0][b'reverse_match'] = match.seq
else:
tags=(reversematch[1][3], directmatch[1][3])
final_sequence[b'forward_errors'] = reversematch[1][0]
final_sequence[b'forward_primer'] = reversematch[0].raw
final_sequence[b'forward_match'] = match.seq
sequences[0][b'forward_errors'] = reversematch[1][0]
sequences[0][b'forward_primer'] = reversematch[0].raw
sequences[0][b'forward_match'] = match.seq
if tags[0] is not None:
final_sequence[b'forward_tag'] = tags[0]
sequences[0][b'forward_tag'] = tags[0]
if tags[1] is not None:
final_sequence[b'reverse_tag'] = tags[1]
sequences[0][b'reverse_tag'] = tags[1]
samples = infos[reversematch[0]]
samples = infos[reversematch[3]]
if not directmatch[0].forward and not not_aligned: # don't reverse complement if not_aligned
final_sequence = final_sequence.reverse_complement
if not directmatch[0].forward:
sequences[0] = sequences[0].reverse_complement
sequences[0][b'reversed'] = True # used by the alignpairedend tool (in kmer_similarity.c)
else:
sequences[0][b'reversed'] = False # used by the alignpairedend tool (in kmer_similarity.c)
sample=None
if tags[0] is not None: # Direct tag known
if tags[1] is not None: # Reverse tag known
sample = samples.get(tags, None)
else: # Only direct tag known
s=[samples[x] for x in samples if x[0]==tags[0]]
if len(s)==1:
sample=s[0]
elif len(s)>1:
final_sequence[b'error']=b'multiple samples match tags'
return False, final_sequence
else:
sample=None
else:
if tags[1] is not None: # Only reverse tag known
s=[samples[x] for x in samples if x[1]==tags[1]]
if len(s)==1:
sample=s[0]
elif len(s)>1:
final_sequence[b'error']=b'multiple samples match tags'
return False, final_sequence
else:
sample=None
if not no_tags:
if tags[0] is not None: # Direct tag known
if tags[1] is not None: # Reverse tag known
sample = samples.get(tags, None)
else: # Only direct tag known
s=[samples[x] for x in samples if x[0]==tags[0]]
if len(s)==1:
sample=s[0]
elif len(s)>1:
sequences[0][b'error']=b'Did not found reverse tag'
return False, sequences[0]
else:
sample=None
else:
if tags[1] is not None: # Only reverse tag known
s=[samples[x] for x in samples if x[1]==tags[1]]
if len(s)==1:
sample=s[0]
elif len(s)>1:
sequences[0][b'error']=b'Did not found forward tag'
return False, sequences[0]
else:
sample=None
if sample is None:
sequences[0][b'error']=b"No sample with that tag combination"
return False, sequences[0]
if sample is None:
final_sequence[b'error']=b"Cannot assign sequence to a sample"
return False, final_sequence
final_sequence.update(sample)
sequences[0].update(sample)
if not not_aligned:
final_sequence[b'seq_length']=len(final_sequence)
sequences[0][b'seq_length']=len(sequences[0])
return True, final_sequence
return True, sequences[0]
def run(config):
@ -478,7 +546,8 @@ def run(config):
raise Exception("Could not open input reads")
if input[2] != View_NUC_SEQS:
raise NotImplementedError('obi ngsfilter only works on NUC_SEQS views')
i_dms = input[0]
if "reverse" in config["ngsfilter"]:
forward = input[1]
@ -519,8 +588,19 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
output_0 = output[0]
o_view = output[1]
# If stdout output, create a temporary view in the input dms that will be deleted afterwards.
if type(output_0)==BufferedWriter:
o_dms = i_dms
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in input DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
o_view = View_NUC_SEQS.new(i_dms, o_view_name)
# Open the view containing the informations about the tags and the primers
info_input = open_uri(config['ngsfilter']['info_view'])
if info_input is None:
@ -541,10 +621,19 @@ def run(config):
unidentified = None
# Initialize the progress bar
pb = ProgressBar(entries_len, config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(entries_len, config)
else:
pb = None
# Check and store primers and tags
infos, primer_list = read_info_view(info_view, max_errors=config['ngsfilter']['error'], verbose=False, not_aligned=not_aligned) # TODO obi verbose option
try:
infos, primer_list = read_info_view(info_view, max_errors=config['ngsfilter']['error'], verbose=False, not_aligned=not_aligned) # TODO obi verbose option
except RollbackException, e:
if unidentified is not None:
raise RollbackException("obi ngsfilter error, rollbacking views: "+str(e), o_view, unidentified)
else:
raise RollbackException("obi ngsfilter error, rollbacking view: "+str(e), o_view)
aligner = Primer_search(primer_list, config['ngsfilter']['error'])
@ -556,49 +645,76 @@ def run(config):
paired_p.revcomp.aligner = aligner
if not_aligned: # create columns used by alignpairedend tool
Column.new_column(o_view, REVERSE_SEQ_COLUMN_NAME, OBI_SEQ)
Column.new_column(o_view, REVERSE_QUALITY_COLUMN_NAME, OBI_QUAL, associated_column_name=REVERSE_SEQ_COLUMN_NAME, associated_column_version=o_view[REVERSE_SEQ_COLUMN_NAME].version)
Column.new_column(o_view, REVERSE_SEQUENCE_COLUMN, OBI_SEQ)
Column.new_column(o_view, REVERSE_QUALITY_COLUMN, OBI_QUAL, associated_column_name=REVERSE_SEQUENCE_COLUMN, associated_column_version=o_view[REVERSE_SEQUENCE_COLUMN].version)
Column.new_column(unidentified, REVERSE_SEQ_COLUMN_NAME, OBI_SEQ)
Column.new_column(unidentified, REVERSE_QUALITY_COLUMN_NAME, OBI_QUAL, associated_column_name=REVERSE_SEQ_COLUMN_NAME, associated_column_version=unidentified[REVERSE_SEQ_COLUMN_NAME].version)
if unidentified is not None:
Column.new_column(unidentified, REVERSE_SEQUENCE_COLUMN, OBI_SEQ)
Column.new_column(unidentified, REVERSE_QUALITY_COLUMN, OBI_QUAL, associated_column_name=REVERSE_SEQUENCE_COLUMN, associated_column_version=unidentified[REVERSE_SEQUENCE_COLUMN].version)
g = 0
u = 0
i = 0
no_tags = config['ngsfilter']['notags']
try:
for i in range(entries_len):
pb(i)
PyErr_CheckSignals()
if pb is not None:
pb(i)
if not_aligned:
modseq = [Nuc_Seq.new_from_stored(forward[i]), Nuc_Seq.new_from_stored(reverse[i])]
else:
modseq = [Nuc_Seq.new_from_stored(entries[i])]
good, oseq = annotate(modseq, infos)
good, oseq = annotate(modseq, infos, no_tags)
if good:
o_view[g].set(oseq.id, oseq.seq, definition=oseq.definition, quality=oseq.quality, tags=oseq)
g+=1
elif unidentified is not None:
unidentified[u].set(oseq.id, oseq.seq, definition=oseq.definition, quality=oseq.quality, tags=oseq)
# Untrim sequences (put original back)
if len(modseq) > 1:
oseq[REVERSE_SEQUENCE_COLUMN] = reverse[i].seq
oseq[REVERSE_QUALITY_COLUMN] = reverse[i].quality
unidentified[u].set(oseq.id, forward[i].seq, definition=oseq.definition, quality=forward[i].quality, tags=oseq)
else:
unidentified[u].set(oseq.id, entries[i].seq, definition=oseq.definition, quality=entries[i].quality, tags=oseq)
u+=1
except Exception, e:
raise RollbackException("obi ngsfilter error, rollbacking views: "+str(e), o_view, unidentified)
if unidentified is not None:
raise RollbackException("obi ngsfilter error, rollbacking views: "+str(e), o_view, unidentified)
else:
raise RollbackException("obi ngsfilter error, rollbacking view: "+str(e), o_view)
pb(i, force=True)
print("", file=sys.stderr)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
o_view.write_config(config, "ngsfilter", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
unidentified.write_config(config, "ngsfilter", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
# Add comment about unidentified seqs
unidentified.comments["info"] = "View containing sequences categorized as unidentified by the ngsfilter command"
output[0].record_command_line(command_line)
if unidentified is not None:
unidentified.write_config(config, "ngsfilter", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
# Add comment about unidentified seqs
unidentified.comments["info"] = "View containing sequences categorized as unidentified by the ngsfilter command"
o_dms.record_command_line(command_line)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
input[0].close()
output[0].close()
info_input[0].close()
unidentified_input[0].close()
# If stdout output, delete the temporary result view in the input DMS
if type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
i_dms.close(force=True)
o_dms.close(force=True)
info_input[0].close(force=True)
if unidentified is not None:
unidentified_input[0].close(force=True)
aligner.free()
logger("info", "Done.")

View File

@ -0,0 +1,87 @@
#cython: language_level=3
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.dms import DMS
from obitools3.apps.optiongroups import addMinimalInputOption
from obitools3.dms.view.view cimport View
from obitools3.utils cimport tostr
import os
import shutil
__title__="Delete a view"
def addOptions(parser):
addMinimalInputOption(parser)
def run(config):
DMS.obi_atexit()
logger("info", "obi rm")
# Open the input
input = open_uri(config['obi']['inputURI'])
if input is None:
raise Exception("Could not read input")
# Check that it's a view
if isinstance(input[1], View) :
view = input[1]
else:
raise NotImplementedError()
dms = input[0]
# Get the path to the view file to remove
path = dms.full_path # dms path
view_path=path+b"/VIEWS/"
view_path+=view.name
view_path+=b".obiview"
to_remove = {}
# For each column:
for col_alias in view.keys():
col = view[col_alias]
col_name = col.original_name
col_version = col.version
col_type = col.data_type
col_ref = (col_name, col_version)
# build file name and AVL file names
col_file_name = f"{tostr(path)}/{tostr(col.original_name)}.obicol/{tostr(col.original_name)}@{col.version}.odc"
if col_type in [b'OBI_CHAR', b'OBI_QUAL', b'OBI_STR', b'OBI_SEQ']:
avl_file_name = f"{tostr(path)}/OBIBLOB_INDEXERS/{tostr(col.original_name)}_{col.version}_indexer"
else:
avl_file_name = None
to_remove[col_ref] = [col_file_name, avl_file_name]
# For each view:
do_not_remove = []
for vn in dms:
v = dms[vn]
# ignore the one being deleted
if v.name != view.name:
# check that none of the column is referenced, if referenced, remove from list to remove
cols = [(v[c].original_name, v[c].version) for c in v.keys()]
for col_ref in to_remove:
if col_ref in cols:
do_not_remove.append(col_ref)
for nr in do_not_remove:
to_remove.pop(nr)
# Close the view and the DMS
view.close()
input[0].close(force=True)
#print(to_remove)
# rm AFTER view and DMS close
os.remove(view_path)
for col in to_remove:
os.remove(to_remove[col][0])
if to_remove[col][1] is not None:
shutil.rmtree(to_remove[col][1])

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,7 +4,7 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport str2bytes
@ -23,6 +23,8 @@ from obitools3.dms.capi.obitypes cimport OBI_BOOL, \
import time
import sys
from cpython.exc cimport PyErr_CheckSignals
from io import BufferedWriter
NULL_VALUE = {OBI_BOOL: OBIBool_NA,
@ -34,13 +36,14 @@ NULL_VALUE = {OBI_BOOL: OBIBool_NA,
OBI_STR: b""}
__title__="Sort view lines according to the value of a given attribute."
__title__="Sort view lines according to the value of a given attribute"
def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi sort specific options')
@ -58,7 +61,7 @@ def addOptions(parser):
def line_cmp(line, key, pb):
pb
pb
if line[key] is None:
return NULL_VALUE[line.view[key].data_type_int]
else:
@ -85,32 +88,46 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
o_view_name_final = output[1]
o_view_name = o_view_name_final
output_0 = output[0]
final_o_view_name = output[1]
# If the input and output DMS are not the same, create output view in input DMS first, then export it
# to output DMS, making sure the temporary view name is unique in the input DMS
if i_dms != o_dms:
# If stdout output or the input and output DMS are not the same, create a temporary view that will be exported and deleted.
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while o_view_name in i_dms:
o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
i+=1
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
# Initialize the progress bar
pb = ProgressBar(len(i_view), config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(i_view), config)
else:
pb = None
keys = config['sort']['keys']
selection = Line_selection(i_view)
for i in range(len(i_view)): # TODO special function?
PyErr_CheckSignals()
selection.append(i)
for k in keys: # TODO order?
selection.sort(key=lambda line_idx: line_cmp(i_view[line_idx], k, pb(line_idx)), reverse=config['sort']['reverse'])
pb(len(i_view), force=True)
print("", file=sys.stderr)
PyErr_CheckSignals()
if pb is not None:
selection.sort(key=lambda line_idx: line_cmp(i_view[line_idx], k, pb(line_idx)), reverse=config['sort']['reverse'])
else:
selection.sort(key=lambda line_idx: line_cmp(i_view[line_idx], k, None), reverse=config['sort']['reverse'])
if pb is not None:
pb(len(i_view), force=True)
print("", file=sys.stderr)
# Create output view with the sorted line selection
try:
@ -129,16 +146,23 @@ def run(config):
# and delete the temporary view in the input DMS
if i_dms != o_dms:
o_view.close()
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
o_view = o_dms[o_view_name_final]
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
o_view = o_dms[final_o_view_name]
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary imported view used to create the final view
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
i_dms.close()
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -0,0 +1,105 @@
#cython: language_level=3
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Split"
def addOptions(parser):
addMinimalInputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group("obi split specific options")
group.add_argument('-p','--prefix',
action="store", dest="split:prefix",
metavar="<PREFIX>",
help="Prefix added to each subview name (included undefined)")
group.add_argument('-t','--tag-name',
action="store", dest="split:tagname",
metavar="<TAG_NAME>",
help="Attribute/tag used to split the input")
group.add_argument('-u','--undefined',
action="store", dest="split:undefined",
default=b'UNDEFINED',
metavar="<VIEW_NAME>",
help="Name of the view where undefined sequenced are stored (will be PREFIX_VIEW_NAME)")
def run(config):
DMS.obi_atexit()
logger("info", "obi split")
# Open the input
input = open_uri(config["obi"]["inputURI"])
if input is None:
raise Exception("Could not read input view")
i_dms = input[0]
i_view = input[1]
# Initialize the progress bar
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(i_view), config)
else:
pb = None
tag_to_split = config["split"]["tagname"]
undefined = tobytes(config["split"]["undefined"])
selections = {}
# Go through input view and split
for i in range(len(i_view)):
PyErr_CheckSignals()
if pb is not None:
pb(i)
line = i_view[i]
if tag_to_split not in line or line[tag_to_split] is None or len(line[tag_to_split])==0:
value = undefined
else:
value = line[tag_to_split]
if value not in selections:
selections[value] = Line_selection(i_view)
selections[value].append(i)
if pb is not None:
pb(len(i_view), force=True)
print("", file=sys.stderr)
# Create output views with the line selection
try:
for cat in selections:
o_view_name = config["split"]["prefix"].encode()+cat
o_view = selections[cat].materialize(o_view_name)
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
input_dms_name=[input[0].name]
input_view_name=[input[1].name]
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
o_view.write_config(config, "split", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
o_view.close()
except Exception, e:
raise RollbackException("obi split error, rollbacking view: "+str(e), o_view)
i_dms.record_command_line(command_line)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -7,14 +7,16 @@ from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.dms.capi.obiview cimport COUNT_COLUMN
from obitools3.utils cimport tostr
from functools import reduce
import math
import time
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Compute basic statistics for attribute values."
__title__="Compute basic statistics for attribute values"
'''
`obi stats` computes basic statistics for attribute values of sequence records.
@ -117,9 +119,12 @@ def mean(values, options):
def variance(v):
if len(v)==1:
return 0
return 0
s = reduce(lambda x,y:(x[0]+y,x[1]+y**2),v,(0.,0.))
return s[1]/(len(v)-1) - s[0]**2/len(v)/(len(v)-1)
var = round(s[1]/(len(v)-1) - s[0]**2/len(v)/(len(v)-1), 5) # round to go around shady python rounding stuff when var is actually 0
if var == -0.0: # then fix -0 to +0 if was rounded to -0
var = 0.0
return var
def varpop(values, options):
@ -146,13 +151,13 @@ def run(config):
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
taxo_uri = open_uri(config['obi']['taxoURI'])
if taxo_uri is None:
if taxo_uri is None or taxo_uri[2] == bytes:
raise Exception("Couldn't open taxonomy")
taxo = taxo_uri[1]
else :
taxo = None
statistics = set(config['stats']['minimum']) | set(config['stats']['maximum']) | set(config['stats']['mean'])
statistics = set(config['stats']['minimum']) | set(config['stats']['maximum']) | set(config['stats']['mean']) | set(config['stats']['var']) | set(config['stats']['sd'])
total = 0
catcount={}
totcount={}
@ -160,9 +165,10 @@ def run(config):
lcat=0
# Initialize the progress bar
pb = ProgressBar(len(i_view), config, seconde=5)
pb = ProgressBar(len(i_view), config)
for i in range(len(i_view)):
PyErr_CheckSignals()
pb(i)
line = i_view[i]
@ -192,7 +198,7 @@ def run(config):
except KeyError:
totcount[category]=totcount.get(category,0)+1
for var in statistics:
if var in line:
if var in line and line[var] is not None:
v = line[var]
if var not in values:
values[var]={}
@ -235,18 +241,43 @@ def run(config):
else:
sdvar= "%s"
hcat = "\t".join([pcat % x for x in config['stats']['categories']]) + "\t" +\
"\t".join([minvar % x for x in config['stats']['minimum']]) + "\t" +\
"\t".join([maxvar % x for x in config['stats']['maximum']]) + "\t" +\
"\t".join([meanvar % x for x in config['stats']['mean']]) + "\t" +\
"\t".join([varvar % x for x in config['stats']['var']]) + "\t" +\
"\t".join([sdvar % x for x in config['stats']['sd']]) + \
"\t count" + \
"\t total"
hcat = ""
for x in config['stats']['categories']:
hcat += pcat % x
hcat += "\t"
for x in config['stats']['minimum']:
hcat += minvar % x
hcat += "\t"
for x in config['stats']['maximum']:
hcat += maxvar % x
hcat += "\t"
for x in config['stats']['mean']:
hcat += meanvar % x
hcat += "\t"
for x in config['stats']['var']:
hcat += varvar % x
hcat += "\t"
for x in config['stats']['sd']:
hcat += sdvar % x
hcat += "\t"
hcat += "count\ttotal"
print(hcat)
for c in catcount:
sorted_stats = sorted(catcount.items(), key = lambda kv:(totcount[kv[0]]), reverse=True)
for i in range(len(sorted_stats)):
c = sorted_stats[i][0]
for v in c:
print(pcat % str(v)+"\t", end="")
if type(v) == bytes:
print(pcat % tostr(v)+"\t", end="")
else:
print(pcat % str(v)+"\t", end="")
for m in config['stats']['minimum']:
print((("%%%dd" % lmini[m]) % mini[m][c])+"\t", end="")
for m in config['stats']['maximum']:
@ -257,9 +288,9 @@ def run(config):
print((("%%%df" % lvarp[m]) % varp[m][c])+"\t", end="")
for m in config['stats']['sd']:
print((("%%%df" % lsigma[m]) % sigma[m][c])+"\t", end="")
print("%7d" %catcount[c], end="")
print("%9d" %totcount[c])
print("%d" %catcount[c]+"\t", end="")
print("%d" %totcount[c]+"\t")
input[0].close()
input[0].close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -4,22 +4,25 @@ from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.apps.config import logger
from obitools3.utils cimport str2bytes
import time
import sys
from cpython.exc cimport PyErr_CheckSignals
from io import BufferedWriter
__title__="Keep the N last lines of a view."
__title__="Keep the N last lines of a view"
def addOptions(parser):
addMinimalInputOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi tail specific options')
@ -51,30 +54,41 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
o_view_name_final = output[1]
o_view_name = o_view_name_final
output_0 = output[0]
final_o_view_name = output[1]
# If the input and output DMS are not the same, create output view in input DMS first, then export it
# to output DMS, making sure the temporary view name is unique in the input DMS
if i_dms != o_dms:
# If stdout output or the input and output DMS are not the same, create a temporary view that will be exported and deleted.
if i_dms != o_dms or type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while o_view_name in i_dms:
o_view_name = o_view_name_final+b"_"+str2bytes(str(i))
i+=1
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
if type(output_0)==BufferedWriter:
o_dms = i_dms
else:
o_view_name = final_o_view_name
start = max(len(i_view) - config['tail']['count'], 0)
# Initialize the progress bar
pb = ProgressBar(len(i_view) - start, config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(i_view) - start, config)
else:
pb = None
selection = Line_selection(i_view)
for i in range(start, len(i_view)):
pb(i)
PyErr_CheckSignals()
if pb is not None:
pb(i)
selection.append(i)
pb(i, force=True)
print("", file=sys.stderr)
if pb is not None:
pb(i, force=True)
print("", file=sys.stderr)
# Save command config in View comments
command_line = " ".join(sys.argv[1:])
@ -95,16 +109,22 @@ def run(config):
# and delete the temporary view in the input DMS
if i_dms != o_dms:
o_view.close()
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, o_view_name_final)
o_view = o_dms[o_view_name_final]
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], o_view_name, final_o_view_name)
o_view = o_dms[final_o_view_name]
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# If the input and the output DMS are different, delete the temporary imported view used to create the final view
if i_dms != o_dms:
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
o_dms.close()
i_dms.close()
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -0,0 +1,230 @@
#cython: language_level=3
from obitools3.apps.progress cimport ProgressBar # @UnresolvedImport
from obitools3.dms import DMS
from obitools3.dms.view.view cimport View, Line_selection
from obitools3.uri.decode import open_uri
from obitools3.apps.optiongroups import addMinimalInputOption, addTaxonomyOption, addMinimalOutputOption, addNoProgressBarOption
from obitools3.dms.view import RollbackException
from obitools3.dms.column.column cimport Column
from functools import reduce
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes, str2bytes, tostr
from io import BufferedWriter
from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, \
ID_COLUMN, \
DEFINITION_COLUMN, \
QUALITY_COLUMN, \
COUNT_COLUMN, \
TAXID_COLUMN
from obitools3.dms.capi.obitypes cimport OBI_INT
from obitools3.dms.capi.obitaxonomy cimport MIN_LOCAL_TAXID
import time
import math
import sys
from cpython.exc cimport PyErr_CheckSignals
__title__="Add taxa with a new generated taxid to an NCBI taxonomy database"
def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addNoProgressBarOption(parser)
group=parser.add_argument_group('obi taxonomy specific options')
group.add_argument('-n', '--taxon-name-tag',
action="store",
dest="taxonomy:taxon_name_tag",
metavar="<SCIENTIFIC_NAME_TAG>",
default=b"SCIENTIFIC_NAME",
help="Name of the tag giving the scientific name of the taxon "
"(default: 'SCIENTIFIC_NAME').")
# group.add_argument('-g', '--try-genus-match',
# action="store_true", dest="taxonomy:try_genus_match",
# default=False,
# help="Try matching the first word of <SCIENTIFIC_NAME_TAG> when can't find corresponding taxid for a taxon. "
# "If there is a match it is added in the 'parent_taxid' tag. (Can be used by 'obi taxonomy' to add the taxon under that taxid).")
group.add_argument('-a', '--restricting-ancestor',
action="store",
dest="taxonomy:restricting_ancestor",
metavar="<RESTRICTING_ANCESTOR>",
default=None,
help="Enables to restrict the addition of taxids under an ancestor specified by its taxid.")
group.add_argument('-t', '--taxid-tag',
action="store",
dest="taxonomy:taxid_tag",
metavar="<TAXID_TAG>",
default=b"TAXID",
help="Name of the tag to store the new taxid "
"(default: 'TAXID').")
group.add_argument('-l', '--log-file',
action="store",
dest="taxonomy:log_file",
metavar="<LOG_FILE>",
default='',
help="Path to a log file to write informations about added taxids.")
def run(config):
DMS.obi_atexit()
logger("info", "obi taxonomy")
# Open the input
input = open_uri(config['obi']['inputURI'])
if input is None:
raise Exception("Could not read input view")
i_dms = input[0]
i_view = input[1]
i_view_name = input[1].name
# Open the output: only the DMS, as the output view is going to be created by cloning the input view
# (could eventually be done via an open_uri() argument)
output = open_uri(config['obi']['outputURI'],
input=False,
dms_only=True)
if output is None:
raise Exception("Could not create output view")
o_dms = output[0]
output_0 = output[0]
o_view_name = output[1]
# stdout output: create temporary view
if type(output_0)==BufferedWriter:
o_dms = i_dms
i=0
o_view_name = b"temp"
while o_view_name in i_dms: # Making sure view name is unique in output DMS
o_view_name = o_view_name+b"_"+str2bytes(str(i))
i+=1
imported_view_name = o_view_name
# If the input and output DMS are not the same, import the input view in the output DMS before cloning it to modify it
# (could be the other way around: clone and modify in the input DMS then import the new view in the output DMS)
if i_dms != o_dms:
imported_view_name = i_view_name
i=0
while imported_view_name in o_dms: # Making sure view name is unique in output DMS
imported_view_name = i_view_name+b"_"+str2bytes(str(i))
i+=1
View.import_view(i_dms.full_path[:-7], o_dms.full_path[:-7], i_view_name, imported_view_name)
i_view = o_dms[imported_view_name]
# Clone output view from input view
o_view = i_view.clone(o_view_name)
if o_view is None:
raise Exception("Couldn't create output view")
i_view.close()
# Open taxonomy
taxo_uri = open_uri(config['obi']['taxoURI'])
if taxo_uri is None or taxo_uri[2] == bytes:
raise Exception("Couldn't open taxonomy")
taxo = taxo_uri[1]
# Initialize the progress bar
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(o_view), config)
else:
pb = None
try:
if config['taxonomy']['log_file']:
logfile = open(config['taxonomy']['log_file'], 'w')
else:
logfile = sys.stdout
if 'restricting_ancestor' in config['taxonomy']:
res_anc = int(config['taxonomy']['restricting_ancestor'])
else:
res_anc = None
taxid_column_name = config['taxonomy']['taxid_tag']
parent_taxid_column_name = "PARENT_TAXID" # TODO macro
taxon_name_column_name = config['taxonomy']['taxon_name_tag']
taxid_column = Column.new_column(o_view, taxid_column_name, OBI_INT)
if parent_taxid_column_name in o_view:
parent_taxid_column = o_view[parent_taxid_column_name]
else:
parent_taxid_column = None
#parent_taxid_column = Column.new_column(o_view, parent_taxid_column_name, OBI_INT)
taxon_name_column = o_view[taxon_name_column_name]
for i in range(len(o_view)):
PyErr_CheckSignals()
#if pb is not None:
# pb(i)
taxon_name = taxon_name_column[i]
taxon = taxo.get_taxon_by_name(taxon_name, res_anc)
if taxon is not None:
taxid_column[i] = taxon.taxid
if logfile:
print(f"Found taxon '{tostr(taxon_name)}' already existing with taxid {taxid_column[i]}", file=logfile)
else: # try finding genus or other parent taxon from the first word
#print(i, o_view[i].id)
if parent_taxid_column is not None and parent_taxid_column[i] is not None:
taxid_column[i] = taxo.add_taxon(taxon_name, 'species', parent_taxid_column[i])
if logfile:
print(f"Adding taxon '{tostr(taxon_name)}' under provided parent {parent_taxid_column[i]} with taxid {taxid_column[i]}", file=logfile)
else:
taxon_name_sp = taxon_name.split(b" ")
taxon = taxo.get_taxon_by_name(taxon_name_sp[0], res_anc)
if taxon is not None:
parent_taxid_column[i] = taxon.taxid
taxid_column[i] = taxo.add_taxon(taxon_name, 'species', taxon.taxid)
if logfile:
print(f"Adding taxon '{tostr(taxon_name)}' under '{tostr(taxon.name)}' ({taxon.taxid}) with taxid {taxid_column[i]}", file=logfile)
else:
taxid_column[i] = taxo.add_taxon(taxon_name, 'species', res_anc)
if logfile:
print(f"Adding taxon '{tostr(taxon_name)}' under provided restricting ancestor {res_anc} with taxid {taxid_column[i]}", file=logfile)
taxo.write(taxo.name, update=True)
except Exception, e:
raise RollbackException("obi taxonomy error, rollbacking view: "+str(e), o_view)
#if pb is not None:
# pb(i, force=True)
# print("", file=sys.stderr)
#logger("info", "\nTaxa already in the taxonomy: "+str(found_count)+"/"+str(len(o_view))+" ("+str(round(found_count*100.0/len(o_view), 2))+"%)")
#logger("info", "\nParent taxids found: "+str(parent_found_count)+"/"+str(len(o_view))+" ("+str(round(parent_found_count*100.0/len(o_view), 2))+"%)")
#logger("info", "\nTaxids not found: "+str(not_found_count)+"/"+str(len(o_view))+" ("+str(round(not_found_count*100.0/len(o_view), 2))+"%)")
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
input_dms_name=[input[0].name]
input_view_name=[i_view_name]
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
o_view.write_config(config, "taxonomy", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
o_dms.record_command_line(command_line)
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
# If the input and the output DMS are different or if stdout output, delete the temporary imported view used to create the final view
if i_dms != o_dms or type(output_0)==BufferedWriter:
View.delete_view(o_dms, imported_view_name)
o_dms.close(force=True)
i_dms.close(force=True)
logger("info", "Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -23,6 +23,8 @@ from obitools3.dms.capi.obiview cimport NUC_SEQUENCE_COLUMN, \
import shutil
import string
import random
import sys
from cpython.exc cimport PyErr_CheckSignals
VIEW_TYPES = [b"", b"NUC_SEQS_VIEW"]
@ -37,7 +39,7 @@ COL_COMMENTS_MAX_LEN = 2048
MAX_INT = 2147483647 # used to generate random float values
__title__="Tests if the obitools are working properly"
__title__="Test if the obitools are working properly"
default_config = {
@ -299,8 +301,11 @@ def fill_column(config, infos, col) :
def create_random_column(config, infos) :
alias = random.choice([b'', random_unique_name(infos)])
tuples = random.choice([True, False])
dict_column = False
if not tuples :
nb_elements_per_line=random.randint(1, config['test']['maxelts'])
if nb_elements_per_line > 1:
dict_column = True
elements_names = []
for i in range(nb_elements_per_line) :
elements_names.append(random_unique_element_name(config, infos))
@ -316,6 +321,7 @@ def create_random_column(config, infos) :
data_type,
nb_elements_per_line=nb_elements_per_line,
elements_names=elements_names,
dict_column=dict_column,
tuples=tuples,
comments=random_comments(config),
alias=alias
@ -365,7 +371,7 @@ def random_new_view(config, infos, first=False):
infos['view'] = View_NUC_SEQS.new(infos['dms'], random_unique_name(infos), comments=random_comments(config)) # TODO quality column
else :
infos['view'] = View.new(infos['dms'], random_unique_name(infos), comments=random_comments(config)) # TODO quality column
infos['view'].write_config(config, "test", infos["command_line"], input_dms_name=[infos['dms'].name], input_view_name=["random"])
print_test(config, repr(infos['view']))
if v_to_clone is not None :
if line_selection is None:
@ -440,7 +446,7 @@ def addOptions(parser):
default=20,
type=int,
help="Maximum length of tuples. "
"Default: 200")
"Default: 20")
group.add_argument('--max_ini_col_count','-o',
action="store", dest="test:maxinicolcount",
@ -453,10 +459,10 @@ def addOptions(parser):
group.add_argument('--max_line_nb','-l',
action="store", dest="test:maxlinenb",
metavar='<MAX_LINE_NB>',
default=10000,
default=1000,
type=int,
help="Maximum number of lines in a column. "
"Default: 10000")
"Default: 1000")
group.add_argument('--max_elts_per_line','-e',
action="store", dest="test:maxelts",
@ -496,7 +502,8 @@ def run(config):
(b"OBI_SEQ", False): random_seq, (b"OBI_SEQ", True): random_seq_tuples,
(b"OBI_STR", False): random_bytes, (b"OBI_STR", True): random_bytes_tuples
},
'tests': [test_set_and_get, test_add_col, test_delete_col, test_col_alias, test_new_view]
'tests': [test_set_and_get, test_add_col, test_delete_col, test_col_alias, test_new_view],
'command_line': " ".join(sys.argv[1:])
}
# TODO ???
@ -511,11 +518,16 @@ def run(config):
i = 0
for t in range(config['test']['nbtests']):
PyErr_CheckSignals()
random_test(config, infos)
print_test(config, repr(infos['view']))
i+=1
if (i%(config['test']['nbtests']/10)) == 0 :
print("Testing......"+str(i*100/config['test']['nbtests'])+"%")
#lsof = subprocess.Popen("lsof | grep '/private/tmp/test_dms.obidms/'", stdin=subprocess.PIPE, shell=True )
#lsof.communicate( b"LSOF\n" )
#lsof = subprocess.Popen("lsof | wc -l", stdin=subprocess.PIPE, shell=True )
#lsof.communicate( b"LSOF total\n" )
#print(infos)
@ -523,7 +535,7 @@ def run(config):
test_taxo(config, infos)
infos['view'].close()
infos['dms'].close()
infos['dms'].close(force=True)
shutil.rmtree(config['obi']['defaultdms']+'.obidms', ignore_errors=True)
print("Done.")

View File

@ -1,103 +0,0 @@
../../../src/obi_lcs.h
../../../src/obi_lcs.c
../../../src/obierrno.h
../../../src/obierrno.c
../../../src/upperband.h
../../../src/upperband.c
../../../src/sse_banded_LCS_alignment.h
../../../src/sse_banded_LCS_alignment.c
../../../src/obiblob.h
../../../src/obiblob.c
../../../src/utils.h
../../../src/utils.c
../../../src/obidms.h
../../../src/obidms.c
../../../src/libjson/json_utils.h
../../../src/libjson/json_utils.c
../../../src/libjson/cJSON.h
../../../src/libjson/cJSON.c
../../../src/obiavl.h
../../../src/obiavl.c
../../../src/bloom.h
../../../src/bloom.c
../../../src/crc64.h
../../../src/crc64.c
../../../src/murmurhash2.h
../../../src/murmurhash2.c
../../../src/obidmscolumn.h
../../../src/obidmscolumn.c
../../../src/obitypes.h
../../../src/obitypes.c
../../../src/obidmscolumndir.h
../../../src/obidmscolumndir.c
../../../src/obiblob_indexer.h
../../../src/obiblob_indexer.c
../../../src/obiview.h
../../../src/obiview.c
../../../src/hashtable.h
../../../src/hashtable.c
../../../src/linked_list.h
../../../src/linked_list.c
../../../src/obidmscolumn_array.h
../../../src/obidmscolumn_array.c
../../../src/obidmscolumn_blob.h
../../../src/obidmscolumn_blob.c
../../../src/obidmscolumn_idx.h
../../../src/obidmscolumn_idx.c
../../../src/obidmscolumn_bool.h
../../../src/obidmscolumn_bool.c
../../../src/obidmscolumn_char.h
../../../src/obidmscolumn_char.c
../../../src/obidmscolumn_float.h
../../../src/obidmscolumn_float.c
../../../src/obidmscolumn_int.h
../../../src/obidmscolumn_int.c
../../../src/obidmscolumn_qual.h
../../../src/obidmscolumn_qual.c
../../../src/obidmscolumn_seq.h
../../../src/obidmscolumn_seq.c
../../../src/obidmscolumn_str.h
../../../src/obidmscolumn_str.c
../../../src/array_indexer.h
../../../src/array_indexer.c
../../../src/char_str_indexer.h
../../../src/char_str_indexer.c
../../../src/dna_seq_indexer.h
../../../src/dna_seq_indexer.c
../../../src/encode.c
../../../src/encode.h
../../../src/uint8_indexer.c
../../../src/uint8_indexer.h
../../../src/build_reference_db.c
../../../src/build_reference_db.h
../../../src/kmer_similarity.c
../../../src/kmer_similarity.h
../../../src/obi_clean.c
../../../src/obi_clean.h
../../../src/obi_ecopcr.c
../../../src/obi_ecopcr.h
../../../src/obi_ecotag.c
../../../src/obi_ecotag.h
../../../src/obidms_taxonomy.c
../../../src/obidms_taxonomy.h
../../../src/obilittlebigman.c
../../../src/obilittlebigman.h
../../../src/_sse.h
../../../src/obidebug.h
../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../src/libecoPCR/libapat/apat_parse.c
../../../src/libecoPCR/libapat/apat_search.c
../../../src/libecoPCR/libapat/apat.h
../../../src/libecoPCR/libapat/Gmach.h
../../../src/libecoPCR/libapat/Gtypes.h
../../../src/libecoPCR/libapat/libstki.c
../../../src/libecoPCR/libapat/libstki.h
../../../src/libecoPCR/libthermo/nnparams.h
../../../src/libecoPCR/libthermo/nnparams.c
../../../src/libecoPCR/ecoapat.c
../../../src/libecoPCR/ecodna.c
../../../src/libecoPCR/ecoError.c
../../../src/libecoPCR/ecoMalloc.c
../../../src/libecoPCR/ecoPCR.h

View File

@ -5,5 +5,5 @@ from obitools3.dms.taxo.taxo cimport Taxonomy
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy)
cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, list mergedKeys_list=*, Taxonomy taxonomy=*, bint mergeIds=*, list categories=*, int max_elts=*)
cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy, dict config)
cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, dict config, list mergedKeys_list=*, Taxonomy taxonomy=*, bint mergeIds=*, list categories=*, int max_elts=*)

370
python/obitools3/commands/uniq.pyx Executable file → Normal file
View File

@ -7,18 +7,25 @@ from obitools3.dms.obiseq cimport Nuc_Seq_Stored
from obitools3.dms.view import RollbackException
from obitools3.dms.view.typed_view.view_NUC_SEQS cimport View_NUC_SEQS
from obitools3.dms.column.column cimport Column, Column_line
from obitools3.dms.capi.obiview cimport QUALITY_COLUMN, COUNT_COLUMN, NUC_SEQUENCE_COLUMN, ID_COLUMN
from obitools3.dms.capi.obiview cimport QUALITY_COLUMN, COUNT_COLUMN, NUC_SEQUENCE_COLUMN, ID_COLUMN, TAXID_COLUMN, \
TAXID_DIST_COLUMN, MERGED_TAXID_COLUMN, MERGED_COLUMN, MERGED_PREFIX, \
REVERSE_QUALITY_COLUMN
from obitools3.dms.capi.obitypes cimport OBI_INT, OBI_STR, index_t
from obitools3.apps.optiongroups import addMinimalInputOption, addMinimalOutputOption, addTaxonomyOption
from obitools3.apps.optiongroups import addMinimalInputOption, \
addMinimalOutputOption, \
addTaxonomyOption, \
addEltLimitOption, \
addNoProgressBarOption
from obitools3.uri.decode import open_uri
from obitools3.apps.config import logger
from obitools3.utils cimport tobytes
from obitools3.utils cimport tobytes, tostr, str2bytes
import sys
from cpython.exc cimport PyErr_CheckSignals
from io import BufferedWriter
__title__="Group sequence records together"
def addOptions(parser):
@ -26,6 +33,8 @@ def addOptions(parser):
addMinimalInputOption(parser)
addTaxonomyOption(parser)
addMinimalOutputOption(parser)
addEltLimitOption(parser)
addNoProgressBarOption(parser)
group = parser.add_argument_group('obi uniq specific options')
@ -34,12 +43,12 @@ def addOptions(parser):
metavar="<TAG NAME>",
default=[],
type=str,
help="Attributes to merge.") # TODO must be a 1 elt/line column
help="Attributes to merge.") # note: must be a 1 elt/line column, but columns containing already merged information (name MERGED_*) are automatically remerged
group.add_argument('--merge-ids', '-e',
action="store_true", dest="uniq:mergeids",
default=False,
help="ONLY WORKING ON SMALL SETS FOR NOW Add the merged key with all ids of merged sequences.") # TODO ?
help="Add the merged key with all ids of merged sequences.")
group.add_argument('--category-attribute', '-c',
action="append", dest="uniq:categories",
@ -50,7 +59,7 @@ def addOptions(parser):
"(option can be used several times).")
cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy, dict config) :
cdef int taxid
cdef Nuc_Seq_Stored seq
@ -63,7 +72,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
cdef object gn_sn
cdef object fa_sn
# Create columns
# Create columns and save them for efficiency
if b"species" in o_view and o_view[b"species"].data_type_int != OBI_INT :
o_view.delete_column(b"species")
if b"species" not in o_view:
@ -71,6 +80,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"species",
OBI_INT
)
species_column = o_view[b"species"]
if b"genus" in o_view and o_view[b"genus"].data_type_int != OBI_INT :
o_view.delete_column(b"genus")
@ -79,6 +89,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"genus",
OBI_INT
)
genus_column = o_view[b"genus"]
if b"family" in o_view and o_view[b"family"].data_type_int != OBI_INT :
o_view.delete_column(b"family")
@ -87,6 +98,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"family",
OBI_INT
)
family_column = o_view[b"family"]
if b"species_name" in o_view and o_view[b"species_name"].data_type_int != OBI_STR :
o_view.delete_column(b"species_name")
@ -95,6 +107,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"species_name",
OBI_STR
)
species_name_column = o_view[b"species_name"]
if b"genus_name" in o_view and o_view[b"genus_name"].data_type_int != OBI_STR :
o_view.delete_column(b"genus_name")
@ -103,6 +116,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"genus_name",
OBI_STR
)
genus_name_column = o_view[b"genus_name"]
if b"family_name" in o_view and o_view[b"family_name"].data_type_int != OBI_STR :
o_view.delete_column(b"family_name")
@ -111,6 +125,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"family_name",
OBI_STR
)
family_name_column = o_view[b"family_name"]
if b"rank" in o_view and o_view[b"rank"].data_type_int != OBI_STR :
o_view.delete_column(b"rank")
@ -119,6 +134,7 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"rank",
OBI_STR
)
rank_column = o_view[b"rank"]
if b"scientific_name" in o_view and o_view[b"scientific_name"].data_type_int != OBI_STR :
o_view.delete_column(b"scientific_name")
@ -127,16 +143,27 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
b"scientific_name",
OBI_STR
)
for seq in o_view:
if b"merged_taxid" in seq :
scientific_name_column = o_view[b"scientific_name"]
# Initialize the progress bar
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(o_view), config)
else:
pb = None
i=0
for seq in o_view:
PyErr_CheckSignals()
if pb is not None:
pb(i)
if MERGED_TAXID_COLUMN in seq :
m_taxids = []
m_taxids_dict = seq[b"merged_taxid"]
m_taxids_dict = seq[MERGED_TAXID_COLUMN]
for k in m_taxids_dict.keys() :
if m_taxids_dict[k] is not None:
m_taxids.append(int(k))
taxid = taxonomy.last_common_taxon(*m_taxids)
seq[b"taxid"] = taxid
seq[TAXID_COLUMN] = taxid
tsp = taxonomy.get_species(taxid)
tgn = taxonomy.get_genus(taxid)
tfa = taxonomy.get_family(taxid)
@ -158,20 +185,24 @@ cdef merge_taxonomy_classification(View_NUC_SEQS o_view, Taxonomy taxonomy) :
else:
fa_sn = None
tfa = None
seq[b"species"] = tsp
seq[b"genus"] = tgn
seq[b"family"] = tfa
seq[b"species_name"] = sp_sn
seq[b"genus_name"] = gn_sn
seq[b"family_name"] = fa_sn
seq[b"rank"] = taxonomy.get_rank(taxid)
seq[b"scientific_name"] = taxonomy.get_scientific_name(taxid)
species_column[i] = tsp
genus_column[i] = tgn
family_column[i] = tfa
species_name_column[i] = sp_sn
genus_name_column[i] = gn_sn
family_name_column[i] = fa_sn
rank_column[i] = taxonomy.get_rank(taxid)
scientific_name_column[i] = taxonomy.get_scientific_name(taxid)
i+=1
if pb is not None:
pb(len(o_view), force=True)
cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, list mergedKeys_list=None, Taxonomy taxonomy=None, bint mergeIds=False, list categories=None, int max_elts=10000) :
cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, dict config, list mergedKeys_list=None, Taxonomy taxonomy=None, bint mergeIds=False, list categories=None, int max_elts=1000000) :
cdef int i
cdef int k
@ -180,6 +211,7 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
cdef int u_idx
cdef int i_idx
cdef int i_count
cdef int o_count
cdef str key_str
cdef bytes key
cdef bytes mkey
@ -202,7 +234,6 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
cdef Nuc_Seq_Stored i_seq
cdef Nuc_Seq_Stored o_seq
cdef Nuc_Seq_Stored u_seq
cdef Column i_col
cdef Column i_seq_col
cdef Column i_id_col
cdef Column i_taxid_col
@ -210,14 +241,25 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
cdef Column o_id_col
cdef Column o_taxid_dist_col
cdef Column o_merged_col
cdef Column o_count_col
cdef Column i_count_col
cdef Column_line i_mcol
cdef object taxid_dist_dict
cdef object iter_view
cdef object mcol
cdef object to_merge
cdef list merged_list
uniques = {}
for column_name in view.keys():
if column_name[:7] == b"MERGED_":
info_to_merge = column_name[7:]
if mergedKeys_list is not None:
mergedKeys_list.append(tostr(info_to_merge))
else:
mergedKeys_list = [tostr(info_to_merge)]
mergedKeys_list_b = []
if mergedKeys_list is not None:
for key_str in mergedKeys_list:
@ -227,26 +269,31 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
mergedKeys_set=set()
if taxonomy is not None:
mergedKeys_set.add(b"taxid")
mergedKeys_set.add(TAXID_COLUMN)
mergedKeys = list(mergedKeys_set)
k_count = len(mergedKeys)
mergedKeys_m = []
for k in range(k_count):
mergedKeys_m.append(b"merged_" + mergedKeys[k])
mergedKeys_m.append(MERGED_PREFIX + mergedKeys[k])
# Check that not trying to remerge without total count information
for key in mergedKeys_m:
if key in view and COUNT_COLUMN not in view:
raise Exception("\n>>>>\nError: trying to re-merge tags without total count tag. Run obi annotate to add the count tag from the relevant merged tag, i.e.: \nobi annotate --set-tag COUNT:'sum([value for key,value in sequence['MERGED_sample'].items()])' dms/input dms/output\n")
if categories is None:
categories = []
# Keep columns that are going to be used a lot in variables
i_seq_col = view[NUC_SEQUENCE_COLUMN]
i_id_col = view[ID_COLUMN]
if b"taxid" in view:
i_taxid_col = view[b"taxid"]
if b"taxid_dist" in view:
i_taxid_dist_col = view[b"taxid_dist"]
if TAXID_COLUMN in view:
i_taxid_col = view[TAXID_COLUMN]
if TAXID_DIST_COLUMN in view:
i_taxid_dist_col = view[TAXID_DIST_COLUMN]
# First browsing
@ -257,7 +304,9 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
merged_infos = {}
iter_view = iter(view)
for i_seq in iter_view :
pb(i)
PyErr_CheckSignals()
if pb is not None:
pb(i)
# This can't be done in the same line as the unique_id tuple creation because it generates a bug
# where Cython (version 0.25.2) does not detect the reference to the categs_list variable and deallocates
@ -267,6 +316,7 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
for x in categories :
catl.append(i_seq[x])
#unique_id = tuple(catl) + (i_seq_col[i],)
unique_id = tuple(catl) + (i_seq_col.get_line_idx(i),)
#unique_id = tuple(i_seq[x] for x in categories) + (seq_col.get_line_idx(i),) # The line that cython can't read properly
@ -278,7 +328,13 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
for k in range(k_count):
key = mergedKeys[k]
mkey = mergedKeys_m[k]
if key in i_seq: # TODO what if mkey already in i_seq? --> should update
if mkey in i_seq:
if mkey not in merged_infos:
merged_infos[mkey] = {}
mkey_infos = merged_infos[mkey]
mkey_infos['nb_elts'] = view[mkey].nb_elements_per_line
mkey_infos['elt_names'] = view[mkey].elements_names
if key in i_seq:
if mkey not in merged_infos:
merged_infos[mkey] = {}
mkey_infos = merged_infos[mkey]
@ -297,7 +353,15 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
for k in range(k_count):
key = mergedKeys[k]
merged_col_name = mergedKeys_m[k]
i_col = view[key]
# if merged_infos[merged_col_name]['nb_elts'] == 1:
# raise Exception("Can't merge information from a tag with only one element (e.g. one sample ; don't use -m option)")
if merged_col_name in view:
i_col = view[merged_col_name]
else:
i_col = view[key]
if merged_infos[merged_col_name]['nb_elts'] > max_elts:
str_merged_cols.append(merged_col_name)
Column.new_column(o_view,
@ -305,7 +369,7 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
OBI_STR,
to_eval=True,
comments=i_col.comments,
alias=merged_col_name # TODO what if it already exists
alias=merged_col_name
)
else:
@ -314,62 +378,73 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
OBI_INT,
nb_elements_per_line=merged_infos[merged_col_name]['nb_elts'],
elements_names=list(merged_infos[merged_col_name]['elt_names']),
dict_column=True,
comments=i_col.comments,
alias=merged_col_name # TODO what if it already exists
alias=merged_col_name
)
mkey_cols[merged_col_name] = o_view[merged_col_name]
# taxid_dist column
if mergeIds and b"taxid" in mergedKeys:
if mergeIds and TAXID_COLUMN in mergedKeys:
if len(view) > max_elts: #The number of different IDs corresponds to the number of sequences in the view
str_merged_cols.append(b"taxid_dist")
str_merged_cols.append(TAXID_DIST_COLUMN)
Column.new_column(o_view,
b"taxid_dist",
TAXID_DIST_COLUMN,
OBI_STR,
to_eval=True,
comments=b"obi uniq taxid dist, stored as character strings to be read as dict",
alias=b"taxid_dist" # TODO what if it already exists
alias=TAXID_DIST_COLUMN
)
else:
Column.new_column(o_view,
b"taxid_dist",
TAXID_DIST_COLUMN,
OBI_INT,
nb_elements_per_line=len(view),
elements_names=[id for id in i_id_col],
comments=b"obi uniq taxid dist",
alias=b"taxid_dist" # TODO what if it already exists
dict_column=True,
alias=TAXID_DIST_COLUMN
)
del(merged_infos)
# Merged ids column
if mergeIds :
Column.new_column(o_view,
b"merged",
MERGED_COLUMN,
OBI_STR,
tuples=True,
comments=b"obi uniq merged ids",
alias=b"merged" # TODO what if it already exists
alias=MERGED_COLUMN
)
# Keep columns that are going to be used a lot in variables
# Keep columns in variables for efficiency
o_id_col = o_view[ID_COLUMN]
if b"taxid_dist" in o_view:
o_taxid_dist_col = o_view[b"taxid_dist"]
if b"merged" in o_view:
o_merged_col = o_view[b"merged"]
if TAXID_DIST_COLUMN in o_view:
o_taxid_dist_col = o_view[TAXID_DIST_COLUMN]
if MERGED_COLUMN in o_view:
o_merged_col = o_view[MERGED_COLUMN]
if COUNT_COLUMN not in o_view:
Column.new_column(o_view,
COUNT_COLUMN,
OBI_INT)
o_count_col = o_view[COUNT_COLUMN]
if COUNT_COLUMN in view:
i_count_col = view[COUNT_COLUMN]
if pb is not None:
pb(len(view), force=True)
print("")
print("\n") # TODO because in the middle of progress bar. Better solution?
logger("info", "Second browsing through the input")
# Initialize the progress bar
pb = ProgressBar(len(uniques), seconde=5)
if pb is not None:
pb = ProgressBar(len(view))
o_idx = 0
total_treated = 0
for unique_id in uniques :
pb(o_idx)
PyErr_CheckSignals()
merged_sequences = uniques[unique_id]
@ -380,11 +455,15 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
o_id = o_seq.id
if mergeIds:
o_merged_col[o_idx] = [view[idx].id for idx in merged_sequences]
merged_list = [view[idx].id for idx in merged_sequences]
if MERGED_COLUMN in view: # merge all ids if there's already some merged ids info
merged_list.extend(view[MERGED_COLUMN][idx] for idx in merged_sequences)
merged_list = list(set(merged_list)) # deduplicate the list
o_merged_col[o_idx] = merged_list
o_seq[COUNT_COLUMN] = 0
o_count = 0
if b"taxid_dist" in u_seq and i_taxid_dist_col[u_idx] is not None:
if TAXID_DIST_COLUMN in u_seq and i_taxid_dist_col[u_idx] is not None:
taxid_dist_dict = i_taxid_dist_col[u_idx]
else:
taxid_dist_dict = {}
@ -394,29 +473,33 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
merged_dict[mkey] = {}
for i_idx in merged_sequences:
PyErr_CheckSignals()
if pb is not None:
pb(total_treated)
i_id = i_id_col[i_idx]
i_seq = view[i_idx]
if COUNT_COLUMN not in i_seq or i_seq[COUNT_COLUMN] is None:
if COUNT_COLUMN not in i_seq or i_count_col[i_idx] is None:
i_count = 1
else:
i_count = i_seq[COUNT_COLUMN]
i_count = i_count_col[i_idx]
o_seq[COUNT_COLUMN] += i_count
o_count += i_count
for k in range(k_count):
key = mergedKeys[k]
mkey = mergedKeys_m[k]
if key==b"taxid" and mergeIds:
if b"taxid_dist" in i_seq:
if key==TAXID_COLUMN and mergeIds:
if TAXID_DIST_COLUMN in i_seq:
taxid_dist_dict.update(i_taxid_dist_col[i_idx])
if b"taxid" in i_seq:
if TAXID_COLUMN in i_seq:
taxid_dist_dict[i_id] = i_taxid_col[i_idx]
#cas ou on met a jour les merged_keys mais il n'y a pas de merged_keys dans la sequence qui arrive
# merge relevant keys
if key in i_seq:
to_merge = i_seq[key]
if to_merge is not None:
@ -428,52 +511,65 @@ cdef uniq_sequences(View_NUC_SEQS view, View_NUC_SEQS o_view, ProgressBar pb, li
else:
mcol[to_merge] = mcol[to_merge] + i_count
o_seq[key] = None
#cas ou merged_keys existe deja
else: # TODO is this a good else
if mkey in i_seq:
mcol = merged_dict[mkey]
i_mcol = i_seq[mkey]
if i_mcol is not None:
for key2 in i_mcol:
if mcol[key2] is None:
mcol[key2] = i_mcol[key2]
else:
mcol[key2] = mcol[key2] + i_mcol[key2]
# Write taxid_dist
if mergeIds and b"taxid" in mergedKeys:
if b"taxid_dist" in str_merged_cols:
o_taxid_dist_col[o_idx] = str(taxid_dist_dict)
else:
o_taxid_dist_col[o_idx] = taxid_dist_dict
# Write merged dicts
for mkey in merged_dict:
if mkey in str_merged_cols:
mkey_cols[mkey][o_idx] = str(merged_dict[mkey])
else:
mkey_cols[mkey][o_idx] = merged_dict[mkey]
# Sets NA values to 0 # TODO discuss, maybe keep as None and test for None instead of testing for 0 in tools
#for key in mkey_cols[mkey][o_idx]:
# if mkey_cols[mkey][o_idx][key] is None:
# mkey_cols[mkey][o_idx][key] = 0
# merged infos already in seq: merge the merged infos
if mkey in i_seq:
mcol = merged_dict[mkey] # dict
i_mcol = i_seq[mkey] # column line
if i_mcol.is_NA() == False:
for key2 in i_mcol:
if key2 not in mcol:
mcol[key2] = i_mcol[key2]
else:
mcol[key2] = mcol[key2] + i_mcol[key2]
for key in i_seq.keys():
# Delete informations that differ between the merged sequences
# TODO make special columns list?
if key != COUNT_COLUMN and key != ID_COLUMN and key != NUC_SEQUENCE_COLUMN and key in o_seq and o_seq[key] != i_seq[key] :
# TODO make special columns list? // could be more efficient
if key != COUNT_COLUMN and key != ID_COLUMN and key != NUC_SEQUENCE_COLUMN and key in o_seq and o_seq[key] != i_seq[key] \
and key not in merged_dict :
o_seq[key] = None
total_treated += 1
# Write merged dicts
for mkey in merged_dict:
if mkey in str_merged_cols:
mkey_cols[mkey][o_idx] = str(merged_dict[mkey])
else:
mkey_cols[mkey][o_idx] = merged_dict[mkey]
# Sets NA values to 0 # TODO discuss, for now keep as None and test for None instead of testing for 0 in tools
#for key in mkey_cols[mkey][o_idx]:
# if mkey_cols[mkey][o_idx][key] is None:
# mkey_cols[mkey][o_idx][key] = 0
# Write taxid_dist
if mergeIds and TAXID_COLUMN in mergedKeys:
if TAXID_DIST_COLUMN in str_merged_cols:
o_taxid_dist_col[o_idx] = str(taxid_dist_dict)
else:
o_taxid_dist_col[o_idx] = taxid_dist_dict
o_count_col[o_idx] = o_count
o_idx += 1
# Deletes quality column if there is one because the matching between sequence and quality will be broken (quality set to NA when sequence not)
if pb is not None:
pb(len(view), force=True)
# Deletes quality columns if there is one because the matching between sequence and quality will be broken (quality set to NA when sequence not)
if QUALITY_COLUMN in view:
o_view.delete_column(QUALITY_COLUMN)
if REVERSE_QUALITY_COLUMN in view:
o_view.delete_column(REVERSE_QUALITY_COLUMN)
# Delete old columns that are now merged
for k in range(k_count):
if mergedKeys[k] in o_view:
o_view.delete_column(mergedKeys[k])
if taxonomy is not None:
print("\n") # TODO because in the middle of progress bar. Better solution?
print("") # TODO because in the middle of progress bar. Better solution?
logger("info", "Merging taxonomy classification")
merge_taxonomy_classification(o_view, taxonomy)
merge_taxonomy_classification(o_view, taxonomy, config)
@ -505,25 +601,47 @@ def run(config):
if output is None:
raise Exception("Could not create output view")
i_dms = input[0]
entries = input[1]
o_view = output[1]
o_dms = output[0]
output_0 = output[0]
# If stdout output create a temporary view that will be exported and deleted.
if type(output_0)==BufferedWriter:
temporary_view_name = b"temp"
i=0
while temporary_view_name in i_dms: # Making sure view name is unique in input DMS
temporary_view_name = temporary_view_name+b"_"+str2bytes(str(i))
i+=1
o_view_name = temporary_view_name
o_dms = i_dms
o_view = View_NUC_SEQS.new(i_dms, o_view_name)
else:
o_view = output[1]
if 'taxoURI' in config['obi'] and config['obi']['taxoURI'] is not None:
taxo_uri = open_uri(config['obi']['taxoURI'])
if taxo_uri is None:
if taxo_uri is None or taxo_uri[2] == bytes:
raise RollbackException("Couldn't open taxonomy, rollbacking view", o_view)
taxo = taxo_uri[1]
else :
taxo = None
# Initialize the progress bar
pb = ProgressBar(len(entries), config, seconde=5)
if config['obi']['noprogressbar'] == False:
pb = ProgressBar(len(entries), config)
else:
pb = None
try:
uniq_sequences(entries, o_view, pb, mergedKeys_list=config['uniq']['merge'], taxonomy=taxo, mergeIds=config['uniq']['mergeids'], categories=config['uniq']['categories'], max_elts=config['obi']['maxelts'])
except Exception, e:
raise RollbackException("obi uniq error, rollbacking view: "+str(e), o_view)
if len(entries) > 0:
try:
uniq_sequences(entries, o_view, pb, config, mergedKeys_list=config['uniq']['merge'], taxonomy=taxo, mergeIds=config['uniq']['mergeids'], categories=config['uniq']['categories'], max_elts=config['obi']['maxelts'])
except Exception, e:
raise RollbackException("obi uniq error, rollbacking view: "+str(e), o_view)
if pb is not None:
print("", file=sys.stderr)
# Save command config in View and DMS comments
command_line = " ".join(sys.argv[1:])
input_dms_name=[input[0].name]
@ -532,11 +650,23 @@ def run(config):
input_dms_name.append(config['obi']['taxoURI'].split("/")[-3])
input_view_name.append("taxonomy/"+config['obi']['taxoURI'].split("/")[-1])
o_view.write_config(config, "uniq", command_line, input_dms_name=input_dms_name, input_view_name=input_view_name)
output[0].record_command_line(command_line)
o_dms.record_command_line(command_line)
print("\n")
print(repr(o_view))
input[0].close()
output[0].close()
# stdout output: write to buffer
if type(output_0)==BufferedWriter:
logger("info", "Printing to output...")
o_view.print_to_output(output_0, noprogressbar=config['obi']['noprogressbar'])
o_view.close()
#print("\n\nOutput view:\n````````````", file=sys.stderr)
#print(repr(o_view), file=sys.stderr)
# If stdout output, delete the temporary result view in the input DMS
if type(output_0)==BufferedWriter:
View.delete_view(i_dms, o_view_name)
i_dms.close(force=True)
o_dms.close(force=True)
logger("info", "Done.")

1
python/obitools3/dms/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
/.DS_Store

View File

@ -34,11 +34,12 @@ cdef extern from "kmer_similarity.h" nogil:
index_t elt_idx2,
OBIDMS_column_p qual_col1,
OBIDMS_column_p qual_col2,
OBIDMS_column_p reversed_col,
uint8_t kmer_size,
int32_t* kmer_pos_array,
int32_t** kmer_pos_array,
int32_t* kmer_pos_array_height_p,
int32_t* shift_array,
int32_t** shift_array,
int32_t* shift_array_height_p,
int32_t* shift_count_array,
int32_t** shift_count_array,
int32_t* shift_count_array_height_p,
bint build_consensus)

View File

@ -9,6 +9,7 @@ cdef extern from "obidms.h" nogil:
bint little_endian
size_t file_size
size_t used_size
bint working
const_char_p comments
ctypedef OBIDMS_infos_t* OBIDMS_infos_p
@ -21,9 +22,10 @@ cdef extern from "obidms.h" nogil:
ctypedef OBIDMS_t* OBIDMS_p
int obi_dms_is_clean(OBIDMS_p dms)
int obi_clean_dms(const_char_p dms_path)
OBIDMS_p obi_dms(const_char_p dms_name)
OBIDMS_p obi_open_dms(const_char_p dms_path)
OBIDMS_p obi_open_dms(const_char_p dms_path, bint cleaning)
OBIDMS_p obi_test_open_dms(const_char_p dms_path)
OBIDMS_p obi_create_dms(const_char_p dms_path)
int obi_dms_exists(const char* dms_path)
@ -32,6 +34,7 @@ cdef extern from "obidms.h" nogil:
int obi_close_dms(OBIDMS_p dms, bint force)
char* obi_dms_get_dms_path(OBIDMS_p dms)
char* obi_dms_get_full_path(OBIDMS_p dms, const_char_p path_name)
char* obi_dms_formatted_infos(OBIDMS_p dms, bint detailed)
void obi_close_atexit()
obiversion_t obi_import_column(const char* dms_path_1, const char* dms_path_2, const char* column_name, obiversion_t version_number)

View File

@ -31,6 +31,7 @@ cdef extern from "obidmscolumn.h" nogil:
const_char_p elements_names
OBIType_t returned_data_type
OBIType_t stored_data_type
bint dict_column
bint tuples
bint to_eval
time_t creation_date
@ -68,3 +69,6 @@ cdef extern from "obidmscolumn.h" nogil:
int obi_column_write_comments(OBIDMS_column_p column, const char* comments)
int obi_column_add_comment(OBIDMS_column_p column, const char* key, const char* value)
char* obi_column_formatted_infos(OBIDMS_column_p column, bint detailed)

View File

@ -8,6 +8,7 @@ cdef extern from "obi_ecopcr.h" nogil:
int obi_ecopcr(const char* input_dms_name,
const char* i_view_name,
const char* tax_dms_name,
const char* taxonomy_name,
const char* output_dms_name,
const char* o_view_name,
@ -23,6 +24,7 @@ cdef extern from "obi_ecopcr.h" nogil:
double salt_concentration,
int salt_correction_method,
int keep_nucleotides,
bint keep_primers,
bint kingdom_mode)

View File

@ -11,4 +11,5 @@ cdef extern from "obi_ecotag.h" nogil:
const char* taxonomy_name,
const char* output_view_name,
const char* output_view_comments,
double ecotag_threshold)
double ecotag_threshold,
double bubble_threshold)

View File

@ -7,3 +7,5 @@ cdef extern from "obierrno.h":
extern int OBI_LINE_IDX_ERROR
extern int OBI_ELT_IDX_ERROR
extern int OBIVIEW_ALREADY_EXISTS_ERROR
extern int OBIDMS_NOT_CLEAN
extern int OBIDMS_WORKING

View File

@ -7,6 +7,8 @@ from libc.stdint cimport int32_t
cdef extern from "obidms_taxonomy.h" nogil:
extern int MIN_LOCAL_TAXID
struct ecotxnode :
int32_t taxid
int32_t rank
@ -18,6 +20,13 @@ cdef extern from "obidms_taxonomy.h" nogil:
ctypedef ecotxnode ecotx_t
struct econame_t : # can't get this struct to be accepted by Cython ('unknown size')
char* name
char* class_name
int32_t is_scientific_name
ecotxnode* taxon
struct ecotxidx_t :
int32_t count
int32_t max_taxid
@ -30,9 +39,14 @@ cdef extern from "obidms_taxonomy.h" nogil:
char** label
struct econameidx_t :
int32_t count
econame_t* names
struct OBIDMS_taxonomy_t :
ecorankidx_t* ranks
# econameidx_t* names
econameidx_t* names
ecotxidx_t* taxa
ctypedef OBIDMS_taxonomy_t* OBIDMS_taxonomy_p
@ -44,14 +58,18 @@ cdef extern from "obidms_taxonomy.h" nogil:
OBIDMS_taxonomy_p obi_read_taxdump(const_char_p taxdump)
int obi_write_taxonomy(OBIDMS_p dms, OBIDMS_taxonomy_p tax, const_char_p tax_name)
int obi_write_taxonomy(OBIDMS_p dms, OBIDMS_taxonomy_p tax, const_char_p tax_name, bint update)
int obi_close_taxonomy(OBIDMS_taxonomy_p taxonomy)
ecotx_t* obi_taxo_get_parent_at_rank(ecotx_t* taxon, int32_t rankidx)
ecotx_t* obi_taxo_get_taxon_with_taxid(OBIDMS_taxonomy_p taxonomy, int32_t taxid)
char* obi_taxo_get_name_from_name_idx(OBIDMS_taxonomy_p taxonomy, int32_t idx)
ecotx_t* obi_taxo_get_taxon_from_name_idx(OBIDMS_taxonomy_p taxonomy, int32_t idx)
bint obi_taxo_is_taxon_under_taxid(ecotx_t* taxon, int32_t other_taxid)
ecotx_t* obi_taxo_get_species(ecotx_t* taxon, OBIDMS_taxonomy_p taxonomy)
@ -71,4 +89,4 @@ cdef extern from "obidms_taxonomy.h" nogil:
int obi_taxo_add_preferred_name_with_taxon(OBIDMS_taxonomy_p tax, ecotx_t* taxon, const char* preferred_name)
const char* obi_taxo_rank_index_to_label(int32_t rank_idx, ecorankidx_t* ranks)

View File

@ -53,6 +53,8 @@ cdef extern from "obitypes.h" nogil:
extern const_char_p OBIQual_char_NA
extern uint8_t* OBIQual_int_NA
extern void* OBITuple_NA
extern obiint_t OBI_INT_MAX
const_char_p name_data_type(int data_type)

View File

@ -24,8 +24,15 @@ cdef extern from "obiview.h" nogil:
extern const_char_p ID_COLUMN
extern const_char_p DEFINITION_COLUMN
extern const_char_p QUALITY_COLUMN
extern const_char_p REVERSE_QUALITY_COLUMN
extern const_char_p REVERSE_SEQUENCE_COLUMN
extern const_char_p COUNT_COLUMN
extern const_char_p SCIENTIFIC_NAME_COLUMN
extern const_char_p TAXID_COLUMN
extern const_char_p MERGED_TAXID_COLUMN
extern const_char_p MERGED_PREFIX
extern const_char_p TAXID_DIST_COLUMN
extern const_char_p MERGED_COLUMN
struct Alias_column_pair_t :
Column_reference_t column_refs
@ -80,7 +87,7 @@ cdef extern from "obiview.h" nogil:
Obiview_p obi_open_view(OBIDMS_p dms, const_char_p view_name)
int obi_view_add_column(Obiview_p view,
const_char_p column_name,
char* column_name,
obiversion_t version_number,
const_char_p alias,
OBIType_t data_type,
@ -88,6 +95,7 @@ cdef extern from "obiview.h" nogil:
index_t nb_elements_per_line,
char* elements_names,
bint elt_names_formatted,
bint dict_column,
bint tuples,
bint to_eval,
const_char_p indexer_name,
@ -96,14 +104,18 @@ cdef extern from "obiview.h" nogil:
const_char_p comments,
bint create)
int obi_view_delete_column(Obiview_p view, const_char_p column_name)
int obi_view_delete_column(Obiview_p view, const_char_p column_name, bint delete_file)
OBIDMS_column_p obi_view_get_column(Obiview_p view, const_char_p column_name)
OBIDMS_column_p* obi_view_get_pointer_on_column_in_view(Obiview_p view, const_char_p column_name)
int obi_view_create_column_alias(Obiview_p view, const_char_p current_name, const_char_p alias)
char* obi_view_formatted_infos(Obiview_p view, bint detailed)
char* obi_view_formatted_infos_one_line(Obiview_p view)
int obi_view_write_comments(Obiview_p view, const_char_p comments)
int obi_view_add_comment(Obiview_p view, const_char_p key, const_char_p value)

View File

@ -1,110 +0,0 @@
../../../../src/obi_lcs.h
../../../../src/obi_lcs.c
../../../../src/obierrno.h
../../../../src/obierrno.c
../../../../src/upperband.h
../../../../src/upperband.c
../../../../src/sse_banded_LCS_alignment.h
../../../../src/sse_banded_LCS_alignment.c
../../../../src/obiblob.h
../../../../src/obiblob.c
../../../../src/utils.h
../../../../src/utils.c
../../../../src/obidms.h
../../../../src/obidms.c
../../../../src/libjson/json_utils.h
../../../../src/libjson/json_utils.c
../../../../src/libjson/cJSON.h
../../../../src/libjson/cJSON.c
../../../../src/obiavl.h
../../../../src/obiavl.c
../../../../src/bloom.h
../../../../src/bloom.c
../../../../src/crc64.h
../../../../src/crc64.c
../../../../src/murmurhash2.h
../../../../src/murmurhash2.c
../../../../src/obidmscolumn.h
../../../../src/obidmscolumn.c
../../../../src/obitypes.h
../../../../src/obitypes.c
../../../../src/obidmscolumndir.h
../../../../src/obidmscolumndir.c
../../../../src/obiblob_indexer.h
../../../../src/obiblob_indexer.c
../../../../src/obiview.h
../../../../src/obiview.c
../../../../src/hashtable.h
../../../../src/hashtable.c
../../../../src/linked_list.h
../../../../src/linked_list.c
../../../../src/obidmscolumn_array.h
../../../../src/obidmscolumn_array.c
../../../../src/obidmscolumn_blob.h
../../../../src/obidmscolumn_blob.c
../../../../src/obidmscolumn_idx.h
../../../../src/obidmscolumn_idx.c
../../../../src/obidmscolumn_bool.h
../../../../src/obidmscolumn_bool.c
../../../../src/obidmscolumn_char.h
../../../../src/obidmscolumn_char.c
../../../../src/obidmscolumn_float.h
../../../../src/obidmscolumn_float.c
../../../../src/obidmscolumn_int.h
../../../../src/obidmscolumn_int.c
../../../../src/obidmscolumn_qual.h
../../../../src/obidmscolumn_qual.c
../../../../src/obidmscolumn_seq.h
../../../../src/obidmscolumn_seq.c
../../../../src/obidmscolumn_str.h
../../../../src/obidmscolumn_str.c
../../../../src/array_indexer.h
../../../../src/array_indexer.c
../../../../src/char_str_indexer.h
../../../../src/char_str_indexer.c
../../../../src/dna_seq_indexer.h
../../../../src/dna_seq_indexer.c
../../../../src/encode.c
../../../../src/encode.h
../../../../src/uint8_indexer.c
../../../../src/uint8_indexer.h
../../../../src/build_reference_db.c
../../../../src/build_reference_db.h
../../../../src/kmer_similarity.c
../../../../src/kmer_similarity.h
../../../../src/obi_clean.c
../../../../src/obi_clean.h
../../../../src/obi_ecopcr.c
../../../../src/obi_ecopcr.h
../../../../src/obi_ecotag.c
../../../../src/obi_ecotag.h
../../../../src/obidms_taxonomy.c
../../../../src/obidms_taxonomy.h
../../../../src/obilittlebigman.c
../../../../src/obilittlebigman.h
../../../../src/_sse.h
../../../../src/obidebug.h
../../../../src/libecoPCR/libapat/CODES/dft_code.h
../../../../src/libecoPCR/libapat/CODES/dna_code.h
../../../../src/libecoPCR/libapat/CODES/prot_code.h
../../../../src/libecoPCR/libapat/apat_parse.c
../../../../src/libecoPCR/libapat/apat_search.c
../../../../src/libecoPCR/libapat/apat.h
../../../../src/libecoPCR/libapat/Gmach.h
../../../../src/libecoPCR/libapat/Gtypes.h
../../../../src/libecoPCR/libapat/libstki.c
../../../../src/libecoPCR/libapat/libstki.h
../../../../src/libecoPCR/libthermo/nnparams.h
../../../../src/libecoPCR/libthermo/nnparams.c
../../../../src/libecoPCR/ecoapat.c
../../../../src/libecoPCR/ecodna.c
../../../../src/libecoPCR/ecoError.c
../../../../src/libecoPCR/ecoMalloc.c
../../../../src/libecoPCR/ecoPCR.h

Some files were not shown because too many files have changed in this diff Show More