70 Commits

Author SHA1 Message Date
28df0c35c1 correction of the IR detection 2025-05-25 19:38:01 +02:00
478a6bdca7 Patch RPS12 detection 2025-05-25 13:43:43 +02:00
17908e0df2 Patch RPS12 detection 2025-05-25 13:41:47 +02:00
9205fd1ed1 Patch RPS12 detection 2025-05-25 10:31:30 +02:00
c5b92799b1 Detect number of cores 2025-05-24 08:46:13 +02:00
534c5c74a8 Patch two bug in the best cluster selection 2025-05-22 05:36:04 +02:00
3589bf03eb Update Swissprot database 2025-05-22 02:23:26 +02:00
4b71fe8c4c Changes to be committed:
modified:   .gitignore
	new file:   data/cds/sp_chlorodb/parameters.sh
	deleted:    data/ir/LSC_RefDB.fasta
	deleted:    data/ir/SSC_RefDB.fasta
	modified:   detectors/cds/bin/go_cds.sh
	modified:   detectors/normalize/lib/lookforIR.lib.sh
	modified:   detectors/normalize/lib/selectIR.py
	modified:   organnot/Dockerfile
	new file:   organnot/README.md
	new file:   organnot/dorgannot
	deleted:    ports/.DS_Store
	deleted:    src/ncbiblast/binaries/.gitignore
	deleted:    src/prokov/lxpack/tests/S.fasta
	deleted:    src/prokov/lxpack/tests/St.fasta
	deleted:    src/prokov/lxpack/tests/Stt.fasta
	deleted:    src/prokov/lxpack/tests/aS.fasta
	deleted:    src/prokov/lxpack/tests/aSt.fasta
	deleted:    src/prokov/lxpack/tests/aStt.fasta
	deleted:    src/prokov/lxpack/tests/aaS.fasta
	deleted:    src/prokov/lxpack/tests/aaSt.fasta
	deleted:    src/prokov/lxpack/tests/aaStt.fasta
	new file:   src/repseek/repseek-2014.09.tgz
2025-03-05 21:56:39 +01:00
2c012eec8e first batch
Former-commit-id: 1eecb206a17c4aff21d1170b48db134ce3c4f14e
2025-03-01 16:15:28 +01:00
bf27de1528 Correction of go_rps12 for not passing anymore the sequence as a variable
Former-commit-id: 0f9bb9472a53aa16a91a9cab5106ee66ee781c34
Former-commit-id: 016607c59e62105850d1d25f29bfe214943abc5c
2023-05-16 13:39:01 +02:00
785e0a6226 Small patches
Former-commit-id: 7f32ef237be64d3f81353241462f0b6c8f68d3c5
Former-commit-id: 8eb0147cc85f241e89399c4d3a9c7b5b2f52e215
2023-05-15 20:48:44 +02:00
ed5b28b14f Patch regular patterns
Former-commit-id: 4c05238859cbb95c68902dbfb0b8f5d91f9f82d0
Former-commit-id: 15ae6fd0b11548a0701c99c9305232d5a238d39d
2023-05-15 14:57:16 +02:00
06f36ccdd3 Add the transsplicing qualifier
Former-commit-id: 1b155125047cbee1cccd12ee6865502f36172566
Former-commit-id: bf4174556214216eeb4e1720c5e9e3cb482bae2b
2023-04-29 07:08:26 +02:00
031e18a8bd Change translate function to deal with start codons
Former-commit-id: 8d15cb5175de1774a1cb366f7a92ef99f8517af5
Former-commit-id: 58421d7b8dd6855efe9770499e48a4cca6d9e1fd
2023-04-29 07:07:03 +02:00
5a7b869170 Add a better management of and create translation exception when required for initiation codon
Former-commit-id: 878d919fdaad16e6e2645b62b3a53ef5d5e1ef2b
Former-commit-id: 3c3647cf114438a1ea9c3ff8c44e67e367929776
2023-04-29 07:04:09 +02:00
3b43762ced some blast tricks
Former-commit-id: 9633c56d33c52ecf97fbc2c40751fd00b2acd09b
Former-commit-id: 15a6398f751070645cd2b14766abaf209b1222ce
2022-02-17 18:43:15 +01:00
9d93a68b3a Change setup for the blast filtering before exonerate
Former-commit-id: 139685eca58c1fb2272854dee31de3821c54af80
Former-commit-id: dc5c345ce72e9895cbdcc3321499b869040a24da
2022-02-17 18:41:27 +01:00
831669433e Switch to a swissprot based reference database for CDS annotation
Former-commit-id: 3da31ce8a135394ecac041291134d61f11f06d8f
Former-commit-id: 406f41a7cb2db14ea832480b86f72a11d3b0ab4a
2022-02-16 22:50:17 +01:00
90b3ee9b04 Do correct renaming of RPS12 genes if several
Former-commit-id: 8ddbfaea302c440aa0992f3443632cf026b0d3a9
Former-commit-id: 2559779ab79d1b52d5193e1a60b443f6290dda48
2022-02-14 15:29:02 +01:00
616fd2bb44 A script for helping in clustering reference database for CDS annotation
Former-commit-id: 7babc60d47f433efd1301fbbe2a5714bfe7f7658
Former-commit-id: cf45c79769c6204598dd456573846496e4e834c0
2022-02-14 15:10:47 +01:00
d56aeaf698 Remove extra feature for CDS
Former-commit-id: 19b149eb57227e4ff3e7dda97f0328207fbc6373
Former-commit-id: ef94884d026004aa80d0fed85121c525cf5610b4
2022-02-14 15:09:17 +01:00
59fcad1c42 Adds detection of RPS12 and managment of locus tags
Former-commit-id: b9b17708eaaa27580f1e99bd3c375d4b6aba4d79
Former-commit-id: 369361ffa58e65b19ab1005bdf7960924f24ca08
2022-02-14 14:21:50 +01:00
616d5d084b change tRNA and CDS annotations
Former-commit-id: 12b6c5605f57940e215643b80c93ffbb48d5406e
Former-commit-id: 18663d59e90e6d35b029d9087b66723487b8db1d
2021-11-05 09:29:57 +01:00
59a53bf482 Patch the detection algorithm (the overlap detection)
Former-commit-id: 7aca679a3425b6f5505f6122f2a58d1c5cd14663
Former-commit-id: 85fa6c3f1934391e952feb71f46300662034eaef
2021-11-04 13:42:35 +01:00
e4627ced6e Switch the go_cds script from tcsh to bash
Former-commit-id: 36041f96b5bb1411a4ac6fecccfbc24b9b90baff
Former-commit-id: 6e63fdff4022a2bb895a44eb6009f41d049ba4ae
2021-11-04 13:36:28 +01:00
2c1d15c227 Adds the detection of the RPS12 gene (Gene with trans-splicing)
Former-commit-id: 2396df183a925fbc1a8b398ee8dd4e12ca3c255f
Former-commit-id: 309796fcdac8cf4b6379eae6418dcf1d6db21bb3
2021-11-03 13:19:01 +01:00
40feaadd43 Move the script used for clusterizing protein DBs
Former-commit-id: c27edd09d88f05618e33ac55deb6af0a9f69329c
Former-commit-id: 933bb60387f3903f4a5ffd8ff3ad20b16aff23bb
2021-06-01 09:53:10 +02:00
15f033332c Patch a bug leading to a double pseudogene tagging
Former-commit-id: 35e27b66dc2f350b72544626da12a758b40da071
Former-commit-id: d01e79b8e7450e4aa734a8d04e81573602a58fec
2018-11-20 17:39:38 +01:00
2ff6ff3308 If proteins are looked for without stop adds an extra option
PASS1_LOOK_FOR_PSEUDO allowing for searching with stop in a second time
(Pseudogene search).

The PASS1_ALLOW_STOP is set back to 0 and the new PASS1_LOOK_FOR_PSEUDO
is set to 1

Former-commit-id: 318327af6bdc3fbdfbe7f438ff7cbea22863a0ab
Former-commit-id: a130baf2b1c3bf1158d367d3633b02600f04674a
2018-11-20 16:02:23 +01:00
a040adb132 Check the translation for stop codon and add a pseudogene qualifier if
present.

Former-commit-id: 11b612fcdfa1fdd2a2614148b5b1772954e62e70
Former-commit-id: 02c87c99e5ece530640e521a577867e74ed1541e
2018-11-20 15:59:57 +01:00
4f18ef51d0 Redirect output of pushd and popd to /dev/null
Former-commit-id: e6ce2c7387b5abd0ef3be9b58c23bbfe596a5aff
Former-commit-id: 85e9495c91660380d531efb63a8f81aa393805cf
2018-05-11 16:20:39 +02:00
fc821d6be8 Final small changes to patch the bug related to complex filenames
Former-commit-id: c59d7b5e7f8c8f37e955e44b354521c312cfc2c4
Former-commit-id: e9d8bc55d4542b276e91104672ba7dddb53c0c6a
2018-01-25 08:53:27 +01:00
640294b47e Always a new attempt to solve the bug...
Former-commit-id: 0a5ece1e927034a7001e2e1bcd2743d9b9e3ec6d
Former-commit-id: 0aafb797b73c8beb4d8662784c8537e6f0c13c5d
2018-01-24 16:41:35 +01:00
44a75f6fd7 Comment out phase 2 CDS searching
Former-commit-id: ca048f8c762475a2ca02735a20b90576b0222462
Former-commit-id: 455ffc2945c49f701f7406930fbe2e4e166d172d
2018-01-24 16:12:49 +01:00
238b500e1a Add missing file...
Former-commit-id: f71b0396212bb8cd2df1ca1a4e5847f30c613a48
Former-commit-id: 17cc9616d8835548e996712545d4cc0e1833f90f
2018-01-24 15:13:31 +01:00
8d2ec19fe8 Patch a bug to launch exonerate on complexe filename
Former-commit-id: e8357a639a22cb123985a0ed487dfd4018c9bb0a
Former-commit-id: a2e1c2ce75c0eac9574b7a68506f6f209e54ea89
2018-01-24 15:07:04 +01:00
f74bb0d973 Patch a bug blocking the exonerate execution when the genome filename is
too long or complex

Former-commit-id: a9da8eab920f422609b41be2e16d65e0569f953c
Former-commit-id: 6829ae3081bea4a1d16ec8d3bad10e51f01f51d7
2018-01-23 07:32:12 +01:00
08d7c940a4 Patch a bug in the final sequence formating occuring when the input
sequence has not 60 char per line

Former-commit-id: 213735f5b9f3cd817053e284d7844cfdd69726c6
Former-commit-id: 074b4aaac0eac00de9b3b48e75804417ce780a2d
2018-01-18 21:58:50 +01:00
04ea0f110d Allows for reporting
Former-commit-id: af7999b0f3c69be9c796799813950adbdb0fb0e8
Former-commit-id: f8a6f2a26c58a02aa6d076bd3005a02f906de82a
2016-10-20 09:31:54 -03:00
1ac0af03c2 Patch the new ycf1 specific parameters
Former-commit-id: 66f848b351a6b8186ff03a7059aa167f39ed29a1
Former-commit-id: fd4260434739725ff967138089eaeeb013812784
2016-10-09 07:19:35 -03:00
8156d5dd2f Add specific exonerate parameters for ycf1
Former-commit-id: c956dde7ad2183b72fe5221333876747db97b361
Former-commit-id: 5ddf35ea93eadadecb063277afd513e8ae73e559
2016-10-09 07:11:20 -03:00
001c1dcac1 For a given protein consider only cluster with at list a score of 95% of
the best score

Former-commit-id: cfdc6fcd37a4036d8bcca27bc7e120e60a94998d
Former-commit-id: f45bb7922f28165fd3baa1bc67bf815a759d1590
2016-10-09 04:24:08 -03:00
54413e7420 Change awk to $AwkCmd
Signed-off-by: Eric Coissac <eric.coissac@metabarcoding.org>
Former-commit-id: 79d7c6cc4333c8f72cef71f9c5323c151bb0e6b7
Former-commit-id: 869cf28bb894c95297fc0f80e424a55d347f2a65
2016-10-09 01:25:57 -03:00
87453701b7 Change some parameters in program calls
Former-commit-id: 3ed8760844007def1d8c5a9cf4eaee01d571fe0b
Former-commit-id: b15127c8f8a601b33e09daccc645cbb8a1f23a2e
2016-10-06 12:37:57 -03:00
4992483b80 Change some blastx parametter to get better matches by taking into
account intron size and the good genetic code

Former-commit-id: 6600123fbdce2070058074e82c791c7fc260c39b
Former-commit-id: ac413cc4a49844d4fa4087107aa84680d36f3df1
2016-10-06 12:36:43 -03:00
e4f3081fa8 Switch to the speedup mode because of the slow down imposed by the new
exonarate parametters

Former-commit-id: 30f2caea735460bcc4dfa61adde72d7da2fb6f2e
Former-commit-id: 0537c77f5bc16d766b3cbd668dcd1e1711140937
2016-10-06 12:35:32 -03:00
16b5e2927d Make changes to better detect pseudo genes frameshited and annotate them
correctly

Former-commit-id: d827908d63149941538e686b48f60a132173cb80
Former-commit-id: 2841c75b415c6c8fa35a6a90e23cf82c3c84408b
2016-10-06 10:06:37 -03:00
860cd217d4 Add the management of pseudogenes
Former-commit-id: 26d91366e483cf17c440b251ab1e8ac5390699fe
Former-commit-id: 0d3d69ba351bd174fe08387a474fd1137559e38a
2016-10-06 08:56:45 -03:00
d4da1d01fd A new set of protein cleaned for the CDS detector prepared using the
clusterizecore.sh script from the detectors/cds/lib folder.

The CDS detector is now modified to use the clean.fst files.


Former-commit-id: e30a53b5b6b658388af4b2640b30e6765c729894
Former-commit-id: 3015ad50d25248fb117ab00e816b00fde1f9ba1d
2016-10-05 09:31:24 -03:00
20d0bcfbf8 First trial to automatcally cleanup the core CDS database
Former-commit-id: dc61a61816084f385f1aa89324b08f81602b4353
Former-commit-id: ee8bf1a08e4af4f4d8d12a1e2a83c5f688e5f7e8
2016-04-25 23:41:18 +02:00