annotate

Author	SHA1	Message	Date
Eric Coissac	15f033332c	Patch a bug leading to a double pseudogene tagging Former-commit-id: 35e27b66dc2f350b72544626da12a758b40da071 Former-commit-id: d01e79b8e7450e4aa734a8d04e81573602a58fec	2018-11-20 17:39:38 +01:00
Eric Coissac	2ff6ff3308	If proteins are looked for without stop adds an extra option PASS1_LOOK_FOR_PSEUDO allowing for searching with stop in a second time (Pseudogene search). The PASS1_ALLOW_STOP is set back to 0 and the new PASS1_LOOK_FOR_PSEUDO is set to 1 Former-commit-id: 318327af6bdc3fbdfbe7f438ff7cbea22863a0ab Former-commit-id: a130baf2b1c3bf1158d367d3633b02600f04674a	2018-11-20 16:02:23 +01:00
Eric Coissac	a040adb132	Check the translation for stop codon and add a pseudogene qualifier if present. Former-commit-id: 11b612fcdfa1fdd2a2614148b5b1772954e62e70 Former-commit-id: 02c87c99e5ece530640e521a577867e74ed1541e	2018-11-20 15:59:57 +01:00
Eric Coissac	4f18ef51d0	Redirect output of pushd and popd to /dev/null Former-commit-id: e6ce2c7387b5abd0ef3be9b58c23bbfe596a5aff Former-commit-id: 85e9495c91660380d531efb63a8f81aa393805cf	2018-05-11 16:20:39 +02:00
Eric Coissac	c691818059	Changes in .gitignore Former-commit-id: 7e9fcd4ed6487e52562b274d87345e0e46f1458d Former-commit-id: bf7a58dcf96db9f8c132977aa7d3f6af97cce0f5	2018-04-05 18:15:40 +02:00
Eric Coissac	0a5a65ab26	Change the notation algorithm to take advantage of the new CAU tRNA reference library Former-commit-id: 32650f41c4a7f95ce5da78c1f520438b35c1d4d1 Former-commit-id: 7ee31aaed2aca437b689fc7930095279fce0051b	2018-04-05 17:59:12 +02:00
Eric Coissac	81657a288a	Modify script to accept compressed genome files Former-commit-id: f816e3ce8b10e2ca3f1aa9ae969c24e699368e25 Former-commit-id: 16fb412552debdfd2172926e8a8b63be05257bdf	2018-04-05 17:58:19 +02:00
Eric Coissac	ee634cc779	Simplify CAU tRNA reference database building to keep onlyCAU tRNA from plastomes where the three categories of CAU tRNA (Met/Ile/fMet) are annotated Former-commit-id: 67dc445698e22fe8a503c6700977c79e4817d302 Former-commit-id: 6e84303543b0752a7946bdde6e5114cfe6eef8da	2018-04-05 17:55:31 +02:00
Eric Coissac	fc821d6be8	Final small changes to patch the bug related to complex filenames Former-commit-id: c59d7b5e7f8c8f37e955e44b354521c312cfc2c4 Former-commit-id: e9d8bc55d4542b276e91104672ba7dddb53c0c6a	2018-01-25 08:53:27 +01:00
Eric Coissac	640294b47e	Always a new attempt to solve the bug... Former-commit-id: 0a5ece1e927034a7001e2e1bcd2743d9b9e3ec6d Former-commit-id: 0aafb797b73c8beb4d8662784c8537e6f0c13c5d	2018-01-24 16:41:35 +01:00
Eric Coissac	44a75f6fd7	Comment out phase 2 CDS searching Former-commit-id: ca048f8c762475a2ca02735a20b90576b0222462 Former-commit-id: 455ffc2945c49f701f7406930fbe2e4e166d172d	2018-01-24 16:12:49 +01:00
Eric Coissac	238b500e1a	Add missing file... Former-commit-id: f71b0396212bb8cd2df1ca1a4e5847f30c613a48 Former-commit-id: 17cc9616d8835548e996712545d4cc0e1833f90f	2018-01-24 15:13:31 +01:00
Eric Coissac	8d2ec19fe8	Patch a bug to launch exonerate on complexe filename Former-commit-id: e8357a639a22cb123985a0ed487dfd4018c9bb0a Former-commit-id: a2e1c2ce75c0eac9574b7a68506f6f209e54ea89	2018-01-24 15:07:04 +01:00
Eric Coissac	f74bb0d973	Patch a bug blocking the exonerate execution when the genome filename is too long or complex Former-commit-id: a9da8eab920f422609b41be2e16d65e0569f953c Former-commit-id: 6829ae3081bea4a1d16ec8d3bad10e51f01f51d7	2018-01-23 07:32:12 +01:00
Eric Coissac	a25ab81b38	Add logs to print the sequence length and if the sequence is reverse complemented Former-commit-id: ba55f354ea7a51119fe44bcb36aa5927194293e2 Former-commit-id: dd7715be54ac92c9625f0a2c30e572b7aee76dc7	2018-01-18 22:00:07 +01:00
Eric Coissac	08d7c940a4	Patch a bug in the final sequence formating occuring when the input sequence has not 60 char per line Former-commit-id: 213735f5b9f3cd817053e284d7844cfdd69726c6 Former-commit-id: 074b4aaac0eac00de9b3b48e75804417ce780a2d	2018-01-18 21:58:50 +01:00
Eric Coissac	04ea0f110d	Allows for reporting Former-commit-id: af7999b0f3c69be9c796799813950adbdb0fb0e8 Former-commit-id: f8a6f2a26c58a02aa6d076bd3005a02f906de82a	2016-10-20 09:31:54 -03:00
Eric Coissac	1ac0af03c2	Patch the new ycf1 specific parameters Former-commit-id: 66f848b351a6b8186ff03a7059aa167f39ed29a1 Former-commit-id: fd4260434739725ff967138089eaeeb013812784	2016-10-09 07:19:35 -03:00
Eric Coissac	8156d5dd2f	Add specific exonerate parameters for ycf1 Former-commit-id: c956dde7ad2183b72fe5221333876747db97b361 Former-commit-id: 5ddf35ea93eadadecb063277afd513e8ae73e559	2016-10-09 07:11:20 -03:00
Eric Coissac	001c1dcac1	For a given protein consider only cluster with at list a score of 95% of the best score Former-commit-id: cfdc6fcd37a4036d8bcca27bc7e120e60a94998d Former-commit-id: f45bb7922f28165fd3baa1bc67bf815a759d1590	2016-10-09 04:24:08 -03:00
Eric Coissac	54413e7420	Change awk to $AwkCmd Signed-off-by: Eric Coissac <eric.coissac@metabarcoding.org> Former-commit-id: 79d7c6cc4333c8f72cef71f9c5323c151bb0e6b7 Former-commit-id: 869cf28bb894c95297fc0f80e424a55d347f2a65	2016-10-09 01:25:57 -03:00
Eric Coissac	87453701b7	Change some parameters in program calls Former-commit-id: 3ed8760844007def1d8c5a9cf4eaee01d571fe0b Former-commit-id: b15127c8f8a601b33e09daccc645cbb8a1f23a2e	2016-10-06 12:37:57 -03:00
Eric Coissac	4992483b80	Change some blastx parametter to get better matches by taking into account intron size and the good genetic code Former-commit-id: 6600123fbdce2070058074e82c791c7fc260c39b Former-commit-id: ac413cc4a49844d4fa4087107aa84680d36f3df1	2016-10-06 12:36:43 -03:00
Eric Coissac	e4f3081fa8	Switch to the speedup mode because of the slow down imposed by the new exonarate parametters Former-commit-id: 30f2caea735460bcc4dfa61adde72d7da2fb6f2e Former-commit-id: 0537c77f5bc16d766b3cbd668dcd1e1711140937	2016-10-06 12:35:32 -03:00
Eric Coissac	16b5e2927d	Make changes to better detect pseudo genes frameshited and annotate them correctly Former-commit-id: d827908d63149941538e686b48f60a132173cb80 Former-commit-id: 2841c75b415c6c8fa35a6a90e23cf82c3c84408b	2016-10-06 10:06:37 -03:00
Eric Coissac	860cd217d4	Add the management of pseudogenes Former-commit-id: 26d91366e483cf17c440b251ab1e8ac5390699fe Former-commit-id: 0d3d69ba351bd174fe08387a474fd1137559e38a	2016-10-06 08:56:45 -03:00
Eric Coissac	d4da1d01fd	A new set of protein cleaned for the CDS detector prepared using the clusterizecore.sh script from the detectors/cds/lib folder. The CDS detector is now modified to use the clean.fst files. Former-commit-id: e30a53b5b6b658388af4b2640b30e6765c729894 Former-commit-id: 3015ad50d25248fb117ab00e816b00fde1f9ba1d	2016-10-05 09:31:24 -03:00
Eric Coissac	466308267e	Add a patch for chloroplast annotation when no inverted repeats are detected Former-commit-id: 7e3ddd41cf0d0788223382fedbf45b183974233e Former-commit-id: e5a8ceb825f78d243e37d22cd6b2e91f403c0ee8	2016-05-02 15:32:28 +02:00
Eric Coissac	8113b80d47	Add annotation of nuclear rDNA cistron Former-commit-id: ee54019ddddbea4d17956622968f6ce673b609e1 Former-commit-id: 5e5381cf59409ca3dc01098b0e3f330efe0a6a32	2016-05-02 10:56:40 +02:00
Eric Coissac	20d0bcfbf8	First trial to automatcally cleanup the core CDS database Former-commit-id: dc61a61816084f385f1aa89324b08f81602b4353 Former-commit-id: ee8bf1a08e4af4f4d8d12a1e2a83c5f688e5f7e8	2016-04-25 23:41:18 +02:00
Eric Coissac	536a451510	call explicitely tcsh to workaround a path bug Former-commit-id: e6c05a695a6872dd5fb8acd96ee031844dd21fa0 Former-commit-id: 7740135e0861b796e85fce0c9c62a4793f836c2b	2016-04-13 17:32:10 +02:00
Eric Coissac	f466f5505a	Change tha dash bang of the csh shell scripts Former-commit-id: 115a1955c5883ffd0909cb05e887f70fa561b6e6 Former-commit-id: 5e6be182d5a3ec910f5deed27014227f34bd4745	2016-04-13 16:51:58 +02:00
Eric Coissac	69434c5b86	Add the latest tcsh able to deal with large PATH (at least 4096) Former-commit-id: 32011d9b239e2c5ed93646a8173b285f377693a3 Former-commit-id: 6e804387bfacfc4e9242ef3f7014642044f3aa2c	2016-04-13 16:21:50 +02:00
Eric Coissac	ab37af3b03	Add the name of the org.annot pipeline in the CDS inference Former-commit-id: 497194fafc15da0d80ee7dcb4cf11551d21061bd Former-commit-id: ea502a0d75d7ff638258a5a15b8ff759cd6e28fa	2015-12-18 08:56:55 +01:00
Eric Coissac	a4e053989b	Specify the genetic code during the aragorn call. Former-commit-id: 6f18008c34dcb33059accc02edef681a26848416 Former-commit-id: a7313f06a23a307a0384b88e3bc8a1d7b9292e07	2015-12-18 08:39:48 +01:00
Eric Coissac	cf54e7dcb1	Close #15 . Actually the bug in intron location was related to a misinterpretation of the aragorn output format. Now tRNA, and intron location are coherent with most of the locations extracted from genbank file with one or 2 base pairs of difference. Former-commit-id: dac4fb731e0edaeaebde9edc5350fce38ad99601 Former-commit-id: f8a0590342aec2db1fe5deb4475b8a9380891a48	2015-12-18 08:39:04 +01:00
Eric Coissac	89c4f17fc4	Patch a bug on the generation of the location of tRNA for gene on reverse complement strand with an intron. Former-commit-id: 729905450d60c9b2e76ac73567b3efb09cb1bb86 Former-commit-id: 722dc77682ef3da8a746879c52072c46adb9de71	2015-11-28 16:11:14 +01:00
alain viari	b7282fb30d	minor addition in cds/compare Former-commit-id: e865ea931fb2fc76f49b72d823eda712138647e3 Former-commit-id: 3d8d2bd249907fa4fbb7fae2ee06cf6090f62d5e	2015-11-15 13:13:36 +01:00
alain viari	2d404b5b24	removed need of R igraph from chlorodb/subdb Former-commit-id: 574aace9be5804d728a877110f5f475d61644f75 Former-commit-id: 2e7ea63447643830a62f18a364327d7b396ec140	2015-11-14 22:13:55 +01:00
alain viari	d83201fd2f	minor bug in chlorodb Former-commit-id: 7017655ac86e7b7837c7b581bf8a1abb86c08b30 Former-commit-id: dcedd4e32e3c7ce302eed94abd2b975a4506df97	2015-11-14 15:16:16 +01:00
alain viari	6f43ede11e	cds test on core and shell Former-commit-id: 9be1f2c23d00a2678489090c4f6d04ffc0124061 Former-commit-id: 823ca0890900bf6f81b158cafc46c78049fcf080	2015-11-13 22:41:34 +01:00
alain viari	405631f527	cds go_test bug fixed Former-commit-id: f73133dca83d02a0c223e98a3ac82fdb0d03c5ae Former-commit-id: 3db7c0037f7c109f4479490480d4323a55206c6a	2015-11-13 22:37:22 +01:00
alain viari	42707c281c	added test for chlorodb Former-commit-id: 639cbbdc91a6c7f11544dbbe1fa0c47e1e28eaad Former-commit-id: 59f6ff3f727d01f3ed4d553a554b322b24119b06	2015-11-13 18:53:47 +01:00
alain viari	13b03062d5	add doc for chlorodb Former-commit-id: 55fd288275b46bd02170029b9bc683ad34c3c611 Former-commit-id: 329bedc41c4355741a5bbd3f2056ac1025f3da7f	2015-11-13 17:55:24 +01:00
alain viari	e4d6a8484d	cds/tools/chlorodb added Former-commit-id: 0579e878a69b7c285ca71870e9ca5730649a2fda Former-commit-id: 7cced5b488441d87bf070a9a444317db0e048880	2015-11-13 17:41:18 +01:00
alain viari	0d5f0c1f20	summary added in comparison Former-commit-id: 9a267727234dc9026ce3a54f543e62d8f609945a Former-commit-id: 556a34a214aea8824f34d4a2c117f09527d88146	2015-11-11 18:51:15 +01:00
alain viari	dd9b23bc77	Merge branch 'master' of git.metabarcoding.org:org-asm/org-annotate Former-commit-id: 14936719198c993d2e38b2c4d8f78dfa5c46c0b4 Former-commit-id: 8e795225e073e077fbc6835b9d9746b6c6ed95cf	2015-11-10 22:15:29 +01:00
alain viari	9108ce75f1	fixed too many partial CDS bug Former-commit-id: d733a46f4e92f755f38e452f03a28062de6739f1 Former-commit-id: 36bdc324d2b9a0491d07d40a7e68a4cf7ea73984	2015-11-10 22:15:01 +01:00
alain viari	262995a486	added cds/tools compare and chrlorodb Former-commit-id: 31633e6bc503eb08ddb507e58e3ab1a6d2ba6027 Former-commit-id: 830c771b5453f3482f283ee069234a34127bf08f	2015-11-10 09:29:05 +01:00
Eric Coissac	813e3958ba	Change minimum length for considering a match from 1000 to 100. Former-commit-id: 6bd827ca011ee71d83e98710edc837f56a089875 Former-commit-id: 454f080c0b163f238951541eec23b5946f914f28	2015-11-09 17:35:59 +01:00

1 2

72 Commits