Add genomic distance benchmarking suite and test data

Introduces scripts to compute and validate pairwise genomic distance matrices across multiple metrics. Updates the Makefile with build and comparison targets, adds .gitignore rules for generated outputs, and includes test CSV matrices and a Newick phylogenetic tree for validating the distance computation pipeline.
This commit is contained in:
Eric Coissac
2026-06-22 17:28:48 +02:00
parent 9f1df96ea7
commit 469e53b6f5
7 changed files with 541 additions and 2 deletions
+21
View File
@@ -0,0 +1,21 @@
genome,Candidozyma_auris--GCF_003013715.1_ASM301371v2,Acidobacterium_capsulatum--ATCC_51196,Bacillus_subtilis--168,Escherichia_coli--CFT073,Escherichia_coli--EDL933,Escherichia_coli--K-12_MG1655,Escherichia_coli--K-12_W3110,Klebsiella_pneumoniae--ATCC_13883,Klebsiella_pneumoniae--HS11286,Klebsiella_pneumoniae--MGH_78578,Opitutus_terrae--PB90-1,Proteus_mirabilis--HI4320,Saccharolobus_islandicus--M.16.4,Salmonella_enterica--AKU_12601,Salmonella_enterica--CT18,Salmonella_enterica--LT2,Salmonella_enterica--P125109,Shouchella_clausii--KSM-K16,Wolbachia_endosymbiont--GCF_000306885.1_ASM30688v1,Yersinia_ruckeri--YRB
Candidozyma_auris--GCF_003013715.1_ASM301371v2,0,0,0,0,0,0,0,0,0,0,0,0,8,0,1,0,0,0,0,3
Acidobacterium_capsulatum--ATCC_51196,0,0,203,119,128,141,140,116,109,111,78,112,0,136,109,147,134,117,55,129
Bacillus_subtilis--168,0,203,0,124,132,128,123,133,109,130,66,158,6,131,112,124,135,2393,46,124
Escherichia_coli--CFT073,0,119,124,0,1966777,1998059,1999094,117743,32029,22312,63,4225,0,74946,31918,73311,76585,113,128,7854
Escherichia_coli--EDL933,0,128,132,1966777,0,2627885,2628700,52488,20134,22064,48,4202,0,74655,28602,71244,74665,112,108,7963
Escherichia_coli--K-12_MG1655,0,141,128,1998059,2627885,0,4452541,48302,21382,24602,47,4277,0,75729,30449,73622,76778,119,111,8566
Escherichia_coli--K-12_W3110,0,140,123,1999094,2628700,4452541,0,47894,21226,24470,68,4278,0,75658,30207,73614,76583,112,108,8660
Klebsiella_pneumoniae--ATCC_13883,0,116,133,117743,52488,48302,47894,0,1416091,1477759,42,4172,0,48296,18988,48144,50416,120,106,7712
Klebsiella_pneumoniae--HS11286,0,109,109,32029,20134,21382,21226,1416091,0,644063,42,2738,0,21498,29758,21606,21376,99,102,4417
Klebsiella_pneumoniae--MGH_78578,0,111,130,22312,22064,24602,24470,1477759,644063,0,42,2614,0,19948,35067,21330,20813,97,102,4374
Opitutus_terrae--PB90-1,0,78,66,63,48,47,68,42,42,42,0,43,18,57,42,53,66,39,58,43
Proteus_mirabilis--HI4320,0,112,158,4225,4202,4277,4278,4172,2738,2614,43,0,0,4254,2481,4166,4215,131,103,4704
Saccharolobus_islandicus--M.16.4,8,0,6,0,0,0,0,0,0,0,18,0,0,0,0,0,0,0,0,0
Salmonella_enterica--AKU_12601,0,136,131,74946,74655,75729,75658,48296,21498,19948,57,4254,0,0,1047731,2857146,2951421,117,108,7643
Salmonella_enterica--CT18,1,109,112,31918,28602,30449,30207,18988,29758,35067,42,2481,0,1047731,0,917948,940297,106,106,3716
Salmonella_enterica--LT2,0,147,124,73311,71244,73622,73614,48144,21606,21330,53,4166,0,2857146,917948,0,3284800,122,108,7460
Salmonella_enterica--P125109,0,134,135,76585,74665,76778,76583,50416,21376,20813,66,4215,0,2951421,940297,3284800,0,134,124,7645
Shouchella_clausii--KSM-K16,0,117,2393,113,112,119,112,120,99,97,39,131,0,117,106,122,134,0,58,124
Wolbachia_endosymbiont--GCF_000306885.1_ASM30688v1,0,55,46,128,108,111,108,106,102,102,58,103,0,108,106,108,124,58,0,96
Yersinia_ruckeri--YRB,3,129,124,7854,7963,8566,8660,7712,4417,4374,43,4704,0,7643,3716,7460,7645,124,96,0
1 genome Candidozyma_auris--GCF_003013715.1_ASM301371v2 Acidobacterium_capsulatum--ATCC_51196 Bacillus_subtilis--168 Escherichia_coli--CFT073 Escherichia_coli--EDL933 Escherichia_coli--K-12_MG1655 Escherichia_coli--K-12_W3110 Klebsiella_pneumoniae--ATCC_13883 Klebsiella_pneumoniae--HS11286 Klebsiella_pneumoniae--MGH_78578 Opitutus_terrae--PB90-1 Proteus_mirabilis--HI4320 Saccharolobus_islandicus--M.16.4 Salmonella_enterica--AKU_12601 Salmonella_enterica--CT18 Salmonella_enterica--LT2 Salmonella_enterica--P125109 Shouchella_clausii--KSM-K16 Wolbachia_endosymbiont--GCF_000306885.1_ASM30688v1 Yersinia_ruckeri--YRB
2 Candidozyma_auris--GCF_003013715.1_ASM301371v2 0 0 0 0 0 0 0 0 0 0 0 0 8 0 1 0 0 0 0 3
3 Acidobacterium_capsulatum--ATCC_51196 0 0 203 119 128 141 140 116 109 111 78 112 0 136 109 147 134 117 55 129
4 Bacillus_subtilis--168 0 203 0 124 132 128 123 133 109 130 66 158 6 131 112 124 135 2393 46 124
5 Escherichia_coli--CFT073 0 119 124 0 1966777 1998059 1999094 117743 32029 22312 63 4225 0 74946 31918 73311 76585 113 128 7854
6 Escherichia_coli--EDL933 0 128 132 1966777 0 2627885 2628700 52488 20134 22064 48 4202 0 74655 28602 71244 74665 112 108 7963
7 Escherichia_coli--K-12_MG1655 0 141 128 1998059 2627885 0 4452541 48302 21382 24602 47 4277 0 75729 30449 73622 76778 119 111 8566
8 Escherichia_coli--K-12_W3110 0 140 123 1999094 2628700 4452541 0 47894 21226 24470 68 4278 0 75658 30207 73614 76583 112 108 8660
9 Klebsiella_pneumoniae--ATCC_13883 0 116 133 117743 52488 48302 47894 0 1416091 1477759 42 4172 0 48296 18988 48144 50416 120 106 7712
10 Klebsiella_pneumoniae--HS11286 0 109 109 32029 20134 21382 21226 1416091 0 644063 42 2738 0 21498 29758 21606 21376 99 102 4417
11 Klebsiella_pneumoniae--MGH_78578 0 111 130 22312 22064 24602 24470 1477759 644063 0 42 2614 0 19948 35067 21330 20813 97 102 4374
12 Opitutus_terrae--PB90-1 0 78 66 63 48 47 68 42 42 42 0 43 18 57 42 53 66 39 58 43
13 Proteus_mirabilis--HI4320 0 112 158 4225 4202 4277 4278 4172 2738 2614 43 0 0 4254 2481 4166 4215 131 103 4704
14 Saccharolobus_islandicus--M.16.4 8 0 6 0 0 0 0 0 0 0 18 0 0 0 0 0 0 0 0 0
15 Salmonella_enterica--AKU_12601 0 136 131 74946 74655 75729 75658 48296 21498 19948 57 4254 0 0 1047731 2857146 2951421 117 108 7643
16 Salmonella_enterica--CT18 1 109 112 31918 28602 30449 30207 18988 29758 35067 42 2481 0 1047731 0 917948 940297 106 106 3716
17 Salmonella_enterica--LT2 0 147 124 73311 71244 73622 73614 48144 21606 21330 53 4166 0 2857146 917948 0 3284800 122 108 7460
18 Salmonella_enterica--P125109 0 134 135 76585 74665 76778 76583 50416 21376 20813 66 4215 0 2951421 940297 3284800 0 134 124 7645
19 Shouchella_clausii--KSM-K16 0 117 2393 113 112 119 112 120 99 97 39 131 0 117 106 122 134 0 58 124
20 Wolbachia_endosymbiont--GCF_000306885.1_ASM30688v1 0 55 46 128 108 111 108 106 102 102 58 103 0 108 106 108 124 58 0 96
21 Yersinia_ruckeri--YRB 3 129 124 7854 7963 8566 8660 7712 4417 4374 43 4704 0 7643 3716 7460 7645 124 96 0