Commit Graph

773 Commits

Author SHA1 Message Date
Eric Coissac 500144051a Add jj Makefile targets and k-mer encoding utilities
Add new Makefile targets for jj operations (jjnew, jjpush, jjfetch) to streamline commit workflow.

Introduce k-mer encoding utilities in pkg/obikmer:
- EncodeKmers: converts DNA sequences to encoded k-mers
- ReverseComplement: computes reverse complement of k-mers
- NormalizeKmer: returns canonical form of k-mers
- EncodeNormalizedKmers: encodes sequences with normalized k-mers

Add comprehensive tests for k-mer encoding functions including edge cases, buffer reuse, and performance benchmarks.

Document k-mer index design for large genomes, covering:
- Use cases and objectives
- Volume estimations
- Distance metrics (Jaccard, Sørensen-Dice, Bray-Curtis)
- Indexing options (Bloom filters, sorted sets, MPHF)
- Optimization techniques (k-2-mer indexing)
- MinHash for distance acceleration
- Recommended architecture for presence/absence and counting queries
2026-02-04 14:27:10 +01:00
coissac 740f66b4c7 Merge pull request #71 from metabarcoding/push-onwzsyuooozn
Implémentation du filtrage unique basé sur séquence et catégories
2026-01-14 19:19:27 +01:00
Eric Coissac b49aba9c09 Implémentation du filtrage unique basé sur séquence et catégories
Ajout d'une fonctionnalité pour le filtrage unique qui prend en compte à la fois la séquence et les catégories.

- Modification de la fonction ISequenceChunk pour accepter un classifieur unique optionnel
- Implémentation du traitement unique sur disque en utilisant un classifieur composite
- Mise à jour du classifieur utilisé pour le tri sur disque
- Correction de la gestion des clés de unicité en utilisant le code et la valeur du classifieur
- Mise à jour du numéro de commit
2026-01-14 19:18:17 +01:00
coissac 52244cdb64 Merge pull request #70 from metabarcoding/push-kuwnszsxmxpn
Refactor chunk processing and update version commit
2026-01-14 18:47:17 +01:00
Eric Coissac 0678181023 Refactor chunk processing and update version commit
Optimize chunk processing by moving variable declarations inside the loop and update the commit hash in version.go to reflect the latest changes.
2026-01-14 18:46:04 +01:00
coissac f55dd553c7 Merge pull request #68 from metabarcoding/push-rrulynolpprl
Push rrulynolpprl
2026-01-14 17:44:36 +01:00
coissac 4a383ac6c9 Merge branch 'master' into push-rrulynolpprl 2025-12-18 14:12:56 +01:00
Eric Coissac 371e702423 obiannotate --cut bug 2025-12-18 14:11:11 +01:00
Eric Coissac ac0d3f3fe4 Update obiuniq for very large dataset 2025-12-18 14:11:11 +01:00
Eric Coissac 547135c747 End of obilowmask 2025-12-03 11:49:07 +01:00
coissac f4a919732e Merge pull request #65 from metabarcoding/push-yurwulsmpxkq
End of obilowmask
2025-11-26 12:13:08 +01:00
Eric Coissac e681666aaa End of obilowmask 2025-11-26 11:14:56 +01:00
coissac adf2486295 Merge pull request #64 from metabarcoding/push-yurwulsmpxkq
End of obilowmask
2025-11-24 15:36:20 +01:00
Eric Coissac 272f5c9c35 End of obilowmask 2025-11-24 15:27:38 +01:00
coissac c1b9503ca6 Merge pull request #63 from metabarcoding/push-vypwrurrsxuk
obicsv bug with stat on value map fields
2025-11-21 14:04:34 +01:00
Eric Coissac 86e60aedd0 obicsv bug with stat on value map fields 2025-11-21 14:03:31 +01:00
coissac 961abcea7b Merge pull request #61 from metabarcoding/push-mvxssvnysyxn
Push mvxssvnysyxn
2025-11-21 13:25:19 +01:00
Eric Coissac 57c65f9d50 obimatrix bug 2025-11-21 13:24:24 +01:00
Eric Coissac e65b2a5efe obimatrix bugs 2025-11-21 13:24:06 +01:00
coissac 3e5f3f76b0 Merge pull request #60 from metabarcoding/push-qpnzxskwpoxo
Push qpnzxskwpoxo
2025-11-18 15:35:41 +01:00
Eric Coissac ccc827afd3 finalise obilowmask 2025-11-18 15:33:08 +01:00
Eric Coissac cef29005a5 debug url reading 2025-11-18 15:30:20 +01:00
Eric Coissac 4603d7973e implementation de obilowmask 2025-11-18 15:30:20 +01:00
coissac 8bc47c13d3 Merge pull request #58 from metabarcoding/push-vxkqkkrokwuz
debug obimultiplex
2025-11-06 15:44:31 +01:00
Eric Coissac 07cdd6f758 debug obimultiplex
bug option obimultiplex
2025-11-06 15:43:13 +01:00
coissac 432da366e2 Merge pull request #57 from metabarcoding/push-ywktmvpvtvmv
debug taxonomy core dump
2025-11-05 19:07:41 +01:00
Eric Coissac 2d7dc7d09d debug taxonomy core dump 2025-11-05 19:01:15 +01:00
coissac 5e12ed5400 Merge pull request #56 from metabarcoding/push-tnrvpwvqtzyo
update install script
2025-11-04 18:11:21 +01:00
Eric Coissac 7500ee1d15 update install script 2025-11-04 18:09:15 +01:00
coissac 5a1d66bf06 Merge pull request #53 from metabarcoding/push-skmxzrzulvtq
Push skmxzrzulvtq
2025-10-28 14:27:19 +01:00
Eric Coissac 0844dcc607 bug obimatrix 2025-10-28 13:57:31 +01:00
Eric Coissac 7f4ebe757e Bug obiuniq - don't clean the chunks 2025-10-28 13:50:22 +01:00
coissac 5150947e23 Merge pull request #51 from metabarcoding/push-urtwmwktsrru
Push urtwmwktsrru
2025-10-20 17:41:33 +02:00
Eric Coissac d17a9520b9 work on obiclean chimera detection 2025-10-20 17:29:47 +02:00
Eric Coissac 29bf4ce871 add a feature to obimatrix adding obicsv option to obimatrix 2025-10-20 16:34:58 +02:00
coissac d7ed9d343e Update install_obitools.sh for missing directory 2025-10-15 08:32:06 +02:00
Eric Coissac 82b6bb1ab6 correct a bug in func (worker SeqWorker) ChainWorkers(next SeqWorker) SeqWorker 2025-08-11 15:09:49 +02:00
Eric Coissac 6d204f6281 Patch the fastq detector 2025-08-08 10:23:03 -04:00
Eric Coissac 7a6d552450 Changes to be committed:
modified:   pkg/obioptions/version.go
2025-08-07 17:01:48 -04:00
Eric Coissac 412b54822c Patch a bug in obliclean for d>1 leading to some instability in the result 2025-08-07 17:01:38 -04:00
Eric Coissac 730d448fc3 Allows for only one cpu and it should work 2025-08-06 16:09:25 -04:00
Eric Coissac 04f3af3e60 some renaming of functions 2025-08-06 15:54:50 -04:00
Eric Coissac 997b6e8c01 correct the fastq detector for distinguish with a csv ngsfilter 2025-08-06 15:52:54 -04:00
Eric Coissac f239e8da92 Rename ISequenceChunk 2025-08-05 08:49:45 -04:00
Eric Coissac ed28d3fb5b Adds a --u-to-t option 2025-07-07 15:35:26 +02:00
Eric Coissac 43b285587e Debug on taxonomy extraction and CSV conversion 2025-07-07 15:29:40 +02:00
Eric Coissac 8d53d253d4 Add a reading option on readers to convet U to T 2025-07-07 15:29:07 +02:00
Eric Coissac 8c26fc9884 Add a new test on obisummary 2025-07-07 15:28:29 +02:00
Eric Coissac 235a7e202a Update obisummary to account new obiseq.StatsOnValues type 2025-06-19 17:21:30 +02:00
Eric Coissac 27fa984a63 Patch obimatrix accoring to the new type obiseq.StatsOnValues 2025-06-19 16:51:53 +02:00