obitools4

mirror of https://github.com/metabarcoding/obitools4.git synced 2026-06-24 09:41:00 +00:00

Author	SHA1	Message	Date
Eric Coissac	00c8be6b48	docs: add architecture documentation for OBITools commands Ajout d'une documentation détaillée sur l'architecture des commandes OBITools, incluant la structure modulaire, les patterns architecturaux et les bonnes pratiques pour la création de nouvelles commandes.	2026-02-07 12:26:35 +01:00
Eric Coissac	4ae331db36	Refactor SuperKmer extraction to use iterator pattern This commit refactors the SuperKmer extraction functionality to use Go's new iterator pattern. The ExtractSuperKmers function is now implemented as a wrapper around a new IterSuperKmers iterator function, which yields results one at a time instead of building a complete slice. This change provides better memory efficiency and more flexible consumption of super k-mers. The functionality remains the same, but the interface is now more idiomatic and efficient for large datasets.	2026-02-07 12:23:12 +01:00
Eric Coissac	f1e2846d2d	Amélioration du processus de release avec génération automatique des notes de version Mise à jour du Makefile pour améliorer le processus de version bump et de création de tag. - Utilisation de variables pour stocker les versions précédente et actuelle - Ajout de la génération automatique des notes de version à partir des commits entre les tags - Intégration d'une logique de fallback si orla n'est pas disponible - Amélioration de la documentation des étapes du processus de release - Mise à jour de la commande de création du tag avec le message généré	2026-02-07 11:48:26 +01:00
coissac	cd5562fb30	Merge pull request #81 from metabarcoding/push-nrylumyxtxnr Push nrylumyxtxnr	2026-02-06 10:10:22 +01:00
Eric Coissac	f79b018430	Bump version to 4.4.11 Update version from 4.4.10 to 4.4.11 in version.txt and pkg/obioptions/version.go	2026-02-06 10:09:56 +01:00
Eric Coissac	aa819618c2	Enhance OBITools4 installation script with version control and documentation Update installation script to support specific version installation, list available versions, and improve documentation. - Add support for installing specific versions with -v/--version flag - Add -l/--list flag to list all available versions - Improve help message with examples - Update README.md to reflect new installation options and examples - Add note on version compatibility between OBITools2 and OBITools4 - Remove ecoprimers directory - Improve error handling and user feedback during installation - Add version detection and download logic from GitHub releases - Update installation process to use tagged releases instead of master branch Release_4.4.11	2026-02-06 10:09:54 +01:00
coissac	da8d851d4d	Merge pull request #80 from metabarcoding/push-vvonlpwlnwxy Remove ecoprimers submodule	2026-02-06 09:53:29 +01:00
Eric Coissac	9823bcb41b	Remove ecoprimers submodule	2026-02-06 09:52:54 +01:00
coissac	9c162459b0	Merge pull request #79 from metabarcoding/push-tpytwyyyostt Remove ecoprimers submodule	2026-02-06 09:51:42 +01:00
Eric Coissac	25b494e562	Remove ecoprimers submodule	2026-02-06 09:50:45 +01:00
coissac	0b5cadd104	Merge pull request #78 from metabarcoding/push-pwvvkzxzmlux Push pwvvkzxzmlux	2026-02-06 09:48:47 +01:00
Eric Coissac	a2106e4e82	Bump version to 4.4.10 Update version from 4.4.9 to 4.4.10 in version.txt and pkg/obioptions/version.go	2026-02-06 09:48:27 +01:00
Eric Coissac	a8a00ba0f7	Simplify artifact packaging and update release notes This commit simplifies the artifact packaging process by creating a single tar.gz file containing all binaries for each platform, instead of individual files. It also updates the release notes to reflect the new packaging approach and corrects the documentation to use the new naming convention 'obitools4' instead of '<tool>'. Release_4.4.10	2026-02-06 09:48:25 +01:00
coissac	1595a74ada	Merge pull request #77 from metabarcoding/push-lwtnswxmorrq Push lwtnswxmorrq	2026-02-06 09:35:05 +01:00
Eric Coissac	68d723ecba	Bump version to 4.4.9 Update version from 4.4.8 to 4.4.9 in version.txt and corresponding Go file.	2026-02-06 09:34:43 +01:00
Eric Coissac	250d616129	Mise à jour des workflows de release pour les nouvelles versions d'OS Mise à jour du workflow de release pour utiliser ubuntu-24.04-arm au lieu de ubuntu-latest pour ARM64, et macos-15-intel au lieu de macos-latest pour macOS. Suppression de la compilation croisée pour ARM64 et ajustement de l'installation des outils de build pour macOS. Release_4.4.9	2026-02-06 09:34:41 +01:00
coissac	fbf816d219	Merge pull request #76 from metabarcoding/push-tzpmmnnxkvxx Push tzpmmnnxkvxx	2026-02-06 09:09:05 +01:00
Eric Coissac	7f0133a196	Bump version to 4.4.8 Update version from 4.4.7 to 4.4.8 in version.txt and _Version variable.	2026-02-06 09:08:35 +01:00
Eric Coissac	f798f22434	Add cross-platform binary builds and release workflow improvements This commit introduces a new build job that compiles binaries for multiple platforms (Linux, macOS) and architectures (amd64, arm64). It also refactors the release process to download pre-built artifacts and simplify the release directory preparation. The workflow now uses matrix strategy for building binaries and downloads all artifacts for the final release, removing the previous manual build steps for each platform. Release_4.4.8	2026-02-06 09:08:33 +01:00
coissac	248bc9f672	Merge pull request #75 from metabarcoding/push-mxxuykppzlpw Push mxxuykppzlpw	2026-02-05 18:11:12 +01:00
Eric Coissac	7a7db703f1	Bump version to 4.4.7 Update version from 4.4.6 to 4.4.7 in version.txt and pkg/obioptions/version.go	2026-02-05 18:10:45 +01:00
Eric Coissac	da195ac5cb	Optimisation de la construction des binaires Modification du fichier de workflow de release pour compiler uniquement les outils obitools lors de la construction des binaires pour chaque plateforme (Linux AMD64, Linux ARM64, macOS AMD64, macOS ARM64, Windows AMD64). Cela permet d'optimiser le processus de build en ne générant que les binaires nécessaires. Release_4.4.7	2026-02-05 18:10:43 +01:00
coissac	20a0a09f5f	Merge pull request #74 from metabarcoding/push-yqrwnpmoqllk Push yqrwnpmoqllk	2026-02-05 18:03:28 +01:00
coissac	7d8c578c57	Merge branch 'master' into push-yqrwnpmoqllk	2026-02-05 18:03:18 +01:00
Eric Coissac	d7f615108f	Bump version to 4.4.6 Update version from 4.4.5 to 4.4.6 in version.txt and pkg/obioptions/version.go	2026-02-05 18:02:30 +01:00
Eric Coissac	71574f240b	Update version and add CI tests Update version to 4.4.5 and add a test job in the release workflow to ensure tests pass before creating a release. Release_4.4.6	2026-02-05 18:02:28 +01:00
coissac	c98501a898	Merge pull request #73 from metabarcoding/push-pklkwsssrkuv Push pklkwsssrkuv	2026-02-05 17:54:39 +01:00
Eric Coissac	23f145a4c2	Bump version to 4.4.5 Update version number from 4.4.4 to 4.4.5 in both version.go and version.txt files.	2026-02-05 17:53:53 +01:00
Eric Coissac	fe6d74efbf	Add automated release workflow and update tag creation This commit introduces a new GitHub Actions workflow to automatically create releases when tags matching the pattern 'Release_*' are pushed. It also updates the Makefile to use the new tag format 'Release_<version>' for tagging commits, ensuring consistency with the new release automation. Release_4.4.5	2026-02-05 17:53:52 +01:00
coissac	cff8135468	Merge pull request #72 from metabarcoding/push-zsprzlqxurrp Push zsprzlqxurrp	2026-02-05 17:42:48 +01:00
Eric Coissac	02ab683fa0	Bump version to 4.4.4 Update version from 4.4.3 to 4.4.4 in version.txt and pkg/obioptions/version.go	2026-02-05 17:42:01 +01:00
Eric Coissac	de88e7eecd	Fix typo in variable name Corrected a typo in the variable name 'usreId' to 'userId' to ensure proper functionality.	2026-02-05 17:41:59 +01:00
Eric Coissac	e3c41fc11b	Add Jaccard distance and similarity computations for KmerSet and KmerSetGroup Add Jaccard distance and similarity computations for KmerSet and KmerSetGroup This commit introduces Jaccard distance and similarity methods for KmerSet and KmerSetGroup. For KmerSet: - Added JaccardDistance method to compute the Jaccard distance between two KmerSets - Added JaccardSimilarity method to compute the Jaccard similarity between two KmerSets For KmerSetGroup: - Added JaccardDistanceMatrix method to compute a pairwise Jaccard distance matrix - Added JaccardSimilarityMatrix method to compute a pairwise Jaccard similarity matrix Also includes: - New DistMatrix implementation in pkg/obidist for storing and computing distance/similarity matrices - Updated version handling with bump-version target in Makefile - Added tests for all new methods	2026-02-05 17:39:23 +01:00
Eric Coissac	aa2e94dd6f	Refactor k-mer normalization functions and add quorum operations This commit refactors the k-mer normalization functions, renaming them from 'NormalizeKmer' to 'CanonicalKmer' to better reflect their purpose of returning canonical k-mers. It also introduces new quorum operations (AtLeast, AtMost, Exactly) for k-mer set groups, along with comprehensive tests and benchmarks. The version commit hash has also been updated.	2026-02-05 17:11:34 +01:00
Eric Coissac	a43e6258be	docs: translate comments to English This commit translates all French comments in the kmer filtering and set management code to English, improving code readability and maintainability for international collaborators.	2026-02-05 16:35:55 +01:00
Eric Coissac	12ca62b06a	Implémentation complète de la persistance pour FrequencyFilter Ajout de la fonctionnalité de sauvegarde et de chargement pour FrequencyFilter en utilisant le KmerSetGroup sous-jacent. - Nouvelle méthode Save() pour enregistrer le filtre dans un répertoire avec formatage des métadonnées - Nouvelle méthode LoadFrequencyFilter() pour charger un filtre depuis un répertoire - Initialisation des métadonnées lors de la création du filtre - Optimisation des méthodes Union() et Intersect() du KmerSetGroup - Mise à jour du commit hash	2026-02-05 16:26:10 +01:00
Eric Coissac	09ac15a76b	Refactor k-mer encoding functions to use 'canonical' terminology This commit refactors all k-mer encoding and normalization functions to consistently use 'canonical' instead of 'normalized' terminology. This includes renaming functions like EncodeNormalizedKmer to EncodeCanonicalKmer, IterNormalizedKmers to IterCanonicalKmers, and NormalizeKmer to CanonicalKmer. The change aligns the API with biological conventions where 'canonical' refers to the lexicographically smallest representation of a k-mer and its reverse complement. All related documentation and examples have been updated accordingly. The commit also updates the version file with a new commit hash.	2026-02-05 16:14:35 +01:00
Eric Coissac	16f72e6305	refactoring of obikmer	2026-02-05 16:05:48 +01:00
Eric Coissac	6c6c369ee2	Add k-mer encoding and decoding functions with normalized k-mer support This commit introduces new functions for encoding and decoding k-mers, including support for normalized k-mers. It also updates the frequency filter and k-mer set implementations to use the new encoding functions, providing zero-allocation encoding for better performance. The commit hash has been updated to reflect the latest changes.	2026-02-05 15:51:52 +01:00
Eric Coissac	c5dd477675	Refactor KmerSet and FrequencyFilter to use immutable K parameter and consistent Copy/Clone methods This commit refactors the KmerSet and related structures to use an immutable K parameter and introduces consistent Copy methods instead of Clone. It also adds attribute API support for KmerSet and KmerSetGroup, and updates persistence logic to handle IDs and metadata correctly.	2026-02-05 15:32:36 +01:00
Eric Coissac	afcb43b352	Ajout de la gestion des métadonnées utilisateur dans KmerSet et KmerSetGroup Cette modification ajoute la capacité de stocker et de persister des métadonnées utilisateur dans les structures KmerSet et KmerSetGroup. Les changements incluent l'ajout d'un champ Metadata dans KmerSet et KmerSetGroup, ainsi que la mise à jour des méthodes de clonage et de persistance pour gérer ces métadonnées. Cela permet de conserver des informations supplémentaires liées aux ensembles de k-mers tout en maintenant la compatibilité avec les opérations existantes.	2026-02-05 15:02:36 +01:00
Eric Coissac	b26b76cbf8	Add TOML persistence support for KmerSet and KmerSetGroup This commit adds support for saving and loading KmerSet and KmerSetGroup structures using TOML, YAML, and JSON formats for metadata. It includes: - Added github.com/pelletier/go-toml/v2 dependency - Implemented Save and Load methods for KmerSet and KmerSetGroup - Added metadata persistence with support for multiple formats (TOML, YAML, JSON) - Added helper functions for format detection and metadata handling - Updated version commit hash	2026-02-05 14:57:22 +01:00
Eric Coissac	aa468ec462	Refactor FrequencyFilter to use KmerSetGroup Refactor FrequencyFilter to inherit from KmerSetGroup for better code organization and maintainability. This change replaces the direct bitmap management with a group-based approach, simplifying the implementation and improving readability.	2026-02-05 14:46:57 +01:00
Eric Coissac	00dcd78e84	Refactor k-mer encoding and frequency filtering with KmerSet This commit refactors the k-mer encoding logic to handle ambiguous bases more consistently and introduces a KmerSet type for better management of k-mer collections. The frequency filter now works with KmerSet instead of roaring bitmaps directly, and the API has been updated to support level-based frequency queries. Additionally, the commit updates the version and commit hash.	2026-02-05 14:41:59 +01:00
Eric Coissac	60f27c1dc8	Add error handling for ambiguous bases in k-mer encoding This commit introduces error handling for ambiguous DNA bases (N, R, Y, W, S, K, M, B, D, H, V) in k-mer encoding. It adds new functions IterNormalizedKmersWithErrors and EncodeNormalizedKmersWithErrors that track and encode the number of ambiguous bases in each k-mer using error markers in the top 2 bits. The commit also updates the version string to reflect the latest changes.	2026-02-04 21:45:08 +01:00
Eric Coissac	28162ac36f	Ajout du filtre de fréquence avec v niveaux Roaring Bitmaps Implémentation complète du filtre de fréquence utilisant v niveaux de Roaring Bitmaps pour éliminer efficacement les erreurs de séquençage. - Ajout de la logique de filtrage par fréquence avec v niveaux - Intégration des bibliothèques RoaringBitmap et bitset - Ajout d'exemples d'utilisation et de documentation - Implémentation de l'itérateur de k-mers pour une utilisation mémoire efficace - Optimisation pour les distributions skewed typiques du séquençage Ce changement permet de filtrer les k-mers par fréquence minimale avec une utilisation mémoire optimale et une seule passe sur les données.	2026-02-04 21:21:10 +01:00
Eric Coissac	1a1adb83ac	Add error marker support for k-mers with enhanced documentation This commit introduces error marker functionality for k-mers with odd lengths up to 31. The top 2 bits of each k-mer are now reserved for error coding (0-3), allowing for error detection and correction capabilities. Key changes include: - Added constants KmerErrorMask and KmerSequenceMask for bit manipulation - Implemented SetKmerError, GetKmerError, and ClearKmerError functions - Updated EncodeKmers, ExtractSuperKmers, EncodeNormalizedKmers functions to enforce k ≤ 31 - Enhanced ReverseComplement to preserve error bits during reverse complement operations - Added comprehensive tests for error marker functionality including edge cases and integration tests The maximum k-mer size is now capped at 31 to accommodate the error bits, ensuring that k-mers with odd lengths ≤ 31 utilize only 62 bits of the 64-bit uint64, leaving the top 2 bits available for error coding.	2026-02-04 16:21:47 +01:00
Eric Coissac	05de9ca58e	Add SuperKmer extraction functionality This commit introduces the ExtractSuperKmers function which identifies maximal subsequences where all consecutive k-mers share the same minimizer. It includes: - SuperKmer struct to represent the maximal subsequences - dequeItem struct for tracking minimizers in a sliding window - Efficient algorithm using monotone deque for O(1) amortized minimizer tracking - Comprehensive parameter validation - Support for buffer reuse for performance optimization - Extensive test cases covering basic functionality, edge cases, and performance benchmarks The implementation uses simultaneous forward/reverse m-mer encoding for O(1) canonical m-mer computation and maintains a monotone deque to track minimizers efficiently.	2026-02-04 16:04:06 +01:00
Eric Coissac	500144051a	Add jj Makefile targets and k-mer encoding utilities Add new Makefile targets for jj operations (jjnew, jjpush, jjfetch) to streamline commit workflow. Introduce k-mer encoding utilities in pkg/obikmer: - EncodeKmers: converts DNA sequences to encoded k-mers - ReverseComplement: computes reverse complement of k-mers - NormalizeKmer: returns canonical form of k-mers - EncodeNormalizedKmers: encodes sequences with normalized k-mers Add comprehensive tests for k-mer encoding functions including edge cases, buffer reuse, and performance benchmarks. Document k-mer index design for large genomes, covering: - Use cases and objectives - Volume estimations - Distance metrics (Jaccard, Sørensen-Dice, Bray-Curtis) - Indexing options (Bloom filters, sorted sets, MPHF) - Optimization techniques (k-2-mer indexing) - MinHash for distance acceleration - Recommended architecture for presence/absence and counting queries	2026-02-04 14:27:10 +01:00
coissac	740f66b4c7	Merge pull request #71 from metabarcoding/push-onwzsyuooozn Implémentation du filtrage unique basé sur séquence et catégories	2026-01-14 19:19:27 +01:00

1 2 3 4 5 ...

671 Commits