Commit Graph

38 Commits

Author SHA1 Message Date
Eric Coissac 8c7017a99d ⬆️ version bump to v4.5
- Update obioptions.Version from "Release 4.4.29" to "/v/ Release v5"
- Update version.txt from 4.29 → .30
(automated by Makefile)
2026-04-13 13:34:53 +02:00
Eric Coissac 761e0dbed3 Implémentation d'un parseur GenBank utilisant rope pour réduire l'usage de mémoire
Ajout d'un parseur GenBank basé sur rope pour réduire l'usage de mémoire (RSS) et les allocations heap.

- Ajout de `gbRopeScanner` pour lire les lignes sans allocation heap
- Implémentation de `GenbankChunkParserRope` qui utilise rope au lieu de `Pack()`
- Modification de `_ParseGenbankFile` et `ReadGenbank` pour utiliser le nouveau parseur
- Réduction du RSS attendue de 57 GB à ~128 MB × workers
- Conservation de l'ancien parseur pour compatibilité et tests

Réduction significative des allocations (~50M) et temps sys, avec un temps user comparable ou meilleur.
2026-03-10 15:35:36 +01:00
Eric Coissac f78543ee75 Refactor k-mer index building to use disk-based KmerSetGroupBuilder
Refactor k-mer index building to use the new disk-based KmerSetGroupBuilder instead of the old KmerSet and FrequencyFilter approaches. This change introduces a more efficient and scalable approach to building k-mer indices by using partitioned disk storage with streaming operations.

- Replace BuildKmerIndex and BuildFrequencyFilterIndex with KmerSetGroupBuilder
- Add support for frequency filtering via WithMinFrequency option
- Remove deprecated k-mer set persistence methods
- Update CLI to use new builder approach
- Add new disk-based k-mer operations (union, intersect, difference, quorum)
- Introduce KDI (K-mer Delta Index) file format for efficient storage
- Add K-way merge operations for combining sorted k-mer streams
- Update documentation and examples to reflect new API

This refactoring provides better memory usage, faster operations on large datasets, and more flexible k-mer set operations.
2026-02-10 06:49:31 +01:00
Eric Coissac c0ae49ef92 Ajout d'obilowmask_ref au fichier .gitignore
Ajout du fichier obilowmask_ref dans le fichier .gitignore pour éviter qu'il ne soit suivi par Git.
2026-02-08 19:31:12 +01:00
Eric Coissac db98ddb241 Fix super k-mer minimizer bijection and add validation test
This commit addresses a bug in the super k-mer implementation where the minimizer bijection property was not properly enforced. The fix ensures that:

1. All k-mers within a super k-mer share the same minimizer
2. Identical super k-mer sequences have the same minimizer

The changes include:

- Fixing the super k-mer iteration logic to properly validate the minimizer bijection property
- Adding a comprehensive test suite (TestSuperKmerMinimizerBijection) that validates the intrinsic property of super k-mers
- Updating the .gitignore file to properly track relevant files

This resolves issues where the same sequence could be associated with different minimizers, violating the super k-mer definition.
2026-02-08 13:47:33 +01:00
Eric Coissac 4603d7973e implementation de obilowmask 2025-11-18 15:30:20 +01:00
Eric Coissac 04f3af3e60 some renaming of functions 2025-08-06 15:54:50 -04:00
Eric Coissac 286e27d6ba patch the scienctific_name tag name to "scientific_name" 2025-03-05 14:22:12 +01:00
Eric Coissac 6245d7f684 Changes to be committed:
modified:   .gitignore
2025-02-24 15:47:45 +01:00
Eric Coissac 15a058cf63 with all the sample files for tests 2025-02-19 15:27:38 +01:00
Eric Coissac f2e81adf95 Changes to be committed:
modified:   .gitignore
	deleted:    xxx.csv
2025-02-05 19:28:19 +01:00
Eric Coissac 0a567f621c small changes 2025-01-24 18:12:37 +01:00
Eric Coissac d066bb6878 Changes to be committed:
modified:   .gitignore
	modified:   cmd/test/main.go
	modified:   pkg/obioptions/version.go
2025-01-09 07:24:41 +01:00
Eric Coissac ccd3b06532 Merge branch 'master' into taxonomy 2024-12-20 20:06:57 +01:00
Eric Coissac 7884a74f9c Patch a bug in obitagpcr 2024-11-18 21:10:47 +01:00
Eric Coissac 36327c79c8 Changes to be committed:
modified:   .gitignore
	new file:   pkg/obitax/default_taxonomy.go
	modified:   pkg/obitax/taxon.go
	modified:   pkg/obitax/taxonnode.go
	modified:   pkg/obitax/taxonomy.go
	modified:   pkg/obitax/taxonset.go
	modified:   pkg/obitax/taxonslice.go
	modified:   pkg/obitools/obifind/iterator.go
	modified:   pkg/obitools/obifind/options.go
2024-11-16 10:01:49 +01:00
coissac 4127ddb26f .gitignore
Former-commit-id: e1dcb41970f7a5405005cda8a1bbd90798e8020d
2024-02-27 07:29:14 +01:00
coissac f2f7b4574e update the geometric obitag
Former-commit-id: acd8fe1c8c1cf443098432d818397b0b5d02df33
2024-01-17 23:38:51 +01:00
coissac 6fca03227a Archive cleaning
Former-commit-id: ded6d9cb43e3ecdf6eb6965e73580ce30ab986c5
2024-01-04 14:27:33 +01:00
coissac 5b57139450 Reduce doc size
Former-commit-id: 6f92f375e9cf92159e769ce562071bb56a871819
2024-01-04 14:17:44 +01:00
coissac c2533667b2 Tag a Fatal bug release 4.0.5
Former-commit-id: 10b27c6d3867756d3159ef22eefd75db3fab84d0
2023-08-29 18:32:00 +02:00
coissac 446ba06c63 edited .gitignore
Former-commit-id: 7090da08a2acd8d73d3c9e3aced387862bcc9822
2023-03-28 21:31:31 +07:00
coissac d4b185b716 Adds some new dependencies
Former-commit-id: 2f31ea6f852651e1ffca1d9ce78b17bddd26f2bb
2023-03-07 11:12:39 +07:00
coissac 6c5fc8f65b Save change in various files
Former-commit-id: 428f8ee77c584b79cc2ef45eef2902c3e0754c77
2023-02-23 23:45:41 +01:00
coissac f56363a100 Patch an embl/genbank parser error 2023-02-16 13:30:42 +01:00
coissac 3a151fc0a0 updated gitignire 2023-01-31 23:14:33 +01:00
coissac f4daa7f97f Modify the gitignore 2022-11-17 12:05:11 +01:00
coissac a71e65963b Modify the .gitignore 2022-08-21 17:53:51 +02:00
coissac a18745a34d Modify the gitignore 2022-02-24 12:15:09 +01:00
coissac f5b278f5ec edit .gitignore 2022-02-24 07:09:35 +01:00
coissac 3586ecc483 second version of obidistribute and a first buggy version of obiuniq 2022-02-15 00:47:02 +01:00
coissac b931321ba1 Adds the hash option to obidistribute 2022-02-14 09:12:57 +01:00
coissac b193c3edfe adds the obifind command to gitignore 2022-02-07 11:52:56 +01:00
coissac e9cdfd7e03 Make subseq method dealing with qualities 2022-02-01 18:49:32 +01:00
coissac 703eb62819 Adds elements to .gitignore 2022-01-18 13:10:56 +01:00
coissac a0ba77792a Adds fasta and fastq file to the main gitignore file 2022-01-14 15:20:11 +01:00
coissac bfa724dca3 adds the bin directory to the gitignore file 2022-01-14 15:18:36 +01:00
coissac b9b9c0f179 Patch module name from oa2 to obitools 2022-01-13 23:43:01 +01:00