obitools4

mirror of https://github.com/metabarcoding/obitools4.git synced 2026-06-24 17:51:00 +00:00

Author	SHA1	Message	Date
Eric Coissac	60b3753673	feat(obiconvert): add --raw-taxid option and refactor taxID formatting - Add new `--tax-id` mode (`obiconvert --raw-taxid`) to output bare numeric taxIDs instead of full-format strings. - Introduce `TaxNode.FullString()` to always return the complete "code:id [name]@rank" format, regardless of global `UseRawTaxids()` setting. - Update `.String(taxonomyCode)` to respect the global flag, returning bare ID when `--raw-taxid` is active. - Extract raw taxID from full-format strings in taxonomy methods when needed (e.g., fallback without loaded DB). - Add comprehensive test suite covering: a) `--raw-taxid` execution and idempotency b) full-format taxID output with `--taxonomy` c interaction of both flags d format validation - Add test data: new reference files `out_ecotag.fasta`, taxonomy.csv, and updated shell script.	2026-04-30 16:57:38 +02:00
Eric Coissac	7cb02ded69	Refactor: Extract utility function for string reversal - Introduce `inverser_chaine()` helper to centralize logic - Replace inline reverse implementations across modules	2026-04-16 13:42:51 +02:00
Eric Coissac	6d469bd711	[obiseq] Add length validation for qualities in SetQualities, Take Qualites and Subsequence [obiseq] Add length validation for qualities in SetQualities, Take Qualites and Subsequence - Panic if sequence/qualities length mismatch when setting or taking qualities in BioSequence. - Add same check before slicing Qualities() for Subsequence to ensure consistency.	2026-04-15 18:20:53 +02:00
Eric Coissac	a2b26712b2	refactor: replace fixed batch size with dynamic flushing based on count and memory Replace the old fixed batch-size mechanism in Distribute with a dynamic strategy that flushes batches when either BatchSizeMax() sequences or BatchMem() bytes are reached per key. This aligns with the RebatchBySize strategy and removes the optional sizes parameter. Also update related code: simplify Lua wrapper to accept optional capacity, and fix buffer growth logic in worker.go using slices.Grow correctly. Remove unused BatchSize() usage from obidistribute.	2026-03-16 22:06:44 +01:00
Eric Coissac	40769bf827	Add memory-based batching support Implement memory-aware batch sizing with --batch-mem CLI option, enabling adaptive batching based on estimated sequence memory footprint. Key changes: - Added _BatchMem and related getters/setters in pkg/obidefault - Implemented RebatchBySize() in pkg/obiter for memory-constrained batching - Added BioSequence.MemorySize() for conservative memory estimation - Integrated batch-mem option in pkg/obioptions with human-readable size parsing (e.g., 128K, 64M, 1G) - Added obiutils.ParseMemSize/FormatMemSize for unit conversion - Enhanced pool GC in pkg/obiseq/pool.go to trigger explicit GC for large slice discards - Updated sequence_reader.go to apply memory-based rebatching when enabled	2026-03-13 14:54:21 +01:00
Eric Coissac	6ee8750635	Replace SplitInTwo with LeftSplitInTwo/RightSplitInTwo for precise splitting Replace SplitInTwo calls with LeftSplitInTwo or RightSplitInTwo depending on the intended split direction. In fastseq_json_header.go, extract rank from suffix without splitting; in biosequenceslice.go and taxid.go, use LeftSplitInTwo to split from the left; add RightSplitInTwo utility function for splitting from the right.	2026-03-12 18:41:28 +01:00
Eric Coissac	3d2e205722	Refactor rope scanner and add FASTQ rope parser This commit refactors the rope scanner implementation by renaming gbRopeScanner to ropeScanner and extracting the common functionality into a new file. It also introduces a new FastqChunkParserRope function that parses FASTQ chunks directly from a rope without Pack(), enabling more efficient memory usage. The existing parsers are updated to use the new rope-based parser when available. The BioSequence type is enhanced with a TakeQualities method for more efficient quality data handling.	2026-03-10 16:47:03 +01:00
Eric Coissac	1342c83db6	Use NewBioSequenceOwning to avoid unnecessary sequence copying Replace NewBioSequence with NewBioSequenceOwning in genbank_read.go to take ownership of sequence slices without copying, improving performance. Update biosequence.go to add the new TakeSequence method and NewBioSequenceOwning constructor.	2026-03-10 15:51:35 +01:00
Eric Coissac	ac0d3f3fe4	Update obiuniq for very large dataset	2025-12-18 14:11:11 +01:00
Eric Coissac	86e60aedd0	obicsv bug with stat on value map fields	2025-11-21 14:03:31 +01:00
Eric Coissac	4603d7973e	implementation de obilowmask	2025-11-18 15:30:20 +01:00
Eric Coissac	d17a9520b9	work on obiclean chimera detection	2025-10-20 17:29:47 +02:00
Eric Coissac	add9d89ccc	Patch the Min and Max values of the expression language	2025-06-19 16:43:26 +02:00
Eric Coissac	9965370d85	Manage a lock on StatsOnValues	2025-06-17 16:46:11 +02:00
Eric Coissac	8a2bb1fe82	Changes to be committed: modified: pkg/obioptions/version.go modified: pkg/obiseq/merge.go	2025-06-17 12:11:35 +02:00
Eric Coissac	efc3f3af29	Patch a concurrent access problem	2025-06-17 12:05:42 +02:00
Eric Coissac	6cb7a5a352	Changes to be committed: modified: cmd/obitools/obitag/main.go modified: cmd/obitools/obitaxonomy/main.go modified: pkg/obiformats/csvtaxdump_read.go modified: pkg/obiformats/ecopcr_read.go modified: pkg/obiformats/ncbitaxdump_read.go modified: pkg/obiformats/ncbitaxdump_readtar.go modified: pkg/obiformats/newick_write.go modified: pkg/obiformats/options.go modified: pkg/obiformats/taxonomy_read.go modified: pkg/obiformats/universal_read.go modified: pkg/obiiter/extract_taxonomy.go modified: pkg/obioptions/options.go modified: pkg/obioptions/version.go new file: pkg/obiphylo/tree.go modified: pkg/obiseq/biosequenceslice.go modified: pkg/obiseq/taxonomy_methods.go modified: pkg/obitax/taxonomy.go modified: pkg/obitax/taxonset.go modified: pkg/obitools/obiconvert/sequence_reader.go modified: pkg/obitools/obitag/obitag.go modified: pkg/obitools/obitaxonomy/obitaxonomy.go modified: pkg/obitools/obitaxonomy/options.go deleted: sample/.DS_Store	2025-06-04 09:48:10 +02:00
Eric Coissac	f9324dd8f4	add min and max to the obitools expression language	2025-05-13 16:03:03 +02:00
Eric Coissac	f1b9ac4a13	Update the expression language	2025-05-07 20:45:05 +02:00
Eric Coissac	c0ecaf90ab	Add the --number option to obiannotate	2025-04-22 18:35:51 +02:00
Eric Coissac	a57cfda675	Make the replace function of the eval language accepting regex	2025-04-10 15:17:15 +02:00
Eric Coissac	5a3705b6bb	Adds the --silent-warning options to the obitools commands and removes the --pared-with option from some of the obitols commands.	2025-03-25 16:44:46 +01:00
Eric Coissac	f21f51ae62	Correct the logic of --update-taxid and --fail-on-taxonomy	2025-03-11 16:56:02 +01:00
Eric Coissac	3b5d4ba455	patch a bug in obiannotate	2025-03-11 16:35:38 +01:00
Eric Coissac	286e27d6ba	patch the scienctific_name tag name to "scientific_name"	2025-03-05 14:22:12 +01:00
Eric Coissac	51b3e83d32	some cleaning	2025-02-24 11:31:49 +01:00
Eric Coissac	8671285d02	add the --min-sample-count option to obiclean.	2025-02-24 08:48:31 +01:00
Eric Coissac	4774438644	Changes to be committed: modified: pkg/obiformats/universal_read.go modified: pkg/obioptions/version.go modified: pkg/obiseq/taxonomy_methods.go	2025-02-12 08:40:38 +01:00
Eric Coissac	6a8061cc4f	Add managment of the taxonomy alias politic	2025-02-10 14:05:47 +01:00
Eric Coissac	0df082da06	Adds possibility to extract a taxonomy from taxonomic path included in sequence files	2025-01-30 11:18:21 +01:00
Eric Coissac	9acb4a85a8	Refactoring of the default values	2025-01-24 18:09:59 +01:00
Eric Coissac	ccd3b06532	Merge branch 'master' into taxonomy	2024-12-20 20:06:57 +01:00
Eric Coissac	5d0f996625	Patch a small bug on json write	2024-12-20 19:42:03 +01:00
Eric Coissac	795df34d1a	Changes to be committed: modified: cmd/obitools/obitag/main.go modified: cmd/obitools/obitag2/main.go modified: go.mod modified: go.sum modified: pkg/obiformats/ncbitaxdump/read.go modified: pkg/obioptions/version.go modified: pkg/obiseq/attributes.go modified: pkg/obiseq/taxonomy_lca.go modified: pkg/obiseq/taxonomy_methods.go modified: pkg/obiseq/taxonomy_predicate.go modified: pkg/obitax/inner.go modified: pkg/obitax/lca.go new file: pkg/obitax/taxid.go modified: pkg/obitax/taxon.go modified: pkg/obitax/taxonomy.go modified: pkg/obitax/taxonslice.go modified: pkg/obitools/obicleandb/obicleandb.go modified: pkg/obitools/obigrep/options.go modified: pkg/obitools/obilandmark/obilandmark.go modified: pkg/obitools/obilandmark/options.go modified: pkg/obitools/obirefidx/famlilyindexing.go modified: pkg/obitools/obirefidx/geomindexing.go modified: pkg/obitools/obirefidx/obirefidx.go modified: pkg/obitools/obirefidx/options.go modified: pkg/obitools/obitag/obigeomtag.go modified: pkg/obitools/obitag/obitag.go modified: pkg/obitools/obitag/options.go modified: pkg/obiutils/strings.go	2024-12-19 13:36:59 +01:00
Eric Coissac	00b0edc15a	refactoring of the file chunck writing	2024-11-29 18:15:03 +01:00
Eric Coissac	d29a56dcbf	Changes to be committed: modified: Release-notes.md modified: pkg/obialign/pairedendalign.go modified: pkg/obilua/obiseq.go modified: pkg/obioptions/version.go modified: pkg/obiseq/biosequence.go modified: pkg/obitools/obipairing/pairing.go	2024-11-27 09:56:22 +01:00
Eric Coissac	7884a74f9c	Patch a bug in obitagpcr	2024-11-18 21:10:47 +01:00
Eric Coissac	03f4e88a17	Fisrt functional version	2024-11-14 19:10:23 +01:00
Eric Coissac	241f2286f2	remove the slice pool management	2024-09-24 16:31:30 +02:00
Eric Coissac	05bf2bfd6c	Add option related to agrep match on obigrep and obiannotate	2024-09-09 16:52:13 +02:00
Eric Coissac	65ae82622e	correction of several small bugs	2024-09-03 06:08:07 -03:00
Eric Coissac	bdb96dda94	Adds the obimicrosat command	2024-08-05 15:31:20 +02:00
Eric Coissac	67665a6b40	Xprize update Former-commit-id: d38919a897961e4d40da3b844057c3fb94fdb6d7	2024-07-25 18:09:03 -04:00
Eric Coissac	4e4fac491f	Fisrt versin of the two levels indexing Former-commit-id: 4d86483bc120e27cb6f5d2c216596d410274fc69	2024-07-12 15:17:48 +02:00
Eric Coissac	c7ed47e110	first version of obidemerge, obijoin and a new filter for obicleandb but to be finnished Former-commit-id: 8a1ed26e5548c30db75644c294d478ec4d753f19	2024-07-10 15:21:42 +02:00
Eric Coissac	bd855c4965	Adds CSV as an input format Former-commit-id: a365bb6947064adc2709d66df05fa54c6fe47fad	2024-07-03 21:04:27 +02:00
Eric Coissac	fd663357b5	First version of obicleandb... Former-commit-id: e60b61d015abbf029a555b51de99b4252c50ab59	2024-07-01 17:12:42 +02:00
Eric Coissac	93f9dcb95f	Reducing memory allocation events Former-commit-id: c94e79ba116464504580fc397270ead154063971	2024-06-22 22:32:31 +02:00
Eric Coissac	e6b87ecd02	Reduce memory allocation events Former-commit-id: fbdb2afc857b02adc2593e2278d3bd838e99b0b2	2024-06-22 21:01:53 +02:00
Eric Coissac	54a138196c	Patch a bug in fasta and fastq reading Former-commit-id: bcaa264b4c4a7c67617eb909b199176bf09913db	2024-06-21 14:28:57 +02:00

1 2 3

138 Commits