Commit Graph

39 Commits

Author SHA1 Message Date
Eric Coissac c188580aac Replace Rebatch with RebatchBySize using default batch parameters
Replace calls to Rebatch(size) with RebatchBySize(obidefault.BatchMem(), obidefault.BatchSizeMax()) in batchiterator.go, fragment.go, and obirefidx.go to ensure consistent use of default memory and size limits for batch rebatching.
2026-03-13 15:16:33 +01:00
Eric Coissac 1e1f575d1c refactor: replace single batch size with min/max bounds and memory limits
Introduce separate _BatchSize (min) and _BatchSizeMax (max) constants to replace the single _BatchSize variable. Update RebatchBySize to accept both maxBytes and maxCount parameters, flushing when either limit is exceeded. Set default batch size min to 1, max to 2000, and memory limit to 128 MB. Update CLI options and sequence_reader.go accordingly.
2026-03-13 15:07:35 +01:00
Eric Coissac 40769bf827 Add memory-based batching support
Implement memory-aware batch sizing with --batch-mem CLI option, enabling adaptive batching based on estimated sequence memory footprint. Key changes:
- Added _BatchMem and related getters/setters in pkg/obidefault
- Implemented RebatchBySize() in pkg/obiter for memory-constrained batching
- Added BioSequence.MemorySize() for conservative memory estimation
- Integrated batch-mem option in pkg/obioptions with human-readable size parsing (e.g., 128K, 64M, 1G)
- Added obiutils.ParseMemSize/FormatMemSize for unit conversion
- Enhanced pool GC in pkg/obiseq/pool.go to trigger explicit GC for large slice discards
- Updated sequence_reader.go to apply memory-based rebatching when enabled
2026-03-13 14:54:21 +01:00
Eric Coissac 7c4042df6b introduce obidefault 2025-01-27 17:12:45 +01:00
Eric Coissac 9acb4a85a8 Refactoring of the default values 2025-01-24 18:09:59 +01:00
Eric Coissac 241f2286f2 remove the slice pool management 2024-09-24 16:31:30 +02:00
Eric Coissac 31bfc88eb9 Patch a bug on writing to stdout, and add clearer error on openning data files 2024-08-13 09:45:28 +02:00
Eric Coissac 886b5d9a96 Optimize memory for readers and writers 2024-08-05 10:48:28 +02:00
Eric Coissac 1b1cd41fd3 Add some code refactoring from the blackboard branch 2024-08-02 12:35:46 +02:00
Eric Coissac e40d0bfbe7 Debug fasta and fastq writer when the first sequence is hudge
Former-commit-id: d208ff838abb7e19e117067f6243298492d60f14
2024-06-26 18:39:42 +02:00
Eric Coissac 65f5109957 Plenty of small bugs
Former-commit-id: 42c7fab7d65906c80ab4cd32da6867ff21842ea8
2024-06-04 16:49:12 +02:00
coissac 23758b00f6 Patch a bug in the embl reader and adds some doc
Former-commit-id: 9b5f75fb14bcc3043da1647055279987a295d271
2024-01-31 15:43:02 +01:00
coissac eb351a7530 patch bug in worker
Former-commit-id: f83cc62fc7a85f732e871f8866f80f738f494f9e
2023-12-03 22:44:13 +01:00
coissac 8d77cc4133 Change path of the obitools pkg
Former-commit-id: 311cbf8df3b990b393c6f4885d62e74564423b65
2023-11-29 12:14:37 +01:00
coissac 2e0c1bd801 Correct the number of workers
Former-commit-id: febbccfb853263e0761ecfccb0f09c8c1bf88475
2023-11-22 09:46:30 +01:00
coissac 62b57f4ede A go implementation of the fasta reader
Former-commit-id: 603592c4761fb0722e9e0501d78de1bd3ba238fa
2023-09-01 09:30:12 +02:00
coissac 988ae79989 Optimize memory allocation of the apat algorithms
Former-commit-id: 5010c5a666b322715b3b81c1078d325e1f647ede
2023-03-28 19:37:05 +07:00
coissac a33e471b39 First attempt for obiconsensus... The graph traversing algorithm is too simple
Former-commit-id: 0456e6c7fd55d6d0fcf9856c40386b976b912cba
2023-03-27 19:51:10 +07:00
coissac d5e84ec676 rename goutils to obiutils
Former-commit-id: 2147f53db972bba571dfdae30c51b62d3e69cec5
2023-03-24 10:25:12 +07:00
coissac 5fbe52368c Patch the empty batch bug
Former-commit-id: fcee04b58f2c4a0bf2c27792f991391c0b6ce78e
2023-03-07 20:16:06 +07:00
coissac d88de15cdc Refactoring codes for removing buffer size options. An some other changes...
Former-commit-id: 10b57cc1a27446ade3c444217341e9651e89cdce
2023-03-07 11:12:13 +07:00
coissac 072b85e155 change the model for representing paired reads and extend its usage to other commands 2023-02-23 23:35:58 +01:00
coissac 526bf79c7f Patch for some lost of data during sequence writing 2023-02-08 13:14:26 +01:00
coissac 2d375df94f move the worker class to the obiseq package 2023-01-22 22:39:13 +01:00
coissac f97f92df72 rename the iterator class 2023-01-22 22:04:17 +01:00
coissac 29563aa94e Rename the Length methods Len to follow GO standart 2022-11-17 11:09:58 +01:00
coissac 09fc426b67 Refactoring related to iterators 2022-11-16 17:13:03 +01:00
coissac 6f853da9df Remove single sequence ierators. Only batch iterators persist 2022-11-16 10:58:59 +01:00
coissac 8aa323dad5 Add a first version of obitag the successor of ecotag 2022-10-26 13:16:56 +02:00
coissac 7873b90902 Patch a bug in obigrep... 2022-10-02 20:52:26 +02:00
coissac d04161a0fb small code refactoring 2022-08-23 15:08:35 +02:00
coissac eca1af9957 Patch a bug on the FilterOn method 2022-08-21 15:05:14 +02:00
coissac f00456fcf3 Add a second way to merge several batch iterators using the pool method. 2022-08-21 13:41:15 +02:00
coissac 5dd835d3e7 A first functional version of obiclean 2022-08-20 18:01:07 +02:00
coissac 6b13729eba Adds a test for not pushing empty batch on the output 2022-06-14 09:53:35 +02:00
coissac f14860a486 Patch header parting and formatiing 2022-05-27 11:53:29 +03:00
coissac 011898bd9d A first version of obigrep. Normally fully functionnal, but not fully tested 2022-02-25 07:29:52 +01:00
coissac abcf02e488 Start to use leveled log 2022-02-24 12:14:52 +01:00
coissac eaf65fbcce Some code refactoring, a new version of obiuniq more efficient in memory and a first make file allowing to build obitools 2022-02-24 07:08:40 +01:00