refactor: restructure k-mer partitioning pipeline for memory efficiency
Replace in-memory hashing with a disk-backed external merge sort and `PersistentCompactIntVec` to drastically reduce peak RAM. Unify both phases using a custom `PtrHash` MPHF, eliminating `GOFunction` and `boomphf`. Introduce a concrete three-step `count_partition()` pipeline with adaptive chunk sizing based on available system memory. Update dependencies to `memmap2`, `ptr_hash`, and `obicompactvec`. Additionally, document strict genomics-only memory constraints and enforce an architectural feedback workflow requiring explicit user authorization before structural changes.
This commit is contained in:
@@ -20,5 +20,8 @@ sysinfo = "0.33"
|
||||
serde = { version = "1", features = ["derive"] }
|
||||
serde_json = "1"
|
||||
tracing = "0.1.44"
|
||||
ph = "0.11"
|
||||
cacheline-ef = "1.1"
|
||||
epserde = "0.8"
|
||||
memmap2 = "0.9.10"
|
||||
obicompactvec = { path = "../obicompactvec" }
|
||||
ptr_hash = "1.1"
|
||||
|
||||
Reference in New Issue
Block a user