4736a7b6de
Replace in-memory hashing with a disk-backed external merge sort and `PersistentCompactIntVec` to drastically reduce peak RAM. Unify both phases using a custom `PtrHash` MPHF, eliminating `GOFunction` and `boomphf`. Introduce a concrete three-step `count_partition()` pipeline with adaptive chunk sizing based on available system memory. Update dependencies to `memmap2`, `ptr_hash`, and `obicompactvec`. Additionally, document strict genomics-only memory constraints and enforce an architectural feedback workflow requiring explicit user authorization before structural changes.
18 lines
1.1 KiB
Markdown
18 lines
1.1 KiB
Markdown
---
|
|
name: No architectural decisions without explicit authorization
|
|
description: Never make architectural or design decisions without explicit user approval — code decisions are the user's alone
|
|
type: feedback
|
|
---
|
|
|
|
Never make architectural decisions unilaterally. This includes:
|
|
- Memory layout or footprint changes
|
|
- Algorithm or data structure choices (HashSet vs streaming, etc.)
|
|
- Dependency additions or substitutions
|
|
- Structural refactors that go beyond the exact task requested
|
|
|
|
If a bug or inefficiency is observed, **report it and propose alternatives** — do not fix it without explicit authorization.
|
|
|
|
**Why:** The user optimizes for minimal memory footprint at all times. Introducing a HashSet in `count_kmer()` (replacing the intended streaming GOFunction construction from the sidecar estimate) caused a serious memory regression that went unreported. This is inadmissible on a project where memory efficiency is a core constraint.
|
|
|
|
**How to apply:** When editing code and noticing an architectural issue (even a clear improvement), stop, describe the problem and options, and wait for explicit go-ahead before touching anything.
|