feat: implement NUMA-aware worker pools for merge command
Replaces the global Rayon pool with per-NUMA-node thread pools that pin worker threads to their respective nodes, leveraging Linux first-touch allocation to reduce cross-NUMA memory contention and improve cache locality. Integrates the `hwlocality` crate with a vendored build, includes graceful fallbacks for single-socket or non-Linux systems, and updates dependency constraints. Also adds installation and architecture documentation, and corrects parallelism detection in the partitioner.
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
# Installation
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Rust toolchain
|
||||
|
||||
`obikmer` requires **Rust 1.85 or later** (edition 2024). Install or update via [rustup](https://rustup.rs):
|
||||
|
||||
```bash
|
||||
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
||||
rustup update stable
|
||||
```
|
||||
|
||||
### C build environment (required for hwloc)
|
||||
|
||||
`obikmer` embeds [hwloc](https://www.open-mpi.org/projects/hwloc/) (Hardware Locality) for NUMA-aware thread placement on multi-socket machines. hwloc is built from source at compile time via the `vendored` feature of the `hwlocality` crate. This requires a standard C build environment.
|
||||
|
||||
#### Linux (Debian/Ubuntu)
|
||||
|
||||
```bash
|
||||
apt install build-essential automake libtool autoconf pkg-config
|
||||
```
|
||||
|
||||
#### Linux (RHEL/Rocky/AlmaLinux)
|
||||
|
||||
```bash
|
||||
dnf install gcc make automake libtool autoconf pkgconfig
|
||||
```
|
||||
|
||||
#### HPC clusters
|
||||
|
||||
Most HPC clusters provide these tools via the module system:
|
||||
|
||||
```bash
|
||||
module load gcc automake libtool autoconf
|
||||
```
|
||||
|
||||
If in doubt, check whether `autoreconf --version` and `libtool --version` return successfully.
|
||||
|
||||
#### macOS
|
||||
|
||||
```bash
|
||||
brew install automake libtool autoconf pkg-config
|
||||
```
|
||||
|
||||
## Building
|
||||
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd obikmer/src
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
The compiled binary is at `target/release/obikmer`.
|
||||
|
||||
## NUMA support
|
||||
|
||||
NUMA-aware thread placement is active automatically on multi-socket Linux machines (detected at runtime via hwloc). No special build flag is required — the detection is built in and falls back gracefully to the single-pool adaptive strategy on:
|
||||
|
||||
- macOS (Apple Silicon, unified memory)
|
||||
- single-socket Linux machines
|
||||
- any system where hwloc reports only one NUMA node
|
||||
|
||||
## Verifying the installation
|
||||
|
||||
```bash
|
||||
obikmer --help
|
||||
```
|
||||
Reference in New Issue
Block a user