- Root cause: header-based class indexing (HEADER_CLASSIDX=1) wrote a 1-byte header during allocation, but linear carve/refill and initial slab capacity still used bare class block sizes. This mismatch could overrun slab usable space and corrupt freelists, causing reproducible SEGV at ~100k iters. Changes - Superslab: compute capacity with effective stride (block_size + header for classes 0..6; class7 remains headerless) in superslab_init_slab(). Add a debug-only bound check in superslab_alloc_from_slab() to fail fast if carve would exceed usable bytes. - Refill (non-P0 and P0): use header-aware stride for all linear carving and TLS window bump operations. Ensure alignment/validation in tiny_refill_opt.h also uses stride, not raw class size. - Drain: keep existing defense-in-depth for remote sentinel and sanitize nodes before splicing into freelist (already present). Notes - This unifies the memory layout across alloc/linear-carve/refill with a single stride definition and keeps class7 (1024B) headerless as designed. - Debug builds add fail-fast checks; release builds remain lean. Next - Re-run Tiny benches (256/1024B) in debug to confirm stability, then in release. If any remaining crash persists, bisect with HAKMEM_TINY_P0_BATCH_REFILL=0 to isolate P0 batch carve, and continue reducing branch-miss as planned.
105 lines
2.5 KiB
Markdown
105 lines
2.5 KiB
Markdown
# HAKMEM Phase 7 + Pool TLS Phase 1.5b — Build & Run Cheatsheet
|
||
|
||
This document captures the stable build/run recipe used for recent benches.
|
||
|
||
## One‑liner Build (recommended)
|
||
|
||
```
|
||
./build.sh <target>
|
||
|
||
# examples
|
||
./build.sh bench_mid_large_mt_hakmem
|
||
./build.sh bench_random_mixed_hakmem
|
||
./build.sh larson_hakmem
|
||
```
|
||
|
||
Enables at build time:
|
||
- POOL_TLS_PHASE1=1 (Pool TLS Phase 1.5b)
|
||
- HEADER_CLASSIDX=1 (Phase 7 header)
|
||
- AGGRESSIVE_INLINE=1
|
||
- PREWARM_TLS=1
|
||
|
||
Verify switches:
|
||
```
|
||
make print-flags
|
||
```
|
||
|
||
Optional safety/verbosity toggles:
|
||
- `HAKMEM_TINY_SAFE_FREE=1` — strict free validation (mincore on all frees). Slower but safest.
|
||
- `HAKMEM_DEBUG_VERBOSE=1` — enable verbose logs for Tiny header/free, etc.
|
||
|
||
Examples:
|
||
```
|
||
make clean && make HAKMEM_TINY_SAFE_FREE=1 POOL_TLS_PHASE1=1 HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 bench_mid_large_mt_hakmem
|
||
```
|
||
|
||
## Bench Recipes (used in reports)
|
||
|
||
Larson (Mixed)
|
||
```
|
||
./build.sh larson_hakmem
|
||
make larson_system
|
||
./larson_hakmem 2 8 128 1024 1 12345 1
|
||
./larson_hakmem 2 8 128 1024 1 12345 4
|
||
./larson_system 2 8 128 1024 1 12345 1
|
||
./larson_system 2 8 128 1024 1 12345 4
|
||
```
|
||
|
||
Pool TLS (8–52KB)
|
||
```
|
||
./build.sh bench_pool_tls_hakmem
|
||
make bench_pool_tls_system
|
||
./bench_pool_tls_hakmem 1 100000 256 42
|
||
./bench_pool_tls_hakmem 4 50000 256 42
|
||
./bench_pool_tls_system 1 100000 256 42
|
||
./bench_pool_tls_system 4 50000 256 42
|
||
```
|
||
|
||
Random Mixed (Tiny 128–1024B)
|
||
```
|
||
./build.sh bench_random_mixed_hakmem
|
||
make bench_random_mixed_system
|
||
for s in 128 256 512 1024; do \
|
||
./bench_random_mixed_hakmem 100000 $s 42; \
|
||
./bench_random_mixed_system 100000 $s 42; \
|
||
done
|
||
```
|
||
|
||
Mid‑Large MT (8–32KB)
|
||
```
|
||
./build.sh bench_mid_large_mt_hakmem
|
||
make bench_mid_large_mt_system
|
||
./bench_mid_large_mt_hakmem 1 100000 256 42
|
||
./bench_mid_large_mt_hakmem 4 50000 256 42
|
||
./bench_mid_large_mt_system 1 100000 256 42
|
||
./bench_mid_large_mt_system 4 50000 256 42
|
||
```
|
||
|
||
## Mimalloc note (when comparing)
|
||
|
||
Direct‑link mimalloc benches require runtime path:
|
||
```
|
||
export LD_LIBRARY_PATH=$PWD/mimalloc-bench/extern/mi/out/release
|
||
```
|
||
|
||
## Build hygiene
|
||
|
||
- Always prefer `./build.sh` over ad‑hoc `make` (prevents flag drift)
|
||
- Check switches: `make print-flags`
|
||
- Verify freshness: `./verify_build.sh <binary>`
|
||
- Arena (Pool TLS) ENV
|
||
|
||
You can tune the Pool TLS Arena growth via ENV vars:
|
||
|
||
```
|
||
# Initial chunk size in MB (default: 1)
|
||
export HAKMEM_POOL_TLS_ARENA_MB_INIT=2
|
||
|
||
# Maximum chunk size in MB (default: 8)
|
||
export HAKMEM_POOL_TLS_ARENA_MB_MAX=16
|
||
|
||
# Number of growth levels (default: 3 → 1→2→4→8MB)
|
||
export HAKMEM_POOL_TLS_ARENA_GROWTH_LEVELS=4
|
||
```
|
||
|