Files
hakmem/docs/BUILD_PHASE7_POOL_TLS.md
Moe Charm (CI) 1010a961fb Tiny: fix header/stride mismatch and harden refill paths
- Root cause: header-based class indexing (HEADER_CLASSIDX=1) wrote a 1-byte
  header during allocation, but linear carve/refill and initial slab capacity
  still used bare class block sizes. This mismatch could overrun slab usable
  space and corrupt freelists, causing reproducible SEGV at ~100k iters.

Changes
- Superslab: compute capacity with effective stride (block_size + header for
  classes 0..6; class7 remains headerless) in superslab_init_slab(). Add a
  debug-only bound check in superslab_alloc_from_slab() to fail fast if carve
  would exceed usable bytes.
- Refill (non-P0 and P0): use header-aware stride for all linear carving and
  TLS window bump operations. Ensure alignment/validation in tiny_refill_opt.h
  also uses stride, not raw class size.
- Drain: keep existing defense-in-depth for remote sentinel and sanitize nodes
  before splicing into freelist (already present).

Notes
- This unifies the memory layout across alloc/linear-carve/refill with a single
  stride definition and keeps class7 (1024B) headerless as designed.
- Debug builds add fail-fast checks; release builds remain lean.

Next
- Re-run Tiny benches (256/1024B) in debug to confirm stability, then in
  release. If any remaining crash persists, bisect with HAKMEM_TINY_P0_BATCH_REFILL=0
  to isolate P0 batch carve, and continue reducing branch-miss as planned.
2025-11-09 18:55:50 +09:00

105 lines
2.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HAKMEM Phase 7 + Pool TLS Phase 1.5b — Build & Run Cheatsheet
This document captures the stable build/run recipe used for recent benches.
## Oneliner Build (recommended)
```
./build.sh <target>
# examples
./build.sh bench_mid_large_mt_hakmem
./build.sh bench_random_mixed_hakmem
./build.sh larson_hakmem
```
Enables at build time:
- POOL_TLS_PHASE1=1 (Pool TLS Phase 1.5b)
- HEADER_CLASSIDX=1 (Phase 7 header)
- AGGRESSIVE_INLINE=1
- PREWARM_TLS=1
Verify switches:
```
make print-flags
```
Optional safety/verbosity toggles:
- `HAKMEM_TINY_SAFE_FREE=1` — strict free validation (mincore on all frees). Slower but safest.
- `HAKMEM_DEBUG_VERBOSE=1` — enable verbose logs for Tiny header/free, etc.
Examples:
```
make clean && make HAKMEM_TINY_SAFE_FREE=1 POOL_TLS_PHASE1=1 HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 bench_mid_large_mt_hakmem
```
## Bench Recipes (used in reports)
Larson (Mixed)
```
./build.sh larson_hakmem
make larson_system
./larson_hakmem 2 8 128 1024 1 12345 1
./larson_hakmem 2 8 128 1024 1 12345 4
./larson_system 2 8 128 1024 1 12345 1
./larson_system 2 8 128 1024 1 12345 4
```
Pool TLS (852KB)
```
./build.sh bench_pool_tls_hakmem
make bench_pool_tls_system
./bench_pool_tls_hakmem 1 100000 256 42
./bench_pool_tls_hakmem 4 50000 256 42
./bench_pool_tls_system 1 100000 256 42
./bench_pool_tls_system 4 50000 256 42
```
Random Mixed (Tiny 1281024B)
```
./build.sh bench_random_mixed_hakmem
make bench_random_mixed_system
for s in 128 256 512 1024; do \
./bench_random_mixed_hakmem 100000 $s 42; \
./bench_random_mixed_system 100000 $s 42; \
done
```
MidLarge MT (832KB)
```
./build.sh bench_mid_large_mt_hakmem
make bench_mid_large_mt_system
./bench_mid_large_mt_hakmem 1 100000 256 42
./bench_mid_large_mt_hakmem 4 50000 256 42
./bench_mid_large_mt_system 1 100000 256 42
./bench_mid_large_mt_system 4 50000 256 42
```
## Mimalloc note (when comparing)
Directlink mimalloc benches require runtime path:
```
export LD_LIBRARY_PATH=$PWD/mimalloc-bench/extern/mi/out/release
```
## Build hygiene
- Always prefer `./build.sh` over adhoc `make` (prevents flag drift)
- Check switches: `make print-flags`
- Verify freshness: `./verify_build.sh <binary>`
- Arena (Pool TLS) ENV
You can tune the Pool TLS Arena growth via ENV vars:
```
# Initial chunk size in MB (default: 1)
export HAKMEM_POOL_TLS_ARENA_MB_INIT=2
# Maximum chunk size in MB (default: 8)
export HAKMEM_POOL_TLS_ARENA_MB_MAX=16
# Number of growth levels (default: 3 → 1→2→4→8MB)
export HAKMEM_POOL_TLS_ARENA_GROWTH_LEVELS=4
```