TINY 1KB (class7) SEGV Triage Plan ================================= Scope - Reproducible SEGV on fixed-size 1KB bench: `./bench_fixed_size_hakmem 200000 1024 128` - Persists with Direct-FC OFF. Likely in non-direct P0 or legacy refill path. - Goal: isolate failing path, capture backtrace, prove root cause, and patch with minimal deltas. Quick Repro Matrix - Release (baseline): - `./build.sh release bench_fixed_size_hakmem` - `./bench_fixed_size_hakmem 200000 1024 128` → SEGV - Disable P0 (all classes): - `HAKMEM_TINY_P0_DISABLE=1 ./bench_fixed_size_hakmem 200000 1024 128` → Check (SEGV persists?) - Disable remote drain: - `HAKMEM_TINY_P0_NO_DRAIN=1 ./bench_fixed_size_hakmem 200000 1024 128` → Check - Assume 1T (disable remote side-table): - `HAKMEM_TINY_ASSUME_1T=1 ./bench_fixed_size_hakmem 200000 1024 128` → Check Debug Build + Guards 1) Build debug flavor - `./build.sh debug bench_fixed_size_hakmem` 2) Strong safety/guards - `export HAKMEM_TINY_SAFE_FREE_STRICT=1` - `export HAKMEM_TINY_DEBUG_REMOTE_GUARD=1` - `export HAKMEM_INVALID_FREE_LOG=1` - `export HAKMEM_TINY_RF_FORCE_NOTIFY=1` 3) Run under gdb - `gdb --args ./bench_fixed_size_hakmem 200000 1024 128` - `(gdb) run` - On crash: `(gdb) bt`, `(gdb) frame 0`, `(gdb) p/x *meta`, `(gdb) p tls->slab_idx`, `(gdb) p tls->ss`, `(gdb) p meta->used`, `(gdb) p meta->carved`, `(gdb) p meta->capacity` Hypotheses (ranked) 1) Capacity/stride mismatch in class7 carve - class7 uses stride=1024 (no 1B header). Any code calculating with `bs = class_size + 1` will overstep. - Check: `superslab_init_slab()` capacity, and any linear carve helper uses the same stride consistently. 2) TLS slab switch with stale pointers (already fixed for P0 direct path; check legacy/P0-general) - After `superslab_refill()`, ensure `tls = &g_tls_slabs[c]; meta = tls->meta;` reloaded before counters/linear carve. 3) Remote drain corrupts freelist - Verify sentinel cleared; ensure drain happens before freelist pop; check class7 path uses same ordering. Files to Inspect - `core/tiny_superslab_alloc.inc.h` (superslab_refill, adopt_bind_if_safe, stride/capacity) - `core/hakmem_tiny_refill.inc.h` (legacy SLL refill, carve/pop ordering, bounds checks) - `core/hakmem_tiny_refill_p0.inc.h` (P0 general path – C7 is currently guarded OFF for direct-FC; confirm P0 batch not entering for C7) - `core/superslab/superslab_inline.h` (remote drain, sentinel guard) Instrumentation to Add (debug-only) - In `superslab_init_slab(ss, idx, class_size, tid)`: - Compute `stride = class_size + (class_idx != 7 ? 1 : 0)`; assert `meta->capacity == usable/stride`. - In linear carve path (legacy + P0-general): - Before write: assert `meta->carved < meta->capacity`; compute base and assert `ptr < slab_base+usable`. - After `superslab_refill()` in any loop: rebind `tls/meta` unconditionally. Bisect Switches - Kill P0 entirely: `HAKMEM_TINY_P0_DISABLE=1` - Skip remote drain: `HAKMEM_TINY_P0_NO_DRAIN=1` - Assume ST mode: `HAKMEM_TINY_ASSUME_1T=1` - Disable simplified refills (if applicable): `HAKMEM_TINY_SIMPLE_REFILL=0` (add if not present) Patch Strategy (expected minimal fix) 1) Make class7 stride consistently 1024 in all carve paths (no +1 header). Audit bs computations. 2) Ensure tls/meta rebind after every `superslab_refill()` in non-direct paths. 3) Enforce drain-before-pop ordering and sentinel clear. Acceptance Criteria - `./bench_fixed_size_hakmem 200000 1024 128` passes 3/3 without SEGV. - Debug counters show `active_delta == taken` (no mismatch). - No invalid-free logs under STRICT mode. Notes - We already defaulted C7 Direct‑FC to OFF and guarded P0 entry for C7 unless explicitly enabled (`HAKMEM_TINY_P0_C7_ENABLE=1`). - Focus on legacy/P0-general carve paths for C7.