3.7 KiB
3.7 KiB
TINY 1KB (class7) SEGV Triage Plan
Scope
- Reproducible SEGV on fixed-size 1KB bench:
./bench_fixed_size_hakmem 200000 1024 128 - Persists with Direct-FC OFF. Likely in non-direct P0 or legacy refill path.
- Goal: isolate failing path, capture backtrace, prove root cause, and patch with minimal deltas.
Quick Repro Matrix
- Release (baseline):
./build.sh release bench_fixed_size_hakmem./bench_fixed_size_hakmem 200000 1024 128→ SEGV
- Disable P0 (all classes):
HAKMEM_TINY_P0_DISABLE=1 ./bench_fixed_size_hakmem 200000 1024 128→ Check (SEGV persists?)
- Disable remote drain:
HAKMEM_TINY_P0_NO_DRAIN=1 ./bench_fixed_size_hakmem 200000 1024 128→ Check
- Assume 1T (disable remote side-table):
HAKMEM_TINY_ASSUME_1T=1 ./bench_fixed_size_hakmem 200000 1024 128→ Check
Debug Build + Guards
- Build debug flavor
./build.sh debug bench_fixed_size_hakmem
- Strong safety/guards
export HAKMEM_TINY_SAFE_FREE_STRICT=1export HAKMEM_TINY_DEBUG_REMOTE_GUARD=1export HAKMEM_INVALID_FREE_LOG=1export HAKMEM_TINY_RF_FORCE_NOTIFY=1
- Run under gdb
gdb --args ./bench_fixed_size_hakmem 200000 1024 128(gdb) run- On crash:
(gdb) bt,(gdb) frame 0,(gdb) p/x *meta,(gdb) p tls->slab_idx,(gdb) p tls->ss,(gdb) p meta->used,(gdb) p meta->carved,(gdb) p meta->capacity
Hypotheses (ranked)
- Capacity/stride mismatch in class7 carve
- class7 uses stride=1024 (no 1B header). Any code calculating with
bs = class_size + 1will overstep. - Check:
superslab_init_slab()capacity, and any linear carve helper uses the same stride consistently.
- class7 uses stride=1024 (no 1B header). Any code calculating with
- TLS slab switch with stale pointers (already fixed for P0 direct path; check legacy/P0-general)
- After
superslab_refill(), ensuretls = &g_tls_slabs[c]; meta = tls->meta;reloaded before counters/linear carve.
- After
- Remote drain corrupts freelist
- Verify sentinel cleared; ensure drain happens before freelist pop; check class7 path uses same ordering.
Files to Inspect
core/tiny_superslab_alloc.inc.h(superslab_refill, adopt_bind_if_safe, stride/capacity)core/hakmem_tiny_refill.inc.h(legacy SLL refill, carve/pop ordering, bounds checks)core/hakmem_tiny_refill_p0.inc.h(P0 general path – C7 is currently guarded OFF for direct-FC; confirm P0 batch not entering for C7)core/superslab/superslab_inline.h(remote drain, sentinel guard)
Instrumentation to Add (debug-only)
- In
superslab_init_slab(ss, idx, class_size, tid):- Compute
stride = class_size + (class_idx != 7 ? 1 : 0); assertmeta->capacity == usable/stride.
- Compute
- In linear carve path (legacy + P0-general):
- Before write: assert
meta->carved < meta->capacity; compute base and assertptr < slab_base+usable.
- Before write: assert
- After
superslab_refill()in any loop: rebindtls/metaunconditionally.
Bisect Switches
- Kill P0 entirely:
HAKMEM_TINY_P0_DISABLE=1 - Skip remote drain:
HAKMEM_TINY_P0_NO_DRAIN=1 - Assume ST mode:
HAKMEM_TINY_ASSUME_1T=1 - Disable simplified refills (if applicable):
HAKMEM_TINY_SIMPLE_REFILL=0(add if not present)
Patch Strategy (expected minimal fix)
- Make class7 stride consistently 1024 in all carve paths (no +1 header). Audit bs computations.
- Ensure tls/meta rebind after every
superslab_refill()in non-direct paths. - Enforce drain-before-pop ordering and sentinel clear.
Acceptance Criteria
./bench_fixed_size_hakmem 200000 1024 128passes 3/3 without SEGV.- Debug counters show
active_delta == taken(no mismatch). - No invalid-free logs under STRICT mode.
Notes
- We already defaulted C7 Direct‑FC to OFF and guarded P0 entry for C7 unless explicitly enabled (
HAKMEM_TINY_P0_C7_ENABLE=1). - Focus on legacy/P0-general carve paths for C7.