75 lines
3.7 KiB
Markdown
75 lines
3.7 KiB
Markdown
|
|
TINY 1KB (class7) SEGV Triage Plan
|
|||
|
|
=================================
|
|||
|
|
|
|||
|
|
Scope
|
|||
|
|
- Reproducible SEGV on fixed-size 1KB bench: `./bench_fixed_size_hakmem 200000 1024 128`
|
|||
|
|
- Persists with Direct-FC OFF. Likely in non-direct P0 or legacy refill path.
|
|||
|
|
- Goal: isolate failing path, capture backtrace, prove root cause, and patch with minimal deltas.
|
|||
|
|
|
|||
|
|
Quick Repro Matrix
|
|||
|
|
- Release (baseline):
|
|||
|
|
- `./build.sh release bench_fixed_size_hakmem`
|
|||
|
|
- `./bench_fixed_size_hakmem 200000 1024 128` → SEGV
|
|||
|
|
- Disable P0 (all classes):
|
|||
|
|
- `HAKMEM_TINY_P0_DISABLE=1 ./bench_fixed_size_hakmem 200000 1024 128` → Check (SEGV persists?)
|
|||
|
|
- Disable remote drain:
|
|||
|
|
- `HAKMEM_TINY_P0_NO_DRAIN=1 ./bench_fixed_size_hakmem 200000 1024 128` → Check
|
|||
|
|
- Assume 1T (disable remote side-table):
|
|||
|
|
- `HAKMEM_TINY_ASSUME_1T=1 ./bench_fixed_size_hakmem 200000 1024 128` → Check
|
|||
|
|
|
|||
|
|
Debug Build + Guards
|
|||
|
|
1) Build debug flavor
|
|||
|
|
- `./build.sh debug bench_fixed_size_hakmem`
|
|||
|
|
2) Strong safety/guards
|
|||
|
|
- `export HAKMEM_TINY_SAFE_FREE_STRICT=1`
|
|||
|
|
- `export HAKMEM_TINY_DEBUG_REMOTE_GUARD=1`
|
|||
|
|
- `export HAKMEM_INVALID_FREE_LOG=1`
|
|||
|
|
- `export HAKMEM_TINY_RF_FORCE_NOTIFY=1`
|
|||
|
|
3) Run under gdb
|
|||
|
|
- `gdb --args ./bench_fixed_size_hakmem 200000 1024 128`
|
|||
|
|
- `(gdb) run`
|
|||
|
|
- On crash: `(gdb) bt`, `(gdb) frame 0`, `(gdb) p/x *meta`, `(gdb) p tls->slab_idx`, `(gdb) p tls->ss`, `(gdb) p meta->used`, `(gdb) p meta->carved`, `(gdb) p meta->capacity`
|
|||
|
|
|
|||
|
|
Hypotheses (ranked)
|
|||
|
|
1) Capacity/stride mismatch in class7 carve
|
|||
|
|
- class7 uses stride=1024 (no 1B header). Any code calculating with `bs = class_size + 1` will overstep.
|
|||
|
|
- Check: `superslab_init_slab()` capacity, and any linear carve helper uses the same stride consistently.
|
|||
|
|
2) TLS slab switch with stale pointers (already fixed for P0 direct path; check legacy/P0-general)
|
|||
|
|
- After `superslab_refill()`, ensure `tls = &g_tls_slabs[c]; meta = tls->meta;` reloaded before counters/linear carve.
|
|||
|
|
3) Remote drain corrupts freelist
|
|||
|
|
- Verify sentinel cleared; ensure drain happens before freelist pop; check class7 path uses same ordering.
|
|||
|
|
|
|||
|
|
Files to Inspect
|
|||
|
|
- `core/tiny_superslab_alloc.inc.h` (superslab_refill, adopt_bind_if_safe, stride/capacity)
|
|||
|
|
- `core/hakmem_tiny_refill.inc.h` (legacy SLL refill, carve/pop ordering, bounds checks)
|
|||
|
|
- `core/hakmem_tiny_refill_p0.inc.h` (P0 general path – C7 is currently guarded OFF for direct-FC; confirm P0 batch not entering for C7)
|
|||
|
|
- `core/superslab/superslab_inline.h` (remote drain, sentinel guard)
|
|||
|
|
|
|||
|
|
Instrumentation to Add (debug-only)
|
|||
|
|
- In `superslab_init_slab(ss, idx, class_size, tid)`:
|
|||
|
|
- Compute `stride = class_size + (class_idx != 7 ? 1 : 0)`; assert `meta->capacity == usable/stride`.
|
|||
|
|
- In linear carve path (legacy + P0-general):
|
|||
|
|
- Before write: assert `meta->carved < meta->capacity`; compute base and assert `ptr < slab_base+usable`.
|
|||
|
|
- After `superslab_refill()` in any loop: rebind `tls/meta` unconditionally.
|
|||
|
|
|
|||
|
|
Bisect Switches
|
|||
|
|
- Kill P0 entirely: `HAKMEM_TINY_P0_DISABLE=1`
|
|||
|
|
- Skip remote drain: `HAKMEM_TINY_P0_NO_DRAIN=1`
|
|||
|
|
- Assume ST mode: `HAKMEM_TINY_ASSUME_1T=1`
|
|||
|
|
- Disable simplified refills (if applicable): `HAKMEM_TINY_SIMPLE_REFILL=0` (add if not present)
|
|||
|
|
|
|||
|
|
Patch Strategy (expected minimal fix)
|
|||
|
|
1) Make class7 stride consistently 1024 in all carve paths (no +1 header). Audit bs computations.
|
|||
|
|
2) Ensure tls/meta rebind after every `superslab_refill()` in non-direct paths.
|
|||
|
|
3) Enforce drain-before-pop ordering and sentinel clear.
|
|||
|
|
|
|||
|
|
Acceptance Criteria
|
|||
|
|
- `./bench_fixed_size_hakmem 200000 1024 128` passes 3/3 without SEGV.
|
|||
|
|
- Debug counters show `active_delta == taken` (no mismatch).
|
|||
|
|
- No invalid-free logs under STRICT mode.
|
|||
|
|
|
|||
|
|
Notes
|
|||
|
|
- We already defaulted C7 Direct‑FC to OFF and guarded P0 entry for C7 unless explicitly enabled (`HAKMEM_TINY_P0_C7_ENABLE=1`).
|
|||
|
|
- Focus on legacy/P0-general carve paths for C7.
|
|||
|
|
|