- Class7 (1KB): P0 direct-to-FastCache now default ON (HAKMEM_TINY_P0_DIRECT_FC_C7 unset or not '0').
- Keep A/B gates: HAKMEM_TINY_P0_ENABLE, HAKMEM_TINY_P0_DIRECT_FC (class5), HAKMEM_TINY_P0_DIRECT_FC_C7 (class7),
HAKMEM_TINY_P0_DRAIN_THRESH (default 32), HAKMEM_TINY_P0_NO_DRAIN, HAKMEM_TINY_P0_LOG.
- P0 batch now supports class7 direct fill in addition to class5: gather (drain thresholded → freelist pop → linear carve)
without writing into objects, then bulk-push into FC, update meta/active counters once.
- Docs: Update direct-FC defaults (class5+class7 ON) in docs/TINY_P0_BATCH_REFILL.md.
Notes
- Use tools/bench_rs_from_files.sh for RS(hakmem/system) to compare runs across CPUs.
- Next: parameter sweep for class7 (FC cap/batch limit/drain threshold) and perf counters A/B.
- Root cause: header-based class indexing (HEADER_CLASSIDX=1) wrote a 1-byte
header during allocation, but linear carve/refill and initial slab capacity
still used bare class block sizes. This mismatch could overrun slab usable
space and corrupt freelists, causing reproducible SEGV at ~100k iters.
Changes
- Superslab: compute capacity with effective stride (block_size + header for
classes 0..6; class7 remains headerless) in superslab_init_slab(). Add a
debug-only bound check in superslab_alloc_from_slab() to fail fast if carve
would exceed usable bytes.
- Refill (non-P0 and P0): use header-aware stride for all linear carving and
TLS window bump operations. Ensure alignment/validation in tiny_refill_opt.h
also uses stride, not raw class size.
- Drain: keep existing defense-in-depth for remote sentinel and sanitize nodes
before splicing into freelist (already present).
Notes
- This unifies the memory layout across alloc/linear-carve/refill with a single
stride definition and keeps class7 (1024B) headerless as designed.
- Debug builds add fail-fast checks; release builds remain lean.
Next
- Re-run Tiny benches (256/1024B) in debug to confirm stability, then in
release. If any remaining crash persists, bisect with HAKMEM_TINY_P0_BATCH_REFILL=0
to isolate P0 batch carve, and continue reducing branch-miss as planned.