Phase 6.16 Update (2025‑10‑23) What changed (summary) - Tiny fast path simplified and sped up: - Removed owner pointer from TLS magazine items (smaller push/pop). - Added sampled counter updates for alloc/free (ENV: HAKMEM_TINY_COUNT_SAMPLE, default 1/256). - Mid overhead trimmed: - hits/misses/frees now updated with sampling (ENV: HAKMEM_POOL_COUNT_SAMPLE, default 1/256). - Extra metrics exposed for trylock attempts/success and ring underflow; learner writes CSV lines `M,,,,,` to HAKMEM_LOG_FILE. - Mid page‑descriptor registry (baseline): - 64KiB page → {class_idx, owner_tid} mini hash. - TLS‑owned pages registered with owner_tid; refill pages registered with owner_tid=0. - Free path consults descriptor when header is light (owner not set). - Header modes (Mid): HAKMEM_HDR_LIGHT now supports 0 (full), 1 (minimal), 2 (skip writes/validation; experimental). Key ENV additions - HAKMEM_TINY_COUNT_SAMPLE, HAKMEM_POOL_COUNT_SAMPLE — 2^N sampling for hot‑path counters. - HAKMEM_HDR_LIGHT=0|1|2 — header write policy for Mid. - Mid TLS two‑tier knobs: HAKMEM_TRYLOCK_PROBES, HAKMEM_RING_RETURN_DIV, HAKMEM_TLS_LO_MAX, HAKMEM_POOL_TLS_RING, HAKMEM_SHARD_MIX. - Tiny wrapper toggles: HAKMEM_WRAP_TINY=1, HAKMEM_WRAP_TINY_REFILL=1. Quick A/B results (10s, larson, BURST) - Tiny 8–64B - 1T: hakmem 17.93M (prev ~14.6M) / mimalloc ~34.3M / system ~37.4M - 4T: hakmem 56.10M (prev ~30.9M) / mimalloc ~52.0M / system ~82.3M - Note: hakmem > mimalloc in 4T here due to TLS/mag simplifications. - Mid 2–32KiB - 1T: hakmem ~4.5–4.8M vs mimalloc ~11–14.8M (needs TC + headerless hot path) - 4T: hakmem ~13.5–15.0M vs mimalloc ~29–30M Logs - Tiny: docs/benchmarks/20251023_035642_TINY_AB/summary.txt - Mid A/B (fast): docs/benchmarks/20251023_033008_AB_FAST_MID/summary.txt - Head‑to‑head suites: docs/benchmarks/20251023_033358_HEAD2HEAD/summary.txt, ..._HEAD2HEAD_POST, ..._HDR2 Next steps (recommended) 1) Mid Transfer Cache (minimal): (class,thread) MPSC + small budget drain in alloc; remove central freelist round trips on cross‑thread frees. 2) Page‑descriptor routing: hold class/owner in descriptor, make Mid headerless on hot path (keep assertions in debug). 3) Tune TLS parameters (ring cap 8/16, LIFO max 256/512) via scripts/ab_fast_mid.sh.