- Added docs/analysis/PHASE70_REFILL_OBSERVABILITY_PREREQS_SSOT.md to clarify that refill/warmpool optimizations require confirmed cache misses to be measurable. - Updated CURRENT_TASK.md to point to this prerequisite.
2.7 KiB
2.7 KiB
Phase 70: Refill Tuning Prerequisites (Observability SSOT)
Status: 🟡 ACTIVE
Context:
- Current baseline (Mixed WS=400 + prefault) generates almost zero cache misses (refills) in Unified Cache.
- Optimizing
unified_cache_refill()or Warm Pool logic will yield zero throughput gain if this path is not hot. - Phase 70 must be gated on observing significant refill activity.
1. Observability Protocol (Step 0)
Before implementing any refill/WarmPool changes, execute this sequence:
-
Build with Stats:
make bench_random_mixed_hakmem_observe EXTRA_CFLAGS='-DHAKMEM_UNIFIED_CACHE_STATS_COMPILED=1'(Note: Phase 70-0 fixed release-mode stats blocking)
-
Run with Stats:
HAKMEM_WARM_POOL_STATS=1 ./bench_random_mixed_hakmem_observe 20000000 400 1 -
Check Output:
- Look for
Unified-STATS:miss=... - Look for
WarmPool-STATS:hits=...
- Look for
Decision Gate:
- If
misscounts are < 1000 (approx <0.01% miss rate):- STOP: Optimization has no ROI on this workload.
- ACTION: Either accept current state (refill is not a bottleneck) or switch to a research workload (below).
- If
misscounts are significant:- GO: Proceed with Phase 70 logic changes.
2. Research Workload (If Refill Optimization is Mandatory)
If you must measure refill performance improvements (e.g., for architectural validation), modify the workload to force cache pressure:
Option A: Disable Prefault (Cold Path Stress)
HAKMEM_BENCH_PREFAULT=0 ./bench_random_mixed_hakmem_observe ...
- Pros: Forces refills during benchmark loop.
- Cons: Measures startup behavior, not steady-state throughput.
Option B: Increase Working Set (Steady State Stress)
./bench_random_mixed_hakmem_observe 20000000 8192 1 # WS=8192 > Cache(2048)
- Pros: Forces steady-state evictions/refills.
- Cons: Different workload profile than standard Mixed SSOT (WS=400).
WARNING: Do NOT change HAKMEM_TINY_STATIC_ROUTE or ULTRA flags unless specifically testing routing changes. The default LEGACY route does use Unified Cache for alloc/free.
3. Reference: Why "LEGACY" Route is OK
- LEGACY route means "not ULTRA/MID/V7 specialized".
- Alloc path:
malloc_tiny_fast→tiny_hot_alloc_fast→ Unified Cache (TLS array). - Free path:
free_tiny_fast→tiny_hot_free_fast→ Unified Cache (TLS array). - Refill path: Cache Miss →
unified_cache_refill→ Warm Pool → Registry.
Previous confusion ("LEGACY unused") was due to:
- Stats counting was gated by
#if !HAKMEM_BUILD_RELEASE. - Low miss rate in WS=400 made it look unused.
- Phase 70-0 fixed the stats visibility.