# Phase 70: Refill Tuning Prerequisites (Observability SSOT) **Status**: ✅ COMPLETE (SSOT established) **Context**: - Current baseline (Mixed WS=400 + prefault) generates almost **zero cache misses** (refills) in Unified Cache. - Optimizing `unified_cache_refill()` or Warm Pool logic will yield **zero throughput gain** if this path is not hot. - Phase 70 must be gated on *observing significant refill activity*. - Current conclusion (WS=400 SSOT): Refill/WarmPool-pop is **not hot** (misses are extremely low), so refill micro-optimizations are **frozen** for SSOT workloads; only research workloads should touch refill behavior. ## 1. Observability Protocol (Step 0) Before implementing any refill/WarmPool changes, execute this sequence: 0. **Route Banner(任意だが推奨)**: ```bash HAKMEM_ROUTE_BANNER=1 ./bench_random_mixed_hakmem_observe ... ``` - Route assignments(backend route kind)と cache config(`unified_cache_enabled` / `warm_pool_max_per_class`)を 1 回だけ表示する。 - 「Route=LEGACY = Unified Cache 未使用」といった誤認を防ぐ(LEGACYでもUnified Cacheは alloc/free の front で使われる)。 1. **Build with Stats**: ```bash make bench_random_mixed_hakmem_observe EXTRA_CFLAGS='-DHAKMEM_UNIFIED_CACHE_STATS_COMPILED=1' ``` *(Note: Phase 70-0 fixed release-mode stats blocking)* 2. **Run with Stats**: ```bash HAKMEM_ROUTE_BANNER=1 HAKMEM_WARM_POOL_STATS=1 ./bench_random_mixed_hakmem_observe 20000000 400 1 ``` 3. **Check Output**: - Look for `Unified-STATS`: `miss=...` - Look for `WarmPool-STATS`: `hits=...` **Decision Gate**: - If `miss` counts are < 1000 (approx <0.01% miss rate): - **STOP**: Optimization has no ROI on this workload. - **ACTION**: Either accept current state (refill is not a bottleneck) or switch to a research workload (below). - If `miss` counts are significant: - **GO**: Proceed with Phase 70 logic changes. ## 2. Research Workload (If Refill Optimization is Mandatory) If you must measure refill performance improvements (e.g., for architectural validation), modify the workload to force cache pressure: **Option A: Disable Prefault (Cold Path Stress)** ```bash HAKMEM_BENCH_PREFAULT=0 ./bench_random_mixed_hakmem_observe ... ``` - Pros: Forces refills during benchmark loop. - Cons: Measures startup behavior, not steady-state throughput. **Option B: Increase Working Set (Steady State Stress)** ```bash ./bench_random_mixed_hakmem_observe 20000000 8192 1 # WS=8192 > Cache(2048) ``` - Pros: Forces steady-state evictions/refills. - Cons: Different workload profile than standard Mixed SSOT (WS=400). **WARNING**: Do NOT change `HAKMEM_TINY_STATIC_ROUTE` or `ULTRA` flags unless specifically testing routing changes. The default `LEGACY` route *does* use Unified Cache for alloc/free. **NOTE (Warm Pool sizing semantics)**: - `HAKMEM_WARM_POOL_SIZE` primarily controls the **registry-scan prefill cap** (`warm_pool_max_per_class()`). - The steady-state push-back cap inside `unified_cache_refill()` is `TinyClassPolicy.warm_cap` (typically 4/8), so `HAKMEM_WARM_POOL_SIZE` only matters when registry scans/refills happen often enough to benefit from prefilled slabs. ## 3. Reference: Why "LEGACY" Route is OK - **LEGACY** route means "not ULTRA/MID/V7 specialized". - Alloc path: `malloc_tiny_fast` → `tiny_hot_alloc_fast` → **Unified Cache (TLS array)**. - Free path: `free_tiny_fast` → `tiny_hot_free_fast` → **Unified Cache (TLS array)**. - Refill path: Cache Miss → `unified_cache_refill` → **Warm Pool** → Registry. Previous confusion ("LEGACY unused") was due to: 1. Stats counting was gated by `#if !HAKMEM_BUILD_RELEASE`. 2. Low miss rate in WS=400 made it look unused. 3. Phase 70-0 fixed the stats visibility.