hakmem/docs/analysis/PHASE70_REFILL_OBSERVABILITY_PREREQS_SSOT.md

# Phase 70: Refill Tuning Prerequisites (Observability SSOT)

**Status**: 🟡 ACTIVE

**Context**: 
- Current baseline (Mixed WS=400 + prefault) generates almost **zero cache misses** (refills) in Unified Cache.
- Optimizing `unified_cache_refill()` or Warm Pool logic will yield **zero throughput gain** if this path is not hot.
- Phase 70 must be gated on *observing significant refill activity*.

## 1. Observability Protocol (Step 0)

Before implementing any refill/WarmPool changes, execute this sequence:

1.  **Build with Stats**:
    ```bash
    make bench_random_mixed_hakmem_observe EXTRA_CFLAGS='-DHAKMEM_UNIFIED_CACHE_STATS_COMPILED=1'
    ```
    *(Note: Phase 70-0 fixed release-mode stats blocking)*

2.  **Run with Stats**:
    ```bash
    HAKMEM_WARM_POOL_STATS=1 ./bench_random_mixed_hakmem_observe 20000000 400 1
    ```

3.  **Check Output**:
    - Look for `Unified-STATS`: `miss=...`
    - Look for `WarmPool-STATS`: `hits=...`

**Decision Gate**:
- If `miss` counts are < 1000 (approx <0.01% miss rate):
  - **STOP**: Optimization has no ROI on this workload.
  - **ACTION**: Either accept current state (refill is not a bottleneck) or switch to a research workload (below).
- If `miss` counts are significant:
  - **GO**: Proceed with Phase 70 logic changes.

## 2. Research Workload (If Refill Optimization is Mandatory)

If you must measure refill performance improvements (e.g., for architectural validation), modify the workload to force cache pressure:

**Option A: Disable Prefault (Cold Path Stress)**
```bash
HAKMEM_BENCH_PREFAULT=0 ./bench_random_mixed_hakmem_observe ...
```
- Pros: Forces refills during benchmark loop.
- Cons: Measures startup behavior, not steady-state throughput.

**Option B: Increase Working Set (Steady State Stress)**
```bash
./bench_random_mixed_hakmem_observe 20000000 8192 1  # WS=8192 > Cache(2048)
```
- Pros: Forces steady-state evictions/refills.
- Cons: Different workload profile than standard Mixed SSOT (WS=400).

**WARNING**: Do NOT change `HAKMEM_TINY_STATIC_ROUTE` or `ULTRA` flags unless specifically testing routing changes. The default `LEGACY` route *does* use Unified Cache for alloc/free.

## 3. Reference: Why "LEGACY" Route is OK

- **LEGACY** route means "not ULTRA/MID/V7 specialized".
- Alloc path: `malloc_tiny_fast` → `tiny_hot_alloc_fast` → **Unified Cache (TLS array)**.
- Free path: `free_tiny_fast` → `tiny_hot_free_fast` → **Unified Cache (TLS array)**.
- Refill path: Cache Miss → `unified_cache_refill` → **Warm Pool** → Registry.

Previous confusion ("LEGACY unused") was due to:
1. Stats counting was gated by `#if !HAKMEM_BUILD_RELEASE`.
2. Low miss rate in WS=400 made it look unused.
3. Phase 70-0 fixed the stats visibility.
Phase 70: Defined observability prerequisites SSOT - Added docs/analysis/PHASE70_REFILL_OBSERVABILITY_PREREQS_SSOT.md to clarify that refill/warmpool optimizations require confirmed cache misses to be measurable. - Updated CURRENT_TASK.md to point to this prerequisite. 2025-12-18 03:44:51 +09:00			`# Phase 70: Refill Tuning Prerequisites (Observability SSOT)`

			`Status: 🟡 ACTIVE`

			`Context:`
			`- Current baseline (Mixed WS=400 + prefault) generates almost zero cache misses (refills) in Unified Cache.`
			- Optimizing `unified_cache_refill()` or Warm Pool logic will yield zero throughput gain if this path is not hot.
			`- Phase 70 must be gated on observing significant refill activity.`

			`## 1. Observability Protocol (Step 0)`

			`Before implementing any refill/WarmPool changes, execute this sequence:`

			`1. Build with Stats:`
			```bash
			`make bench_random_mixed_hakmem_observe EXTRA_CFLAGS='-DHAKMEM_UNIFIED_CACHE_STATS_COMPILED=1'`
			```
			`(Note: Phase 70-0 fixed release-mode stats blocking)`

			`2. Run with Stats:`
			```bash
			`HAKMEM_WARM_POOL_STATS=1 ./bench_random_mixed_hakmem_observe 20000000 400 1`
			```

			`3. Check Output:`
			- Look for `Unified-STATS`: `miss=...`
			- Look for `WarmPool-STATS`: `hits=...`

			`Decision Gate:`
			- If `miss` counts are < 1000 (approx <0.01% miss rate):
			`- STOP: Optimization has no ROI on this workload.`
			`- ACTION: Either accept current state (refill is not a bottleneck) or switch to a research workload (below).`
			- If `miss` counts are significant:
			`- GO: Proceed with Phase 70 logic changes.`

			`## 2. Research Workload (If Refill Optimization is Mandatory)`

			`If you must measure refill performance improvements (e.g., for architectural validation), modify the workload to force cache pressure:`

			`Option A: Disable Prefault (Cold Path Stress)`
			```bash
			`HAKMEM_BENCH_PREFAULT=0 ./bench_random_mixed_hakmem_observe ...`
			```
			`- Pros: Forces refills during benchmark loop.`
			`- Cons: Measures startup behavior, not steady-state throughput.`

			`Option B: Increase Working Set (Steady State Stress)`
			```bash
			`./bench_random_mixed_hakmem_observe 20000000 8192 1 # WS=8192 > Cache(2048)`
			```
			`- Pros: Forces steady-state evictions/refills.`
			`- Cons: Different workload profile than standard Mixed SSOT (WS=400).`

			WARNING: Do NOT change `HAKMEM_TINY_STATIC_ROUTE` or `ULTRA` flags unless specifically testing routing changes. The default `LEGACY` route does use Unified Cache for alloc/free.

			`## 3. Reference: Why "LEGACY" Route is OK`

			`- LEGACY route means "not ULTRA/MID/V7 specialized".`
			- Alloc path: `malloc_tiny_fast` → `tiny_hot_alloc_fast` → Unified Cache (TLS array).
			- Free path: `free_tiny_fast` → `tiny_hot_free_fast` → Unified Cache (TLS array).
			- Refill path: Cache Miss → `unified_cache_refill` → Warm Pool → Registry.

			`Previous confusion ("LEGACY unused") was due to:`
			1. Stats counting was gated by `#if !HAKMEM_BUILD_RELEASE`.
			`2. Low miss rate in WS=400 made it look unused.`
			`3. Phase 70-0 fixed the stats visibility.`