# Phase 19: Frontend Layer Metrics Analysis ## Phase 19-1: Box FrontMetrics Implementation ✅ **Status**: COMPLETE (2025-11-16) **Implementation**: - Created `core/box/front_metrics_box.h` - Per-class hit/miss counters - Created `core/box/front_metrics_box.c` - CSV reporting with percentage analysis - Added instrumentation to all frontend layers in `tiny_alloc_fast.inc.h` - ENV controls: `HAKMEM_TINY_FRONT_METRICS=1`, `HAKMEM_TINY_FRONT_DUMP=1` **Build fix**: Added missing `hakmem_smallmid_superslab.o` to Makefile --- ## Phase 19-2: Benchmark Results and Analysis ✅ **Benchmark**: `bench_random_mixed_hakmem 500000 4096 42` **Workload**: Random allocations 16-1040 bytes, 500K iterations ### Layer Hit Rates (Classes C2/C3) ``` Class UH_hit HV2_hit C5_hit FC_hit SFC_hit SLL_hit Total ------|----------|----------|----------|----------|----------|----------|------------- C2 455 3,450 0 0 0 0 3,905 C3 13 7,585 0 0 0 0 7,598 Percentages: C2: UltraHot=11.7%, HeapV2=88.3% C3: UltraHot=0.2%, HeapV2=99.8% ``` ### Key Findings 1. **HeapV2 Dominates (>80% hit rate)** - C2: 88.3% hit rate (3,450 / 3,905 allocations) - C3: 99.8% hit rate (7,585 / 7,598 allocations) - **Recommendation**: ✅ Keep and optimize (hot path) 2. **UltraHot Marginal (<12% hit rate)** - C2: 11.7% hit rate (455 / 3,905 allocations) - C3: 0.2% hit rate (13 / 7,598 allocations) - **Recommendation**: ⚠️ Consider pruning (low value, adds branch overhead) 3. **FastCache DISABLED** - Gated by `g_fastcache_enable=0` (default) - 0% hit rate across all classes - **Status**: Not in use (OFF by default) 4. **SFC DISABLED** - Gated by `g_sfc_enabled=0` (default) - 0% hit rate across all classes - **Status**: Not in use (OFF by default) 5. **Class5 Dedicated Path DISABLED** - `g_front_class5_hit[]=0` for all classes - **Status**: Not in use (OFF by default or C5 not hit in this workload) 6. **TLS SLL Not Reached** - 0% hit rate because earlier layers (UltraHot + HeapV2) catch 100% - **Status**: Enabled but bypassed (earlier layers are effective) ### Layer Execution Order ``` FastCache (C0-C3) [DISABLED] ↓ SFC (all classes) [DISABLED] ↓ UltraHot (C2-C5) [ENABLED] → 0.2-11.7% hit rate ↓ HeapV2 (C0-C3) [ENABLED] → 88-99% hit rate ✅ ↓ Class5 (C5 only) [DISABLED or N/A] ↓ TLS SLL (all classes) [ENABLED but not reached] ↓ SuperSlab (fallback) ``` --- ## Analysis Recommendations (from Box FrontMetrics) 1. **Layers with >80% hit rate**: ✅ Keep and optimize (hot path) - **HeapV2**: 88-99% hit rate → Primary workhorse for C2/C3 2. **Layers with <5% hit rate**: ⚠️ Consider pruning (dead weight) - **FastCache**: 0% (disabled) - **SFC**: 0% (disabled) - **Class5**: 0% (disabled or N/A) - **TLS SLL**: 0% (not reached) 3. **Multiple layers 5-20%**: ⚠️ Potential redundancy, test pruning - **UltraHot**: 0.2-11.7% → Adds branch overhead for minimal benefit --- ## Phase 19-3: Next Steps (Box FrontPrune) **Goal**: Add ENV switches to selectively disable layers for A/B testing **Proposed ENV Controls**: ```bash HAKMEM_TINY_FRONT_DISABLE_ULTRAHOT=1 # Disable UltraHot magazine HAKMEM_TINY_FRONT_DISABLE_HEAPV2=1 # Disable HeapV2 magazine HAKMEM_TINY_FRONT_DISABLE_CLASS5=1 # Disable Class5 dedicated path HAKMEM_TINY_FRONT_ENABLE_FC=1 # Enable FastCache (currently OFF) HAKMEM_TINY_FRONT_ENABLE_SFC=1 # Enable SFC (currently OFF) ``` **A/B Test Scenarios**: 1. **Baseline**: Current state (UltraHot + HeapV2) 2. **Test 1**: HeapV2 only (disable UltraHot) → Expected: Minimal perf loss (<12%) 3. **Test 2**: UltraHot only (disable HeapV2) → Expected: Major perf loss (88-99%) 4. **Test 3**: Enable FC + SFC, disable UltraHot/HeapV2 → Test classic TLS cache layers 5. **Test 4**: HeapV2 + FC + SFC (disable UltraHot) → Test hybrid approach **Expected Outcome**: Identify minimal effective layer set (maximize hit rate, minimize overhead) --- ## Performance Impact **Benchmark Throughput**: 10.8M ops/s (500K iterations) **Layer Overhead Estimate**: - Each layer check: ~2-4 instructions (branch + state access) - Current active layers: UltraHot (2-4 inst) + HeapV2 (2-4 inst) = 4-8 inst overhead - If UltraHot removed: -2-4 inst = potential +5-10% perf improvement **Risk Assessment**: - Removing HeapV2: HIGH RISK (88-99% hit rate loss) - Removing UltraHot: LOW RISK (0.2-11.7% hit rate loss, likely <5% perf impact) --- ## Files Modified (Phase 19-1) 1. `core/box/front_metrics_box.h` - NEW (metrics API + inline helpers) 2. `core/box/front_metrics_box.c` - NEW (CSV reporting) 3. `core/tiny_alloc_fast.inc.h` - Added metrics collection calls 4. `Makefile` - Added `front_metrics_box.o` + `hakmem_smallmid_superslab.o` **Build Command**: ```bash make clean && make HAKMEM_DEBUG_COUNTERS=1 bench_random_mixed_hakmem ``` **Test Command**: ```bash HAKMEM_TINY_FRONT_METRICS=1 HAKMEM_TINY_FRONT_DUMP=1 \ ./bench_random_mixed_hakmem 500000 4096 42 ``` --- ## Conclusion **Phase 19-2 successfully identified**: - HeapV2 as the dominant effective layer (>80% hit rate) - UltraHot as a low-value layer (<12% hit rate) - FC/SFC as currently unused (disabled by default) **Next Phase**: Implement Box FrontPrune ENV switches for A/B testing layer removal.