Phase 14 v2: Hot Path Integration NEUTRAL (+0.08% Mixed, -0.39% C7-only)

Implementation:
- Patch 1: Add tcache pop to tiny_hot_alloc_fast() (try tcache first)
- Patch 2: Add tcache push to tiny_hot_free_fast() (try tcache first)
- Makefile fix: Add missing .o files to BENCH_HAKMEM_OBJS_BASE
- LTO fix: Restore static inline for tiny_c7_preserve_header_enabled()

A/B Test Results:
- Mixed (16-1024B): 51,287,515 → 51,330,213 ops/s (+0.08%)
- C7-only (1025-2048B): 80,975,651 → 80,660,283 ops/s (-0.39%)

Verdict: NEUTRAL (below +1.0% GO threshold)

Root Cause:
- LIFO/FIFO mixing degrades cache locality
- Hot path branch overhead
- Intrusive pointers add overhead vs array cache
- v2 worse than v1 (+0.20%)

Files:
- Modified: core/box/tiny_front_hot_box.h (tcache integration)
- Modified: Makefile (BENCH_HAKMEM_OBJS_BASE fix)
- Modified: core/box/tiny_c7_preserve_header_env_box.{h,c} (LTO fix)
- Results: docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_2_AB_TEST_RESULTS.md

Decision: Freeze Phase 14 (v1+v2) as research box (HAKMEM_TINY_TCACHE=0 default)

Next: Phase 15 (UnifiedCache FIFO→LIFO) - optimize array cache structure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-15 01:57:38 +09:00
parent f8fb05bc13
commit b7e01a9419

View File

@ -0,0 +1,68 @@
# Phase 14 v2: Pointer-Chase Reduction (Hot Path Integration) A/B Test Results
**Date:** 2025-12-15
**Benchmark:** Mixed (161024B) + C7-only (10252048B) 10-run cleanenv
**Target:** Integrate tcache into tiny_front_hot_box alloc/free hot paths
**Expected ROI:** +15-25% (design estimate from v1)
**GO Threshold:** +1.0% mean improvement
---
## 1. Implementation Summary
Phase 14 v2 integrates the intrusive LIFO tcache (implemented in v1) into the actual hot paths of `tiny_front_hot_box.h`.
**Key Changes:**
- **Patch 1**: Added `tiny_tcache_try_pop()` to `tiny_hot_alloc_fast()` (try tcache first, fall through to array cache on miss)
- **Patch 2**: Added `tiny_tcache_try_push()` to `tiny_hot_free_fast()` (try tcache first, fall through to array cache on overflow)
- **Makefile Fix**: Added missing `.o` files to `BENCH_HAKMEM_OBJS_BASE`
- **LTO Fix**: Restored static inline for `tiny_c7_preserve_header_enabled()`
**Design:**
- v1 only integrated tcache into `unified_cache_push()` (free side only)
- v2 integrates tcache into both alloc and free hot paths
- This creates push/pop symmetry (tcache becomes "live" on both sides)
**ENV Control:**
```bash
export HAKMEM_TINY_TCACHE=0 # Baseline
export HAKMEM_TINY_TCACHE=1 # Optimized
```
---
## 2. A/B Test Results
### Mixed (161024B):
- **Baseline (TCACHE=0):** 51,287,515 ops/s
- **Optimized (TCACHE=1):** 51,330,213 ops/s
- **Delta:** +0.08%
### C7-only (10252048B):
- **Baseline (TCACHE=0):** 80,975,651 ops/s
- **Optimized (TCACHE=1):** 80,660,283 ops/s
- **Delta:** -0.39%
---
## 3. Verdict: NEUTRAL
**Result:** Mixed +0.08%, C7-only -0.39% (both below GO threshold)
**Comparison to v1:**
- v1: Mixed +0.20% (NEUTRAL)
- v2: Mixed +0.08%, C7-only -0.39% (NEUTRAL, worse than v1)
**Root Cause:**
1. LIFO/FIFO mixing degrades cache locality
2. Hot path branch overhead
3. Cap=64 too small for high churn
4. Intrusive pointers add overhead vs array cache
---
## 4. Recommendation: Freeze as Research Box
**Decision:** Freeze Phase 14 (v1+v2) as research box (HAKMEM_TINY_TCACHE=0 default, OFF)
**Next:** Phase 15 (UnifiedCache FIFO→LIFO) - optimize existing array cache structure instead of adding intrusive layers.