Phase 4 E2: Alloc Per-Class FastPath - NEUTRAL (-0.21%)
A/B Test Results (Mixed, 10-run, 20M iters): - Baseline (DUALHOT=0): 45.40M ops/s (mean), 45.51M ops/s (median) - Optimized (DUALHOT=1): 45.30M ops/s (mean), 45.22M ops/s (median) - Improvement: -0.21% mean, -0.62% median Decision: NEUTRAL (within ±1.0% noise threshold) Action: FREEZE as research box (default OFF, no promotion) Key Findings: - C0-C3 fast path adds branch overhead without measurable benefit - Unlike FREE path (+13%), ALLOC path already has optimized route caching - Phase 3 C3 static routing eliminated route lookup overhead - Additional per-class specialization doesn't reduce existing cost Root Cause: - Free DUALHOT skips expensive policy_snapshot() + tiny_route_for_class() - Alloc DUALHOT adds C0-C3 branch but route already cached (Phase 3 C3) - Net effect: Branch cost ≈ Route savings → neutral Conclusion: Alloc route optimization has reached diminishing returns Cumulative Status: - Phase 4 E1: +3.92% (GO, research box) - Phase 4 E2: -0.21% (NEUTRAL, frozen) Files: - CURRENT_TASK.md: Updated with E2 results - docs/analysis/PHASE4_E2_ALLOC_PER_CLASS_FASTPATH_AB_TEST_RESULTS.md: Full A/B test report 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -1,6 +1,43 @@
|
||||
# 本線タスク(現在)
|
||||
|
||||
## 更新メモ(2025-12-14 Phase 4 E1 Complete - ENV Snapshot Consolidation)
|
||||
## 更新メモ(2025-12-14 Phase 4 E2 Complete - Alloc Per-Class FastPath)
|
||||
|
||||
### Phase 4 E2: Alloc Per-Class FastPath ⚪ NEUTRAL (2025-12-14)
|
||||
|
||||
**Target**: C0-C3 dedicated fast path for alloc (bypass policy route for small sizes)
|
||||
- Strategy: Skip policy snapshot + route determination for C0-C3 classes
|
||||
- Reuse DUALHOT pattern from free path (which achieved +13% for C0-C3)
|
||||
- Baseline: HAKMEM_ENV_SNAPSHOT=1 enabled (E1 active)
|
||||
|
||||
**Implementation**:
|
||||
- ENV gate: `HAKMEM_TINY_ALLOC_DUALHOT=0/1` (already exists, default: 0)
|
||||
- Integration: `malloc_tiny_fast_for_class()` lines 247-259
|
||||
- C0-C3 check: Direct to LEGACY unified cache when enabled
|
||||
- Pattern: Probe window lazy init (64-call tolerance for early putenv)
|
||||
|
||||
**A/B Test Results** (Mixed, 10-run, 20M iters, HAKMEM_ENV_SNAPSHOT=1):
|
||||
- Baseline (DUALHOT=0): **45.40M ops/s** (mean), 45.51M ops/s (median), σ=0.38M
|
||||
- Optimized (DUALHOT=1): **45.30M ops/s** (mean), 45.22M ops/s (median), σ=0.49M
|
||||
- **Improvement: -0.21% mean, -0.62% median**
|
||||
|
||||
**Decision: NEUTRAL** (-0.21% within ±1.0% noise threshold)
|
||||
- Action: Keep as research box (default OFF, freeze)
|
||||
- Reason: C0-C3 fast path adds branch overhead without measurable gain on Mixed
|
||||
- Unlike FREE path (+13%), ALLOC path doesn't show significant route determination cost
|
||||
|
||||
**Key Insight**:
|
||||
- Free path benefits from DUALHOT because it skips expensive policy snapshot + route lookup
|
||||
- Alloc path already has optimized route caching (Phase 3 C3 static routing)
|
||||
- C0-C3 specialization doesn't provide additional benefit over current routing
|
||||
- Conclusion: Alloc route optimization has reached diminishing returns
|
||||
|
||||
**Cumulative Status**:
|
||||
- Phase 4 E1: +3.92% (GO, research box)
|
||||
- Phase 4 E2: -0.21% (NEUTRAL, frozen)
|
||||
|
||||
### Next: Phase 4 E3 - TBD (consult perf profile or pursue other optimization vectors)
|
||||
|
||||
---
|
||||
|
||||
### Phase 4 E1: ENV Snapshot Consolidation ✅ COMPLETE (2025-12-14)
|
||||
|
||||
@ -28,10 +65,6 @@
|
||||
|
||||
**Key Insight**: Shifting from shape optimizations (plateaued) to TLS/memory overhead yields strong returns. ENV snapshot consolidation represents new optimization frontier beyond branch prediction tuning.
|
||||
|
||||
### Next: Phase 4 E2 - Alloc Per-Class Fast Path
|
||||
- 指示書(SSOT): `docs/analysis/PHASE4_E2_ALLOC_PER_CLASS_FASTPATH_NEXT_INSTRUCTIONS.md`
|
||||
- 設計メモ: `docs/analysis/PHASE4_E2_ALLOC_PER_CLASS_FASTPATH_1_DESIGN.md`
|
||||
|
||||
### Phase 4 Perf Profiling Complete ✅ (2025-12-14)
|
||||
|
||||
**Profile Analysis**:
|
||||
|
||||
Reference in New Issue
Block a user