Files
hakmem/docs/analysis/PHASE4_EXECUTIVE_SUMMARY.md
Moe Charm (CI) 4a070d8a14 Phase 5 E4-1: Free Wrapper ENV Snapshot (+3.51% GO, ADOPTED)
Target: Consolidate free wrapper TLS reads (2→1)
- free() is 25.26% self% (top hot spot)
- Strategy: Apply E1 success pattern (ENV snapshot) to free path

Implementation:
- ENV gate: HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0/1 (default 0)
- core/box/free_wrapper_env_snapshot_box.{h,c}: New box
  - Consolidates 2 TLS reads → 1 TLS read (50% reduction)
  - Reduces 4 branches → 3 branches (25% reduction)
  - Lazy init with probe window (bench_profile putenv sync)
- core/box/hak_wrappers.inc.h: Integration in free() wrapper
- Makefile: Add free_wrapper_env_snapshot_box.o to all targets

A/B Test Results (Mixed, 10-run, 20M iters):
- Baseline (SNAPSHOT=0): 45.35M ops/s (mean), 45.31M ops/s (median)
- Optimized (SNAPSHOT=1): 46.94M ops/s (mean), 47.15M ops/s (median)
- Improvement: +3.51% mean, +4.07% median

Decision: GO (+3.51% >= +1.0% threshold)
- Exceeded conservative estimate (+1.5% → +3.51%)
- Similar efficiency to E1 (+3.92%)
- Health check: PASS (all profiles)
- Action: PROMOTED to MIXED_TINYV3_C7_SAFE preset

Phase 5 Cumulative:
- E1 (ENV Snapshot): +3.92%
- E4-1 (Free Wrapper Snapshot): +3.51%
- Total Phase 4-5: ~+7.5%

E3-4 Correction:
- Phase 4 E3-4 (ENV Constructor Init): NO-GO / FROZEN
- Initial A/B showed +4.75%, but investigation revealed:
  - Branch prediction hint mismatch (UNLIKELY with always-true)
  - Retest confirmed -1.78% regression
  - Root cause: __builtin_expect(..., 0) with ctor_mode==1
- Decision: Freeze as research box (default OFF)
- Learning: Branch hints need careful tuning, TLS consolidation safer

Deliverables:
- docs/analysis/PHASE5_E4_FREE_GATE_OPTIMIZATION_1_DESIGN.md
- docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md
- docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md (next)
- docs/analysis/PHASE5_POST_E1_NEXT_INSTRUCTIONS.md
- docs/analysis/ENV_PROFILE_PRESETS.md (E4-1 added, E3-4 corrected)
- CURRENT_TASK.md (E4-1 complete, E3-4 frozen)
- core/bench_profile.h (E4-1 promoted to default)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 04:24:34 +09:00

67 lines
2.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 4 Status - Executive Summary
**Date**: 2025-12-14
**Status**: E1 ✅ GOpreset昇格, E2 🔬 FROZEN, E3-4 ❌ NO-GO
**Baseline**: Mixed 20M/ws=400E1=1 を前提)
---
## Quick Status
### E2 Decision: FREEZE ✅ (NEUTRAL)
**Result**: -0.21% mean, -0.62% median (NEUTRAL)
**Why Freeze?**
- Alloc route optimization saturated by Phase 3 C3 (static routing)
- Free DUALHOT worked (+13%) because it skipped expensive ops
- Alloc DUALHOT doesn't work (-0.21%) because route already cached
- **Lesson**: Per-class specialization only helps when bypassing uncached overhead
**Action**: Keep as research box (default OFF), no further investigation
---
## E1/E3-4 Results (Mixed A/B)
### E1: ENV Snapshot Consolidation ✅ GO (opt-in)
**Result**: +3.92% avg, +4.01% median
**ENV**: `HAKMEM_ENV_SNAPSHOT=1``MIXED_TINYV3_C7_SAFE` で default 化、opt-out 可)
### E3-4: ENV Constructor Init ❌ NO-GO (FROZEN)
**Resultre-validation**: -1.44% mean, -1.03% medianE1=1 前提)
**ENV**: `HAKMEM_ENV_SNAPSHOT=1 HAKMEM_ENV_SNAPSHOT_CTOR=1`default OFF / freeze
---
## Phase 4 Cumulative Status
**Active**:
- E1 (ENV Snapshot): +3.92% ✅ GOopt-in
**Frozen**:
- D3 (Alloc Gate Shape): +0.56% ⚪
- E2 (Alloc Per-Class FastPath): -0.21% ⚪
- E3-4 (ENV CTOR): ❌ NO-GO
## Next Actions
1. E3-4 を freeze 維持default OFF
2. E1 を本線化した状態で perf を取り直し、“self% ≥ 5%” の芯を選ぶ
3. 次の箱は “TLS/分岐” ではなく “実データ構造/ホットループ” を優先alloc gate / unified_cache / pool など)
---
## Key Lessons
1. **Route optimization saturated**: C3 already cached routes, E2 no benefit
2. **Shape optimization plateaued**: D3 +0.56% neutral, branch prediction saturated
3. **ENV consolidation successful**: E1 +3.92%, constructor init is next step
4. **Different optimization vectors needed**: Move beyond route/shape to init/dispatch overhead
---
**Full Analysis**: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE4_COMPREHENSIVE_STATUS_ANALYSIS.md`