Phase 5 E5-3: Candidate Analysis (All DEFERRED) + E5-4 Instructions
E5-3 Analysis Results:
- free_tiny_fast_cold (7.14%): DEFER - cold path, low ROI
- unified_cache_push (3.39%): DEFER - already optimized
- hakmem_env_snapshot_enabled (2.97%): DEFER - low headroom
Key Insight: perf self% is time-weighted, not frequency-weighted.
Cold paths appear hot but have low total impact.
Next: E5-4 (Malloc Tiny Direct Path)
- Apply E5-1 winning pattern to malloc side
- Target: tiny_alloc_gate_fast() gate tax elimination
- ENV gate: HAKMEM_MALLOC_TINY_DIRECT=0/1
Files added:
- docs/analysis/PHASE5_E5_3_ANALYSIS_AND_RECOMMENDATIONS.md
- docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md
- core/box/free_cold_shape_env_box.{h,c} (research box, not tested)
- core/box/free_cold_shape_stats_box.{h,c} (research box, not tested)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -1,5 +1,68 @@
|
||||
# 本線タスク(現在)
|
||||
|
||||
## 更新メモ(2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot)
|
||||
|
||||
### Phase 5 E5-3: Candidate Analysis & Strategic Recommendations ⚠️ DEFER (2025-12-14)
|
||||
|
||||
**Decision**: **DEFER all E5-3 candidates** (E5-3a/b/c). Pivot to E5-4 (Malloc Direct Path, E5-1 pattern replication).
|
||||
|
||||
**Analysis**:
|
||||
- **E5-3a (free_tiny_fast_cold 7.14%)**: NO-GO (cold path, low frequency despite high self%)
|
||||
- **E5-3b (unified_cache_push 3.39%)**: MAYBE (already optimized, marginal ROI ~+1.0%)
|
||||
- **E5-3c (hakmem_env_snapshot_enabled 2.97%)**: NO-GO (E3-4 precedent shows -1.44% regression)
|
||||
|
||||
**Key Insight**: **Profiler self% ≠ optimization opportunity**
|
||||
- Self% is time-weighted (samples during execution), not frequency-weighted
|
||||
- Cold paths appear hot due to expensive operations when hit, not total cost
|
||||
- E5-2 lesson: 3.35% self% → +0.45% NEUTRAL (branch overhead ≈ savings)
|
||||
|
||||
**ROI Assessment**:
|
||||
| Candidate | Self% | Frequency | Expected Gain | Risk | Decision |
|
||||
|-----------|-------|-----------|---------------|------|----------|
|
||||
| E5-3a (cold path) | 7.14% | LOW | +0.5% | HIGH | NO-GO |
|
||||
| E5-3b (push) | 3.39% | HIGH | +1.0% | MEDIUM | DEFER |
|
||||
| E5-3c (env snapshot) | 2.97% | HIGH | -1.0% | HIGH | NO-GO |
|
||||
|
||||
**Strategic Pivot**: Focus on **E5-1 Success Pattern** (wrapper-level deduplication)
|
||||
- E5-1 (Free Tiny Direct): +3.35% (GO) ✅
|
||||
- **Next**: E5-4 (Malloc Tiny Direct) - Apply E5-1 pattern to alloc side
|
||||
- **Expected**: +2-4% (similar to E5-1, based on malloc wrapper overhead)
|
||||
|
||||
**Cumulative Status (Phase 5)**:
|
||||
- E4-1 (Free Wrapper Snapshot): +3.51% standalone
|
||||
- E4-2 (Malloc Wrapper Snapshot): +21.83% standalone
|
||||
- E4 Combined: +6.43% (from baseline with both OFF)
|
||||
- E5-1 (Free Tiny Direct): +3.35% (from E4 baseline)
|
||||
- E5-2 (Header Write-Once): +0.45% NEUTRAL (frozen)
|
||||
- **E5-3**: **DEFER** (analysis complete, no implementation/test)
|
||||
- **Total Phase 5**: ~+9-10% cumulative (E4+E5-1 promoted, E5-2 frozen, E5-3 deferred)
|
||||
|
||||
**Implementation** (E5-3a research box, NOT TESTED):
|
||||
- Files created:
|
||||
- `core/box/free_cold_shape_env_box.{h,c}` (ENV gate, default OFF)
|
||||
- `core/box/free_cold_shape_stats_box.{h,c}` (stats counters)
|
||||
- `docs/analysis/PHASE5_E5_3_ANALYSIS_AND_RECOMMENDATIONS.md` (analysis)
|
||||
- Files modified:
|
||||
- `core/front/malloc_tiny_fast.h` (lines 418-437, cold path shape optimization)
|
||||
- Pattern: Early exit for LEGACY path (skip LARSON check when !use_tiny_heap)
|
||||
- **Status**: FROZEN (default OFF, pre-analysis shows NO-GO, not worth A/B testing)
|
||||
|
||||
**Key Lessons**:
|
||||
1. **Profiler self% misleads** when frequency is low (cold path)
|
||||
2. **Micro-optimizations plateau** in already-optimized code (E5-2, E5-3b)
|
||||
3. **Branch hints are profile-dependent** (E3-4 failure, E5-3c risk)
|
||||
4. **Wrapper-level deduplication wins** (E4-1, E4-2, E5-1 pattern)
|
||||
|
||||
**Next Steps**:
|
||||
- **E5-4 Design**: Malloc Tiny Direct Path (E5-1 pattern for alloc)
|
||||
- Target: malloc() wrapper overhead (~12.95% self% in E4 profile)
|
||||
- Method: Single size check → direct call to malloc_tiny_fast_for_class()
|
||||
- Expected: +2-4% (based on E5-1 precedent +3.35%)
|
||||
- Design doc: `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_DESIGN.md`
|
||||
- Next instructions: `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
---
|
||||
|
||||
## 更新メモ(2025-12-14 Phase 5 E5-2 Complete - Header Write-Once)
|
||||
|
||||
### Phase 5 E5-2: Header Write-Once Optimization ⚪ NEUTRAL (2025-12-14)
|
||||
@ -120,12 +183,15 @@
|
||||
|
||||
**Next Steps**:
|
||||
- ✅ Promote: `HAKMEM_FREE_TINY_DIRECT=1` to `MIXED_TINYV3_C7_SAFE` preset
|
||||
- Next: E5-2 (Header Prefill at Refill, 2.59% target) or E5-3 (ENV Snapshot Shape, 2.57% target)
|
||||
- ✅ E5-2: NEUTRAL → FREEZE
|
||||
- ✅ E5-3: DEFER(ROI 低)
|
||||
- Next: **E5-4 (Malloc Tiny Direct)**(E5-1 パターンの alloc 側複製)
|
||||
- Design docs:
|
||||
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_1_DESIGN.md`
|
||||
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_1_AB_TEST_RESULTS.md`
|
||||
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||
- `docs/analysis/PHASE5_E5_COMPREHENSIVE_ANALYSIS.md`
|
||||
- `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user