|
|
9123a8f12b
|
Phase 75-5: PGO Regeneration + Forensics - CRITICAL FINDING (NEUTRAL)
Regenerated PGO profile with C5=1, C6=1, WarmPool=16 training config.
Results:
- Baseline (10-run): 55.04 M ops/s (target: ≥60, Phase 69: 62.63)
- Recovery: +0.3% vs Phase 75-4 (minimal improvement)
- 4-point matrix D vs A: +2.35% (down from +3.16%)
Decision: NEUTRAL - Profile regeneration did NOT fix regression
ROOT CAUSE DISCOVERY (Forensics):
Original hypothesis: PGO profile mismatch
ACTUAL FINDING: Hypothesis REJECTED - Code bloat layout tax
Forensics Analysis (Phase 69 → Phase 75-5):
1. Code Bloat Tax: +13KB text (+3.1% binary growth)
- Phase 69: 447KB → Phase 75-5: 460KB
- C5/C6 inline slots + structural additions
2. IPC Collapse: -7.22% (CRITICAL)
- Phase 69: 1.80 IPC → Phase 75-5: 1.67 IPC
- Instruction fetch/decode pipeline degraded
3. Branch Predictor Disruption: +19.4% (SIGNIFICANT)
- Branch-miss rate: 3.81% → 4.56%
- Control flow patterns worsened
4. Net Effect: -12.12% regression
- Code bloat impact: ~-5.0 M ops/s
- IPC degradation: ~-2.0 M ops/s
- C5+C6 benefit: +1.3 M ops/s
- Total: -7.4 M ops/s vs Phase 69
The Paradox:
- C5+C6 optimization is algorithmically correct (+2.35%)
- But code bloat introduces larger layout tax (-12%)
- PGO profile was correctly trained - issue is structural
Recommendation: DEMOTE FAST PGO as SSOT → Promote Standard build
- PGO too sensitive to layout changes (3% → 12% loss)
- Standard showed +5.41% in Phase 75-3 with better stability
Next: Phase 75-6 (Standard baseline update) + Phase 76 (code size audit)
Artifacts: docs/analysis/PHASE75_5_PGO_REGENERATION_RESULTS.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
2025-12-18 09:48:31 +09:00 |
|