hakmem

tomoaki/hakmem

Fork 0

Commit Graph

Author	SHA1	Message	Date
Moe Charm (CI)	9123a8f12b	Phase 75-5: PGO Regeneration + Forensics - CRITICAL FINDING (NEUTRAL) Regenerated PGO profile with C5=1, C6=1, WarmPool=16 training config. Results: - Baseline (10-run): 55.04 M ops/s (target: ≥60, Phase 69: 62.63) - Recovery: +0.3% vs Phase 75-4 (minimal improvement) - 4-point matrix D vs A: +2.35% (down from +3.16%) Decision: NEUTRAL - Profile regeneration did NOT fix regression ROOT CAUSE DISCOVERY (Forensics): Original hypothesis: PGO profile mismatch ACTUAL FINDING: Hypothesis REJECTED - Code bloat layout tax Forensics Analysis (Phase 69 → Phase 75-5): 1. Code Bloat Tax: +13KB text (+3.1% binary growth) - Phase 69: 447KB → Phase 75-5: 460KB - C5/C6 inline slots + structural additions 2. IPC Collapse: -7.22% (CRITICAL) - Phase 69: 1.80 IPC → Phase 75-5: 1.67 IPC - Instruction fetch/decode pipeline degraded 3. Branch Predictor Disruption: +19.4% (SIGNIFICANT) - Branch-miss rate: 3.81% → 4.56% - Control flow patterns worsened 4. Net Effect: -12.12% regression - Code bloat impact: ~-5.0 M ops/s - IPC degradation: ~-2.0 M ops/s - C5+C6 benefit: +1.3 M ops/s - Total: -7.4 M ops/s vs Phase 69 The Paradox: - C5+C6 optimization is algorithmically correct (+2.35%) - But code bloat introduces larger layout tax (-12%) - PGO profile was correctly trained - issue is structural Recommendation: DEMOTE FAST PGO as SSOT → Promote Standard build - PGO too sensitive to layout changes (3% → 12% loss) - Standard showed +5.41% in Phase 75-3 with better stability Next: Phase 75-6 (Standard baseline update) + Phase 76 (code size audit) Artifacts: docs/analysis/PHASE75_5_PGO_REGENERATION_RESULTS.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-18 09:48:31 +09:00

Author

SHA1

Message

Date

Moe Charm (CI)

9123a8f12b

Phase 75-5: PGO Regeneration + Forensics - CRITICAL FINDING (NEUTRAL)

Regenerated PGO profile with C5=1, C6=1, WarmPool=16 training config.

Results:
- Baseline (10-run): 55.04 M ops/s (target: ≥60, Phase 69: 62.63)
- Recovery: +0.3% vs Phase 75-4 (minimal improvement)
- 4-point matrix D vs A: +2.35% (down from +3.16%)

Decision: NEUTRAL - Profile regeneration did NOT fix regression

ROOT CAUSE DISCOVERY (Forensics):
Original hypothesis: PGO profile mismatch
ACTUAL FINDING: Hypothesis REJECTED - Code bloat layout tax

Forensics Analysis (Phase 69 → Phase 75-5):
1. Code Bloat Tax: +13KB text (+3.1% binary growth)
   - Phase 69: 447KB → Phase 75-5: 460KB
   - C5/C6 inline slots + structural additions

2. IPC Collapse: -7.22% (CRITICAL)
   - Phase 69: 1.80 IPC → Phase 75-5: 1.67 IPC
   - Instruction fetch/decode pipeline degraded

3. Branch Predictor Disruption: +19.4% (SIGNIFICANT)
   - Branch-miss rate: 3.81% → 4.56%
   - Control flow patterns worsened

4. Net Effect: -12.12% regression
   - Code bloat impact: ~-5.0 M ops/s
   - IPC degradation: ~-2.0 M ops/s
   - C5+C6 benefit: +1.3 M ops/s
   - Total: -7.4 M ops/s vs Phase 69

The Paradox:
- C5+C6 optimization is algorithmically correct (+2.35%)
- But code bloat introduces larger layout tax (-12%)
- PGO profile was correctly trained - issue is structural

Recommendation: DEMOTE FAST PGO as SSOT → Promote Standard build
- PGO too sensitive to layout changes (3% → 12% loss)
- Standard showed +5.41% in Phase 75-3 with better stability

Next: Phase 75-6 (Standard baseline update) + Phase 76 (code size audit)

Artifacts: docs/analysis/PHASE75_5_PGO_REGENERATION_RESULTS.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2025-12-18 09:48:31 +09:00

1 Commits