Update scorecard: Phase 75-4 FAST PGO rebase (+3.16%) + critical PGO staleness finding
Phase 75-4 validates C5+C6 inline slots on FAST PGO baseline: - Point A (baseline, C5=0, C6=0): 53.81 M ops/s - Point D (C5=1, C6=1): 55.51 M ops/s (+3.16%) CRITICAL FINDING: 14% regression vs Phase 69 baseline (53.81 vs 62.63 M ops/s) Root cause: Stale PGO profile (likely trained pre-Phase 69, missing Phase 75 benefits) Recommended next: Phase 75-5 (PGO Profile Regeneration) to recover lost performance Scorecard updated with Phase 75-4 results and high-priority action items.
This commit is contained in:
@ -33,6 +33,7 @@ Note:
|
||||
| **FAST v3 + PGO (Phase 66)** | **60.89** | **61.35** | **50.32%** | **GO: +3.0% mean (3回検証済み、安定 <±1%)**。Phase 66 PGO initial baseline |
|
||||
| **FAST v3 + PGO (Phase 68)** | **61.614** | **61.924** | **50.93%** | **GO: +1.19% vs Phase 66** ✓ (seed/WS diversification) |
|
||||
| **FAST v3 + PGO (Phase 69)** | **62.63** | **63.38** | **51.77%** | **強GO: +3.26% vs Phase 68** ✓✓✓ (Warm Pool Size=16, ENV-only) → **昇格済み 新 FAST baseline** ✓ |
|
||||
| FAST v3 + PGO + Phase 75 (C5+C6 ON) [Point D] | **55.51** | - | **45.70%** | Phase 75-4 FAST PGO rebase (C5+C6 inline slots): +3.16% vs Point A ✓ **[REBASE URGENT]** |
|
||||
| Standard | 53.50 | - | 44.21% | 安全・互換基準(Phase 48 前計測、要 rebase) |
|
||||
| OBSERVE | TBD | - | - | 診断カウンタ ON |
|
||||
|
||||
@ -118,6 +119,50 @@ Notes:
|
||||
- Rollback: Set `HAKMEM_WARM_POOL_SIZE=12` or remove ENV variable
|
||||
- Results: `docs/analysis/PHASE69_REFILL_TUNING_1_RESULTS.md`
|
||||
|
||||
**Phase 75-4: FAST PGO Rebase (C5+C6 Inline Slots Validation) — CRITICAL FINDING**
|
||||
|
||||
Phase 75-3 validated C5+C6 inline slots optimization on Standard binary (+5.41%). Phase 75-4 rebased this onto FAST PGO baseline to update SSOT:
|
||||
|
||||
**4-Point Matrix (FAST PGO, Mixed SSOT):**
|
||||
| Point | Config | Throughput | Delta vs A |
|
||||
|-------|--------|-----------|-----------|
|
||||
| A | C5=0, C6=0 | 53.81 M ops/s | baseline |
|
||||
| B | C5=1, C6=0 | 53.03 M ops/s | -1.45% |
|
||||
| C | C5=0, C6=1 | 54.17 M ops/s | +0.67% |
|
||||
| **D** | **C5=1, C6=1** | **55.51 M ops/s** | **+3.16%** |
|
||||
|
||||
**Decision**: ✅ **GO** (Point D exceeds +3.0% ideal threshold by +0.16%)
|
||||
|
||||
**⚠️ CRITICAL FINDING: PGO Profile Staleness**
|
||||
|
||||
- **Phase 69 FAST baseline**: 62.63 M ops/s
|
||||
- **Phase 75-4 Point A (FAST PGO baseline)**: 53.81 M ops/s
|
||||
- **Regression**: -14.09% (not explained by Phase 75 additions)
|
||||
- **Root cause hypothesis**: PGO profile trained pre-Phase 69 (likely Phase 68 or earlier) with C5=0, C6=0 configuration
|
||||
- **Impact**: FAST PGO captures only 58.4% of Standard's +5.41% gain (3.16% vs 5.41%)
|
||||
|
||||
**Recommended Actions (Priority Order):**
|
||||
|
||||
1. **IMMEDIATE - UPDATE SSOT**: Phase 75 C5+C6 inline slots confirmed working (+3.16% on FAST PGO)
|
||||
- Promote to core/bench_profile.h (already done for Standard, now FAST PGO validated)
|
||||
- Update this scorecard: Phase 75 baseline = 55.51 M ops/s (Point D, with C5+C6 ON)
|
||||
|
||||
2. **HIGH PRIORITY - PHASE 75-5 (PGO Profile Regeneration)**
|
||||
- Regenerate PGO profile with C5=1, C6=1 training configuration
|
||||
- Expected gain: +5-8% (if profile aligns with actual code optimization)
|
||||
- Estimated recovery: 55.51 M ops/s → ~58-59 M ops/s
|
||||
- Root cause analysis: Investigate 14% gap vs Phase 69 (layout, code bloat, or profile mismatch)
|
||||
|
||||
**Documentation:**
|
||||
- Phase 75-4 results: `docs/analysis/PHASE75_4_FAST_PGO_REBASE_RESULTS.md`
|
||||
- Next: Phase 75-5 (PGO regeneration) required before next optimization phase
|
||||
|
||||
**Impact on M2 Milestone:**
|
||||
- Phase 69 FAST baseline: 62.63 M ops/s (51.77% of mimalloc, +3.23pp to M2)
|
||||
- Phase 75-4 Point A (baseline): 53.81 M ops/s (44.35% of mimalloc, +10.65pp to M2)
|
||||
- Phase 75-4 Point D (C5+C6): 55.51 M ops/s (45.70% of mimalloc, +9.30pp to M2)
|
||||
- **Status**: Phase 75 optimization proven, but PGO profile regression masks true progress
|
||||
|
||||
※注意: `mimalloc/system/jemalloc` の参照値は環境ドリフトでズレるため、定期的に再ベースラインする。
|
||||
- Phase 48 完了: `docs/analysis/PHASE48_REBASE_ALLOCATORS_AND_STABILITY_SUITE_RESULTS.md`
|
||||
- Phase 59 完了: `docs/analysis/PHASE59_50PERCENT_RECOVERY_BASELINE_REBASE_RESULTS.md`
|
||||
|
||||
Reference in New Issue
Block a user