3.9 KiB
Phase 89 SSOT Measurement Capture
Timestamp: 2025-12-18 23:06:01
Git SHA: e4c5f0535
Branch: master
Step 1: OBSERVE Binary (Telemetry Verification)
Binary: ./bench_random_mixed_hakmem_observe
Profile: MIXED_TINYV3_C7_SAFE
Iterations: 20,000,000
Working Set: 400
Inline Slots Overflow Stats (Preflight Verification):
- PUSH TOTAL: 4,812,031 ops (C4+C5+C6 verified active)
- POP TOTAL: 4,812,031 ops
- PUSH FULL: 0 (0.00%)
- POP EMPTY: 168 (0.003%)
- LEGACY FALLBACK CALLS: 5,327,294
- Judgment: ✓ [C] LEGACY used AND C4/C5/C6 INLINE SLOTS ACTIVE
- Throughput (with telemetry): 51.52M ops/s
Step 2: Standard Build (Clean Performance Baseline)
Binary: ./bench_random_mixed_hakmem
Build Flags: RELEASE, no telemetry, standard optimization
Profile: MIXED_TINYV3_C7_SAFE
Iterations: 20,000,000
Working Set: 400
Runs: 10
10-Run Results:
| Run | Throughput | Status |
|---|---|---|
| 1 | 51.15M | OK |
| 2 | 51.44M | OK |
| 3 | 51.61M | OK |
| 4 | 51.73M | Peak |
| 5 | 50.74M | Low |
| 6 | 51.34M | OK |
| 7 | 50.74M | Low |
| 8 | 51.37M | OK |
| 9 | 51.39M | OK |
| 10 | 51.31M | OK |
Statistics:
- Mean: 51.36M ops/s
- Min: 50.74M ops/s
- Max: 51.73M ops/s
- Range: 0.99M ops/s
- CV: ~0.7%
Step 3: FAST PGO Build (Optimized Performance Tracking)
Binary: ./bench_random_mixed_hakmem_minimal_pgo
Build Flags: RELEASE, PGO optimized, BENCH_MINIMAL=1
Profile: MIXED_TINYV3_C7_SAFE
Iterations: 20,000,000
Working Set: 400
Runs: 10
10-Run Results:
| Run | Throughput | Status |
|---|---|---|
| 1 | 55.13M | Peak |
| 2 | 54.73M | High |
| 3 | 53.81M | OK |
| 4 | 54.60M | High |
| 5 | 55.02M | Peak |
| 6 | 52.89M | Low |
| 7 | 53.61M | OK |
| 8 | 53.53M | OK |
| 9 | 55.08M | Peak |
| 10 | 53.51M | OK |
Statistics:
- Mean: 54.16M ops/s
- Min: 52.89M ops/s
- Max: 55.13M ops/s
- Range: 2.24M ops/s
- CV: ~1.5%
Performance Delta Analysis
Standard vs FAST PGO:
- Delta: 54.16M - 51.36M = 2.80M ops/s
- Percentage Gain: (2.80M / 51.36M) × 100 = 5.45%
Interpretation:
- FAST PGO is 5.45% faster than Standard build
- This represents the optimization ceiling with current profile-guided configuration
- SSOT baseline for bottleneck analysis: Standard 51.36M ops/s
Environment Configuration (SSOT Locked)
Key ENV variables (forced in scripts/run_mixed_10_cleanenv.sh):
HAKMEM_BENCH_MIN_SIZE=16- SSOT: prevent size driftHAKMEM_BENCH_MAX_SIZE=1040- SSOT: prevent class filteringHAKMEM_BENCH_C5_ONLY=0- SSOT: no single-class modeHAKMEM_BENCH_C6_ONLY=0- SSOT: no single-class modeHAKMEM_BENCH_C7_ONLY=0- SSOT: no single-class modeHAKMEM_WARM_POOL_SIZE=16- Phase 69 winnerHAKMEM_TINY_C4_INLINE_SLOTS=1- Phase 76-1 promotedHAKMEM_TINY_C5_INLINE_SLOTS=1- Phase 75-2 promotedHAKMEM_TINY_C6_INLINE_SLOTS=1- Phase 75-1 promotedHAKMEM_TINY_INLINE_SLOTS_FIXED=1- Phase 78-1 promotedHAKMEM_TINY_INLINE_SLOTS_SWITCHDISPATCH=1- Phase 80-1 promotedHAKMEM_TINY_INLINE_SLOTS_SWITCHDISPATCH_FIXED=0- Phase 83-1 NO-GOHAKMEM_FASTLANE_DIRECT=1- Phase 19-1b promotedHAKMEM_FREE_TINY_FAST_MONO_DUALHOT=1- Phase 9/10 promotedHAKMEM_FREE_TINY_FAST_MONO_LEGACY_DIRECT=1- Phase 10 promotedHAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE- default route
System Configuration
- CPU: AMD Ryzen 7 5825U with Radeon Graphics
- Cores: 16
- Memory: MemTotal: 13166508 kB
- Kernel: 6.8.0-87-generic
Next Steps (Phase 89 Step 5)
Objective: Identify top 3 bottleneck candidates using perf measurement
- Run
perf topduring Mixed SSOT execution - Analyze top 50 functions by CPU time
- Filter to high-frequency code paths (avoid 0.001% optimizations)
- Prepare recommendations for Phase 90+