142 lines
3.9 KiB
Markdown
142 lines
3.9 KiB
Markdown
|
|
# Phase 89 SSOT Measurement Capture
|
|||
|
|
|
|||
|
|
**Timestamp**: 2025-12-18 23:06:01
|
|||
|
|
**Git SHA**: e4c5f0535
|
|||
|
|
**Branch**: master
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Step 1: OBSERVE Binary (Telemetry Verification)
|
|||
|
|
|
|||
|
|
**Binary**: `./bench_random_mixed_hakmem_observe`
|
|||
|
|
**Profile**: `MIXED_TINYV3_C7_SAFE`
|
|||
|
|
**Iterations**: 20,000,000
|
|||
|
|
**Working Set**: 400
|
|||
|
|
|
|||
|
|
**Inline Slots Overflow Stats (Preflight Verification)**:
|
|||
|
|
- PUSH TOTAL: 4,812,031 ops (C4+C5+C6 verified active)
|
|||
|
|
- POP TOTAL: 4,812,031 ops
|
|||
|
|
- PUSH FULL: 0 (0.00%)
|
|||
|
|
- POP EMPTY: 168 (0.003%)
|
|||
|
|
- LEGACY FALLBACK CALLS: 5,327,294
|
|||
|
|
- Judgment: ✓ \[C\] LEGACY used AND C4/C5/C6 INLINE SLOTS ACTIVE
|
|||
|
|
- Throughput (with telemetry): **51.52M ops/s**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Step 2: Standard Build (Clean Performance Baseline)
|
|||
|
|
|
|||
|
|
**Binary**: `./bench_random_mixed_hakmem`
|
|||
|
|
**Build Flags**: RELEASE, no telemetry, standard optimization
|
|||
|
|
**Profile**: `MIXED_TINYV3_C7_SAFE`
|
|||
|
|
**Iterations**: 20,000,000
|
|||
|
|
**Working Set**: 400
|
|||
|
|
**Runs**: 10
|
|||
|
|
|
|||
|
|
**10-Run Results**:
|
|||
|
|
| Run | Throughput | Status |
|
|||
|
|
|-----|-----------|--------|
|
|||
|
|
| 1 | 51.15M | OK |
|
|||
|
|
| 2 | 51.44M | OK |
|
|||
|
|
| 3 | 51.61M | OK |
|
|||
|
|
| 4 | 51.73M | Peak |
|
|||
|
|
| 5 | 50.74M | Low |
|
|||
|
|
| 6 | 51.34M | OK |
|
|||
|
|
| 7 | 50.74M | Low |
|
|||
|
|
| 8 | 51.37M | OK |
|
|||
|
|
| 9 | 51.39M | OK |
|
|||
|
|
| 10 | 51.31M | OK |
|
|||
|
|
|
|||
|
|
**Statistics**:
|
|||
|
|
- **Mean**: 51.36M ops/s
|
|||
|
|
- **Min**: 50.74M ops/s
|
|||
|
|
- **Max**: 51.73M ops/s
|
|||
|
|
- **Range**: 0.99M ops/s
|
|||
|
|
- **CV**: ~0.7%
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Step 3: FAST PGO Build (Optimized Performance Tracking)
|
|||
|
|
|
|||
|
|
**Binary**: `./bench_random_mixed_hakmem_minimal_pgo`
|
|||
|
|
**Build Flags**: RELEASE, PGO optimized, BENCH_MINIMAL=1
|
|||
|
|
**Profile**: `MIXED_TINYV3_C7_SAFE`
|
|||
|
|
**Iterations**: 20,000,000
|
|||
|
|
**Working Set**: 400
|
|||
|
|
**Runs**: 10
|
|||
|
|
|
|||
|
|
**10-Run Results**:
|
|||
|
|
| Run | Throughput | Status |
|
|||
|
|
|-----|-----------|--------|
|
|||
|
|
| 1 | 55.13M | Peak |
|
|||
|
|
| 2 | 54.73M | High |
|
|||
|
|
| 3 | 53.81M | OK |
|
|||
|
|
| 4 | 54.60M | High |
|
|||
|
|
| 5 | 55.02M | Peak |
|
|||
|
|
| 6 | 52.89M | Low |
|
|||
|
|
| 7 | 53.61M | OK |
|
|||
|
|
| 8 | 53.53M | OK |
|
|||
|
|
| 9 | 55.08M | Peak |
|
|||
|
|
| 10 | 53.51M | OK |
|
|||
|
|
|
|||
|
|
**Statistics**:
|
|||
|
|
- **Mean**: 54.16M ops/s
|
|||
|
|
- **Min**: 52.89M ops/s
|
|||
|
|
- **Max**: 55.13M ops/s
|
|||
|
|
- **Range**: 2.24M ops/s
|
|||
|
|
- **CV**: ~1.5%
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Performance Delta Analysis
|
|||
|
|
|
|||
|
|
**Standard vs FAST PGO**:
|
|||
|
|
- Delta: 54.16M - 51.36M = **2.80M ops/s**
|
|||
|
|
- Percentage Gain: (2.80M / 51.36M) × 100 = **5.45%**
|
|||
|
|
|
|||
|
|
**Interpretation**:
|
|||
|
|
- FAST PGO is 5.45% faster than Standard build
|
|||
|
|
- This represents the optimization ceiling with current profile-guided configuration
|
|||
|
|
- SSOT baseline for bottleneck analysis: **Standard 51.36M ops/s**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Environment Configuration (SSOT Locked)
|
|||
|
|
|
|||
|
|
**Key ENV variables** (forced in `scripts/run_mixed_10_cleanenv.sh`):
|
|||
|
|
- `HAKMEM_BENCH_MIN_SIZE=16` - SSOT: prevent size drift
|
|||
|
|
- `HAKMEM_BENCH_MAX_SIZE=1040` - SSOT: prevent class filtering
|
|||
|
|
- `HAKMEM_BENCH_C5_ONLY=0` - SSOT: no single-class mode
|
|||
|
|
- `HAKMEM_BENCH_C6_ONLY=0` - SSOT: no single-class mode
|
|||
|
|
- `HAKMEM_BENCH_C7_ONLY=0` - SSOT: no single-class mode
|
|||
|
|
- `HAKMEM_WARM_POOL_SIZE=16` - Phase 69 winner
|
|||
|
|
- `HAKMEM_TINY_C4_INLINE_SLOTS=1` - Phase 76-1 promoted
|
|||
|
|
- `HAKMEM_TINY_C5_INLINE_SLOTS=1` - Phase 75-2 promoted
|
|||
|
|
- `HAKMEM_TINY_C6_INLINE_SLOTS=1` - Phase 75-1 promoted
|
|||
|
|
- `HAKMEM_TINY_INLINE_SLOTS_FIXED=1` - Phase 78-1 promoted
|
|||
|
|
- `HAKMEM_TINY_INLINE_SLOTS_SWITCHDISPATCH=1` - Phase 80-1 promoted
|
|||
|
|
- `HAKMEM_TINY_INLINE_SLOTS_SWITCHDISPATCH_FIXED=0` - Phase 83-1 NO-GO
|
|||
|
|
- `HAKMEM_FASTLANE_DIRECT=1` - Phase 19-1b promoted
|
|||
|
|
- `HAKMEM_FREE_TINY_FAST_MONO_DUALHOT=1` - Phase 9/10 promoted
|
|||
|
|
- `HAKMEM_FREE_TINY_FAST_MONO_LEGACY_DIRECT=1` - Phase 10 promoted
|
|||
|
|
- `HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE` - default route
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## System Configuration
|
|||
|
|
|
|||
|
|
- **CPU**: AMD Ryzen 7 5825U with Radeon Graphics
|
|||
|
|
- **Cores**: 16
|
|||
|
|
- **Memory**: MemTotal: 13166508 kB
|
|||
|
|
- **Kernel**: 6.8.0-87-generic
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Next Steps (Phase 89 Step 5)
|
|||
|
|
|
|||
|
|
**Objective**: Identify top 3 bottleneck candidates using perf measurement
|
|||
|
|
- Run `perf top` during Mixed SSOT execution
|
|||
|
|
- Analyze top 50 functions by CPU time
|
|||
|
|
- Filter to high-frequency code paths (avoid 0.001% optimizations)
|
|||
|
|
- Prepare recommendations for Phase 90+
|