Files
hakmem/docs/analysis/PHASE89_SSOT_MEASUREMENT.md
2025-12-19 03:45:01 +09:00

3.9 KiB
Raw Blame History

Phase 89 SSOT Measurement Capture

Timestamp: 2025-12-18 23:06:01
Git SHA: e4c5f0535
Branch: master


Step 1: OBSERVE Binary (Telemetry Verification)

Binary: ./bench_random_mixed_hakmem_observe
Profile: MIXED_TINYV3_C7_SAFE
Iterations: 20,000,000
Working Set: 400

Inline Slots Overflow Stats (Preflight Verification):

  • PUSH TOTAL: 4,812,031 ops (C4+C5+C6 verified active)
  • POP TOTAL: 4,812,031 ops
  • PUSH FULL: 0 (0.00%)
  • POP EMPTY: 168 (0.003%)
  • LEGACY FALLBACK CALLS: 5,327,294
  • Judgment: ✓ [C] LEGACY used AND C4/C5/C6 INLINE SLOTS ACTIVE
  • Throughput (with telemetry): 51.52M ops/s

Step 2: Standard Build (Clean Performance Baseline)

Binary: ./bench_random_mixed_hakmem
Build Flags: RELEASE, no telemetry, standard optimization
Profile: MIXED_TINYV3_C7_SAFE
Iterations: 20,000,000
Working Set: 400
Runs: 10

10-Run Results:

Run Throughput Status
1 51.15M OK
2 51.44M OK
3 51.61M OK
4 51.73M Peak
5 50.74M Low
6 51.34M OK
7 50.74M Low
8 51.37M OK
9 51.39M OK
10 51.31M OK

Statistics:

  • Mean: 51.36M ops/s
  • Min: 50.74M ops/s
  • Max: 51.73M ops/s
  • Range: 0.99M ops/s
  • CV: ~0.7%

Step 3: FAST PGO Build (Optimized Performance Tracking)

Binary: ./bench_random_mixed_hakmem_minimal_pgo
Build Flags: RELEASE, PGO optimized, BENCH_MINIMAL=1
Profile: MIXED_TINYV3_C7_SAFE
Iterations: 20,000,000
Working Set: 400
Runs: 10

10-Run Results:

Run Throughput Status
1 55.13M Peak
2 54.73M High
3 53.81M OK
4 54.60M High
5 55.02M Peak
6 52.89M Low
7 53.61M OK
8 53.53M OK
9 55.08M Peak
10 53.51M OK

Statistics:

  • Mean: 54.16M ops/s
  • Min: 52.89M ops/s
  • Max: 55.13M ops/s
  • Range: 2.24M ops/s
  • CV: ~1.5%

Performance Delta Analysis

Standard vs FAST PGO:

  • Delta: 54.16M - 51.36M = 2.80M ops/s
  • Percentage Gain: (2.80M / 51.36M) × 100 = 5.45%

Interpretation:

  • FAST PGO is 5.45% faster than Standard build
  • This represents the optimization ceiling with current profile-guided configuration
  • SSOT baseline for bottleneck analysis: Standard 51.36M ops/s

Environment Configuration (SSOT Locked)

Key ENV variables (forced in scripts/run_mixed_10_cleanenv.sh):

  • HAKMEM_BENCH_MIN_SIZE=16 - SSOT: prevent size drift
  • HAKMEM_BENCH_MAX_SIZE=1040 - SSOT: prevent class filtering
  • HAKMEM_BENCH_C5_ONLY=0 - SSOT: no single-class mode
  • HAKMEM_BENCH_C6_ONLY=0 - SSOT: no single-class mode
  • HAKMEM_BENCH_C7_ONLY=0 - SSOT: no single-class mode
  • HAKMEM_WARM_POOL_SIZE=16 - Phase 69 winner
  • HAKMEM_TINY_C4_INLINE_SLOTS=1 - Phase 76-1 promoted
  • HAKMEM_TINY_C5_INLINE_SLOTS=1 - Phase 75-2 promoted
  • HAKMEM_TINY_C6_INLINE_SLOTS=1 - Phase 75-1 promoted
  • HAKMEM_TINY_INLINE_SLOTS_FIXED=1 - Phase 78-1 promoted
  • HAKMEM_TINY_INLINE_SLOTS_SWITCHDISPATCH=1 - Phase 80-1 promoted
  • HAKMEM_TINY_INLINE_SLOTS_SWITCHDISPATCH_FIXED=0 - Phase 83-1 NO-GO
  • HAKMEM_FASTLANE_DIRECT=1 - Phase 19-1b promoted
  • HAKMEM_FREE_TINY_FAST_MONO_DUALHOT=1 - Phase 9/10 promoted
  • HAKMEM_FREE_TINY_FAST_MONO_LEGACY_DIRECT=1 - Phase 10 promoted
  • HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE - default route

System Configuration

  • CPU: AMD Ryzen 7 5825U with Radeon Graphics
  • Cores: 16
  • Memory: MemTotal: 13166508 kB
  • Kernel: 6.8.0-87-generic

Next Steps (Phase 89 Step 5)

Objective: Identify top 3 bottleneck candidates using perf measurement

  • Run perf top during Mixed SSOT execution
  • Analyze top 50 functions by CPU time
  • Filter to high-frequency code paths (avoid 0.001% optimizations)
  • Prepare recommendations for Phase 90+