Phase 75-4 validates C5+C6 inline slots on FAST PGO baseline:
- Point A (baseline, C5=0, C6=0): 53.81 M ops/s
- Point D (C5=1, C6=1): 55.51 M ops/s (+3.16%)
CRITICAL FINDING: 14% regression vs Phase 69 baseline (53.81 vs 62.63 M ops/s)
Root cause: Stale PGO profile (likely trained pre-Phase 69, missing Phase 75 benefits)
Recommended next: Phase 75-5 (PGO Profile Regeneration) to recover lost performance
Scorecard updated with Phase 75-4 results and high-priority action items.
- Promoted Warm Pool Size=16 as the new baseline (+3.26% gain).
- Updated PERFORMANCE_TARGETS_SCORECARD.md with Phase 69 results.
- Updated scripts/run_mixed_10_cleanenv.sh and core/bench_profile.h to use HAKMEM_WARM_POOL_SIZE=16 by default.
- Clarified that TINY_REFILL_BATCH_SIZE is not currently connected.
Phase 59b: Speed-first Mode Baseline Rebase
- Rebase on MIXED_TINYV3_C7_SAFE profile (Speed-first, no prewarm suppression)
- hakmem: 58.478 M ops/s (CV 2.52%)
- mimalloc: 120.979 M ops/s (CV 0.90%)
- Ratio: 48.34% of mimalloc (down from 49.13% Balanced mode in Phase 59)
- Reason for difference: Profile selection (Speed-first vs Balanced) and mimalloc environment variance
- Status: COMPLETE (measurement-only, zero code changes)
Phase 61: C7 ULTRA Header-Light Optimization Attempt
- Objective: Skip header write on C7 ULTRA alloc hit (write only on refill)
- Implementation: ENV gate HAKMEM_TINY_C7_ULTRA_HEADER_LIGHT (default OFF)
- Result: +0.31% (NEUTRAL, below +1.0% GO threshold)
- Baseline: 59.543 M ops/s (CV 1.53%)
- Treatment: 59.729 M ops/s (CV 2.66%)
- Root cause analysis:
- tiny_region_id_write_header only 2.32% of time (lower than Phase 42 estimate 4.56%)
- Header-light mode adds branch to hot path, negating write savings
- Mixed workload dilutes C7-specific optimization effectiveness
- Variance increased due to branch prediction variability
- Decision: Kept as research box with ENV gate (default OFF)
- Lesson: Workload-specific optimizations need careful verification with full workloads
Updated Documentation:
- PHASE59B_SPEED_FIRST_REBASE_RESULTS.md: Full measurement results and analysis
- PHASE61_C7_ULTRA_HEADER_LIGHT_RESULTS.md: A/B test results and root cause analysis
- PHASE61_C7_ULTRA_HEADER_LIGHT_IMPLEMENTATION.md: Implementation details and design
- CURRENT_TASK.md: Updated status and next phase planning (Phase 62)
- PERFORMANCE_TARGETS_SCORECARD.md: Updated baseline and M1 milestone status
M1 (50%) Milestone Status:
- Current: 48.34% (Speed-first profile)
- Gap: -1.66% (within measurement noise)
- Profile recommendation: Speed-first as canonical default for throughput focus
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>