Files
hakmem/docs/analysis/PHASE59B_SPEED_FIRST_REBASE_RESULTS.md
Moe Charm (CI) ef8e2ab9b5 Phase 59b & 61: Speed-first Rebase + C7 ULTRA Header-Light Optimization
Phase 59b: Speed-first Mode Baseline Rebase
- Rebase on MIXED_TINYV3_C7_SAFE profile (Speed-first, no prewarm suppression)
- hakmem: 58.478 M ops/s (CV 2.52%)
- mimalloc: 120.979 M ops/s (CV 0.90%)
- Ratio: 48.34% of mimalloc (down from 49.13% Balanced mode in Phase 59)
- Reason for difference: Profile selection (Speed-first vs Balanced) and mimalloc environment variance
- Status: COMPLETE (measurement-only, zero code changes)

Phase 61: C7 ULTRA Header-Light Optimization Attempt
- Objective: Skip header write on C7 ULTRA alloc hit (write only on refill)
- Implementation: ENV gate HAKMEM_TINY_C7_ULTRA_HEADER_LIGHT (default OFF)
- Result: +0.31% (NEUTRAL, below +1.0% GO threshold)
  - Baseline: 59.543 M ops/s (CV 1.53%)
  - Treatment: 59.729 M ops/s (CV 2.66%)
- Root cause analysis:
  - tiny_region_id_write_header only 2.32% of time (lower than Phase 42 estimate 4.56%)
  - Header-light mode adds branch to hot path, negating write savings
  - Mixed workload dilutes C7-specific optimization effectiveness
  - Variance increased due to branch prediction variability
- Decision: Kept as research box with ENV gate (default OFF)
- Lesson: Workload-specific optimizations need careful verification with full workloads

Updated Documentation:
- PHASE59B_SPEED_FIRST_REBASE_RESULTS.md: Full measurement results and analysis
- PHASE61_C7_ULTRA_HEADER_LIGHT_RESULTS.md: A/B test results and root cause analysis
- PHASE61_C7_ULTRA_HEADER_LIGHT_IMPLEMENTATION.md: Implementation details and design
- CURRENT_TASK.md: Updated status and next phase planning (Phase 62)
- PERFORMANCE_TARGETS_SCORECARD.md: Updated baseline and M1 milestone status

M1 (50%) Milestone Status:
- Current: 48.34% (Speed-first profile)
- Gap: -1.66% (within measurement noise)
- Profile recommendation: Speed-first as canonical default for throughput focus

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-17 16:25:26 +09:00

115 lines
2.7 KiB
Markdown

# Phase 59b: Speed-first Rebase Results
**Date**: 2025-12-17
**Objective**: Measure baseline with Speed-first mode (HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE) and update baseline ratio.
---
## Background
Phase 59 used Balanced mode, but Phase 57's 60-minute soak test showed Speed-first mode wins across all metrics:
- Throughput: Speed-first is higher (not -3.0% as previously recorded)
- CV: Speed-first 1.58% < Balanced 5.38%
- Tail p99: Speed-first 19.14 ns/op < Balanced 20.78 ns/op
This phase re-measures baseline with Speed-first as the canonical configuration.
---
## Build Configuration
```bash
make clean
make bench_random_mixed_hakmem_minimal
make bench_random_mixed_mi
```
**Profile**: MIXED_TINYV3_C7_SAFE (Speed-first)
---
## Results
### HAKMEM (Speed-first, 10 runs)
```
Run 1: 59703498 ops/s
Run 2: 58304610 ops/s
Run 3: 57661940 ops/s
Run 4: 58971883 ops/s
Run 5: 54922424 ops/s
Run 6: 58840032 ops/s
Run 7: 59513137 ops/s
Run 8: 57656603 ops/s
Run 9: 59560261 ops/s
Run 10: 59641284 ops/s
```
**Statistics**:
- Mean: 58,477,567 ops/s
- Median: 58,876,007 ops/s
- Min: 54,922,424 ops/s
- Max: 59,703,498 ops/s
- CV: 2.52%
### mimalloc (10 runs)
```
Run 1: 121727781 ops/s
Run 2: 122378721 ops/s
Run 3: 120826927 ops/s
Run 4: 119288198 ops/s
Run 5: 121275784 ops/s
Run 6: 119825073 ops/s
Run 7: 120096029 ops/s
Run 8: 121769295 ops/s
Run 9: 120555258 ops/s
Run 10: 122051669 ops/s
```
**Statistics**:
- Mean: 120,979,474 ops/s
- Median: 120,966,493 ops/s
- Min: 119,288,198 ops/s
- Max: 122,378,721 ops/s
- CV: 0.90%
---
## Ratio Calculation
**HAKMEM / mimalloc**: 58,477,567 / 120,979,474 = **48.34%**
### Comparison with Phase 59
| Metric | Phase 59 (Balanced) | Phase 59b (Speed-first) | Delta |
|--------|---------------------|-------------------------|-------|
| HAKMEM Mean | 58,476,000 ops/s | 58,477,567 ops/s | +0.00% |
| mimalloc Mean | 119,086,000 ops/s | 120,979,474 ops/s | +1.59% |
| Ratio | 49.13% | 48.34% | -0.79pp |
**Note**: Speed-first mode shows slightly lower ratio (-0.79pp) due to mimalloc improvement (+1.59%), not HAKMEM regression. HAKMEM throughput is identical.
---
## Conclusion
**Status**: COMPLETED
**Findings**:
1. Speed-first mode is the correct baseline (lower CV, better tail latency)
2. New baseline ratio: **48.34%** (down 0.79pp from Phase 59 due to mimalloc variation)
3. HAKMEM throughput remains stable at ~58.5M ops/s
**Recommendation**:
- Adopt Speed-first (MIXED_TINYV3_C7_SAFE) as canonical default
- Update PERFORMANCE_TARGETS_SCORECARD.md with new baseline
- Use 48.34% as reference for future comparisons
---
## Next Steps
- Phase 61: C7 ULTRA header-light optimization
- Target: +1.0% improvement from header write elimination