102 lines
3.2 KiB
Markdown
102 lines
3.2 KiB
Markdown
|
|
# Phase 8 Benchmark - Quick Reference Card
|
||
|
|
|
||
|
|
## TL;DR - The Numbers
|
||
|
|
|
||
|
|
```
|
||
|
|
Working Set 256 (Hot Cache):
|
||
|
|
HAKMEM: 79.2 M ops/s
|
||
|
|
System: 86.7 M ops/s (1.09x faster)
|
||
|
|
mimalloc: 114.9 M ops/s (1.45x faster)
|
||
|
|
|
||
|
|
Working Set 8192 (Realistic):
|
||
|
|
HAKMEM: 16.5 M ops/s ⚠️ CRITICAL
|
||
|
|
System: 57.1 M ops/s (3.46x faster) ⚠️ CRITICAL
|
||
|
|
mimalloc: 96.5 M ops/s (5.85x faster) ⚠️ CRITICAL
|
||
|
|
|
||
|
|
Scalability (WS256 → WS8192):
|
||
|
|
HAKMEM: 4.80x degradation 🔴 BROKEN
|
||
|
|
System: 1.52x degradation ✅ Good
|
||
|
|
mimalloc: 1.19x degradation ✅ Excellent
|
||
|
|
```
|
||
|
|
|
||
|
|
## Critical Issues Found
|
||
|
|
|
||
|
|
### 1. SuperSlab Scaling Failure (SEVERITY: CRITICAL)
|
||
|
|
- **Impact**: 246% slower than System malloc at WS8192
|
||
|
|
- **Evidence**: "shared_fail→legacy" logs show slab exhaustion
|
||
|
|
- **Root cause**: SuperSlab architecture doesn't scale beyond hot cache
|
||
|
|
|
||
|
|
### 2. Fast Path Overhead (SEVERITY: MEDIUM)
|
||
|
|
- **Impact**: 9.4% slower than System malloc at WS256
|
||
|
|
- **Evidence**: Even with everything in cache, HAKMEM lags
|
||
|
|
- **Root cause**: TLS drain overhead, SuperSlab lookup costs
|
||
|
|
|
||
|
|
### 3. Fragmentation Issues (SEVERITY: HIGH)
|
||
|
|
- **Impact**: 4.8x performance degradation vs 1.5x for System
|
||
|
|
- **Evidence**: Linear performance collapse with working set size
|
||
|
|
- **Root cause**: SuperSlab list becomes inefficient
|
||
|
|
|
||
|
|
## Phase 9 Priorities
|
||
|
|
|
||
|
|
### Week 1: Investigation
|
||
|
|
1. Profile SuperSlab lookup latency
|
||
|
|
2. Measure cache/TLB miss rates
|
||
|
|
3. Analyze "shared_fail→legacy" root cause
|
||
|
|
4. Measure fragmentation at different working set sizes
|
||
|
|
|
||
|
|
### Week 2: Targeted Fixes
|
||
|
|
1. Implement hash table for SuperSlab lookup
|
||
|
|
2. Experiment with 1MB/2MB SuperSlab sizes
|
||
|
|
3. Fix shared slab capacity issues
|
||
|
|
4. Optimize fast path (inline more, reduce branches)
|
||
|
|
|
||
|
|
## Success Criteria
|
||
|
|
|
||
|
|
### Minimum (Required)
|
||
|
|
- WS256: 79.2 → 85 M ops/s (+7%)
|
||
|
|
- WS8192: 16.5 → 35 M ops/s (+112%)
|
||
|
|
- Degradation: 4.80x → 2.50x or better
|
||
|
|
|
||
|
|
### Stretch Goal
|
||
|
|
- WS256: 90+ M ops/s (match System malloc)
|
||
|
|
- WS8192: 45+ M ops/s (80% of System malloc)
|
||
|
|
- Degradation: 2.00x or better
|
||
|
|
|
||
|
|
## If Phase 9 Fails (<30 M ops/s at WS8192)
|
||
|
|
|
||
|
|
Switch to **Hybrid Architecture**:
|
||
|
|
- Keep: TLS fast path layer
|
||
|
|
- Replace: SuperSlab backend → jemalloc-style arenas
|
||
|
|
- Timeline: +3 weeks
|
||
|
|
- Success probability: 75%
|
||
|
|
|
||
|
|
## Benchmark Reproducibility
|
||
|
|
|
||
|
|
All benchmarks available at:
|
||
|
|
- `/mnt/workdisk/public_share/hakmem/phase8_comprehensive_benchmark_results.txt` (raw data)
|
||
|
|
- `./bench_random_mixed_hakmem 10000000 8192` (reproduce HAKMEM)
|
||
|
|
- `./bench_random_mixed_system 10000000 8192` (reproduce System)
|
||
|
|
- `./bench_random_mixed_mi 10000000 8192` (reproduce mimalloc)
|
||
|
|
|
||
|
|
5 runs per benchmark, StdDev < 2.5% (statistically robust).
|
||
|
|
|
||
|
|
## Reports Generated
|
||
|
|
|
||
|
|
1. **PHASE8_COMPREHENSIVE_BENCHMARK_REPORT.md** - Full statistical analysis
|
||
|
|
2. **PHASE8_TECHNICAL_ANALYSIS.md** - Deep dive into root causes
|
||
|
|
3. **PHASE8_VISUAL_SUMMARY.md** - Visual charts and decision matrix
|
||
|
|
4. **PHASE8_QUICK_REFERENCE.md** - This file (quick lookup)
|
||
|
|
|
||
|
|
## Next Steps
|
||
|
|
|
||
|
|
1. Read PHASE8_VISUAL_SUMMARY.md for decision matrix
|
||
|
|
2. Read PHASE8_TECHNICAL_ANALYSIS.md for root cause details
|
||
|
|
3. Begin Phase 9 investigation (Week 1)
|
||
|
|
4. Re-evaluate after 2 weeks
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Date**: 2025-11-30
|
||
|
|
**Status**: Phase 8 COMPLETE, Phase 9 READY
|
||
|
|
**Critical Path**: Fix SuperSlab scaling or switch to Hybrid architecture
|