================================================================================ Phase 8 Comprehensive Allocator Comparison - Analysis ================================================================================ ## Working Set 256 (Hot cache, Phase 7 comparison) | Allocator | Avg (M ops/s) | StdDev (%) | Min - Max | vs HAKMEM | |----------------|---------------|------------|----------------|-----------| | HAKMEM Phase 8 | 79.2 | ± 2.4% | 77.0 - 81.2 | 1.00x | | System malloc | 86.7 | ± 1.0% | 85.3 - 87.5 | 1.09x | | mimalloc | 114.9 | ± 1.2% | 112.5 - 116.2 | 1.45x | ## Working Set 8192 (Realistic workload) | Allocator | Avg (M ops/s) | StdDev (%) | Min - Max | vs HAKMEM | |----------------|---------------|------------|----------------|-----------| | HAKMEM Phase 8 | 16.5 | ± 2.5% | 15.8 - 16.9 | 1.00x | | System malloc | 57.1 | ± 1.3% | 56.1 - 57.8 | 3.46x | | mimalloc | 96.5 | ± 0.9% | 95.5 - 97.7 | 5.85x | ================================================================================ Performance Analysis ================================================================================ ### 1. Working Set 256 (Hot Cache) Results - HAKMEM Phase 8: 79.2 M ops/s - System malloc: 86.7 M ops/s (1.09x faster) - mimalloc: 114.9 M ops/s (1.45x faster) HAKMEM is **9.4% slower** than System malloc and **45.2% slower** than mimalloc ### 2. Working Set 8192 (Realistic Workload) Results - HAKMEM Phase 8: 16.5 M ops/s - System malloc: 57.1 M ops/s (3.46x faster) - mimalloc: 96.5 M ops/s (5.85x faster) HAKMEM is **246.0% slower** than System malloc and **484.9% slower** than mimalloc ================================================================================ Critical Observations ================================================================================ ### HAKMEM Performance Gap Analysis Performance degradation from WS256 to WS8192: - HAKMEM: 4.80x slowdown (79.2 → 16.5 M ops/s) - System: 1.52x slowdown (86.7 → 57.1 M ops/s) - mimalloc: 1.19x slowdown (114.9 → 96.5 M ops/s) HAKMEM degrades **3.16x MORE** than System malloc HAKMEM degrades **4.03x MORE** than mimalloc ### Key Issues Identified 1. **Hot Cache Performance (WS256)**: - HAKMEM: 79.2 M ops/s - Gap: -9.1% vs System, -45.8% vs mimalloc - Issue: Fast-path overhead (TLS drain, SuperSlab lookup) 2. **Realistic Workload Performance (WS8192)**: - HAKMEM: 16.5 M ops/s - Gap: -71.1% vs System, -83.1% vs mimalloc - Issue: SEVERE - SuperSlab scaling, fragmentation, TLB pressure 3. **Scalability Problem**: - HAKMEM loses 4.8x performance with larger working sets - System loses only 1.5x - mimalloc loses only 1.2x - Root cause: SuperSlab architecture doesn't scale well ================================================================================ Recommendations for Phase 9+ ================================================================================ ### CRITICAL PRIORITY: Fix WS8192 Performance Gap The 71-83% performance gap at realistic working sets is UNACCEPTABLE. **Immediate Actions Required:** 1. **Investigate SuperSlab Scaling (Phase 9)** - Profile: Why does performance collapse with larger working sets? - Hypothesis: SuperSlab lookup overhead, fragmentation, or TLB misses - Debug logs show 'shared_fail→legacy' messages → shared slab exhaustion 2. **Optimize Fast Path (Phase 10)** - Even WS256 shows 9-46% gap vs competitors - Profile TLS drain overhead - Consider reducing drain frequency or lazy draining 3. **Consider Alternative Architectures (Phase 11)** - Current SuperSlab model may be fundamentally flawed - Benchmark shows 4.8x degradation vs 1.5x for System malloc - May need hybrid approach: TLS fast path + different backend 4. **Specific Debug Actions** - Analyze '[SS_BACKEND] shared_fail→legacy' logs - Measure SuperSlab hit rate at different working set sizes - Profile cache misses and TLB misses ================================================================================ Raw Data (for reproducibility) ================================================================================ hakmem_256 : [78480676, 78099247, 77034450, 81120430, 81206714] system_256 : [87329938, 86497843, 87514376, 85308713, 86630819] mimalloc_256 : [115842807, 115180313, 116209200, 112542094, 114950573] hakmem_8192 : [16504443, 15799180, 16916987, 16687009, 16582555] system_8192 : [56095157, 57843156, 56999206, 57717254, 56720055] mimalloc_8192 : [96824532, 96117137, 95521242, 97733856, 96327554] ================================================================================ Analysis Complete ================================================================================