This commit introduces a comprehensive tracing mechanism for allocation failures within the Adaptive Cache Engine (ACE) component. This feature allows for precise identification of the root cause for Out-Of-Memory (OOM) issues related to ACE allocations. Key changes include: - **ACE Tracing Implementation**: - Added environment variable to enable/disable detailed logging of allocation failures. - Instrumented , , and to distinguish between "Threshold" (size class mismatch), "Exhaustion" (pool depletion), and "MapFail" (OS memory allocation failure). - **Build System Fixes**: - Corrected to ensure is properly linked into , resolving an error. - **LD_PRELOAD Wrapper Adjustments**: - Investigated and understood the wrapper's behavior under , particularly its interaction with and checks. - Enabled debugging flags for environment to prevent unintended fallbacks to 's for non-tiny allocations, allowing comprehensive testing of the allocator. - **Debugging & Verification**: - Introduced temporary verbose logging to pinpoint execution flow issues within interception and routing. These temporary logs have been removed. - Created to facilitate testing of the tracing features. This feature will significantly aid in diagnosing and resolving allocation-related OOM issues in by providing clear insights into the failure pathways.
3.2 KiB
3.2 KiB
Phase 8 Benchmark - Quick Reference Card
TL;DR - The Numbers
Working Set 256 (Hot Cache):
HAKMEM: 79.2 M ops/s
System: 86.7 M ops/s (1.09x faster)
mimalloc: 114.9 M ops/s (1.45x faster)
Working Set 8192 (Realistic):
HAKMEM: 16.5 M ops/s ⚠️ CRITICAL
System: 57.1 M ops/s (3.46x faster) ⚠️ CRITICAL
mimalloc: 96.5 M ops/s (5.85x faster) ⚠️ CRITICAL
Scalability (WS256 → WS8192):
HAKMEM: 4.80x degradation 🔴 BROKEN
System: 1.52x degradation ✅ Good
mimalloc: 1.19x degradation ✅ Excellent
Critical Issues Found
1. SuperSlab Scaling Failure (SEVERITY: CRITICAL)
- Impact: 246% slower than System malloc at WS8192
- Evidence: "shared_fail→legacy" logs show slab exhaustion
- Root cause: SuperSlab architecture doesn't scale beyond hot cache
2. Fast Path Overhead (SEVERITY: MEDIUM)
- Impact: 9.4% slower than System malloc at WS256
- Evidence: Even with everything in cache, HAKMEM lags
- Root cause: TLS drain overhead, SuperSlab lookup costs
3. Fragmentation Issues (SEVERITY: HIGH)
- Impact: 4.8x performance degradation vs 1.5x for System
- Evidence: Linear performance collapse with working set size
- Root cause: SuperSlab list becomes inefficient
Phase 9 Priorities
Week 1: Investigation
- Profile SuperSlab lookup latency
- Measure cache/TLB miss rates
- Analyze "shared_fail→legacy" root cause
- Measure fragmentation at different working set sizes
Week 2: Targeted Fixes
- Implement hash table for SuperSlab lookup
- Experiment with 1MB/2MB SuperSlab sizes
- Fix shared slab capacity issues
- Optimize fast path (inline more, reduce branches)
Success Criteria
Minimum (Required)
- WS256: 79.2 → 85 M ops/s (+7%)
- WS8192: 16.5 → 35 M ops/s (+112%)
- Degradation: 4.80x → 2.50x or better
Stretch Goal
- WS256: 90+ M ops/s (match System malloc)
- WS8192: 45+ M ops/s (80% of System malloc)
- Degradation: 2.00x or better
If Phase 9 Fails (<30 M ops/s at WS8192)
Switch to Hybrid Architecture:
- Keep: TLS fast path layer
- Replace: SuperSlab backend → jemalloc-style arenas
- Timeline: +3 weeks
- Success probability: 75%
Benchmark Reproducibility
All benchmarks available at:
/mnt/workdisk/public_share/hakmem/phase8_comprehensive_benchmark_results.txt(raw data)./bench_random_mixed_hakmem 10000000 8192(reproduce HAKMEM)./bench_random_mixed_system 10000000 8192(reproduce System)./bench_random_mixed_mi 10000000 8192(reproduce mimalloc)
5 runs per benchmark, StdDev < 2.5% (statistically robust).
Reports Generated
- PHASE8_COMPREHENSIVE_BENCHMARK_REPORT.md - Full statistical analysis
- PHASE8_TECHNICAL_ANALYSIS.md - Deep dive into root causes
- PHASE8_VISUAL_SUMMARY.md - Visual charts and decision matrix
- PHASE8_QUICK_REFERENCE.md - This file (quick lookup)
Next Steps
- Read PHASE8_VISUAL_SUMMARY.md for decision matrix
- Read PHASE8_TECHNICAL_ANALYSIS.md for root cause details
- Begin Phase 9 investigation (Week 1)
- Re-evaluate after 2 weeks
Date: 2025-11-30 Status: Phase 8 COMPLETE, Phase 9 READY Critical Path: Fix SuperSlab scaling or switch to Hybrid architecture