Files
hakmem/PHASE8_QUICK_REFERENCE.md
Moe Charm (CI) 4ef0171bc0 feat: Add ACE allocation failure tracing and debug hooks
This commit introduces a comprehensive tracing mechanism for allocation failures within the Adaptive Cache Engine (ACE) component. This feature allows for precise identification of the root cause for Out-Of-Memory (OOM) issues related to ACE allocations.

Key changes include:
- **ACE Tracing Implementation**:
  - Added  environment variable to enable/disable detailed logging of allocation failures.
  - Instrumented , , and  to distinguish between "Threshold" (size class mismatch), "Exhaustion" (pool depletion), and "MapFail" (OS memory allocation failure).
- **Build System Fixes**:
  - Corrected  to ensure  is properly linked into , resolving an  error.
- **LD_PRELOAD Wrapper Adjustments**:
  - Investigated and understood the  wrapper's behavior under , particularly its interaction with  and  checks.
  - Enabled debugging flags for  environment to prevent unintended fallbacks to 's  for non-tiny allocations, allowing comprehensive testing of the  allocator.
- **Debugging & Verification**:
  - Introduced temporary verbose logging to pinpoint execution flow issues within  interception and  routing. These temporary logs have been removed.
  - Created  to facilitate testing of the tracing features.

This feature will significantly aid in diagnosing and resolving allocation-related OOM issues in  by providing clear insights into the failure pathways.
2025-12-01 16:37:59 +09:00

3.2 KiB

Phase 8 Benchmark - Quick Reference Card

TL;DR - The Numbers

Working Set 256 (Hot Cache):
  HAKMEM:    79.2 M ops/s
  System:    86.7 M ops/s  (1.09x faster)
  mimalloc: 114.9 M ops/s  (1.45x faster)

Working Set 8192 (Realistic):
  HAKMEM:    16.5 M ops/s  ⚠️ CRITICAL
  System:    57.1 M ops/s  (3.46x faster) ⚠️ CRITICAL
  mimalloc:  96.5 M ops/s  (5.85x faster) ⚠️ CRITICAL

Scalability (WS256 → WS8192):
  HAKMEM:   4.80x degradation  🔴 BROKEN
  System:   1.52x degradation  ✅ Good
  mimalloc: 1.19x degradation  ✅ Excellent

Critical Issues Found

1. SuperSlab Scaling Failure (SEVERITY: CRITICAL)

  • Impact: 246% slower than System malloc at WS8192
  • Evidence: "shared_fail→legacy" logs show slab exhaustion
  • Root cause: SuperSlab architecture doesn't scale beyond hot cache

2. Fast Path Overhead (SEVERITY: MEDIUM)

  • Impact: 9.4% slower than System malloc at WS256
  • Evidence: Even with everything in cache, HAKMEM lags
  • Root cause: TLS drain overhead, SuperSlab lookup costs

3. Fragmentation Issues (SEVERITY: HIGH)

  • Impact: 4.8x performance degradation vs 1.5x for System
  • Evidence: Linear performance collapse with working set size
  • Root cause: SuperSlab list becomes inefficient

Phase 9 Priorities

Week 1: Investigation

  1. Profile SuperSlab lookup latency
  2. Measure cache/TLB miss rates
  3. Analyze "shared_fail→legacy" root cause
  4. Measure fragmentation at different working set sizes

Week 2: Targeted Fixes

  1. Implement hash table for SuperSlab lookup
  2. Experiment with 1MB/2MB SuperSlab sizes
  3. Fix shared slab capacity issues
  4. Optimize fast path (inline more, reduce branches)

Success Criteria

Minimum (Required)

  • WS256: 79.2 → 85 M ops/s (+7%)
  • WS8192: 16.5 → 35 M ops/s (+112%)
  • Degradation: 4.80x → 2.50x or better

Stretch Goal

  • WS256: 90+ M ops/s (match System malloc)
  • WS8192: 45+ M ops/s (80% of System malloc)
  • Degradation: 2.00x or better

If Phase 9 Fails (<30 M ops/s at WS8192)

Switch to Hybrid Architecture:

  • Keep: TLS fast path layer
  • Replace: SuperSlab backend → jemalloc-style arenas
  • Timeline: +3 weeks
  • Success probability: 75%

Benchmark Reproducibility

All benchmarks available at:

  • /mnt/workdisk/public_share/hakmem/phase8_comprehensive_benchmark_results.txt (raw data)
  • ./bench_random_mixed_hakmem 10000000 8192 (reproduce HAKMEM)
  • ./bench_random_mixed_system 10000000 8192 (reproduce System)
  • ./bench_random_mixed_mi 10000000 8192 (reproduce mimalloc)

5 runs per benchmark, StdDev < 2.5% (statistically robust).

Reports Generated

  1. PHASE8_COMPREHENSIVE_BENCHMARK_REPORT.md - Full statistical analysis
  2. PHASE8_TECHNICAL_ANALYSIS.md - Deep dive into root causes
  3. PHASE8_VISUAL_SUMMARY.md - Visual charts and decision matrix
  4. PHASE8_QUICK_REFERENCE.md - This file (quick lookup)

Next Steps

  1. Read PHASE8_VISUAL_SUMMARY.md for decision matrix
  2. Read PHASE8_TECHNICAL_ANALYSIS.md for root cause details
  3. Begin Phase 9 investigation (Week 1)
  4. Re-evaluate after 2 weeks

Date: 2025-11-30 Status: Phase 8 COMPLETE, Phase 9 READY Critical Path: Fix SuperSlab scaling or switch to Hybrid architecture