# hakmem Benchmark Design - jemalloc/mimalloc Comparison **Purpose**: Compare hakmem against industry-standard allocators for paper evaluation **Date**: 2025-10-21 **Status**: Phase 5 Implementation --- ## 🎯 Benchmark Goals Per Gemini's S+ review: > "jemalloc/mimallocζ―”θΌƒγŒγͺいと β†’ Best Paper Awardは焑理" **Key Requirements**: 1. Fair comparison (same workload, same environment) 2. Multiple allocators: hakmem (baseline/evolving), jemalloc, mimalloc, system malloc 3. KPI measurement: P99 latency, page faults, RSS, throughput 4. Statistical significance: multiple runs, warm-up, median/percentiles 5. Paper-ready output: CSV format for graphs/tables --- ## πŸ“Š Workload Scenarios Using existing test scenarios from `test_hakmem.c`: ### Scenario 1: JSON Parsing (small, frequent) - Size: 64KB - Iterations: 1000 - Pattern: Allocate β†’ Use β†’ Free (tight loop) ### Scenario 2: MIR Build (medium, moderate) - Size: 256KB - Iterations: 100 - Pattern: Allocate β†’ Use β†’ Free (moderate) ### Scenario 3: VM Execution (large, infrequent) - Size: 2MB - Iterations: 10 - Pattern: Allocate β†’ Use β†’ Free (infrequent) ### Scenario 4: Mixed (realistic) - All three patterns mixed - Simulates real compiler workload --- ## πŸ”¬ Allocator Configurations ### 1. hakmem-baseline - `HAKMEM_MODE` not set - Fixed policy (256KB threshold) - Baseline for comparison ### 2. hakmem-evolving - `HAKMEM_MODE=evolving` - UCB1 enabled - Adaptive learning ### 3. jemalloc - `LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2` - Industry standard (Firefox, Redis) ### 4. mimalloc - `LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libmimalloc.so.2` - Microsoft allocator ### 5. system malloc (glibc) - No LD_PRELOAD - Default libc allocator - Control baseline --- ## πŸ“ˆ KPI Metrics ### Primary Metrics (for paper) 1. **P99 Latency**: 99th percentile allocation latency (ns) 2. **Page Faults**: Hard page faults (I/O required) 3. **RSS Peak**: Maximum resident set size (MB) ### Secondary Metrics 4. **Throughput**: Allocations per second 5. **P50/P95 Latency**: Additional percentiles 6. **Soft Page Faults**: Minor faults (no I/O) --- ## πŸ—οΈ Implementation Plan ### Phase 5.1: Benchmark Infrastructure (δ»Šε›ž) - [x] Design document (this file) - [ ] `bench_allocators.c` - Main benchmark program - [ ] `bench_runner.sh` - Shell script to run all allocators - [ ] CSV output format ### Phase 5.2: Statistical Analysis - [ ] Multiple runs (10-50 iterations) - [ ] Warm-up phase (discard first 3 runs) - [ ] Median/percentile calculation - [ ] Confidence intervals ### Phase 5.3: Visualization - [ ] Python/gnuplot scripts for graphs - [ ] LaTeX tables for paper --- ## πŸ”§ Benchmark Program Design ### `bench_allocators.c` Structure ```c // Allocator abstraction layer typedef void* (*alloc_fn_t)(size_t); typedef void (*free_fn_t)(void*, size_t); // Benchmark a single scenario typedef struct { const char* name; void (*run)(alloc_fn_t, free_fn_t, int iterations); } benchmark_t; // Scenarios void bench_json(alloc_fn_t alloc, free_fn_t free, int iters); void bench_mir(alloc_fn_t alloc, free_fn_t free, int iters); void bench_vm(alloc_fn_t alloc, free_fn_t free, int iters); void bench_mixed(alloc_fn_t alloc, free_fn_t free, int iters); // KPI measurement void measure_start(void); void measure_end(bench_result_t* out); ``` ### Output Format (CSV) ```csv allocator,scenario,iterations,p50_ns,p95_ns,p99_ns,soft_pf,hard_pf,rss_mb,throughput hakmem-baseline,json,1000,42,68,89,1234,0,5.2,23809 hakmem-evolving,json,1000,38,62,81,1150,0,4.8,26315 jemalloc,json,1000,45,72,95,1400,0,6.1,22222 mimalloc,json,1000,40,65,85,1280,0,5.5,25000 system,json,1000,55,90,120,1800,2,7.8,18181 ``` --- ## πŸ§ͺ Execution Plan ### Step 1: Build ```bash # hakmem versions make clean && make # hakmem with UCB1 # System allocators (already installed) apt-get install libjemalloc2 libmimalloc2.0 ``` ### Step 2: Run Benchmarks ```bash # Run all allocators, all scenarios bash bench_runner.sh --warmup 3 --runs 10 --output results.csv # Individual runs (for debugging) ./bench_allocators --allocator hakmem-baseline --scenario json LD_PRELOAD=libjemalloc.so.2 ./bench_allocators --allocator jemalloc --scenario json ``` ### Step 3: Analyze ```bash # Generate graphs python3 analyze_results.py results.csv --output graphs/ # Generate LaTeX table python3 generate_latex_table.py results.csv --output paper_table.tex ``` --- ## πŸ“‹ Gemini's Critical Requirements ### βœ… Must Have (for Best Paper) 1. **jemalloc comparison** - Industry standard 2. **mimalloc comparison** - State-of-the-art 3. **Fair benchmarking** - Same workload, multiple runs 4. **Statistical significance** - Warm-up, median, confidence intervals ### 🎯 Should Have (for generality) 5. **Redis/Nginx benchmarks** - Real-world workloads 6. **Confusion Matrix** - Auto-inference accuracy ### πŸ’‘ Nice to Have 7. **Multi-threaded benchmarks** - Scalability 8. **Memory fragmentation** - Long-running tests --- ## πŸš€ Next Steps 1. Implement `bench_allocators.c` ⬅️ **欑** 2. Implement `bench_runner.sh` 3. Run initial benchmarks (10 runs) 4. Analyze results 5. Create graphs for paper --- ## πŸ“ Notes - **Fair comparison**: Use same scenarios for all allocators - **Statistical rigor**: Multiple runs, discard outliers - **Paper-ready**: CSV β†’ graphs/tables directly - **Reproducible**: Document exact versions, environment **Related**: Gemini's S+ review in `chatgpt-advanced-proposals.md`