Files
hakmem/docs/design/BENCHMARK_DESIGN.md

217 lines
5.4 KiB
Markdown
Raw Normal View History

# hakmem Benchmark Design - jemalloc/mimalloc Comparison
**Purpose**: Compare hakmem against industry-standard allocators for paper evaluation
**Date**: 2025-10-21
**Status**: Phase 5 Implementation
---
## 🎯 Benchmark Goals
Per Gemini's S+ review:
> "jemalloc/mimalloc比較がないと → Best Paper Awardは無理"
**Key Requirements**:
1. Fair comparison (same workload, same environment)
2. Multiple allocators: hakmem (baseline/evolving), jemalloc, mimalloc, system malloc
3. KPI measurement: P99 latency, page faults, RSS, throughput
4. Statistical significance: multiple runs, warm-up, median/percentiles
5. Paper-ready output: CSV format for graphs/tables
---
## 📊 Workload Scenarios
Using existing test scenarios from `test_hakmem.c`:
### Scenario 1: JSON Parsing (small, frequent)
- Size: 64KB
- Iterations: 1000
- Pattern: Allocate → Use → Free (tight loop)
### Scenario 2: MIR Build (medium, moderate)
- Size: 256KB
- Iterations: 100
- Pattern: Allocate → Use → Free (moderate)
### Scenario 3: VM Execution (large, infrequent)
- Size: 2MB
- Iterations: 10
- Pattern: Allocate → Use → Free (infrequent)
### Scenario 4: Mixed (realistic)
- All three patterns mixed
- Simulates real compiler workload
---
## 🔬 Allocator Configurations
### 1. hakmem-baseline
- `HAKMEM_MODE` not set
- Fixed policy (256KB threshold)
- Baseline for comparison
### 2. hakmem-evolving
- `HAKMEM_MODE=evolving`
- UCB1 enabled
- Adaptive learning
### 3. jemalloc
- `LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2`
- Industry standard (Firefox, Redis)
### 4. mimalloc
- `LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libmimalloc.so.2`
- Microsoft allocator
### 5. system malloc (glibc)
- No LD_PRELOAD
- Default libc allocator
- Control baseline
---
## 📈 KPI Metrics
### Primary Metrics (for paper)
1. **P99 Latency**: 99th percentile allocation latency (ns)
2. **Page Faults**: Hard page faults (I/O required)
3. **RSS Peak**: Maximum resident set size (MB)
### Secondary Metrics
4. **Throughput**: Allocations per second
5. **P50/P95 Latency**: Additional percentiles
6. **Soft Page Faults**: Minor faults (no I/O)
---
## 🏗️ Implementation Plan
### Phase 5.1: Benchmark Infrastructure (今回)
- [x] Design document (this file)
- [ ] `bench_allocators.c` - Main benchmark program
- [ ] `bench_runner.sh` - Shell script to run all allocators
- [ ] CSV output format
### Phase 5.2: Statistical Analysis
- [ ] Multiple runs (10-50 iterations)
- [ ] Warm-up phase (discard first 3 runs)
- [ ] Median/percentile calculation
- [ ] Confidence intervals
### Phase 5.3: Visualization
- [ ] Python/gnuplot scripts for graphs
- [ ] LaTeX tables for paper
---
## 🔧 Benchmark Program Design
### `bench_allocators.c` Structure
```c
// Allocator abstraction layer
typedef void* (*alloc_fn_t)(size_t);
typedef void (*free_fn_t)(void*, size_t);
// Benchmark a single scenario
typedef struct {
const char* name;
void (*run)(alloc_fn_t, free_fn_t, int iterations);
} benchmark_t;
// Scenarios
void bench_json(alloc_fn_t alloc, free_fn_t free, int iters);
void bench_mir(alloc_fn_t alloc, free_fn_t free, int iters);
void bench_vm(alloc_fn_t alloc, free_fn_t free, int iters);
void bench_mixed(alloc_fn_t alloc, free_fn_t free, int iters);
// KPI measurement
void measure_start(void);
void measure_end(bench_result_t* out);
```
### Output Format (CSV)
```csv
allocator,scenario,iterations,p50_ns,p95_ns,p99_ns,soft_pf,hard_pf,rss_mb,throughput
hakmem-baseline,json,1000,42,68,89,1234,0,5.2,23809
hakmem-evolving,json,1000,38,62,81,1150,0,4.8,26315
jemalloc,json,1000,45,72,95,1400,0,6.1,22222
mimalloc,json,1000,40,65,85,1280,0,5.5,25000
system,json,1000,55,90,120,1800,2,7.8,18181
```
---
## 🧪 Execution Plan
### Step 1: Build
```bash
# hakmem versions
make clean && make # hakmem with UCB1
# System allocators (already installed)
apt-get install libjemalloc2 libmimalloc2.0
```
### Step 2: Run Benchmarks
```bash
# Run all allocators, all scenarios
bash bench_runner.sh --warmup 3 --runs 10 --output results.csv
# Individual runs (for debugging)
./bench_allocators --allocator hakmem-baseline --scenario json
LD_PRELOAD=libjemalloc.so.2 ./bench_allocators --allocator jemalloc --scenario json
```
### Step 3: Analyze
```bash
# Generate graphs
python3 analyze_results.py results.csv --output graphs/
# Generate LaTeX table
python3 generate_latex_table.py results.csv --output paper_table.tex
```
---
## 📋 Gemini's Critical Requirements
### ✅ Must Have (for Best Paper)
1. **jemalloc comparison** - Industry standard
2. **mimalloc comparison** - State-of-the-art
3. **Fair benchmarking** - Same workload, multiple runs
4. **Statistical significance** - Warm-up, median, confidence intervals
### 🎯 Should Have (for generality)
5. **Redis/Nginx benchmarks** - Real-world workloads
6. **Confusion Matrix** - Auto-inference accuracy
### 💡 Nice to Have
7. **Multi-threaded benchmarks** - Scalability
8. **Memory fragmentation** - Long-running tests
---
## 🚀 Next Steps
1. Implement `bench_allocators.c` ⬅️ **次**
2. Implement `bench_runner.sh`
3. Run initial benchmarks (10 runs)
4. Analyze results
5. Create graphs for paper
---
## 📝 Notes
- **Fair comparison**: Use same scenarios for all allocators
- **Statistical rigor**: Multiple runs, discard outliers
- **Paper-ready**: CSV → graphs/tables directly
- **Reproducible**: Document exact versions, environment
**Related**: Gemini's S+ review in `chatgpt-advanced-proposals.md`