283 lines
8.3 KiB
Markdown
283 lines
8.3 KiB
Markdown
|
|
# ChatGPT Pro Consultation: mmap vs malloc Strategy
|
|||
|
|
|
|||
|
|
**Date**: 2025-10-21
|
|||
|
|
**Context**: hakmem allocator optimization (Phase 6.2 + 6.3 implementation)
|
|||
|
|
**Time Limit**: 10 minutes
|
|||
|
|
**Question Type**: Architecture decision
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 Core Question
|
|||
|
|
|
|||
|
|
**Should we switch from malloc to mmap for large allocations (POLICY_LARGE_INFREQUENT) to enable Phase 6.3 madvise batching?**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📊 Current Situation
|
|||
|
|
|
|||
|
|
### What We Built (Phases 6.2 + 6.3)
|
|||
|
|
|
|||
|
|
1. **Phase 6.2: ELO Strategy Selection** ✅
|
|||
|
|
- 12 candidate strategies (512KB-32MB thresholds)
|
|||
|
|
- Epsilon-greedy selection (10% exploration)
|
|||
|
|
- Expected: +10-20% on VM scenario
|
|||
|
|
|
|||
|
|
2. **Phase 6.3: madvise Batching** ✅
|
|||
|
|
- Batch MADV_DONTNEED calls (4MB threshold)
|
|||
|
|
- Reduces TLB flush overhead
|
|||
|
|
- Expected: +20-30% on VM scenario
|
|||
|
|
|
|||
|
|
### Critical Problem Discovered
|
|||
|
|
|
|||
|
|
**Phase 6.3 doesn't work because all allocations use malloc!**
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// hakmem.c:357
|
|||
|
|
static void* allocate_with_policy(size_t size, Policy policy) {
|
|||
|
|
switch (policy) {
|
|||
|
|
case POLICY_LARGE_INFREQUENT:
|
|||
|
|
// ALL ALLOCATIONS USE MALLOC
|
|||
|
|
return alloc_malloc(size); // ← Was alloc_mmap(size) before
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Why this is a problem**:
|
|||
|
|
- madvise() only works on mmap blocks (not malloc!)
|
|||
|
|
- Current code: 100% malloc → 0% madvise batching
|
|||
|
|
- Phase 6.3 implementation is correct, but never triggered
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📜 Key Code Snippets
|
|||
|
|
|
|||
|
|
### 1. Current Allocation Strategy (ALL MALLOC)
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// hakmem.c:349-357
|
|||
|
|
static void* allocate_with_policy(size_t size, Policy policy) {
|
|||
|
|
switch (policy) {
|
|||
|
|
case POLICY_LARGE_INFREQUENT:
|
|||
|
|
// CHANGED: Use malloc for all sizes to leverage system allocator's
|
|||
|
|
// built-in free-list and mmap optimization. Direct mmap() without
|
|||
|
|
// free-list causes excessive page faults (1538 vs 2 for 10×2MB).
|
|||
|
|
//
|
|||
|
|
// Future: Implement per-site mmap cache for true zero-copy large allocs.
|
|||
|
|
return alloc_malloc(size); // was: alloc_mmap(size)
|
|||
|
|
|
|||
|
|
case POLICY_SMALL_FREQUENT:
|
|||
|
|
case POLICY_MEDIUM:
|
|||
|
|
case POLICY_DEFAULT:
|
|||
|
|
default:
|
|||
|
|
return alloc_malloc(size);
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. BigCache (Implemented for malloc blocks)
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// hakmem.c:430-437
|
|||
|
|
// NEW: Try BigCache first (for large allocations)
|
|||
|
|
if (size >= 1048576) { // 1MB threshold
|
|||
|
|
void* cached_ptr = NULL;
|
|||
|
|
if (hak_bigcache_try_get(size, site_id, &cached_ptr)) {
|
|||
|
|
// Cache hit! Return immediately
|
|||
|
|
return cached_ptr;
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Stats from FINAL_RESULTS.md**:
|
|||
|
|
- BigCache hit rate: 90%
|
|||
|
|
- Page faults reduced: 50% (513 vs 1026)
|
|||
|
|
- BigCache caches malloc blocks (not mmap)
|
|||
|
|
|
|||
|
|
### 3. madvise Batching (Only works on mmap!)
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// hakmem.c:543-548
|
|||
|
|
case ALLOC_METHOD_MMAP:
|
|||
|
|
// Phase 6.3: Batch madvise for mmap blocks ONLY
|
|||
|
|
if (hdr->size >= BATCH_MIN_SIZE) {
|
|||
|
|
hak_batch_add(raw, hdr->size); // ← Never called!
|
|||
|
|
}
|
|||
|
|
munmap(raw, hdr->size);
|
|||
|
|
break;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Problem**: No blocks have ALLOC_METHOD_MMAP, so batching never triggers.
|
|||
|
|
|
|||
|
|
### 4. Historical Context (Why malloc was chosen)
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// Comment in hakmem.c:352-356
|
|||
|
|
// CHANGED: Use malloc for all sizes to leverage system allocator's
|
|||
|
|
// built-in free-list and mmap optimization. Direct mmap() without
|
|||
|
|
// free-list causes excessive page faults (1538 vs 2 for 10×2MB).
|
|||
|
|
//
|
|||
|
|
// Future: Implement per-site mmap cache for true zero-copy large allocs.
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Before BigCache**:
|
|||
|
|
- Direct mmap: 1538 page faults (10 allocations × 2MB)
|
|||
|
|
- malloc: 2 page faults (system allocator's internal mmap caching)
|
|||
|
|
|
|||
|
|
**After BigCache** (current):
|
|||
|
|
- BigCache hit rate: 90% → Only 10% of allocations hit actual allocator
|
|||
|
|
- Expected page faults with mmap: 1538 × 10% = ~150 faults
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🤔 Decision Options
|
|||
|
|
|
|||
|
|
### Option A: Switch to mmap (Enable Phase 6.3)
|
|||
|
|
|
|||
|
|
**Change**:
|
|||
|
|
```c
|
|||
|
|
case POLICY_LARGE_INFREQUENT:
|
|||
|
|
return alloc_mmap(size); // 1-line change
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Pros**:
|
|||
|
|
- ✅ Phase 6.3 madvise batching works immediately
|
|||
|
|
- ✅ BigCache (90% hit) should prevent page fault explosion
|
|||
|
|
- ✅ Combined effect: BigCache + madvise batching
|
|||
|
|
- ✅ Expected: 150 faults → 150/50 = 3 TLB flushes (vs 150 without batching)
|
|||
|
|
|
|||
|
|
**Cons**:
|
|||
|
|
- ❌ Risk of page fault regression if BigCache doesn't work as expected
|
|||
|
|
- ❌ Need to verify BigCache works with mmap blocks (not just malloc)
|
|||
|
|
|
|||
|
|
**Expected Performance**:
|
|||
|
|
- Page faults: 1538 → 150 (BigCache: 90% hit)
|
|||
|
|
- TLB flushes: 150 → 3-5 (madvise batching: 50× reduction)
|
|||
|
|
- Net speedup: +30-50% on VM scenario
|
|||
|
|
|
|||
|
|
### Option B: Keep malloc (Status quo)
|
|||
|
|
|
|||
|
|
**Pros**:
|
|||
|
|
- ✅ Known good performance (system allocator optimization)
|
|||
|
|
- ✅ No risk of page fault regression
|
|||
|
|
|
|||
|
|
**Cons**:
|
|||
|
|
- ❌ Phase 6.3 completely wasted (no madvise batching)
|
|||
|
|
- ❌ No TLB optimization
|
|||
|
|
- ❌ Can't compete with mimalloc (2× faster due to madvise batching)
|
|||
|
|
|
|||
|
|
### Option C: ELO-based dynamic selection
|
|||
|
|
|
|||
|
|
**Change**:
|
|||
|
|
```c
|
|||
|
|
// ELO selects between malloc and mmap strategies
|
|||
|
|
if (strategy_id < 6) {
|
|||
|
|
return alloc_malloc(size);
|
|||
|
|
} else {
|
|||
|
|
return alloc_mmap(size); // Test mmap with top strategies
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Pros**:
|
|||
|
|
- ✅ Let ELO learning decide based on actual performance
|
|||
|
|
- ✅ Safe fallback to malloc if mmap performs worse
|
|||
|
|
|
|||
|
|
**Cons**:
|
|||
|
|
- ❌ More complex
|
|||
|
|
- ❌ Slower convergence (need data from both paths)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📊 Benchmark Data (Current Silver Medal Results)
|
|||
|
|
|
|||
|
|
**From FINAL_RESULTS.md**:
|
|||
|
|
|
|||
|
|
| Allocator | JSON (ns) | MIR (ns) | VM (ns) | MIXED (ns) |
|
|||
|
|
|-----------|-----------|----------|---------|------------|
|
|||
|
|
| mimalloc | 278.5 | 1234.0 | **17725.0** | 512.0 |
|
|||
|
|
| **hakmem-evolving** | 272.0 | 1578.0 | **36647.5** | 739.5 |
|
|||
|
|
| hakmem-baseline | 261.0 | 1690.0 | 36910.5 | 781.5 |
|
|||
|
|
| jemalloc | 489.0 | 1493.0 | 27039.0 | 800.5 |
|
|||
|
|
| system | 253.5 | 1724.0 | 62772.5 | 931.5 |
|
|||
|
|
|
|||
|
|
**Current gap (VM scenario)**:
|
|||
|
|
- hakmem vs mimalloc: **2.07× slower** (36647 / 17725)
|
|||
|
|
- Target with Phase 6.3: **1.3-1.4× slower** (close gap by 30-50%)
|
|||
|
|
|
|||
|
|
**Page faults (VM scenario)**:
|
|||
|
|
- hakmem: 513 (with BigCache)
|
|||
|
|
- system: 1026 (without BigCache)
|
|||
|
|
- BigCache reduces faults by 50%
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 Specific Questions for ChatGPT Pro
|
|||
|
|
|
|||
|
|
1. **Risk Assessment**: Is switching to mmap safe given BigCache's 90% hit rate?
|
|||
|
|
- Will 150 page faults (10% miss rate) cause acceptable overhead?
|
|||
|
|
- Is madvise batching (150 → 3-5 TLB flushes) worth the risk?
|
|||
|
|
|
|||
|
|
2. **BigCache + mmap Compatibility**: Any concerns with caching mmap blocks?
|
|||
|
|
- Current: BigCache caches malloc blocks
|
|||
|
|
- Proposed: BigCache caches mmap blocks (same size class)
|
|||
|
|
- Any hidden issues?
|
|||
|
|
|
|||
|
|
3. **Alternative Approach**: Should we implement Option C (ELO-based selection)?
|
|||
|
|
- Let ELO choose between malloc and mmap strategies
|
|||
|
|
- Trade-off: complexity vs. safety
|
|||
|
|
|
|||
|
|
4. **mimalloc Analysis**: Does mimalloc use mmap for large allocations?
|
|||
|
|
- How does it achieve 2× speedup on VM scenario?
|
|||
|
|
- Is madvise batching the main factor?
|
|||
|
|
|
|||
|
|
5. **Performance Prediction**: Expected performance with Option A?
|
|||
|
|
- Current: 36,647 ns (malloc, no batching)
|
|||
|
|
- Predicted: ??? ns (mmap + BigCache + madvise batching)
|
|||
|
|
- Is +30-50% gain realistic?
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🧪 Test Plan (If Option A is chosen)
|
|||
|
|
|
|||
|
|
1. **Switch to mmap** (1-line change)
|
|||
|
|
2. **Run VM scenario benchmark** (10 runs, quick test)
|
|||
|
|
3. **Measure**:
|
|||
|
|
- Page faults (expect ~150, vs 513 with malloc)
|
|||
|
|
- TLB flushes (expect 3-5, vs 150 without batching)
|
|||
|
|
- Latency (expect 25,000-28,000 ns, vs 36,647 ns current)
|
|||
|
|
4. **Rollback if**:
|
|||
|
|
- Page faults > 500 (BigCache not working)
|
|||
|
|
- Latency regression (slower than current)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📚 Context Files
|
|||
|
|
|
|||
|
|
**Implementation**:
|
|||
|
|
- `hakmem.c`: Main allocator (allocate_with_policy L349)
|
|||
|
|
- `hakmem_bigcache.c`: Per-site cache (90% hit rate)
|
|||
|
|
- `hakmem_batch.c`: madvise batching (Phase 6.3)
|
|||
|
|
- `hakmem_elo.c`: ELO strategy selection (Phase 6.2)
|
|||
|
|
|
|||
|
|
**Documentation**:
|
|||
|
|
- `FINAL_RESULTS.md`: Silver medal results (2nd place / 5 allocators)
|
|||
|
|
- `CHATGPT_FEEDBACK.md`: Your previous recommendations (ACE + ELO + madvise)
|
|||
|
|
- `PHASE_6.2_ELO_IMPLEMENTATION.md`: ELO implementation details
|
|||
|
|
- `PHASE_6.3_MADVISE_BATCHING.md`: madvise batching implementation
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🎯 Recommendation Request
|
|||
|
|
|
|||
|
|
**Please provide**:
|
|||
|
|
1. **Go/No-Go**: Should we switch to mmap (Option A)?
|
|||
|
|
2. **Risk mitigation**: How to safely test without breaking current performance?
|
|||
|
|
3. **Alternative**: If not Option A, what's the best path to gold medal?
|
|||
|
|
4. **Expected gain**: Realistic performance prediction with mmap + batching?
|
|||
|
|
|
|||
|
|
**Time limit**: 10 minutes
|
|||
|
|
**Priority**: HIGH (blocks Phase 6.3 effectiveness)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**Generated**: 2025-10-21
|
|||
|
|
**Status**: Awaiting ChatGPT Pro consultation
|
|||
|
|
**Next**: Implement recommended approach
|