Implement and Test Lazy Zeroing Optimization: Phase 2 Complete

## Implementation
- Added MADV_DONTNEED when SuperSlab enters LRU cache
- Environment variable: HAKMEM_SS_LAZY_ZERO (default: 1)
- Low-risk, zero-overhead when disabled

## Results: NO MEASURABLE IMPROVEMENT
- Cycles: 70.4M (baseline) vs 70.8M (optimized) = -0.5% (worse!)
- Page faults: 7,674 (no change)
- L1 misses: 717K vs 714K (negligible)

## Key Discovery
The 11.65% clear_page_erms overhead is **kernel-level**, not allocator-level:
- Happens during page faults, not during free
- Can't be selectively deferred for SuperSlab pages
- MADV_DONTNEED syscall overhead cancels benefit
- Result: Zero improvement despite profiling showing 11.65%

## Why Profiling Was Misleading
- Page zeroing shown in profile but not controllable
- Happens globally across all allocators
- Can't isolate which faults are from our code
- Not all profile % are equally optimizable

## Conclusion
Random Mixed 1.06M ops/s appears to be near the practical limit:
- THP: no effect (already tested)
- PREFAULT: +2.6% (measurement noise)
- Lazy zeroing: 0% (syscall overhead cancels benefit)
- Realistic cap: ~1.10-1.15M ops/s (10-15% max possible)

Tiny Hot (89M ops/s) is not comparable - it's an architectural difference.

🐱 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-04 20:49:21 +09:00
parent 1755257f60
commit 4cad395e10
2 changed files with 324 additions and 0 deletions

View File

@ -345,6 +345,20 @@ void superslab_free(SuperSlab* ss) {
}
if (lru_cached) {
// Successfully cached in LRU - defer munmap
// OPTIMIZATION: Lazy zeroing via MADV_DONTNEED
// When SuperSlab enters LRU cache, mark pages as DONTNEED to defer
// page zeroing until they are actually touched by next allocation.
// Kernel will zero them on-fault (zero-on-fault), reducing clear_page_erms overhead.
static int lazy_zero_enabled = -1;
if (__builtin_expect(lazy_zero_enabled == -1, 0)) {
const char* e = getenv("HAKMEM_SS_LAZY_ZERO");
lazy_zero_enabled = (!e || !*e || *e == '1') ? 1 : 0;
}
if (lazy_zero_enabled) {
#ifdef MADV_DONTNEED
(void)madvise((void*)ss, ss_size, MADV_DONTNEED);
#endif
}
return;
}