Phase 11: SuperSlab Prewarm implementation (startup pre-allocation)

## Summary
Pre-allocate SuperSlabs at startup to eliminate runtime mmap overhead.
Result: +6.4% improvement (8.82M → 9.38M ops/s) but still 9x slower than System malloc.

## Key Findings (Lesson Learned)
- Syscall reduction strategy targeted WRONG bottleneck
- Real bottleneck: SuperSlab allocation churn (877 SuperSlabs needed)
- Prewarm reduces mmap frequency but doesn't solve fundamental architecture issue

## Implementation
- Two-phase allocation with atomic bypass flag
- Environment variable: HAKMEM_PREWARM_SUPERSLABS (default: 0)
- Best result: Prewarm=8 → 9.38M ops/s (+6.4%)

## Next Step
Pivot to Phase 12: Shared SuperSlab Pool (mimalloc-style)
- Expected: 877 → 100-200 SuperSlabs (-70-80%)
- This addresses ROOT CAUSE (allocation churn) not symptoms

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-13 14:45:43 +09:00
parent 030132f911
commit 2be754853f
4 changed files with 420 additions and 0 deletions

View File

@ -633,6 +633,20 @@ void hak_tiny_init(void) {
}
}
// Phase 11: Initialize SuperSlab Registry and LRU Cache
if (g_use_superslab) {
extern void hak_super_registry_init(void);
extern void hak_ss_lru_init(void);
extern void hak_ss_prewarm_init(void);
hak_super_registry_init();
hak_ss_lru_init();
// Phase 11: Prewarm SuperSlabs to eliminate mmap/munmap churn
// ENV: HAKMEM_PREWARM_SUPERSLABS=<count> (e.g., 32, 128)
hak_ss_prewarm_init();
}
if (__builtin_expect(route_enabled_runtime(), 0)) {
tiny_debug_ring_record(TINY_RING_EVENT_ROUTE, (uint16_t)0xFFFFu, NULL, (uintptr_t)0x494E4954u);
}