Phase 11: SuperSlab Prewarm implementation (startup pre-allocation)

## Summary Pre-allocate SuperSlabs at startup to eliminate runtime mmap overhead. Result: +6.4% improvement (8.82M → 9.38M ops/s) but still 9x slower than System malloc. ## Key Findings (Lesson Learned) - Syscall reduction strategy targeted WRONG bottleneck - Real bottleneck: SuperSlab allocation churn (877 SuperSlabs needed) - Prewarm reduces mmap frequency but doesn't solve fundamental architecture issue ## Implementation - Two-phase allocation with atomic bypass flag - Environment variable: HAKMEM_PREWARM_SUPERSLABS (default: 0) - Best result: Prewarm=8 → 9.38M ops/s (+6.4%) ## Next Step Pivot to Phase 12: Shared SuperSlab Pool (mimalloc-style) - Expected: 877 → 100-200 SuperSlabs (-70-80%) - This addresses ROOT CAUSE (allocation churn) not symptoms 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 14:45:43 +09:00
parent 030132f911
commit 2be754853f
4 changed files with 420 additions and 0 deletions
--- a/core/hakmem_tiny_init.inc
+++ b/core/hakmem_tiny_init.inc
@ -633,6 +633,20 @@ void hak_tiny_init(void) {
        }
    }

+    // Phase 11: Initialize SuperSlab Registry and LRU Cache
+    if (g_use_superslab) {
+        extern void hak_super_registry_init(void);
+        extern void hak_ss_lru_init(void);
+        extern void hak_ss_prewarm_init(void);
+
+        hak_super_registry_init();
+        hak_ss_lru_init();
+
+        // Phase 11: Prewarm SuperSlabs to eliminate mmap/munmap churn
+        // ENV: HAKMEM_PREWARM_SUPERSLABS=<count> (e.g., 32, 128)
+        hak_ss_prewarm_init();
+    }
+
    if (__builtin_expect(route_enabled_runtime(), 0)) {
        tiny_debug_ring_record(TINY_RING_EVENT_ROUTE, (uint16_t)0xFFFFu, NULL, (uintptr_t)0x494E4954u);
    }