Phase 26: Front Gate Unification - Tiny allocator fast path (+12.9%)
Implementation: - New single-layer malloc/free path for Tiny (≤1024B) allocations - Bypasses 3-layer overhead: malloc → hak_alloc_at (236 lines) → wrapper → tiny_alloc_fast - Leverages Phase 23 Unified Cache (tcache-style, 2-3 cache misses) - Safe fallback to normal path on Unified Cache miss Performance (Random Mixed 256B, 100K iterations): - Baseline (Phase 26 OFF): 11.33M ops/s - Phase 26 ON: 12.79M ops/s (+12.9%) - Prediction (ChatGPT): +10-15% → Actual: +12.9% (perfect match!) Bug fixes: - Initialization bug: Added hak_init() call before fast path - Page boundary SEGV: Added guard for offset_in_page == 0 Also includes Phase 23 debug log fixes: - Guard C2_CARVE logs with #if !HAKMEM_BUILD_RELEASE - Guard prewarm logs with #if !HAKMEM_BUILD_RELEASE - Set Hot_2048 as default capacity (C2/C3=2048, others=64) Files: - core/front/malloc_tiny_fast.h: Phase 26 implementation (145 lines) - core/box/hak_wrappers.inc.h: Fast path integration (+28 lines) - core/front/tiny_unified_cache.h: Hot_2048 default - core/tiny_refill_opt.h: C2_CARVE log guard - core/box/ss_hot_prewarm_box.c: Prewarm log guard - CURRENT_TASK.md: Phase 26 completion documentation ENV variables: - HAKMEM_FRONT_GATE_UNIFIED=1 (enable Phase 26, default: OFF) - HAKMEM_TINY_UNIFIED_CACHE=1 (Phase 23, required) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@ -77,18 +77,27 @@ static inline int unified_cache_enabled(void) {
|
||||
return g_enable;
|
||||
}
|
||||
|
||||
// Per-class capacity (default: 128 for all classes)
|
||||
// Per-class capacity (default: Hot_2048 strategy - optimized for 256B workload)
|
||||
// Phase 23 Capacity Optimization Result: Hot_2048 = 14.63M ops/s (+43% vs baseline)
|
||||
// Hot classes (C2/C3: 128B/256B) get 2048 slots, others get 64 slots
|
||||
static inline size_t unified_capacity(int class_idx) {
|
||||
static size_t g_cap[TINY_NUM_CLASSES] = {0};
|
||||
if (__builtin_expect(g_cap[class_idx] == 0, 0)) {
|
||||
char env_name[64];
|
||||
snprintf(env_name, sizeof(env_name), "HAKMEM_TINY_UNIFIED_C%d", class_idx);
|
||||
const char* e = getenv(env_name);
|
||||
g_cap[class_idx] = (e && *e) ? (size_t)atoi(e) : 128; // Default: 128
|
||||
|
||||
// Default: Hot_2048 strategy (C2/C3=2048, others=64)
|
||||
size_t default_cap = 64; // Cold classes
|
||||
if (class_idx == 2 || class_idx == 3) {
|
||||
default_cap = 2048; // Hot classes (128B, 256B)
|
||||
}
|
||||
|
||||
g_cap[class_idx] = (e && *e) ? (size_t)atoi(e) : default_cap;
|
||||
|
||||
// Round up to power of 2 (for fast modulo)
|
||||
if (g_cap[class_idx] < 32) g_cap[class_idx] = 32;
|
||||
if (g_cap[class_idx] > 512) g_cap[class_idx] = 512;
|
||||
if (g_cap[class_idx] > 4096) g_cap[class_idx] = 4096; // Increased limit for Hot_2048
|
||||
|
||||
// Ensure power of 2
|
||||
size_t pow2 = 32;
|
||||
|
||||
Reference in New Issue
Block a user