## Phase 2 Optimization Research Complete ### B1 (Header tax reduction v2) - NO-GO - HAKMEM_TINY_HEADER_MODE=LIGHT: -2.54% regression on Mixed - Decision: FREEZE as research box (ENV opt-in only) ### B3 (Routing branch shape optimization) - ADOPT - Mixed: +2.89% (48.41M → 49.80M ops/s) - C6-heavy: +9.13% (8.97M → 9.79M ops/s) - Strategy: LIKELY on LEGACY (hot), noinline,cold helper for rare routes - Implementation: Already in malloc_tiny_fast.h:252-267 - Profile updates: HAKMEM_TINY_ALLOC_ROUTE_SHAPE=1 now default ### B4 (Wrapper Layer Hot/Cold Split) - Preparation - Design memo: docs/analysis/PHASE2_B4_WRAPPER_SHAPE_1_DESIGN.md - Goal: Split malloc/free into hot/cold paths, reduce I-cache pressure - ENV gate: HAKMEM_WRAP_SHAPE=0/1 (added to wrapper_env_box) - Expected gain: +2-5% Mixed, +1-3% C6-heavy ## Analysis Summary - Background is visible: FREE DUALHOT + B3 routing optimizations work - Code layering is clean: winning boxes promoted to presets, losing boxes frozen with ENV guards - Remaining gap to mimalloc is wrapper layer + safety checks + policy snapshot - Further +5-10% still realistically achievable 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
57 lines
1.9 KiB
C
57 lines
1.9 KiB
C
#define _GNU_SOURCE
|
|
#include "wrapper_env_box.h"
|
|
#include <stdlib.h>
|
|
#include <string.h>
|
|
|
|
wrapper_env_cfg_t g_wrapper_env = {.inited = 0, .step_trace = 0, .ld_safe_mode = 1, .free_wrap_trace = 0, .wrap_diag = 0, .wrap_shape = 0};
|
|
|
|
static inline int env_flag(const char* name, int def) {
|
|
const char* e = getenv(name);
|
|
if (!e || *e == '\0') return def;
|
|
return (*e != '0');
|
|
}
|
|
|
|
static inline int env_int(const char* name, int def) {
|
|
const char* e = getenv(name);
|
|
if (!e || *e == '\0') return def;
|
|
char* end;
|
|
long val = strtol(e, &end, 10);
|
|
return (end != e) ? (int)val : def;
|
|
}
|
|
|
|
void wrapper_env_init_once(void) {
|
|
// Atomic CAS to ensure exactly-once initialization
|
|
static _Atomic int init_started = 0;
|
|
int expected = 0;
|
|
|
|
if (!atomic_compare_exchange_strong_explicit(&init_started, &expected, 1,
|
|
memory_order_acq_rel,
|
|
memory_order_relaxed)) {
|
|
// Someone else is initializing or already initialized
|
|
// Spin until they're done
|
|
while (!__builtin_expect(g_wrapper_env.inited, 1)) {
|
|
__builtin_ia32_pause();
|
|
}
|
|
return;
|
|
}
|
|
|
|
// We own the initialization
|
|
g_wrapper_env.step_trace = env_flag("HAKMEM_STEP_TRACE", 0);
|
|
g_wrapper_env.ld_safe_mode = env_int("HAKMEM_LD_SAFE", 1);
|
|
g_wrapper_env.free_wrap_trace = env_flag("HAKMEM_FREE_WRAP_TRACE", 0);
|
|
g_wrapper_env.wrap_diag = env_flag("HAKMEM_WRAP_DIAG", 0);
|
|
g_wrapper_env.wrap_shape = env_flag("HAKMEM_WRAP_SHAPE", 0);
|
|
|
|
// Mark as initialized last with memory barrier
|
|
atomic_store_explicit(&g_wrapper_env.inited, 1, memory_order_release);
|
|
}
|
|
|
|
__attribute__((constructor(101)))
|
|
static void wrapper_env_ctor(void) {
|
|
// Constructor priority 101 runs early (libc uses 100+)
|
|
// This ensures initialization before any malloc calls
|
|
if (!g_wrapper_env.inited) {
|
|
wrapper_env_init_once();
|
|
}
|
|
}
|