## Summary Eliminated all getenv() calls from malloc/free wrappers and allocator hot paths by implementing constructor-based environment variable caching. This achieves 39-42% performance improvement (36s → 22s on sh8bench single-thread). ## Performance Impact - sh8bench 1 thread: 35-36s → 21-22s (+39-42% improvement) 🚀 - sh8bench 8 threads: ~15s (maintained) - getenv overhead: 36.32% → 0% (completely eliminated) ## Changes ### New Files - **core/box/tiny_env_box.{c,h}**: Centralized environment variable cache for Tiny allocator - Caches 43 environment variables (HAKMEM_TINY_*, HAKMEM_SLL_*, HAKMEM_SS_*, etc.) - Constructor-based initialization with atomic CAS for thread safety - Inline accessor tiny_env_cfg() for hot path access - **core/box/wrapper_env_box.{c,h}**: Environment cache for malloc/free wrappers - Caches 3 wrapper variables (HAKMEM_STEP_TRACE, HAKMEM_LD_SAFE, HAKMEM_FREE_WRAP_TRACE) - Constructor priority 101 ensures early initialization - Replaces all lazy-init patterns in wrapper code ### Modified Files - **Makefile**: Added tiny_env_box.o and wrapper_env_box.o to OBJS_BASE and SHARED_OBJS - **core/box/hak_wrappers.inc.h**: - Removed static lazy-init variables (g_step_trace, ld_safe_mode cache) - Replaced with wrapper_env_cfg() lookups (wcfg->step_trace, wcfg->ld_safe_mode) - All getenv() calls eliminated from malloc/free hot paths - **core/hakmem.c**: - Added hak_ld_env_init() with constructor for LD_PRELOAD caching - Added hak_force_libc_ctor() for HAKMEM_FORCE_LIBC_ALLOC* caching - Simplified hak_ld_env_mode() to return cached value only - Simplified hak_force_libc_alloc() to use cached values - Eliminated all getenv/atoi calls from hot paths ## Technical Details ### Constructor Initialization Pattern All environment variables are now read once at library load time using __attribute__((constructor)): ```c __attribute__((constructor(101))) static void wrapper_env_ctor(void) { wrapper_env_init_once(); // Atomic CAS ensures exactly-once init } ``` ### Thread Safety - Atomic compare-and-swap (CAS) ensures single initialization - Spin-wait for initialization completion in multi-threaded scenarios - Memory barriers (memory_order_acq_rel) ensure visibility ### Hot Path Impact Before: Every malloc/free → getenv("LD_PRELOAD") + getenv("HAKMEM_STEP_TRACE") + ... After: Every malloc/free → Single pointer dereference (wcfg->field) ## Next Optimization Target (Phase 1-2) Perf analysis reveals libc fallback accounts for ~51% of cycles: - _int_malloc: 15.04% - malloc: 9.81% - _int_free: 10.07% - malloc_consolidate: 9.27% - unlink_chunk: 6.82% Reducing libc fallback from 51% → 10% could yield additional +25-30% improvement. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: ChatGPT <chatgpt@openai.com>
26 lines
806 B
C
26 lines
806 B
C
// wrapper_env_box.h - Environment variable cache for malloc/free wrappers
|
|
// Eliminates getenv() calls from malloc/free hot paths
|
|
#pragma once
|
|
|
|
#include <stdatomic.h>
|
|
|
|
typedef struct {
|
|
int inited;
|
|
int step_trace; // HAKMEM_STEP_TRACE (default: 0)
|
|
int ld_safe_mode; // HAKMEM_LD_SAFE (default: 1)
|
|
int free_wrap_trace; // HAKMEM_FREE_WRAP_TRACE (default: 0)
|
|
} wrapper_env_cfg_t;
|
|
|
|
extern wrapper_env_cfg_t g_wrapper_env;
|
|
|
|
void wrapper_env_init_once(void);
|
|
|
|
static inline const wrapper_env_cfg_t* wrapper_env_cfg(void) {
|
|
// Constructor ensures init at library load time
|
|
// This check prevents repeated initialization in multi-threaded context
|
|
if (__builtin_expect(!g_wrapper_env.inited, 0)) {
|
|
wrapper_env_init_once();
|
|
}
|
|
return &g_wrapper_env;
|
|
}
|