# HAKMEM Sanitizer Investigation Report **Date:** 2025-11-07 **Status:** Root cause identified **Severity:** Critical (immediate SEGV on startup) --- ## Executive Summary HAKMEM fails immediately when built with AddressSanitizer (ASan) or ThreadSanitizer (TSan) with allocator enabled (`-alloc` variants). The root cause is **ASan/TSan initialization calling `malloc()` before TLS (Thread-Local Storage) is fully initialized**, causing a SEGV when accessing `__thread` variables. **Key Finding:** ASan's `dlsym()` call during library initialization triggers HAKMEM's `malloc()` wrapper, which attempts to access `g_hakmem_lock_depth` (TLS variable) before TLS is ready. --- ## 1. TLS Variables - Complete Inventory ### 1.1 Core TLS Variables (Recursion Guard) **File:** `core/hakmem.c:188` ```c __thread int g_hakmem_lock_depth = 0; // Recursion guard (NOT static!) ``` **First Access:** `core/box/hak_wrappers.inc.h:42` (in `malloc()` wrapper) ```c void* malloc(size_t size) { if (__builtin_expect(g_initializing != 0, 0)) { // ← Line 42 extern void* __libc_malloc(size_t); return __libc_malloc(size); } // ... later: g_hakmem_lock_depth++; (line 86) } ``` **Problem:** Line 42 checks `g_initializing` (global variable, OK), but **TLS access happens implicitly** when the function prologue sets up the stack frame for accessing TLS variables later in the function. ### 1.2 Other TLS Variables #### Wrapper Statistics (hak_wrappers.inc.h:32-36) ```c __thread uint64_t g_malloc_total_calls = 0; __thread uint64_t g_malloc_tiny_size_match = 0; __thread uint64_t g_malloc_fast_path_tried = 0; __thread uint64_t g_malloc_fast_path_null = 0; __thread uint64_t g_malloc_slow_path = 0; ``` #### Tiny Allocator TLS (hakmem_tiny.c) ```c __thread int g_tls_live_ss[TINY_NUM_CLASSES] = {0}; // Line 658 __thread void* g_tls_sll_head[TINY_NUM_CLASSES] = {0}; // Line 1019 __thread uint32_t g_tls_sll_count[TINY_NUM_CLASSES] = {0}; // Line 1020 __thread uint8_t* g_tls_bcur[TINY_NUM_CLASSES] = {0}; // Line 1187 __thread uint8_t* g_tls_bend[TINY_NUM_CLASSES] = {0}; // Line 1188 ``` #### Fast Cache TLS (tiny_fastcache.h:32-54, extern declarations) ```c extern __thread void* g_tiny_fast_cache[TINY_FAST_CLASS_COUNT]; extern __thread uint32_t g_tiny_fast_count[TINY_FAST_CLASS_COUNT]; // ... 10+ more TLS variables ``` #### Other Subsystems TLS - **SFC Cache:** `hakmem_tiny_sfc.c:18-19` (2 TLS variables) - **Sticky Cache:** `tiny_sticky.c:6-8` (3 TLS arrays) - **Simple Cache:** `hakmem_tiny_simple.c:23,26` (2 TLS variables) - **Magazine:** `hakmem_tiny_magazine.c:29,37` (2 TLS variables) - **Mid-Range MT:** `hakmem_mid_mt.c:37` (1 TLS array) - **Pool TLS:** `core/box/pool_tls_types.inc.h:11` (1 TLS array) **Total TLS Variables:** 50+ across the codebase --- ## 2. dlsym / syscall Initialization Flow ### 2.1 Intended Initialization Order **File:** `core/box/hak_core_init.inc.h:29-35` ```c static void hak_init_impl(void) { g_initializing = 1; // Phase 6.X P0 FIX (2025-10-24): Initialize Box 3 (Syscall Layer) FIRST! // This MUST be called before ANY allocation (Tiny/Mid/Large/Learner) // dlsym() initializes function pointers to real libc (bypasses LD_PRELOAD) hkm_syscall_init(); // ← Line 35 // ... } ``` **File:** `core/hakmem_syscall.c:41-64` ```c void hkm_syscall_init(void) { if (g_syscall_initialized) return; // Idempotent // dlsym with RTLD_NEXT: Get NEXT symbol in library chain real_malloc = dlsym(RTLD_NEXT, "malloc"); // ← Line 49 real_calloc = dlsym(RTLD_NEXT, "calloc"); real_free = dlsym(RTLD_NEXT, "free"); real_realloc = dlsym(RTLD_NEXT, "realloc"); if (!real_malloc || !real_calloc || !real_free || !real_realloc) { fprintf(stderr, "[hakmem_syscall] FATAL: dlsym failed\n"); abort(); } g_syscall_initialized = 1; } ``` ### 2.2 Actual Execution Order (ASan Build) **GDB Backtrace:** ``` #0 malloc (size=69) at core/box/hak_wrappers.inc.h:40 #1 0x00007ffff7fc7cca in malloc (size=69) at ../include/rtld-malloc.h:56 #2 __GI__dl_exception_create_format (...) at ./elf/dl-exception.c:157 #3 0x00007ffff7fcf3dc in _dl_lookup_symbol_x (undef_name="__isoc99_printf", ...) #4 0x00007ffff65759c4 in do_sym (..., name="__isoc99_printf", ...) at ./elf/dl-sym.c:146 #5 _dl_sym (handle=, name="__isoc99_printf", ...) at ./elf/dl-sym.c:195 #12 0x00007ffff74e3859 in __interception::GetFuncAddr (name="__isoc99_printf") at interception_linux.cpp:42 #13 __interception::InterceptFunction (name="__isoc99_printf", ...) at interception_linux.cpp:61 #14 0x00007ffff74a1deb in InitializeCommonInterceptors () at sanitizer_common_interceptors.inc:10094 #15 __asan::InitializeAsanInterceptors () at asan_interceptors.cpp:634 #16 0x00007ffff74c063b in __asan::AsanInitInternal () at asan_rtl.cpp:452 #17 0x00007ffff7fc95be in _dl_init (main_map=0x7ffff7ffe2e0, ...) at ./elf/dl-init.c:102 #18 0x00007ffff7fe32ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2 ``` **Timeline:** 1. Dynamic linker (`ld-linux.so`) initializes 2. ASan runtime initializes (`__asan::AsanInitInternal`) 3. ASan intercepts `printf` family functions 4. `dlsym("__isoc99_printf")` calls `malloc()` internally (glibc rtld-malloc.h:56) 5. HAKMEM's `malloc()` wrapper is invoked **before `hak_init()` runs** 6. **TLS access SEGV** (TLS segment not yet initialized) ### 2.3 Why `HAKMEM_FORCE_LIBC_ALLOC_BUILD` Doesn't Help **Current Makefile (line 810-811):** ```makefile SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \ -fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong # NOTE: Missing -DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 ``` **Expected Behavior (with flag):** ```c #ifdef HAKMEM_FORCE_LIBC_ALLOC_BUILD void* malloc(size_t size) { extern void* __libc_malloc(size_t); return __libc_malloc(size); // Bypass HAKMEM completely } #endif ``` **However:** Even with `HAKMEM_FORCE_LIBC_ALLOC_BUILD=1`, the symbol `malloc` would still be exported, and ASan might still interpose on it. The real fix requires: 1. Not exporting `malloc` at all when Sanitizers are active, OR 2. Using constructor priorities to guarantee TLS initialization before ASan --- ## 3. Static Constructor Execution Order ### 3.1 Current Constructors **File:** `core/hakmem.c:66` ```c __attribute__((constructor)) static void hakmem_ctor_install_segv(void) { const char* dbg = getenv("HAKMEM_DEBUG_SEGV"); // ... install SIGSEGV handler } ``` **File:** `core/tiny_debug_ring.c:204` ```c __attribute__((constructor)) static void hak_debug_ring_ctor(void) { // ... } ``` **File:** `core/hakmem_tiny_stats.c:66` ```c __attribute__((constructor)) static void hak_tiny_stats_ctor(void) { // ... } ``` **Problem:** No priority specified! GCC default is `65535`, which runs **after** most library constructors. **ASan Constructor Priority:** Typically `1` or `100` (very early) ### 3.2 Constructor Priority Ranges - **0-99:** Reserved for system libraries (libc, libstdc++, sanitizers) - **100-999:** Early initialization (critical infrastructure) - **1000-9999:** Normal initialization - **65535 (default):** Late initialization --- ## 4. Sanitizer Conflict Points ### 4.1 Symbol Interposition Chain **Without Sanitizer:** ``` Application → malloc() → HAKMEM wrapper → hak_alloc_at() ``` **With ASan (Direct Link):** ``` Application → ASan malloc() → HAKMEM malloc() → TLS access → SEGV ↓ (during ASan init, TLS not ready!) ``` **Expected (with FORCE_LIBC):** ``` Application → ASan malloc() → __libc_malloc() ✓ ``` ### 4.2 LD_PRELOAD vs Direct Link **LD_PRELOAD (libhakmem_asan.so):** ``` Application → LD_PRELOAD (HAKMEM malloc) → ASan malloc → ... ``` - Even worse: HAKMEM wrapper runs before ASan init! **Direct Link (larson_hakmem_asan_alloc):** ``` Application → main() → ... ↓ (ASan init via constructor) → dlsym malloc → HAKMEM malloc → SEGV ``` ### 4.3 TLS Initialization Timing **Normal Execution:** 1. ELF loader initializes TLS templates 2. `__tls_get_addr()` sets up TLS for main thread 3. Constructors run (can safely access TLS) 4. `main()` starts **ASan Execution:** 1. ELF loader initializes TLS templates 2. ASan constructor runs **before** application constructors 3. ASan's `dlsym()` calls `malloc()` 4. **HAKMEM malloc accesses TLS → SEGV** (TLS not fully initialized!) **Why TLS Fails:** - ASan's early constructor (priority 1-100) runs during `_dl_init()` - TLS segment may be allocated but **not yet associated with the current thread** - Accessing `__thread` variable triggers `__tls_get_addr()` → NULL dereference --- ## 5. Existing Workarounds / Comments ### 5.1 Recursion Guard Design **File:** `core/hakmem.c:175-192` ```c // Phase 6.15 P1: Remove global lock; keep recursion guard only // --------------------------------------------------------------------------- // We no longer serialize all allocations with a single global mutex. // Instead, each submodule is responsible for its own fine‑grained locking. // We keep a per‑thread recursion guard so that internal use of malloc/free // within the allocator routes to libc (avoids infinite recursion). // // Phase 6.X P0 FIX (2025-10-24): Reverted to simple g_hakmem_lock_depth check // Box Theory - Layer 1 (API Layer): // This guard protects against LD_PRELOAD recursion (Box 1 → Box 1) // Box 2 (Core) → Box 3 (Syscall) uses hkm_libc_malloc() (dlsym, no guard needed!) // NOTE: Removed 'static' to allow access from hakmem_tiny_superslab.c (fopen fix) __thread int g_hakmem_lock_depth = 0; // 0 = outermost call ``` **Comment Analysis:** - Designed for **runtime recursion**, not **initialization-time TLS issues** - Assumes TLS is already available when `malloc()` is called - `dlsym` guard mentioned, but not for initialization safety ### 5.2 Sanitizer Build Flags (Makefile) **Line 799-801 (ASan with FORCE_LIBC):** ```makefile SAN_ASAN_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \ -fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong \ -DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 # ← Bypasses HAKMEM allocator ``` **Line 810-811 (ASan with HAKMEM allocator):** ```makefile SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \ -fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong # NOTE: Missing -DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 ← INTENDED for testing! ``` **Design Intent:** Allow ASan to instrument HAKMEM's allocator for memory safety testing. **Current Reality:** Broken due to TLS initialization order. --- ## 6. Recommended Fix (Priority Ordered) ### 6.1 Option A: Constructor Priority (Quick Fix) ⭐⭐⭐⭐⭐ **Difficulty:** Easy **Risk:** Low **Effectiveness:** High (80% confidence) **Implementation:** **File:** `core/hakmem.c` ```c // PRIORITY 101: Run after ASan (priority ~100), but before default (65535) __attribute__((constructor(101))) static void hakmem_tls_preinit(void) { // Force TLS allocation by touching the variable g_hakmem_lock_depth = 0; // Optional: Pre-initialize dlsym cache hkm_syscall_init(); } // Keep existing constructor for SEGV handler (no priority = runs later) __attribute__((constructor)) static void hakmem_ctor_install_segv(void) { // ... existing code } ``` **Rationale:** - Ensures TLS is touched **after** ASan init but **before** any malloc calls - Forces `__tls_get_addr()` to run in a safe context - Minimal code change **Verification:** ```bash make clean # Add constructor(101) to hakmem.c make asan-larson-alloc ./larson_hakmem_asan_alloc 1 1 128 1024 1 12345 1 # Should run without SEGV ``` --- ### 6.2 Option B: Lazy TLS Initialization (Defensive) ⭐⭐⭐⭐ **Difficulty:** Medium **Risk:** Medium (performance impact) **Effectiveness:** High (90% confidence) **Implementation:** **File:** `core/box/hak_wrappers.inc.h:40-50` ```c void* malloc(size_t size) { // NEW: Check if TLS is initialized using a helper if (__builtin_expect(!hak_tls_is_ready(), 0)) { extern void* __libc_malloc(size_t); return __libc_malloc(size); } // Existing code... if (__builtin_expect(g_initializing != 0, 0)) { extern void* __libc_malloc(size_t); return __libc_malloc(size); } // ... } ``` **New Helper Function:** ```c // core/hakmem.c static __thread int g_tls_ready_flag = 0; __attribute__((constructor(101))) static void hak_tls_mark_ready(void) { g_tls_ready_flag = 1; } int hak_tls_is_ready(void) { // Use volatile to prevent compiler optimization return __atomic_load_n(&g_tls_ready_flag, __ATOMIC_RELAXED); } ``` **Pros:** - Safe even if constructor priorities fail - Explicit TLS readiness check - Falls back to libc if TLS not ready **Cons:** - Extra branch on malloc hot path (1-2 cycles) - Requires touching another TLS variable (`g_tls_ready_flag`) --- ### 6.3 Option C: Weak Symbol Aliasing (Advanced) ⭐⭐⭐ **Difficulty:** Hard **Risk:** High (portability, build system complexity) **Effectiveness:** Medium (70% confidence) **Implementation:** **File:** `core/box/hak_wrappers.inc.h` ```c // Weak alias: Allow ASan to override if needed __attribute__((weak)) void* malloc(size_t size) { // ... HAKMEM implementation } // Strong symbol for internal use void* hak_malloc_internal(size_t size) { // ... same implementation } ``` **Pros:** - Allows ASan to fully control malloc symbol - HAKMEM can still use internal allocation **Cons:** - Complex build interactions - May not work with all linker configurations - Debugging becomes harder (symbol resolution issues) --- ### 6.4 Option D: Disable Wrappers for Sanitizer Builds (Pragmatic) ⭐⭐⭐⭐⭐ **Difficulty:** Easy **Risk:** Low **Effectiveness:** 100% (but limited scope) **Implementation:** **File:** `Makefile:810-811` ```makefile # OLD (broken): SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \ -fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong # NEW (fixed): SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \ -fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong \ -DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 # ← Bypass HAKMEM allocator ``` **Rationale:** - Sanitizer builds should focus on **application logic bugs**, not allocator bugs - HAKMEM allocator can be tested separately without Sanitizers - Eliminates all TLS/constructor issues **Pros:** - Immediate fix (1-line change) - Zero risk - Sanitizers work as intended **Cons:** - Cannot test HAKMEM allocator with Sanitizers - Defeats purpose of `-alloc` variants **Recommended Naming:** ```bash # Current (misleading): larson_hakmem_asan_alloc # Implies HAKMEM allocator is used # Better naming: larson_hakmem_asan_libc # Clarifies libc malloc is used larson_hakmem_asan_nalloc # "no allocator" (HAKMEM disabled) ``` --- ## 7. Recommended Action Plan ### Phase 1: Immediate Fix (1 day) ✅ 1. **Add `-DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1` to SAN_*_ALLOC_CFLAGS** (Makefile:810, 823) 2. Rename binaries for clarity: - `larson_hakmem_asan_alloc` → `larson_hakmem_asan_libc` - `larson_hakmem_tsan_alloc` → `larson_hakmem_tsan_libc` 3. Verify all Sanitizer builds work correctly ### Phase 2: Constructor Priority Fix (2-3 days) 1. Add `__attribute__((constructor(101)))` to `hakmem_tls_preinit()` 2. Test with ASan/TSan/UBSan (allocator enabled) 3. Document constructor priority ranges in `ARCHITECTURE.md` ### Phase 3: Defensive TLS Check (1 week, optional) 1. Implement `hak_tls_is_ready()` helper 2. Add early exit in `malloc()` wrapper 3. Benchmark performance impact (should be < 1%) ### Phase 4: Documentation (ongoing) 1. Update `CLAUDE.md` with Sanitizer findings 2. Add "Sanitizer Compatibility" section to README 3. Document TLS variable inventory --- ## 8. Testing Matrix | Build Type | Allocator | Sanitizer | Expected Result | Actual Result | |------------|-----------|-----------|-----------------|---------------| | `asan-larson` | libc | ASan+UBSan | ✅ Pass | ✅ Pass | | `tsan-larson` | libc | TSan | ✅ Pass | ✅ Pass | | `asan-larson-alloc` | HAKMEM | ASan+UBSan | ✅ Pass | ❌ SEGV (TLS) | | `tsan-larson-alloc` | HAKMEM | TSan | ✅ Pass | ❌ SEGV (TLS) | | `asan-shared-alloc` | HAKMEM | ASan+UBSan | ✅ Pass | ❌ SEGV (TLS) | | `tsan-shared-alloc` | HAKMEM | TSan | ✅ Pass | ❌ SEGV (TLS) | **Target:** All ✅ after Phase 1 (libc) + Phase 2 (constructor priority) --- ## 9. References ### 9.1 Related Code Files - `core/hakmem.c:188` - TLS recursion guard - `core/box/hak_wrappers.inc.h:40` - malloc wrapper entry point - `core/box/hak_core_init.inc.h:29` - Initialization flow - `core/hakmem_syscall.c:41` - dlsym initialization - `Makefile:799-824` - Sanitizer build flags ### 9.2 External Documentation - [GCC Constructor/Destructor Attributes](https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-constructor-function-attribute) - [ASan Initialization Order](https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco) - [ELF TLS Specification](https://www.akkadia.org/drepper/tls.pdf) - [glibc rtld-malloc.h](https://sourceware.org/git/?p=glibc.git;a=blob;f=include/rtld-malloc.h) --- ## 10. Conclusion The HAKMEM Sanitizer crash is a **classic initialization order problem** exacerbated by ASan's aggressive use of `malloc()` during `dlsym()` resolution. The immediate fix is trivial (enable `HAKMEM_FORCE_LIBC_ALLOC_BUILD`), but enabling Sanitizer instrumentation of HAKMEM itself requires careful constructor priority management. **Recommended Path:** Implement Phase 1 (immediate) + Phase 2 (robust) for full Sanitizer support with allocator instrumentation enabled. --- **Report Author:** Claude Code (Sonnet 4.5) **Investigation Date:** 2025-11-07 **Last Updated:** 2025-11-07