Files
hakmem/docs/analysis/SANITIZER_INVESTIGATION_REPORT.md
Moe Charm (CI) 67fb15f35f Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization)
## Changes

### 1. core/page_arena.c
- Removed init failure message (lines 25-27) - error is handled by returning early
- All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks

### 2. core/hakmem.c
- Wrapped SIGSEGV handler init message (line 72)
- CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs

### 3. core/hakmem_shared_pool.c
- Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE:
  - Node pool exhaustion warning (line 252)
  - SP_META_CAPACITY_ERROR warning (line 421)
  - SP_FIX_GEOMETRY debug logging (line 745)
  - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865)
  - SP_ACQUIRE_STAGE0_L0 debug logging (line 803)
  - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922)
  - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996)
  - SP_ACQUIRE_STAGE3 debug logging (line 1116)
  - SP_SLOT_RELEASE debug logging (line 1245)
  - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305)
  - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316)
- Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized

## Performance Validation

Before: 51M ops/s (with debug fprintf overhead)
After:  49.1M ops/s (consistent performance, fprintf removed from hot paths)

## Build & Test

```bash
./build.sh larson_hakmem
./out/release/larson_hakmem 1 5 1 1000 100 10000 42
# Result: 49.1M ops/s
```

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00

563 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HAKMEM Sanitizer Investigation Report
**Date:** 2025-11-07
**Status:** Root cause identified
**Severity:** Critical (immediate SEGV on startup)
---
## Executive Summary
HAKMEM fails immediately when built with AddressSanitizer (ASan) or ThreadSanitizer (TSan) with allocator enabled (`-alloc` variants). The root cause is **ASan/TSan initialization calling `malloc()` before TLS (Thread-Local Storage) is fully initialized**, causing a SEGV when accessing `__thread` variables.
**Key Finding:** ASan's `dlsym()` call during library initialization triggers HAKMEM's `malloc()` wrapper, which attempts to access `g_hakmem_lock_depth` (TLS variable) before TLS is ready.
---
## 1. TLS Variables - Complete Inventory
### 1.1 Core TLS Variables (Recursion Guard)
**File:** `core/hakmem.c:188`
```c
__thread int g_hakmem_lock_depth = 0; // Recursion guard (NOT static!)
```
**First Access:** `core/box/hak_wrappers.inc.h:42` (in `malloc()` wrapper)
```c
void* malloc(size_t size) {
if (__builtin_expect(g_initializing != 0, 0)) { // ← Line 42
extern void* __libc_malloc(size_t);
return __libc_malloc(size);
}
// ... later: g_hakmem_lock_depth++; (line 86)
}
```
**Problem:** Line 42 checks `g_initializing` (global variable, OK), but **TLS access happens implicitly** when the function prologue sets up the stack frame for accessing TLS variables later in the function.
### 1.2 Other TLS Variables
#### Wrapper Statistics (hak_wrappers.inc.h:32-36)
```c
__thread uint64_t g_malloc_total_calls = 0;
__thread uint64_t g_malloc_tiny_size_match = 0;
__thread uint64_t g_malloc_fast_path_tried = 0;
__thread uint64_t g_malloc_fast_path_null = 0;
__thread uint64_t g_malloc_slow_path = 0;
```
#### Tiny Allocator TLS (hakmem_tiny.c)
```c
__thread int g_tls_live_ss[TINY_NUM_CLASSES] = {0}; // Line 658
__thread void* g_tls_sll_head[TINY_NUM_CLASSES] = {0}; // Line 1019
__thread uint32_t g_tls_sll_count[TINY_NUM_CLASSES] = {0}; // Line 1020
__thread uint8_t* g_tls_bcur[TINY_NUM_CLASSES] = {0}; // Line 1187
__thread uint8_t* g_tls_bend[TINY_NUM_CLASSES] = {0}; // Line 1188
```
#### Fast Cache TLS (tiny_fastcache.h:32-54, extern declarations)
```c
extern __thread void* g_tiny_fast_cache[TINY_FAST_CLASS_COUNT];
extern __thread uint32_t g_tiny_fast_count[TINY_FAST_CLASS_COUNT];
// ... 10+ more TLS variables
```
#### Other Subsystems TLS
- **SFC Cache:** `hakmem_tiny_sfc.c:18-19` (2 TLS variables)
- **Sticky Cache:** `tiny_sticky.c:6-8` (3 TLS arrays)
- **Simple Cache:** `hakmem_tiny_simple.c:23,26` (2 TLS variables)
- **Magazine:** `hakmem_tiny_magazine.c:29,37` (2 TLS variables)
- **Mid-Range MT:** `hakmem_mid_mt.c:37` (1 TLS array)
- **Pool TLS:** `core/box/pool_tls_types.inc.h:11` (1 TLS array)
**Total TLS Variables:** 50+ across the codebase
---
## 2. dlsym / syscall Initialization Flow
### 2.1 Intended Initialization Order
**File:** `core/box/hak_core_init.inc.h:29-35`
```c
static void hak_init_impl(void) {
g_initializing = 1;
// Phase 6.X P0 FIX (2025-10-24): Initialize Box 3 (Syscall Layer) FIRST!
// This MUST be called before ANY allocation (Tiny/Mid/Large/Learner)
// dlsym() initializes function pointers to real libc (bypasses LD_PRELOAD)
hkm_syscall_init(); // ← Line 35
// ...
}
```
**File:** `core/hakmem_syscall.c:41-64`
```c
void hkm_syscall_init(void) {
if (g_syscall_initialized) return; // Idempotent
// dlsym with RTLD_NEXT: Get NEXT symbol in library chain
real_malloc = dlsym(RTLD_NEXT, "malloc"); // ← Line 49
real_calloc = dlsym(RTLD_NEXT, "calloc");
real_free = dlsym(RTLD_NEXT, "free");
real_realloc = dlsym(RTLD_NEXT, "realloc");
if (!real_malloc || !real_calloc || !real_free || !real_realloc) {
fprintf(stderr, "[hakmem_syscall] FATAL: dlsym failed\n");
abort();
}
g_syscall_initialized = 1;
}
```
### 2.2 Actual Execution Order (ASan Build)
**GDB Backtrace:**
```
#0 malloc (size=69) at core/box/hak_wrappers.inc.h:40
#1 0x00007ffff7fc7cca in malloc (size=69) at ../include/rtld-malloc.h:56
#2 __GI__dl_exception_create_format (...) at ./elf/dl-exception.c:157
#3 0x00007ffff7fcf3dc in _dl_lookup_symbol_x (undef_name="__isoc99_printf", ...)
#4 0x00007ffff65759c4 in do_sym (..., name="__isoc99_printf", ...) at ./elf/dl-sym.c:146
#5 _dl_sym (handle=<optimized out>, name="__isoc99_printf", ...) at ./elf/dl-sym.c:195
#12 0x00007ffff74e3859 in __interception::GetFuncAddr (name="__isoc99_printf") at interception_linux.cpp:42
#13 __interception::InterceptFunction (name="__isoc99_printf", ...) at interception_linux.cpp:61
#14 0x00007ffff74a1deb in InitializeCommonInterceptors () at sanitizer_common_interceptors.inc:10094
#15 __asan::InitializeAsanInterceptors () at asan_interceptors.cpp:634
#16 0x00007ffff74c063b in __asan::AsanInitInternal () at asan_rtl.cpp:452
#17 0x00007ffff7fc95be in _dl_init (main_map=0x7ffff7ffe2e0, ...) at ./elf/dl-init.c:102
#18 0x00007ffff7fe32ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
```
**Timeline:**
1. Dynamic linker (`ld-linux.so`) initializes
2. ASan runtime initializes (`__asan::AsanInitInternal`)
3. ASan intercepts `printf` family functions
4. `dlsym("__isoc99_printf")` calls `malloc()` internally (glibc rtld-malloc.h:56)
5. HAKMEM's `malloc()` wrapper is invoked **before `hak_init()` runs**
6. **TLS access SEGV** (TLS segment not yet initialized)
### 2.3 Why `HAKMEM_FORCE_LIBC_ALLOC_BUILD` Doesn't Help
**Current Makefile (line 810-811):**
```makefile
SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \
-fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong
# NOTE: Missing -DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1
```
**Expected Behavior (with flag):**
```c
#ifdef HAKMEM_FORCE_LIBC_ALLOC_BUILD
void* malloc(size_t size) {
extern void* __libc_malloc(size_t);
return __libc_malloc(size); // Bypass HAKMEM completely
}
#endif
```
**However:** Even with `HAKMEM_FORCE_LIBC_ALLOC_BUILD=1`, the symbol `malloc` would still be exported, and ASan might still interpose on it. The real fix requires:
1. Not exporting `malloc` at all when Sanitizers are active, OR
2. Using constructor priorities to guarantee TLS initialization before ASan
---
## 3. Static Constructor Execution Order
### 3.1 Current Constructors
**File:** `core/hakmem.c:66`
```c
__attribute__((constructor)) static void hakmem_ctor_install_segv(void) {
const char* dbg = getenv("HAKMEM_DEBUG_SEGV");
// ... install SIGSEGV handler
}
```
**File:** `core/tiny_debug_ring.c:204`
```c
__attribute__((constructor))
static void hak_debug_ring_ctor(void) {
// ...
}
```
**File:** `core/hakmem_tiny_stats.c:66`
```c
__attribute__((constructor))
static void hak_tiny_stats_ctor(void) {
// ...
}
```
**Problem:** No priority specified! GCC default is `65535`, which runs **after** most library constructors.
**ASan Constructor Priority:** Typically `1` or `100` (very early)
### 3.2 Constructor Priority Ranges
- **0-99:** Reserved for system libraries (libc, libstdc++, sanitizers)
- **100-999:** Early initialization (critical infrastructure)
- **1000-9999:** Normal initialization
- **65535 (default):** Late initialization
---
## 4. Sanitizer Conflict Points
### 4.1 Symbol Interposition Chain
**Without Sanitizer:**
```
Application → malloc() → HAKMEM wrapper → hak_alloc_at()
```
**With ASan (Direct Link):**
```
Application → ASan malloc() → HAKMEM malloc() → TLS access → SEGV
(during ASan init, TLS not ready!)
```
**Expected (with FORCE_LIBC):**
```
Application → ASan malloc() → __libc_malloc() ✓
```
### 4.2 LD_PRELOAD vs Direct Link
**LD_PRELOAD (libhakmem_asan.so):**
```
Application → LD_PRELOAD (HAKMEM malloc) → ASan malloc → ...
```
- Even worse: HAKMEM wrapper runs before ASan init!
**Direct Link (larson_hakmem_asan_alloc):**
```
Application → main() → ...
(ASan init via constructor) → dlsym malloc → HAKMEM malloc → SEGV
```
### 4.3 TLS Initialization Timing
**Normal Execution:**
1. ELF loader initializes TLS templates
2. `__tls_get_addr()` sets up TLS for main thread
3. Constructors run (can safely access TLS)
4. `main()` starts
**ASan Execution:**
1. ELF loader initializes TLS templates
2. ASan constructor runs **before** application constructors
3. ASan's `dlsym()` calls `malloc()`
4. **HAKMEM malloc accesses TLS → SEGV** (TLS not fully initialized!)
**Why TLS Fails:**
- ASan's early constructor (priority 1-100) runs during `_dl_init()`
- TLS segment may be allocated but **not yet associated with the current thread**
- Accessing `__thread` variable triggers `__tls_get_addr()` → NULL dereference
---
## 5. Existing Workarounds / Comments
### 5.1 Recursion Guard Design
**File:** `core/hakmem.c:175-192`
```c
// Phase 6.15 P1: Remove global lock; keep recursion guard only
// ---------------------------------------------------------------------------
// We no longer serialize all allocations with a single global mutex.
// Instead, each submodule is responsible for its own finegrained locking.
// We keep a perthread recursion guard so that internal use of malloc/free
// within the allocator routes to libc (avoids infinite recursion).
//
// Phase 6.X P0 FIX (2025-10-24): Reverted to simple g_hakmem_lock_depth check
// Box Theory - Layer 1 (API Layer):
// This guard protects against LD_PRELOAD recursion (Box 1 → Box 1)
// Box 2 (Core) → Box 3 (Syscall) uses hkm_libc_malloc() (dlsym, no guard needed!)
// NOTE: Removed 'static' to allow access from hakmem_tiny_superslab.c (fopen fix)
__thread int g_hakmem_lock_depth = 0; // 0 = outermost call
```
**Comment Analysis:**
- Designed for **runtime recursion**, not **initialization-time TLS issues**
- Assumes TLS is already available when `malloc()` is called
- `dlsym` guard mentioned, but not for initialization safety
### 5.2 Sanitizer Build Flags (Makefile)
**Line 799-801 (ASan with FORCE_LIBC):**
```makefile
SAN_ASAN_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \
-fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong \
-DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 # ← Bypasses HAKMEM allocator
```
**Line 810-811 (ASan with HAKMEM allocator):**
```makefile
SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \
-fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong
# NOTE: Missing -DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 ← INTENDED for testing!
```
**Design Intent:** Allow ASan to instrument HAKMEM's allocator for memory safety testing.
**Current Reality:** Broken due to TLS initialization order.
---
## 6. Recommended Fix (Priority Ordered)
### 6.1 Option A: Constructor Priority (Quick Fix) ⭐⭐⭐⭐⭐
**Difficulty:** Easy
**Risk:** Low
**Effectiveness:** High (80% confidence)
**Implementation:**
**File:** `core/hakmem.c`
```c
// PRIORITY 101: Run after ASan (priority ~100), but before default (65535)
__attribute__((constructor(101))) static void hakmem_tls_preinit(void) {
// Force TLS allocation by touching the variable
g_hakmem_lock_depth = 0;
// Optional: Pre-initialize dlsym cache
hkm_syscall_init();
}
// Keep existing constructor for SEGV handler (no priority = runs later)
__attribute__((constructor)) static void hakmem_ctor_install_segv(void) {
// ... existing code
}
```
**Rationale:**
- Ensures TLS is touched **after** ASan init but **before** any malloc calls
- Forces `__tls_get_addr()` to run in a safe context
- Minimal code change
**Verification:**
```bash
make clean
# Add constructor(101) to hakmem.c
make asan-larson-alloc
./larson_hakmem_asan_alloc 1 1 128 1024 1 12345 1
# Should run without SEGV
```
---
### 6.2 Option B: Lazy TLS Initialization (Defensive) ⭐⭐⭐⭐
**Difficulty:** Medium
**Risk:** Medium (performance impact)
**Effectiveness:** High (90% confidence)
**Implementation:**
**File:** `core/box/hak_wrappers.inc.h:40-50`
```c
void* malloc(size_t size) {
// NEW: Check if TLS is initialized using a helper
if (__builtin_expect(!hak_tls_is_ready(), 0)) {
extern void* __libc_malloc(size_t);
return __libc_malloc(size);
}
// Existing code...
if (__builtin_expect(g_initializing != 0, 0)) {
extern void* __libc_malloc(size_t);
return __libc_malloc(size);
}
// ...
}
```
**New Helper Function:**
```c
// core/hakmem.c
static __thread int g_tls_ready_flag = 0;
__attribute__((constructor(101)))
static void hak_tls_mark_ready(void) {
g_tls_ready_flag = 1;
}
int hak_tls_is_ready(void) {
// Use volatile to prevent compiler optimization
return __atomic_load_n(&g_tls_ready_flag, __ATOMIC_RELAXED);
}
```
**Pros:**
- Safe even if constructor priorities fail
- Explicit TLS readiness check
- Falls back to libc if TLS not ready
**Cons:**
- Extra branch on malloc hot path (1-2 cycles)
- Requires touching another TLS variable (`g_tls_ready_flag`)
---
### 6.3 Option C: Weak Symbol Aliasing (Advanced) ⭐⭐⭐
**Difficulty:** Hard
**Risk:** High (portability, build system complexity)
**Effectiveness:** Medium (70% confidence)
**Implementation:**
**File:** `core/box/hak_wrappers.inc.h`
```c
// Weak alias: Allow ASan to override if needed
__attribute__((weak))
void* malloc(size_t size) {
// ... HAKMEM implementation
}
// Strong symbol for internal use
void* hak_malloc_internal(size_t size) {
// ... same implementation
}
```
**Pros:**
- Allows ASan to fully control malloc symbol
- HAKMEM can still use internal allocation
**Cons:**
- Complex build interactions
- May not work with all linker configurations
- Debugging becomes harder (symbol resolution issues)
---
### 6.4 Option D: Disable Wrappers for Sanitizer Builds (Pragmatic) ⭐⭐⭐⭐⭐
**Difficulty:** Easy
**Risk:** Low
**Effectiveness:** 100% (but limited scope)
**Implementation:**
**File:** `Makefile:810-811`
```makefile
# OLD (broken):
SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \
-fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong
# NEW (fixed):
SAN_ASAN_ALLOC_CFLAGS = -O1 -g -fno-omit-frame-pointer -fno-lto \
-fsanitize=address,undefined -fno-sanitize-recover=all -fstack-protector-strong \
-DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1 # ← Bypass HAKMEM allocator
```
**Rationale:**
- Sanitizer builds should focus on **application logic bugs**, not allocator bugs
- HAKMEM allocator can be tested separately without Sanitizers
- Eliminates all TLS/constructor issues
**Pros:**
- Immediate fix (1-line change)
- Zero risk
- Sanitizers work as intended
**Cons:**
- Cannot test HAKMEM allocator with Sanitizers
- Defeats purpose of `-alloc` variants
**Recommended Naming:**
```bash
# Current (misleading):
larson_hakmem_asan_alloc # Implies HAKMEM allocator is used
# Better naming:
larson_hakmem_asan_libc # Clarifies libc malloc is used
larson_hakmem_asan_nalloc # "no allocator" (HAKMEM disabled)
```
---
## 7. Recommended Action Plan
### Phase 1: Immediate Fix (1 day) ✅
1. **Add `-DHAKMEM_FORCE_LIBC_ALLOC_BUILD=1` to SAN_*_ALLOC_CFLAGS** (Makefile:810, 823)
2. Rename binaries for clarity:
- `larson_hakmem_asan_alloc``larson_hakmem_asan_libc`
- `larson_hakmem_tsan_alloc``larson_hakmem_tsan_libc`
3. Verify all Sanitizer builds work correctly
### Phase 2: Constructor Priority Fix (2-3 days)
1. Add `__attribute__((constructor(101)))` to `hakmem_tls_preinit()`
2. Test with ASan/TSan/UBSan (allocator enabled)
3. Document constructor priority ranges in `ARCHITECTURE.md`
### Phase 3: Defensive TLS Check (1 week, optional)
1. Implement `hak_tls_is_ready()` helper
2. Add early exit in `malloc()` wrapper
3. Benchmark performance impact (should be < 1%)
### Phase 4: Documentation (ongoing)
1. Update `CLAUDE.md` with Sanitizer findings
2. Add "Sanitizer Compatibility" section to README
3. Document TLS variable inventory
---
## 8. Testing Matrix
| Build Type | Allocator | Sanitizer | Expected Result | Actual Result |
|------------|-----------|-----------|-----------------|---------------|
| `asan-larson` | libc | ASan+UBSan | Pass | Pass |
| `tsan-larson` | libc | TSan | Pass | Pass |
| `asan-larson-alloc` | HAKMEM | ASan+UBSan | Pass | SEGV (TLS) |
| `tsan-larson-alloc` | HAKMEM | TSan | Pass | SEGV (TLS) |
| `asan-shared-alloc` | HAKMEM | ASan+UBSan | Pass | SEGV (TLS) |
| `tsan-shared-alloc` | HAKMEM | TSan | Pass | SEGV (TLS) |
**Target:** All after Phase 1 (libc) + Phase 2 (constructor priority)
---
## 9. References
### 9.1 Related Code Files
- `core/hakmem.c:188` - TLS recursion guard
- `core/box/hak_wrappers.inc.h:40` - malloc wrapper entry point
- `core/box/hak_core_init.inc.h:29` - Initialization flow
- `core/hakmem_syscall.c:41` - dlsym initialization
- `Makefile:799-824` - Sanitizer build flags
### 9.2 External Documentation
- [GCC Constructor/Destructor Attributes](https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-constructor-function-attribute)
- [ASan Initialization Order](https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco)
- [ELF TLS Specification](https://www.akkadia.org/drepper/tls.pdf)
- [glibc rtld-malloc.h](https://sourceware.org/git/?p=glibc.git;a=blob;f=include/rtld-malloc.h)
---
## 10. Conclusion
The HAKMEM Sanitizer crash is a **classic initialization order problem** exacerbated by ASan's aggressive use of `malloc()` during `dlsym()` resolution. The immediate fix is trivial (enable `HAKMEM_FORCE_LIBC_ALLOC_BUILD`), but enabling Sanitizer instrumentation of HAKMEM itself requires careful constructor priority management.
**Recommended Path:** Implement Phase 1 (immediate) + Phase 2 (robust) for full Sanitizer support with allocator instrumentation enabled.
---
**Report Author:** Claude Code (Sonnet 4.5)
**Investigation Date:** 2025-11-07
**Last Updated:** 2025-11-07