# Phase 15 Bug Analysis - ExternalGuard Crash Investigation **Date**: 2025-11-15 **Status**: ROOT CAUSE IDENTIFIED ## Summary ExternalGuard is being called with a page-aligned pointer (`0x7fd8f8202000`) that: - `hak_super_lookup()` returns NULL (not in registry) - `__libc_free()` rejects as "invalid pointer" ## Evidence ### Crash Log ``` [ExternalGuard] ptr=0x7fd8f8202000 offset_in_page=0x0 (call #1) [ExternalGuard] >>> Use: addr2line -e 0x58b613548275 [ExternalGuard] hak_super_lookup(ptr) = (nil) [ExternalGuard] ptr=0x7fd8f8202000 delegated to __libc_free free(): invalid pointer ``` ### Caller Identification Using objdump analysis, caller address `0x...8275` maps to: - **Function**: `free()` wrapper (line 0xb270 in binary) - **Source**: `free(slots)` from bench_random_mixed.c line 85 ### Allocation Analysis ```c // bench_random_mixed.c line 34: void** slots = (void**)calloc(256, sizeof(void*)); // = 2048 bytes ``` **calloc(2048) routing** (core/box/hak_wrappers.inc.h:282-285): ```c if (ld_safe_mode_calloc >= 2 || total > TINY_MAX_SIZE) { // TINY_MAX_SIZE = 1023 extern void* __libc_calloc(size_t, size_t); return __libc_calloc(nmemb, size); // ← Delegates to libc! } ``` **Expected**: `calloc(2048)` → `__libc_calloc()` (delegated to libc) ## Root Cause Analysis ### Free Path Bug (core/box/hak_wrappers.inc.h) **Lines 147-166**: Early classification ```c ptr_classification_t c = classify_ptr(ptr); if (is_hakmem_owned) { hak_free_at(ptr, ...); // Path A: HAKMEM allocations return; } ``` **Lines 226-228**: **FINAL FALLBACK** - unconditional routing ```c g_hakmem_lock_depth++; hak_free_at(ptr, 0, HAK_CALLSITE()); // ← BUG: Routes ALL pointers! g_hakmem_lock_depth--; ``` **The Bug**: Non-HAKMEM pointers that pass all early-exit checks (lines 171-225) get unconditionally routed to `hak_free_at()`, even though `classify_ptr()` returned `PTR_KIND_EXTERNAL` (not HAKMEM-owned). ### Why __libc_free() Rejects the Pointer **Two Hypotheses**: **Hypothesis A**: Pointer is from `__libc_calloc()` (expected), but something corrupts it before reaching `__libc_free()` - Test: calloc(256, 8) returned offset 0x2a0 (not page-aligned) - **Contradiction**: Crash log shows page-aligned pointer (0x...000) - **Conclusion**: Pointer is NOT from `calloc(slots)` **Hypothesis B**: Pointer is a HAKMEM allocation that `classify_ptr()` failed to recognize - Pool TLS allocations CAN be page-aligned (mmap'd chunks) - `hak_super_lookup()` returns NULL → not in Tiny registry - **Likely**: This is a Pool TLS allocation (2KB = Pool range 8-52KB) ## Verification Tests ### Test 1: Pool TLS Allocation Check ```bash # Check if 2KB allocations use Pool TLS ./test/pool_tls_allocation_test 2048 ``` ### Test 2: classify_ptr() Behavior ```c void* ptr = calloc(256, sizeof(void*)); // 2048 bytes ptr_classification_t c = classify_ptr(ptr); printf("kind=%d (POOL_TLS=%d, EXTERNAL=%d)\n", c.kind, PTR_KIND_POOL_TLS, PTR_KIND_EXTERNAL); ``` ## Next Steps ### Option 1: Fix free() Wrapper Logic (Recommended) Change line 227 to check HAKMEM ownership first: ```c // Before (BUG): hak_free_at(ptr, 0, HAK_CALLSITE()); // Routes ALL pointers // After (FIX): if (is_hakmem_owned) { hak_free_at(ptr, 0, HAK_CALLSITE()); } else { extern void __libc_free(void*); __libc_free(ptr); // Proper fallback for libc allocations } ``` **Problem**: `is_hakmem_owned` is out of scope (line 149-159 block) **Solution**: Hoist `is_hakmem_owned` to function scope or re-classify at line 226 ### Option 2: Fix classify_ptr() to Recognize Pool TLS If pointer is actually Pool TLS but misclassified: - Add Pool TLS registry lookup to `classify_ptr()` - Ensure Pool allocations are properly registered ### Option 3: Defer Phase 15 (Current) Revert to Phase 14-C until free() wrapper logic is fixed ## User's Insight > "うん? mincore のセグフォはむしろ 違う層から呼ばれているという バグ発見じゃにゃいの?" **Translation**: "Wait, isn't the mincore SEGV actually detecting a bug - that it's being called from the wrong layer?" **Interpretation**: ExternalGuard being called is CORRECT behavior - it's detecting that a HAKMEM pointer (Pool TLS?) is not being recognized by the classification layer! ## Conclusion **Primary Bug**: `free()` wrapper unconditionally routes all pointers to `hak_free_at()` at line 227, regardless of HAKMEM ownership. **Secondary Bug (suspected)**: `classify_ptr()` may fail to recognize Pool TLS allocations, causing them to be misclassified as `PTR_KIND_EXTERNAL`. **Recommendation**: Fix Option 1 (free() wrapper logic) first, then investigate Pool TLS classification if issue persists.