140 lines
4.6 KiB
Markdown
140 lines
4.6 KiB
Markdown
|
|
# Phase 15 Bug Analysis - ExternalGuard Crash Investigation
|
|||
|
|
|
|||
|
|
**Date**: 2025-11-15
|
|||
|
|
**Status**: ROOT CAUSE IDENTIFIED
|
|||
|
|
|
|||
|
|
## Summary
|
|||
|
|
|
|||
|
|
ExternalGuard is being called with a page-aligned pointer (`0x7fd8f8202000`) that:
|
|||
|
|
- `hak_super_lookup()` returns NULL (not in registry)
|
|||
|
|
- `__libc_free()` rejects as "invalid pointer"
|
|||
|
|
|
|||
|
|
## Evidence
|
|||
|
|
|
|||
|
|
### Crash Log
|
|||
|
|
```
|
|||
|
|
[ExternalGuard] ptr=0x7fd8f8202000 offset_in_page=0x0 (call #1)
|
|||
|
|
[ExternalGuard] >>> Use: addr2line -e <binary> 0x58b613548275
|
|||
|
|
[ExternalGuard] hak_super_lookup(ptr) = (nil)
|
|||
|
|
[ExternalGuard] ptr=0x7fd8f8202000 delegated to __libc_free
|
|||
|
|
free(): invalid pointer
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Caller Identification
|
|||
|
|
Using objdump analysis, caller address `0x...8275` maps to:
|
|||
|
|
- **Function**: `free()` wrapper (line 0xb270 in binary)
|
|||
|
|
- **Source**: `free(slots)` from bench_random_mixed.c line 85
|
|||
|
|
|
|||
|
|
### Allocation Analysis
|
|||
|
|
```c
|
|||
|
|
// bench_random_mixed.c line 34:
|
|||
|
|
void** slots = (void**)calloc(256, sizeof(void*)); // = 2048 bytes
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**calloc(2048) routing** (core/box/hak_wrappers.inc.h:282-285):
|
|||
|
|
```c
|
|||
|
|
if (ld_safe_mode_calloc >= 2 || total > TINY_MAX_SIZE) { // TINY_MAX_SIZE = 1023
|
|||
|
|
extern void* __libc_calloc(size_t, size_t);
|
|||
|
|
return __libc_calloc(nmemb, size); // ← Delegates to libc!
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Expected**: `calloc(2048)` → `__libc_calloc()` (delegated to libc)
|
|||
|
|
|
|||
|
|
## Root Cause Analysis
|
|||
|
|
|
|||
|
|
### Free Path Bug (core/box/hak_wrappers.inc.h)
|
|||
|
|
|
|||
|
|
**Lines 147-166**: Early classification
|
|||
|
|
```c
|
|||
|
|
ptr_classification_t c = classify_ptr(ptr);
|
|||
|
|
if (is_hakmem_owned) {
|
|||
|
|
hak_free_at(ptr, ...); // Path A: HAKMEM allocations
|
|||
|
|
return;
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Lines 226-228**: **FINAL FALLBACK** - unconditional routing
|
|||
|
|
```c
|
|||
|
|
g_hakmem_lock_depth++;
|
|||
|
|
hak_free_at(ptr, 0, HAK_CALLSITE()); // ← BUG: Routes ALL pointers!
|
|||
|
|
g_hakmem_lock_depth--;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**The Bug**: Non-HAKMEM pointers that pass all early-exit checks (lines 171-225) get unconditionally routed to `hak_free_at()`, even though `classify_ptr()` returned `PTR_KIND_EXTERNAL` (not HAKMEM-owned).
|
|||
|
|
|
|||
|
|
### Why __libc_free() Rejects the Pointer
|
|||
|
|
|
|||
|
|
**Two Hypotheses**:
|
|||
|
|
|
|||
|
|
**Hypothesis A**: Pointer is from `__libc_calloc()` (expected), but something corrupts it before reaching `__libc_free()`
|
|||
|
|
- Test: calloc(256, 8) returned offset 0x2a0 (not page-aligned)
|
|||
|
|
- **Contradiction**: Crash log shows page-aligned pointer (0x...000)
|
|||
|
|
- **Conclusion**: Pointer is NOT from `calloc(slots)`
|
|||
|
|
|
|||
|
|
**Hypothesis B**: Pointer is a HAKMEM allocation that `classify_ptr()` failed to recognize
|
|||
|
|
- Pool TLS allocations CAN be page-aligned (mmap'd chunks)
|
|||
|
|
- `hak_super_lookup()` returns NULL → not in Tiny registry
|
|||
|
|
- **Likely**: This is a Pool TLS allocation (2KB = Pool range 8-52KB)
|
|||
|
|
|
|||
|
|
## Verification Tests
|
|||
|
|
|
|||
|
|
### Test 1: Pool TLS Allocation Check
|
|||
|
|
```bash
|
|||
|
|
# Check if 2KB allocations use Pool TLS
|
|||
|
|
./test/pool_tls_allocation_test 2048
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Test 2: classify_ptr() Behavior
|
|||
|
|
```c
|
|||
|
|
void* ptr = calloc(256, sizeof(void*)); // 2048 bytes
|
|||
|
|
ptr_classification_t c = classify_ptr(ptr);
|
|||
|
|
printf("kind=%d (POOL_TLS=%d, EXTERNAL=%d)\n",
|
|||
|
|
c.kind, PTR_KIND_POOL_TLS, PTR_KIND_EXTERNAL);
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Next Steps
|
|||
|
|
|
|||
|
|
### Option 1: Fix free() Wrapper Logic (Recommended)
|
|||
|
|
Change line 227 to check HAKMEM ownership first:
|
|||
|
|
```c
|
|||
|
|
// Before (BUG):
|
|||
|
|
hak_free_at(ptr, 0, HAK_CALLSITE()); // Routes ALL pointers
|
|||
|
|
|
|||
|
|
// After (FIX):
|
|||
|
|
if (is_hakmem_owned) {
|
|||
|
|
hak_free_at(ptr, 0, HAK_CALLSITE());
|
|||
|
|
} else {
|
|||
|
|
extern void __libc_free(void*);
|
|||
|
|
__libc_free(ptr); // Proper fallback for libc allocations
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Problem**: `is_hakmem_owned` is out of scope (line 149-159 block)
|
|||
|
|
|
|||
|
|
**Solution**: Hoist `is_hakmem_owned` to function scope or re-classify at line 226
|
|||
|
|
|
|||
|
|
### Option 2: Fix classify_ptr() to Recognize Pool TLS
|
|||
|
|
If pointer is actually Pool TLS but misclassified:
|
|||
|
|
- Add Pool TLS registry lookup to `classify_ptr()`
|
|||
|
|
- Ensure Pool allocations are properly registered
|
|||
|
|
|
|||
|
|
### Option 3: Defer Phase 15 (Current)
|
|||
|
|
Revert to Phase 14-C until free() wrapper logic is fixed
|
|||
|
|
|
|||
|
|
## User's Insight
|
|||
|
|
|
|||
|
|
> "うん? mincore のセグフォはむしろ 違う層から呼ばれているという バグ発見じゃにゃいの?"
|
|||
|
|
|
|||
|
|
**Translation**: "Wait, isn't the mincore SEGV actually detecting a bug - that it's being called from the wrong layer?"
|
|||
|
|
|
|||
|
|
**Interpretation**: ExternalGuard being called is CORRECT behavior - it's detecting that a HAKMEM pointer (Pool TLS?) is not being recognized by the classification layer!
|
|||
|
|
|
|||
|
|
## Conclusion
|
|||
|
|
|
|||
|
|
**Primary Bug**: `free()` wrapper unconditionally routes all pointers to `hak_free_at()` at line 227, regardless of HAKMEM ownership.
|
|||
|
|
|
|||
|
|
**Secondary Bug (suspected)**: `classify_ptr()` may fail to recognize Pool TLS allocations, causing them to be misclassified as `PTR_KIND_EXTERNAL`.
|
|||
|
|
|
|||
|
|
**Recommendation**: Fix Option 1 (free() wrapper logic) first, then investigate Pool TLS classification if issue persists.
|