Phase 1 完了:環境変数整理 + fprintf デバッグガード ENV変数削除(BG/HotMag系): - core/hakmem_tiny_init.inc: HotMag ENV 削除 (~131 lines) - core/hakmem_tiny_bg_spill.c: BG spill ENV 削除 - core/tiny_refill.h: BG remote 固定値化 - core/hakmem_tiny_slow.inc: BG refs 削除 fprintf Debug Guards (#if !HAKMEM_BUILD_RELEASE): - core/hakmem_shared_pool.c: Lock stats (~18 fprintf) - core/page_arena.c: Init/Shutdown/Stats (~27 fprintf) - core/hakmem.c: SIGSEGV init message ドキュメント整理: - 328 markdown files 削除(旧レポート・重複docs) 性能確認: - Larson: 52.35M ops/s (前回52.8M、安定動作✅) - ENV整理による機能影響なし - Debug出力は一部残存(次phase で対応) 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
183 lines
5.3 KiB
Markdown
183 lines
5.3 KiB
Markdown
# Phase 15 Registry Lookup Investigation
|
||
|
||
**Date**: 2025-11-15
|
||
**Status**: 🔍 ROOT CAUSE IDENTIFIED
|
||
|
||
## Summary
|
||
|
||
Page-aligned Tiny allocations reach ExternalGuard → SuperSlab registry lookup FAILS → delegated to `__libc_free()` → crash.
|
||
|
||
## Critical Findings
|
||
|
||
### 1. Registry Only Stores ONE SuperSlab
|
||
|
||
**Evidence**:
|
||
```
|
||
[SUPER_REG] register base=0x7d3893c00000 lg=21 slot=115870 magic=5353504c
|
||
```
|
||
|
||
**Only 1 registration** in entire test run (10K iterations, 100K operations).
|
||
|
||
### 2. 4MB Address Gap
|
||
|
||
**Pattern** (consistent across multiple runs):
|
||
- **Registry stores**: `0x7d3893c00000` (SuperSlab structure address)
|
||
- **Lookup searches**: `0x7d3893800000` (user pointer, 4MB **lower**)
|
||
- **Difference**: `0x400000 = 4MB = 2 × SuperSlab size (lg=21, 2MB)`
|
||
|
||
### 3. User Data Layout
|
||
|
||
**From code analysis** (`superslab_inline.h:30-35`):
|
||
|
||
```c
|
||
size_t off = SUPERSLAB_SLAB0_DATA_OFFSET + (size_t)slab_idx * SLAB_SIZE;
|
||
return (uint8_t*)ss + off;
|
||
```
|
||
|
||
**User data is placed AFTER SuperSlab structure**, NOT before!
|
||
|
||
**Implication**: User pointer `0x7d3893800000` **cannot** belong to SuperSlab `0x7d3893c00000` (4MB higher).
|
||
|
||
### 4. mmap Alignment Mechanism
|
||
|
||
**Code** (`hakmem_tiny_superslab.c:280-308`):
|
||
|
||
```c
|
||
size_t alloc_size = ss_size * 2; // Allocate 4MB for 2MB SuperSlab
|
||
void* raw = mmap(NULL, alloc_size, ...);
|
||
uintptr_t aligned_addr = (raw_addr + ss_mask) & ~ss_mask; // 2MB align
|
||
```
|
||
|
||
**Scenario**:
|
||
- mmap returns `0x7d3893800000` (already 2MB-aligned)
|
||
- `aligned_addr = 0x7d3893800000` (no change)
|
||
- Prefix size = 0, Suffix = 2MB (munmapped)
|
||
- **SuperSlab registered at**: `0x7d3893800000`
|
||
|
||
**Contradiction**: Registry shows `0x7d3893c00000`, not `0x7d3893800000`!
|
||
|
||
### 5. Hash Slot Mismatch
|
||
|
||
**Lookup**:
|
||
```
|
||
[SUPER_LOOKUP] ptr=0x7d3893800000 lg=21 aligned_base=0x7d3893800000 hash=115868
|
||
```
|
||
|
||
**Registry**:
|
||
```
|
||
[SUPER_REG] register base=0x7d3893c00000 lg=21 slot=115870
|
||
```
|
||
|
||
**Hash difference**: 115868 vs 115870 (2 slots apart)
|
||
**Reason**: Linear probing found different slot due to collision.
|
||
|
||
## Root Cause Hypothesis
|
||
|
||
### Option A: Multiple SuperSlabs, Only One Registered
|
||
|
||
**Theory**: Multiple SuperSlabs allocated, but only the **last one** is logged.
|
||
|
||
**Problem**: Debug logging should show ALL registrations after fix (ENV check on every call).
|
||
|
||
### Option B: LRU Cache Reuse
|
||
|
||
**Theory**: Most SuperSlabs come from LRU cache (already registered), only new allocations are logged.
|
||
|
||
**Problem**: First few iterations should still show multiple registrations.
|
||
|
||
### Option C: Pointer is NOT from hakmem
|
||
|
||
**Theory**: `0x7d3893800000` is allocated by **`__libc_malloc()`**, NOT hakmem.
|
||
|
||
**Evidence**:
|
||
- Box BenchMeta uses `__libc_calloc` for `slots[]` array
|
||
- `free(slots[idx])` uses hakmem wrapper
|
||
- **But**: `slots[]` array itself is freed with `__libc_free(slots)` (Line 99)
|
||
|
||
**Contradiction**: `slots[]` should NOT reach hakmem `free()` wrapper.
|
||
|
||
### Option D: Registry Lookup Bug
|
||
|
||
**Theory**: SuperSlab **is** registered at `0x7d3893800000`, but lookup fails due to:
|
||
1. Hash collision (different slot used during registration vs lookup)
|
||
2. Linear probing limit exceeded (SUPER_MAX_PROBE = 8)
|
||
3. Alignment mismatch (looking for wrong base address)
|
||
|
||
## Test Results Comparison
|
||
|
||
| Phase | Test Result | Behavior |
|
||
|-------|-------------|----------|
|
||
| Phase 14 | ✅ PASS (5.69M ops/s) | No crash with same test |
|
||
| Phase 15 | ❌ CRASH | ExternalGuard → `__libc_free()` failure |
|
||
|
||
**Conclusion**: Phase 15 Box Separation introduced regression.
|
||
|
||
## Next Steps
|
||
|
||
### Investigation Needed
|
||
|
||
1. **Add more detailed logging**:
|
||
- Log ALL mmap calls with returned address
|
||
- Log prefix/suffix munmap with exact ranges
|
||
- Log final SuperSlab address vs mmap address
|
||
- Track which pointers are allocated from which SuperSlab
|
||
|
||
2. **Verify registry integrity**:
|
||
- Dump entire registry before crash
|
||
- Check for hash collisions
|
||
- Verify linear probing behavior
|
||
|
||
3. **Test with reduced SuperSlab size**:
|
||
- Try lg=20 (1MB) instead of lg=21 (2MB)
|
||
- See if 2MB gap still occurs
|
||
|
||
### Fix Options
|
||
|
||
#### **Option 1: Fix SuperSlab Registry Lookup** ✅ **RECOMMENDED**
|
||
|
||
**Issue**: Registry lookup fails for valid hakmem allocations.
|
||
|
||
**Potential fixes**:
|
||
- Increase SUPER_MAX_PROBE from 8 to 16/32
|
||
- Use better hash function to reduce collisions
|
||
- Store address **range** instead of single base
|
||
- Support lookup by any address within SuperSlab region
|
||
|
||
#### **Option 2: Improve ExternalGuard Safety** ⚠️ **WORKAROUND**
|
||
|
||
**Current behavior** (DANGEROUS):
|
||
```c
|
||
if (!is_mapped) return 0; // Delegate to __libc_free → CRASH!
|
||
```
|
||
|
||
**Safer behavior**:
|
||
```c
|
||
if (!is_mapped) {
|
||
fprintf(stderr, "[ExternalGuard] WARNING: Unknown pointer %p (ignored)\n", ptr);
|
||
return 1; // Claim handled (leak vs crash tradeoff)
|
||
}
|
||
```
|
||
|
||
**Pros**: Prevents crash
|
||
**Cons**: Memory leak for genuinely external pointers
|
||
|
||
#### **Option 3: Fix Box FrontGate Classification** ❌ NOT RECOMMENDED
|
||
|
||
**Idea**: Add special path for page-aligned Tiny pointers.
|
||
|
||
**Problems**:
|
||
- Can't read header at `ptr-1` (page boundary violation)
|
||
- Violates 1-byte header design
|
||
- Requires alternative classification
|
||
|
||
## Conclusion
|
||
|
||
**Primary Issue**: SuperSlab registry lookup fails for page-aligned user pointers.
|
||
|
||
**Secondary Issue**: ExternalGuard unconditionally delegates unknown pointers to `__libc_free()`.
|
||
|
||
**Recommended Action**:
|
||
1. Fix registry lookup (Option 1)
|
||
2. Add ExternalGuard safety (Option 2 as backup)
|
||
3. Comprehensive logging to confirm root cause
|