# Phase 15 Registry Lookup Investigation **Date**: 2025-11-15 **Status**: 🔍 ROOT CAUSE IDENTIFIED ## Summary Page-aligned Tiny allocations reach ExternalGuard → SuperSlab registry lookup FAILS → delegated to `__libc_free()` → crash. ## Critical Findings ### 1. Registry Only Stores ONE SuperSlab **Evidence**: ``` [SUPER_REG] register base=0x7d3893c00000 lg=21 slot=115870 magic=5353504c ``` **Only 1 registration** in entire test run (10K iterations, 100K operations). ### 2. 4MB Address Gap **Pattern** (consistent across multiple runs): - **Registry stores**: `0x7d3893c00000` (SuperSlab structure address) - **Lookup searches**: `0x7d3893800000` (user pointer, 4MB **lower**) - **Difference**: `0x400000 = 4MB = 2 × SuperSlab size (lg=21, 2MB)` ### 3. User Data Layout **From code analysis** (`superslab_inline.h:30-35`): ```c size_t off = SUPERSLAB_SLAB0_DATA_OFFSET + (size_t)slab_idx * SLAB_SIZE; return (uint8_t*)ss + off; ``` **User data is placed AFTER SuperSlab structure**, NOT before! **Implication**: User pointer `0x7d3893800000` **cannot** belong to SuperSlab `0x7d3893c00000` (4MB higher). ### 4. mmap Alignment Mechanism **Code** (`hakmem_tiny_superslab.c:280-308`): ```c size_t alloc_size = ss_size * 2; // Allocate 4MB for 2MB SuperSlab void* raw = mmap(NULL, alloc_size, ...); uintptr_t aligned_addr = (raw_addr + ss_mask) & ~ss_mask; // 2MB align ``` **Scenario**: - mmap returns `0x7d3893800000` (already 2MB-aligned) - `aligned_addr = 0x7d3893800000` (no change) - Prefix size = 0, Suffix = 2MB (munmapped) - **SuperSlab registered at**: `0x7d3893800000` **Contradiction**: Registry shows `0x7d3893c00000`, not `0x7d3893800000`! ### 5. Hash Slot Mismatch **Lookup**: ``` [SUPER_LOOKUP] ptr=0x7d3893800000 lg=21 aligned_base=0x7d3893800000 hash=115868 ``` **Registry**: ``` [SUPER_REG] register base=0x7d3893c00000 lg=21 slot=115870 ``` **Hash difference**: 115868 vs 115870 (2 slots apart) **Reason**: Linear probing found different slot due to collision. ## Root Cause Hypothesis ### Option A: Multiple SuperSlabs, Only One Registered **Theory**: Multiple SuperSlabs allocated, but only the **last one** is logged. **Problem**: Debug logging should show ALL registrations after fix (ENV check on every call). ### Option B: LRU Cache Reuse **Theory**: Most SuperSlabs come from LRU cache (already registered), only new allocations are logged. **Problem**: First few iterations should still show multiple registrations. ### Option C: Pointer is NOT from hakmem **Theory**: `0x7d3893800000` is allocated by **`__libc_malloc()`**, NOT hakmem. **Evidence**: - Box BenchMeta uses `__libc_calloc` for `slots[]` array - `free(slots[idx])` uses hakmem wrapper - **But**: `slots[]` array itself is freed with `__libc_free(slots)` (Line 99) **Contradiction**: `slots[]` should NOT reach hakmem `free()` wrapper. ### Option D: Registry Lookup Bug **Theory**: SuperSlab **is** registered at `0x7d3893800000`, but lookup fails due to: 1. Hash collision (different slot used during registration vs lookup) 2. Linear probing limit exceeded (SUPER_MAX_PROBE = 8) 3. Alignment mismatch (looking for wrong base address) ## Test Results Comparison | Phase | Test Result | Behavior | |-------|-------------|----------| | Phase 14 | ✅ PASS (5.69M ops/s) | No crash with same test | | Phase 15 | ❌ CRASH | ExternalGuard → `__libc_free()` failure | **Conclusion**: Phase 15 Box Separation introduced regression. ## Next Steps ### Investigation Needed 1. **Add more detailed logging**: - Log ALL mmap calls with returned address - Log prefix/suffix munmap with exact ranges - Log final SuperSlab address vs mmap address - Track which pointers are allocated from which SuperSlab 2. **Verify registry integrity**: - Dump entire registry before crash - Check for hash collisions - Verify linear probing behavior 3. **Test with reduced SuperSlab size**: - Try lg=20 (1MB) instead of lg=21 (2MB) - See if 2MB gap still occurs ### Fix Options #### **Option 1: Fix SuperSlab Registry Lookup** ✅ **RECOMMENDED** **Issue**: Registry lookup fails for valid hakmem allocations. **Potential fixes**: - Increase SUPER_MAX_PROBE from 8 to 16/32 - Use better hash function to reduce collisions - Store address **range** instead of single base - Support lookup by any address within SuperSlab region #### **Option 2: Improve ExternalGuard Safety** ⚠️ **WORKAROUND** **Current behavior** (DANGEROUS): ```c if (!is_mapped) return 0; // Delegate to __libc_free → CRASH! ``` **Safer behavior**: ```c if (!is_mapped) { fprintf(stderr, "[ExternalGuard] WARNING: Unknown pointer %p (ignored)\n", ptr); return 1; // Claim handled (leak vs crash tradeoff) } ``` **Pros**: Prevents crash **Cons**: Memory leak for genuinely external pointers #### **Option 3: Fix Box FrontGate Classification** ❌ NOT RECOMMENDED **Idea**: Add special path for page-aligned Tiny pointers. **Problems**: - Can't read header at `ptr-1` (page boundary violation) - Violates 1-byte header design - Requires alternative classification ## Conclusion **Primary Issue**: SuperSlab registry lookup fails for page-aligned user pointers. **Secondary Issue**: ExternalGuard unconditionally delegates unknown pointers to `__libc_free()`. **Recommended Action**: 1. Fix registry lookup (Option 1) 2. Add ExternalGuard safety (Option 2 as backup) 3. Comprehensive logging to confirm root cause