Files
hakmem/docs/analysis/FAST_CAP_0_SEGV_ROOT_CAUSE_ANALYSIS.md

517 lines
16 KiB
Markdown
Raw Normal View History

Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization) ## Changes ### 1. core/page_arena.c - Removed init failure message (lines 25-27) - error is handled by returning early - All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks ### 2. core/hakmem.c - Wrapped SIGSEGV handler init message (line 72) - CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs ### 3. core/hakmem_shared_pool.c - Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE: - Node pool exhaustion warning (line 252) - SP_META_CAPACITY_ERROR warning (line 421) - SP_FIX_GEOMETRY debug logging (line 745) - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865) - SP_ACQUIRE_STAGE0_L0 debug logging (line 803) - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922) - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996) - SP_ACQUIRE_STAGE3 debug logging (line 1116) - SP_SLOT_RELEASE debug logging (line 1245) - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305) - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316) - Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized ## Performance Validation Before: 51M ops/s (with debug fprintf overhead) After: 49.1M ops/s (consistent performance, fprintf removed from hot paths) ## Build & Test ```bash ./build.sh larson_hakmem ./out/release/larson_hakmem 1 5 1 1000 100 10000 42 # Result: 49.1M ops/s ``` Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00
# FAST_CAP=0 SEGV Root Cause Analysis
## Executive Summary
**Status:** Fix #1 and Fix #2 are implemented correctly BUT are **NOT BEING EXECUTED** in the crash scenario.
**Root Cause Discovered:** When `FAST_CAP=0` and `g_tls_list_enable=1` (TLS List mode), the free path **BYPASSES the freelist entirely** and stores freed blocks in TLS List cache. These blocks are **NEVER merged into the SuperSlab freelist** until TLS List spills. Meanwhile, the allocation path tries to allocate from the freelist, which contains **stale pointers** from cross-thread frees that were never drained.
**Critical Flow Bug:**
```
Thread A:
1. free(ptr) → g_fast_cap[cls]=0 → skip fast tier
2. g_tls_list_enable=1 → TLS List push (L75-79 in free.inc)
3. RETURNS WITHOUT TOUCHING FREELIST (meta->freelist unchanged)
4. Remote frees accumulate in remote_heads[] but NEVER get drained
Thread B:
1. alloc() → hak_tiny_alloc_superslab(cls)
2. meta->freelist EXISTS (has stale/remote pointers)
3. FIX #2 SHOULD drain here (L740-743) BUT...
4. has_remote = (remote_heads[idx] != 0) → FALSE (wrong index!)
5. Dereferences stale freelist → **SEGV**
```
---
## Why Fix #1 and Fix #2 Are Not Executed
### Fix #1 (superslab_refill L615-620): NOT REACHED
```c
// Fix #1: In superslab_refill() loop
for (int i = 0; i < tls_cap; i++) {
int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[i], memory_order_acquire) != 0);
if (has_remote) {
ss_remote_drain_to_freelist(tls->ss, i); // ← This line NEVER executes
}
if (tls->ss->slabs[i].freelist) { ... }
}
```
**Why it doesn't execute:**
1. **Larson immediately crashes on first allocation miss**
- The allocation path is: `hak_tiny_alloc_superslab()` (L720) → checks existing `meta->freelist` (L737) → SEGV
- It **NEVER reaches** `superslab_refill()` (L755) because it crashes first!
2. **Even if it did reach refill:**
- Loop checks ALL slabs `i=0..tls_cap`, but the current TLS slab is `tls->slab_idx` (e.g., 7)
- When checking slab `i=0..6`, those slabs don't have `remote_heads[i]` set
- When checking slab `i=7`, it finds `freelist` exists and **RETURNS IMMEDIATELY** (L624) without draining!
### Fix #2 (hak_tiny_alloc_superslab L737-743): CONDITION ALWAYS FALSE
```c
if (meta && meta->freelist) {
int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire) != 0);
if (has_remote) { // ← ALWAYS FALSE!
ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);
}
void* block = meta->freelist; // ← SEGV HERE
meta->freelist = *(void**)block;
}
```
**Why `has_remote` is always false:**
1. **Wrong understanding of remote queue semantics:**
- `remote_heads[idx]` is **NOT a flag** indicating "has remote frees"
- It's the **HEAD POINTER** of the remote queue linked list
- When TLS List mode is active, frees go to TLS List, **NOT to remote_heads[]**!
2. **Actual remote free flow in TLS List mode:**
```
hak_tiny_free() → class_idx detected → g_fast_cap=0 → skip fast
→ g_tls_list_enable=1 → TLS List push (L75-79)
→ RETURNS (L80) WITHOUT calling ss_remote_push()!
```
3. **Therefore:**
- `remote_heads[idx]` remains `NULL` (never used in TLS List mode)
- `has_remote` check is always false
- Drain never happens
- Freelist contains stale pointers from old allocations
---
## The Missing Link: TLS List Spill Path
When TLS List is enabled, freed blocks flow like this:
```
free() → TLS List cache → [eventually] tls_list_spill_excess()
→ WHERE DO THEY GO? → Need to check tls_list_spill implementation!
```
**Hypothesis:** TLS List spill probably returns blocks to Magazine/Registry, **NOT to SuperSlab freelist**. This creates a **disconnect** where:
1. Blocks are allocated from SuperSlab freelist
2. Blocks are freed into TLS List
3. TLS List spills to Magazine/Registry (NOT back to freelist)
4. SuperSlab freelist becomes stale (contains pointers to freed memory)
5. Cross-thread frees accumulate in remote_heads[] but never merge
6. Next allocation from freelist → SEGV
---
## Evidence from Debug Ring Output
**Key observation:** `remote_drain` events are **NEVER** recorded in debug output.
**Why?**
- `TINY_RING_EVENT_REMOTE_DRAIN` is only recorded in `ss_remote_drain_to_freelist()` (superslab.h:341-344)
- But this function is never called because:
- Fix #1 not reached (crash before refill)
- Fix #2 condition always false (remote_heads[] unused in TLS List mode)
**What IS recorded:**
- `remote_push` events: Yes (cross-thread frees call ss_remote_push in some path)
- `remote_drain` events: No (never called)
- This confirms the diagnosis: **remote queues fill up but never drain**
---
## Code Paths Verified
### Free Path (FAST_CAP=0, TLS List mode)
```
hak_tiny_free(ptr)
hak_tiny_free_with_slab(ptr, NULL) // NULL = SuperSlab mode
[L14-36] Cross-thread check → if different thread → hak_tiny_free_superslab() → ss_remote_push()
[L38-51] g_debug_fast0 check → NO (not set)
[L53-59] g_fast_cap[cls]=0 → SKIP fast tier
[L61-92] g_tls_list_enable=1 → TLS List push → RETURN ✓
NEVER REACHES Magazine/freelist code (L94+)
```
**Problem:** Same-thread frees go to TLS List, **never update SuperSlab freelist**.
### Alloc Path (FAST_CAP=0)
```
hak_tiny_alloc(size)
[Benchmark path disabled for FAST_CAP=0]
hak_tiny_alloc_slow(size, cls)
hak_tiny_alloc_superslab(cls)
[L727-735] meta->freelist == NULL && used < cap linear alloc (virgin slab)
[L737-752] meta->freelist EXISTS → CHECK remote_heads[] (Fix #2)
has_remote = (remote_heads[idx] != 0) → FALSE (TLS List mode doesn't use it)
block = meta->freelist → **(void**)block → SEGV 💥
```
**Problem:** Freelist contains pointers to blocks that were:
1. Freed by same thread → went to TLS List
2. Freed by other threads → went to remote_heads[] but never drained
3. Never merged back to freelist
---
## Additional Problems Found
### 1. Ultra-Simple Free Path Incompatibility
When `g_tiny_ultra=1` (HAKMEM_TINY_ULTRA=1), the free path is:
```c
// hakmem_tiny_free.inc:886-908
if (g_tiny_ultra) {
// Detect class_idx from SuperSlab
// Push to TLS SLL (not TLS List!)
if (g_tls_sll_count[cls] < sll_cap) {
*(void**)ptr = g_tls_sll_head[cls];
g_tls_sll_head[cls] = ptr;
return; // BYPASSES remote queue entirely!
}
}
```
**Problem:** Ultra mode also bypasses remote queues for same-thread frees!
### 2. Linear Allocation Mode Confusion
```c
// L727-735: Linear allocation (freelist == NULL)
if (meta->freelist == NULL && meta->used < meta->capacity) {
void* block = slab_base + (meta->used * block_size);
meta->used++;
return block; // ✓ Safe (virgin memory)
}
```
**This is safe!** Linear allocation doesn't touch freelist at all.
**But next allocation:**
```c
// L737-752: Freelist allocation
if (meta->freelist) { // ← Freelist exists from OLD allocations
// Fix #2 check (always false in TLS List mode)
void* block = meta->freelist; // ← STALE POINTER
meta->freelist = *(void**)block; // ← SEGV 💥
}
```
---
## Root Cause Summary
**The fundamental issue:** HAKMEM has **TWO SEPARATE FREE PATHS**:
1. **SuperSlab freelist path** (original design)
- Frees update `meta->freelist` directly
- Cross-thread frees go to `remote_heads[]`
- Drain merges remote_heads[] → freelist
- Alloc pops from freelist
2. **TLS List/Magazine path** (optimization layer)
- Frees go to TLS cache (never touch freelist!)
- Spills go to Magazine → Registry
- **DISCONNECTED from SuperSlab freelist!**
**When FAST_CAP=0:**
- TLS List path is activated (no fast tier to bypass)
- ALL same-thread frees go to TLS List
- SuperSlab freelist is **NEVER UPDATED**
- Cross-thread frees accumulate in remote_heads[]
- remote_heads[] is **NEVER DRAINED** (Fix #2 check fails)
- Next alloc from stale freelist → **SEGV**
---
## Why Debug Ring Produces No Output
**Expected:** SIGSEGV handler dumps Debug Ring before crash
**Actual:** Immediate crash with no output
**Possible reasons:**
1. **Stack corruption before handler runs**
- Freelist corruption may have corrupted stack
- Signal handler can't execute safely
2. **Handler not installed (HAKMEM_TINY_TRACE_RING=1 not set)**
- Check: `g_tiny_ring_enabled` must be 1
- Verify env var is exported BEFORE running Larson
3. **Fast crash (no time to record events)**
- Unlikely (should have at least ALLOC_ENTER events)
4. **Crash in signal handler itself**
- Handler uses async-signal-unsafe functions (write, fprintf)
- May fail if heap is corrupted
**Recommendation:** Add printf BEFORE running Larson to confirm:
```bash
HAKMEM_TINY_TRACE_RING=1 LD_PRELOAD=./libhakmem.so \
bash -c 'echo "Ring enabled: $HAKMEM_TINY_TRACE_RING"; ./larson_hakmem ...'
```
---
## Recommended Fixes
### Option A: Unconditional Drain in Alloc Path (SAFE, SIMPLE) ⭐⭐⭐⭐⭐
**Location:** `hak_tiny_alloc_superslab()` L737-752
**Change:**
```c
if (meta && meta->freelist) {
// UNCONDITIONAL drain: always merge remote frees before using freelist
// Cost: ~50-100ns (only when freelist exists, amortized by batch drain)
ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);
// Now safe to use freelist
void* block = meta->freelist;
meta->freelist = *(void**)block;
meta->used++;
ss_active_inc(tls->ss);
return block;
}
```
**Pros:**
- Guarantees correctness (no stale pointers)
- Simple, easy to verify
- Only ~50-100ns overhead per allocation miss
**Cons:**
- May drain empty queues (wasted atomic load)
- Doesn't fix the root issue (TLS List disconnect)
### Option B: Force TLS List Spill to SuperSlab Freelist (CORRECT FIX) ⭐⭐⭐⭐
**Location:** `tls_list_spill_excess()` (need to find this function)
**Change:** Modify spill path to return blocks to **SuperSlab freelist** instead of Magazine:
```c
void tls_list_spill_excess(int class_idx, TinyTLSList* tls) {
SuperSlab* ss = g_tls_slabs[class_idx].ss;
if (!ss) { /* fallback to Magazine */ }
int slab_idx = g_tls_slabs[class_idx].slab_idx;
TinySlabMeta* meta = &ss->slabs[slab_idx];
// Spill half to SuperSlab freelist (under lock)
int spill_count = tls->count / 2;
for (int i = 0; i < spill_count; i++) {
void* ptr = tls_list_pop(tls);
// Push to freelist
*(void**)ptr = meta->freelist;
meta->freelist = ptr;
meta->used--;
}
}
```
**Pros:**
- Fixes root cause (reconnects TLS List → SuperSlab)
- No allocation path overhead
- Maintains cache efficiency
**Cons:**
- Requires lock (spill is already under lock)
- Need to identify correct slab for each block (may be from different slabs)
### Option C: Disable TLS List Mode for FAST_CAP=0 (WORKAROUND) ⭐⭐⭐
**Location:** `hak_tiny_init()` or free path
**Change:**
```c
// In init:
if (g_fast_cap_all_zero) {
g_tls_list_enable = 0; // Force Magazine path
}
// Or in free path:
if (g_tls_list_enable && g_fast_cap[class_idx] == 0) {
// Force Magazine path for this class
goto use_magazine_path;
}
```
**Pros:**
- Minimal code change
- Forces consistent path (Magazine → freelist)
**Cons:**
- Doesn't fix the bug (just avoids it)
- Performance may suffer (Magazine has overhead)
### Option D: Track Freelist Validity (DEFENSIVE) ⭐⭐
**Add flag:** `meta->freelist_valid` (1 bit in meta)
**Set valid:** When updating freelist (free, spill)
**Clear valid:** When allocating from virgin slab
**Check valid:** Before dereferencing freelist
**Pros:**
- Catches corruption early
- Good for debugging
**Cons:**
- Adds overhead (1 extra check per alloc)
- Doesn't fix the bug (just detects it)
---
## Recommended Action Plan
### Immediate (1 hour): Confirm Diagnosis
1. **Add printf at crash site:**
```c
// hakmem_tiny_free.inc L745
fprintf(stderr, "[ALLOC] freelist=%p remote_heads=%p tls_list_en=%d\n",
meta->freelist,
(void*)atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire),
g_tls_list_enable);
```
2. **Run Larson with FAST_CAP=0:**
```bash
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
HAKMEM_TINY_TRACE_RING=1 ./larson_hakmem 2 8 128 1024 1 12345 4 2>&1 | tee crash.log
```
3. **Verify output shows:**
- `freelist != NULL` (stale freelist exists)
- `remote_heads == NULL` (never used in TLS List mode)
- `tls_list_en = 1` (TLS List mode active)
### Short-term (2 hours): Implement Option A
**Safest, fastest fix:**
1. Edit `core/hakmem_tiny_free.inc` L737-743
2. Change conditional drain to **unconditional**
3. `make clean && make`
4. Test with Larson FAST_CAP=0
5. Verify no SEGV, measure performance impact
### Medium-term (1 day): Implement Option B
**Proper fix:**
1. Find `tls_list_spill_excess()` implementation
2. Add path to return blocks to SuperSlab freelist
3. Test with all configurations (FAST_CAP=0/64, TLS_LIST=0/1)
4. Measure performance vs. current
### Long-term (1 week): Unified Free Path
**Ultimate solution:**
1. Audit all free paths (TLS List, Magazine, Fast, Ultra, SuperSlab)
2. Ensure consistency: freed blocks ALWAYS return to owner slab
3. Remote frees ALWAYS go through remote queue (or mailbox)
4. Drain happens at predictable points (refill, alloc miss, periodic)
---
## Testing Strategy
### Minimal Repro Test (30 seconds)
```bash
# Single-thread (should work)
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
./larson_hakmem 2 8 128 1024 1 12345 1
# Multi-thread (crashes)
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
./larson_hakmem 2 8 128 1024 1 12345 4
```
### Comprehensive Test Matrix
| FAST_CAP | TLS_LIST | THREADS | Expected | Notes |
|----------|----------|---------|----------|-------|
| 0 | 0 | 1 | ✓ | Magazine path, single-thread |
| 0 | 0 | 4 | ? | Magazine path, may crash |
| 0 | 1 | 1 | ✓ | TLS List, no cross-thread |
| 0 | 1 | 4 | ✗ | **CURRENT BUG** |
| 64 | 0 | 4 | ✓ | Fast tier absorbs cross-thread |
| 64 | 1 | 4 | ✓ | Fast tier + TLS List |
### Validation After Fix
```bash
# All these should pass:
for CAP in 0 64; do
for TLS in 0 1; do
for T in 1 2 4 8; do
echo "Testing FAST_CAP=$CAP TLS_LIST=$TLS THREADS=$T"
HAKMEM_TINY_FAST_CAP=$CAP HAKMEM_TINY_TLS_LIST=$TLS \
HAKMEM_LARSON_TINY_ONLY=1 \
timeout 10 ./larson_hakmem 2 8 128 1024 1 12345 $T || echo "FAIL"
done
done
done
```
---
## Files to Investigate Further
1. **TLS List spill implementation:**
```bash
grep -rn "tls_list_spill" core/
```
2. **Magazine spill path:**
```bash
grep -rn "mag.*spill" core/hakmem_tiny_free.inc
```
3. **Remote drain call sites:**
```bash
grep -rn "ss_remote_drain" core/
```
---
## Summary
**Root Cause:** TLS List mode (active when FAST_CAP=0) bypasses SuperSlab freelist for same-thread frees. Freed blocks go to TLS cache → Magazine → Registry, never returning to SuperSlab freelist. Meanwhile, freelist contains stale pointers from old allocations. Cross-thread frees accumulate in remote_heads[] but Fix #2's drain check always fails because TLS List mode doesn't use remote_heads[].
**Why Fixes Don't Work:**
- Fix #1: Never reached (crash before refill)
- Fix #2: Condition always false (remote_heads[] unused)
**Recommended Fix:** Option A (unconditional drain) for immediate safety, Option B (fix spill path) for proper solution.
**Next Steps:**
1. Confirm diagnosis with printf
2. Implement Option A
3. Test thoroughly
4. Plan Option B implementation