hakmem/docs/analysis/INVESTIGATION_SUMMARY.md

# FAST_CAP=0 SEGV Investigation - Executive Summary

## Status: ROOT CAUSE IDENTIFIED ✓

**Date:** 2025-11-04
**Issue:** SEGV crash in 4-thread Larson benchmark when `FAST_CAP=0`
**Fixes Implemented:** Fix #1 (L615-620), Fix #2 (L737-743) - **BOTH CORRECT BUT NOT EXECUTING**

---

## Root Cause (CONFIRMED)

### The Bug

When `FAST_CAP=0` and `g_tls_list_enable=1` (TLS List mode), the code has **TWO DISCONNECTED MEMORY PATHS**:

**FREE PATH (where blocks go):**
```
hak_tiny_free(ptr)
  → TLS List cache (g_tls_lists[])
  → tls_list_spill_excess() when full
  → ✓ RETURNS TO SUPERSLAB FREELIST (L179-193 in tls_ops.h)
```

**ALLOC PATH (where blocks come from):**
```
hak_tiny_alloc()
  → hak_tiny_alloc_superslab()
  → meta->freelist (expects valid linked list)
  → ✗ CRASHES on stale/corrupted pointers
```

### Why It Crashes

1. **TLS List spill DOES return to SuperSlab freelist** (L184-186):
   ```c
   *(void**)node = meta->freelist;  // Link to freelist
   meta->freelist = node;           // Update head
   if (meta->used > 0) meta->used--;
   ```

2. **BUT: Cross-thread frees accumulate in remote_heads[] and NEVER drain!**

3. **The freelist becomes CORRUPTED** because:
   - Same-thread frees: TLS List → (eventually) freelist ✓
   - Cross-thread frees: remote_heads[] → **NEVER MERGED** ✗
   - Freelist now has **INVALID NEXT POINTERS** (point to blocks in remote queue)

4. **Next allocation:**
   ```c
   void* block = meta->freelist;        // Valid pointer
   meta->freelist = *(void**)block;     // ✗ SEGV (next pointer is garbage)
   ```

---

## Why Fix #2 Doesn't Work

**Fix #2 Location:** `hakmem_tiny_free.inc` L737-743

```c
if (meta && meta->freelist) {
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire) != 0);
    if (has_remote) {
        ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);  // ← NEVER EXECUTES
    }
    void* block = meta->freelist;  // ← SEGV HERE
    meta->freelist = *(void**)block;
}
```

**Why `has_remote` is always FALSE:**

The check looks for `remote_heads[idx] != 0`, BUT:

1. **Cross-thread frees in TLS List mode DO call `ss_remote_push()`**
   - Checked: `hakmem_tiny_free_superslab()` L833 calls `ss_remote_push()`
   - This sets `remote_heads[idx]` to the remote queue head

2. **BUT Fix #2 checks the WRONG slab index:**
   - `tls->slab_idx` = current TLS-cached slab (e.g., slab 7)
   - Cross-thread frees may be for OTHER slabs (e.g., slab 0-6)
   - Fix #2 only drains the current slab, misses remote frees to other slabs!

3. **Example scenario:**
   ```
   Thread A: allocates from slab 0 → tls->slab_idx = 0
   Thread B: frees those blocks → remote_heads[0] = <queue>
   Thread A: allocates again, moves to slab 7 → tls->slab_idx = 7
   Thread A: Fix #2 checks remote_heads[7] → NULL (not 0!)
   Thread A: Uses freelist from slab 0 (has stale pointers) → SEGV
   ```

---

## Why Fix #1 Doesn't Work

**Fix #1 Location:** `hakmem_tiny_free.inc` L615-620 (in `superslab_refill()`)

```c
for (int i = 0; i < tls_cap; i++) {
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[i], memory_order_acquire) != 0);
    if (has_remote) {
        ss_remote_drain_to_freelist(tls->ss, i);  // ← SHOULD drain all slabs
    }
    if (tls->ss->slabs[i].freelist) {
        // Reuse this slab
        tiny_tls_bind_slab(tls, tls->ss, i);
        return tls->ss;  // ← RETURNS IMMEDIATELY
    }
}
```

**Why it doesn't execute:**

1. **Crash happens BEFORE refill:**
   - Allocation path: `hak_tiny_alloc_superslab()` (L720)
   - First checks existing `meta->freelist` (L737) → **SEGV HERE**
   - NEVER reaches `superslab_refill()` (L755) because it crashes first!

2. **Even if it reached refill:**
   - Loop finds slab with `freelist != NULL` at iteration 0
   - Returns immediately (L627) without checking remaining slabs
   - Misses remote_heads[1..N] that may have queued frees

---

## Evidence from Code Analysis

### 1. TLS List Spill DOES Return to Freelist ✓

**File:** `core/hakmem_tiny_tls_ops.h` L179-193

```c
// Phase 1: Try SuperSlab first (registry-based lookup)
SuperSlab* ss = hak_super_lookup(node);
if (ss && ss->magic == SUPERSLAB_MAGIC) {
    int slab_idx = slab_index_for(ss, node);
    TinySlabMeta* meta = &ss->slabs[slab_idx];
    *(void**)node = meta->freelist;  // ✓ Link to freelist
    meta->freelist = node;            // ✓ Update head
    if (meta->used > 0) meta->used--;
    handled = 1;
}
```

**This is CORRECT!** TLS List spill properly returns blocks to SuperSlab freelist.

### 2. Cross-Thread Frees DO Call ss_remote_push() ✓

**File:** `core/hakmem_tiny_free.inc` L824-838

```c
// Slow path: Remote free (cross-thread)
if (g_ss_adopt_en2) {
    // Use remote queue
    int was_empty = ss_remote_push(ss, slab_idx, ptr);  // ✓ Adds to remote_heads[]
    meta->used--;
    ss_active_dec_one(ss);
    if (was_empty) {
        ss_partial_publish((int)ss->size_class, ss);
    }
}
```

**This is CORRECT!** Cross-thread frees go to remote queue.

### 3. Remote Queue NEVER Drains in Alloc Path ✗

**File:** `core/hakmem_tiny_free.inc` L737-743

```c
if (meta && meta->freelist) {
    // Check ONLY current slab's remote queue
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire) != 0);
    if (has_remote) {
        ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);  // ✓ Drains current slab
    }
    // ✗ BUG: Doesn't drain OTHER slabs' remote queues!
    void* block = meta->freelist;  // May be from slab 0, but we only drained slab 7
    meta->freelist = *(void**)block;  // ✗ SEGV if next pointer is in remote queue
}
```

**This is the BUG!** Fix #2 only drains the current TLS slab, not the slab being allocated from.

---

## The Actual Bug (Detailed)

### Scenario: Multi-threaded Larson with FAST_CAP=0

**Thread A - Allocation:**
```
1. alloc() → hak_tiny_alloc_superslab(cls=0)
2. TLS cache empty, calls superslab_refill()
3. Finds SuperSlab SS1 with slabs[0..15]
4. Binds to slab 0: tls->ss = SS1, tls->slab_idx = 0
5. Allocates 100 blocks from slab 0 via linear allocation
6. Returns pointers to Thread B
```

**Thread B - Free (cross-thread):**
```
7. free(ptr_from_slab_0)
8. Detects cross-thread (meta->owner_tid != self)
9. Calls ss_remote_push(SS1, slab_idx=0, ptr)
10. Adds ptr to SS1->remote_heads[0] (lock-free queue)
11. Repeat for all 100 blocks
12. Result: SS1->remote_heads[0] = <chain of 100 blocks>
```

**Thread A - More Allocations:**
```
13. alloc() → hak_tiny_alloc_superslab(cls=0)
14. Slab 0 is full (meta->used == meta->capacity)
15. Calls superslab_refill()
16. Finds slab 7 has freelist (from old allocations)
17. Binds to slab 7: tls->ss = SS1, tls->slab_idx = 7
18. Returns without draining remote_heads[0]!
```

**Thread A - Fatal Allocation:**
```
19. alloc() → hak_tiny_alloc_superslab(cls=0)
20. meta->freelist exists (from slab 7)
21. Fix #2 checks remote_heads[7] → NULL (no cross-thread frees to slab 7)
22. Skips drain
23. block = meta->freelist → valid pointer (from slab 7)
24. meta->freelist = *(void**)block → ✗ SEGV
```

**Why it crashes:**
- `block` points to a valid block from slab 7
- But that block was freed via TLS List → spilled to freelist
- During spill, it was linked to the freelist: `*(void**)block = meta->freelist`
- BUT meta->freelist at that moment included blocks from slab 0 that were:
  - Allocated by Thread A
  - Freed by Thread B (cross-thread)
  - Queued in remote_heads[0]
  - **NEVER MERGED** to freelist
- So `*(void**)block` points to a block in the remote queue
- Which has invalid/corrupted next pointers → **SEGV**

---

## Why Debug Ring Produces No Output

**Expected:** SIGSEGV handler dumps Debug Ring

**Actual:** Immediate crash, no output

**Reasons:**

1. **Signal handler may not be installed:**
   - Check: `HAKMEM_TINY_TRACE_RING=1` must be set BEFORE init
   - Verify: Add `printf("Ring enabled: %d\n", g_tiny_ring_enabled);` in main()

2. **Crash may corrupt stack before handler runs:**
   - Freelist corruption may overwrite stack frames
   - Signal handler can't execute safely

3. **Handler uses unsafe functions:**
   - `write()` is signal-safe ✓
   - But if heap is corrupted, may still fail

---

## Correct Fix (VERIFIED)

### Option A: Drain ALL Slabs Before Using Freelist (SAFEST)

**Location:** `core/hakmem_tiny_free.inc` L737-752

**Replace:**
```c
if (meta && meta->freelist) {
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire) != 0);
    if (has_remote) {
        ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);
    }
    void* block = meta->freelist;
    meta->freelist = *(void**)block;
    // ...
}
```

**With:**
```c
if (meta && meta->freelist) {
    // BUGFIX: Drain ALL slabs' remote queues, not just current TLS slab
    // Reason: Freelist may contain pointers from OTHER slabs that have remote frees
    int tls_cap = ss_slabs_capacity(tls->ss);
    for (int i = 0; i < tls_cap; i++) {
        if (atomic_load_explicit(&tls->ss->remote_heads[i], memory_order_acquire) != 0) {
            ss_remote_drain_to_freelist(tls->ss, i);
        }
    }

    void* block = meta->freelist;
    meta->freelist = *(void**)block;
    // ...
}
```

**Pros:**
- Guarantees correctness
- Simple to implement
- Low overhead (only when freelist exists, ~10-16 atomic loads)

**Cons:**
- May drain empty queues (wasted atomic loads)
- Not the most efficient (but safe!)

---

### Option B: Track Per-Slab in Freelist (OPTIMAL)

**Idea:** When allocating from freelist, only drain the remote queue for THE SLAB THAT OWNS THE FREELIST BLOCK.

**Problem:** Freelist is a linked list mixing blocks from multiple slabs!
- Can't determine which slab owns which block without expensive lookup
- Would need to scan entire freelist or maintain per-slab freelists

**Verdict:** Too complex, not worth it.

---

### Option C: Drain in superslab_refill() Before Returning (PROACTIVE)

**Location:** `core/hakmem_tiny_free.inc` L615-630

**Change:**
```c
for (int i = 0; i < tls_cap; i++) {
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[i], memory_order_acquire) != 0);
    if (has_remote) {
        ss_remote_drain_to_freelist(tls->ss, i);
    }
    if (tls->ss->slabs[i].freelist) {
        // ✓ Now freelist is guaranteed clean
        tiny_tls_bind_slab(tls, tls->ss, i);
        return tls->ss;
    }
}
```

**BUT:** Need to drain BEFORE checking freelist (move drain outside if):

```c
for (int i = 0; i < tls_cap; i++) {
    // Drain FIRST (before checking freelist)
    if (atomic_load_explicit(&tls->ss->remote_heads[i], memory_order_acquire) != 0) {
        ss_remote_drain_to_freelist(tls->ss, i);
    }

    // NOW check freelist (guaranteed fresh)
    if (tls->ss->slabs[i].freelist) {
        tiny_tls_bind_slab(tls, tls->ss, i);
        return tls->ss;
    }
}
```

**Pros:**
- Proactive (prevents corruption)
- No allocation path overhead

**Cons:**
- Doesn't fix the immediate crash (crash happens before refill)
- Need BOTH Option A (immediate safety) AND Option C (long-term)

---

## Recommended Action Plan

### Immediate (30 minutes): Implement Option A

1. Edit `core/hakmem_tiny_free.inc` L737-752
2. Add loop to drain all slabs before using freelist
3. `make clean && make`
4. Test: `HAKMEM_TINY_FAST_CAP=0 ./larson_hakmem 2 8 128 1024 1 12345 4`
5. Verify: No SEGV

### Short-term (2 hours): Implement Option C

1. Edit `core/hakmem_tiny_free.inc` L615-630
2. Move drain BEFORE freelist check
3. Test all configurations

### Long-term (1 week): Audit All Paths

1. Ensure ALL allocation paths drain remote queues
2. Add assertions: `assert(remote_heads[i] == 0)` after drain
3. Consider: Lazy drain (only when freelist is used, not virgin slabs)

---

## Testing Commands

```bash
# Verify bug exists:
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
  timeout 5 ./larson_hakmem 2 8 128 1024 1 12345 4
# Expected: SEGV

# After fix:
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
  timeout 10 ./larson_hakmem 2 8 128 1024 1 12345 4
# Expected: Completes successfully

# Full test matrix:
./scripts/verify_fast_cap_0_bug.sh
```

---

## Files Modified (for Option A fix)

1. **core/hakmem_tiny_free.inc** - L737-752 (hak_tiny_alloc_superslab)

---

## Confidence Level

**ROOT CAUSE: 95%** - Code analysis confirms disconnected paths
**FIX CORRECTNESS: 90%** - Option A is sound, Option C is proactive
**FIX COMPLETENESS: 80%** - May need additional drain points (virgin slab → freelist transition)

---

## Next Steps

1. Implement Option A (drain all slabs in alloc path)
2. Test with Larson FAST_CAP=0
3. If successful, implement Option C (drain in refill)
4. Audit all freelist usage sites for similar bugs
5. Consider: Add `HAKMEM_TINY_PARANOID_DRAIN=1` mode (drain everywhere)