Files
hakmem/docs/analysis/FAST_CAP_0_SEGV_ROOT_CAUSE_ANALYSIS.md
Moe Charm (CI) 67fb15f35f Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization)
## Changes

### 1. core/page_arena.c
- Removed init failure message (lines 25-27) - error is handled by returning early
- All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks

### 2. core/hakmem.c
- Wrapped SIGSEGV handler init message (line 72)
- CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs

### 3. core/hakmem_shared_pool.c
- Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE:
  - Node pool exhaustion warning (line 252)
  - SP_META_CAPACITY_ERROR warning (line 421)
  - SP_FIX_GEOMETRY debug logging (line 745)
  - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865)
  - SP_ACQUIRE_STAGE0_L0 debug logging (line 803)
  - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922)
  - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996)
  - SP_ACQUIRE_STAGE3 debug logging (line 1116)
  - SP_SLOT_RELEASE debug logging (line 1245)
  - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305)
  - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316)
- Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized

## Performance Validation

Before: 51M ops/s (with debug fprintf overhead)
After:  49.1M ops/s (consistent performance, fprintf removed from hot paths)

## Build & Test

```bash
./build.sh larson_hakmem
./out/release/larson_hakmem 1 5 1 1000 100 10000 42
# Result: 49.1M ops/s
```

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00

16 KiB

FAST_CAP=0 SEGV Root Cause Analysis

Executive Summary

Status: Fix #1 and Fix #2 are implemented correctly BUT are NOT BEING EXECUTED in the crash scenario.

Root Cause Discovered: When FAST_CAP=0 and g_tls_list_enable=1 (TLS List mode), the free path BYPASSES the freelist entirely and stores freed blocks in TLS List cache. These blocks are NEVER merged into the SuperSlab freelist until TLS List spills. Meanwhile, the allocation path tries to allocate from the freelist, which contains stale pointers from cross-thread frees that were never drained.

Critical Flow Bug:

Thread A:
1. free(ptr) → g_fast_cap[cls]=0 → skip fast tier
2. g_tls_list_enable=1 → TLS List push (L75-79 in free.inc)
3. RETURNS WITHOUT TOUCHING FREELIST (meta->freelist unchanged)
4. Remote frees accumulate in remote_heads[] but NEVER get drained

Thread B:
1. alloc() → hak_tiny_alloc_superslab(cls)
2. meta->freelist EXISTS (has stale/remote pointers)
3. FIX #2 SHOULD drain here (L740-743) BUT...
4. has_remote = (remote_heads[idx] != 0) → FALSE (wrong index!)
5. Dereferences stale freelist → **SEGV**

Why Fix #1 and Fix #2 Are Not Executed

Fix #1 (superslab_refill L615-620): NOT REACHED

// Fix #1: In superslab_refill() loop
for (int i = 0; i < tls_cap; i++) {
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[i], memory_order_acquire) != 0);
    if (has_remote) {
        ss_remote_drain_to_freelist(tls->ss, i);  // ← This line NEVER executes
    }
    if (tls->ss->slabs[i].freelist) { ... }
}

Why it doesn't execute:

  1. Larson immediately crashes on first allocation miss

    • The allocation path is: hak_tiny_alloc_superslab() (L720) → checks existing meta->freelist (L737) → SEGV
    • It NEVER reaches superslab_refill() (L755) because it crashes first!
  2. Even if it did reach refill:

    • Loop checks ALL slabs i=0..tls_cap, but the current TLS slab is tls->slab_idx (e.g., 7)
    • When checking slab i=0..6, those slabs don't have remote_heads[i] set
    • When checking slab i=7, it finds freelist exists and RETURNS IMMEDIATELY (L624) without draining!

Fix #2 (hak_tiny_alloc_superslab L737-743): CONDITION ALWAYS FALSE

if (meta && meta->freelist) {
    int has_remote = (atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire) != 0);
    if (has_remote) {  // ← ALWAYS FALSE!
        ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);
    }
    void* block = meta->freelist;  // ← SEGV HERE
    meta->freelist = *(void**)block;
}

Why has_remote is always false:

  1. Wrong understanding of remote queue semantics:

    • remote_heads[idx] is NOT a flag indicating "has remote frees"
    • It's the HEAD POINTER of the remote queue linked list
    • When TLS List mode is active, frees go to TLS List, NOT to remote_heads[]!
  2. Actual remote free flow in TLS List mode:

    hak_tiny_free() → class_idx detected → g_fast_cap=0 → skip fast
    → g_tls_list_enable=1 → TLS List push (L75-79)
    → RETURNS (L80) WITHOUT calling ss_remote_push()!
    
  3. Therefore:

    • remote_heads[idx] remains NULL (never used in TLS List mode)
    • has_remote check is always false
    • Drain never happens
    • Freelist contains stale pointers from old allocations

When TLS List is enabled, freed blocks flow like this:

free() → TLS List cache → [eventually] tls_list_spill_excess()
→ WHERE DO THEY GO? → Need to check tls_list_spill implementation!

Hypothesis: TLS List spill probably returns blocks to Magazine/Registry, NOT to SuperSlab freelist. This creates a disconnect where:

  1. Blocks are allocated from SuperSlab freelist
  2. Blocks are freed into TLS List
  3. TLS List spills to Magazine/Registry (NOT back to freelist)
  4. SuperSlab freelist becomes stale (contains pointers to freed memory)
  5. Cross-thread frees accumulate in remote_heads[] but never merge
  6. Next allocation from freelist → SEGV

Evidence from Debug Ring Output

Key observation: remote_drain events are NEVER recorded in debug output.

Why?

  • TINY_RING_EVENT_REMOTE_DRAIN is only recorded in ss_remote_drain_to_freelist() (superslab.h:341-344)
  • But this function is never called because:
    • Fix #1 not reached (crash before refill)
    • Fix #2 condition always false (remote_heads[] unused in TLS List mode)

What IS recorded:

  • remote_push events: Yes (cross-thread frees call ss_remote_push in some path)
  • remote_drain events: No (never called)
  • This confirms the diagnosis: remote queues fill up but never drain

Code Paths Verified

Free Path (FAST_CAP=0, TLS List mode)

hak_tiny_free(ptr)
  ↓
hak_tiny_free_with_slab(ptr, NULL)  // NULL = SuperSlab mode
  ↓
[L14-36] Cross-thread check → if different thread → hak_tiny_free_superslab() → ss_remote_push()
  ↓
[L38-51] g_debug_fast0 check → NO (not set)
  ↓
[L53-59] g_fast_cap[cls]=0 → SKIP fast tier
  ↓
[L61-92] g_tls_list_enable=1 → TLS List push → RETURN ✓
  ↓
NEVER REACHES Magazine/freelist code (L94+)

Problem: Same-thread frees go to TLS List, never update SuperSlab freelist.

Alloc Path (FAST_CAP=0)

hak_tiny_alloc(size)
  ↓
[Benchmark path disabled for FAST_CAP=0]
  ↓
hak_tiny_alloc_slow(size, cls)
  ↓
hak_tiny_alloc_superslab(cls)
  ↓
[L727-735] meta->freelist == NULL && used < cap → linear alloc (virgin slab)
  ↓
[L737-752] meta->freelist EXISTS → CHECK remote_heads[] (Fix #2)
  ↓
has_remote = (remote_heads[idx] != 0) → FALSE (TLS List mode doesn't use it)
  ↓
block = meta->freelist → **(void**)block → SEGV 💥

Problem: Freelist contains pointers to blocks that were:

  1. Freed by same thread → went to TLS List
  2. Freed by other threads → went to remote_heads[] but never drained
  3. Never merged back to freelist

Additional Problems Found

1. Ultra-Simple Free Path Incompatibility

When g_tiny_ultra=1 (HAKMEM_TINY_ULTRA=1), the free path is:

// hakmem_tiny_free.inc:886-908
if (g_tiny_ultra) {
    // Detect class_idx from SuperSlab
    // Push to TLS SLL (not TLS List!)
    if (g_tls_sll_count[cls] < sll_cap) {
        *(void**)ptr = g_tls_sll_head[cls];
        g_tls_sll_head[cls] = ptr;
        return;  // BYPASSES remote queue entirely!
    }
}

Problem: Ultra mode also bypasses remote queues for same-thread frees!

2. Linear Allocation Mode Confusion

// L727-735: Linear allocation (freelist == NULL)
if (meta->freelist == NULL && meta->used < meta->capacity) {
    void* block = slab_base + (meta->used * block_size);
    meta->used++;
    return block;  // ✓ Safe (virgin memory)
}

This is safe! Linear allocation doesn't touch freelist at all.

But next allocation:

// L737-752: Freelist allocation
if (meta->freelist) {  // ← Freelist exists from OLD allocations
    // Fix #2 check (always false in TLS List mode)
    void* block = meta->freelist;  // ← STALE POINTER
    meta->freelist = *(void**)block;  // ← SEGV 💥
}

Root Cause Summary

The fundamental issue: HAKMEM has TWO SEPARATE FREE PATHS:

  1. SuperSlab freelist path (original design)

    • Frees update meta->freelist directly
    • Cross-thread frees go to remote_heads[]
    • Drain merges remote_heads[] → freelist
    • Alloc pops from freelist
  2. TLS List/Magazine path (optimization layer)

    • Frees go to TLS cache (never touch freelist!)
    • Spills go to Magazine → Registry
    • DISCONNECTED from SuperSlab freelist!

When FAST_CAP=0:

  • TLS List path is activated (no fast tier to bypass)
  • ALL same-thread frees go to TLS List
  • SuperSlab freelist is NEVER UPDATED
  • Cross-thread frees accumulate in remote_heads[]
  • remote_heads[] is NEVER DRAINED (Fix #2 check fails)
  • Next alloc from stale freelist → SEGV

Why Debug Ring Produces No Output

Expected: SIGSEGV handler dumps Debug Ring before crash

Actual: Immediate crash with no output

Possible reasons:

  1. Stack corruption before handler runs

    • Freelist corruption may have corrupted stack
    • Signal handler can't execute safely
  2. Handler not installed (HAKMEM_TINY_TRACE_RING=1 not set)

    • Check: g_tiny_ring_enabled must be 1
    • Verify env var is exported BEFORE running Larson
  3. Fast crash (no time to record events)

    • Unlikely (should have at least ALLOC_ENTER events)
  4. Crash in signal handler itself

    • Handler uses async-signal-unsafe functions (write, fprintf)
    • May fail if heap is corrupted

Recommendation: Add printf BEFORE running Larson to confirm:

HAKMEM_TINY_TRACE_RING=1 LD_PRELOAD=./libhakmem.so \
  bash -c 'echo "Ring enabled: $HAKMEM_TINY_TRACE_RING"; ./larson_hakmem ...'

Option A: Unconditional Drain in Alloc Path (SAFE, SIMPLE)

Location: hak_tiny_alloc_superslab() L737-752

Change:

if (meta && meta->freelist) {
    // UNCONDITIONAL drain: always merge remote frees before using freelist
    // Cost: ~50-100ns (only when freelist exists, amortized by batch drain)
    ss_remote_drain_to_freelist(tls->ss, tls->slab_idx);

    // Now safe to use freelist
    void* block = meta->freelist;
    meta->freelist = *(void**)block;
    meta->used++;
    ss_active_inc(tls->ss);
    return block;
}

Pros:

  • Guarantees correctness (no stale pointers)
  • Simple, easy to verify
  • Only ~50-100ns overhead per allocation miss

Cons:

  • May drain empty queues (wasted atomic load)
  • Doesn't fix the root issue (TLS List disconnect)

Option B: Force TLS List Spill to SuperSlab Freelist (CORRECT FIX)

Location: tls_list_spill_excess() (need to find this function)

Change: Modify spill path to return blocks to SuperSlab freelist instead of Magazine:

void tls_list_spill_excess(int class_idx, TinyTLSList* tls) {
    SuperSlab* ss = g_tls_slabs[class_idx].ss;
    if (!ss) { /* fallback to Magazine */ }

    int slab_idx = g_tls_slabs[class_idx].slab_idx;
    TinySlabMeta* meta = &ss->slabs[slab_idx];

    // Spill half to SuperSlab freelist (under lock)
    int spill_count = tls->count / 2;
    for (int i = 0; i < spill_count; i++) {
        void* ptr = tls_list_pop(tls);
        // Push to freelist
        *(void**)ptr = meta->freelist;
        meta->freelist = ptr;
        meta->used--;
    }
}

Pros:

  • Fixes root cause (reconnects TLS List → SuperSlab)
  • No allocation path overhead
  • Maintains cache efficiency

Cons:

  • Requires lock (spill is already under lock)
  • Need to identify correct slab for each block (may be from different slabs)

Option C: Disable TLS List Mode for FAST_CAP=0 (WORKAROUND)

Location: hak_tiny_init() or free path

Change:

// In init:
if (g_fast_cap_all_zero) {
    g_tls_list_enable = 0;  // Force Magazine path
}

// Or in free path:
if (g_tls_list_enable && g_fast_cap[class_idx] == 0) {
    // Force Magazine path for this class
    goto use_magazine_path;
}

Pros:

  • Minimal code change
  • Forces consistent path (Magazine → freelist)

Cons:

  • Doesn't fix the bug (just avoids it)
  • Performance may suffer (Magazine has overhead)

Option D: Track Freelist Validity (DEFENSIVE)

Add flag: meta->freelist_valid (1 bit in meta)

Set valid: When updating freelist (free, spill) Clear valid: When allocating from virgin slab Check valid: Before dereferencing freelist

Pros:

  • Catches corruption early
  • Good for debugging

Cons:

  • Adds overhead (1 extra check per alloc)
  • Doesn't fix the bug (just detects it)

Immediate (1 hour): Confirm Diagnosis

  1. Add printf at crash site:

    // hakmem_tiny_free.inc L745
    fprintf(stderr, "[ALLOC] freelist=%p remote_heads=%p tls_list_en=%d\n",
            meta->freelist,
            (void*)atomic_load_explicit(&tls->ss->remote_heads[tls->slab_idx], memory_order_acquire),
            g_tls_list_enable);
    
  2. Run Larson with FAST_CAP=0:

    HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
      HAKMEM_TINY_TRACE_RING=1 ./larson_hakmem 2 8 128 1024 1 12345 4 2>&1 | tee crash.log
    
  3. Verify output shows:

    • freelist != NULL (stale freelist exists)
    • remote_heads == NULL (never used in TLS List mode)
    • tls_list_en = 1 (TLS List mode active)

Short-term (2 hours): Implement Option A

Safest, fastest fix:

  1. Edit core/hakmem_tiny_free.inc L737-743
  2. Change conditional drain to unconditional
  3. make clean && make
  4. Test with Larson FAST_CAP=0
  5. Verify no SEGV, measure performance impact

Medium-term (1 day): Implement Option B

Proper fix:

  1. Find tls_list_spill_excess() implementation
  2. Add path to return blocks to SuperSlab freelist
  3. Test with all configurations (FAST_CAP=0/64, TLS_LIST=0/1)
  4. Measure performance vs. current

Long-term (1 week): Unified Free Path

Ultimate solution:

  1. Audit all free paths (TLS List, Magazine, Fast, Ultra, SuperSlab)
  2. Ensure consistency: freed blocks ALWAYS return to owner slab
  3. Remote frees ALWAYS go through remote queue (or mailbox)
  4. Drain happens at predictable points (refill, alloc miss, periodic)

Testing Strategy

Minimal Repro Test (30 seconds)

# Single-thread (should work)
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
  ./larson_hakmem 2 8 128 1024 1 12345 1

# Multi-thread (crashes)
HAKMEM_TINY_FAST_CAP=0 HAKMEM_LARSON_TINY_ONLY=1 \
  ./larson_hakmem 2 8 128 1024 1 12345 4

Comprehensive Test Matrix

FAST_CAP TLS_LIST THREADS Expected Notes
0 0 1 Magazine path, single-thread
0 0 4 ? Magazine path, may crash
0 1 1 TLS List, no cross-thread
0 1 4 CURRENT BUG
64 0 4 Fast tier absorbs cross-thread
64 1 4 Fast tier + TLS List

Validation After Fix

# All these should pass:
for CAP in 0 64; do
  for TLS in 0 1; do
    for T in 1 2 4 8; do
      echo "Testing FAST_CAP=$CAP TLS_LIST=$TLS THREADS=$T"
      HAKMEM_TINY_FAST_CAP=$CAP HAKMEM_TINY_TLS_LIST=$TLS \
        HAKMEM_LARSON_TINY_ONLY=1 \
        timeout 10 ./larson_hakmem 2 8 128 1024 1 12345 $T || echo "FAIL"
    done
  done
done

Files to Investigate Further

  1. TLS List spill implementation:

    grep -rn "tls_list_spill" core/
    
  2. Magazine spill path:

    grep -rn "mag.*spill" core/hakmem_tiny_free.inc
    
  3. Remote drain call sites:

    grep -rn "ss_remote_drain" core/
    

Summary

Root Cause: TLS List mode (active when FAST_CAP=0) bypasses SuperSlab freelist for same-thread frees. Freed blocks go to TLS cache → Magazine → Registry, never returning to SuperSlab freelist. Meanwhile, freelist contains stale pointers from old allocations. Cross-thread frees accumulate in remote_heads[] but Fix #2's drain check always fails because TLS List mode doesn't use remote_heads[].

Why Fixes Don't Work:

  • Fix #1: Never reached (crash before refill)
  • Fix #2: Condition always false (remote_heads[] unused)

Recommended Fix: Option A (unconditional drain) for immediate safety, Option B (fix spill path) for proper solution.

Next Steps:

  1. Confirm diagnosis with printf
  2. Implement Option A
  3. Test thoroughly
  4. Plan Option B implementation