Files
hakmem/docs/analysis/FREE_PATH_INVESTIGATION.md
Moe Charm (CI) 67fb15f35f Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization)
## Changes

### 1. core/page_arena.c
- Removed init failure message (lines 25-27) - error is handled by returning early
- All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks

### 2. core/hakmem.c
- Wrapped SIGSEGV handler init message (line 72)
- CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs

### 3. core/hakmem_shared_pool.c
- Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE:
  - Node pool exhaustion warning (line 252)
  - SP_META_CAPACITY_ERROR warning (line 421)
  - SP_FIX_GEOMETRY debug logging (line 745)
  - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865)
  - SP_ACQUIRE_STAGE0_L0 debug logging (line 803)
  - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922)
  - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996)
  - SP_ACQUIRE_STAGE3 debug logging (line 1116)
  - SP_SLOT_RELEASE debug logging (line 1245)
  - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305)
  - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316)
- Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized

## Performance Validation

Before: 51M ops/s (with debug fprintf overhead)
After:  49.1M ops/s (consistent performance, fprintf removed from hot paths)

## Build & Test

```bash
./build.sh larson_hakmem
./out/release/larson_hakmem 1 5 1 1000 100 10000 42
# Result: 49.1M ops/s
```

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00

522 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Free Path Freelist Push Investigation
## Executive Summary
Investigation of the same-thread free path for freelist push implementation has identified **ONE CRITICAL BUG** and **MULTIPLE DESIGN ISSUES** that explain the freelist reuse rate problem.
**Critical Finding:** The freelist push is being performed, but it is **only visible when blocks are accessed from the refill path**, not when they're accessed from normal allocation paths. This creates a **visibility gap** in the publish/fetch mechanism.
---
## Investigation Flow: free() → alloc()
### Phase 1: Same-Thread Free (freelist push)
**File:** `core/hakmem_tiny_free.inc` (lines 1-608)
**Main Function:** `hak_tiny_free_superslab(void* ptr, SuperSlab* ss)` (lines ~150-300)
#### Fast Path Decision (Line 121):
```c
if (!g_tiny_force_remote && meta->owner_tid != 0 && meta->owner_tid == my_tid) {
// Same-thread free
// ...
tiny_free_local_box(ss, slab_idx, meta, ptr, my_tid);
```
**Status:** ✓ CORRECT - ownership check is present
#### Freelist Push Implementation
**File:** `core/box/free_local_box.c` (lines 5-36)
```c
void tiny_free_local_box(SuperSlab* ss, int slab_idx, TinySlabMeta* meta, void* ptr, uint32_t my_tid) {
void* prev = meta->freelist;
*(void**)ptr = prev;
meta->freelist = ptr; // <-- FREELIST PUSH HAPPENS HERE (Line 12)
// ...
meta->used--;
ss_active_dec_one(ss);
if (prev == NULL) {
// First-free → publish
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx); // Line 34
}
}
```
**Status:** ✓ CORRECT - freelist push happens unconditionally before publish
#### Publish Mechanism
**File:** `core/box/free_publish_box.c` (lines 23-28)
```c
void tiny_free_publish_first_free(int class_idx, SuperSlab* ss, int slab_idx) {
tiny_ready_push(class_idx, ss, slab_idx);
ss_partial_publish(class_idx, ss);
mailbox_box_publish(class_idx, ss, slab_idx); // Line 28
}
```
**File:** `core/box/mailbox_box.c` (lines 112-122)
```c
void mailbox_box_publish(int class_idx, SuperSlab* ss, int slab_idx) {
mailbox_box_register(class_idx);
uintptr_t ent = ((uintptr_t)ss) | ((uintptr_t)slab_idx & 0x3Fu);
uint32_t slot = g_tls_mailbox_slot[class_idx];
atomic_store_explicit(&g_pub_mailbox_entries[class_idx][slot], ent, memory_order_release);
g_pub_mail_hits[class_idx]++; // Line 122 - COUNTER INCREMENTED
}
```
**Status:** ✓ CORRECT - publish happens on first-free
---
### Phase 2: Refill/Adoption Path (mailbox fetch)
**File:** `core/tiny_refill.h` (lines 136-157)
```c
// For hot tiny classes (0..3), try mailbox first
if (class_idx <= 3) {
uint32_t self_tid = tiny_self_u32();
ROUTE_MARK(3);
uintptr_t mail = mailbox_box_fetch(class_idx); // Line 139
if (mail) {
SuperSlab* mss = slab_entry_ss(mail);
int midx = slab_entry_idx(mail);
SlabHandle h = slab_try_acquire(mss, midx, self_tid);
if (slab_is_valid(&h)) {
if (slab_remote_pending(&h)) {
slab_drain_remote_full(&h);
} else if (slab_freelist(&h)) {
tiny_tls_bind_slab(tls, h.ss, h.slab_idx);
ROUTE_MARK(4);
return h.ss; // Success!
}
}
}
}
```
**Status:** ✓ CORRECT - mailbox fetch is called for refill
#### Mailbox Fetch Implementation
**File:** `core/box/mailbox_box.c` (lines 160-207)
```c
uintptr_t mailbox_box_fetch(int class_idx) {
uint32_t used = atomic_load_explicit(&g_pub_mailbox_used[class_idx], memory_order_acquire);
// Destructive fetch of first available entry (0..used-1)
for (uint32_t i = 0; i < used; i++) {
uintptr_t ent = atomic_exchange_explicit(&g_pub_mailbox_entries[class_idx][i],
(uintptr_t)0,
memory_order_acq_rel);
if (ent) {
g_rf_hit_mail[class_idx]++; // Line 200 - COUNTER INCREMENTED
return ent;
}
}
return (uintptr_t)0;
}
---
## Fix Log (2025-11-06)
- P0: nonempty_maskをクリアしない
- 変更: `core/slab_handle.h` `slab_freelist_pop()` `nonempty_mask` を空→空転でクリアする処理を削除。
- 理由: 一度でも非空になった slab を再発見できるようにして、free後の再利用が見えなくなるリークを防止
- P0: adopt_gate TOCTOU 安全化
- 変更: すべての bind 直前の判定を `slab_is_safe_to_bind()` に統一。`core/tiny_refill.h` mailbox/hot/ready/BG 集約の分岐を更新。
- 変更: adopt_gate 実装側(`core/hakmem_tiny.c`)は `slab_drain_remote_full()` の後に `slab_is_safe_to_bind()` を必ず最終確認。
- P1: Refill アイテム内訳カウンタの追加
- 変更: `core/hakmem_tiny.c` `g_rf_freelist_items[]` / `g_rf_carve_items[]` を追加。
- 変更: `core/hakmem_tiny_refill_p0.inc.h` freelist/carve 取得数をカウント。
- 変更: `core/hakmem_tiny_stats.c` のダンプに [Refill Item Sources] を追加。
- Mailbox 実装の一本化
- 変更: `core/tiny_mailbox.c/.h` を削除。実装は `core/box/mailbox_box.*` のみ(包括的な Box)に統一。
- Makefile 修正
- 変更: タイポ修正 `>/devnull` `>/dev/null`。
### 検証の目安SIGUSR1/終了時ダンプ)
- [Refill Stage] mail/reg/ready 0 のままになっていないか
- [Refill Item Sources] freelist/carve のバランス(freelist が上がれば再利用が通電)
- [Publish Hits] / [Publish Pipeline] 0 連発のときは、`HAKMEM_TINY_FREE_TO_SS=1` `HAKMEM_TINY_FREELIST_MASK=1` を一時有効化
```
**Status:** ✓ CORRECT - fetch clears the mailbox entry
---
## Critical Bug Found
### BUG #1: Freelist Access Without Publish
**Location:** `core/hakmem_tiny_free.inc` (lines 687-695)
**Function:** `superslab_alloc_from_slab()` - Direct freelist pop during allocation
```c
// Freelist mode (after first free())
if (meta->freelist) {
void* block = meta->freelist;
meta->freelist = *(void**)block; // Pop from freelist
meta->used++;
tiny_remote_track_on_alloc(ss, slab_idx, block, "freelist_alloc", 0);
tiny_remote_assert_not_remote(ss, slab_idx, block, "freelist_alloc_ret", 0);
return block; // Direct pop - NO mailbox tracking!
}
```
**Problem:** When allocation directly pops from `meta->freelist`, it completely **bypasses the mailbox layer**. This means:
1. Block is pushed to freelist via `tiny_free_local_box()`
2. Mailbox is published on first-free ✓
3. But if the block is accessed during direct freelist pop, the mailbox entry is never fetched or cleared
4. The mailbox entry remains stale, wasting a slot permanently
**Impact:**
- **Permanent mailbox slot leakage** - Published blocks that are directly popped are never cleared
- **False positive in `g_pub_mail_hits[]`** - count includes blocks that bypassed the fetch path
- **Freelist reuse becomes invisible** to refill metrics because it doesn't go through mailbox_box_fetch()
### BUG #2: Premature Publish Before Freelist Formation
**Location:** `core/box/free_local_box.c` (lines 32-34)
**Issue:** Publish happens only on first-free (prev==NULL)
```c
if (prev == NULL) {
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx);
}
```
**Problem:** Once first-free publishes, subsequent pushes (prev!=NULL) are **silent**:
- Block 1 freed: freelist=[1], mailbox published ✓
- Block 2 freed: freelist=[2→1], mailbox NOT updated ⚠️
- Block 3 freed: freelist=[3→2→1], mailbox NOT updated ⚠️
The mailbox only ever contains the first freed block in the slab. If that block is allocated and then freed again, the mailbox entry is not refreshed.
**Impact:**
- Freelist state changes after first-free are not advertised
- Refill can't discover newly available blocks without full registry scan
- Forces slower adoption path (registry scan) instead of mailbox hit
---
## Design Issues
### Issue #1: Missing Freelist State Visibility
The core problem: **Meta->freelist is not synchronized with publish state**.
**Current Flow:**
```
free()
→ tiny_free_local_box()
→ meta->freelist = ptr (direct write, no sync)
→ if (prev==NULL) mailbox_publish() (one-time)
refill()
→ Try mailbox_box_fetch() (gets only first-free block)
→ If miss, scan registry (slow path, O(n))
→ If found, adopt & pop freelist
alloc()
→ superslab_alloc_from_slab()
→ if (meta->freelist) pop (direct access, bypasses mailbox!)
```
**Missing:** Mailbox consistency check when freelist is accessed
### Issue #2: Adoption vs. Direct Access Race
**Location:** `core/hakmem_tiny_free.inc` (line 687-695)
Thread A: Thread B:
1. Allocate from SS
2. Free block → freelist=[1]
3. Publish mailbox ✓
4. Refill: Try adopt
5. Mailbox fetch gets [1] ✓
6. Ownership acquire → success
7. But direct alloc bypasses this path!
8. Alloc again (same thread)
9. Pop from freelist directly
→ mailbox entry stale now
**Result:** Mailbox state diverges from actual freelist state
### Issue #3: Ownership Transition Not Tracked
When `meta->owner_tid` changes (cross-thread ownership transfer), freelist is not re-published:
**Location:** `core/hakmem_tiny_free.inc` (lines 120-135)
```c
if (!g_tiny_force_remote && meta->owner_tid != 0 && meta->owner_tid == my_tid) {
// Same-thread path
} else {
// Cross-thread path - but NO REPUBLISH if ownership changes
}
```
**Missing:** When ownership transitions to a new thread, the existing freelist should be advertised to that thread
---
## Metrics Analysis
The counters reveal the issue:
**In `core/box/mailbox_box.c` (Line 122):**
```c
void mailbox_box_publish(int class_idx, SuperSlab* ss, int slab_idx) {
// ...
g_pub_mail_hits[class_idx]++; // Published count
}
```
**In `core/box/mailbox_box.c` (Line 200):**
```c
uintptr_t mailbox_box_fetch(int class_idx) {
if (ent) {
g_rf_hit_mail[class_idx]++; // Fetched count
return ent;
}
return (uintptr_t)0;
}
```
**Expected Relationship:** `g_rf_hit_mail[class_idx]` should be ~1.0x of `g_pub_mail_hits[class_idx]`
**Actual Relationship:** Probably 0.1x - 0.5x (many published entries never fetched)
**Explanation:**
- Blocks are published (g_pub_mail_hits++)
- But they're accessed via direct freelist pop (no fetch)
- So g_rf_hit_mail stays low
- Mailbox entries accumulate as garbage
---
## Root Cause Summary
**Root Cause:** The freelist push is functional, but the **visibility mechanism (mailbox) is decoupled** from the **actual freelist access pattern**.
The system assumes refill always goes through mailbox_fetch(), but direct freelist pops bypass this entirely, creating:
1. **Stale mailbox entries** - Published but never fetched
2. **Invisible reuse** - Freed blocks are reused directly without fetch visibility
3. **Metric misalignment** - g_pub_mail_hits >> g_rf_hit_mail
---
## Recommended Fixes
### Fix #1: Clear Stale Mailbox Entry on Direct Pop
**File:** `core/hakmem_tiny_free.inc` (lines 687-695)
**In:** `superslab_alloc_from_slab()`
```c
if (meta->freelist) {
void* block = meta->freelist;
meta->freelist = *(void**)block;
meta->used++;
// NEW: If this is a mailbox-published slab, clear the entry
if (slab_idx == 0) { // Only first slab publishes
// Signal to refill: this slab's mailbox entry may now be stale
// Option A: Mark as dirty (requires new field)
// Option B: Clear mailbox on first pop (requires sync)
}
return block;
}
```
### Fix #2: Republish After Each Free (Aggressive)
**File:** `core/box/free_local_box.c` (lines 32-34)
**Problem:** Only first-free publishes
**Change:**
```c
// Always publish if freelist is non-empty
if (meta->freelist != NULL) {
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx);
}
```
**Cost:** More atomic operations, but ensures mailbox is always up-to-date
### Fix #3: Track Freelist Modifications via Atomic
**New Approach:** Use atomic freelist_mask as published state
**File:** `core/box/free_local_box.c` (current lines 15-25)
```c
// Already implemented - use this more aggressively
if (prev == NULL) {
uint32_t bit = (1u << slab_idx);
atomic_fetch_or_explicit(&ss->freelist_mask, bit, memory_order_release);
}
// Also mark on later frees
else {
uint32_t bit = (1u << slab_idx);
atomic_fetch_or_explicit(&ss->freelist_mask, bit, memory_order_release);
}
```
### Fix #4: Add Freelist Consistency Check in Refill
**File:** `core/tiny_refill.h` (lines ~140-156)
**New Logic:**
```c
uintptr_t mail = mailbox_box_fetch(class_idx);
if (mail) {
SuperSlab* mss = slab_entry_ss(mail);
int midx = slab_entry_idx(mail);
SlabHandle h = slab_try_acquire(mss, midx, self_tid);
if (slab_is_valid(&h)) {
if (slab_freelist(&h)) {
// NEW: Verify mailbox entry matches actual freelist
if (h.ss->slabs[h.slab_idx].freelist == NULL) {
// Stale entry - was already popped directly
// Re-publish if more blocks freed since
continue; // Try next candidate
}
tiny_tls_bind_slab(tls, h.ss, h.slab_idx);
return h.ss;
}
}
}
```
---
## Testing Recommendations
### Test 1: Mailbox vs. Direct Pop Ratio
Instrument the code to measure:
- `mailbox_fetch_calls` vs `direct_freelist_pops`
- Expected ratio after warmup: Should be ~1:1 if refill path is being used
- Actual ratio: Probably 1:10 or worse (direct pops dominating)
### Test 2: Mailbox Entry Staleness
Enable debug mode and check:
```
HAKMEM_TINY_MAILBOX_TRACE=1 HAKMEM_TINY_RF_TRACE=1 ./larson
```
Examine MBTRACE output:
- Count "publish" events vs "fetch" events
- Any publish without matching fetch = wasted slot
### Test 3: Freelist Reuse Path
Add instrumentation to `superslab_alloc_from_slab()`:
```c
if (meta->freelist) {
g_direct_freelist_pops[class_idx]++; // New counter
}
```
Compare with refill path:
```c
g_refill_calls[class_idx]++;
```
Verify that most allocations come from direct freelist (expected) vs. refill (if low, freelist is working)
---
## Code Quality Issues Found
### Issue #1: Unused Function Parameter
**File:** `core/box/free_local_box.c` (line 8)
```c
void tiny_free_local_box(SuperSlab* ss, int slab_idx, TinySlabMeta* meta, void* ptr, uint32_t my_tid) {
// ...
(void)my_tid; // Explicitly ignored
}
```
**Why:** Parameter passed but not used - suggests design change where ownership was computed earlier
### Issue #2: Magic Number for First Slab
**File:** `core/hakmem_tiny_free.inc` (line 676)
```c
if (slab_idx == 0) {
slab_start = (char*)slab_start + 1024; // Magic number!
}
```
Should be:
```c
if (slab_idx == 0) {
slab_start = (char*)slab_start + sizeof(SuperSlab); // or named constant
}
```
### Issue #3: Duplicate Freelist Scan Logic
**Locations:**
- `core/hakmem_tiny_free.inc` (line ~45-62): `tiny_remote_queue_contains_guard()`
- `core/hakmem_tiny_free.inc` (line ~50-64): Duplicate in safe_free path
These should be unified into a helper function.
---
## Performance Impact
**Current Situation:**
- Freelist is functional and pushed correctly
- But publish/fetch visibility is weak
- Forces all allocations to use direct freelist pop (bypassingrefill path)
- This is actually **good** for performance (fewer lock/sync operations)
- But creates **hidden fragmentation** (freelist not reorganized by adopt path)
**After Fix:**
- Expect +5-10% refill path usage (from ~0% to ~5-10%)
- Refill path can reorganize and rebalance
- Better memory locality for hot allocations
- Slightly more atomic operations during free (acceptable trade-off)
---
## Conclusion
**The freelist push IS happening.** The bug is not in the push logic itself, but in:
1. **Visibility Gap:** Pushed blocks are not tracked by mailbox when accessed via direct pop
2. **Incomplete Publish:** Only first-free publishes; later frees are silent
3. **Lack of Republish:** Freelist state changes not advertised to refill path
The fixes are straightforward:
- Re-publish on every free (not just first-free)
- Validate mailbox entries during fetch
- Track direct vs. refill access to find optimal balance
This explains why Larson shows low refill metrics despite high freelist push rate.