## Changes ### 1. core/page_arena.c - Removed init failure message (lines 25-27) - error is handled by returning early - All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks ### 2. core/hakmem.c - Wrapped SIGSEGV handler init message (line 72) - CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs ### 3. core/hakmem_shared_pool.c - Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE: - Node pool exhaustion warning (line 252) - SP_META_CAPACITY_ERROR warning (line 421) - SP_FIX_GEOMETRY debug logging (line 745) - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865) - SP_ACQUIRE_STAGE0_L0 debug logging (line 803) - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922) - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996) - SP_ACQUIRE_STAGE3 debug logging (line 1116) - SP_SLOT_RELEASE debug logging (line 1245) - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305) - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316) - Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized ## Performance Validation Before: 51M ops/s (with debug fprintf overhead) After: 49.1M ops/s (consistent performance, fprintf removed from hot paths) ## Build & Test ```bash ./build.sh larson_hakmem ./out/release/larson_hakmem 1 5 1 1000 100 10000 42 # Result: 49.1M ops/s ``` Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
522 lines
16 KiB
Markdown
522 lines
16 KiB
Markdown
# Free Path Freelist Push Investigation
|
||
|
||
## Executive Summary
|
||
|
||
Investigation of the same-thread free path for freelist push implementation has identified **ONE CRITICAL BUG** and **MULTIPLE DESIGN ISSUES** that explain the freelist reuse rate problem.
|
||
|
||
**Critical Finding:** The freelist push is being performed, but it is **only visible when blocks are accessed from the refill path**, not when they're accessed from normal allocation paths. This creates a **visibility gap** in the publish/fetch mechanism.
|
||
|
||
---
|
||
|
||
## Investigation Flow: free() → alloc()
|
||
|
||
### Phase 1: Same-Thread Free (freelist push)
|
||
|
||
**File:** `core/hakmem_tiny_free.inc` (lines 1-608)
|
||
**Main Function:** `hak_tiny_free_superslab(void* ptr, SuperSlab* ss)` (lines ~150-300)
|
||
|
||
#### Fast Path Decision (Line 121):
|
||
```c
|
||
if (!g_tiny_force_remote && meta->owner_tid != 0 && meta->owner_tid == my_tid) {
|
||
// Same-thread free
|
||
// ...
|
||
tiny_free_local_box(ss, slab_idx, meta, ptr, my_tid);
|
||
```
|
||
|
||
**Status:** ✓ CORRECT - ownership check is present
|
||
|
||
#### Freelist Push Implementation
|
||
|
||
**File:** `core/box/free_local_box.c` (lines 5-36)
|
||
|
||
```c
|
||
void tiny_free_local_box(SuperSlab* ss, int slab_idx, TinySlabMeta* meta, void* ptr, uint32_t my_tid) {
|
||
void* prev = meta->freelist;
|
||
*(void**)ptr = prev;
|
||
meta->freelist = ptr; // <-- FREELIST PUSH HAPPENS HERE (Line 12)
|
||
|
||
// ...
|
||
meta->used--;
|
||
ss_active_dec_one(ss);
|
||
|
||
if (prev == NULL) {
|
||
// First-free → publish
|
||
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx); // Line 34
|
||
}
|
||
}
|
||
```
|
||
|
||
**Status:** ✓ CORRECT - freelist push happens unconditionally before publish
|
||
|
||
#### Publish Mechanism
|
||
|
||
**File:** `core/box/free_publish_box.c` (lines 23-28)
|
||
|
||
```c
|
||
void tiny_free_publish_first_free(int class_idx, SuperSlab* ss, int slab_idx) {
|
||
tiny_ready_push(class_idx, ss, slab_idx);
|
||
ss_partial_publish(class_idx, ss);
|
||
mailbox_box_publish(class_idx, ss, slab_idx); // Line 28
|
||
}
|
||
```
|
||
|
||
**File:** `core/box/mailbox_box.c` (lines 112-122)
|
||
|
||
```c
|
||
void mailbox_box_publish(int class_idx, SuperSlab* ss, int slab_idx) {
|
||
mailbox_box_register(class_idx);
|
||
uintptr_t ent = ((uintptr_t)ss) | ((uintptr_t)slab_idx & 0x3Fu);
|
||
uint32_t slot = g_tls_mailbox_slot[class_idx];
|
||
atomic_store_explicit(&g_pub_mailbox_entries[class_idx][slot], ent, memory_order_release);
|
||
g_pub_mail_hits[class_idx]++; // Line 122 - COUNTER INCREMENTED
|
||
}
|
||
```
|
||
|
||
**Status:** ✓ CORRECT - publish happens on first-free
|
||
|
||
---
|
||
|
||
### Phase 2: Refill/Adoption Path (mailbox fetch)
|
||
|
||
**File:** `core/tiny_refill.h` (lines 136-157)
|
||
|
||
```c
|
||
// For hot tiny classes (0..3), try mailbox first
|
||
if (class_idx <= 3) {
|
||
uint32_t self_tid = tiny_self_u32();
|
||
ROUTE_MARK(3);
|
||
uintptr_t mail = mailbox_box_fetch(class_idx); // Line 139
|
||
if (mail) {
|
||
SuperSlab* mss = slab_entry_ss(mail);
|
||
int midx = slab_entry_idx(mail);
|
||
SlabHandle h = slab_try_acquire(mss, midx, self_tid);
|
||
if (slab_is_valid(&h)) {
|
||
if (slab_remote_pending(&h)) {
|
||
slab_drain_remote_full(&h);
|
||
} else if (slab_freelist(&h)) {
|
||
tiny_tls_bind_slab(tls, h.ss, h.slab_idx);
|
||
ROUTE_MARK(4);
|
||
return h.ss; // Success!
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**Status:** ✓ CORRECT - mailbox fetch is called for refill
|
||
|
||
#### Mailbox Fetch Implementation
|
||
|
||
**File:** `core/box/mailbox_box.c` (lines 160-207)
|
||
|
||
```c
|
||
uintptr_t mailbox_box_fetch(int class_idx) {
|
||
uint32_t used = atomic_load_explicit(&g_pub_mailbox_used[class_idx], memory_order_acquire);
|
||
|
||
// Destructive fetch of first available entry (0..used-1)
|
||
for (uint32_t i = 0; i < used; i++) {
|
||
uintptr_t ent = atomic_exchange_explicit(&g_pub_mailbox_entries[class_idx][i],
|
||
(uintptr_t)0,
|
||
memory_order_acq_rel);
|
||
if (ent) {
|
||
g_rf_hit_mail[class_idx]++; // Line 200 - COUNTER INCREMENTED
|
||
return ent;
|
||
}
|
||
}
|
||
return (uintptr_t)0;
|
||
}
|
||
|
||
---
|
||
|
||
## Fix Log (2025-11-06)
|
||
|
||
- P0: nonempty_maskをクリアしない
|
||
- 変更: `core/slab_handle.h` の `slab_freelist_pop()` で `nonempty_mask` を空→空転でクリアする処理を削除。
|
||
- 理由: 一度でも非空になった slab を再発見できるようにして、free後の再利用が見えなくなるリークを防止。
|
||
|
||
- P0: adopt_gate の TOCTOU 安全化
|
||
- 変更: すべての bind 直前の判定を `slab_is_safe_to_bind()` に統一。`core/tiny_refill.h` の mailbox/hot/ready/BG 集約の分岐を更新。
|
||
- 変更: adopt_gate 実装側(`core/hakmem_tiny.c`)は `slab_drain_remote_full()` の後に `slab_is_safe_to_bind()` を必ず最終確認。
|
||
|
||
- P1: Refill アイテム内訳カウンタの追加
|
||
- 変更: `core/hakmem_tiny.c` に `g_rf_freelist_items[]` / `g_rf_carve_items[]` を追加。
|
||
- 変更: `core/hakmem_tiny_refill_p0.inc.h` で freelist/carve 取得数をカウント。
|
||
- 変更: `core/hakmem_tiny_stats.c` のダンプに [Refill Item Sources] を追加。
|
||
|
||
- Mailbox 実装の一本化
|
||
- 変更: 旧 `core/tiny_mailbox.c/.h` を削除。実装は `core/box/mailbox_box.*` のみ(包括的な Box)に統一。
|
||
|
||
- Makefile 修正
|
||
- 変更: タイポ修正 `>/devnull` → `>/dev/null`。
|
||
|
||
### 検証の目安(SIGUSR1/終了時ダンプ)
|
||
|
||
- [Refill Stage] の mail/reg/ready が 0 のままになっていないか
|
||
- [Refill Item Sources] で freelist/carve のバランス(freelist が上がれば再利用が通電)
|
||
- [Publish Hits] / [Publish Pipeline] が 0 連発のときは、`HAKMEM_TINY_FREE_TO_SS=1` や `HAKMEM_TINY_FREELIST_MASK=1` を一時有効化
|
||
|
||
```
|
||
|
||
**Status:** ✓ CORRECT - fetch clears the mailbox entry
|
||
|
||
---
|
||
|
||
## Critical Bug Found
|
||
|
||
### BUG #1: Freelist Access Without Publish
|
||
|
||
**Location:** `core/hakmem_tiny_free.inc` (lines 687-695)
|
||
**Function:** `superslab_alloc_from_slab()` - Direct freelist pop during allocation
|
||
|
||
```c
|
||
// Freelist mode (after first free())
|
||
if (meta->freelist) {
|
||
void* block = meta->freelist;
|
||
meta->freelist = *(void**)block; // Pop from freelist
|
||
meta->used++;
|
||
tiny_remote_track_on_alloc(ss, slab_idx, block, "freelist_alloc", 0);
|
||
tiny_remote_assert_not_remote(ss, slab_idx, block, "freelist_alloc_ret", 0);
|
||
return block; // Direct pop - NO mailbox tracking!
|
||
}
|
||
```
|
||
|
||
**Problem:** When allocation directly pops from `meta->freelist`, it completely **bypasses the mailbox layer**. This means:
|
||
1. Block is pushed to freelist via `tiny_free_local_box()` ✓
|
||
2. Mailbox is published on first-free ✓
|
||
3. But if the block is accessed during direct freelist pop, the mailbox entry is never fetched or cleared
|
||
4. The mailbox entry remains stale, wasting a slot permanently
|
||
|
||
**Impact:**
|
||
- **Permanent mailbox slot leakage** - Published blocks that are directly popped are never cleared
|
||
- **False positive in `g_pub_mail_hits[]`** - count includes blocks that bypassed the fetch path
|
||
- **Freelist reuse becomes invisible** to refill metrics because it doesn't go through mailbox_box_fetch()
|
||
|
||
### BUG #2: Premature Publish Before Freelist Formation
|
||
|
||
**Location:** `core/box/free_local_box.c` (lines 32-34)
|
||
**Issue:** Publish happens only on first-free (prev==NULL)
|
||
|
||
```c
|
||
if (prev == NULL) {
|
||
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx);
|
||
}
|
||
```
|
||
|
||
**Problem:** Once first-free publishes, subsequent pushes (prev!=NULL) are **silent**:
|
||
- Block 1 freed: freelist=[1], mailbox published ✓
|
||
- Block 2 freed: freelist=[2→1], mailbox NOT updated ⚠️
|
||
- Block 3 freed: freelist=[3→2→1], mailbox NOT updated ⚠️
|
||
|
||
The mailbox only ever contains the first freed block in the slab. If that block is allocated and then freed again, the mailbox entry is not refreshed.
|
||
|
||
**Impact:**
|
||
- Freelist state changes after first-free are not advertised
|
||
- Refill can't discover newly available blocks without full registry scan
|
||
- Forces slower adoption path (registry scan) instead of mailbox hit
|
||
|
||
---
|
||
|
||
## Design Issues
|
||
|
||
### Issue #1: Missing Freelist State Visibility
|
||
|
||
The core problem: **Meta->freelist is not synchronized with publish state**.
|
||
|
||
**Current Flow:**
|
||
```
|
||
free()
|
||
→ tiny_free_local_box()
|
||
→ meta->freelist = ptr (direct write, no sync)
|
||
→ if (prev==NULL) mailbox_publish() (one-time)
|
||
|
||
refill()
|
||
→ Try mailbox_box_fetch() (gets only first-free block)
|
||
→ If miss, scan registry (slow path, O(n))
|
||
→ If found, adopt & pop freelist
|
||
|
||
alloc()
|
||
→ superslab_alloc_from_slab()
|
||
→ if (meta->freelist) pop (direct access, bypasses mailbox!)
|
||
```
|
||
|
||
**Missing:** Mailbox consistency check when freelist is accessed
|
||
|
||
### Issue #2: Adoption vs. Direct Access Race
|
||
|
||
**Location:** `core/hakmem_tiny_free.inc` (line 687-695)
|
||
|
||
Thread A: Thread B:
|
||
1. Allocate from SS
|
||
2. Free block → freelist=[1]
|
||
3. Publish mailbox ✓
|
||
4. Refill: Try adopt
|
||
5. Mailbox fetch gets [1] ✓
|
||
6. Ownership acquire → success
|
||
7. But direct alloc bypasses this path!
|
||
8. Alloc again (same thread)
|
||
9. Pop from freelist directly
|
||
→ mailbox entry stale now
|
||
|
||
**Result:** Mailbox state diverges from actual freelist state
|
||
|
||
### Issue #3: Ownership Transition Not Tracked
|
||
|
||
When `meta->owner_tid` changes (cross-thread ownership transfer), freelist is not re-published:
|
||
|
||
**Location:** `core/hakmem_tiny_free.inc` (lines 120-135)
|
||
|
||
```c
|
||
if (!g_tiny_force_remote && meta->owner_tid != 0 && meta->owner_tid == my_tid) {
|
||
// Same-thread path
|
||
} else {
|
||
// Cross-thread path - but NO REPUBLISH if ownership changes
|
||
}
|
||
```
|
||
|
||
**Missing:** When ownership transitions to a new thread, the existing freelist should be advertised to that thread
|
||
|
||
---
|
||
|
||
## Metrics Analysis
|
||
|
||
The counters reveal the issue:
|
||
|
||
**In `core/box/mailbox_box.c` (Line 122):**
|
||
```c
|
||
void mailbox_box_publish(int class_idx, SuperSlab* ss, int slab_idx) {
|
||
// ...
|
||
g_pub_mail_hits[class_idx]++; // Published count
|
||
}
|
||
```
|
||
|
||
**In `core/box/mailbox_box.c` (Line 200):**
|
||
```c
|
||
uintptr_t mailbox_box_fetch(int class_idx) {
|
||
if (ent) {
|
||
g_rf_hit_mail[class_idx]++; // Fetched count
|
||
return ent;
|
||
}
|
||
return (uintptr_t)0;
|
||
}
|
||
```
|
||
|
||
**Expected Relationship:** `g_rf_hit_mail[class_idx]` should be ~1.0x of `g_pub_mail_hits[class_idx]`
|
||
**Actual Relationship:** Probably 0.1x - 0.5x (many published entries never fetched)
|
||
|
||
**Explanation:**
|
||
- Blocks are published (g_pub_mail_hits++)
|
||
- But they're accessed via direct freelist pop (no fetch)
|
||
- So g_rf_hit_mail stays low
|
||
- Mailbox entries accumulate as garbage
|
||
|
||
---
|
||
|
||
## Root Cause Summary
|
||
|
||
**Root Cause:** The freelist push is functional, but the **visibility mechanism (mailbox) is decoupled** from the **actual freelist access pattern**.
|
||
|
||
The system assumes refill always goes through mailbox_fetch(), but direct freelist pops bypass this entirely, creating:
|
||
|
||
1. **Stale mailbox entries** - Published but never fetched
|
||
2. **Invisible reuse** - Freed blocks are reused directly without fetch visibility
|
||
3. **Metric misalignment** - g_pub_mail_hits >> g_rf_hit_mail
|
||
|
||
---
|
||
|
||
## Recommended Fixes
|
||
|
||
### Fix #1: Clear Stale Mailbox Entry on Direct Pop
|
||
|
||
**File:** `core/hakmem_tiny_free.inc` (lines 687-695)
|
||
**In:** `superslab_alloc_from_slab()`
|
||
|
||
```c
|
||
if (meta->freelist) {
|
||
void* block = meta->freelist;
|
||
meta->freelist = *(void**)block;
|
||
meta->used++;
|
||
|
||
// NEW: If this is a mailbox-published slab, clear the entry
|
||
if (slab_idx == 0) { // Only first slab publishes
|
||
// Signal to refill: this slab's mailbox entry may now be stale
|
||
// Option A: Mark as dirty (requires new field)
|
||
// Option B: Clear mailbox on first pop (requires sync)
|
||
}
|
||
|
||
return block;
|
||
}
|
||
```
|
||
|
||
### Fix #2: Republish After Each Free (Aggressive)
|
||
|
||
**File:** `core/box/free_local_box.c` (lines 32-34)
|
||
**Problem:** Only first-free publishes
|
||
|
||
**Change:**
|
||
```c
|
||
// Always publish if freelist is non-empty
|
||
if (meta->freelist != NULL) {
|
||
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx);
|
||
}
|
||
```
|
||
|
||
**Cost:** More atomic operations, but ensures mailbox is always up-to-date
|
||
|
||
### Fix #3: Track Freelist Modifications via Atomic
|
||
|
||
**New Approach:** Use atomic freelist_mask as published state
|
||
|
||
**File:** `core/box/free_local_box.c` (current lines 15-25)
|
||
|
||
```c
|
||
// Already implemented - use this more aggressively
|
||
if (prev == NULL) {
|
||
uint32_t bit = (1u << slab_idx);
|
||
atomic_fetch_or_explicit(&ss->freelist_mask, bit, memory_order_release);
|
||
}
|
||
|
||
// Also mark on later frees
|
||
else {
|
||
uint32_t bit = (1u << slab_idx);
|
||
atomic_fetch_or_explicit(&ss->freelist_mask, bit, memory_order_release);
|
||
}
|
||
```
|
||
|
||
### Fix #4: Add Freelist Consistency Check in Refill
|
||
|
||
**File:** `core/tiny_refill.h` (lines ~140-156)
|
||
**New Logic:**
|
||
|
||
```c
|
||
uintptr_t mail = mailbox_box_fetch(class_idx);
|
||
if (mail) {
|
||
SuperSlab* mss = slab_entry_ss(mail);
|
||
int midx = slab_entry_idx(mail);
|
||
SlabHandle h = slab_try_acquire(mss, midx, self_tid);
|
||
if (slab_is_valid(&h)) {
|
||
if (slab_freelist(&h)) {
|
||
// NEW: Verify mailbox entry matches actual freelist
|
||
if (h.ss->slabs[h.slab_idx].freelist == NULL) {
|
||
// Stale entry - was already popped directly
|
||
// Re-publish if more blocks freed since
|
||
continue; // Try next candidate
|
||
}
|
||
tiny_tls_bind_slab(tls, h.ss, h.slab_idx);
|
||
return h.ss;
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Testing Recommendations
|
||
|
||
### Test 1: Mailbox vs. Direct Pop Ratio
|
||
|
||
Instrument the code to measure:
|
||
- `mailbox_fetch_calls` vs `direct_freelist_pops`
|
||
- Expected ratio after warmup: Should be ~1:1 if refill path is being used
|
||
- Actual ratio: Probably 1:10 or worse (direct pops dominating)
|
||
|
||
### Test 2: Mailbox Entry Staleness
|
||
|
||
Enable debug mode and check:
|
||
```
|
||
HAKMEM_TINY_MAILBOX_TRACE=1 HAKMEM_TINY_RF_TRACE=1 ./larson
|
||
```
|
||
|
||
Examine MBTRACE output:
|
||
- Count "publish" events vs "fetch" events
|
||
- Any publish without matching fetch = wasted slot
|
||
|
||
### Test 3: Freelist Reuse Path
|
||
|
||
Add instrumentation to `superslab_alloc_from_slab()`:
|
||
```c
|
||
if (meta->freelist) {
|
||
g_direct_freelist_pops[class_idx]++; // New counter
|
||
}
|
||
```
|
||
|
||
Compare with refill path:
|
||
```c
|
||
g_refill_calls[class_idx]++;
|
||
```
|
||
|
||
Verify that most allocations come from direct freelist (expected) vs. refill (if low, freelist is working)
|
||
|
||
---
|
||
|
||
## Code Quality Issues Found
|
||
|
||
### Issue #1: Unused Function Parameter
|
||
|
||
**File:** `core/box/free_local_box.c` (line 8)
|
||
```c
|
||
void tiny_free_local_box(SuperSlab* ss, int slab_idx, TinySlabMeta* meta, void* ptr, uint32_t my_tid) {
|
||
// ...
|
||
(void)my_tid; // Explicitly ignored
|
||
}
|
||
```
|
||
|
||
**Why:** Parameter passed but not used - suggests design change where ownership was computed earlier
|
||
|
||
### Issue #2: Magic Number for First Slab
|
||
|
||
**File:** `core/hakmem_tiny_free.inc` (line 676)
|
||
```c
|
||
if (slab_idx == 0) {
|
||
slab_start = (char*)slab_start + 1024; // Magic number!
|
||
}
|
||
```
|
||
|
||
Should be:
|
||
```c
|
||
if (slab_idx == 0) {
|
||
slab_start = (char*)slab_start + sizeof(SuperSlab); // or named constant
|
||
}
|
||
```
|
||
|
||
### Issue #3: Duplicate Freelist Scan Logic
|
||
|
||
**Locations:**
|
||
- `core/hakmem_tiny_free.inc` (line ~45-62): `tiny_remote_queue_contains_guard()`
|
||
- `core/hakmem_tiny_free.inc` (line ~50-64): Duplicate in safe_free path
|
||
|
||
These should be unified into a helper function.
|
||
|
||
---
|
||
|
||
## Performance Impact
|
||
|
||
**Current Situation:**
|
||
- Freelist is functional and pushed correctly
|
||
- But publish/fetch visibility is weak
|
||
- Forces all allocations to use direct freelist pop (bypassingrefill path)
|
||
- This is actually **good** for performance (fewer lock/sync operations)
|
||
- But creates **hidden fragmentation** (freelist not reorganized by adopt path)
|
||
|
||
**After Fix:**
|
||
- Expect +5-10% refill path usage (from ~0% to ~5-10%)
|
||
- Refill path can reorganize and rebalance
|
||
- Better memory locality for hot allocations
|
||
- Slightly more atomic operations during free (acceptable trade-off)
|
||
|
||
---
|
||
|
||
## Conclusion
|
||
|
||
**The freelist push IS happening.** The bug is not in the push logic itself, but in:
|
||
|
||
1. **Visibility Gap:** Pushed blocks are not tracked by mailbox when accessed via direct pop
|
||
2. **Incomplete Publish:** Only first-free publishes; later frees are silent
|
||
3. **Lack of Republish:** Freelist state changes not advertised to refill path
|
||
|
||
The fixes are straightforward:
|
||
- Re-publish on every free (not just first-free)
|
||
- Validate mailbox entries during fetch
|
||
- Track direct vs. refill access to find optimal balance
|
||
|
||
This explains why Larson shows low refill metrics despite high freelist push rate.
|