Files
hakmem/fix_active_counter_double_decrement.patch
Moe Charm (CI) f6b06a0311 Fix: Active counter double-decrement in P0 batch refill (4T crash → stable)
## Problem
HAKMEM 4T crashed with "free(): invalid pointer" on startup:
- System/mimalloc: 3.3M ops/s 
- HAKMEM 1T: 838K ops/s (-75%) ⚠️
- HAKMEM 4T: Crash (Exit 134) 

Error: superslab_refill returned NULL (OOM), active=0, bitmap=0x00000000

## Root Cause (Ultrathink Task Agent Investigation)
Active counter double-decrement when re-allocating from freelist:

1. Free → counter-- 
2. Remote drain → add to freelist (no counter change) 
3. P0 batch refill → move to TLS cache (forgot counter++)  BUG!
4. Next free → counter--  Double decrement!

Result: Counter underflow → SuperSlab appears "full" → OOM → crash

## Fix (1 line)
File: core/hakmem_tiny_refill_p0.inc.h:103

+ss_active_add(tls->ss, from_freelist);

Reason: Freelist re-allocation moves block from "free" to "allocated" state,
so active counter MUST increment.

## Verification
| Setting        | Before  | After          | Result       |
|----------------|---------|----------------|--------------|
| 4T default     |  Crash |  838,445 ops/s | 🎉 Stable    |
| Stability (2x) | -       |  Same score   | Reproducible |

## Remaining Issue
 HAKMEM_TINY_REFILL_COUNT_HOT=64 triggers crash (class=4 OOM)
- Suspected: TLS cache over-accumulation or memory leak
- Next: Investigate HAKMEM_TINY_FAST_CAP interaction

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 12:37:23 +09:00

16 lines
1.1 KiB
Diff
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

--- a/core/hakmem_tiny_refill_p0.inc.h
+++ b/core/hakmem_tiny_refill_p0.inc.h
@@ -99,9 +99,10 @@ static inline int sll_refill_batch_from_ss(int class_idx, int max_take) {
uint32_t from_freelist = trc_pop_from_freelist(meta, want, &chain);
if (from_freelist > 0) {
trc_splice_to_sll(class_idx, &chain, &g_tls_sll_head[class_idx], &g_tls_sll_count[class_idx]);
- // NOTE: from_freelist は既に used/active 計上済みのブロックの再循環。active 追加や
- // nonempty_mask クリアは不要クリアすると後続freeで立たない
+ // FIX (2025-11-07): Blocks from freelist were decremented when freed (remote or local).
+ // Must increment counter when moving back to allocation pool (TLS SLL).
+ // Bug: Without this, counter underflows → false OOM → crash.
+ ss_active_add(tls->ss, from_freelist);
extern unsigned long long g_rf_freelist_items[];
g_rf_freelist_items[class_idx] += from_freelist;
total_taken += from_freelist;