Files
hakmem/core/box/free_local_box.c
Moe Charm (CI) c9053a43ac Phase 6-2.3~6-2.5: Critical bug fixes + SuperSlab optimization (WIP)
## Phase 6-2.3: Fix 4T Larson crash (active counter bug) 
**Problem:** 4T Larson crashed with "free(): invalid pointer", OOM errors
**Root cause:** core/hakmem_tiny_refill_p0.inc.h:103
  - P0 batch refill moved freelist blocks to TLS cache
  - Active counter NOT incremented → double-decrement on free
  - Counter underflows → SuperSlab appears full → OOM → crash
**Fix:** Added ss_active_add(tls->ss, from_freelist);
**Result:** 4T stable at 838K ops/s 

## Phase 6-2.4: Fix SEGV in random_mixed/mid_large_mt benchmarks 
**Problem:** bench_random_mixed_hakmem, bench_mid_large_mt_hakmem → immediate SEGV
**Root cause #1:** core/box/hak_free_api.inc.h:92-95
  - "Guess loop" dereferenced unmapped memory when registry lookup failed
**Root cause #2:** core/box/hak_free_api.inc.h:115
  - Header magic check dereferenced unmapped memory
**Fix:**
  1. Removed dangerous guess loop (lines 92-95)
  2. Added hak_is_memory_readable() check before dereferencing header
     (core/hakmem_internal.h:277-294 - uses mincore() syscall)
**Result:**
  - random_mixed (2KB): SEGV → 2.22M ops/s 
  - random_mixed (4KB): SEGV → 2.58M ops/s 
  - Larson 4T: no regression (838K ops/s) 

## Phase 6-2.5: Performance investigation + SuperSlab fix (WIP) ⚠️
**Problem:** Severe performance gaps (19-26x slower than system malloc)
**Investigation:** Task agent identified root cause
  - hak_is_memory_readable() syscall overhead (100-300 cycles per free)
  - ALL frees hit unmapped_header_fallback path
  - SuperSlab lookup NEVER called
  - Why? g_use_superslab = 0 (disabled by diet mode)

**Root cause:** core/hakmem_tiny_init.inc:104-105
  - Diet mode (default ON) disables SuperSlab
  - SuperSlab defaults to 1 (hakmem_config.c:334)
  - BUT diet mode overrides it to 0 during init

**Fix:** Separate SuperSlab from diet mode
  - SuperSlab: Performance-critical (fast alloc/free)
  - Diet mode: Memory efficiency (magazine capacity limits only)
  - Both are independent features, should not interfere

**Status:** ⚠️ INCOMPLETE - New SEGV discovered after fix
  - SuperSlab lookup now works (confirmed via debug output)
  - But benchmark crashes (Exit 139) after ~20 lookups
  - Needs further investigation

**Files modified:**
- core/hakmem_tiny_init.inc:99-109 - Removed diet mode override
- PERFORMANCE_INVESTIGATION_REPORT.md - Task agent analysis (303x instruction gap)

**Next steps:**
- Investigate new SEGV (likely SuperSlab free path bug)
- OR: Revert Phase 6-2.5 changes if blocking progress

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 20:31:01 +09:00

59 lines
2.4 KiB
C

#include "free_local_box.h"
#include "free_publish_box.h"
#include "hakmem_tiny.h"
void tiny_free_local_box(SuperSlab* ss, int slab_idx, TinySlabMeta* meta, void* ptr, uint32_t my_tid) {
extern _Atomic uint64_t g_free_local_box_calls;
atomic_fetch_add_explicit(&g_free_local_box_calls, 1, memory_order_relaxed);
if (!(ss && ss->magic == SUPERSLAB_MAGIC)) return;
if (slab_idx < 0 || slab_idx >= ss_slabs_capacity(ss)) return;
(void)my_tid;
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0)) {
int actual_idx = slab_index_for(ss, ptr);
if (actual_idx != slab_idx) {
tiny_failfast_abort_ptr("free_local_box_idx", ss, slab_idx, ptr, "slab_idx_mismatch");
} else {
size_t blk = g_tiny_class_sizes[ss->size_class];
uint8_t* slab_base = tiny_slab_base_for(ss, slab_idx);
uintptr_t delta = (uintptr_t)ptr - (uintptr_t)slab_base;
if (blk == 0 || (delta % blk) != 0) {
tiny_failfast_abort_ptr("free_local_box_align", ss, slab_idx, ptr, "misaligned");
} else if (meta && delta / blk >= meta->capacity) {
tiny_failfast_abort_ptr("free_local_box_range", ss, slab_idx, ptr, "out_of_capacity");
}
}
}
void* prev = meta->freelist;
*(void**)ptr = prev;
meta->freelist = ptr;
tiny_failfast_log("free_local_box", ss->size_class, ss, meta, ptr, prev);
// BUGFIX: Memory barrier to ensure freelist visibility before used decrement
// Without this, other threads can see new freelist but old used count (race)
atomic_thread_fence(memory_order_release);
// Optional freelist mask update on first push
do {
static int g_mask_en = -1;
if (__builtin_expect(g_mask_en == -1, 0)) {
const char* e = getenv("HAKMEM_TINY_FREELIST_MASK");
g_mask_en = (e && *e && *e != '0') ? 1 : 0;
}
if (__builtin_expect(g_mask_en, 0) && prev == NULL) {
uint32_t bit = (1u << slab_idx);
atomic_fetch_or_explicit(&ss->freelist_mask, bit, memory_order_release);
}
} while (0);
// Track local free (debug helpers may be no-op)
tiny_remote_track_on_local_free(ss, slab_idx, ptr, "local_free", my_tid);
meta->used--;
ss_active_dec_one(ss);
if (prev == NULL) {
// First-free → advertise slab to adopters
tiny_free_publish_first_free((int)ss->size_class, ss, slab_idx);
}
}