Phase 6-2.3~6-2.5: Critical bug fixes + SuperSlab optimization (WIP)

## Phase 6-2.3: Fix 4T Larson crash (active counter bug) 
**Problem:** 4T Larson crashed with "free(): invalid pointer", OOM errors
**Root cause:** core/hakmem_tiny_refill_p0.inc.h:103
  - P0 batch refill moved freelist blocks to TLS cache
  - Active counter NOT incremented → double-decrement on free
  - Counter underflows → SuperSlab appears full → OOM → crash
**Fix:** Added ss_active_add(tls->ss, from_freelist);
**Result:** 4T stable at 838K ops/s 

## Phase 6-2.4: Fix SEGV in random_mixed/mid_large_mt benchmarks 
**Problem:** bench_random_mixed_hakmem, bench_mid_large_mt_hakmem → immediate SEGV
**Root cause #1:** core/box/hak_free_api.inc.h:92-95
  - "Guess loop" dereferenced unmapped memory when registry lookup failed
**Root cause #2:** core/box/hak_free_api.inc.h:115
  - Header magic check dereferenced unmapped memory
**Fix:**
  1. Removed dangerous guess loop (lines 92-95)
  2. Added hak_is_memory_readable() check before dereferencing header
     (core/hakmem_internal.h:277-294 - uses mincore() syscall)
**Result:**
  - random_mixed (2KB): SEGV → 2.22M ops/s 
  - random_mixed (4KB): SEGV → 2.58M ops/s 
  - Larson 4T: no regression (838K ops/s) 

## Phase 6-2.5: Performance investigation + SuperSlab fix (WIP) ⚠️
**Problem:** Severe performance gaps (19-26x slower than system malloc)
**Investigation:** Task agent identified root cause
  - hak_is_memory_readable() syscall overhead (100-300 cycles per free)
  - ALL frees hit unmapped_header_fallback path
  - SuperSlab lookup NEVER called
  - Why? g_use_superslab = 0 (disabled by diet mode)

**Root cause:** core/hakmem_tiny_init.inc:104-105
  - Diet mode (default ON) disables SuperSlab
  - SuperSlab defaults to 1 (hakmem_config.c:334)
  - BUT diet mode overrides it to 0 during init

**Fix:** Separate SuperSlab from diet mode
  - SuperSlab: Performance-critical (fast alloc/free)
  - Diet mode: Memory efficiency (magazine capacity limits only)
  - Both are independent features, should not interfere

**Status:** ⚠️ INCOMPLETE - New SEGV discovered after fix
  - SuperSlab lookup now works (confirmed via debug output)
  - But benchmark crashes (Exit 139) after ~20 lookups
  - Needs further investigation

**Files modified:**
- core/hakmem_tiny_init.inc:99-109 - Removed diet mode override
- PERFORMANCE_INVESTIGATION_REPORT.md - Task agent analysis (303x instruction gap)

**Next steps:**
- Investigate new SEGV (likely SuperSlab free path bug)
- OR: Revert Phase 6-2.5 changes if blocking progress

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-07 20:31:01 +09:00
parent 382980d450
commit c9053a43ac
11 changed files with 857 additions and 14 deletions

View File

@ -104,6 +104,78 @@ typedef struct SuperSlab {
} __attribute__((aligned(64))) SuperSlab;
static inline int ss_slabs_capacity(const SuperSlab* ss);
static inline int tiny_refill_failfast_level(void) {
static int g_failfast_level = -1;
if (__builtin_expect(g_failfast_level == -1, 0)) {
const char* env = getenv("HAKMEM_TINY_REFILL_FAILFAST");
if (env && *env) {
g_failfast_level = atoi(env);
} else {
g_failfast_level = 1;
}
}
return g_failfast_level;
}
static inline void tiny_failfast_log(const char* stage,
int class_idx,
SuperSlab* ss,
TinySlabMeta* meta,
const void* node,
const void* next) {
if (__builtin_expect(tiny_refill_failfast_level() < 2, 1)) return;
uintptr_t base = ss ? (uintptr_t)ss : 0;
size_t size = ss ? ((size_t)1ULL << ss->lg_size) : 0;
uintptr_t limit = base + size;
fprintf(stderr,
"[TRC_FREELIST_LOG] stage=%s cls=%d node=%p next=%p head=%p base=%p limit=%p\n",
stage ? stage : "(null)",
class_idx,
node,
next,
meta ? meta->freelist : NULL,
(void*)base,
(void*)limit);
fflush(stderr);
}
static inline void tiny_failfast_abort_ptr(const char* stage,
SuperSlab* ss,
int slab_idx,
const void* ptr,
const char* reason) {
if (__builtin_expect(tiny_refill_failfast_level() < 2, 1)) return;
uintptr_t base = ss ? (uintptr_t)ss : 0;
size_t size = ss ? ((size_t)1ULL << ss->lg_size) : 0;
uintptr_t limit = base + size;
size_t cap = 0;
uint32_t used = 0;
if (ss && slab_idx >= 0 && slab_idx < ss_slabs_capacity(ss)) {
cap = ss->slabs[slab_idx].capacity;
used = ss->slabs[slab_idx].used;
}
size_t offset = 0;
if (ptr && base && ptr >= (void*)base) {
offset = (size_t)((uintptr_t)ptr - base);
}
fprintf(stderr,
"[TRC_FAILFAST_PTR] stage=%s cls=%d slab_idx=%d ptr=%p reason=%s base=%p limit=%p cap=%zu used=%u offset=%zu\n",
stage ? stage : "(null)",
ss ? (int)ss->size_class : -1,
slab_idx,
ptr,
reason ? reason : "(null)",
(void*)base,
(void*)limit,
cap,
used,
offset);
fflush(stderr);
abort();
}
// Compile-time assertions
_Static_assert(sizeof(TinySlabMeta) == 16, "TinySlabMeta must be 16 bytes");
// Phase 8.3: Variable-size SuperSlab assertions (1MB=16 slabs, 2MB=32 slabs)
@ -162,6 +234,12 @@ static inline void* slab_data_start(SuperSlab* ss, int slab_idx) {
return (char*)ss + (slab_idx * SLAB_SIZE);
}
static inline uint8_t* tiny_slab_base_for(SuperSlab* ss, int slab_idx) {
uint8_t* base = (uint8_t*)slab_data_start(ss, slab_idx);
if (slab_idx == 0) base += 1024;
return base;
}
// DEPRECATED (Phase 1): Uses unsafe ptr_to_superslab() internally (false positives!)
// Use: SuperSlab* ss = hak_super_lookup(p); if (ss && ss->magic == SUPERSLAB_MAGIC) { ... }
#if 0 // DISABLED - uses unsafe ptr_to_superslab(), causes crashes on L2.5 boundaries
@ -506,7 +584,9 @@ static inline void _ss_remote_drain_to_freelist_unsafe(SuperSlab* ss, int slab_i
if (chain_tail != NULL) {
*(void**)chain_tail = meta->freelist;
}
void* prev = meta->freelist;
meta->freelist = chain_head;
tiny_failfast_log("remote_drain", ss->size_class, ss, meta, chain_head, prev);
// Optional: set freelist bit when transitioning from empty
do {
static int g_mask_en = -1;