## Changes ### 1. core/page_arena.c - Removed init failure message (lines 25-27) - error is handled by returning early - All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks ### 2. core/hakmem.c - Wrapped SIGSEGV handler init message (line 72) - CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs ### 3. core/hakmem_shared_pool.c - Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE: - Node pool exhaustion warning (line 252) - SP_META_CAPACITY_ERROR warning (line 421) - SP_FIX_GEOMETRY debug logging (line 745) - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865) - SP_ACQUIRE_STAGE0_L0 debug logging (line 803) - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922) - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996) - SP_ACQUIRE_STAGE3 debug logging (line 1116) - SP_SLOT_RELEASE debug logging (line 1245) - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305) - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316) - Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized ## Performance Validation Before: 51M ops/s (with debug fprintf overhead) After: 49.1M ops/s (consistent performance, fprintf removed from hot paths) ## Build & Test ```bash ./build.sh larson_hakmem ./out/release/larson_hakmem 1 5 1 1000 100 10000 42 # Result: 49.1M ops/s ``` Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
418 lines
11 KiB
Markdown
418 lines
11 KiB
Markdown
# Task: Remove malloc Fallback (Root Cause Fix for 4T Crash)
|
|
|
|
**Date**: 2025-11-08
|
|
**Priority**: CRITICAL - BLOCKING
|
|
**Status**: Ready for Task Agent
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
**Problem**: malloc フォールバックが 4T クラッシュの根本原因
|
|
|
|
**Root Cause**:
|
|
```
|
|
SuperSlab OOM → __libc_malloc() fallback → Mixed HAKMEM/libc allocations
|
|
→ free() confusion → free(): invalid pointer crash
|
|
```
|
|
|
|
**二重管理の問題**:
|
|
- libc malloc: 独自メタデータ管理 (8-16B)
|
|
- HAKMEM: さらに AllocHeader 追加
|
|
- 結果: メモリ効率悪化、所有権不明、バグの温床
|
|
|
|
**Mission**: malloc フォールバックを完全削除し、HAKMEM 100% 割り当てを実現
|
|
|
|
---
|
|
|
|
## Why malloc Fallback is Fundamentally Wrong
|
|
|
|
### 1. **HAKMEM の存在意義を否定**
|
|
- 目標: System malloc より高速・効率的
|
|
- 現実: OOM 時に System malloc に丸投げ
|
|
- 矛盾: HAKMEM が OOM なら System malloc も OOM のはず
|
|
|
|
### 2. **二重オーバーヘッド**
|
|
```
|
|
libc malloc 割り当て:
|
|
[libc metadata (8-16B)] [user data]
|
|
|
|
HAKMEM がヘッダー追加:
|
|
[libc metadata] [HAKMEM header] [user data]
|
|
|
|
総オーバーヘッド: 16-32B per allocation!
|
|
```
|
|
|
|
### 3. **Mixed Allocation Bug**
|
|
```
|
|
Thread 1: SuperSlab alloc → ptr1 (HAKMEM)
|
|
Thread 2: SuperSlab OOM → libc malloc → ptr2 (libc + HAKMEM header)
|
|
Thread 3: free(ptr1) → HAKMEM free ✓
|
|
Thread 4: free(ptr2) → HAKMEM free tries to touch libc memory → 💥 CRASH
|
|
```
|
|
|
|
### 4. **性能の不安定性**
|
|
- 通常時: HAKMEM 高速パス
|
|
- 負荷時: libc malloc 遅いパス
|
|
- ベンチマーク結果が負荷によって大きくブレる
|
|
|
|
---
|
|
|
|
## Task 1: Identify All malloc Fallback Paths (CRITICAL)
|
|
|
|
### Search Commands
|
|
|
|
```bash
|
|
# Find all hak_alloc_malloc_impl() calls
|
|
grep -rn "hak_alloc_malloc_impl" core/
|
|
|
|
# Find all __libc_malloc() calls
|
|
grep -rn "__libc_malloc" core/
|
|
|
|
# Find fallback comments
|
|
grep -rn "fallback.*malloc\|malloc.*fallback" core/
|
|
```
|
|
|
|
### Expected Locations
|
|
|
|
**Already identified**:
|
|
1. `core/hakmem_internal.h:200-222` - `hak_alloc_malloc_impl()` implementation
|
|
2. `core/box/hak_alloc_api.inc.h:36-46` - Tiny failure fallback
|
|
3. `core/box/hak_alloc_api.inc.h:128` - General fallback
|
|
|
|
**Potentially more**:
|
|
- `core/hakmem.c` - Top-level malloc wrapper
|
|
- `core/hakmem_tiny.c` - Tiny allocator
|
|
- Other allocation paths
|
|
|
|
---
|
|
|
|
## Task 2: Remove malloc Fallback (Phase 1 - Immediate Fix)
|
|
|
|
### Goal: Make HAKMEM fail explicitly on OOM instead of falling back
|
|
|
|
### Change 1: Disable `hak_alloc_malloc_impl()` (core/hakmem_internal.h:200-222)
|
|
|
|
**Before (BROKEN)**:
|
|
```c
|
|
static inline void* hak_alloc_malloc_impl(size_t size) {
|
|
if (!HAK_ENABLED_ALLOC(HAKMEM_FEATURE_MALLOC)) {
|
|
return NULL; // malloc disabled
|
|
}
|
|
|
|
extern void* __libc_malloc(size_t);
|
|
void* raw = __libc_malloc(HEADER_SIZE + size); // ← BAD!
|
|
if (!raw) return NULL;
|
|
|
|
AllocHeader* hdr = (AllocHeader*)raw;
|
|
hdr->magic = HAKMEM_MAGIC;
|
|
hdr->method = ALLOC_METHOD_MALLOC;
|
|
// ...
|
|
return (char*)raw + HEADER_SIZE;
|
|
}
|
|
```
|
|
|
|
**After (SAFE)**:
|
|
```c
|
|
static inline void* hak_alloc_malloc_impl(size_t size) {
|
|
// Phase 7 CRITICAL FIX: malloc fallback removed (causes mixed allocation bug)
|
|
// Return NULL explicitly to force OOM handling
|
|
(void)size;
|
|
|
|
fprintf(stderr, "[HAKMEM] CRITICAL: malloc fallback disabled (size=%zu), returning NULL\n", size);
|
|
errno = ENOMEM;
|
|
return NULL; // ✅ Explicit OOM
|
|
}
|
|
```
|
|
|
|
**Alternative (環境変数ゲート)**:
|
|
```c
|
|
static inline void* hak_alloc_malloc_impl(size_t size) {
|
|
// Allow malloc fallback ONLY if explicitly enabled (for debugging)
|
|
static int allow_fallback = -1;
|
|
if (allow_fallback < 0) {
|
|
char* env = getenv("HAKMEM_ALLOW_MALLOC_FALLBACK");
|
|
allow_fallback = (env && atoi(env) != 0) ? 1 : 0;
|
|
}
|
|
|
|
if (!allow_fallback) {
|
|
fprintf(stderr, "[HAKMEM] malloc fallback disabled (size=%zu), returning NULL\n", size);
|
|
errno = ENOMEM;
|
|
return NULL;
|
|
}
|
|
|
|
// Fallback path (only if HAKMEM_ALLOW_MALLOC_FALLBACK=1)
|
|
extern void* __libc_malloc(size_t);
|
|
// ... rest of original code
|
|
}
|
|
```
|
|
|
|
### Change 2: Remove Tiny failure fallback (core/box/hak_alloc_api.inc.h:36-46)
|
|
|
|
**Before (BROKEN)**:
|
|
```c
|
|
if (tiny_ptr) { hkm_ace_track_alloc(); return tiny_ptr; }
|
|
|
|
// Phase 7: If Tiny rejects size <= TINY_MAX_SIZE
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
|
if (size <= TINY_MAX_SIZE) {
|
|
// Tiny rejected this size (likely 1024B), use malloc directly
|
|
static int log_count = 0;
|
|
if (log_count < 3) {
|
|
fprintf(stderr, "[DEBUG] Phase 7: tiny_alloc(%zu) rejected, using malloc fallback\n", size);
|
|
log_count++;
|
|
}
|
|
void* fallback_ptr = hak_alloc_malloc_impl(size); // ← BAD!
|
|
if (fallback_ptr) return fallback_ptr;
|
|
}
|
|
#endif
|
|
```
|
|
|
|
**After (SAFE)**:
|
|
```c
|
|
if (tiny_ptr) { hkm_ace_track_alloc(); return tiny_ptr; }
|
|
|
|
// Phase 7 CRITICAL FIX: No malloc fallback, let allocation flow to Mid/ACE layers
|
|
// If all layers fail, NULL will be returned (explicit OOM)
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
|
if (!tiny_ptr && size <= TINY_MAX_SIZE) {
|
|
// Tiny failed for size <= TINY_MAX_SIZE
|
|
// Log and continue to Mid/ACE layers (don't fallback to malloc!)
|
|
static int log_count = 0;
|
|
if (log_count < 3) {
|
|
fprintf(stderr, "[DEBUG] Phase 7: tiny_alloc(%zu) failed, trying Mid/ACE layers\n", size);
|
|
log_count++;
|
|
}
|
|
// Continue to Mid allocation below (no early return!)
|
|
}
|
|
#endif
|
|
```
|
|
|
|
### Change 3: Remove general fallback (core/box/hak_alloc_api.inc.h:124-132)
|
|
|
|
**Before (BROKEN)**:
|
|
```c
|
|
void* ptr;
|
|
if (size >= threshold) {
|
|
ptr = hak_alloc_mmap_impl(size);
|
|
} else {
|
|
ptr = hak_alloc_malloc_impl(size); // ← BAD!
|
|
}
|
|
if (!ptr) return NULL;
|
|
```
|
|
|
|
**After (SAFE)**:
|
|
```c
|
|
void* ptr;
|
|
if (size >= threshold) {
|
|
ptr = hak_alloc_mmap_impl(size);
|
|
} else {
|
|
// Phase 7 CRITICAL FIX: No malloc fallback
|
|
// If we reach here, all allocation layers (Tiny/Mid/ACE) have failed
|
|
// Return NULL explicitly (OOM)
|
|
fprintf(stderr, "[HAKMEM] OOM: All layers failed for size=%zu, returning NULL\n", size);
|
|
errno = ENOMEM;
|
|
return NULL; // ✅ Explicit OOM
|
|
}
|
|
if (!ptr) return NULL;
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: Implement SuperSlab Dynamic Scaling (Phase 2 - Proper Fix)
|
|
|
|
### Goal: Never run out of SuperSlabs
|
|
|
|
### Change 1: Detect SuperSlab exhaustion (core/tiny_superslab_alloc.inc.h or similar)
|
|
|
|
**Location**: Find where `bitmap == 0x00000000` check would go
|
|
|
|
```c
|
|
// In superslab_refill() or equivalent
|
|
if (bitmap == 0x00000000) {
|
|
// All 32 slabs exhausted for this class
|
|
fprintf(stderr, "[HAKMEM] SuperSlab class %d exhausted (bitmap=0x00000000), allocating new SuperSlab\n", class_idx);
|
|
|
|
// Allocate new SuperSlab via mmap
|
|
SuperSlab* new_ss = mmap_new_superslab(class_idx);
|
|
if (!new_ss) {
|
|
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to allocate new SuperSlab for class %d\n", class_idx);
|
|
return NULL; // True OOM (system out of memory)
|
|
}
|
|
|
|
// Register new SuperSlab in registry
|
|
if (!register_superslab(new_ss, class_idx)) {
|
|
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to register new SuperSlab for class %d\n", class_idx);
|
|
munmap(new_ss, SUPERSLAB_SIZE);
|
|
return NULL;
|
|
}
|
|
|
|
// Retry refill from new SuperSlab
|
|
return refill_from_superslab(new_ss, class_idx, count);
|
|
}
|
|
```
|
|
|
|
### Change 2: Increase initial capacity for hot classes
|
|
|
|
**File**: SuperSlab initialization code
|
|
|
|
```c
|
|
// In hak_tiny_init() or similar
|
|
void initialize_superslabs(void) {
|
|
for (int class_idx = 0; class_idx < TINY_NUM_CLASSES; class_idx++) {
|
|
int initial_slabs;
|
|
|
|
// Hot classes in multi-threaded workloads: 1, 4, 6
|
|
if (class_idx == 1 || class_idx == 4 || class_idx == 6) {
|
|
initial_slabs = 64; // Double capacity for hot classes
|
|
} else {
|
|
initial_slabs = 32; // Default
|
|
}
|
|
|
|
allocate_superslabs_for_class(class_idx, initial_slabs);
|
|
}
|
|
}
|
|
```
|
|
|
|
### Change 3: Implement `mmap_new_superslab()` helper
|
|
|
|
```c
|
|
// Allocate a new SuperSlab via mmap
|
|
static SuperSlab* mmap_new_superslab(int class_idx) {
|
|
size_t ss_size = SUPERSLAB_SIZE; // e.g., 2MB
|
|
|
|
void* raw = mmap(NULL, ss_size, PROT_READ | PROT_WRITE,
|
|
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
|
|
if (raw == MAP_FAILED) {
|
|
return NULL;
|
|
}
|
|
|
|
// Initialize SuperSlab structure
|
|
SuperSlab* ss = (SuperSlab*)raw;
|
|
ss->class_idx = class_idx;
|
|
ss->total_active_blocks = 0;
|
|
ss->bitmap = 0xFFFFFFFF; // All slabs available
|
|
|
|
// Initialize slabs
|
|
size_t block_size = class_to_size(class_idx);
|
|
initialize_slabs(ss, block_size);
|
|
|
|
return ss;
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: Testing Requirements (CRITICAL)
|
|
|
|
### Test 1: Build and verify no malloc fallback
|
|
|
|
```bash
|
|
# Rebuild with Phase 7 flags
|
|
make clean
|
|
make HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 larson_hakmem
|
|
|
|
# Verify malloc fallback is disabled
|
|
strings libhakmem.so | grep "malloc fallback disabled"
|
|
# Should see: "[HAKMEM] malloc fallback disabled"
|
|
```
|
|
|
|
### Test 2: 4T stability (CRITICAL - must achieve 100%)
|
|
|
|
```bash
|
|
# Run 20 times, count successes
|
|
success=0
|
|
for i in {1..20}; do
|
|
echo "Run $i:"
|
|
env HAKMEM_TINY_USE_SUPERSLAB=1 HAKMEM_TINY_MEM_DIET=0 \
|
|
./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | tee run_$i.log
|
|
|
|
if grep -q "Throughput" run_$i.log; then
|
|
((success++))
|
|
echo "✓ Success ($success/20)"
|
|
else
|
|
echo "✗ Failed"
|
|
fi
|
|
done
|
|
|
|
echo "Final: $success/20 success rate"
|
|
# TARGET: 20/20 (100%)
|
|
```
|
|
|
|
### Test 3: Performance regression check
|
|
|
|
```bash
|
|
# Single-thread (should be ~2.68M ops/s)
|
|
./larson_hakmem 1 1 128 1024 1 12345 1
|
|
|
|
# Random mixed (should be 59-70M ops/s)
|
|
./bench_random_mixed_hakmem 100000 128 1234567
|
|
./bench_random_mixed_hakmem 100000 256 1234567
|
|
./bench_random_mixed_hakmem 100000 1024 1234567
|
|
|
|
# Should maintain Phase 7 performance (no regression)
|
|
```
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
✅ **malloc フォールバック完全削除**
|
|
- `hak_alloc_malloc_impl()` が NULL を返す
|
|
- `__libc_malloc()` 呼び出しが 0
|
|
|
|
✅ **4T 安定性 100%**
|
|
- 20/20 runs 成功
|
|
- `free(): invalid pointer` クラッシュが 0
|
|
|
|
✅ **性能維持**
|
|
- Single-thread: 2.68M ops/s (変化なし)
|
|
- Random mixed: 59-70M ops/s (変化なし)
|
|
|
|
✅ **SuperSlab 動的拡張動作** (Phase 2)
|
|
- `bitmap == 0x00000000` で新規 SuperSlab 割り当て
|
|
- Hot classes で初期容量増加
|
|
- OOM が発生しない
|
|
|
|
---
|
|
|
|
## Expected Deliverable
|
|
|
|
**Report file**: `/mnt/workdisk/public_share/hakmem/MALLOC_FALLBACK_REMOVAL_REPORT.md`
|
|
|
|
**Required sections**:
|
|
1. **Removed malloc fallback paths** (list of all changes)
|
|
2. **Code diffs** (before/after)
|
|
3. **Why this fixes the bug** (explanation)
|
|
4. **Test results** (20/20 stability, performance)
|
|
5. **SuperSlab dynamic scaling** (implementation details, if done)
|
|
6. **Production readiness** (YES/NO verdict)
|
|
|
|
---
|
|
|
|
## Context Documents
|
|
|
|
- `TASK_FOR_OTHER_AI.md` - Original task document (superseded by this one)
|
|
- `PHASE7_4T_STABILITY_VERIFICATION.md` - 30% success rate baseline
|
|
- `PHASE7_TASK3_RESULTS.md` - Phase 7 performance results
|
|
- `CLAUDE.md` - Project history
|
|
|
|
---
|
|
|
|
## Debug Commands
|
|
|
|
```bash
|
|
# Trace malloc fallback
|
|
HAKMEM_LOG=1 ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep "malloc fallback"
|
|
|
|
# Trace SuperSlab exhaustion
|
|
HAKMEM_LOG=1 ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep "bitmap=0x00000000"
|
|
|
|
# Check for libc malloc calls
|
|
ltrace -e malloc ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep -v HAKMEM
|
|
```
|
|
|
|
---
|
|
|
|
**Good luck! Let's make HAKMEM 100% self-sufficient! 🚀**
|