Files
hakmem/docs/archive/REMOVE_MALLOC_FALLBACK_TASK.md
Moe Charm (CI) 67fb15f35f Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization)
## Changes

### 1. core/page_arena.c
- Removed init failure message (lines 25-27) - error is handled by returning early
- All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks

### 2. core/hakmem.c
- Wrapped SIGSEGV handler init message (line 72)
- CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs

### 3. core/hakmem_shared_pool.c
- Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE:
  - Node pool exhaustion warning (line 252)
  - SP_META_CAPACITY_ERROR warning (line 421)
  - SP_FIX_GEOMETRY debug logging (line 745)
  - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865)
  - SP_ACQUIRE_STAGE0_L0 debug logging (line 803)
  - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922)
  - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996)
  - SP_ACQUIRE_STAGE3 debug logging (line 1116)
  - SP_SLOT_RELEASE debug logging (line 1245)
  - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305)
  - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316)
- Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized

## Performance Validation

Before: 51M ops/s (with debug fprintf overhead)
After:  49.1M ops/s (consistent performance, fprintf removed from hot paths)

## Build & Test

```bash
./build.sh larson_hakmem
./out/release/larson_hakmem 1 5 1 1000 100 10000 42
# Result: 49.1M ops/s
```

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00

418 lines
11 KiB
Markdown

# Task: Remove malloc Fallback (Root Cause Fix for 4T Crash)
**Date**: 2025-11-08
**Priority**: CRITICAL - BLOCKING
**Status**: Ready for Task Agent
---
## Executive Summary
**Problem**: malloc フォールバックが 4T クラッシュの根本原因
**Root Cause**:
```
SuperSlab OOM → __libc_malloc() fallback → Mixed HAKMEM/libc allocations
→ free() confusion → free(): invalid pointer crash
```
**二重管理の問題**:
- libc malloc: 独自メタデータ管理 (8-16B)
- HAKMEM: さらに AllocHeader 追加
- 結果: メモリ効率悪化、所有権不明、バグの温床
**Mission**: malloc フォールバックを完全削除し、HAKMEM 100% 割り当てを実現
---
## Why malloc Fallback is Fundamentally Wrong
### 1. **HAKMEM の存在意義を否定**
- 目標: System malloc より高速・効率的
- 現実: OOM 時に System malloc に丸投げ
- 矛盾: HAKMEM が OOM なら System malloc も OOM のはず
### 2. **二重オーバーヘッド**
```
libc malloc 割り当て:
[libc metadata (8-16B)] [user data]
HAKMEM がヘッダー追加:
[libc metadata] [HAKMEM header] [user data]
総オーバーヘッド: 16-32B per allocation!
```
### 3. **Mixed Allocation Bug**
```
Thread 1: SuperSlab alloc → ptr1 (HAKMEM)
Thread 2: SuperSlab OOM → libc malloc → ptr2 (libc + HAKMEM header)
Thread 3: free(ptr1) → HAKMEM free ✓
Thread 4: free(ptr2) → HAKMEM free tries to touch libc memory → 💥 CRASH
```
### 4. **性能の不安定性**
- 通常時: HAKMEM 高速パス
- 負荷時: libc malloc 遅いパス
- ベンチマーク結果が負荷によって大きくブレる
---
## Task 1: Identify All malloc Fallback Paths (CRITICAL)
### Search Commands
```bash
# Find all hak_alloc_malloc_impl() calls
grep -rn "hak_alloc_malloc_impl" core/
# Find all __libc_malloc() calls
grep -rn "__libc_malloc" core/
# Find fallback comments
grep -rn "fallback.*malloc\|malloc.*fallback" core/
```
### Expected Locations
**Already identified**:
1. `core/hakmem_internal.h:200-222` - `hak_alloc_malloc_impl()` implementation
2. `core/box/hak_alloc_api.inc.h:36-46` - Tiny failure fallback
3. `core/box/hak_alloc_api.inc.h:128` - General fallback
**Potentially more**:
- `core/hakmem.c` - Top-level malloc wrapper
- `core/hakmem_tiny.c` - Tiny allocator
- Other allocation paths
---
## Task 2: Remove malloc Fallback (Phase 1 - Immediate Fix)
### Goal: Make HAKMEM fail explicitly on OOM instead of falling back
### Change 1: Disable `hak_alloc_malloc_impl()` (core/hakmem_internal.h:200-222)
**Before (BROKEN)**:
```c
static inline void* hak_alloc_malloc_impl(size_t size) {
if (!HAK_ENABLED_ALLOC(HAKMEM_FEATURE_MALLOC)) {
return NULL; // malloc disabled
}
extern void* __libc_malloc(size_t);
void* raw = __libc_malloc(HEADER_SIZE + size); // ← BAD!
if (!raw) return NULL;
AllocHeader* hdr = (AllocHeader*)raw;
hdr->magic = HAKMEM_MAGIC;
hdr->method = ALLOC_METHOD_MALLOC;
// ...
return (char*)raw + HEADER_SIZE;
}
```
**After (SAFE)**:
```c
static inline void* hak_alloc_malloc_impl(size_t size) {
// Phase 7 CRITICAL FIX: malloc fallback removed (causes mixed allocation bug)
// Return NULL explicitly to force OOM handling
(void)size;
fprintf(stderr, "[HAKMEM] CRITICAL: malloc fallback disabled (size=%zu), returning NULL\n", size);
errno = ENOMEM;
return NULL; // ✅ Explicit OOM
}
```
**Alternative (環境変数ゲート)**:
```c
static inline void* hak_alloc_malloc_impl(size_t size) {
// Allow malloc fallback ONLY if explicitly enabled (for debugging)
static int allow_fallback = -1;
if (allow_fallback < 0) {
char* env = getenv("HAKMEM_ALLOW_MALLOC_FALLBACK");
allow_fallback = (env && atoi(env) != 0) ? 1 : 0;
}
if (!allow_fallback) {
fprintf(stderr, "[HAKMEM] malloc fallback disabled (size=%zu), returning NULL\n", size);
errno = ENOMEM;
return NULL;
}
// Fallback path (only if HAKMEM_ALLOW_MALLOC_FALLBACK=1)
extern void* __libc_malloc(size_t);
// ... rest of original code
}
```
### Change 2: Remove Tiny failure fallback (core/box/hak_alloc_api.inc.h:36-46)
**Before (BROKEN)**:
```c
if (tiny_ptr) { hkm_ace_track_alloc(); return tiny_ptr; }
// Phase 7: If Tiny rejects size <= TINY_MAX_SIZE
#if HAKMEM_TINY_HEADER_CLASSIDX
if (size <= TINY_MAX_SIZE) {
// Tiny rejected this size (likely 1024B), use malloc directly
static int log_count = 0;
if (log_count < 3) {
fprintf(stderr, "[DEBUG] Phase 7: tiny_alloc(%zu) rejected, using malloc fallback\n", size);
log_count++;
}
void* fallback_ptr = hak_alloc_malloc_impl(size); // ← BAD!
if (fallback_ptr) return fallback_ptr;
}
#endif
```
**After (SAFE)**:
```c
if (tiny_ptr) { hkm_ace_track_alloc(); return tiny_ptr; }
// Phase 7 CRITICAL FIX: No malloc fallback, let allocation flow to Mid/ACE layers
// If all layers fail, NULL will be returned (explicit OOM)
#if HAKMEM_TINY_HEADER_CLASSIDX
if (!tiny_ptr && size <= TINY_MAX_SIZE) {
// Tiny failed for size <= TINY_MAX_SIZE
// Log and continue to Mid/ACE layers (don't fallback to malloc!)
static int log_count = 0;
if (log_count < 3) {
fprintf(stderr, "[DEBUG] Phase 7: tiny_alloc(%zu) failed, trying Mid/ACE layers\n", size);
log_count++;
}
// Continue to Mid allocation below (no early return!)
}
#endif
```
### Change 3: Remove general fallback (core/box/hak_alloc_api.inc.h:124-132)
**Before (BROKEN)**:
```c
void* ptr;
if (size >= threshold) {
ptr = hak_alloc_mmap_impl(size);
} else {
ptr = hak_alloc_malloc_impl(size); // ← BAD!
}
if (!ptr) return NULL;
```
**After (SAFE)**:
```c
void* ptr;
if (size >= threshold) {
ptr = hak_alloc_mmap_impl(size);
} else {
// Phase 7 CRITICAL FIX: No malloc fallback
// If we reach here, all allocation layers (Tiny/Mid/ACE) have failed
// Return NULL explicitly (OOM)
fprintf(stderr, "[HAKMEM] OOM: All layers failed for size=%zu, returning NULL\n", size);
errno = ENOMEM;
return NULL; // ✅ Explicit OOM
}
if (!ptr) return NULL;
```
---
## Task 3: Implement SuperSlab Dynamic Scaling (Phase 2 - Proper Fix)
### Goal: Never run out of SuperSlabs
### Change 1: Detect SuperSlab exhaustion (core/tiny_superslab_alloc.inc.h or similar)
**Location**: Find where `bitmap == 0x00000000` check would go
```c
// In superslab_refill() or equivalent
if (bitmap == 0x00000000) {
// All 32 slabs exhausted for this class
fprintf(stderr, "[HAKMEM] SuperSlab class %d exhausted (bitmap=0x00000000), allocating new SuperSlab\n", class_idx);
// Allocate new SuperSlab via mmap
SuperSlab* new_ss = mmap_new_superslab(class_idx);
if (!new_ss) {
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to allocate new SuperSlab for class %d\n", class_idx);
return NULL; // True OOM (system out of memory)
}
// Register new SuperSlab in registry
if (!register_superslab(new_ss, class_idx)) {
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to register new SuperSlab for class %d\n", class_idx);
munmap(new_ss, SUPERSLAB_SIZE);
return NULL;
}
// Retry refill from new SuperSlab
return refill_from_superslab(new_ss, class_idx, count);
}
```
### Change 2: Increase initial capacity for hot classes
**File**: SuperSlab initialization code
```c
// In hak_tiny_init() or similar
void initialize_superslabs(void) {
for (int class_idx = 0; class_idx < TINY_NUM_CLASSES; class_idx++) {
int initial_slabs;
// Hot classes in multi-threaded workloads: 1, 4, 6
if (class_idx == 1 || class_idx == 4 || class_idx == 6) {
initial_slabs = 64; // Double capacity for hot classes
} else {
initial_slabs = 32; // Default
}
allocate_superslabs_for_class(class_idx, initial_slabs);
}
}
```
### Change 3: Implement `mmap_new_superslab()` helper
```c
// Allocate a new SuperSlab via mmap
static SuperSlab* mmap_new_superslab(int class_idx) {
size_t ss_size = SUPERSLAB_SIZE; // e.g., 2MB
void* raw = mmap(NULL, ss_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (raw == MAP_FAILED) {
return NULL;
}
// Initialize SuperSlab structure
SuperSlab* ss = (SuperSlab*)raw;
ss->class_idx = class_idx;
ss->total_active_blocks = 0;
ss->bitmap = 0xFFFFFFFF; // All slabs available
// Initialize slabs
size_t block_size = class_to_size(class_idx);
initialize_slabs(ss, block_size);
return ss;
}
```
---
## Task 4: Testing Requirements (CRITICAL)
### Test 1: Build and verify no malloc fallback
```bash
# Rebuild with Phase 7 flags
make clean
make HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 larson_hakmem
# Verify malloc fallback is disabled
strings libhakmem.so | grep "malloc fallback disabled"
# Should see: "[HAKMEM] malloc fallback disabled"
```
### Test 2: 4T stability (CRITICAL - must achieve 100%)
```bash
# Run 20 times, count successes
success=0
for i in {1..20}; do
echo "Run $i:"
env HAKMEM_TINY_USE_SUPERSLAB=1 HAKMEM_TINY_MEM_DIET=0 \
./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | tee run_$i.log
if grep -q "Throughput" run_$i.log; then
((success++))
echo "✓ Success ($success/20)"
else
echo "✗ Failed"
fi
done
echo "Final: $success/20 success rate"
# TARGET: 20/20 (100%)
```
### Test 3: Performance regression check
```bash
# Single-thread (should be ~2.68M ops/s)
./larson_hakmem 1 1 128 1024 1 12345 1
# Random mixed (should be 59-70M ops/s)
./bench_random_mixed_hakmem 100000 128 1234567
./bench_random_mixed_hakmem 100000 256 1234567
./bench_random_mixed_hakmem 100000 1024 1234567
# Should maintain Phase 7 performance (no regression)
```
---
## Success Criteria
**malloc フォールバック完全削除**
- `hak_alloc_malloc_impl()` が NULL を返す
- `__libc_malloc()` 呼び出しが 0
**4T 安定性 100%**
- 20/20 runs 成功
- `free(): invalid pointer` クラッシュが 0
**性能維持**
- Single-thread: 2.68M ops/s (変化なし)
- Random mixed: 59-70M ops/s (変化なし)
**SuperSlab 動的拡張動作** (Phase 2)
- `bitmap == 0x00000000` で新規 SuperSlab 割り当て
- Hot classes で初期容量増加
- OOM が発生しない
---
## Expected Deliverable
**Report file**: `/mnt/workdisk/public_share/hakmem/MALLOC_FALLBACK_REMOVAL_REPORT.md`
**Required sections**:
1. **Removed malloc fallback paths** (list of all changes)
2. **Code diffs** (before/after)
3. **Why this fixes the bug** (explanation)
4. **Test results** (20/20 stability, performance)
5. **SuperSlab dynamic scaling** (implementation details, if done)
6. **Production readiness** (YES/NO verdict)
---
## Context Documents
- `TASK_FOR_OTHER_AI.md` - Original task document (superseded by this one)
- `PHASE7_4T_STABILITY_VERIFICATION.md` - 30% success rate baseline
- `PHASE7_TASK3_RESULTS.md` - Phase 7 performance results
- `CLAUDE.md` - Project history
---
## Debug Commands
```bash
# Trace malloc fallback
HAKMEM_LOG=1 ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep "malloc fallback"
# Trace SuperSlab exhaustion
HAKMEM_LOG=1 ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep "bitmap=0x00000000"
# Check for libc malloc calls
ltrace -e malloc ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep -v HAKMEM
```
---
**Good luck! Let's make HAKMEM 100% self-sufficient! 🚀**