Files
hakmem/docs/status/PHASE2A_SUPERSLAB_DYNAMIC_EXPANSION.md

611 lines
16 KiB
Markdown
Raw Normal View History

Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization) ## Changes ### 1. core/page_arena.c - Removed init failure message (lines 25-27) - error is handled by returning early - All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks ### 2. core/hakmem.c - Wrapped SIGSEGV handler init message (line 72) - CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs ### 3. core/hakmem_shared_pool.c - Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE: - Node pool exhaustion warning (line 252) - SP_META_CAPACITY_ERROR warning (line 421) - SP_FIX_GEOMETRY debug logging (line 745) - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865) - SP_ACQUIRE_STAGE0_L0 debug logging (line 803) - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922) - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996) - SP_ACQUIRE_STAGE3 debug logging (line 1116) - SP_SLOT_RELEASE debug logging (line 1245) - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305) - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316) - Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized ## Performance Validation Before: 51M ops/s (with debug fprintf overhead) After: 49.1M ops/s (consistent performance, fprintf removed from hot paths) ## Build & Test ```bash ./build.sh larson_hakmem ./out/release/larson_hakmem 1 5 1 1000 100 10000 42 # Result: 49.1M ops/s ``` Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00
# Phase 2a: SuperSlab Dynamic Expansion Implementation
**Date**: 2025-11-08
**Priority**: 🔴 CRITICAL - BLOCKING 100% stability
**Estimated Effort**: 7-10 days
**Status**: Ready for implementation
---
## Executive Summary
**Problem**: SuperSlab uses fixed 32-slab array → OOM under 4T high-contention
**Solution**: Implement mimalloc-style chunk linking → unlimited slab expansion
**Expected Result**: 50% → 100% stability (20/20 success rate)
---
## Current Architecture (BROKEN)
### File: `core/superslab/superslab_types.h:82`
```c
typedef struct SuperSlab {
Slab slabs[SLABS_PER_SUPERSLAB_MAX]; // ← FIXED 32 slabs! Cannot grow!
uint32_t bitmap; // ← 32 bits = 32 slabs max
size_t total_active_blocks;
int class_idx;
// ...
} SuperSlab;
```
### Why This Fails
**4T high-contention scenario**:
```
Thread 1: allocates from slabs[0-7] → bitmap bits 0-7 = 0
Thread 2: allocates from slabs[8-15] → bitmap bits 8-15 = 0
Thread 3: allocates from slabs[16-23] → bitmap bits 16-23 = 0
Thread 4: allocates from slabs[24-31] → bitmap bits 24-31 = 0
→ bitmap = 0x00000000 (all slabs busy)
→ superslab_refill() returns NULL
→ OOM → malloc fallback (now disabled) → CRASH
```
**Evidence from logs**:
```
[DEBUG] superslab_refill returned NULL (OOM) detail:
class=4 prev_ss=(nil) active=0 bitmap=0x00000000
prev_meta=(nil) used=0 cap=0 slab_idx=0
reused_freelist=0 free_idx=-2 errno=12
```
---
## Proposed Architecture (mimalloc-style)
### Design Pattern: Linked Chunks
**Inspiration**: mimalloc uses linked segments, jemalloc uses linked chunks
```c
typedef struct SuperSlabChunk {
Slab slabs[32]; // Initial 32 slabs per chunk
struct SuperSlabChunk* next; // ← Link to next chunk
uint32_t bitmap; // 32 bits for this chunk's slabs
size_t total_active_blocks; // Active blocks in this chunk
int class_idx;
} SuperSlabChunk;
typedef struct SuperSlabHead {
SuperSlabChunk* first_chunk; // Head of chunk list
SuperSlabChunk* current_chunk; // Current chunk for allocation
size_t total_chunks; // Total chunks allocated
int class_idx;
pthread_mutex_t lock; // Protect chunk list
} SuperSlabHead;
```
### Allocation Flow
```
1. superslab_refill() called
2. Try current_chunk
3. bitmap == 0x00000000? (all slabs busy)
↓ YES
4. Try current_chunk->next
↓ NULL (no next chunk)
5. Allocate new chunk via mmap
6. current_chunk->next = new_chunk
7. current_chunk = new_chunk
8. Refill from new_chunk
↓ SUCCESS
9. Return blocks to caller
```
### Visual Representation
```
Before (BROKEN):
┌─────────────────────────────────┐
│ SuperSlab (2MB) │
│ slabs[32] ← FIXED! │
│ [0][1][2]...[31] │
│ bitmap = 0x00000000 → OOM 💥 │
└─────────────────────────────────┘
After (DYNAMIC):
┌─────────────────────────────────┐
│ SuperSlabHead │
│ ├─ first_chunk ──────────────┐ │
│ └─ current_chunk ────────┐ │ │
└──────────────────────────│───│──┘
│ │
▼ ▼
┌────────────────┐ ┌────────────────┐
│ Chunk 1 (2MB) │ ───► │ Chunk 2 (2MB) │ ───► ...
│ slabs[32] │ next │ slabs[32] │ next
│ bitmap=0x0000 │ │ bitmap=0xFFFF │
└────────────────┘ └────────────────┘
(all busy) (has free slabs!)
```
---
## Implementation Tasks
### Task 1: Define New Data Structures (2-3 hours)
**File**: `core/superslab/superslab_types.h`
**Changes**:
1. **Rename existing `SuperSlab` → `SuperSlabChunk`**:
```c
typedef struct SuperSlabChunk {
Slab slabs[32]; // Keep 32 slabs per chunk
struct SuperSlabChunk* next; // NEW: Link to next chunk
uint32_t bitmap;
size_t total_active_blocks;
int class_idx;
// Existing fields...
} SuperSlabChunk;
```
2. **Add new `SuperSlabHead`**:
```c
typedef struct SuperSlabHead {
SuperSlabChunk* first_chunk; // Head of chunk list
SuperSlabChunk* current_chunk; // Current chunk for fast allocation
size_t total_chunks; // Total chunks in list
int class_idx;
// Thread safety
pthread_mutex_t expansion_lock; // Protect chunk list expansion
} SuperSlabHead;
```
3. **Update global registry**:
```c
// Before:
extern SuperSlab* g_superslab_registry[MAX_SUPERSLABS];
// After:
extern SuperSlabHead* g_superslab_heads[TINY_NUM_CLASSES];
```
---
### Task 2: Implement Chunk Allocation (3-4 hours)
**File**: `core/superslab/superslab_alloc.c` (new file or add to existing)
**Function 1: Allocate new chunk**:
```c
// Allocate a new SuperSlabChunk via mmap
static SuperSlabChunk* alloc_new_chunk(int class_idx) {
size_t chunk_size = SUPERSLAB_SIZE; // 2MB
// mmap new chunk
void* raw = mmap(NULL, chunk_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (raw == MAP_FAILED) {
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to mmap new SuperSlabChunk for class %d (errno=%d)\n",
class_idx, errno);
return NULL;
}
// Initialize chunk structure
SuperSlabChunk* chunk = (SuperSlabChunk*)raw;
chunk->next = NULL;
chunk->bitmap = 0xFFFFFFFF; // All 32 slabs available
chunk->total_active_blocks = 0;
chunk->class_idx = class_idx;
// Initialize slabs
size_t block_size = class_to_size(class_idx);
init_slabs_in_chunk(chunk, block_size);
return chunk;
}
```
**Function 2: Link new chunk to head**:
```c
// Expand SuperSlabHead by linking new chunk
static int expand_superslab_head(SuperSlabHead* head) {
if (!head) return -1;
// Allocate new chunk
SuperSlabChunk* new_chunk = alloc_new_chunk(head->class_idx);
if (!new_chunk) {
return -1; // True OOM (system out of memory)
}
// Thread-safe linking
pthread_mutex_lock(&head->expansion_lock);
if (head->current_chunk) {
// Link at end of list
SuperSlabChunk* tail = head->current_chunk;
while (tail->next) {
tail = tail->next;
}
tail->next = new_chunk;
} else {
// First chunk
head->first_chunk = new_chunk;
}
// Update current chunk to new chunk
head->current_chunk = new_chunk;
head->total_chunks++;
pthread_mutex_unlock(&head->expansion_lock);
fprintf(stderr, "[HAKMEM] Expanded SuperSlabHead for class %d: %zu chunks now\n",
head->class_idx, head->total_chunks);
return 0;
}
```
---
### Task 3: Update Refill Logic (4-5 hours)
**File**: `core/tiny_superslab_alloc.inc.h` or wherever `superslab_refill()` is
**Modify `superslab_refill()` to try all chunks**:
```c
// Before (BROKEN):
void* superslab_refill(int class_idx, int count) {
SuperSlab* ss = get_superslab_for_class(class_idx);
if (!ss) return NULL;
if (ss->bitmap == 0x00000000) {
// All slabs busy → OOM!
return NULL; // ← CRASH HERE
}
// Try to refill from this SuperSlab
return refill_from_superslab(ss, count);
}
// After (DYNAMIC):
void* superslab_refill(int class_idx, int count) {
SuperSlabHead* head = g_superslab_heads[class_idx];
if (!head) {
// Initialize head for this class (first time)
head = init_superslab_head(class_idx);
if (!head) return NULL;
g_superslab_heads[class_idx] = head;
}
SuperSlabChunk* chunk = head->current_chunk;
// Try current chunk first (fast path)
if (chunk && chunk->bitmap != 0x00000000) {
return refill_from_chunk(chunk, count);
}
// Current chunk exhausted, try to expand
fprintf(stderr, "[DEBUG] SuperSlabChunk exhausted for class %d (bitmap=0x00000000), expanding...\n", class_idx);
if (expand_superslab_head(head) < 0) {
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to expand SuperSlabHead for class %d\n", class_idx);
return NULL; // True system OOM
}
// Retry refill from new chunk
chunk = head->current_chunk;
if (!chunk || chunk->bitmap == 0x00000000) {
fprintf(stderr, "[HAKMEM] CRITICAL: New chunk still has no free slabs for class %d\n", class_idx);
return NULL;
}
return refill_from_chunk(chunk, count);
}
```
**Helper function**:
```c
// Refill from a specific chunk
static void* refill_from_chunk(SuperSlabChunk* chunk, int count) {
if (!chunk || chunk->bitmap == 0x00000000) return NULL;
// Use existing P0 optimization (ctz-based slab selection)
uint32_t mask = chunk->bitmap;
while (mask && count > 0) {
int slab_idx = __builtin_ctz(mask);
mask &= ~(1u << slab_idx);
Slab* slab = &chunk->slabs[slab_idx];
// Try to acquire slab and refill
// ... existing refill logic
}
return /* refilled blocks */;
}
```
---
### Task 4: Update Initialization (2-3 hours)
**File**: `core/hakmem_tiny.c` or initialization code
**Modify `hak_tiny_init()`**:
```c
void hak_tiny_init(void) {
// Initialize SuperSlabHead for each class
for (int class_idx = 0; class_idx < TINY_NUM_CLASSES; class_idx++) {
SuperSlabHead* head = init_superslab_head(class_idx);
if (!head) {
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to initialize SuperSlabHead for class %d\n", class_idx);
abort();
}
g_superslab_heads[class_idx] = head;
}
}
// Initialize SuperSlabHead with initial chunk(s)
static SuperSlabHead* init_superslab_head(int class_idx) {
SuperSlabHead* head = calloc(1, sizeof(SuperSlabHead));
if (!head) return NULL;
head->class_idx = class_idx;
head->total_chunks = 0;
pthread_mutex_init(&head->expansion_lock, NULL);
// Allocate initial chunk(s)
int initial_chunks = 1;
// Hot classes (1, 4, 6) get 2 initial chunks
if (class_idx == 1 || class_idx == 4 || class_idx == 6) {
initial_chunks = 2;
}
for (int i = 0; i < initial_chunks; i++) {
if (expand_superslab_head(head) < 0) {
fprintf(stderr, "[HAKMEM] CRITICAL: Failed to allocate initial chunk %d for class %d\n", i, class_idx);
free(head);
return NULL;
}
}
return head;
}
```
---
### Task 5: Update Free Path (2-3 hours)
**File**: `core/hakmem_tiny_free.inc` or free path code
**Modify free to find correct chunk**:
```c
void hak_tiny_free(void* ptr) {
if (!ptr) return;
// Determine class_idx from header or registry
int class_idx = get_class_idx_for_ptr(ptr);
if (class_idx < 0) {
fprintf(stderr, "[HAKMEM] Invalid free: ptr=%p not in any SuperSlab\n", ptr);
return;
}
// Find which chunk this ptr belongs to
SuperSlabHead* head = g_superslab_heads[class_idx];
if (!head) {
fprintf(stderr, "[HAKMEM] Invalid free: no SuperSlabHead for class %d\n", class_idx);
return;
}
SuperSlabChunk* chunk = head->first_chunk;
while (chunk) {
// Check if ptr is within this chunk's memory range
uintptr_t chunk_start = (uintptr_t)chunk;
uintptr_t chunk_end = chunk_start + SUPERSLAB_SIZE;
uintptr_t ptr_addr = (uintptr_t)ptr;
if (ptr_addr >= chunk_start && ptr_addr < chunk_end) {
// Found the chunk, free to it
free_to_chunk(chunk, ptr);
return;
}
chunk = chunk->next;
}
fprintf(stderr, "[HAKMEM] Invalid free: ptr=%p not found in any chunk for class %d\n", ptr, class_idx);
}
```
---
### Task 6: Update Registry (3-4 hours)
**File**: Registry code (wherever SuperSlab registry is managed)
**Replace flat registry with per-class heads**:
```c
// Before:
SuperSlab* g_superslab_registry[MAX_SUPERSLABS];
// After:
SuperSlabHead* g_superslab_heads[TINY_NUM_CLASSES];
```
**Update registry lookup**:
```c
// Before:
SuperSlab* find_superslab_for_ptr(void* ptr) {
for (int i = 0; i < MAX_SUPERSLABS; i++) {
SuperSlab* ss = g_superslab_registry[i];
if (ptr_in_range(ptr, ss)) return ss;
}
return NULL;
}
// After:
SuperSlabChunk* find_chunk_for_ptr(void* ptr, int* out_class_idx) {
for (int class_idx = 0; class_idx < TINY_NUM_CLASSES; class_idx++) {
SuperSlabHead* head = g_superslab_heads[class_idx];
if (!head) continue;
SuperSlabChunk* chunk = head->first_chunk;
while (chunk) {
if (ptr_in_chunk_range(ptr, chunk)) {
if (out_class_idx) *out_class_idx = class_idx;
return chunk;
}
chunk = chunk->next;
}
}
return NULL;
}
```
---
## Testing Strategy
### Test 1: Build Verification
```bash
# Rebuild with new architecture
make clean
make HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 larson_hakmem
# Check for compilation errors
echo $? # Should be 0
```
### Test 2: Single-Thread Stability
```bash
# Should work perfectly (no change in behavior)
./larson_hakmem 1 1 128 1024 1 12345 1
# Expected: 2.68-2.71M ops/s (no regression)
```
### Test 3: 4T High-Contention (CRITICAL)
```bash
# Run 20 times, count successes
success=0
for i in {1..20}; do
echo "=== Run $i ==="
env HAKMEM_TINY_USE_SUPERSLAB=1 HAKMEM_TINY_MEM_DIET=0 \
./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | tee phase2a_run_$i.log
if grep -q "Throughput" phase2a_run_$i.log; then
((success++))
echo "✓ Success ($success/20)"
else
echo "✗ Failed"
fi
done
echo "Final: $success/20 success rate"
# TARGET: 20/20 (100%)
# Current baseline: 10/20 (50%)
```
### Test 4: Chunk Expansion Verification
```bash
# Enable debug logging
HAKMEM_LOG=1 ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep "Expanded SuperSlabHead"
# Should see:
# [HAKMEM] Expanded SuperSlabHead for class 4: 2 chunks now
# [HAKMEM] Expanded SuperSlabHead for class 4: 3 chunks now
# ...
```
### Test 5: Memory Leak Check
```bash
# Valgrind test (may be slow)
valgrind --leak-check=full --show-leak-kinds=all \
./larson_hakmem 1 1 128 1024 1 12345 1 2>&1 | tee valgrind_phase2a.log
# Check for leaks
grep "definitely lost" valgrind_phase2a.log
# Should be 0 bytes
```
---
## Success Criteria
**Compilation**: No errors, no warnings
**Single-thread**: 2.68-2.71M ops/s (no regression)
**4T stability**: **20/20 (100%)** ← KEY METRIC
**Chunk expansion**: Logs show multiple chunks allocated
**No memory leaks**: Valgrind clean
**Performance**: 4T throughput ≥981K ops/s (when it works)
---
## Deliverable
**Report file**: `/mnt/workdisk/public_share/hakmem/PHASE2A_IMPLEMENTATION_REPORT.md`
**Required sections**:
1. **Architecture changes** (SuperSlab → SuperSlabChunk + SuperSlabHead)
2. **Code diffs** (all modified files)
3. **Test results** (20/20 stability test)
4. **Performance comparison** (before/after)
5. **Chunk expansion behavior** (how many chunks allocated under load)
6. **Memory usage** (overhead per chunk, total memory)
7. **Production readiness** (YES/NO verdict)
---
## Files to Create/Modify
**New files**:
1. `core/superslab/superslab_alloc.c` - Chunk allocation functions
**Modified files**:
1. `core/superslab/superslab_types.h` - SuperSlabChunk + SuperSlabHead
2. `core/tiny_superslab_alloc.inc.h` - Refill logic with expansion
3. `core/hakmem_tiny_free.inc` - Free path with chunk lookup
4. `core/hakmem_tiny.c` - Initialization with SuperSlabHead
5. Registry code - Update to per-class heads
**Estimated LOC**: 500-800 lines (new code + modifications)
---
## Risk Mitigation
**Risk 1: Performance regression**
- Mitigation: Keep fast path (current_chunk) unchanged
- Single-chunk case should be identical to before
**Risk 2: Thread safety issues**
- Mitigation: Use expansion_lock only for chunk linking
- Slab-level atomics unchanged
**Risk 3: Memory overhead**
- Each chunk: 2MB (same as before)
- SuperSlabHead: ~64 bytes per class
- Total overhead: negligible
**Risk 4: Complexity**
- Mitigation: Follow mimalloc pattern (proven design)
- Keep chunk size fixed (2MB) for simplicity
---
**Let's implement Phase 2a and achieve 100% stability! 🚀**