# Phase 2a: SuperSlab Dynamic Expansion Implementation **Date**: 2025-11-08 **Priority**: 🔴 CRITICAL - BLOCKING 100% stability **Estimated Effort**: 7-10 days **Status**: Ready for implementation --- ## Executive Summary **Problem**: SuperSlab uses fixed 32-slab array → OOM under 4T high-contention **Solution**: Implement mimalloc-style chunk linking → unlimited slab expansion **Expected Result**: 50% → 100% stability (20/20 success rate) --- ## Current Architecture (BROKEN) ### File: `core/superslab/superslab_types.h:82` ```c typedef struct SuperSlab { Slab slabs[SLABS_PER_SUPERSLAB_MAX]; // ← FIXED 32 slabs! Cannot grow! uint32_t bitmap; // ← 32 bits = 32 slabs max size_t total_active_blocks; int class_idx; // ... } SuperSlab; ``` ### Why This Fails **4T high-contention scenario**: ``` Thread 1: allocates from slabs[0-7] → bitmap bits 0-7 = 0 Thread 2: allocates from slabs[8-15] → bitmap bits 8-15 = 0 Thread 3: allocates from slabs[16-23] → bitmap bits 16-23 = 0 Thread 4: allocates from slabs[24-31] → bitmap bits 24-31 = 0 → bitmap = 0x00000000 (all slabs busy) → superslab_refill() returns NULL → OOM → malloc fallback (now disabled) → CRASH ``` **Evidence from logs**: ``` [DEBUG] superslab_refill returned NULL (OOM) detail: class=4 prev_ss=(nil) active=0 bitmap=0x00000000 prev_meta=(nil) used=0 cap=0 slab_idx=0 reused_freelist=0 free_idx=-2 errno=12 ``` --- ## Proposed Architecture (mimalloc-style) ### Design Pattern: Linked Chunks **Inspiration**: mimalloc uses linked segments, jemalloc uses linked chunks ```c typedef struct SuperSlabChunk { Slab slabs[32]; // Initial 32 slabs per chunk struct SuperSlabChunk* next; // ← Link to next chunk uint32_t bitmap; // 32 bits for this chunk's slabs size_t total_active_blocks; // Active blocks in this chunk int class_idx; } SuperSlabChunk; typedef struct SuperSlabHead { SuperSlabChunk* first_chunk; // Head of chunk list SuperSlabChunk* current_chunk; // Current chunk for allocation size_t total_chunks; // Total chunks allocated int class_idx; pthread_mutex_t lock; // Protect chunk list } SuperSlabHead; ``` ### Allocation Flow ``` 1. superslab_refill() called ↓ 2. Try current_chunk ↓ 3. bitmap == 0x00000000? (all slabs busy) ↓ YES 4. Try current_chunk->next ↓ NULL (no next chunk) 5. Allocate new chunk via mmap ↓ 6. current_chunk->next = new_chunk ↓ 7. current_chunk = new_chunk ↓ 8. Refill from new_chunk ↓ SUCCESS 9. Return blocks to caller ``` ### Visual Representation ``` Before (BROKEN): ┌─────────────────────────────────┐ │ SuperSlab (2MB) │ │ slabs[32] ← FIXED! │ │ [0][1][2]...[31] │ │ bitmap = 0x00000000 → OOM 💥 │ └─────────────────────────────────┘ After (DYNAMIC): ┌─────────────────────────────────┐ │ SuperSlabHead │ │ ├─ first_chunk ──────────────┐ │ │ └─ current_chunk ────────┐ │ │ └──────────────────────────│───│──┘ │ │ ▼ ▼ ┌────────────────┐ ┌────────────────┐ │ Chunk 1 (2MB) │ ───► │ Chunk 2 (2MB) │ ───► ... │ slabs[32] │ next │ slabs[32] │ next │ bitmap=0x0000 │ │ bitmap=0xFFFF │ └────────────────┘ └────────────────┘ (all busy) (has free slabs!) ``` --- ## Implementation Tasks ### Task 1: Define New Data Structures (2-3 hours) **File**: `core/superslab/superslab_types.h` **Changes**: 1. **Rename existing `SuperSlab` → `SuperSlabChunk`**: ```c typedef struct SuperSlabChunk { Slab slabs[32]; // Keep 32 slabs per chunk struct SuperSlabChunk* next; // NEW: Link to next chunk uint32_t bitmap; size_t total_active_blocks; int class_idx; // Existing fields... } SuperSlabChunk; ``` 2. **Add new `SuperSlabHead`**: ```c typedef struct SuperSlabHead { SuperSlabChunk* first_chunk; // Head of chunk list SuperSlabChunk* current_chunk; // Current chunk for fast allocation size_t total_chunks; // Total chunks in list int class_idx; // Thread safety pthread_mutex_t expansion_lock; // Protect chunk list expansion } SuperSlabHead; ``` 3. **Update global registry**: ```c // Before: extern SuperSlab* g_superslab_registry[MAX_SUPERSLABS]; // After: extern SuperSlabHead* g_superslab_heads[TINY_NUM_CLASSES]; ``` --- ### Task 2: Implement Chunk Allocation (3-4 hours) **File**: `core/superslab/superslab_alloc.c` (new file or add to existing) **Function 1: Allocate new chunk**: ```c // Allocate a new SuperSlabChunk via mmap static SuperSlabChunk* alloc_new_chunk(int class_idx) { size_t chunk_size = SUPERSLAB_SIZE; // 2MB // mmap new chunk void* raw = mmap(NULL, chunk_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (raw == MAP_FAILED) { fprintf(stderr, "[HAKMEM] CRITICAL: Failed to mmap new SuperSlabChunk for class %d (errno=%d)\n", class_idx, errno); return NULL; } // Initialize chunk structure SuperSlabChunk* chunk = (SuperSlabChunk*)raw; chunk->next = NULL; chunk->bitmap = 0xFFFFFFFF; // All 32 slabs available chunk->total_active_blocks = 0; chunk->class_idx = class_idx; // Initialize slabs size_t block_size = class_to_size(class_idx); init_slabs_in_chunk(chunk, block_size); return chunk; } ``` **Function 2: Link new chunk to head**: ```c // Expand SuperSlabHead by linking new chunk static int expand_superslab_head(SuperSlabHead* head) { if (!head) return -1; // Allocate new chunk SuperSlabChunk* new_chunk = alloc_new_chunk(head->class_idx); if (!new_chunk) { return -1; // True OOM (system out of memory) } // Thread-safe linking pthread_mutex_lock(&head->expansion_lock); if (head->current_chunk) { // Link at end of list SuperSlabChunk* tail = head->current_chunk; while (tail->next) { tail = tail->next; } tail->next = new_chunk; } else { // First chunk head->first_chunk = new_chunk; } // Update current chunk to new chunk head->current_chunk = new_chunk; head->total_chunks++; pthread_mutex_unlock(&head->expansion_lock); fprintf(stderr, "[HAKMEM] Expanded SuperSlabHead for class %d: %zu chunks now\n", head->class_idx, head->total_chunks); return 0; } ``` --- ### Task 3: Update Refill Logic (4-5 hours) **File**: `core/tiny_superslab_alloc.inc.h` or wherever `superslab_refill()` is **Modify `superslab_refill()` to try all chunks**: ```c // Before (BROKEN): void* superslab_refill(int class_idx, int count) { SuperSlab* ss = get_superslab_for_class(class_idx); if (!ss) return NULL; if (ss->bitmap == 0x00000000) { // All slabs busy → OOM! return NULL; // ← CRASH HERE } // Try to refill from this SuperSlab return refill_from_superslab(ss, count); } // After (DYNAMIC): void* superslab_refill(int class_idx, int count) { SuperSlabHead* head = g_superslab_heads[class_idx]; if (!head) { // Initialize head for this class (first time) head = init_superslab_head(class_idx); if (!head) return NULL; g_superslab_heads[class_idx] = head; } SuperSlabChunk* chunk = head->current_chunk; // Try current chunk first (fast path) if (chunk && chunk->bitmap != 0x00000000) { return refill_from_chunk(chunk, count); } // Current chunk exhausted, try to expand fprintf(stderr, "[DEBUG] SuperSlabChunk exhausted for class %d (bitmap=0x00000000), expanding...\n", class_idx); if (expand_superslab_head(head) < 0) { fprintf(stderr, "[HAKMEM] CRITICAL: Failed to expand SuperSlabHead for class %d\n", class_idx); return NULL; // True system OOM } // Retry refill from new chunk chunk = head->current_chunk; if (!chunk || chunk->bitmap == 0x00000000) { fprintf(stderr, "[HAKMEM] CRITICAL: New chunk still has no free slabs for class %d\n", class_idx); return NULL; } return refill_from_chunk(chunk, count); } ``` **Helper function**: ```c // Refill from a specific chunk static void* refill_from_chunk(SuperSlabChunk* chunk, int count) { if (!chunk || chunk->bitmap == 0x00000000) return NULL; // Use existing P0 optimization (ctz-based slab selection) uint32_t mask = chunk->bitmap; while (mask && count > 0) { int slab_idx = __builtin_ctz(mask); mask &= ~(1u << slab_idx); Slab* slab = &chunk->slabs[slab_idx]; // Try to acquire slab and refill // ... existing refill logic } return /* refilled blocks */; } ``` --- ### Task 4: Update Initialization (2-3 hours) **File**: `core/hakmem_tiny.c` or initialization code **Modify `hak_tiny_init()`**: ```c void hak_tiny_init(void) { // Initialize SuperSlabHead for each class for (int class_idx = 0; class_idx < TINY_NUM_CLASSES; class_idx++) { SuperSlabHead* head = init_superslab_head(class_idx); if (!head) { fprintf(stderr, "[HAKMEM] CRITICAL: Failed to initialize SuperSlabHead for class %d\n", class_idx); abort(); } g_superslab_heads[class_idx] = head; } } // Initialize SuperSlabHead with initial chunk(s) static SuperSlabHead* init_superslab_head(int class_idx) { SuperSlabHead* head = calloc(1, sizeof(SuperSlabHead)); if (!head) return NULL; head->class_idx = class_idx; head->total_chunks = 0; pthread_mutex_init(&head->expansion_lock, NULL); // Allocate initial chunk(s) int initial_chunks = 1; // Hot classes (1, 4, 6) get 2 initial chunks if (class_idx == 1 || class_idx == 4 || class_idx == 6) { initial_chunks = 2; } for (int i = 0; i < initial_chunks; i++) { if (expand_superslab_head(head) < 0) { fprintf(stderr, "[HAKMEM] CRITICAL: Failed to allocate initial chunk %d for class %d\n", i, class_idx); free(head); return NULL; } } return head; } ``` --- ### Task 5: Update Free Path (2-3 hours) **File**: `core/hakmem_tiny_free.inc` or free path code **Modify free to find correct chunk**: ```c void hak_tiny_free(void* ptr) { if (!ptr) return; // Determine class_idx from header or registry int class_idx = get_class_idx_for_ptr(ptr); if (class_idx < 0) { fprintf(stderr, "[HAKMEM] Invalid free: ptr=%p not in any SuperSlab\n", ptr); return; } // Find which chunk this ptr belongs to SuperSlabHead* head = g_superslab_heads[class_idx]; if (!head) { fprintf(stderr, "[HAKMEM] Invalid free: no SuperSlabHead for class %d\n", class_idx); return; } SuperSlabChunk* chunk = head->first_chunk; while (chunk) { // Check if ptr is within this chunk's memory range uintptr_t chunk_start = (uintptr_t)chunk; uintptr_t chunk_end = chunk_start + SUPERSLAB_SIZE; uintptr_t ptr_addr = (uintptr_t)ptr; if (ptr_addr >= chunk_start && ptr_addr < chunk_end) { // Found the chunk, free to it free_to_chunk(chunk, ptr); return; } chunk = chunk->next; } fprintf(stderr, "[HAKMEM] Invalid free: ptr=%p not found in any chunk for class %d\n", ptr, class_idx); } ``` --- ### Task 6: Update Registry (3-4 hours) **File**: Registry code (wherever SuperSlab registry is managed) **Replace flat registry with per-class heads**: ```c // Before: SuperSlab* g_superslab_registry[MAX_SUPERSLABS]; // After: SuperSlabHead* g_superslab_heads[TINY_NUM_CLASSES]; ``` **Update registry lookup**: ```c // Before: SuperSlab* find_superslab_for_ptr(void* ptr) { for (int i = 0; i < MAX_SUPERSLABS; i++) { SuperSlab* ss = g_superslab_registry[i]; if (ptr_in_range(ptr, ss)) return ss; } return NULL; } // After: SuperSlabChunk* find_chunk_for_ptr(void* ptr, int* out_class_idx) { for (int class_idx = 0; class_idx < TINY_NUM_CLASSES; class_idx++) { SuperSlabHead* head = g_superslab_heads[class_idx]; if (!head) continue; SuperSlabChunk* chunk = head->first_chunk; while (chunk) { if (ptr_in_chunk_range(ptr, chunk)) { if (out_class_idx) *out_class_idx = class_idx; return chunk; } chunk = chunk->next; } } return NULL; } ``` --- ## Testing Strategy ### Test 1: Build Verification ```bash # Rebuild with new architecture make clean make HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 larson_hakmem # Check for compilation errors echo $? # Should be 0 ``` ### Test 2: Single-Thread Stability ```bash # Should work perfectly (no change in behavior) ./larson_hakmem 1 1 128 1024 1 12345 1 # Expected: 2.68-2.71M ops/s (no regression) ``` ### Test 3: 4T High-Contention (CRITICAL) ```bash # Run 20 times, count successes success=0 for i in {1..20}; do echo "=== Run $i ===" env HAKMEM_TINY_USE_SUPERSLAB=1 HAKMEM_TINY_MEM_DIET=0 \ ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | tee phase2a_run_$i.log if grep -q "Throughput" phase2a_run_$i.log; then ((success++)) echo "✓ Success ($success/20)" else echo "✗ Failed" fi done echo "Final: $success/20 success rate" # TARGET: 20/20 (100%) # Current baseline: 10/20 (50%) ``` ### Test 4: Chunk Expansion Verification ```bash # Enable debug logging HAKMEM_LOG=1 ./larson_hakmem 10 8 128 1024 1 12345 4 2>&1 | grep "Expanded SuperSlabHead" # Should see: # [HAKMEM] Expanded SuperSlabHead for class 4: 2 chunks now # [HAKMEM] Expanded SuperSlabHead for class 4: 3 chunks now # ... ``` ### Test 5: Memory Leak Check ```bash # Valgrind test (may be slow) valgrind --leak-check=full --show-leak-kinds=all \ ./larson_hakmem 1 1 128 1024 1 12345 1 2>&1 | tee valgrind_phase2a.log # Check for leaks grep "definitely lost" valgrind_phase2a.log # Should be 0 bytes ``` --- ## Success Criteria ✅ **Compilation**: No errors, no warnings ✅ **Single-thread**: 2.68-2.71M ops/s (no regression) ✅ **4T stability**: **20/20 (100%)** ← KEY METRIC ✅ **Chunk expansion**: Logs show multiple chunks allocated ✅ **No memory leaks**: Valgrind clean ✅ **Performance**: 4T throughput ≥981K ops/s (when it works) --- ## Deliverable **Report file**: `/mnt/workdisk/public_share/hakmem/PHASE2A_IMPLEMENTATION_REPORT.md` **Required sections**: 1. **Architecture changes** (SuperSlab → SuperSlabChunk + SuperSlabHead) 2. **Code diffs** (all modified files) 3. **Test results** (20/20 stability test) 4. **Performance comparison** (before/after) 5. **Chunk expansion behavior** (how many chunks allocated under load) 6. **Memory usage** (overhead per chunk, total memory) 7. **Production readiness** (YES/NO verdict) --- ## Files to Create/Modify **New files**: 1. `core/superslab/superslab_alloc.c` - Chunk allocation functions **Modified files**: 1. `core/superslab/superslab_types.h` - SuperSlabChunk + SuperSlabHead 2. `core/tiny_superslab_alloc.inc.h` - Refill logic with expansion 3. `core/hakmem_tiny_free.inc` - Free path with chunk lookup 4. `core/hakmem_tiny.c` - Initialization with SuperSlabHead 5. Registry code - Update to per-class heads **Estimated LOC**: 500-800 lines (new code + modifications) --- ## Risk Mitigation **Risk 1: Performance regression** - Mitigation: Keep fast path (current_chunk) unchanged - Single-chunk case should be identical to before **Risk 2: Thread safety issues** - Mitigation: Use expansion_lock only for chunk linking - Slab-level atomics unchanged **Risk 3: Memory overhead** - Each chunk: 2MB (same as before) - SuperSlabHead: ~64 bytes per class - Total overhead: negligible **Risk 4: Complexity** - Mitigation: Follow mimalloc pattern (proven design) - Keep chunk size fixed (2MB) for simplicity --- **Let's implement Phase 2a and achieve 100% stability! 🚀**