Option B: Periodic TLS SLL Drain - Fix Phase 9 LRU Architecture Issue
Root Cause: - TLS SLL fast path (95-99% of frees) does NOT decrement meta->used - Slabs never appear empty → SuperSlabs never freed → LRU never used - Impact: 6,455 mmap/munmap calls per 200K iterations (74.8% time) - Performance: -94% regression (9.38M → 563K ops/s) Solution: - Periodic drain every N frees (default: 1024) per size class - Drain path: TLS SLL → slab freelist via tiny_free_local_box() - This properly decrements meta->used and enables empty detection Implementation: 1. core/box/tls_sll_drain_box.h - New drain box function - tiny_tls_sll_drain(): Pop from TLS SLL, push to slab freelist - tiny_tls_sll_try_drain(): Drain trigger with counter - ENV: HAKMEM_TINY_SLL_DRAIN_ENABLE=1/0 (default: 1) - ENV: HAKMEM_TINY_SLL_DRAIN_INTERVAL=N (default: 1024) - ENV: HAKMEM_TINY_SLL_DRAIN_DEBUG=1 (debug logging) 2. core/tiny_free_fast_v2.inc.h - Integrated drain trigger - Added drain call after successful TLS SLL push (line 145) - Cost: 2-3 cycles per free (counter increment + comparison) - Drain triggered every 1024 frees (0.1% overhead) Expected Impact: - mmap/munmap: 6,455 → ~100 calls (-96-97%) - Throughput: 563K → 8-10M ops/s (+1,300-1,700%) - LRU utilization: 0% → >90% (functional) Reference: PHASE9_LRU_ARCHITECTURE_ISSUE.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@ -19,6 +19,7 @@
|
||||
#include "hakmem_build_flags.h"
|
||||
#include "hakmem_tiny_config.h" // For TINY_TLS_MAG_CAP, TINY_NUM_CLASSES
|
||||
#include "box/tls_sll_box.h" // Box TLS-SLL API
|
||||
#include "box/tls_sll_drain_box.h" // Box TLS-SLL Drain (Option B)
|
||||
#include "hakmem_tiny_integrity.h" // PRIORITY 1-4: Corruption detection
|
||||
|
||||
// Phase 7: Header-based ultra-fast free
|
||||
@ -136,6 +137,13 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
// Option B: Periodic TLS SLL Drain (restore slab accounting consistency)
|
||||
// Purpose: Every N frees (default: 1024), drain TLS SLL → slab freelist
|
||||
// Impact: Enables empty detection → SuperSlabs freed → LRU cache functional
|
||||
// Cost: 2-3 cycles (counter increment + comparison, predict-not-taken)
|
||||
// Benefit: +1,300-1,700% throughput (563K → 8-10M ops/s expected)
|
||||
tiny_tls_sll_try_drain(class_idx);
|
||||
|
||||
return 1; // Success - handled in fast path
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user