Option B: Periodic TLS SLL Drain - Fix Phase 9 LRU Architecture Issue

Root Cause:
- TLS SLL fast path (95-99% of frees) does NOT decrement meta->used
- Slabs never appear empty → SuperSlabs never freed → LRU never used
- Impact: 6,455 mmap/munmap calls per 200K iterations (74.8% time)
- Performance: -94% regression (9.38M → 563K ops/s)

Solution:
- Periodic drain every N frees (default: 1024) per size class
- Drain path: TLS SLL → slab freelist via tiny_free_local_box()
- This properly decrements meta->used and enables empty detection

Implementation:
1. core/box/tls_sll_drain_box.h - New drain box function
   - tiny_tls_sll_drain(): Pop from TLS SLL, push to slab freelist
   - tiny_tls_sll_try_drain(): Drain trigger with counter
   - ENV: HAKMEM_TINY_SLL_DRAIN_ENABLE=1/0 (default: 1)
   - ENV: HAKMEM_TINY_SLL_DRAIN_INTERVAL=N (default: 1024)
   - ENV: HAKMEM_TINY_SLL_DRAIN_DEBUG=1 (debug logging)

2. core/tiny_free_fast_v2.inc.h - Integrated drain trigger
   - Added drain call after successful TLS SLL push (line 145)
   - Cost: 2-3 cycles per free (counter increment + comparison)
   - Drain triggered every 1024 frees (0.1% overhead)

Expected Impact:
- mmap/munmap: 6,455 → ~100 calls (-96-97%)
- Throughput: 563K → 8-10M ops/s (+1,300-1,700%)
- LRU utilization: 0% → >90% (functional)

Reference: PHASE9_LRU_ARCHITECTURE_ISSUE.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-14 07:09:18 +09:00
parent f95448c767
commit 88f3592ef6
2 changed files with 260 additions and 0 deletions

View File

@ -19,6 +19,7 @@
#include "hakmem_build_flags.h"
#include "hakmem_tiny_config.h" // For TINY_TLS_MAG_CAP, TINY_NUM_CLASSES
#include "box/tls_sll_box.h" // Box TLS-SLL API
#include "box/tls_sll_drain_box.h" // Box TLS-SLL Drain (Option B)
#include "hakmem_tiny_integrity.h" // PRIORITY 1-4: Corruption detection
// Phase 7: Header-based ultra-fast free
@ -136,6 +137,13 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
return 0;
}
// Option B: Periodic TLS SLL Drain (restore slab accounting consistency)
// Purpose: Every N frees (default: 1024), drain TLS SLL → slab freelist
// Impact: Enables empty detection → SuperSlabs freed → LRU cache functional
// Cost: 2-3 cycles (counter increment + comparison, predict-not-taken)
// Benefit: +1,300-1,700% throughput (563K → 8-10M ops/s expected)
tiny_tls_sll_try_drain(class_idx);
return 1; // Success - handled in fast path
}