Phase 13-B: TinyHeapV2 supply path with dual-mode A/B framework (Stealing vs Leftover)

Summary:
- Implemented free path supply with ENV-gated A/B modes (HAKMEM_TINY_HEAP_V2_LEFTOVER_MODE)
- Mode 0 (Stealing, default): L0 gets freed blocks first → +18% @ 32B
- Mode 1 (Leftover): L1 primary owner, L0 gets leftovers → Box-clean but -5% @ 16B
- Decision: Default to Stealing for performance (ChatGPT analysis: L0 doesn't corrupt learning layer signals)

Performance (100K iterations, workset=128):
- 16B: 43.9M → 45.6M ops/s (+3.9%)
- 32B: 41.9M → 49.6M ops/s (+18.4%) 
- 64B: 51.2M → 51.5M ops/s (+0.6%)
- 100% magazine hit rate (supply from free path working correctly)

Implementation:
- tiny_free_fast_v2.inc.h: Dual-mode supply (lines 134-166)
- tiny_heap_v2.h: Add tiny_heap_v2_leftover_mode() flag + rationale doc
- tiny_alloc_fast.inc.h: Alloc hook with tiny_heap_v2_alloc_by_class()
- CURRENT_TASK.md: Updated Phase 13-B status (complete) with A/B results

ENV flags:
- HAKMEM_TINY_HEAP_V2=1                      # Enable TinyHeapV2
- HAKMEM_TINY_HEAP_V2_LEFTOVER_MODE=0        # Mode 0 (Stealing, default)
- HAKMEM_TINY_HEAP_V2_CLASS_MASK=0xE         # C1-C3 only (skip C0 -5% regression)
- HAKMEM_TINY_HEAP_V2_STATS=1                # Print statistics

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-15 16:28:40 +09:00
parent d9bbdcfc69
commit bb70d422dc
4 changed files with 115 additions and 77 deletions

View File

@ -132,9 +132,12 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
void* base = (char*)ptr - 1;
// Phase 13-B: TinyHeapV2 magazine supply (C0-C3 only)
// Try to supply to magazine first (L0 cache, faster than TLS SLL)
// Falls back to TLS SLL if magazine is full
if (class_idx <= 3 && tiny_heap_v2_enabled()) {
// Two supply modes (controlled by HAKMEM_TINY_HEAP_V2_LEFTOVER_MODE):
// Mode 0 (default): L0 gets blocks first ("stealing" design)
// Mode 1: L1 primary owner, L0 gets leftovers (ChatGPT recommended design)
if (class_idx <= 3 && tiny_heap_v2_enabled() && !tiny_heap_v2_leftover_mode()) {
// Mode 0: Try to supply to magazine first (L0 cache, faster than TLS SLL)
// Falls back to TLS SLL if magazine is full
if (tiny_heap_v2_try_push(class_idx, base)) {
// Successfully supplied to magazine
return 1;
@ -149,6 +152,19 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
return 0;
}
// Phase 13-B: Leftover mode - L0 gets leftovers from L1
// Mode 1: L1 (TLS SLL) is primary owner, L0 (magazine) gets leftovers
// Only refill L0 if it's empty (don't reduce L1 capacity)
if (class_idx <= 3 && tiny_heap_v2_enabled() && tiny_heap_v2_leftover_mode()) {
TinyHeapV2Mag* mag = &g_tiny_heap_v2_mag[class_idx];
if (mag->top == 0) { // Only refill if magazine is empty
void* leftover;
if (tls_sll_pop(class_idx, &leftover)) {
mag->items[mag->top++] = leftover;
}
}
}
// Option B: Periodic TLS SLL Drain (restore slab accounting consistency)
// Purpose: Every N frees (default: 1024), drain TLS SLL → slab freelist
// Impact: Enables empty detection → SuperSlabs freed → LRU cache functional