Phase 13-A Step 1: TinyHeapV2 NO-REFILL L0 cache implementation

Implement TinyHeapV2 as a minimal "lucky hit" L0 cache that avoids
circular dependency with FastCache by eliminating self-refill.

Key Changes:
- New: core/front/tiny_heap_v2.h - NO-REFILL L0 cache implementation
  - tiny_heap_v2_alloc(): Pop from magazine if available, else return NULL
  - tiny_heap_v2_refill_mag(): No-op stub (no backend refill)
  - ENV: HAKMEM_TINY_HEAP_V2=1 to enable
  - ENV: HAKMEM_TINY_HEAP_V2_CLASS_MASK=bitmask (C0-C3 control)
  - ENV: HAKMEM_TINY_HEAP_V2_STATS=1 to print statistics
- Modified: core/hakmem_tiny_alloc_new.inc - Add TinyHeapV2 hook
  - Hook at entry point (after class_idx calculation)
  - Fallback to existing front if TinyHeapV2 returns NULL
- Modified: core/hakmem_tiny_alloc.inc - Add hook for legacy path
- Modified: core/hakmem_tiny.c - Add TLS variables and stats wrapper
  - TinyHeapV2Mag: Per-class magazine (capacity=16)
  - TinyHeapV2Stats: Per-class counters (alloc_calls, mag_hits, etc.)
  - tiny_heap_v2_print_stats(): Statistics output at exit
- New: TINY_HEAP_V2_TASK_SPEC.md - Phase 13 specification

Root Cause Fixed:
- BEFORE: TinyHeapV2 refilled from FastCache → circular dependency
  - TinyHeapV2 intercepted all allocs → FastCache never populated
  - Result: 100% backend OOM, 0% hit rate, 99% slowdown
- AFTER: TinyHeapV2 is passive L0 cache (no refill)
  - Magazine empty → return NULL → existing front handles it
  - Result: 0% overhead, stable baseline performance

A/B Test Results (100K iterations, fixed-size bench):
- C1 (8B):  Baseline 9,688 ops/s → HeapV2 ON 9,762 ops/s (+0.76%)
- C2 (16B): Baseline 9,804 ops/s → HeapV2 ON 9,845 ops/s (+0.42%)
- C3 (32B): Baseline 9,840 ops/s → HeapV2 ON 9,814 ops/s (-0.26%)
- All within noise range: NO PERFORMANCE REGRESSION 

Statistics (HeapV2 ON, C1-C3):
- alloc_calls: 200K (hook works correctly)
- mag_hits: 0 (0%) - Magazine empty as expected
- refill_calls: 0 - No refill executed (circular dependency avoided)
- backend_oom: 0 - No backend access

Next Steps (Phase 13-A Step 2):
- Implement magazine supply strategy (from existing front or free path)
- Goal: Populate magazine with "leftover" blocks from existing pipeline

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-15 01:42:57 +09:00
parent 9472ee90c9
commit 5cc1f93622
5 changed files with 482 additions and 0 deletions

View File

@ -148,6 +148,17 @@ void* hak_tiny_alloc(size_t size) {
}
} while (0);
// Phase 13-A: Tiny Heap v2 (per-thread heap, experimental)
// ENV-gated: HAKMEM_TINY_HEAP_V2=1
// Targets class 0-3 (16-256B) only, falls back to existing path if NULL
if (__builtin_expect(tiny_heap_v2_enabled(), 0) && class_idx <= 3) {
void* base = tiny_heap_v2_alloc(size);
if (base) {
HAK_RET_ALLOC(class_idx, base); // Header write + return USER pointer
}
// Fall through to existing front path if HeapV2 returned NULL (disabled class or OOM)
}
#if HAKMEM_TINY_MINIMAL_FRONT
// Minimal Front for hot tiny classes (bench-focused):
// SLL direct pop → minimal refill → pop, bypassing other layers.