Phase 75-1: C6-only Inline Slots (P2) - GO (+2.87%)

Modular implementation of hot-class inline slots optimization: - Created 5 new boxes: env_box, tls_box, fast_path_api, integration_box, test_script - Single decision point at TLS init (ENV gate: HAKMEM_TINY_C6_INLINE_SLOTS=0/1) - Integration: 2 minimal boundary points (alloc/free paths for C6 class) - Default OFF: zero overhead when disabled (full backward compatibility) Results (10-run Mixed SSOT, WS=400): - Baseline (C6 inline OFF): 44.24 M ops/s - Treatment (C6 inline ON): 45.51 M ops/s - Delta: +1.27 M ops/s (+2.87%) Status: ✅ GO - Strong improvement via C6 ring buffer fast-path Mechanism: Branch elimination on unified_cache_push/pop for C6 allocations Next: Phase 75-2 (add C5 inline slots, target 85% C4-C7 coverage) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-18 08:22:09 +09:00
parent 65f982aeec
commit 0009ce13b3
11 changed files with 743 additions and 10 deletions
--- a/core/box/tiny_front_hot_box.h
+++ b/core/box/tiny_front_hot_box.h
@ -31,6 +31,8 @@
 #include "../front/tiny_unified_cache.h"  // For TinyUnifiedCache
 #include "tiny_header_box.h"              // Phase 5 E5-2: For tiny_header_finalize_alloc
 #include "tiny_unified_lifo_box.h"        // Phase 15 v1: UnifiedCache FIFO→LIFO
+#include "tiny_c6_inline_slots_env_box.h" // Phase 75-1: C6 inline slots ENV gate
+#include "../front/tiny_c6_inline_slots.h" // Phase 75-1: C6 inline slots API

 // ============================================================================
 // Branch Prediction Macros (Pointer Safety - Prediction Hints)
@ -110,6 +112,21 @@ __attribute__((always_inline))
 static inline void* tiny_hot_alloc_fast(int class_idx) {
    extern __thread TinyUnifiedCache g_unified_cache[];

+    // Phase 75-1: C6 Inline Slots early-exit (ENV gated)
+    // Try C6 inline slots FIRST (before unified cache) for class 6
+    if (class_idx == 6 && tiny_c6_inline_slots_enabled()) {
+        void* base = c6_inline_pop(c6_inline_tls());
+        if (TINY_HOT_LIKELY(base != NULL)) {
+            TINY_HOT_METRICS_HIT(class_idx);
+            #if HAKMEM_TINY_HEADER_CLASSIDX
+            return tiny_header_finalize_alloc(base, class_idx);
+            #else
+            return base;
+            #endif
+        }
+        // C6 inline miss → fall through to unified cache
+    }
+
    // TLS cache access (1 cache miss)
    // NOTE: Range check removed - caller (hak_tiny_size_to_class) guarantees valid class_idx
    TinyUnifiedCache* cache = &g_unified_cache[class_idx];