Files
hakmem/core/front/tiny_c2_local_cache.h
Moe Charm (CI) 89a9212700 Phase 83-1 + Allocator Comparison: Switch dispatch fixed (NO-GO +0.32%), PROFILE correction, SCORECARD update
Key changes:
- Phase 83-1: Switch dispatch fixed mode (tiny_inline_slots_switch_dispatch_fixed_box) - NO-GO (marginal +0.32%, branch reduction negligible)
  Reason: lazy-init pattern already optimal, Phase 78-1 pattern shows diminishing returns

- Allocator comparison baseline update (10-run SSOT, WS=400, ITERS=20M):
  tcmalloc: 115.26M (92.33% of mimalloc)
  jemalloc: 97.39M (77.96% of mimalloc)
  system: 85.20M (68.24% of mimalloc)
  mimalloc: 124.82M (baseline)

- hakmem PROFILE correction: scripts/run_mixed_10_cleanenv.sh + run_allocator_quick_matrix.sh
  PROFILE explicitly set to MIXED_TINYV3_C7_SAFE for hakmem measurements
  Result: baseline stabilized to 55.53M (44.46% of mimalloc)
  Previous unstable measurement (35.57M) was due to profile leak

- Documentation:
  * PERFORMANCE_TARGETS_SCORECARD.md: Reference allocators + M1/M2 milestone status
  * PHASE83_1_SWITCH_DISPATCH_FIXED_RESULTS.md: Phase 83-1 analysis (NO-GO)
  * ALLOCATOR_COMPARISON_QUICK_RUNBOOK.md: Quick comparison procedure
  * ALLOCATOR_COMPARISON_SSOT.md: Detailed SSOT methodology

- M2 milestone status: 44.46% (target 55%, gap -10.54pp) - structural improvements needed

🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-18 18:50:00 +09:00

74 lines
2.7 KiB
C

// tiny_c2_local_cache.h - Phase 79-1: C2 Local Cache Fast-Path API
//
// Goal: Zero-overhead always-inline push/pop for C2 FIFO ring buffer
// Scope: C2 allocations (32-64B)
// Design: Fail-fast to unified_cache on full/empty
//
// Fast-Path Strategy:
// - Always-inline push/pop for zero-call-overhead
// - Modulo arithmetic inlined (tail/head)
// - Return NULL on empty, 0 on full (caller handles fallback)
// - No bounds checking (ring size fixed at compile time)
//
// Integration Points:
// - Alloc: Call c2_local_cache_pop() in tiny_front_hot_box BEFORE unified_cache
// - Free: Call c2_local_cache_push() in tiny_legacy_fallback BEFORE unified_cache
//
// Rationale:
// - Same pattern as C3/C4/C5/C6 inline slots (proven +7.05% C4-C6 cumulative)
// - Phase 79-0 analysis: C2 Stage3 backend lock contention (not well-served by TLS)
// - Lightweight cap (64) = 512B/thread (Phase 79-0 specification)
// - Fail-fast design = no performance cliff if full/empty
#ifndef HAK_FRONT_TINY_C2_LOCAL_CACHE_H
#define HAK_FRONT_TINY_C2_LOCAL_CACHE_H
#include <stdint.h>
#include "../box/tiny_c2_local_cache_tls_box.h"
#include "../box/tiny_c2_local_cache_env_box.h"
// ============================================================================
// C2 Local Cache: Fast-Path Push/Pop (Always-Inline)
// ============================================================================
// Get TLS pointer for C2 local cache
// Inline for zero overhead
static inline TinyC2LocalCache* c2_local_cache_tls(void) {
extern __thread TinyC2LocalCache g_tiny_c2_local_cache;
return &g_tiny_c2_local_cache;
}
// Push pointer to C2 local cache ring
// Returns: 1 if success, 0 if full (caller must fallback to unified_cache)
__attribute__((always_inline))
static inline int c2_local_cache_push(TinyC2LocalCache* cache, void* ptr) {
// Check if ring is full
if (__builtin_expect(c2_local_cache_full(cache), 0)) {
return 0; // Full, caller must use unified_cache
}
// Enqueue at tail
cache->slots[cache->tail] = ptr;
cache->tail = (cache->tail + 1) % TINY_C2_LOCAL_CACHE_CAPACITY;
return 1; // Success
}
// Pop pointer from C2 local cache ring
// Returns: non-NULL if success, NULL if empty (caller must fallback to unified_cache)
__attribute__((always_inline))
static inline void* c2_local_cache_pop(TinyC2LocalCache* cache) {
// Check if ring is empty
if (__builtin_expect(c2_local_cache_empty(cache), 0)) {
return NULL; // Empty, caller must use unified_cache
}
// Dequeue from head
void* ptr = cache->slots[cache->head];
cache->head = (cache->head + 1) % TINY_C2_LOCAL_CACHE_CAPACITY;
return ptr; // Success
}
#endif // HAK_FRONT_TINY_C2_LOCAL_CACHE_H