Files
hakmem/core/front/tiny_c5_inline_slots.h
Moe Charm (CI) 89a9212700 Phase 83-1 + Allocator Comparison: Switch dispatch fixed (NO-GO +0.32%), PROFILE correction, SCORECARD update
Key changes:
- Phase 83-1: Switch dispatch fixed mode (tiny_inline_slots_switch_dispatch_fixed_box) - NO-GO (marginal +0.32%, branch reduction negligible)
  Reason: lazy-init pattern already optimal, Phase 78-1 pattern shows diminishing returns

- Allocator comparison baseline update (10-run SSOT, WS=400, ITERS=20M):
  tcmalloc: 115.26M (92.33% of mimalloc)
  jemalloc: 97.39M (77.96% of mimalloc)
  system: 85.20M (68.24% of mimalloc)
  mimalloc: 124.82M (baseline)

- hakmem PROFILE correction: scripts/run_mixed_10_cleanenv.sh + run_allocator_quick_matrix.sh
  PROFILE explicitly set to MIXED_TINYV3_C7_SAFE for hakmem measurements
  Result: baseline stabilized to 55.53M (44.46% of mimalloc)
  Previous unstable measurement (35.57M) was due to profile leak

- Documentation:
  * PERFORMANCE_TARGETS_SCORECARD.md: Reference allocators + M1/M2 milestone status
  * PHASE83_1_SWITCH_DISPATCH_FIXED_RESULTS.md: Phase 83-1 analysis (NO-GO)
  * ALLOCATOR_COMPARISON_QUICK_RUNBOOK.md: Quick comparison procedure
  * ALLOCATOR_COMPARISON_SSOT.md: Detailed SSOT methodology

- M2 milestone status: 44.46% (target 55%, gap -10.54pp) - structural improvements needed

🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-18 18:50:00 +09:00

90 lines
3.4 KiB
C

// tiny_c5_inline_slots.h - Phase 75-2: C5 Inline Slots Fast-Path API
//
// Goal: Zero-overhead fast-path API for C5 inline slot operations
// Scope: C5 class only (separate from C6, tested independently)
// Design: Always-inline, fail-fast to unified_cache on FULL/empty
//
// Performance Target:
// - Push: 1-2 cycles (ring index update, no bounds check)
// - Pop: 1-2 cycles (ring index update, null check)
// - Fallback: Silent delegation to unified_cache (existing path)
//
// Integration Points:
// - Alloc: Try c5_inline_pop() first, fallback to unified_cache_pop()
// - Free: Try c5_inline_push() first, fallback to unified_cache_push()
//
// Safety:
// - Caller must check c5_inline_enabled() before calling
// - Caller must handle NULL return (pop) or full condition (push)
// - No internal checks (fail-fast design)
#ifndef HAK_FRONT_TINY_C5_INLINE_SLOTS_H
#define HAK_FRONT_TINY_C5_INLINE_SLOTS_H
#include <stdint.h>
#include "../box/tiny_c5_inline_slots_env_box.h"
#include "../box/tiny_c5_inline_slots_tls_box.h"
#include "../box/tiny_inline_slots_fixed_mode_box.h"
// ============================================================================
// Fast-Path API (always_inline for zero branch overhead)
// ============================================================================
// Push to C5 inline slots (free path)
// Returns: 1 on success, 0 if full (caller must fallback to unified_cache)
// Precondition: ptr is valid BASE pointer for C5 class
__attribute__((always_inline))
static inline int c5_inline_push(TinyC5InlineSlots* slots, void* ptr) {
// Full check (single branch, likely taken in steady state)
if (__builtin_expect(c5_inline_full(slots), 0)) {
return 0; // Full, caller must fallback
}
// Push to tail (FIFO producer)
slots->slots[slots->tail] = ptr;
slots->tail = (slots->tail + 1) % TINY_C5_INLINE_CAPACITY;
return 1; // Success
}
// Pop from C5 inline slots (alloc path)
// Returns: BASE pointer on success, NULL if empty (caller must fallback to unified_cache)
// Precondition: slots is initialized and enabled
__attribute__((always_inline))
static inline void* c5_inline_pop(TinyC5InlineSlots* slots) {
// Empty check (single branch, likely NOT taken in steady state)
if (__builtin_expect(c5_inline_empty(slots), 0)) {
return NULL; // Empty, caller must fallback
}
// Pop from head (FIFO consumer)
void* ptr = slots->slots[slots->head];
slots->head = (slots->head + 1) % TINY_C5_INLINE_CAPACITY;
return ptr; // BASE pointer (caller converts to USER)
}
// ============================================================================
// Integration Helpers (for malloc_tiny_fast.h integration)
// ============================================================================
// Get TLS instance (wraps extern TLS variable)
static inline TinyC5InlineSlots* c5_inline_tls(void) {
return &g_tiny_c5_inline_slots;
}
// Check if C5 inline is enabled AND initialized (combined gate)
// Returns: 1 if ready to use, 0 if disabled or uninitialized
static inline int c5_inline_ready(void) {
if (!tiny_c5_inline_slots_enabled_fast()) {
return 0;
}
// TLS init check (once per thread)
// Note: In production, this check can be eliminated if TLS init is guaranteed
TinyC5InlineSlots* slots = c5_inline_tls();
return (slots->slots != NULL || slots->head == 0); // Initialized if zero or non-null
}
#endif // HAK_FRONT_TINY_C5_INLINE_SLOTS_H