From 21f7b35503cfb140f3ac97f5c15249f072628c93 Mon Sep 17 00:00:00 2001 From: "Moe Charm (CI)" Date: Sat, 29 Nov 2025 17:04:24 +0900 Subject: [PATCH] Phase 7-Step4: Replace runtime checks with config macros (+1.1% improvement) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit **What Changed**: Replace 3 runtime checks with compile-time config macros in hot path: - `g_fastcache_enable` → `TINY_FRONT_FASTCACHE_ENABLED` (line 421) - `tiny_heap_v2_enabled()` → `TINY_FRONT_HEAP_V2_ENABLED` (line 809) - `ultra_slim_mode_enabled()` → `TINY_FRONT_ULTRA_SLIM_ENABLED` (line 757) **Why This Works**: PGO mode (-DHAKMEM_TINY_FRONT_PGO=1 in bench builds): - Config macros become compile-time constants (0 or 1) - Compiler eliminates dead branches: if (0) { ... } → removed - Smaller code size, better instruction cache locality - Fewer branch mispredictions in hot path Normal mode (default, backward compatible): - Config macros expand to runtime function calls - Preserves ENV variable control (e.g., HAKMEM_TINY_FRONT_V2=1) **Performance**: bench_random_mixed (ws=256): - Before (Step 3): 80.6 M ops/s - After (Step 4): 81.0 / 81.0 / 82.4 M ops/s - Average: ~81.5 M ops/s (+1.1%, +0.9 M ops/s) **Dead Code Elimination Benefit**: - FastCache check eliminated (PGO mode: TINY_FRONT_FASTCACHE_ENABLED = 0) - Heap V2 check eliminated (PGO mode: TINY_FRONT_HEAP_V2_ENABLED = 0) - Ultra SLIM check eliminated (PGO mode: TINY_FRONT_ULTRA_SLIM_ENABLED = 0) **Files Modified**: - core/tiny_alloc_fast.inc.h (+6 lines comments, 3 lines changed) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- core/tiny_alloc_fast.inc.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/core/tiny_alloc_fast.inc.h b/core/tiny_alloc_fast.inc.h index 2f9f09a6..1d31a004 100644 --- a/core/tiny_alloc_fast.inc.h +++ b/core/tiny_alloc_fast.inc.h @@ -417,7 +417,8 @@ static inline void* tiny_alloc_fast_pop(int class_idx) { #endif // Phase 1: Try array stack (FastCache) first for hottest tiny classes (C0–C3) - if (__builtin_expect(g_fastcache_enable && class_idx <= 3, 1)) { + // Phase 7-Step4: Use config macro for dead code elimination in PGO mode + if (__builtin_expect(TINY_FRONT_FASTCACHE_ENABLED && class_idx <= 3, 1)) { void* fc = fastcache_pop(class_idx); if (__builtin_expect(fc != NULL, 1)) { // Frontend FastCache hit (already tracked by g_front_fc_hit) @@ -752,7 +753,8 @@ static inline void* tiny_alloc_fast(size_t size) { } #endif - if (__builtin_expect(ultra_slim_mode_enabled(), 0)) { + // Phase 7-Step4: Use config macro for dead code elimination in PGO mode + if (__builtin_expect(TINY_FRONT_ULTRA_SLIM_ENABLED, 0)) { return ultra_slim_alloc_with_refill(size); } // ========== End Phase 19-2: Ultra SLIM ========== @@ -804,7 +806,8 @@ static inline void* tiny_alloc_fast(size_t size) { void* ptr = NULL; // Front-V2: TLS magazine front (A/B, default OFF) - if (__builtin_expect(tiny_heap_v2_enabled() && front_prune_heapv2_enabled() && class_idx <= 3, 0)) { + // Phase 7-Step4: Use config macro for dead code elimination in PGO mode + if (__builtin_expect(TINY_FRONT_HEAP_V2_ENABLED && front_prune_heapv2_enabled() && class_idx <= 3, 0)) { void* hv2 = tiny_heap_v2_alloc_by_class(class_idx); if (hv2) { front_metrics_heapv2_hit(class_idx);