Phase 7-Step7: Replace g_tls_sll_enable with TINY_FRONT_TLS_SLL_ENABLED macro

**Goal**: Enable dead code elimination for TLS SLL checks in PGO mode

**Changes**:
1. core/box/tiny_front_config_box.h:
   - Add TINY_FRONT_TLS_SLL_ENABLED macro (PGO: 1, Normal: tiny_tls_sll_enabled())
   - Add tiny_tls_sll_enabled() wrapper function (static inline)

2. core/tiny_alloc_fast.inc.h (5 hot path locations):
   - Line 220: tiny_heap_v2_refill_mag() - early return check
   - Line 388: SLIM mode - SLL freelist check
   - Line 459: tiny_alloc_fast_pop() - Layer 1 SLL check
   - Line 774: Main alloc path - cached sll_enabled check (most critical!)
   - Line 815: Generic front - SLL toggle respect

3. core/hakmem_tiny_refill.inc.h (2 locations):
   - Line 186: bulk_mag_refill_fc() - refill from SLL
   - Line 213: bulk_mag_to_sll_if_room() - push to SLL

**Performance**: 79.9M ops/s (maintained, +0.1M vs Step 6)
- Normal mode: Same performance (runtime checks preserved)
- PGO mode: Dead code elimination ready (if (!1) → removed by compiler)

**Expected PGO benefit**:
- Eliminate 7 TLS SLL checks across hot paths
- Reduce instruction count in main alloc loop
- Better branch prediction (no runtime checks)

**Design**: Config Box as single entry point
- All TLS SLL checks now use TINY_FRONT_TLS_SLL_ENABLED
- Consistent pattern with FASTCACHE/SFC/HEAP_V2 macros
- Include order independent (wrapper in config box header)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-29 17:35:51 +09:00
parent ae00221a0a
commit 69e6df4cbc
3 changed files with 22 additions and 12 deletions

View File

@ -216,8 +216,8 @@ static inline int tiny_heap_v2_refill_mag(int class_idx) {
if (class_idx < 0 || class_idx > 3) return 0;
if (!tiny_heap_v2_class_enabled(class_idx)) return 0;
extern int g_tls_sll_enable;
if (!g_tls_sll_enable) return 0;
// Phase 7-Step7: Use config macro for dead code elimination in PGO mode
if (!TINY_FRONT_TLS_SLL_ENABLED) return 0;
TinyHeapV2Mag* mag = &g_tiny_heap_v2_mag[class_idx];
const int cap = TINY_HEAP_V2_MAG_CAP;
@ -384,8 +384,8 @@ static inline void* tiny_alloc_fast_pop(int class_idx) {
// SLIM MODE: Skip FastCache + SFC, go straight to SLL
if (__builtin_expect(g_front_slim_enabled, 0)) {
// Box Boundary: TLS SLL freelist pop (only layer in SLIM mode)
extern int g_tls_sll_enable;
if (__builtin_expect(g_tls_sll_enable, 1)) {
// Phase 7-Step7: Use config macro for dead code elimination in PGO mode
if (__builtin_expect(TINY_FRONT_TLS_SLL_ENABLED, 1)) {
void* base = NULL;
if (tls_sll_pop(class_idx, &base)) {
// Front Gate: SLL hit (SLIM fast path - 3 instructions)
@ -455,8 +455,8 @@ static inline void* tiny_alloc_fast_pop(int class_idx) {
// Box Boundary: Layer 1 - TLS SLL freelist の先頭を popenvで無効化可
// Note: This is in tiny_alloc_fast_pop(), not tiny_alloc_fast(), so use global variable
extern int g_tls_sll_enable;
if (__builtin_expect(g_tls_sll_enable, 1)) {
// Phase 7-Step7: Use config macro for dead code elimination in PGO mode
if (__builtin_expect(TINY_FRONT_TLS_SLL_ENABLED, 1)) {
// Use Box TLS-SLL API (C7-safe pop)
// CRITICAL: Pop FIRST, do NOT read g_tls_sll_head directly (race condition!)
// Reading head before pop causes stale read → rbp=0xa0 SEGV
@ -770,8 +770,8 @@ static inline void* tiny_alloc_fast(size_t size) {
// P0.1: Cache g_tls_sll_enable once (Phase 3-4 instruction reduction)
// Eliminates redundant global variable reads (2-3 instructions saved)
extern int g_tls_sll_enable;
const int sll_enabled = g_tls_sll_enable;
// Phase 7-Step7: Use config macro for dead code elimination in PGO mode
const int sll_enabled = TINY_FRONT_TLS_SLL_ENABLED;
#if !HAKMEM_BUILD_RELEASE
// Phase 3: Debug checks eliminated in release builds
@ -811,7 +811,8 @@ static inline void* tiny_alloc_fast(size_t size) {
// Generic front (FastCache/SFC/SLL)
// Respect SLL global toggle
if (__builtin_expect(g_tls_sll_enable, 1)) {
// Phase 7-Step7: Use config macro for dead code elimination in PGO mode
if (__builtin_expect(TINY_FRONT_TLS_SLL_ENABLED, 1)) {
// For classes 0..3 keep ultra-inline POP; for >=4 use safe Box POP to avoid UB on bad heads.
if (class_idx <= 3) {
#if HAKMEM_TINY_INLINE_SLL