Commit Graph

2 Commits

Author SHA1 Message Date
ae00221a0a Phase 7-Step6: Fix include order issue - refill path optimization complete
**Problem**: Include order dependency prevented using TINY_FRONT_FASTCACHE_ENABLED
macro in hakmem_tiny_refill.inc.h (included before tiny_alloc_fast.inc.h).

**Solution** (from ChatGPT advice):
- Move wrapper functions to tiny_front_config_box.h as static inline
- This makes them available regardless of include order
- Enables dead code elimination in PGO mode for refill path

**Changes**:
1. core/box/tiny_front_config_box.h:
   - Add tiny_fastcache_enabled() and sfc_cascade_enabled() as static inline
   - These access static global variables via extern declaration

2. core/hakmem_tiny_refill.inc.h:
   - Include tiny_front_config_box.h
   - Use TINY_FRONT_FASTCACHE_ENABLED macro (line 162)
   - Enables dead code elimination in PGO mode

3. core/tiny_alloc_fast.inc.h:
   - Remove duplicate wrapper function definitions
   - Now uses functions from config box header

**Performance**: 79.8M ops/s (maintained, 77M/81M/81M across 3 runs)

**Design Principle**: Config Box as "single entry point" for Tiny Front policy
- All config checks go through TINY_FRONT_*_ENABLED macros
- Wrapper functions centralized in config box header
- Include order independent (static inline in header)

🐱 Generated with ChatGPT advice for solving include order dependencies

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 17:31:32 +09:00
e0aa51dba1 Phase 4-Step3: Add Front Config Box (+2.7-4.9% dead code elimination)
Implement compile-time configuration system for dead code elimination in Tiny
allocation hot paths. The Config Box provides dual-mode configuration:
- Normal mode: Runtime ENV checks (backward compatible, flexible)
- PGO mode: Compile-time constants (dead code elimination, performance)

PERFORMANCE:
- Baseline (runtime config): 50.32 M ops/s (avg of 5 runs)
- Config Box (PGO mode): 52.77 M ops/s (avg of 5 runs)
- Improvement: +2.45 M ops/s (+4.87% with outlier, +2.72% without)
- Target: +5-8% (partially achieved)

IMPLEMENTATION:

1. core/box/tiny_front_config_box.h (NEW):
   - Defines TINY_FRONT_*_ENABLED macros for all config checks
   - PGO mode (#if HAKMEM_TINY_FRONT_PGO): Macros expand to constants (0/1)
   - Normal mode (#else): Macros expand to function calls
   - Functions remain in their original locations (no code duplication)

2. core/hakmem_build_flags.h:
   - Added HAKMEM_TINY_FRONT_PGO build flag (default: 0, off)
   - Documentation: Usage with make EXTRA_CFLAGS="-DHAKMEM_TINY_FRONT_PGO=1"

3. core/box/hak_wrappers.inc.h:
   - Replaced front_gate_unified_enabled() with TINY_FRONT_UNIFIED_GATE_ENABLED
   - 2 call sites updated (malloc and free fast paths)
   - Added config box include

EXPECTED DEAD CODE ELIMINATION (PGO mode):
  if (TINY_FRONT_UNIFIED_GATE_ENABLED) { ... }
  → if (1) { ... }  // Constant, always true
  → Compiler optimizes away the branch, keeps body

SCOPE:
  Currently only front_gate_unified_enabled() is replaced (2 call sites).
  To achieve full +5-8% target, expand to other config checks:
  - ultra_slim_mode_enabled()
  - tiny_heap_v2_enabled()
  - sfc_cascade_enabled()
  - tiny_fastcache_enabled()
  - tiny_metrics_enabled()
  - tiny_diag_enabled()

BUILD USAGE:
  Normal mode (runtime config, default):
    make bench_random_mixed_hakmem

  PGO mode (compile-time config, dead code elimination):
    make EXTRA_CFLAGS="-DHAKMEM_TINY_FRONT_PGO=1" bench_random_mixed_hakmem

BOX PATTERN COMPLIANCE:
 Single Responsibility: Configuration management ONLY
 Clear Contract: Dual-mode (PGO = constants, Normal = runtime)
 Observable: Config report function (debug builds)
 Safe: Backward compatible (default is normal mode)
 Testable: Easy A/B comparison (PGO vs normal builds)

WHY +2.7-4.9% (below +5-8% target)?
- Limited scope: Only 2 call sites for 1 config function replaced
- Lazy init overhead: front_gate_unified_enabled() cached after first call
- Need to expand to more config checks for full benefit

NEXT STEPS:
- Expand config macro usage to other functions (optional)
- OR proceed with PGO re-enablement (Final polish)

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-29 12:18:37 +09:00