Moe Charm (CI)
2b567ac070
Phase FREE-TINY-FAST-DUALHOT-1: Optimize C0-C3 direct free path
Treat C0-C3 classes (48% of calls) as "second hot path" instead of
cold path. Skip expensive policy snapshot and route determination,
direct to tiny_legacy_fallback_free_base().
Measurements from FREE-TINY-FAST-HOTCOLD-OPT-1 revealed C0-C3 is NOT
rare (48.43% of all frees). Previous attempt to optimize via hot/cold
split failed (-13% regression) because noinline + function call on 48%
of workload hurt more than it helped.
This phase applies correct optimization: direct inline path for
frequent C0-C3 without policy snapshot overhead.
Implementation:
- Insert C0-C3 early-exit after C7 ULTRA check
- Skip tiny_front_v3_snapshot_get() for C0-C3 (saves 5-10 cycles)
- Skip route determination logic
- Safety: HAKMEM_TINY_LARSON_FIX=1 disables optimization
Benchmark Results (100M ops, 400 threads, MIXED_TINYV3_C7_SAFE):
- Baseline (optimization OFF): 44.50M ops/s (median)
- Optimized (DUALHOT ON): 48.74M ops/s (median)
- Improvement: +9.51% (+4.23M ops/s)
Perf Stats (optimized):
- Branch misses: 112.8M
- Cycles: 8.89B
- Instructions: 21.95B (2.47 IPC)
- Cache misses: 656K
Status: GO (significant improvement, no regression)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-13 03:46:36 +09:00
..
2025-11-14 05:41:49 +09:00
2025-12-13 03:46:36 +09:00
2025-11-14 05:41:49 +09:00
2025-12-10 09:08:18 +09:00
2025-12-10 09:08:18 +09:00
2025-12-05 23:41:01 +09:00
2025-12-07 03:12:27 +09:00