Phase 35-39: FAST build optimization complete (+7.13% cumulative)

Phase 35-A: BENCH_MINIMAL gate function elimination (GO +4.39%)
- tiny_front_v3_enabled() → constant true
- tiny_metadata_cache_enabled() → constant 0
- learner_v7_enabled() → constant false
- small_learner_v2_enabled() → constant false

Phase 36: Policy snapshot init-once (GO +0.71%)
- small_policy_v7_snapshot() version check skip in BENCH_MINIMAL
- TLS cache for policy snapshot

Phase 37: Standard TLS cache (NO-GO -0.07%)
- TLS cache for Standard build attempted
- Runtime gate overhead negates benefit

Phase 38: FAST/OBSERVE/Standard workflow established
- make perf_fast, make perf_observe targets
- Scorecard and documentation updates

Phase 39: Hot path gate constantization (GO +1.98%)
- front_gate_unified_enabled() → constant 1
- alloc_dualhot_enabled() → constant 0
- g_bench_fast_front, g_v3_enabled blocks → compile-out
- free_dispatch_stats_enabled() → constant false

Results:
- FAST v3: 56.04M ops/s (47.4% of mimalloc)
- Standard: 53.50M ops/s (45.3% of mimalloc)
- M1 target (50%): 5.5% remaining

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-16 15:01:56 +09:00
parent 506e724c3b
commit b7085c47e1
22 changed files with 1550 additions and 397 deletions

View File

@ -8,6 +8,7 @@ profile=${HAKMEM_PROFILE:-MIXED_TINYV3_C7_SAFE}
iters=${ITERS:-20000000}
ws=${WS:-400}
runs=${RUNS:-10}
bin=${BENCH_BIN:-./bench_random_mixed_hakmem}
# Force known research knobs OFF to avoid accidental carry-over.
export HAKMEM_TINY_HEADER_WRITE_ONCE=${HAKMEM_TINY_HEADER_WRITE_ONCE:-0}
@ -26,5 +27,5 @@ export HAKMEM_FREE_TINY_FAST_MONO_LEGACY_DIRECT=${HAKMEM_FREE_TINY_FAST_MONO_LEG
for i in $(seq 1 "${runs}"); do
echo "=== Run ${i}/${runs} ==="
HAKMEM_PROFILE="${profile}" ./bench_random_mixed_hakmem "${iters}" "${ws}" 1 2>&1 | rg "Throughput" || true
HAKMEM_PROFILE="${profile}" "${bin}" "${iters}" "${ws}" 1 2>&1 | rg "Throughput" || true
done