Phase FREE-FRONT-V3-1: Free route snapshot infrastructure + build fix

Summary:
========
Implemented Phase FREE-FRONT-V3 infrastructure to optimize free hotpath by:
1. Creating snapshot-based route decision table (consolidating route logic)
2. Removing redundant ENV checks from hot path
3. Preparing for future integration into hak_free_at()

Key Changes:
============

1. NEW FILES:
   - core/box/free_front_v3_env_box.h: Route snapshot definition & API
   - core/box/free_front_v3_env_box.c: Snapshot initialization & caching

2. Infrastructure Details:
   - FreeRouteSnapshotV3: Maps class_idx → free_route_kind for all 8 classes
   - Routes defined: LEGACY, TINY_V3, CORE_V6_C6, POOL_V1
   - ENV-gated initialization (HAKMEM_TINY_FREE_FRONT_V3_ENABLED, default OFF)
   - Per-thread TLS caching to avoid repeated ENV reads

3. Design Goals:
   - Consolidate tiny_route_for_class() results into snapshot table
   - Remove C7 ULTRA / v4 / v5 / v6 ENV checks from hot path
   - Limit lookup (ss_fast_lookup/slab_index_for) to paths that truly need it
   - Clear ownership boundary: front v3 handles routing, downstream handles free

4. Phase Plan:
   - v3-1  COMPLETE: Infrastructure (snapshot table, ENV initialization, TLS cache)
   - v3-2 (INFRASTRUCTURE ONLY): Placeholder integration in hak_free_api.inc.h
   - v3-3 (FUTURE): Full integration + benchmark A/B to measure hotpath improvement

5. BUILD FIX:
   - Added missing core/box/c7_meta_used_counter_box.o to OBJS_BASE in Makefile
   - This symbol was referenced but not linked, causing undefined reference errors
   - Benchmark targets now build cleanly without LTO

Status:
=======
- Build:  PASS (bench_allocators_hakmem builds without errors)
- Integration: Currently DISABLED (default OFF, ready for v3-2 phase)
- No performance impact: Infrastructure-only, hotpath unchanged

Future Work:
============
- Phase v3-2: Integrate snapshot routing into hak_free_at() main path
- Phase v3-3: Measure free hotpath performance improvement (target: 1-2% less branch mispredict)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-11 19:17:30 +09:00
parent 224cc8d1ca
commit 7b7de53167
14 changed files with 462 additions and 10 deletions

View File

@ -410,6 +410,13 @@ uint32_t region_id_box_lookup(void* ptr);
- C6 → C5 → 他クラスと、hot class から CORE v6 に載せ、Mixed 161024B の perf を確認。
- C7 ULTRAL0と CORE v6L1の共存チューニング。
5. **Phase v6-4 以降C5/C4 拡張 + free hotpath 削減)**
- C6 で安定・baseline 同等が確認できたら、C5 / C4 を順次 CORE v6 に載せていき、free hotpath の `ss_fast_lookup`/`slab_index_for` 依存を削っていく。
- 各クラスごとに:
- heavy プロファイルC5-heavy, C4-heavyで v6 ON/OFF の A/Bまずは ±0〜数% を目標)。
- Mixed 161024B で v6 ON 時の impactfree% の減少と ops/s の変化)を確認。
- それでも free 側が支配的なら、最終的には front/gate の dispatcher 自体を薄くするフェーズfree dispatch 削減)に進む。
以降の Phase は、この「層」と「責務」を変えずに micro-optimization を繰り返すフェーズとする。
---