Phase FREE-FRONT-V3-1: Free route snapshot infrastructure + build fix

Summary:
========
Implemented Phase FREE-FRONT-V3 infrastructure to optimize free hotpath by:
1. Creating snapshot-based route decision table (consolidating route logic)
2. Removing redundant ENV checks from hot path
3. Preparing for future integration into hak_free_at()

Key Changes:
============

1. NEW FILES:
   - core/box/free_front_v3_env_box.h: Route snapshot definition & API
   - core/box/free_front_v3_env_box.c: Snapshot initialization & caching

2. Infrastructure Details:
   - FreeRouteSnapshotV3: Maps class_idx → free_route_kind for all 8 classes
   - Routes defined: LEGACY, TINY_V3, CORE_V6_C6, POOL_V1
   - ENV-gated initialization (HAKMEM_TINY_FREE_FRONT_V3_ENABLED, default OFF)
   - Per-thread TLS caching to avoid repeated ENV reads

3. Design Goals:
   - Consolidate tiny_route_for_class() results into snapshot table
   - Remove C7 ULTRA / v4 / v5 / v6 ENV checks from hot path
   - Limit lookup (ss_fast_lookup/slab_index_for) to paths that truly need it
   - Clear ownership boundary: front v3 handles routing, downstream handles free

4. Phase Plan:
   - v3-1  COMPLETE: Infrastructure (snapshot table, ENV initialization, TLS cache)
   - v3-2 (INFRASTRUCTURE ONLY): Placeholder integration in hak_free_api.inc.h
   - v3-3 (FUTURE): Full integration + benchmark A/B to measure hotpath improvement

5. BUILD FIX:
   - Added missing core/box/c7_meta_used_counter_box.o to OBJS_BASE in Makefile
   - This symbol was referenced but not linked, causing undefined reference errors
   - Benchmark targets now build cleanly without LTO

Status:
=======
- Build:  PASS (bench_allocators_hakmem builds without errors)
- Integration: Currently DISABLED (default OFF, ready for v3-2 phase)
- No performance impact: Infrastructure-only, hotpath unchanged

Future Work:
============
- Phase v3-2: Integrate snapshot routing into hak_free_at() main path
- Phase v3-3: Measure free hotpath performance improvement (target: 1-2% less branch mispredict)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-11 19:17:30 +09:00
parent 224cc8d1ca
commit 7b7de53167
14 changed files with 462 additions and 10 deletions

View File

@ -168,6 +168,39 @@ Step 2.5 が TLS_SLL_PUSH_DUP を「修正」するために追加されたが
---
## Phase FREE-LEGACY-OPT シリーズ2025-12-11
### Phase FREE-LEGACY-OPT-4-1: Legacy per-class 分析 ✅ 完了
**目的**: Legacy fallback 49.2% の内訳を per-class で分析
**測定結果Mixed 16-1024B**:
- **C6 (513-1024B)**: 51.4% (137,319 / 266,942 Legacy calls)
- C5 (257-512B): 25.8%
- C4 (129-256B): 13.0%
- C3 (65-128B): 6.5%
- C2 (33-64B): 3.3%
- C0/C1/C7: 0.0%
**最大ターゲット**: C6 が Legacy の過半数を占める
**詳細**: `docs/analysis/FREE_LEGACY_PATH_ANALYSIS.md` 参照
### Phase FREE-LEGACY-OPT-4-2: C6_ULTRA_FREE_BOX 実装(進行中)
**目的**: C6 の free だけを C7 ULTRA 風 TLS キャッシュで受け、Legacy fallback を半減
**実装範囲**:
- C6 専用・free 専用alloc は既存ルートのまま)
- TLS に `c6_freelist[32]` + `c6_count` + segment range check
- ENV: `HAKMEM_TINY_C6_ULTRA_FREE_ENABLED=0`(研究箱、デフォルト OFF
**期待効果**:
- Legacy fallback: 49.2% → 24-27%C6 分を削減)
- Mixed throughput: +5-8% 改善44.8M → 47-48M ops/s
---
## 🎯 次のアクション
### 現時点での選択肢