Update CURRENT_TASK: Phase 3 D1 Complete (GO, +1.06%)

This commit is contained in:
Moe Charm (CI)
2025-12-13 21:44:52 +09:00
parent f059c0ec83
commit 76ea5ad57d

View File

@ -244,15 +244,68 @@
- B4 (Wrapper split): +1.47% - B4 (Wrapper split): +1.47%
- C3 (Static routing): +2.20% - C3 (Static routing): +2.20%
- C2 (Metadata cache): -0.45% - C2 (Metadata cache): -0.45%
- **Total: ~6.1%** (baseline 37.5M → ~39.8M ops/s) - D1 (Free route cache): +1.06%
- **Total: ~7.2%** (baseline 37.5M → ~40.2M ops/s)
**Commit**: `deecda733` **Commit**: `f059c0ec8`
**優先度 C2** - Slab metadata cache optimization: #### Phase 3 D1: Free Path Route Cache ✅ GO (+1.06%)
- Profile cache-miss hotspots (policy struct, slab metadata)
- Hot/cold split of metadata **設計メモ**: `docs/analysis/PHASE3_D1_FREE_ROUTE_CACHE_1_DESIGN.md`
- Inline first slab descriptor
- Expected: +5-10% **狙い**: Free path の `tiny_route_for_class()` コストを削減4.39% self + 24.78% children
**実装完了** ✅:
- `core/box/tiny_free_route_cache_env_box.h` (ENV gate + lazy init)
- `core/front/malloc_tiny_fast.h` (lines 373-385, 780-791) - 2箇所で route cache integration
- `free_tiny_fast_cold()` path: direct `g_tiny_route_class[]` lookup
- `legacy_fallback` path: direct `g_tiny_route_class[]` lookup
- Fallback safety: `g_tiny_route_snapshot_done` check before cache use
- ENV gate: `HAKMEM_FREE_STATIC_ROUTE=0/1` (default OFF)
**A/B テスト結果** ✅ GO:
- Mixed (10-run):
- Baseline (D1=0): 45,132,610 ops/s (avg), 45,756,040 ops/s (median)
- Optimized (D1=1): 45,610,062 ops/s (avg), 45,402,234 ops/s (median)
- **Average gain: +1.06%**, **Median gain: -0.77%**
- **Decision: GO** (average exceeds +1.0% threshold)
- Action: Keep as ENV-gated optimization (candidate for future default)
**Rationale**:
- Eliminates `tiny_route_for_class()` call overhead in free path
- Uses existing `g_tiny_route_class[]` cache from Phase 3 C3 (Static Routing)
- Safe fallback: checks snapshot initialization before cache use
- Minimal code footprint: 2 integration points in malloc_tiny_fast.h
**Current Cumulative Gain** (Phase 2-3):
- B3 (Routing shape): +2.89%
- B4 (Wrapper split): +1.47%
- C3 (Static routing): +2.20%
- D1 (Free route cache): +1.06%
- **Total: ~7.9%** (cumulative, assuming multiplicative gains)
**Commit**: `f059c0ec8`
#### Phase 3 C4: MIXED MID_V3 Routing Fix ✅ ADOPT
**要点**: `MIXED_TINYV3_C7_SAFE` では `HAKMEM_MID_V3_ENABLED=1` が大きく遅くなるため、**プリセットのデフォルトを OFF に変更**。
**変更**(プリセット):
- `core/bench_profile.h`: `MIXED_TINYV3_C7_SAFE``HAKMEM_MID_V3_ENABLED=0` / `HAKMEM_MID_V3_CLASSES=0x0`
- `docs/analysis/ENV_PROFILE_PRESETS.md`: Mixed 本線では MID v3 OFF と明記
**A/BMixed, ws=400, 20M iters, 10-run**:
- BaselineMID_V3=1: **mean ~43.33M ops/s**
- OptimizedMID_V3=0: **mean ~48.97M ops/s**
- **Delta: +13%** ✅GO
**理由(観測)**:
- C6 を MID_V3 にルーティングすると `tiny_alloc_route_cold()`→MID 側が “第2ホット” になり、Mixed では instruction / cache コストが支配的になりやすい
- Mixed 本線は “全クラス多発” なので、C6 は LEGACY(tiny unified cache) に残した方が速い
**ルール**:
- Mixed 本線: MID v3 OFFデフォルト
- C6-heavy: MID v3 ON従来通り
### Architectural Insight (Long-term) ### Architectural Insight (Long-term)