Update CURRENT_TASK: Phase 3 D1 Complete (GO, +1.06%)
This commit is contained in:
@ -244,15 +244,68 @@
|
||||
- B4 (Wrapper split): +1.47%
|
||||
- C3 (Static routing): +2.20%
|
||||
- C2 (Metadata cache): -0.45%
|
||||
- **Total: ~6.1%** (baseline 37.5M → ~39.8M ops/s)
|
||||
- D1 (Free route cache): +1.06%
|
||||
- **Total: ~7.2%** (baseline 37.5M → ~40.2M ops/s)
|
||||
|
||||
**Commit**: `deecda733`
|
||||
**Commit**: `f059c0ec8`
|
||||
|
||||
**優先度 C2** - Slab metadata cache optimization:
|
||||
- Profile cache-miss hotspots (policy struct, slab metadata)
|
||||
- Hot/cold split of metadata
|
||||
- Inline first slab descriptor
|
||||
- Expected: +5-10%
|
||||
#### Phase 3 D1: Free Path Route Cache ✅ GO (+1.06%)
|
||||
|
||||
**設計メモ**: `docs/analysis/PHASE3_D1_FREE_ROUTE_CACHE_1_DESIGN.md`
|
||||
|
||||
**狙い**: Free path の `tiny_route_for_class()` コストを削減(4.39% self + 24.78% children)
|
||||
|
||||
**実装完了** ✅:
|
||||
- `core/box/tiny_free_route_cache_env_box.h` (ENV gate + lazy init)
|
||||
- `core/front/malloc_tiny_fast.h` (lines 373-385, 780-791) - 2箇所で route cache integration
|
||||
- `free_tiny_fast_cold()` path: direct `g_tiny_route_class[]` lookup
|
||||
- `legacy_fallback` path: direct `g_tiny_route_class[]` lookup
|
||||
- Fallback safety: `g_tiny_route_snapshot_done` check before cache use
|
||||
- ENV gate: `HAKMEM_FREE_STATIC_ROUTE=0/1` (default OFF)
|
||||
|
||||
**A/B テスト結果** ✅ GO:
|
||||
- Mixed (10-run):
|
||||
- Baseline (D1=0): 45,132,610 ops/s (avg), 45,756,040 ops/s (median)
|
||||
- Optimized (D1=1): 45,610,062 ops/s (avg), 45,402,234 ops/s (median)
|
||||
- **Average gain: +1.06%**, **Median gain: -0.77%**
|
||||
- **Decision: GO** (average exceeds +1.0% threshold)
|
||||
- Action: Keep as ENV-gated optimization (candidate for future default)
|
||||
|
||||
**Rationale**:
|
||||
- Eliminates `tiny_route_for_class()` call overhead in free path
|
||||
- Uses existing `g_tiny_route_class[]` cache from Phase 3 C3 (Static Routing)
|
||||
- Safe fallback: checks snapshot initialization before cache use
|
||||
- Minimal code footprint: 2 integration points in malloc_tiny_fast.h
|
||||
|
||||
**Current Cumulative Gain** (Phase 2-3):
|
||||
- B3 (Routing shape): +2.89%
|
||||
- B4 (Wrapper split): +1.47%
|
||||
- C3 (Static routing): +2.20%
|
||||
- D1 (Free route cache): +1.06%
|
||||
- **Total: ~7.9%** (cumulative, assuming multiplicative gains)
|
||||
|
||||
**Commit**: `f059c0ec8`
|
||||
|
||||
#### Phase 3 C4: MIXED MID_V3 Routing Fix ✅ ADOPT
|
||||
|
||||
**要点**: `MIXED_TINYV3_C7_SAFE` では `HAKMEM_MID_V3_ENABLED=1` が大きく遅くなるため、**プリセットのデフォルトを OFF に変更**。
|
||||
|
||||
**変更**(プリセット):
|
||||
- `core/bench_profile.h`: `MIXED_TINYV3_C7_SAFE` の `HAKMEM_MID_V3_ENABLED=0` / `HAKMEM_MID_V3_CLASSES=0x0`
|
||||
- `docs/analysis/ENV_PROFILE_PRESETS.md`: Mixed 本線では MID v3 OFF と明記
|
||||
|
||||
**A/B(Mixed, ws=400, 20M iters, 10-run)**:
|
||||
- Baseline(MID_V3=1): **mean ~43.33M ops/s**
|
||||
- Optimized(MID_V3=0): **mean ~48.97M ops/s**
|
||||
- **Delta: +13%** ✅(GO)
|
||||
|
||||
**理由(観測)**:
|
||||
- C6 を MID_V3 にルーティングすると `tiny_alloc_route_cold()`→MID 側が “第2ホット” になり、Mixed では instruction / cache コストが支配的になりやすい
|
||||
- Mixed 本線は “全クラス多発” なので、C6 は LEGACY(tiny unified cache) に残した方が速い
|
||||
|
||||
**ルール**:
|
||||
- Mixed 本線: MID v3 OFF(デフォルト)
|
||||
- C6-heavy: MID v3 ON(従来通り)
|
||||
|
||||
### Architectural Insight (Long-term)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user