Update CURRENT_TASK: Phase 3 C2 Complete (NEUTRAL, research box)
This commit is contained in:
@ -197,6 +197,57 @@
|
||||
- 実際のメモリ待ちは slots[] 配列へのアクセス時(prefetch より後)
|
||||
- 改善案: prefetch をもっと早期(route_kind 決定前)に移動するか、形状を変更
|
||||
|
||||
#### Phase 3 C2: Slab Metadata Cache Optimization 🔬 NEUTRAL / FREEZE
|
||||
|
||||
**設計メモ**: `docs/analysis/PHASE3_C2_METADATA_CACHE_1_DESIGN.md`
|
||||
|
||||
**狙い**: Free path で metadata access(policy snapshot, slab descriptor)の cache locality を改善
|
||||
|
||||
**3 Patches 実装完了** ✅:
|
||||
|
||||
1. **Policy Hot Cache** (Patch 1):
|
||||
- TinyPolicyHot struct: route_kind[8] を TLS にキャッシュ(9 bytes packed)
|
||||
- policy_snapshot() 呼び出しを削減(~2 memory ops 節約)
|
||||
- Safety: learner v7 active 時は自動的に disable
|
||||
- Files: `core/box/tiny_metadata_cache_env_box.h`, `tiny_metadata_cache_hot_box.{h,c}`
|
||||
- Integration: `core/front/malloc_tiny_fast.h` (line 256) route selection
|
||||
|
||||
2. **First Page Inline Cache** (Patch 2):
|
||||
- TinyFirstPageCache struct: current slab page pointer を TLS per-class にキャッシュ
|
||||
- superslab metadata lookup を回避(1-2 memory ops)
|
||||
- Fast-path check in `tiny_legacy_fallback_free_base()`
|
||||
- Files: `core/front/tiny_first_page_cache.h`, `tiny_unified_cache.c`
|
||||
- Integration: `core/box/tiny_legacy_fallback_box.h` (lines 27-36)
|
||||
|
||||
3. **Bounds Check Compile-out** (Patch 3):
|
||||
- unified_cache capacity を MACRO constant 化(2048 hardcode)
|
||||
- modulo 演算を compile-time 最適化(`& MASK`)
|
||||
- Macros: `TINY_UNIFIED_CACHE_CAPACITY_POW2=11`, `CAPACITY=2048`, `MASK=2047`
|
||||
- File: `core/front/tiny_unified_cache.h` (lines 35-41)
|
||||
|
||||
**A/B テスト結果** 🔬 NEUTRAL:
|
||||
- Mixed (10-run):
|
||||
- Baseline (C2=0): 40,433,519 ops/s (avg), 40,722,094 ops/s (median)
|
||||
- Optimized (C2=1): 40,252,836 ops/s (avg), 40,291,762 ops/s (median)
|
||||
- **Average gain: -0.45%**, **Median gain: -1.06%**
|
||||
- **Decision: NEUTRAL** (within ±1.0% threshold)
|
||||
- Action: Keep as research box (ENV gate OFF by default)
|
||||
|
||||
**Rationale**:
|
||||
- Policy hot cache: learner との interlock コストが高い(プローブ時に毎回 check)
|
||||
- First page cache: 現在の free path は unified_cache push のみ(superslab lookup なし)
|
||||
- 効果を発揮するには drain path への統合が必要(将来の最適化)
|
||||
- Bounds check: すでにコンパイラが最適化済み(power-of-2 detection)
|
||||
|
||||
**Current Cumulative Gain** (Phase 2-3):
|
||||
- B3 (Routing shape): +2.89%
|
||||
- B4 (Wrapper split): +1.47%
|
||||
- C3 (Static routing): +2.20%
|
||||
- C2 (Metadata cache): -0.45%
|
||||
- **Total: ~6.1%** (baseline 37.5M → ~39.8M ops/s)
|
||||
|
||||
**Commit**: `deecda733`
|
||||
|
||||
**優先度 C2** - Slab metadata cache optimization:
|
||||
- Profile cache-miss hotspots (policy struct, slab metadata)
|
||||
- Hot/cold split of metadata
|
||||
|
||||
Reference in New Issue
Block a user