Update CURRENT_TASK: Phase 3 C2 Complete (NEUTRAL, research box)

This commit is contained in:
Moe Charm (CI)
2025-12-13 19:20:27 +09:00
parent deecda7336
commit d43a3ce611

View File

@ -197,6 +197,57 @@
- 実際のメモリ待ちは slots[] 配列へのアクセス時prefetch より後) - 実際のメモリ待ちは slots[] 配列へのアクセス時prefetch より後)
- 改善案: prefetch をもっと早期route_kind 決定前)に移動するか、形状を変更 - 改善案: prefetch をもっと早期route_kind 決定前)に移動するか、形状を変更
#### Phase 3 C2: Slab Metadata Cache Optimization 🔬 NEUTRAL / FREEZE
**設計メモ**: `docs/analysis/PHASE3_C2_METADATA_CACHE_1_DESIGN.md`
**狙い**: Free path で metadata accesspolicy snapshot, slab descriptorの cache locality を改善
**3 Patches 実装完了** ✅:
1. **Policy Hot Cache** (Patch 1):
- TinyPolicyHot struct: route_kind[8] を TLS にキャッシュ9 bytes packed
- policy_snapshot() 呼び出しを削減(~2 memory ops 節約)
- Safety: learner v7 active 時は自動的に disable
- Files: `core/box/tiny_metadata_cache_env_box.h`, `tiny_metadata_cache_hot_box.{h,c}`
- Integration: `core/front/malloc_tiny_fast.h` (line 256) route selection
2. **First Page Inline Cache** (Patch 2):
- TinyFirstPageCache struct: current slab page pointer を TLS per-class にキャッシュ
- superslab metadata lookup を回避1-2 memory ops
- Fast-path check in `tiny_legacy_fallback_free_base()`
- Files: `core/front/tiny_first_page_cache.h`, `tiny_unified_cache.c`
- Integration: `core/box/tiny_legacy_fallback_box.h` (lines 27-36)
3. **Bounds Check Compile-out** (Patch 3):
- unified_cache capacity を MACRO constant 化2048 hardcode
- modulo 演算を compile-time 最適化(`& MASK`
- Macros: `TINY_UNIFIED_CACHE_CAPACITY_POW2=11`, `CAPACITY=2048`, `MASK=2047`
- File: `core/front/tiny_unified_cache.h` (lines 35-41)
**A/B テスト結果** 🔬 NEUTRAL:
- Mixed (10-run):
- Baseline (C2=0): 40,433,519 ops/s (avg), 40,722,094 ops/s (median)
- Optimized (C2=1): 40,252,836 ops/s (avg), 40,291,762 ops/s (median)
- **Average gain: -0.45%**, **Median gain: -1.06%**
- **Decision: NEUTRAL** (within ±1.0% threshold)
- Action: Keep as research box (ENV gate OFF by default)
**Rationale**:
- Policy hot cache: learner との interlock コストが高い(プローブ時に毎回 check
- First page cache: 現在の free path は unified_cache push のみsuperslab lookup なし)
- 効果を発揮するには drain path への統合が必要(将来の最適化)
- Bounds check: すでにコンパイラが最適化済みpower-of-2 detection
**Current Cumulative Gain** (Phase 2-3):
- B3 (Routing shape): +2.89%
- B4 (Wrapper split): +1.47%
- C3 (Static routing): +2.20%
- C2 (Metadata cache): -0.45%
- **Total: ~6.1%** (baseline 37.5M → ~39.8M ops/s)
**Commit**: `deecda733`
**優先度 C2** - Slab metadata cache optimization: **優先度 C2** - Slab metadata cache optimization:
- Profile cache-miss hotspots (policy struct, slab metadata) - Profile cache-miss hotspots (policy struct, slab metadata)
- Hot/cold split of metadata - Hot/cold split of metadata