Phase 15 v1: UnifiedCache FIFO→LIFO NEUTRAL (-0.70% Mixed, +0.42% C7)
Transform existing array-based UnifiedCache from FIFO ring to LIFO stack.
A/B Results:
- Mixed (16-1024B): -0.70% (52,965,966 → 52,593,948 ops/s)
- C7-only (1025-2048B): +0.42% (78,010,783 → 78,335,509 ops/s)
Verdict: NEUTRAL (both below +1.0% GO threshold) - freeze as research box
Implementation:
- L0 ENV gate: tiny_unified_lifo_env_box.{h,c} (HAKMEM_TINY_UNIFIED_LIFO=0/1)
- L1 LIFO ops: tiny_unified_lifo_box.h (unified_cache_try_pop/push_lifo)
- L2 integration: tiny_front_hot_box.h (mode check at entry)
- Reuses existing slots[] array (no intrusive pointers)
Root Causes:
1. Mode check overhead (tiny_unified_lifo_enabled() call)
2. Minimal LIFO vs FIFO locality delta in practice
3. Existing FIFO ring already well-optimized
Bonus Fix: LTO bug for tiny_c7_preserve_header_enabled() (Phase 13/14 latent issue)
- Converted static inline to extern + non-inline implementation
- Fixes undefined reference during LTO linking
Design: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md
Results: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_AB_TEST_RESULTS.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -268,6 +268,80 @@ Phase 6-10 で達成した累積改善:
|
||||
|
||||
**Future Work**: Consider per-class cap tuning or alternative pointer-chase reduction strategies
|
||||
|
||||
### Phase 14 v2: Pointer Chase Reduction — Hot Path Integration — NEUTRAL (+0.08%) ⚠️ RESEARCH BOX
|
||||
|
||||
**Date**: 2025-12-15
|
||||
**Verdict**: **NEUTRAL (+0.08% Mixed)** / **-0.39% (C7-only)** — research box 維持(default OFF)
|
||||
|
||||
**Motivation**: Phase 14 v1 は “alloc 側が tcache を消費していない” 疑義があったため、`tiny_front_hot_box` の hot alloc/free に tcache を接続して再 A/B を実施。
|
||||
|
||||
**Results**:
|
||||
| Workload | TCACHE=0 | TCACHE=1 | Delta |
|
||||
|---------|----------|----------|-------|
|
||||
| Mixed (16–1024B) | 51,287,515 | 51,330,213 | **+0.08%** |
|
||||
| C7-only | 80,975,651 | 80,660,283 | **-0.39%** |
|
||||
|
||||
**Conclusion**:
|
||||
- v2 で通電は確認したが、Mixed の “本線” 改善にはならず(GO 閾値 +1.0% 未達)
|
||||
- Phase 14(tcache-style intrusive LIFO)は現状 **freeze 維持**が妥当
|
||||
|
||||
**Possible root causes**(次に掘るなら):
|
||||
1. `tiny_next_load/store` の fence/補助処理が TLS-only tcache には重すぎる可能性
|
||||
2. `tiny_tcache_enabled/cap` の固定費(load/branch)が savings を相殺
|
||||
3. Mixed では bin ごとの hit 率が薄い(workload mismatch)
|
||||
|
||||
**Refs**:
|
||||
- v2 results: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_2_AB_TEST_RESULTS.md`
|
||||
- v2 instructions: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_2_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
---
|
||||
|
||||
### Phase 15 v1: UnifiedCache FIFO→LIFO (Stack) — NEUTRAL (-0.70% Mixed, +0.42% C7) ⚠️ RESEARCH BOX
|
||||
|
||||
**Date**: 2025-12-15
|
||||
**Verdict**: **NEUTRAL (-0.70% Mixed, +0.42% C7-only)** — research box 維持(default OFF)
|
||||
|
||||
**Motivation**: Phase 14(tcache intrusive)が NEUTRAL だったため、intrusive を増やさず、既存 `TinyUnifiedCache.slots[]` を FIFO ring から LIFO stack に変更して局所性改善を狙った。
|
||||
|
||||
**Results**:
|
||||
| Workload | LIFO=0 (FIFO) | LIFO=1 (LIFO) | Delta |
|
||||
|---------|----------|----------|-------|
|
||||
| Mixed (16–1024B) | 52,965,966 | 52,593,948 | **-0.70%** |
|
||||
| C7-only (1025–2048B) | 78,010,783 | 78,335,509 | **+0.42%** |
|
||||
|
||||
**Conclusion**:
|
||||
- LIFO への変更は期待した効果なし(Mixed で劣化、C7 で微改善だが両方 GO 閾値未達)
|
||||
- モード判定分岐オーバーヘッド(`tiny_unified_lifo_enabled()`)が局所性改善を相殺
|
||||
- 既存 FIFO ring 実装が既に十分最適化されている
|
||||
|
||||
**Root causes**:
|
||||
1. Entry-point mode check overhead (`tiny_unified_lifo_enabled()` call)
|
||||
2. Minimal LIFO vs FIFO locality delta in practice (cache warming mitigates)
|
||||
3. Existing FIFO ring already well-optimized
|
||||
|
||||
**Bonus**: LTO bug fix for `tiny_c7_preserve_header_enabled()` (Phase 13/14 latent issue)
|
||||
|
||||
**Refs**:
|
||||
- A/B results: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_AB_TEST_RESULTS.md`
|
||||
- Design: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md`
|
||||
- Instructions: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
---
|
||||
|
||||
### Phase 14-15 Summary: Pointer-Chase & Cache-Shape Research ⚠️
|
||||
|
||||
**Conclusion**: 両 Phase とも NEUTRAL(研究箱として凍結)
|
||||
|
||||
| Phase | Approach | Mixed Delta | C7 Delta | Verdict |
|
||||
|-------|----------|-------------|----------|---------|
|
||||
| 14 v1 | tcache (free-side only) | +0.20% | N/A | NEUTRAL |
|
||||
| 14 v2 | tcache (alloc+free) | +0.08% | -0.39% | NEUTRAL |
|
||||
| 15 v1 | FIFO→LIFO (array cache) | -0.70% | +0.42% | NEUTRAL |
|
||||
|
||||
**教訓**:
|
||||
- Pointer-chase 削減も cache 形状変更も、現状の TLS array cache に対して有意な改善を生まない
|
||||
- 次の mimalloc gap(約 2.4x)を埋めるには、別次元のアプローチが必要
|
||||
|
||||
## 更新メモ(2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot)
|
||||
|
||||
### Phase 5 E5-3: Candidate Analysis & Strategic Recommendations ⚠️ DEFER (2025-12-14)
|
||||
|
||||
Reference in New Issue
Block a user