Phase 15 v1: UnifiedCache FIFO→LIFO NEUTRAL (-0.70% Mixed, +0.42% C7)
Transform existing array-based UnifiedCache from FIFO ring to LIFO stack.
A/B Results:
- Mixed (16-1024B): -0.70% (52,965,966 → 52,593,948 ops/s)
- C7-only (1025-2048B): +0.42% (78,010,783 → 78,335,509 ops/s)
Verdict: NEUTRAL (both below +1.0% GO threshold) - freeze as research box
Implementation:
- L0 ENV gate: tiny_unified_lifo_env_box.{h,c} (HAKMEM_TINY_UNIFIED_LIFO=0/1)
- L1 LIFO ops: tiny_unified_lifo_box.h (unified_cache_try_pop/push_lifo)
- L2 integration: tiny_front_hot_box.h (mode check at entry)
- Reuses existing slots[] array (no intrusive pointers)
Root Causes:
1. Mode check overhead (tiny_unified_lifo_enabled() call)
2. Minimal LIFO vs FIFO locality delta in practice
3. Existing FIFO ring already well-optimized
Bonus Fix: LTO bug for tiny_c7_preserve_header_enabled() (Phase 13/14 latent issue)
- Converted static inline to extern + non-inline implementation
- Fixes undefined reference during LTO linking
Design: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md
Results: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_AB_TEST_RESULTS.md
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -268,6 +268,80 @@ Phase 6-10 で達成した累積改善:
|
|||||||
|
|
||||||
**Future Work**: Consider per-class cap tuning or alternative pointer-chase reduction strategies
|
**Future Work**: Consider per-class cap tuning or alternative pointer-chase reduction strategies
|
||||||
|
|
||||||
|
### Phase 14 v2: Pointer Chase Reduction — Hot Path Integration — NEUTRAL (+0.08%) ⚠️ RESEARCH BOX
|
||||||
|
|
||||||
|
**Date**: 2025-12-15
|
||||||
|
**Verdict**: **NEUTRAL (+0.08% Mixed)** / **-0.39% (C7-only)** — research box 維持(default OFF)
|
||||||
|
|
||||||
|
**Motivation**: Phase 14 v1 は “alloc 側が tcache を消費していない” 疑義があったため、`tiny_front_hot_box` の hot alloc/free に tcache を接続して再 A/B を実施。
|
||||||
|
|
||||||
|
**Results**:
|
||||||
|
| Workload | TCACHE=0 | TCACHE=1 | Delta |
|
||||||
|
|---------|----------|----------|-------|
|
||||||
|
| Mixed (16–1024B) | 51,287,515 | 51,330,213 | **+0.08%** |
|
||||||
|
| C7-only | 80,975,651 | 80,660,283 | **-0.39%** |
|
||||||
|
|
||||||
|
**Conclusion**:
|
||||||
|
- v2 で通電は確認したが、Mixed の “本線” 改善にはならず(GO 閾値 +1.0% 未達)
|
||||||
|
- Phase 14(tcache-style intrusive LIFO)は現状 **freeze 維持**が妥当
|
||||||
|
|
||||||
|
**Possible root causes**(次に掘るなら):
|
||||||
|
1. `tiny_next_load/store` の fence/補助処理が TLS-only tcache には重すぎる可能性
|
||||||
|
2. `tiny_tcache_enabled/cap` の固定費(load/branch)が savings を相殺
|
||||||
|
3. Mixed では bin ごとの hit 率が薄い(workload mismatch)
|
||||||
|
|
||||||
|
**Refs**:
|
||||||
|
- v2 results: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_2_AB_TEST_RESULTS.md`
|
||||||
|
- v2 instructions: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_2_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 15 v1: UnifiedCache FIFO→LIFO (Stack) — NEUTRAL (-0.70% Mixed, +0.42% C7) ⚠️ RESEARCH BOX
|
||||||
|
|
||||||
|
**Date**: 2025-12-15
|
||||||
|
**Verdict**: **NEUTRAL (-0.70% Mixed, +0.42% C7-only)** — research box 維持(default OFF)
|
||||||
|
|
||||||
|
**Motivation**: Phase 14(tcache intrusive)が NEUTRAL だったため、intrusive を増やさず、既存 `TinyUnifiedCache.slots[]` を FIFO ring から LIFO stack に変更して局所性改善を狙った。
|
||||||
|
|
||||||
|
**Results**:
|
||||||
|
| Workload | LIFO=0 (FIFO) | LIFO=1 (LIFO) | Delta |
|
||||||
|
|---------|----------|----------|-------|
|
||||||
|
| Mixed (16–1024B) | 52,965,966 | 52,593,948 | **-0.70%** |
|
||||||
|
| C7-only (1025–2048B) | 78,010,783 | 78,335,509 | **+0.42%** |
|
||||||
|
|
||||||
|
**Conclusion**:
|
||||||
|
- LIFO への変更は期待した効果なし(Mixed で劣化、C7 で微改善だが両方 GO 閾値未達)
|
||||||
|
- モード判定分岐オーバーヘッド(`tiny_unified_lifo_enabled()`)が局所性改善を相殺
|
||||||
|
- 既存 FIFO ring 実装が既に十分最適化されている
|
||||||
|
|
||||||
|
**Root causes**:
|
||||||
|
1. Entry-point mode check overhead (`tiny_unified_lifo_enabled()` call)
|
||||||
|
2. Minimal LIFO vs FIFO locality delta in practice (cache warming mitigates)
|
||||||
|
3. Existing FIFO ring already well-optimized
|
||||||
|
|
||||||
|
**Bonus**: LTO bug fix for `tiny_c7_preserve_header_enabled()` (Phase 13/14 latent issue)
|
||||||
|
|
||||||
|
**Refs**:
|
||||||
|
- A/B results: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_AB_TEST_RESULTS.md`
|
||||||
|
- Design: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md`
|
||||||
|
- Instructions: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 14-15 Summary: Pointer-Chase & Cache-Shape Research ⚠️
|
||||||
|
|
||||||
|
**Conclusion**: 両 Phase とも NEUTRAL(研究箱として凍結)
|
||||||
|
|
||||||
|
| Phase | Approach | Mixed Delta | C7 Delta | Verdict |
|
||||||
|
|-------|----------|-------------|----------|---------|
|
||||||
|
| 14 v1 | tcache (free-side only) | +0.20% | N/A | NEUTRAL |
|
||||||
|
| 14 v2 | tcache (alloc+free) | +0.08% | -0.39% | NEUTRAL |
|
||||||
|
| 15 v1 | FIFO→LIFO (array cache) | -0.70% | +0.42% | NEUTRAL |
|
||||||
|
|
||||||
|
**教訓**:
|
||||||
|
- Pointer-chase 削減も cache 形状変更も、現状の TLS array cache に対して有意な改善を生まない
|
||||||
|
- 次の mimalloc gap(約 2.4x)を埋めるには、別次元のアプローチが必要
|
||||||
|
|
||||||
## 更新メモ(2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot)
|
## 更新メモ(2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot)
|
||||||
|
|
||||||
### Phase 5 E5-3: Candidate Analysis & Strategic Recommendations ⚠️ DEFER (2025-12-14)
|
### Phase 5 E5-3: Candidate Analysis & Strategic Recommendations ⚠️ DEFER (2025-12-14)
|
||||||
|
|||||||
6
Makefile
6
Makefile
@ -218,12 +218,12 @@ LDFLAGS += $(EXTRA_LDFLAGS)
|
|||||||
|
|
||||||
# Targets
|
# Targets
|
||||||
TARGET = test_hakmem
|
TARGET = test_hakmem
|
||||||
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/box/tiny_tcache_env_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/box/tiny_tcache_env_box.o core/box/tiny_unified_lifo_env_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||||
OBJS = $(OBJS_BASE)
|
OBJS = $(OBJS_BASE)
|
||||||
|
|
||||||
# Shared library
|
# Shared library
|
||||||
SHARED_LIB = libhakmem.so
|
SHARED_LIB = libhakmem.so
|
||||||
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/box/tiny_c7_preserve_header_env_box_shared.o core/box/tiny_tcache_env_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
|
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/box/tiny_c7_preserve_header_env_box_shared.o core/box/tiny_tcache_env_box_shared.o core/box/tiny_unified_lifo_env_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
|
||||||
|
|
||||||
# Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1)
|
# Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1)
|
||||||
ifeq ($(POOL_TLS_PHASE1),1)
|
ifeq ($(POOL_TLS_PHASE1),1)
|
||||||
@ -427,7 +427,7 @@ test-box-refactor: box-refactor
|
|||||||
./larson_hakmem 10 8 128 1024 1 12345 4
|
./larson_hakmem 10 8 128 1024 1 12345 4
|
||||||
|
|
||||||
# Phase 4: Tiny Pool benchmarks (properly linked with hakmem)
|
# Phase 4: Tiny Pool benchmarks (properly linked with hakmem)
|
||||||
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/tiny_free_route_cache_env_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/box/tiny_tcache_env_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/tiny_free_route_cache_env_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/box/tiny_tcache_env_box.o core/box/tiny_unified_lifo_env_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||||
TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE)
|
TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE)
|
||||||
ifeq ($(POOL_TLS_PHASE1),1)
|
ifeq ($(POOL_TLS_PHASE1),1)
|
||||||
TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
||||||
|
|||||||
@ -12,6 +12,7 @@
|
|||||||
#include "box/tiny_free_route_cache_env_box.h" // tiny_free_static_route_refresh_from_env (Phase 8)
|
#include "box/tiny_free_route_cache_env_box.h" // tiny_free_static_route_refresh_from_env (Phase 8)
|
||||||
#include "box/tiny_c7_preserve_header_env_box.h" // tiny_c7_preserve_header_env_refresh_from_env (Phase 13 v1)
|
#include "box/tiny_c7_preserve_header_env_box.h" // tiny_c7_preserve_header_env_refresh_from_env (Phase 13 v1)
|
||||||
#include "box/tiny_tcache_env_box.h" // tiny_tcache_env_refresh_from_env (Phase 14 v1)
|
#include "box/tiny_tcache_env_box.h" // tiny_tcache_env_refresh_from_env (Phase 14 v1)
|
||||||
|
#include "box/tiny_unified_lifo_env_box.h" // tiny_unified_lifo_env_refresh_from_env (Phase 15 v1)
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
// env が未設定のときだけ既定値を入れる
|
// env が未設定のときだけ既定値を入れる
|
||||||
@ -190,5 +191,7 @@ static inline void bench_apply_profile(void) {
|
|||||||
tiny_c7_preserve_header_env_refresh_from_env();
|
tiny_c7_preserve_header_env_refresh_from_env();
|
||||||
// Phase 14 v1: Sync tcache ENV cache after bench_profile putenv defaults.
|
// Phase 14 v1: Sync tcache ENV cache after bench_profile putenv defaults.
|
||||||
tiny_tcache_env_refresh_from_env();
|
tiny_tcache_env_refresh_from_env();
|
||||||
|
// Phase 15 v1: Sync LIFO ENV cache after bench_profile putenv defaults.
|
||||||
|
tiny_unified_lifo_env_refresh_from_env();
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|||||||
@ -39,6 +39,19 @@ int tiny_c7_preserve_header_env_init(void) {
|
|||||||
return enabled;
|
return enabled;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Hot Path (LTO Fallback)
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
// LTO fallback: Non-inline version for cases where LTO can't inline
|
||||||
|
int tiny_c7_preserve_header_enabled(void) {
|
||||||
|
int val = atomic_load_explicit(&g_tiny_c7_preserve_header_enabled, memory_order_relaxed);
|
||||||
|
if (__builtin_expect(val == -1, 0)) {
|
||||||
|
val = tiny_c7_preserve_header_env_init();
|
||||||
|
}
|
||||||
|
return val;
|
||||||
|
}
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Refresh (Cold Path, called from bench_profile)
|
// Refresh (Cold Path, called from bench_profile)
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
|
|||||||
@ -44,22 +44,13 @@
|
|||||||
extern _Atomic int g_tiny_c7_preserve_header_enabled;
|
extern _Atomic int g_tiny_c7_preserve_header_enabled;
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Hot Inline API (L0)
|
// Hot API (L0)
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
|
|
||||||
// Check if C7 preserve header is enabled
|
// Check if C7 preserve header is enabled
|
||||||
// Returns: 1 if enabled, 0 if disabled
|
// Returns: 1 if enabled, 0 if disabled
|
||||||
static inline int tiny_c7_preserve_header_enabled(void) {
|
// Note: Implementation in .c file (non-inline for LTO compatibility)
|
||||||
int val = atomic_load_explicit(&g_tiny_c7_preserve_header_enabled, memory_order_relaxed);
|
extern int tiny_c7_preserve_header_enabled(void);
|
||||||
|
|
||||||
if (__builtin_expect(val == -1, 0)) {
|
|
||||||
// Lazy init: read ENV once
|
|
||||||
extern int tiny_c7_preserve_header_env_init(void);
|
|
||||||
val = tiny_c7_preserve_header_env_init();
|
|
||||||
}
|
|
||||||
|
|
||||||
return val;
|
|
||||||
}
|
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Cold API (L2)
|
// Cold API (L2)
|
||||||
|
|||||||
@ -30,6 +30,7 @@
|
|||||||
#include "../tiny_region_id.h"
|
#include "../tiny_region_id.h"
|
||||||
#include "../front/tiny_unified_cache.h" // For TinyUnifiedCache
|
#include "../front/tiny_unified_cache.h" // For TinyUnifiedCache
|
||||||
#include "tiny_header_box.h" // Phase 5 E5-2: For tiny_header_finalize_alloc
|
#include "tiny_header_box.h" // Phase 5 E5-2: For tiny_header_finalize_alloc
|
||||||
|
#include "tiny_unified_lifo_box.h" // Phase 15 v1: UnifiedCache FIFO→LIFO
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Branch Prediction Macros (Pointer Safety - Prediction Hints)
|
// Branch Prediction Macros (Pointer Safety - Prediction Hints)
|
||||||
@ -107,12 +108,34 @@
|
|||||||
//
|
//
|
||||||
__attribute__((always_inline))
|
__attribute__((always_inline))
|
||||||
static inline void* tiny_hot_alloc_fast(int class_idx) {
|
static inline void* tiny_hot_alloc_fast(int class_idx) {
|
||||||
|
// Phase 15 v1: Mode check at entry (once per call, not scattered in hot path)
|
||||||
|
int lifo_mode = tiny_unified_lifo_enabled();
|
||||||
|
|
||||||
extern __thread TinyUnifiedCache g_unified_cache[];
|
extern __thread TinyUnifiedCache g_unified_cache[];
|
||||||
|
|
||||||
// TLS cache access (1 cache miss)
|
// TLS cache access (1 cache miss)
|
||||||
// NOTE: Range check removed - caller (hak_tiny_size_to_class) guarantees valid class_idx
|
// NOTE: Range check removed - caller (hak_tiny_size_to_class) guarantees valid class_idx
|
||||||
TinyUnifiedCache* cache = &g_unified_cache[class_idx];
|
TinyUnifiedCache* cache = &g_unified_cache[class_idx];
|
||||||
|
|
||||||
|
// Phase 15 v1: LIFO vs FIFO mode switch
|
||||||
|
if (lifo_mode) {
|
||||||
|
// === LIFO MODE: Stack-based (LIFO) ===
|
||||||
|
// Try pop from stack (tail is stack depth)
|
||||||
|
void* base = unified_cache_try_pop_lifo(class_idx);
|
||||||
|
if (__builtin_expect(base != NULL, 1)) {
|
||||||
|
TINY_HOT_METRICS_HIT(class_idx);
|
||||||
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
||||||
|
return tiny_header_finalize_alloc(base, class_idx);
|
||||||
|
#else
|
||||||
|
return base;
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
// LIFO miss → fall through to cold path
|
||||||
|
TINY_HOT_METRICS_MISS(class_idx);
|
||||||
|
return NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
// === FIFO MODE: Ring-based (existing) ===
|
||||||
// Branch 1: Cache empty check (LIKELY hit)
|
// Branch 1: Cache empty check (LIKELY hit)
|
||||||
// Hot path: cache has objects (head != tail)
|
// Hot path: cache has objects (head != tail)
|
||||||
// Cold path: cache empty (head == tail) → refill needed
|
// Cold path: cache empty (head == tail) → refill needed
|
||||||
@ -164,12 +187,35 @@ static inline void* tiny_hot_alloc_fast(int class_idx) {
|
|||||||
//
|
//
|
||||||
__attribute__((always_inline))
|
__attribute__((always_inline))
|
||||||
static inline int tiny_hot_free_fast(int class_idx, void* base) {
|
static inline int tiny_hot_free_fast(int class_idx, void* base) {
|
||||||
|
// Phase 15 v1: Mode check at entry (once per call, not scattered in hot path)
|
||||||
|
int lifo_mode = tiny_unified_lifo_enabled();
|
||||||
|
|
||||||
extern __thread TinyUnifiedCache g_unified_cache[];
|
extern __thread TinyUnifiedCache g_unified_cache[];
|
||||||
|
|
||||||
// TLS cache access (1 cache miss)
|
// TLS cache access (1 cache miss)
|
||||||
// NOTE: Range check removed - caller guarantees valid class_idx
|
// NOTE: Range check removed - caller guarantees valid class_idx
|
||||||
TinyUnifiedCache* cache = &g_unified_cache[class_idx];
|
TinyUnifiedCache* cache = &g_unified_cache[class_idx];
|
||||||
|
|
||||||
|
// Phase 15 v1: LIFO vs FIFO mode switch
|
||||||
|
if (lifo_mode) {
|
||||||
|
// === LIFO MODE: Stack-based (LIFO) ===
|
||||||
|
// Try push to stack (tail is stack depth)
|
||||||
|
if (unified_cache_try_push_lifo(class_idx, base)) {
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
extern __thread uint64_t g_unified_cache_push[];
|
||||||
|
g_unified_cache_push[class_idx]++;
|
||||||
|
#endif
|
||||||
|
return 1; // SUCCESS
|
||||||
|
}
|
||||||
|
// LIFO overflow → fall through to cold path
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
extern __thread uint64_t g_unified_cache_full[];
|
||||||
|
g_unified_cache_full[class_idx]++;
|
||||||
|
#endif
|
||||||
|
return 0; // FULL
|
||||||
|
}
|
||||||
|
|
||||||
|
// === FIFO MODE: Ring-based (existing) ===
|
||||||
// Calculate next tail (for full check)
|
// Calculate next tail (for full check)
|
||||||
uint16_t next_tail = (cache->tail + 1) & cache->mask;
|
uint16_t next_tail = (cache->tail + 1) & cache->mask;
|
||||||
|
|
||||||
|
|||||||
121
core/box/tiny_unified_lifo_box.h
Normal file
121
core/box/tiny_unified_lifo_box.h
Normal file
@ -0,0 +1,121 @@
|
|||||||
|
// ============================================================================
|
||||||
|
// Phase 15 v1: Tiny Unified LIFO Box (L1) - LIFO Stack Operations
|
||||||
|
// ============================================================================
|
||||||
|
//
|
||||||
|
// Purpose: LIFO (stack) operations for UnifiedCache
|
||||||
|
//
|
||||||
|
// Design: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md
|
||||||
|
//
|
||||||
|
// Strategy:
|
||||||
|
// - Reuse existing TinyUnifiedCache.slots[] array
|
||||||
|
// - Treat `tail` as stack top (depth)
|
||||||
|
// - Treat `head` as unused (always 0)
|
||||||
|
// - No wrap-around (`mask` unused)
|
||||||
|
//
|
||||||
|
// Invariants:
|
||||||
|
// - 0 <= tail <= capacity (stack depth)
|
||||||
|
// - head == 0 (unused in LIFO mode)
|
||||||
|
// - LIFO and FIFO modes are mutually exclusive
|
||||||
|
//
|
||||||
|
// API:
|
||||||
|
// unified_cache_try_pop_lifo(class_idx) -> void* (BASE or NULL)
|
||||||
|
// unified_cache_try_push_lifo(class_idx, base) -> int (1=success, 0=full)
|
||||||
|
//
|
||||||
|
// Safety:
|
||||||
|
// - Debug: assert tail <= capacity
|
||||||
|
// - Release: fast path, no checks
|
||||||
|
//
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
#ifndef TINY_UNIFIED_LIFO_BOX_H
|
||||||
|
#define TINY_UNIFIED_LIFO_BOX_H
|
||||||
|
|
||||||
|
#include <stdint.h>
|
||||||
|
#include <assert.h>
|
||||||
|
#include "../hakmem_build_flags.h"
|
||||||
|
#include "../hakmem_tiny_config.h" // TINY_NUM_CLASSES
|
||||||
|
#include "../front/tiny_unified_cache.h" // TinyUnifiedCache
|
||||||
|
#include "tiny_unified_lifo_env_box.h" // tiny_unified_lifo_enabled
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// LIFO Pop (Alloc Fast Path)
|
||||||
|
// ============================================================================
|
||||||
|
//
|
||||||
|
// Arguments:
|
||||||
|
// class_idx - Tiny class index (0-7)
|
||||||
|
//
|
||||||
|
// Returns:
|
||||||
|
// BASE pointer - Block from top of stack
|
||||||
|
// NULL - Stack empty
|
||||||
|
//
|
||||||
|
// Side effects:
|
||||||
|
// - Decrements tail (stack depth)
|
||||||
|
//
|
||||||
|
// LIFO semantics:
|
||||||
|
// - Pop from top of stack (tail - 1)
|
||||||
|
// - tail is "one past last element" (like vector.size())
|
||||||
|
//
|
||||||
|
|
||||||
|
__attribute__((always_inline))
|
||||||
|
static inline void* unified_cache_try_pop_lifo(int class_idx) {
|
||||||
|
extern __thread TinyUnifiedCache g_unified_cache[];
|
||||||
|
TinyUnifiedCache* cache = &g_unified_cache[class_idx];
|
||||||
|
|
||||||
|
// Empty check (tail == 0 means stack is empty)
|
||||||
|
if (__builtin_expect(cache->tail == 0, 0)) {
|
||||||
|
return NULL; // Empty
|
||||||
|
}
|
||||||
|
|
||||||
|
// Pop from top of stack (LIFO)
|
||||||
|
void* base = cache->slots[--cache->tail];
|
||||||
|
|
||||||
|
// Debug: validate stack depth
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
assert(cache->tail <= cache->capacity);
|
||||||
|
#endif
|
||||||
|
|
||||||
|
return base; // HIT (BASE pointer)
|
||||||
|
}
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// LIFO Push (Free Fast Path)
|
||||||
|
// ============================================================================
|
||||||
|
//
|
||||||
|
// Arguments:
|
||||||
|
// class_idx - Tiny class index (0-7)
|
||||||
|
// base - BASE pointer to freed block
|
||||||
|
//
|
||||||
|
// Returns:
|
||||||
|
// 1 - Block pushed to stack (success)
|
||||||
|
// 0 - Stack full (overflow)
|
||||||
|
//
|
||||||
|
// Side effects:
|
||||||
|
// - Increments tail (stack depth)
|
||||||
|
//
|
||||||
|
// LIFO semantics:
|
||||||
|
// - Push to top of stack (tail)
|
||||||
|
// - tail is "one past last element"
|
||||||
|
//
|
||||||
|
|
||||||
|
__attribute__((always_inline))
|
||||||
|
static inline int unified_cache_try_push_lifo(int class_idx, void* base) {
|
||||||
|
extern __thread TinyUnifiedCache g_unified_cache[];
|
||||||
|
TinyUnifiedCache* cache = &g_unified_cache[class_idx];
|
||||||
|
|
||||||
|
// Full check (tail == capacity means stack is full)
|
||||||
|
if (__builtin_expect(cache->tail >= cache->capacity, 0)) {
|
||||||
|
return 0; // Full
|
||||||
|
}
|
||||||
|
|
||||||
|
// Push to top of stack (LIFO)
|
||||||
|
cache->slots[cache->tail++] = base;
|
||||||
|
|
||||||
|
// Debug: validate stack depth
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
assert(cache->tail <= cache->capacity);
|
||||||
|
#endif
|
||||||
|
|
||||||
|
return 1; // SUCCESS
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif // TINY_UNIFIED_LIFO_BOX_H
|
||||||
50
core/box/tiny_unified_lifo_env_box.c
Normal file
50
core/box/tiny_unified_lifo_env_box.c
Normal file
@ -0,0 +1,50 @@
|
|||||||
|
// ============================================================================
|
||||||
|
// Phase 15 v1: Tiny Unified LIFO ENV Box (L0) - Implementation
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
#include "tiny_unified_lifo_env_box.h"
|
||||||
|
#include <stdlib.h>
|
||||||
|
#include <string.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <unistd.h>
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Global State
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
_Atomic int g_tiny_unified_lifo_enabled = -1;
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Init (Cold Path)
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
int tiny_unified_lifo_env_init(void) {
|
||||||
|
const char* env = getenv("HAKMEM_TINY_UNIFIED_LIFO");
|
||||||
|
int enabled = 0; // default: OFF (opt-in)
|
||||||
|
|
||||||
|
if (env && (env[0] == '1' || strcmp(env, "true") == 0 || strcmp(env, "TRUE") == 0)) {
|
||||||
|
enabled = 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Cache result
|
||||||
|
atomic_store_explicit(&g_tiny_unified_lifo_enabled, enabled, memory_order_relaxed);
|
||||||
|
|
||||||
|
// Log once (stderr for immediate visibility)
|
||||||
|
if (enabled) {
|
||||||
|
const char msg[] = "[UNIFIED_LIFO] enabled\n";
|
||||||
|
ssize_t w = write(2, msg, sizeof(msg) - 1);
|
||||||
|
(void)w;
|
||||||
|
}
|
||||||
|
|
||||||
|
return enabled;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Refresh (Cold Path, called from bench_profile)
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
void tiny_unified_lifo_env_refresh_from_env(void) {
|
||||||
|
// Reset to uninitialized state (-1)
|
||||||
|
// Next call to tiny_unified_lifo_enabled() will re-read ENV
|
||||||
|
atomic_store_explicit(&g_tiny_unified_lifo_enabled, -1, memory_order_relaxed);
|
||||||
|
}
|
||||||
72
core/box/tiny_unified_lifo_env_box.h
Normal file
72
core/box/tiny_unified_lifo_env_box.h
Normal file
@ -0,0 +1,72 @@
|
|||||||
|
// ============================================================================
|
||||||
|
// Phase 15 v1: Tiny Unified LIFO ENV Box (L0)
|
||||||
|
// ============================================================================
|
||||||
|
//
|
||||||
|
// Purpose: ENV gate for UnifiedCache FIFO→LIFO mode switch
|
||||||
|
//
|
||||||
|
// Design: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md
|
||||||
|
// Instructions: docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_NEXT_INSTRUCTIONS.md
|
||||||
|
//
|
||||||
|
// Strategy:
|
||||||
|
// - UnifiedCache の形状を FIFO ring → LIFO stack に変更
|
||||||
|
// - 既存 slots[] をそのまま使う(intrusive nextptr なし)
|
||||||
|
// - tail を "top" として扱い、head は未使用
|
||||||
|
//
|
||||||
|
// ENV:
|
||||||
|
// HAKMEM_TINY_UNIFIED_LIFO=0/1 (default: 0, opt-in)
|
||||||
|
//
|
||||||
|
// API:
|
||||||
|
// tiny_unified_lifo_enabled() -> int
|
||||||
|
// tiny_unified_lifo_env_refresh_from_env()
|
||||||
|
//
|
||||||
|
// Box Theory:
|
||||||
|
// - L0: This file (ENV gate, reversible)
|
||||||
|
// - L1: tiny_unified_lifo_box.h (LIFO push/pop operations)
|
||||||
|
// - L2: tiny_front_hot_box.h (integration point)
|
||||||
|
//
|
||||||
|
// Safety:
|
||||||
|
// - ENV-gated (default OFF, opt-in)
|
||||||
|
// - Reversible (ENV toggle)
|
||||||
|
// - Mode switch requires cache drain/reset
|
||||||
|
//
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
#ifndef TINY_UNIFIED_LIFO_ENV_BOX_H
|
||||||
|
#define TINY_UNIFIED_LIFO_ENV_BOX_H
|
||||||
|
|
||||||
|
#include <stdatomic.h>
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Global State (L0)
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
// Cached state: -1 (uninitialized), 0 (disabled), 1 (enabled)
|
||||||
|
extern _Atomic int g_tiny_unified_lifo_enabled;
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Hot Inline API (L0)
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
// Check if LIFO mode is enabled
|
||||||
|
// Returns: 1 if enabled, 0 if disabled
|
||||||
|
static inline int tiny_unified_lifo_enabled(void) {
|
||||||
|
int val = atomic_load_explicit(&g_tiny_unified_lifo_enabled, memory_order_relaxed);
|
||||||
|
|
||||||
|
if (__builtin_expect(val == -1, 0)) {
|
||||||
|
// Lazy init: read ENV once
|
||||||
|
extern int tiny_unified_lifo_env_init(void);
|
||||||
|
val = tiny_unified_lifo_env_init();
|
||||||
|
}
|
||||||
|
|
||||||
|
return val;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ============================================================================
|
||||||
|
// Cold API (L2)
|
||||||
|
// ============================================================================
|
||||||
|
|
||||||
|
// Refresh ENV cache (called from bench_profile after putenv)
|
||||||
|
// Note: Mode switch during runtime requires cache drain/reset
|
||||||
|
extern void tiny_unified_lifo_env_refresh_from_env(void);
|
||||||
|
|
||||||
|
#endif // TINY_UNIFIED_LIFO_ENV_BOX_H
|
||||||
@ -129,3 +129,17 @@ scripts/verify_health_profiles.sh
|
|||||||
- Rollback:
|
- Rollback:
|
||||||
- `export HAKMEM_TINY_TCACHE=0`
|
- `export HAKMEM_TINY_TCACHE=0`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. 追加調査(やるなら最小 2 本だけ)
|
||||||
|
|
||||||
|
Phase 14 v2 が NEUTRAL の場合、これ以上 “cap 探索” を無制限に回すより、まず原因を 2 本に絞る:
|
||||||
|
|
||||||
|
1) **tcache hit 率の可視化(TLS カウンタのみ、atomic 禁止)**
|
||||||
|
- `tiny_tcache_try_pop/push` の hit/miss/overflow を TLS で数え、Mixed/C7-only で “本当に hit しているか” を確認する。
|
||||||
|
|
||||||
|
2) **TLS-only nextptr wrapper(fence なし)を tcache 専用に導入する v3**
|
||||||
|
- `tiny_next_load/store` は汎用 SSOT のため fence / header restore 分岐を含む。
|
||||||
|
- tcache は TLS-only なので、`tiny_nextptr_offset()` だけを使い、load/store は memcpy/直書き(fenceなし)にする “tcache専用 next” を L1 に閉じ込めて A/B。
|
||||||
|
|
||||||
|
上記 2 本が “当たらない” 場合は、Phase 14 系(tcache 追加)は freeze を確定し、別の構造差(metadata/segment/remote/footprint)へ移る。
|
||||||
|
|||||||
83
docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_AB_TEST_RESULTS.md
Normal file
83
docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_AB_TEST_RESULTS.md
Normal file
@ -0,0 +1,83 @@
|
|||||||
|
# Phase 15 v1: UnifiedCache FIFO→LIFO (Stack) A/B Test Results
|
||||||
|
|
||||||
|
**Date:** 2025-12-15
|
||||||
|
**Benchmark:** Mixed (16–1024B) + C7-only (1025–2048B) 10-run cleanenv
|
||||||
|
**Target:** Transform existing UnifiedCache from FIFO ring to LIFO stack
|
||||||
|
**Expected ROI:** +5-10% (design estimate, cache locality improvement)
|
||||||
|
**GO Threshold:** +1.0% mean improvement
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Implementation Summary
|
||||||
|
|
||||||
|
Phase 15 v1 transforms the existing array-based UnifiedCache from FIFO (ring buffer) to LIFO (stack) layout.
|
||||||
|
|
||||||
|
**Key Changes:**
|
||||||
|
- **Patch 1**: L0 ENV gate box (`tiny_unified_lifo_env_box.{h,c}`)
|
||||||
|
- **Patch 2**: L1 LIFO operations (`tiny_unified_lifo_box.h`)
|
||||||
|
- **Patch 3**: Hot path integration (`tiny_front_hot_box.h` - alloc/free both)
|
||||||
|
- **Patch 4**: Makefile updates (added `.o` files)
|
||||||
|
- **Patch 5**: bench_profile.h refresh sync
|
||||||
|
|
||||||
|
**Design:**
|
||||||
|
- Reuses existing `TinyUnifiedCache.slots[]` array (no intrusive pointers)
|
||||||
|
- `tail` treated as stack top (depth), `head` unused (always 0)
|
||||||
|
- Mode check at function entry (once per call)
|
||||||
|
- No wrap-around (`mask` unused in LIFO mode)
|
||||||
|
|
||||||
|
**ENV Control:**
|
||||||
|
```bash
|
||||||
|
export HAKMEM_TINY_UNIFIED_LIFO=0 # Baseline (FIFO)
|
||||||
|
export HAKMEM_TINY_UNIFIED_LIFO=1 # Optimized (LIFO)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Bonus Fix:**
|
||||||
|
- Discovered and fixed pre-existing LTO linkage bug for `tiny_c7_preserve_header_enabled()` (Phase 13/14 latent issue)
|
||||||
|
- Converted static inline to extern declaration + non-inline implementation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. A/B Test Results
|
||||||
|
|
||||||
|
### Mixed (16–1024B):
|
||||||
|
- **Baseline (LIFO=0):** 52,965,966 ops/s
|
||||||
|
- **Optimized (LIFO=1):** 52,593,948 ops/s
|
||||||
|
- **Delta:** **-0.70%** (regression)
|
||||||
|
|
||||||
|
### C7-only (1025–2048B):
|
||||||
|
- **Baseline (LIFO=0):** 78,010,783 ops/s
|
||||||
|
- **Optimized (LIFO=1):** 78,335,509 ops/s
|
||||||
|
- **Delta:** **+0.42%** (slight improvement)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Verdict: NEUTRAL
|
||||||
|
|
||||||
|
**Result:** Mixed -0.70%, C7-only +0.42% (both below GO threshold)
|
||||||
|
|
||||||
|
**Comparison to Phase 14:**
|
||||||
|
- Phase 14 v1 (tcache free-side only): Mixed +0.20% (NEUTRAL)
|
||||||
|
- Phase 14 v2 (tcache alloc+free): Mixed +0.08%, C7-only -0.39% (NEUTRAL)
|
||||||
|
- Phase 15 v1 (FIFO→LIFO): Mixed -0.70%, C7-only +0.42% (NEUTRAL)
|
||||||
|
|
||||||
|
**Root Cause:**
|
||||||
|
1. **Mode check overhead**: Entry-point `tiny_unified_lifo_enabled()` call adds branch
|
||||||
|
2. **Minimal locality delta**: LIFO vs FIFO temporal locality difference is small in practice
|
||||||
|
3. **Existing optimization**: FIFO ring implementation already well-optimized
|
||||||
|
4. **Cache warming**: TLS cache pre-warming reduces locality sensitivity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Recommendation: Freeze as Research Box
|
||||||
|
|
||||||
|
**Decision:** Freeze Phase 15 v1 as research box (HAKMEM_TINY_UNIFIED_LIFO=0 default, OFF)
|
||||||
|
|
||||||
|
**Rationale:**
|
||||||
|
- Neither LIFO nor FIFO shows significant advantage
|
||||||
|
- Mode switching overhead outweighs potential locality gains
|
||||||
|
- Existing FIFO ring is simple and already fast
|
||||||
|
|
||||||
|
**Next:** Explore alternative approaches:
|
||||||
|
- Hybrid strategies (per-class mode selection)
|
||||||
|
- Batch operations (reduce per-call overhead)
|
||||||
|
- Hardware prefetch hints (explicit locality control)
|
||||||
120
docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md
Normal file
120
docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md
Normal file
@ -0,0 +1,120 @@
|
|||||||
|
# Phase 15: UnifiedCache FIFO→LIFO (Stack) v1 Design
|
||||||
|
|
||||||
|
**Date:** 2025-12-15
|
||||||
|
**Status:** DESIGN (next candidate)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 0. Motivation (Why this next?)
|
||||||
|
|
||||||
|
Phase 14(intrusive tcache)v1/v2 は通電確認まで行ったが **NEUTRAL**。
|
||||||
|
一方で system/mimalloc と比べると、Tiny の thread cache 形状は依然として最重要仮説のまま。
|
||||||
|
|
||||||
|
現行の `TinyUnifiedCache` は **FIFO ring (head/tail + mask)**:
|
||||||
|
- pop/push が毎回 `head/tail` と `mask` の更新を行う
|
||||||
|
- “最近 free したブロック” を最優先で再利用できない(局所性が薄い)
|
||||||
|
|
||||||
|
glibc tcache / mimalloc 系の勝ちパターンは **LIFO** が多い。
|
||||||
|
Phase 15 は intrusive(nextptr)を増やさず、既存 `slots[]` 配列をそのまま使って
|
||||||
|
**FIFO→LIFO(stack)**へ形状変更し、命令数・局所性の両方で勝ち筋を狙う。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Design (Box Theory)
|
||||||
|
|
||||||
|
### 1.1 Boxes
|
||||||
|
|
||||||
|
```
|
||||||
|
L0: unified_cache_shape_env_box (ENV gate, reversible)
|
||||||
|
↓
|
||||||
|
L1: unified_cache_lifo_box (push/pop LIFO only, no side effects)
|
||||||
|
↓
|
||||||
|
L2: existing unified_cache (FIFO) (fallback / compatibility)
|
||||||
|
```
|
||||||
|
|
||||||
|
**境界は 1 箇所**:
|
||||||
|
- LIFO disabled → FIFO を使う
|
||||||
|
- LIFO enabled → LIFO を使う
|
||||||
|
|
||||||
|
(実装上は “同一関数内で分岐” になる可能性があるが、責務は箱で分ける)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. API (minimal)
|
||||||
|
|
||||||
|
ENV:
|
||||||
|
- `HAKMEM_TINY_UNIFIED_LIFO=0/1` (default 0, opt-in)
|
||||||
|
|
||||||
|
L1 API(内部用、static inline):
|
||||||
|
- `unified_cache_pop_lifo(int class_idx) -> BASE or NULL`
|
||||||
|
- `unified_cache_push_lifo(int class_idx, BASE) -> 1/0`
|
||||||
|
|
||||||
|
統合点(候補):
|
||||||
|
- `tiny_hot_alloc_fast()` / `tiny_hot_free_fast()`
|
||||||
|
- ここが “実ホットパス” で、FIFO/LIFO の差分を最も素直に測れる
|
||||||
|
- もしくは `unified_cache_pop()` / `unified_cache_push()`
|
||||||
|
- 既存 caller を増やさず広く効かせたい場合
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Implementation sketch (concept)
|
||||||
|
|
||||||
|
### 3.1 LIFO state
|
||||||
|
|
||||||
|
`TinyUnifiedCache` に `top` を追加するのではなく、v1 では互換性優先:
|
||||||
|
- 既存の `head/tail` のうち **`tail` を “top” とみなす**(stack depth)
|
||||||
|
- `head` は常に 0(または未使用)
|
||||||
|
- `mask` は不要(wrap-around しない)
|
||||||
|
|
||||||
|
LIFO push:
|
||||||
|
- if `tail < capacity`: `slots[tail++] = base`
|
||||||
|
|
||||||
|
LIFO pop:
|
||||||
|
- if `tail > 0`: `base = slots[--tail]`
|
||||||
|
|
||||||
|
(FIFO とは並立できないため、モード切替時の整合は “drain/reset” を境界に置く)
|
||||||
|
|
||||||
|
### 3.2 Mode switch safety (Fail-Fast)
|
||||||
|
|
||||||
|
- LIFO ON へ切り替える際は `unified_cache_init()` 後に **各 class の `head/tail` を reset**(empty扱い)
|
||||||
|
- bench/profile では init 前に ENV が確定する前提だが、研究箱として “refresh” を持つなら
|
||||||
|
- refresh 時は切替を禁止(または drain/reset を強制)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. A/B Plan (same binary)
|
||||||
|
|
||||||
|
Baseline:
|
||||||
|
```sh
|
||||||
|
HAKMEM_TINY_UNIFIED_LIFO=0 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Optimized:
|
||||||
|
```sh
|
||||||
|
HAKMEM_TINY_UNIFIED_LIFO=1 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
追加で “局所性が効くか” を確認:
|
||||||
|
```sh
|
||||||
|
HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_UNIFIED_LIFO=0 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_UNIFIED_LIFO=1 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
GO/NO-GO:
|
||||||
|
- GO: Mixed mean +1.0% 以上
|
||||||
|
- NO-GO: mean -1.0% 以下
|
||||||
|
- NEUTRAL: ±1.0%(freeze)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Risks
|
||||||
|
|
||||||
|
1) **mode 分岐の固定費**で相殺(Phase 11/14 の再来)
|
||||||
|
→ 対策: “入口で 1 回だけ” 判定し、hot では分岐を増やさない(関数ポインタ or snapshot)
|
||||||
|
|
||||||
|
2) **切替時の整合**(FIFO state と LIFO state の互換なし)
|
||||||
|
→ 対策: refresh 時の切替は禁止 or drain/reset を境界 1 箇所に固定
|
||||||
|
|
||||||
|
3) **容量チューニング依存**
|
||||||
|
→ v1 はまず形状のみを変えて ROI を確認し、cap 探索は v2 へ分離
|
||||||
|
|
||||||
@ -0,0 +1,98 @@
|
|||||||
|
# Phase 15: UnifiedCache FIFO→LIFO (Stack) v1 — Next Instructions
|
||||||
|
|
||||||
|
設計: `docs/analysis/PHASE15_UNIFIEDCACHE_LIFO_1_DESIGN.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 0. Status / Why now
|
||||||
|
|
||||||
|
- Phase 14 v1/v2(intrusive tcache)は **NEUTRAL** → freeze(default OFF)
|
||||||
|
- 次の狙いは intrusive を増やさず、既存 `slots[]` を使って **FIFO ring → LIFO stack** に変える(形状で命令数と局所性を取りに行く)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. GO 条件
|
||||||
|
|
||||||
|
Mixed 10-run(clean env):
|
||||||
|
- **GO**: mean +1.0% 以上
|
||||||
|
- **NO-GO**: mean -1.0% 以下
|
||||||
|
- **NEUTRAL**: ±1.0% → research box freeze
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Patch 順(小さく積む)
|
||||||
|
|
||||||
|
### Patch 1: L0 ENV gate box(戻せる)
|
||||||
|
|
||||||
|
新規:
|
||||||
|
- `core/box/tiny_unified_lifo_env_box.{h,c}`
|
||||||
|
|
||||||
|
ENV:
|
||||||
|
- `HAKMEM_TINY_UNIFIED_LIFO=0/1`(default 0)
|
||||||
|
|
||||||
|
要件:
|
||||||
|
- hot path に `getenv()` を置かない(cached)
|
||||||
|
- bench_profile の `putenv()` 同期が必要なら refresh API を用意(ただし mode 切替の整合に注意)
|
||||||
|
|
||||||
|
### Patch 2: L1 LIFO 操作箱(副作用ゼロ)
|
||||||
|
|
||||||
|
新規(static inline 想定):
|
||||||
|
- `core/box/tiny_unified_lifo_box.h`
|
||||||
|
|
||||||
|
API:
|
||||||
|
- `unified_cache_try_pop_lifo(int class_idx) -> void* base_or_null`
|
||||||
|
- `unified_cache_try_push_lifo(int class_idx, void* base) -> int handled(1/0)`
|
||||||
|
|
||||||
|
実装方針:
|
||||||
|
- `TinyUnifiedCache` の `tail` を “top” とみなす(互換優先)
|
||||||
|
- LIFO enabled のときは head は使わない(または 0 に固定)
|
||||||
|
|
||||||
|
### Patch 3: 統合点(入口で 1 回だけ)
|
||||||
|
|
||||||
|
統合候補(優先順):
|
||||||
|
1) `core/box/tiny_front_hot_box.h`(hot alloc/free の実体)
|
||||||
|
2) `core/front/tiny_unified_cache.h`(広範囲に効かせたい場合)
|
||||||
|
|
||||||
|
原則:
|
||||||
|
- “mode 判定” は **関数入口で 1 回だけ**
|
||||||
|
- hot パス中で mode の再判定を散らさない(Phase 11 の反省)
|
||||||
|
|
||||||
|
### Patch 4: 可視化(最小)
|
||||||
|
|
||||||
|
必要なときだけ:
|
||||||
|
- LIFO hit/miss を TLS カウンタ(atomic 禁止)
|
||||||
|
- ワンショット dump(ENV opt-in)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. A/B(同一バイナリ)
|
||||||
|
|
||||||
|
Baseline:
|
||||||
|
```sh
|
||||||
|
HAKMEM_TINY_UNIFIED_LIFO=0 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Optimized:
|
||||||
|
```sh
|
||||||
|
HAKMEM_TINY_UNIFIED_LIFO=1 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
追加(局所性が効くか):
|
||||||
|
```sh
|
||||||
|
HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_UNIFIED_LIFO=0 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_UNIFIED_LIFO=1 scripts/run_mixed_10_cleanenv.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. 健康診断
|
||||||
|
|
||||||
|
```sh
|
||||||
|
scripts/verify_health_profiles.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Rollback
|
||||||
|
|
||||||
|
- `export HAKMEM_TINY_UNIFIED_LIFO=0`
|
||||||
14
hakmem.d
14
hakmem.d
@ -98,13 +98,18 @@ hakmem.o: core/hakmem.c core/hakmem.h core/hakmem_build_flags.h \
|
|||||||
core/box/../front/../box/ptr_type_box.h \
|
core/box/../front/../box/ptr_type_box.h \
|
||||||
core/box/../front/../box/tiny_front_config_box.h \
|
core/box/../front/../box/tiny_front_config_box.h \
|
||||||
core/box/../front/../box/../hakmem_build_flags.h \
|
core/box/../front/../box/../hakmem_build_flags.h \
|
||||||
|
core/box/../front/../box/tiny_tcache_box.h \
|
||||||
|
core/box/../front/../box/../hakmem_tiny_config.h \
|
||||||
|
core/box/../front/../box/../tiny_nextptr.h \
|
||||||
|
core/box/../front/../box/tiny_tcache_env_box.h \
|
||||||
core/box/../front/../tiny_region_id.h core/box/../front/../hakmem_tiny.h \
|
core/box/../front/../tiny_region_id.h core/box/../front/../hakmem_tiny.h \
|
||||||
core/box/../front/../box/tiny_env_box.h \
|
core/box/../front/../box/tiny_env_box.h \
|
||||||
core/box/../front/../box/tiny_front_hot_box.h \
|
core/box/../front/../box/tiny_front_hot_box.h \
|
||||||
core/box/../front/../box/../hakmem_tiny_config.h \
|
|
||||||
core/box/../front/../box/../tiny_region_id.h \
|
core/box/../front/../box/../tiny_region_id.h \
|
||||||
core/box/../front/../box/../front/tiny_unified_cache.h \
|
core/box/../front/../box/../front/tiny_unified_cache.h \
|
||||||
core/box/../front/../box/tiny_header_box.h \
|
core/box/../front/../box/tiny_header_box.h \
|
||||||
|
core/box/../front/../box/tiny_unified_lifo_box.h \
|
||||||
|
core/box/../front/../box/tiny_unified_lifo_env_box.h \
|
||||||
core/box/../front/../box/tiny_front_cold_box.h \
|
core/box/../front/../box/tiny_front_cold_box.h \
|
||||||
core/box/../front/../box/tiny_layout_box.h \
|
core/box/../front/../box/tiny_layout_box.h \
|
||||||
core/box/../front/../box/tiny_hotheap_v2_box.h \
|
core/box/../front/../box/tiny_hotheap_v2_box.h \
|
||||||
@ -346,14 +351,19 @@ core/box/../front/tiny_unified_cache.h:
|
|||||||
core/box/../front/../box/ptr_type_box.h:
|
core/box/../front/../box/ptr_type_box.h:
|
||||||
core/box/../front/../box/tiny_front_config_box.h:
|
core/box/../front/../box/tiny_front_config_box.h:
|
||||||
core/box/../front/../box/../hakmem_build_flags.h:
|
core/box/../front/../box/../hakmem_build_flags.h:
|
||||||
|
core/box/../front/../box/tiny_tcache_box.h:
|
||||||
|
core/box/../front/../box/../hakmem_tiny_config.h:
|
||||||
|
core/box/../front/../box/../tiny_nextptr.h:
|
||||||
|
core/box/../front/../box/tiny_tcache_env_box.h:
|
||||||
core/box/../front/../tiny_region_id.h:
|
core/box/../front/../tiny_region_id.h:
|
||||||
core/box/../front/../hakmem_tiny.h:
|
core/box/../front/../hakmem_tiny.h:
|
||||||
core/box/../front/../box/tiny_env_box.h:
|
core/box/../front/../box/tiny_env_box.h:
|
||||||
core/box/../front/../box/tiny_front_hot_box.h:
|
core/box/../front/../box/tiny_front_hot_box.h:
|
||||||
core/box/../front/../box/../hakmem_tiny_config.h:
|
|
||||||
core/box/../front/../box/../tiny_region_id.h:
|
core/box/../front/../box/../tiny_region_id.h:
|
||||||
core/box/../front/../box/../front/tiny_unified_cache.h:
|
core/box/../front/../box/../front/tiny_unified_cache.h:
|
||||||
core/box/../front/../box/tiny_header_box.h:
|
core/box/../front/../box/tiny_header_box.h:
|
||||||
|
core/box/../front/../box/tiny_unified_lifo_box.h:
|
||||||
|
core/box/../front/../box/tiny_unified_lifo_env_box.h:
|
||||||
core/box/../front/../box/tiny_front_cold_box.h:
|
core/box/../front/../box/tiny_front_cold_box.h:
|
||||||
core/box/../front/../box/tiny_layout_box.h:
|
core/box/../front/../box/tiny_layout_box.h:
|
||||||
core/box/../front/../box/tiny_hotheap_v2_box.h:
|
core/box/../front/../box/tiny_hotheap_v2_box.h:
|
||||||
|
|||||||
Reference in New Issue
Block a user