Phase 13 v1 + E5-2 retest: Both NEUTRAL, freeze as research boxes
Phase 13 v1: Header Write Elimination (C7 preserve header)
- Verdict: NEUTRAL (+0.78%)
- Implementation: HAKMEM_TINY_C7_PRESERVE_HEADER ENV gate (default OFF)
- Makes C7 nextptr offset conditional (0→1 when enabled)
- 4-point matrix A/B test results:
* Case A (baseline): 51.49M ops/s
* Case B (WRITE_ONCE=1): 52.07M ops/s (+1.13%)
* Case C (C7_PRESERVE=1): 51.36M ops/s (-0.26%)
* Case D (both): 51.89M ops/s (+0.78% NEUTRAL)
- Action: Freeze as research box (default OFF, manual opt-in)
Phase 5 E5-2: Header Write-Once retest (promotion test)
- Verdict: NEUTRAL (+0.54%)
- Motivation: Phase 13 Case B showed +1.13%, re-tested with dedicated 20-run
- Results (20-run):
* Case A (baseline): 51.10M ops/s
* Case B (WRITE_ONCE=1): 51.37M ops/s (+0.54%)
- Previous test: +0.45% (consistent with NEUTRAL)
- Action: Keep as research box (default OFF, manual opt-in)
Key findings:
- Header write tax optimization shows consistent NEUTRAL results
- Neither Phase 13 v1 nor E5-2 reaches GO threshold (+1.0%)
- Both implemented as reversible ENV gates for future research
Files changed:
- New: core/box/tiny_c7_preserve_header_env_box.{c,h}
- Modified: core/box/tiny_layout_box.h (C7 offset conditional)
- Modified: core/tiny_nextptr.h, core/box/tiny_header_box.h (comments)
- Modified: core/bench_profile.h (refresh sync)
- Modified: Makefile (add new .o files)
- Modified: scripts/run_mixed_10_cleanenv.sh (add C7_PRESERVE ENV)
- Docs: PHASE13_*, PHASE5_E5_2_HEADER_WRITE_ONCE_* (design/results)
Next: Phase 14 (Pointer-chase reduction, tcache-style intrusive LIFO)
🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -162,22 +162,74 @@ Phase 6-10 で達成した累積改善:
|
||||
|
||||
詳細: `docs/analysis/PHASE12_STRATEGIC_PAUSE_RESULTS.md`
|
||||
|
||||
### Next: Phase 13(Header Write Elimination)
|
||||
### Phase 13: Header Write Elimination v1 — NEUTRAL (+0.78%) ⚠️ RESEARCH BOX
|
||||
|
||||
**方向性決定**: Pause 解除、Phase 13 へ進む ✅
|
||||
**Date**: 2025-12-14
|
||||
**Verdict**: **NEUTRAL (+0.78%)** — Frozen as research box (default OFF, manual opt-in)
|
||||
|
||||
**Target**: 1-byte header write の削除(最優先仮説)
|
||||
**Target**: steady-state の header write tax 削減(最優先仮説)
|
||||
|
||||
**Strategy**:
|
||||
- Header を user pointer より前に配置(system malloc パターン)
|
||||
- または header-less classification(RegionId のみ)
|
||||
**Strategy (v1)**:
|
||||
- **C7 freelist がヘッダを壊さない**形に寄せ、E5-2(write-once)を C7 にも適用可能にする
|
||||
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1` (default: 0)
|
||||
|
||||
**Expected ROI**: **+10-20%**
|
||||
**Results (4-Point Matrix)**:
|
||||
| Case | C7_PRESERVE | WRITE_ONCE | Mean (ops/s) | Delta | Verdict |
|
||||
|------|-------------|------------|--------------|-------|---------|
|
||||
| A (baseline) | 0 | 0 | 51,490,500 | — | — |
|
||||
| **B (E5-2 only)** | 0 | 1 | **52,070,600** | **+1.13%** | candidate |
|
||||
| C (C7 preserve) | 1 | 0 | 51,355,200 | -0.26% | NEUTRAL |
|
||||
| D (Phase 13 v1) | 1 | 1 | 51,891,902 | +0.78% | NEUTRAL |
|
||||
|
||||
**Next Actions**:
|
||||
1. Header write overhead の実測(perf annotate)
|
||||
2. Header-less classification の feasibility 検証
|
||||
3. Phase 13 設計書の作成
|
||||
**Key Findings**:
|
||||
1. **E5-2 (HAKMEM_TINY_HEADER_WRITE_ONCE=1) は “単発 +1.13%” を観測したが、20-run 再テストで NEUTRAL (+0.54%)**
|
||||
- 参照: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
|
||||
- 結論: E5-2 は research box 維持(default OFF)
|
||||
|
||||
2. **C7 preserve header alone: -0.26%** (slight regression)
|
||||
- C7 offset=1 memcpy overhead outweighs benefits
|
||||
|
||||
3. **Combined (Phase 13 v1): +0.78%** (positive but below GO)
|
||||
- C7 preserve reduces E5-2 gains
|
||||
|
||||
**Action**:
|
||||
- ✅ Freeze Phase 13 v1 as research box (default OFF)
|
||||
- ✅ Re-test Phase 5 E5-2 (WRITE_ONCE=1) with dedicated 20-run → NEUTRAL (+0.54%)
|
||||
- 📋 Document results: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
|
||||
|
||||
### Phase 5 E5-2: Header Write-Once — 再テスト NEUTRAL (+0.54%) ⚪
|
||||
|
||||
**Date**: 2025-12-14
|
||||
**Verdict**: ⚪ **NEUTRAL (+0.54%)** — Research box 維持(default OFF)
|
||||
|
||||
**Motivation**: Phase 13 の 4点マトリクスで E5-2 単体が +1.13% を記録したため、専用 20-run で昇格可否を判定。
|
||||
|
||||
**Results (20-run)**:
|
||||
| Case | WRITE_ONCE | Mean (ops/s) | Median (ops/s) | Delta |
|
||||
|------|------------|--------------|----------------|-------|
|
||||
| A (baseline) | 0 | 51,096,839 | 51,127,725 | — |
|
||||
| B (optimized) | 1 | 51,371,358 | 51,495,811 | **+0.54%** |
|
||||
|
||||
**Verdict**: NEUTRAL (+0.54%) — GO 閾値 (+1.0%) 未達
|
||||
|
||||
**考察**:
|
||||
- Phase 13 の +1.13% は 10-run での観測値
|
||||
- 専用 20-run では +0.54%(より信頼性が高い)
|
||||
- 旧 E5-2 テスト (+0.45%) と一貫性あり
|
||||
|
||||
**Action**:
|
||||
- ✅ Research box 維持(default OFF、manual opt-in)
|
||||
- ENV: `HAKMEM_TINY_HEADER_WRITE_ONCE=0/1` (default: 0)
|
||||
- 📋 詳細: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
|
||||
|
||||
**Next**: Phase 12 Strategic Pause の次の gap 仮説へ進む
|
||||
|
||||
### Next: Phase 14(Pointer Chase Reduction / Tiny tcache)
|
||||
|
||||
**狙い**: system malloc の tcache に寄せて、Tiny frontend の “配列/FIFO/indirection” コストを減らす。
|
||||
|
||||
- 設計: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_1_DESIGN.md`
|
||||
- 指示: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_1_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
## 更新メモ(2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot)
|
||||
|
||||
|
||||
6
Makefile
6
Makefile
@ -218,12 +218,12 @@ LDFLAGS += $(EXTRA_LDFLAGS)
|
||||
|
||||
# Targets
|
||||
TARGET = test_hakmem
|
||||
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||
OBJS = $(OBJS_BASE)
|
||||
|
||||
# Shared library
|
||||
SHARED_LIB = libhakmem.so
|
||||
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
|
||||
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/box/tiny_c7_preserve_header_env_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
|
||||
|
||||
# Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1)
|
||||
ifeq ($(POOL_TLS_PHASE1),1)
|
||||
@ -427,7 +427,7 @@ test-box-refactor: box-refactor
|
||||
./larson_hakmem 10 8 128 1024 1 12345 4
|
||||
|
||||
# Phase 4: Tiny Pool benchmarks (properly linked with hakmem)
|
||||
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/tiny_free_route_cache_env_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/tiny_free_route_cache_env_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||
TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE)
|
||||
ifeq ($(POOL_TLS_PHASE1),1)
|
||||
TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
||||
|
||||
@ -10,6 +10,7 @@
|
||||
#include "box/tiny_static_route_box.h" // tiny_static_route_refresh_from_env (Phase 3 C3)
|
||||
#include "box/hakmem_env_snapshot_box.h" // hakmem_env_snapshot_refresh_from_env (Phase 4 E1)
|
||||
#include "box/tiny_free_route_cache_env_box.h" // tiny_free_static_route_refresh_from_env (Phase 8)
|
||||
#include "box/tiny_c7_preserve_header_env_box.h" // tiny_c7_preserve_header_env_refresh_from_env (Phase 13 v1)
|
||||
#endif
|
||||
|
||||
// env が未設定のときだけ既定値を入れる
|
||||
@ -184,5 +185,7 @@ static inline void bench_apply_profile(void) {
|
||||
hakmem_env_snapshot_refresh_from_env();
|
||||
// Phase 8: Sync free static route ENV cache after bench_profile putenv defaults.
|
||||
tiny_free_static_route_refresh_from_env();
|
||||
// Phase 13 v1: Sync C7 preserve header ENV cache after bench_profile putenv defaults.
|
||||
tiny_c7_preserve_header_env_refresh_from_env();
|
||||
#endif
|
||||
}
|
||||
|
||||
50
core/box/tiny_c7_preserve_header_env_box.c
Normal file
50
core/box/tiny_c7_preserve_header_env_box.c
Normal file
@ -0,0 +1,50 @@
|
||||
// ============================================================================
|
||||
// Phase 13 v1: Tiny C7 Preserve Header ENV Box (L0) - Implementation
|
||||
// ============================================================================
|
||||
|
||||
#include "tiny_c7_preserve_header_env_box.h"
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <stdio.h>
|
||||
#include <unistd.h>
|
||||
|
||||
// ============================================================================
|
||||
// Global State
|
||||
// ============================================================================
|
||||
|
||||
_Atomic int g_tiny_c7_preserve_header_enabled = -1;
|
||||
|
||||
// ============================================================================
|
||||
// Init (Cold Path)
|
||||
// ============================================================================
|
||||
|
||||
int tiny_c7_preserve_header_env_init(void) {
|
||||
const char* env = getenv("HAKMEM_TINY_C7_PRESERVE_HEADER");
|
||||
int enabled = 0; // default: OFF (opt-in)
|
||||
|
||||
if (env && (env[0] == '1' || strcmp(env, "true") == 0 || strcmp(env, "TRUE") == 0)) {
|
||||
enabled = 1;
|
||||
}
|
||||
|
||||
// Cache result
|
||||
atomic_store_explicit(&g_tiny_c7_preserve_header_enabled, enabled, memory_order_relaxed);
|
||||
|
||||
// Log once (stderr for immediate visibility)
|
||||
if (enabled) {
|
||||
const char msg[] = "[C7_PRESERVE_HEADER] enabled\n";
|
||||
ssize_t w = write(2, msg, sizeof(msg) - 1);
|
||||
(void)w;
|
||||
}
|
||||
|
||||
return enabled;
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Refresh (Cold Path, called from bench_profile)
|
||||
// ============================================================================
|
||||
|
||||
void tiny_c7_preserve_header_env_refresh_from_env(void) {
|
||||
// Reset to uninitialized state (-1)
|
||||
// Next call to tiny_c7_preserve_header_enabled() will re-read ENV
|
||||
atomic_store_explicit(&g_tiny_c7_preserve_header_enabled, -1, memory_order_relaxed);
|
||||
}
|
||||
72
core/box/tiny_c7_preserve_header_env_box.h
Normal file
72
core/box/tiny_c7_preserve_header_env_box.h
Normal file
@ -0,0 +1,72 @@
|
||||
// ============================================================================
|
||||
// Phase 13 v1: Tiny C7 Preserve Header ENV Box (L0)
|
||||
// ============================================================================
|
||||
//
|
||||
// Purpose: ENV gate for C7 header-preserving freelist layout
|
||||
//
|
||||
// Design: docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md
|
||||
// Instructions: docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_NEXT_INSTRUCTIONS.md
|
||||
//
|
||||
// Strategy:
|
||||
// - C7 (1025-2048B) の freelist が header を壊さないようにする
|
||||
// - nextptr offset を 0→1 に変更(header 1B をスキップ)
|
||||
// - これにより alloc 時の header 再書き込みを削減できる
|
||||
//
|
||||
// ENV:
|
||||
// HAKMEM_TINY_C7_PRESERVE_HEADER=0/1 (default: 0, opt-in)
|
||||
//
|
||||
// API:
|
||||
// tiny_c7_preserve_header_enabled() -> int
|
||||
// tiny_c7_preserve_header_env_refresh_from_env()
|
||||
//
|
||||
// Box Theory:
|
||||
// - L0: This file (ENV gate,戻せる)
|
||||
// - L1: tiny_layout_box.h (SSOT: tiny_nextptr_offset)
|
||||
// - L2: tiny_nextptr.h, tiny_header_box.h (affected code)
|
||||
//
|
||||
// Safety:
|
||||
// - ENV-gated (default OFF, opt-in)
|
||||
// - Reversible (ENV toggle)
|
||||
// - Minimal change (C7 offset 0→1 のみ)
|
||||
//
|
||||
// ============================================================================
|
||||
|
||||
#ifndef TINY_C7_PRESERVE_HEADER_ENV_BOX_H
|
||||
#define TINY_C7_PRESERVE_HEADER_ENV_BOX_H
|
||||
|
||||
#include <stdatomic.h>
|
||||
|
||||
// ============================================================================
|
||||
// Global State (L0)
|
||||
// ============================================================================
|
||||
|
||||
// Cached state: -1 (uninitialized), 0 (disabled), 1 (enabled)
|
||||
extern _Atomic int g_tiny_c7_preserve_header_enabled;
|
||||
|
||||
// ============================================================================
|
||||
// Hot Inline API (L0)
|
||||
// ============================================================================
|
||||
|
||||
// Check if C7 preserve header is enabled
|
||||
// Returns: 1 if enabled, 0 if disabled
|
||||
static inline int tiny_c7_preserve_header_enabled(void) {
|
||||
int val = atomic_load_explicit(&g_tiny_c7_preserve_header_enabled, memory_order_relaxed);
|
||||
|
||||
if (__builtin_expect(val == -1, 0)) {
|
||||
// Lazy init: read ENV once
|
||||
extern int tiny_c7_preserve_header_env_init(void);
|
||||
val = tiny_c7_preserve_header_env_init();
|
||||
}
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// Cold API (L2)
|
||||
// ============================================================================
|
||||
|
||||
// Refresh ENV cache (called from bench_profile after putenv)
|
||||
// Pattern: Same as Phase 8 (FREE_STATIC_ROUTE)
|
||||
extern void tiny_c7_preserve_header_env_refresh_from_env(void);
|
||||
|
||||
#endif // TINY_C7_PRESERVE_HEADER_ENV_BOX_H
|
||||
@ -41,13 +41,14 @@
|
||||
//
|
||||
// Returns:
|
||||
// true - C1-C6: Header preserved at offset 0, next at offset 1
|
||||
// false - C0, C7: Header overwritten by next pointer at offset 0
|
||||
// false - C0: Header overwritten by next pointer at offset 0
|
||||
// Phase 13 v1: C7 returns false (default) or true (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
|
||||
|
||||
static inline bool tiny_class_preserves_header(int class_idx) {
|
||||
#if HAKMEM_TINY_HEADER_CLASSIDX
|
||||
// Delegate to tiny_layout_box.h specification (Single Source of Truth)
|
||||
// next_off=0 → header overwritten (C0, C7)
|
||||
// next_off=1 → header preserved (C1-C6)
|
||||
// next_off=0 → header overwritten (C0, C7 default)
|
||||
// next_off=1 → header preserved (C1-C6, C7 with HAKMEM_TINY_C7_PRESERVE_HEADER=1)
|
||||
return tiny_nextptr_offset(class_idx) != 0;
|
||||
#else
|
||||
// Headers disabled globally
|
||||
@ -87,7 +88,8 @@ static inline void tiny_header_write_if_preserved(void* base, int class_idx) {
|
||||
// ============================================================================
|
||||
//
|
||||
// Validates header ONLY if this class preserves headers.
|
||||
// For C0/C7, validation is impossible (next pointer is stored at offset 0).
|
||||
// For C0, validation is impossible (next pointer is stored at offset 0).
|
||||
// Phase 13 v1: C7 validation depends on HAKMEM_TINY_C7_PRESERVE_HEADER.
|
||||
//
|
||||
// Arguments:
|
||||
// base - BASE pointer (not user pointer)
|
||||
|
||||
@ -79,14 +79,29 @@ static inline size_t tiny_user_offset(int class_idx) {
|
||||
// Offset for storing the freelist next pointer inside a freed block.
|
||||
// This is distinct from tiny_user_offset():
|
||||
// - User offset is always +1 in header mode.
|
||||
// - Next offset is 0 for C0/C7 (cannot preserve header while free), else 1.
|
||||
// - Next offset:
|
||||
// - C0: always 0 (16B, cannot fit header+next)
|
||||
// - C1-C6: always 1 (header-preserving)
|
||||
// - C7: 0 (default) or 1 (Phase 13 v1: header-preserving)
|
||||
static inline size_t tiny_nextptr_offset(int class_idx) {
|
||||
#if HAKMEM_TINY_HEADERLESS
|
||||
(void)class_idx;
|
||||
return 0;
|
||||
#elif HAKMEM_TINY_HEADER_CLASSIDX
|
||||
// Bit pattern: C0=0, C1-C6=1, C7=0 → 0b01111110 = 0x7E
|
||||
return (0x7Eu >> ((unsigned)class_idx & 7u)) & 1u;
|
||||
// Phase 13 v1: C7 preserve header gate
|
||||
// Bit pattern (default): C0=0, C1-C6=1, C7=0 → 0b01111110 = 0x7E
|
||||
// Bit pattern (C7 preserve): C0=0, C1-C7=1 → 0b11111110 = 0xFE
|
||||
unsigned int base_pattern = 0x7Eu; // default: C7 offset=0
|
||||
|
||||
// Phase 13 v1: Gate for C7 header-preserving layout
|
||||
if (class_idx == 7) {
|
||||
extern int tiny_c7_preserve_header_enabled(void);
|
||||
if (tiny_c7_preserve_header_enabled()) {
|
||||
base_pattern = 0xFEu; // C7 offset=1 (header-preserving)
|
||||
}
|
||||
}
|
||||
|
||||
return (base_pattern >> ((unsigned)class_idx & 7u)) & 1u;
|
||||
#else
|
||||
(void)class_idx;
|
||||
return 0u;
|
||||
|
||||
@ -1,7 +1,8 @@
|
||||
// tiny_nextptr.h - Authoritative next-pointer offset/load/store for tiny boxes
|
||||
//
|
||||
// Finalized Phase E1-CORRECT spec (物理制約込み):
|
||||
// P0.1 updated: C0 and C7 use offset 0, C1-C6 use offset 1 (header preserved)
|
||||
// P0.1 updated: C0 uses offset 0, C1-C6 use offset 1 (header preserved)
|
||||
// Phase 13 v1: C7 uses offset 0 (default) or 1 (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
|
||||
//
|
||||
// HAKMEM_TINY_HEADER_CLASSIDX != 0 のとき:
|
||||
//
|
||||
@ -18,8 +19,8 @@
|
||||
//
|
||||
// Class 7:
|
||||
// [1B header][payload 2047B]
|
||||
// → headerは上書きし、next は base+0 に格納(最大サイズなので許容)
|
||||
// → next_off = 0
|
||||
// → next_off = 0 (default: headerは上書き)
|
||||
// → next_off = 1 (Phase 13 v1: HAKMEM_TINY_C7_PRESERVE_HEADER=1)
|
||||
//
|
||||
// HAKMEM_TINY_HEADER_CLASSIDX == 0 のとき:
|
||||
//
|
||||
@ -56,7 +57,8 @@ static __thread void* g_tiny_next_ra1 __attribute__((unused)) = NULL;
|
||||
static __thread void* g_tiny_next_ra2 __attribute__((unused)) = NULL;
|
||||
|
||||
// Compute freelist next-pointer offset within a block for the given class.
|
||||
// P0.1 updated: C0 and C7 use offset 0, C1-C6 use offset 1 (header preserved)
|
||||
// P0.1: C0 uses offset 0, C1-C6 use offset 1 (header preserved)
|
||||
// Phase 13 v1: C7 uses offset 0 (default) or 1 (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
|
||||
// Rationale for C0: 8B stride cannot fit [1B header][8B next pointer] without overflow
|
||||
static inline __attribute__((always_inline)) size_t tiny_next_off(int class_idx) {
|
||||
return tiny_nextptr_offset(class_idx);
|
||||
@ -186,7 +188,8 @@ static inline __attribute__((always_inline)) void* tiny_next_load(const void* ba
|
||||
// - When class_map is used for class_idx lookup (default), header restoration is unnecessary
|
||||
// - Alloc path always writes fresh header before returning block to user (HAK_RET_ALLOC)
|
||||
// - ENV: HAKMEM_TINY_RESTORE_HEADER=1 to force header restoration (legacy mode)
|
||||
// P0.1: C7 uses offset 0 (overwrites header), C0-C6 use offset 1 (header preserved)
|
||||
// P0.1: C0 uses offset 0 (overwrites header), C1-C6 use offset 1 (header preserved)
|
||||
// Phase 13 v1: C7 uses offset 0 (default) or 1 (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
|
||||
static inline __attribute__((always_inline)) void tiny_next_store(void* base, int class_idx, void* next) {
|
||||
size_t off = tiny_next_off(class_idx);
|
||||
|
||||
|
||||
@ -0,0 +1,58 @@
|
||||
# Phase 13 v1: Header Write Elimination(C7 preserve header)A/B 結果
|
||||
|
||||
**Date**: 2025-12-14
|
||||
**Verdict**: ⚪ **NEUTRAL**(Phase 13 v1 は research box freeze / default OFF)
|
||||
|
||||
設計: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md`
|
||||
手順: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. 目的
|
||||
|
||||
Phase 12 の gap 仮説(header write tax)に対して、Phase 13 v1 は:
|
||||
|
||||
- **ヘッダを消さずに維持**
|
||||
- C7 の freelist がヘッダを壊さない(header-preserving)ようにして
|
||||
- **E5-2(header write-once)を C7 にも拡張**できるかを検証する
|
||||
|
||||
---
|
||||
|
||||
## 2. 4点マトリクス(throughput)
|
||||
|
||||
| Case | HAKMEM_TINY_C7_PRESERVE_HEADER | HAKMEM_TINY_HEADER_WRITE_ONCE | ops/s | vs Case A |
|
||||
|------|--------------------------------|-------------------------------|-------|----------|
|
||||
| A | 0 | 0 | 51,490,500 | baseline |
|
||||
| B | 0 | 1 | 52,070,600 | **+1.13%** |
|
||||
| C | 1 | 0 | 51,355,200 | -0.26% |
|
||||
| D | 1 | 1 | 51,891,902 | +0.78% |
|
||||
|
||||
結論:
|
||||
- Phase 13 v1(Case D)は **+0.78%** → **NEUTRAL**(GO閾値 +1.0% 未満)
|
||||
- **E5-2 単体(Case B)が +1.13% で GO 相当**という重要な副産物が得られた
|
||||
|
||||
---
|
||||
|
||||
## 3. 判定
|
||||
|
||||
### 3.1 Phase 13 v1(C7 preserve header)
|
||||
|
||||
- **Verdict**: ⚪ NEUTRAL → **research box freeze(default OFF)**
|
||||
- 推定原因:
|
||||
- C7 preserve による freelist next のオフセット変更が、保存できた write を相殺(未確定)
|
||||
|
||||
### 3.2 Phase 5 E5-2(Header write-once)
|
||||
|
||||
- **再テスト結果**:
|
||||
- Phase 13 matrix の単発観測では **+1.13%**(Case B)
|
||||
- 専用 clean env 20-run 再テストでは **+0.54%(NEUTRAL)** → research box 維持(default OFF)
|
||||
- 詳細: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
|
||||
|
||||
---
|
||||
|
||||
## 4. Next Actions(推奨)
|
||||
|
||||
1. Phase 13 v1 は freeze(保持はするが default OFF)
|
||||
2. E5-2 は freeze(default OFF)
|
||||
3. Phase 13 v1 の派生案(必要なら):
|
||||
- C7 の next を “より aligned” な位置に置く設計(v1b)を研究箱で検討
|
||||
146
docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md
Normal file
146
docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md
Normal file
@ -0,0 +1,146 @@
|
||||
# Phase 13: Header Write Elimination v1(C7 Header-Preserving Freelist)
|
||||
|
||||
**Date**: 2025-12-14
|
||||
**Status**: DESIGN(Phase 13 kickoff)→ ⚪ **NEUTRAL (+0.78%)**(research box freeze, default OFF)
|
||||
|
||||
---
|
||||
|
||||
## 0. Executive Summary(1枚)
|
||||
|
||||
Phase 12 の比較で **system malloc (glibc) が hakmem より +63.7% 速い**ことが判明し、次の大きい構造差として **“steady-state のヘッダ書き込み(write tax)”** が最優先仮説になった。
|
||||
|
||||
ただし hakmem は free の hot path で `HEADER_MAGIC` を前提に **ヘッダを読む**ため、ヘッダを “無くす/壊す” と安全性が崩れる。
|
||||
|
||||
そこで Phase 13 v1 は「ヘッダ自体は維持」しつつ、**C7 の freelist でヘッダを上書きしない**設計に寄せて、既存の **E5-2 (Header write-once)** を **C7 にも適用可能にする**。
|
||||
|
||||
狙い:
|
||||
- C1-C6 は既に write-once で “alloc 時ヘッダ書き込み” をスキップ可能
|
||||
- **C7 は現状 “free の next がヘッダを潰す” ため、alloc で毎回ヘッダ再書き込みが必要**
|
||||
- C7 の next を **base+1(user 先頭)**へ移すとヘッダが保持され、write-once で alloc 側の再書き込みを削れる
|
||||
|
||||
---
|
||||
|
||||
## 1. 現状(なぜ C7 だけ毎回書いているのか)
|
||||
|
||||
### 1.1 重要な前提(現行の正)
|
||||
|
||||
- Free hot path(例: `core/front/malloc_tiny_fast.h` の `free_tiny_fast()`)は、
|
||||
- `ptr-1` の `HEADER_MAGIC` を検証し
|
||||
- class_idx を header から抽出している
|
||||
→ **ヘッダの正しさは safety と fast path の前提**
|
||||
|
||||
### 1.2 E5-2 (Header write-once) の適用範囲
|
||||
|
||||
- `core/box/tiny_header_box.h` の `tiny_header_finalize_alloc()` が、
|
||||
- `HAKMEM_TINY_HEADER_WRITE_ONCE=1` かつ
|
||||
- `tiny_class_preserves_header(class_idx)=true`(C1-C6)
|
||||
のとき、alloc 時の `tiny_region_id_write_header()` をスキップする。
|
||||
|
||||
### 1.3 C7 が write-once にならない理由(根本)
|
||||
|
||||
- `core/box/tiny_layout_box.h` の `tiny_nextptr_offset()` が
|
||||
- C7 は `next_off=0`(= `base+0` に next を書く)
|
||||
→ free 時に **ヘッダ領域を next pointer で上書き**する
|
||||
→ alloc で必ず `tiny_region_id_write_header()` を実行し直す必要がある
|
||||
|
||||
(C0 も同じだが、C0 は stride 8B のため `base+1` に 8B next を置けない制約がある)
|
||||
|
||||
---
|
||||
|
||||
## 2. 提案(Phase 13 v1)
|
||||
|
||||
### 2.1 変更のコア
|
||||
|
||||
**C7 の next pointer を `base+1`(user 先頭)に移す**:
|
||||
|
||||
- Before(現行):
|
||||
- C7: `next_off=0` → `*(void**)base = next`(ヘッダ破壊)
|
||||
- After(Phase 13 v1):
|
||||
- C7: `next_off=1` → `memcpy(base+1, &next, 8)`(ヘッダ保持)
|
||||
|
||||
これにより C7 が “header-preserving class” になり、E5-2 の write-once が C7 にも効く。
|
||||
|
||||
### 2.2 Box Theory(箱割り)
|
||||
|
||||
```
|
||||
L0: tiny_c7_preserve_header_env_box (ENV gate, A/B, refresh)
|
||||
↓
|
||||
L1: tiny_layout_box (tiny_nextptr_offset の SSOT)
|
||||
↓
|
||||
L2: tiny_nextptr (next load/store は SSOT を参照)
|
||||
↓
|
||||
L3: tiny_header_box (class_preserves_header → write-once 適用)
|
||||
```
|
||||
|
||||
境界は 1 箇所:
|
||||
- 「C7 の next オフセット決定」= `tiny_nextptr_offset()` に集約(他で分岐しない)
|
||||
|
||||
### 2.3 戻せる(A/B)
|
||||
|
||||
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1`(default: 0)
|
||||
- まずは research box として導入し、GO なら preset 昇格
|
||||
|
||||
---
|
||||
|
||||
## 3. Safety / Invariants(Fail-Fast)
|
||||
|
||||
### 3.1 不変条件
|
||||
|
||||
- `tiny_next_store/load` は **常に** `tiny_nextptr_offset()` を参照(直書き禁止)
|
||||
- `tiny_class_preserves_header(class_idx)` は offset!=0 で決まる(ハードコード禁止)
|
||||
- C7 preserve ON のとき:
|
||||
- free 後も `*(uint8_t*)base == HEADER_MAGIC|cls` が保持される(ヘッダ破壊が起きない)
|
||||
|
||||
### 3.2 Fail-Fast(debug 限定)
|
||||
|
||||
- デバッグのみ、C7 preserve ON のときに:
|
||||
- `tiny_header_validate(base, 7, ...)` の mismatch をワンショットで出す
|
||||
- release では常時ログ無し、必要なら stats カウンタのみ
|
||||
|
||||
---
|
||||
|
||||
## 4. A/B 計測計画(同一バイナリ)
|
||||
|
||||
この変更は “freelist next の配置” を変えるため、本来は layout 差になるが、Phase 13 v1 は **ENV で切替**できるようにして同一バイナリ A/B を維持する(Phase 5-7 の教訓)。
|
||||
|
||||
### 4.1 4点マトリクス(必須)
|
||||
|
||||
| Case | HAKMEM_TINY_C7_PRESERVE_HEADER | HAKMEM_TINY_HEADER_WRITE_ONCE | 意味 |
|
||||
|------|--------------------------------|-------------------------------|------|
|
||||
| A | 0 | 0 | 現行 baseline |
|
||||
| B | 0 | 1 | E5-2 のみ(C1-C6) |
|
||||
| C | 1 | 0 | C7 next を user に移す(ヘッダは毎回書く) |
|
||||
| D | 1 | 1 | Phase 13 v1 本命(C1-C7 を write-once) |
|
||||
|
||||
### 4.2 GO/NO-GO(Mixed 10-run)
|
||||
|
||||
- GO: mean **+1.0% 以上**
|
||||
- NO-GO: mean **-1.0% 以下**
|
||||
- NEUTRAL: ±1.0% → freeze(research box)
|
||||
|
||||
---
|
||||
|
||||
## 5. リスクと対策
|
||||
|
||||
### リスク 1: C7 next が unaligned になり memcpy 経由で遅くなる
|
||||
|
||||
- 対策: Case C(write-once 無し)を必ず測り、layout 変更単体のコストを分離する
|
||||
- もし C が大きく負ける場合:
|
||||
- “C7 next offset=8(aligned)” の派生案を検討(Phase 13 v1b)
|
||||
|
||||
### リスク 2: class_idx ハードコードが残っていて壊れる
|
||||
|
||||
- 対策: `rg "== 7|!= 7|C7 uses offset 0"` を掃除し、SSOT(`tiny_layout_box`)参照に寄せる
|
||||
|
||||
### リスク 3: ENV refresh が bench_profile putenv に追従しない
|
||||
|
||||
- 対策: Phase 8 と同様に `*_env_refresh_from_env()` を用意し、`bench_profile.h` から呼ぶ
|
||||
|
||||
---
|
||||
|
||||
## 6. 次(Phase 13 以降の視界)
|
||||
|
||||
Phase 13 v1 は「ヘッダを “消す”」ではなく「**steady-state のヘッダ再書き込みを減らす**」に寄せる。
|
||||
|
||||
もし system malloc との差がまだ大きい場合、次の大テーマは:
|
||||
- Thread cache(tcache 相当の構造)を TinyUnifiedCache に移植する(Phase 14 候補)
|
||||
@ -0,0 +1,134 @@
|
||||
# Phase 13: Header Write Elimination v1 — 次の指示書(C7 preserve header)
|
||||
|
||||
## 0. Status
|
||||
|
||||
- Phase 12 で system malloc が hakmem より +63.7% 速いことが判明 → Phase 13 開始
|
||||
- 方針(v1): **ヘッダは維持**しつつ、**C7 の freelist がヘッダを壊さない**ようにして “alloc 時のヘッダ再書き込み” を削る
|
||||
- 結果: ⚪ **NEUTRAL (+0.78%) → freeze (default OFF)**(副産物: E5-2 が +1.13%)
|
||||
|
||||
設計: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md`
|
||||
結果: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. 目的(GO 条件)
|
||||
|
||||
Mixed 10-run(clean env)で:
|
||||
- **GO**: mean +1.0% 以上
|
||||
- **NO-GO**: mean -1.0% 以下(即 rollback / freeze)
|
||||
- **NEUTRAL**: ±1.0%(research box freeze)
|
||||
|
||||
---
|
||||
|
||||
## 2. 実装パッチ順(小さく積む)
|
||||
|
||||
### Patch 1: L0 ENV Box(戻せる)
|
||||
|
||||
新規:
|
||||
- `core/box/tiny_c7_preserve_header_env_box.h`
|
||||
- `core/box/tiny_c7_preserve_header_env_box.c`(refresh)
|
||||
|
||||
仕様:
|
||||
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1`(default: 0)
|
||||
- API:
|
||||
- `tiny_c7_preserve_header_enabled() -> int`
|
||||
- `tiny_c7_preserve_header_env_refresh_from_env()`
|
||||
|
||||
要件:
|
||||
- hot path では **getenv 禁止**(lazy init + cached read のみ)
|
||||
|
||||
### Patch 2: L1 Layout SSOT 変更(境界1箇所)
|
||||
|
||||
修正:
|
||||
- `core/box/tiny_layout_box.h`
|
||||
|
||||
変更:
|
||||
- `tiny_nextptr_offset(class_idx)` の C7 分だけを L0 gate で切替
|
||||
- OFF: 既存(C7 off=0)
|
||||
- ON: C7 off=1(header-preserving)
|
||||
|
||||
### Patch 3: L2 NextPtr のコメント/前提を SSOT 準拠に
|
||||
|
||||
修正(コードの挙動変更はしない):
|
||||
- `core/tiny_nextptr.h`
|
||||
- `core/box/tiny_header_box.h`(コメントの “C7=offset0 固定” 等があれば撤去)
|
||||
|
||||
狙い:
|
||||
- C7 の offset 固定前提を残さない(設計事故の芽を摘む)
|
||||
|
||||
### Patch 4: Bench profile の refresh 同期(ENV 事故防止)
|
||||
|
||||
修正:
|
||||
- `core/bench_profile.h`
|
||||
|
||||
追加:
|
||||
- `bench_setenv_default(...)` の後に `tiny_c7_preserve_header_env_refresh_from_env()` を呼ぶ
|
||||
|
||||
(Phase 8 と同じパターン)
|
||||
|
||||
---
|
||||
|
||||
## 3. A/B テスト(4点マトリクス必須)
|
||||
|
||||
`scripts/run_mixed_10_cleanenv.sh` を使用(ENV リークを防ぐ)。
|
||||
|
||||
### Case A(baseline)
|
||||
|
||||
```sh
|
||||
HAKMEM_TINY_C7_PRESERVE_HEADER=0 \
|
||||
HAKMEM_TINY_HEADER_WRITE_ONCE=0 \
|
||||
scripts/run_mixed_10_cleanenv.sh
|
||||
```
|
||||
|
||||
### Case B(E5-2 only)
|
||||
|
||||
```sh
|
||||
HAKMEM_TINY_C7_PRESERVE_HEADER=0 \
|
||||
HAKMEM_TINY_HEADER_WRITE_ONCE=1 \
|
||||
scripts/run_mixed_10_cleanenv.sh
|
||||
```
|
||||
|
||||
### Case C(C7 preserve only)
|
||||
|
||||
```sh
|
||||
HAKMEM_TINY_C7_PRESERVE_HEADER=1 \
|
||||
HAKMEM_TINY_HEADER_WRITE_ONCE=0 \
|
||||
scripts/run_mixed_10_cleanenv.sh
|
||||
```
|
||||
|
||||
### Case D(Phase 13 v1 本命)
|
||||
|
||||
```sh
|
||||
HAKMEM_TINY_C7_PRESERVE_HEADER=1 \
|
||||
HAKMEM_TINY_HEADER_WRITE_ONCE=1 \
|
||||
scripts/run_mixed_10_cleanenv.sh
|
||||
```
|
||||
|
||||
追加(任意):
|
||||
- `HAKMEM_PROFILE=C6_HEAVY_LEGACY_POOLV1` でも 5-run を取る(回帰が無いこと)
|
||||
|
||||
---
|
||||
|
||||
## 4. 可視化(最小)
|
||||
|
||||
既存:
|
||||
- `HAKMEM_TINY_HEADER_WRITE_ONCE_STATS=1` を使い、
|
||||
- `alloc_skip_count / alloc_write_count` の比率が増えることを確認する
|
||||
|
||||
新規を足す場合(必要最小):
|
||||
- “C7 で skip が増えている” が見えない場合のみ、C7 だけのカウンタを追加(常時 atomic は避ける)
|
||||
|
||||
---
|
||||
|
||||
## 5. 昇格(GO の場合のみ)
|
||||
|
||||
GO のとき:
|
||||
1. `core/bench_profile.h` に default を追加
|
||||
- `bench_setenv_default("HAKMEM_TINY_C7_PRESERVE_HEADER", "1");`
|
||||
- (必要なら)`HAKMEM_TINY_HEADER_WRITE_ONCE=1` も昇格
|
||||
2. `CURRENT_TASK.md` に Phase 13 v1 の結果(A/B 表)を追記
|
||||
3. rollback 手順を明記
|
||||
- `export HAKMEM_TINY_C7_PRESERVE_HEADER=0`
|
||||
|
||||
NO-GO のとき:
|
||||
- research box freeze(default OFF のまま)、設計メモに原因を記録
|
||||
@ -9,6 +9,19 @@
|
||||
|
||||
---
|
||||
|
||||
## Addendum(2025-12-14)
|
||||
|
||||
Phase 13 v1 の 4点マトリクスで、`HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を観測(候補)。
|
||||
|
||||
- 結果: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
|
||||
- ただし、専用 clean env 20-run 再テストでは **+0.54%(NEUTRAL)** となり、昇格は見送り。
|
||||
- 詳細: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
|
||||
|
||||
結論:
|
||||
- E5-2 は research box のまま維持(default OFF)。
|
||||
|
||||
---
|
||||
|
||||
## A/B Test Results (Mixed Workload)
|
||||
|
||||
### Configuration
|
||||
|
||||
@ -6,6 +6,12 @@
|
||||
**Baseline**: 43.998M ops/s (Mixed, 40M iters, ws=400, E4-1+E4-2+E5-1 ON)
|
||||
**Goal**: +1-3% by moving header writes from allocation hot path to refill cold boundary
|
||||
|
||||
**Update (2025-12-14)**:
|
||||
- Phase 13 v1 の 4点マトリクスで `HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を観測(候補)。
|
||||
- `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
|
||||
- 専用 clean env 20-run 再テストでは **+0.54%(NEUTRAL)** → 昇格は見送り。
|
||||
- `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
|
||||
|
||||
---
|
||||
|
||||
## Hypothesis
|
||||
|
||||
@ -0,0 +1,76 @@
|
||||
# Phase 5 E5-2: Header Write-Once — Promotion 判定用 指示書
|
||||
|
||||
**Status**: ✅ COMPLETE → ⚪ NEUTRAL(昇格見送り)
|
||||
|
||||
結果: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
|
||||
|
||||
## 0. 背景
|
||||
|
||||
過去の E5-2 A/B では NEUTRAL だったが、Phase 13 v1 の 4点マトリクス再計測で
|
||||
`HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を記録し、GO候補になった。
|
||||
|
||||
参照:
|
||||
- 旧結果: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`
|
||||
- 新観測: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
|
||||
|
||||
目的: **E5-2 を preset デフォルトへ昇格できるか**を “専用 A/B” で確定する。
|
||||
|
||||
---
|
||||
|
||||
## 1. A/B 手順(clean env, 同一バイナリ)
|
||||
|
||||
推奨: Mixed 20-run(mean/median を確度高めに取る)
|
||||
|
||||
### A: baseline(WRITE_ONCE=0)
|
||||
|
||||
```sh
|
||||
RUNS=20 HAKMEM_TINY_HEADER_WRITE_ONCE=0 scripts/run_mixed_10_cleanenv.sh
|
||||
```
|
||||
|
||||
### B: optimized(WRITE_ONCE=1)
|
||||
|
||||
```sh
|
||||
RUNS=20 HAKMEM_TINY_HEADER_WRITE_ONCE=1 scripts/run_mixed_10_cleanenv.sh
|
||||
```
|
||||
|
||||
任意:
|
||||
- `HAKMEM_PROFILE=C6_HEAVY_LEGACY_POOLV1` でも 5-run を 0/1 で取る(回帰がないこと)
|
||||
|
||||
---
|
||||
|
||||
## 2. 判定ゲート
|
||||
|
||||
- **GO**: Mixed 20-run mean **+1.0% 以上** かつ median も正
|
||||
- **NO-GO**: mean **-1.0% 以下**
|
||||
- **NEUTRAL**: それ以外(±1.0%)→ research box 維持(default OFF)
|
||||
|
||||
---
|
||||
|
||||
## 3. GO の場合の昇格手順(小パッチ)
|
||||
|
||||
### Patch P1: preset 昇格
|
||||
|
||||
- `core/bench_profile.h`(対象プリセット)に追加:
|
||||
- `bench_setenv_default("HAKMEM_TINY_HEADER_WRITE_ONCE", "1");`
|
||||
|
||||
最初は `MIXED_TINYV3_C7_SAFE` のみに昇格でよい(C6-heavy は任意)。
|
||||
|
||||
### Patch P2: cleanenv スクリプト更新(ENV 漏れ防止)
|
||||
|
||||
`scripts/run_mixed_10_cleanenv.sh` のデフォルト値を見直す:
|
||||
- 昇格後は `HAKMEM_TINY_HEADER_WRITE_ONCE` を “研究 knob” 扱いしない
|
||||
- 例: `export HAKMEM_TINY_HEADER_WRITE_ONCE=${HAKMEM_TINY_HEADER_WRITE_ONCE:-1}`
|
||||
|
||||
(既存の運用: export された値は bench_setenv_default が上書きできないため)
|
||||
|
||||
### Patch P3: ドキュメント更新
|
||||
|
||||
- 新しい再計測結果を 1 本にまとめる(例: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_RETEST_AB_TEST_RESULTS.md`)
|
||||
- `CURRENT_TASK.md` に “E5-2 ADOPT” の記録を追記
|
||||
|
||||
---
|
||||
|
||||
## 4. NO-GO/NEUTRAL の場合
|
||||
|
||||
- `HAKMEM_TINY_HEADER_WRITE_ONCE` は research box のまま(default OFF)
|
||||
- 旧結果との差分要因(ベースライン差 / env 漏れ / build 形状)をメモして凍結
|
||||
@ -0,0 +1,177 @@
|
||||
# Phase 5 E5-2: Header Write-Once — 再テスト結果(昇格判定)
|
||||
|
||||
**Date**: 2025-12-14
|
||||
**Verdict**: ⚪ **NEUTRAL (+0.54%)** — Research box 維持(default OFF)
|
||||
|
||||
背景: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`
|
||||
指示: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_PROMOTION_NEXT_INSTRUCTIONS.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. 背景
|
||||
|
||||
Phase 13 v1 の 4点マトリクス A/B で `HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を記録し、GO 候補として浮上したため、専用の clean env 20-run で昇格可否を判定。
|
||||
|
||||
参照: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md` (Case B)
|
||||
|
||||
---
|
||||
|
||||
## 2. テスト構成
|
||||
|
||||
- **Benchmark**: scripts/run_mixed_10_cleanenv.sh
|
||||
- **Profile**: MIXED_TINYV3_C7_SAFE
|
||||
- **Iterations**: 20,000,000 per run
|
||||
- **Working set**: 400
|
||||
- **Runs**: 20 per case
|
||||
- **ENV**: `HAKMEM_TINY_C7_PRESERVE_HEADER=0` 固定(C7 preserve は使用しない)
|
||||
|
||||
---
|
||||
|
||||
## 3. 結果(20-run)
|
||||
|
||||
| Case | WRITE_ONCE | Mean (ops/s) | Median (ops/s) | Delta vs A |
|
||||
|------|------------|--------------|----------------|------------|
|
||||
| A (baseline) | 0 | 51,096,839 | 51,127,725 | — |
|
||||
| B (optimized) | 1 | 51,371,358 | 51,495,811 | **+0.54%** |
|
||||
|
||||
---
|
||||
|
||||
## 4. 判定
|
||||
|
||||
### 4.1 GO 条件
|
||||
|
||||
- Mean **+1.0%** 以上 かつ Median も正
|
||||
- 今回: Mean +0.54%, Median +0.72%
|
||||
|
||||
### 4.2 Verdict
|
||||
|
||||
- **NEUTRAL (+0.54%)** → Research box 維持(default OFF)
|
||||
- GO 閾値 (+1.0%) に到達せず
|
||||
|
||||
---
|
||||
|
||||
## 5. 考察
|
||||
|
||||
### 5.1 Phase 13 の 4点マトリクスとの差異
|
||||
|
||||
| Test | WRITE_ONCE=1 結果 | Runs | Baseline |
|
||||
|------|-------------------|------|----------|
|
||||
| Phase 13 (Case B) | **+1.13%** | 10 | 51,490,500 ops/s |
|
||||
| 今回 (専用 20-run) | **+0.54%** | 20 | 51,096,839 ops/s |
|
||||
|
||||
**差分要因**:
|
||||
1. **Baseline の揺らぎ**: Phase 13 の baseline (51.49M) vs 今回 (51.10M) で約 -0.76% の差
|
||||
2. **測定回数**: 10-run vs 20-run(20-run の方が信頼性が高い)
|
||||
3. **ENV 汚染**: Phase 13 では 4 ケースを連続実行(ENV リーク可能性)
|
||||
|
||||
### 5.2 Phase 5 E5-2 旧結果との比較
|
||||
|
||||
旧テスト(`PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`):
|
||||
- 結果: +0.45% (NEUTRAL)
|
||||
- 今回: +0.54% (NEUTRAL)
|
||||
|
||||
**一貫性**: 両テストとも NEUTRAL 範囲内で一貫
|
||||
|
||||
---
|
||||
|
||||
## 6. Next Actions
|
||||
|
||||
### 6.1 E5-2 の扱い
|
||||
|
||||
- ✅ Research box として維持(default OFF、manual opt-in)
|
||||
- ENV: `HAKMEM_TINY_HEADER_WRITE_ONCE=0/1` (default: 0)
|
||||
|
||||
### 6.2 Phase 13 v1 の扱い
|
||||
|
||||
- ✅ Research box として維持(default OFF)
|
||||
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1` (default: 0)
|
||||
|
||||
### 6.3 次の最適化
|
||||
|
||||
Phase 12 Strategic Pause の gap 仮説リストに戻る:
|
||||
1. ~~Header write tax~~ → Phase 13 v1 NEUTRAL, E5-2 NEUTRAL
|
||||
2. **Pointer chase overhead** (次の候補)
|
||||
3. Lock contention (if applicable)
|
||||
4. Memory fence overhead
|
||||
5. Metadata access patterns
|
||||
|
||||
---
|
||||
|
||||
## 7. Raw Data
|
||||
|
||||
### Case A (baseline, WRITE_ONCE=0)
|
||||
```
|
||||
Run 1: 50725850 ops/s
|
||||
Run 2: 51547217 ops/s
|
||||
Run 3: 51076712 ops/s
|
||||
Run 4: 51527474 ops/s
|
||||
Run 5: 51193070 ops/s
|
||||
Run 6: 51597708 ops/s
|
||||
Run 7: 52239171 ops/s
|
||||
Run 8: 52386008 ops/s
|
||||
Run 9: 51618321 ops/s
|
||||
Run 10: 50919588 ops/s
|
||||
Run 11: 52415403 ops/s
|
||||
Run 12: 51125404 ops/s
|
||||
Run 13: 49785086 ops/s
|
||||
Run 14: 50915858 ops/s
|
||||
Run 15: 51130046 ops/s
|
||||
Run 16: 48960162 ops/s
|
||||
Run 17: 51385756 ops/s
|
||||
Run 18: 50849945 ops/s
|
||||
Run 19: 50550500 ops/s
|
||||
Run 20: 49987500 ops/s
|
||||
|
||||
Mean: 51096838.95 ops/s
|
||||
Median: 51127725.00 ops/s
|
||||
```
|
||||
|
||||
### Case B (optimized, WRITE_ONCE=1)
|
||||
```
|
||||
Run 1: 51594697 ops/s
|
||||
Run 2: 50145581 ops/s
|
||||
Run 3: 52268972 ops/s
|
||||
Run 4: 52083686 ops/s
|
||||
Run 5: 50612405 ops/s
|
||||
Run 6: 50556552 ops/s
|
||||
Run 7: 49910193 ops/s
|
||||
Run 8: 52657108 ops/s
|
||||
Run 9: 52053748 ops/s
|
||||
Run 10: 51957521 ops/s
|
||||
Run 11: 52417281 ops/s
|
||||
Run 12: 51712162 ops/s
|
||||
Run 13: 51531743 ops/s
|
||||
Run 14: 50832685 ops/s
|
||||
Run 15: 51337254 ops/s
|
||||
Run 16: 51218309 ops/s
|
||||
Run 17: 50110155 ops/s
|
||||
Run 18: 51459878 ops/s
|
||||
Run 19: 51931080 ops/s
|
||||
Run 20: 51036152 ops/s
|
||||
|
||||
Mean: 51371358.10 ops/s
|
||||
Median: 51495810.50 ops/s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Rollback 手順
|
||||
|
||||
Phase 5 E5-2 は ENV-gated で default OFF。Rollback 不要。
|
||||
|
||||
手動で無効化する場合:
|
||||
```sh
|
||||
export HAKMEM_TINY_HEADER_WRITE_ONCE=0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. まとめ
|
||||
|
||||
Phase 5 E5-2 (Header Write-Once) は 20-run 再テストで **+0.54% (NEUTRAL)** を記録。
|
||||
|
||||
- GO 閾値 (+1.0%) に到達せず
|
||||
- Research box として維持(default OFF、manual opt-in)
|
||||
- Phase 13 v1 も同様に research box 維持
|
||||
|
||||
次のステップ: Phase 12 Strategic Pause の次の gap 仮説に進む
|
||||
@ -11,6 +11,9 @@ runs=${RUNS:-10}
|
||||
|
||||
# Force known research knobs OFF to avoid accidental carry-over.
|
||||
export HAKMEM_TINY_HEADER_WRITE_ONCE=${HAKMEM_TINY_HEADER_WRITE_ONCE:-0}
|
||||
export HAKMEM_TINY_C7_PRESERVE_HEADER=${HAKMEM_TINY_C7_PRESERVE_HEADER:-0}
|
||||
export HAKMEM_TINY_TCACHE=${HAKMEM_TINY_TCACHE:-0}
|
||||
export HAKMEM_TINY_TCACHE_CAP=${HAKMEM_TINY_TCACHE_CAP:-64}
|
||||
export HAKMEM_MALLOC_TINY_DIRECT=${HAKMEM_MALLOC_TINY_DIRECT:-0}
|
||||
export HAKMEM_ENV_SNAPSHOT_SHAPE=${HAKMEM_ENV_SNAPSHOT_SHAPE:-0}
|
||||
export HAKMEM_FREE_TINY_FAST_MONO_DUALHOT=${HAKMEM_FREE_TINY_FAST_MONO_DUALHOT:-0}
|
||||
@ -20,4 +23,3 @@ for i in $(seq 1 "${runs}"); do
|
||||
echo "=== Run ${i}/${runs} ==="
|
||||
HAKMEM_PROFILE="${profile}" ./bench_random_mixed_hakmem "${iters}" "${ws}" 1 2>&1 | rg "Throughput" || true
|
||||
done
|
||||
|
||||
|
||||
Reference in New Issue
Block a user