Phase 13 v1 + E5-2 retest: Both NEUTRAL, freeze as research boxes

Phase 13 v1: Header Write Elimination (C7 preserve header)
- Verdict: NEUTRAL (+0.78%)
- Implementation: HAKMEM_TINY_C7_PRESERVE_HEADER ENV gate (default OFF)
- Makes C7 nextptr offset conditional (0→1 when enabled)
- 4-point matrix A/B test results:
  * Case A (baseline): 51.49M ops/s
  * Case B (WRITE_ONCE=1): 52.07M ops/s (+1.13%)
  * Case C (C7_PRESERVE=1): 51.36M ops/s (-0.26%)
  * Case D (both): 51.89M ops/s (+0.78% NEUTRAL)
- Action: Freeze as research box (default OFF, manual opt-in)

Phase 5 E5-2: Header Write-Once retest (promotion test)
- Verdict: NEUTRAL (+0.54%)
- Motivation: Phase 13 Case B showed +1.13%, re-tested with dedicated 20-run
- Results (20-run):
  * Case A (baseline): 51.10M ops/s
  * Case B (WRITE_ONCE=1): 51.37M ops/s (+0.54%)
- Previous test: +0.45% (consistent with NEUTRAL)
- Action: Keep as research box (default OFF, manual opt-in)

Key findings:
- Header write tax optimization shows consistent NEUTRAL results
- Neither Phase 13 v1 nor E5-2 reaches GO threshold (+1.0%)
- Both implemented as reversible ENV gates for future research

Files changed:
- New: core/box/tiny_c7_preserve_header_env_box.{c,h}
- Modified: core/box/tiny_layout_box.h (C7 offset conditional)
- Modified: core/tiny_nextptr.h, core/box/tiny_header_box.h (comments)
- Modified: core/bench_profile.h (refresh sync)
- Modified: Makefile (add new .o files)
- Modified: scripts/run_mixed_10_cleanenv.sh (add C7_PRESERVE ENV)
- Docs: PHASE13_*, PHASE5_E5_2_HEADER_WRITE_ONCE_* (design/results)

Next: Phase 14 (Pointer-chase reduction, tcache-style intrusive LIFO)

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-15 00:32:25 +09:00
parent 51f76153c4
commit cbb35ee27f
16 changed files with 836 additions and 27 deletions

View File

@ -162,22 +162,74 @@ Phase 6-10 で達成した累積改善:
詳細: `docs/analysis/PHASE12_STRATEGIC_PAUSE_RESULTS.md` 詳細: `docs/analysis/PHASE12_STRATEGIC_PAUSE_RESULTS.md`
### Next: Phase 13Header Write Elimination ### Phase 13: Header Write Elimination v1 — NEUTRAL (+0.78%) ⚠️ RESEARCH BOX
**方向性決定**: Pause 解除、Phase 13 へ進む ✅ **Date**: 2025-12-14
**Verdict**: **NEUTRAL (+0.78%)** — Frozen as research box (default OFF, manual opt-in)
**Target**: 1-byte header write の削除(最優先仮説) **Target**: steady-state の header write tax 削減(最優先仮説)
**Strategy**: **Strategy (v1)**:
- Header を user pointer より前に配置system malloc パターン) - **C7 freelist がヘッダを壊さない**形に寄せ、E5-2write-onceを C7 にも適用可能にする
- または header-less classificationRegionId のみ) - ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1` (default: 0)
**Expected ROI**: **+10-20%** **Results (4-Point Matrix)**:
| Case | C7_PRESERVE | WRITE_ONCE | Mean (ops/s) | Delta | Verdict |
|------|-------------|------------|--------------|-------|---------|
| A (baseline) | 0 | 0 | 51,490,500 | — | — |
| **B (E5-2 only)** | 0 | 1 | **52,070,600** | **+1.13%** | candidate |
| C (C7 preserve) | 1 | 0 | 51,355,200 | -0.26% | NEUTRAL |
| D (Phase 13 v1) | 1 | 1 | 51,891,902 | +0.78% | NEUTRAL |
**Next Actions**: **Key Findings**:
1. Header write overhead の実測perf annotate 1. **E5-2 (HAKMEM_TINY_HEADER_WRITE_ONCE=1) は “単発 +1.13%” を観測したが、20-run 再テストで NEUTRAL (+0.54%)**
2. Header-less classification の feasibility 検証 - 参照: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
3. Phase 13 設計書の作成 - 結論: E5-2 は research box 維持default OFF
2. **C7 preserve header alone: -0.26%** (slight regression)
- C7 offset=1 memcpy overhead outweighs benefits
3. **Combined (Phase 13 v1): +0.78%** (positive but below GO)
- C7 preserve reduces E5-2 gains
**Action**:
- ✅ Freeze Phase 13 v1 as research box (default OFF)
- ✅ Re-test Phase 5 E5-2 (WRITE_ONCE=1) with dedicated 20-run → NEUTRAL (+0.54%)
- 📋 Document results: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
### Phase 5 E5-2: Header Write-Once — 再テスト NEUTRAL (+0.54%) ⚪
**Date**: 2025-12-14
**Verdict**: ⚪ **NEUTRAL (+0.54%)** — Research box 維持default OFF
**Motivation**: Phase 13 の 4点マトリクスで E5-2 単体が +1.13% を記録したため、専用 20-run で昇格可否を判定。
**Results (20-run)**:
| Case | WRITE_ONCE | Mean (ops/s) | Median (ops/s) | Delta |
|------|------------|--------------|----------------|-------|
| A (baseline) | 0 | 51,096,839 | 51,127,725 | — |
| B (optimized) | 1 | 51,371,358 | 51,495,811 | **+0.54%** |
**Verdict**: NEUTRAL (+0.54%) — GO 閾値 (+1.0%) 未達
**考察**:
- Phase 13 の +1.13% は 10-run での観測値
- 専用 20-run では +0.54%(より信頼性が高い)
- 旧 E5-2 テスト (+0.45%) と一貫性あり
**Action**:
- ✅ Research box 維持default OFF、manual opt-in
- ENV: `HAKMEM_TINY_HEADER_WRITE_ONCE=0/1` (default: 0)
- 📋 詳細: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
**Next**: Phase 12 Strategic Pause の次の gap 仮説へ進む
### Next: Phase 14Pointer Chase Reduction / Tiny tcache
**狙い**: system malloc の tcache に寄せて、Tiny frontend の “配列/FIFO/indirection” コストを減らす。
- 設計: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_1_DESIGN.md`
- 指示: `docs/analysis/PHASE14_POINTER_CHASE_REDUCTION_1_NEXT_INSTRUCTIONS.md`
## 更新メモ2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot ## 更新メモ2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot

View File

@ -218,12 +218,12 @@ LDFLAGS += $(EXTRA_LDFLAGS)
# Targets # Targets
TARGET = test_hakmem TARGET = test_hakmem
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
OBJS = $(OBJS_BASE) OBJS = $(OBJS_BASE)
# Shared library # Shared library
SHARED_LIB = libhakmem.so SHARED_LIB = libhakmem.so
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/box/tiny_c7_preserve_header_env_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
# Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1) # Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1)
ifeq ($(POOL_TLS_PHASE1),1) ifeq ($(POOL_TLS_PHASE1),1)
@ -427,7 +427,7 @@ test-box-refactor: box-refactor
./larson_hakmem 10 8 128 1024 1 12345 4 ./larson_hakmem 10 8 128 1024 1 12345 4
# Phase 4: Tiny Pool benchmarks (properly linked with hakmem) # Phase 4: Tiny Pool benchmarks (properly linked with hakmem)
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/tiny_free_route_cache_env_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/free_cold_shape_env_box.o core/box/free_cold_shape_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/tiny_free_route_cache_env_box.o core/box/hakmem_env_snapshot_box.o core/box/tiny_c7_preserve_header_env_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE) TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE)
ifeq ($(POOL_TLS_PHASE1),1) ifeq ($(POOL_TLS_PHASE1),1)
TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o

View File

@ -10,6 +10,7 @@
#include "box/tiny_static_route_box.h" // tiny_static_route_refresh_from_env (Phase 3 C3) #include "box/tiny_static_route_box.h" // tiny_static_route_refresh_from_env (Phase 3 C3)
#include "box/hakmem_env_snapshot_box.h" // hakmem_env_snapshot_refresh_from_env (Phase 4 E1) #include "box/hakmem_env_snapshot_box.h" // hakmem_env_snapshot_refresh_from_env (Phase 4 E1)
#include "box/tiny_free_route_cache_env_box.h" // tiny_free_static_route_refresh_from_env (Phase 8) #include "box/tiny_free_route_cache_env_box.h" // tiny_free_static_route_refresh_from_env (Phase 8)
#include "box/tiny_c7_preserve_header_env_box.h" // tiny_c7_preserve_header_env_refresh_from_env (Phase 13 v1)
#endif #endif
// env が未設定のときだけ既定値を入れる // env が未設定のときだけ既定値を入れる
@ -184,5 +185,7 @@ static inline void bench_apply_profile(void) {
hakmem_env_snapshot_refresh_from_env(); hakmem_env_snapshot_refresh_from_env();
// Phase 8: Sync free static route ENV cache after bench_profile putenv defaults. // Phase 8: Sync free static route ENV cache after bench_profile putenv defaults.
tiny_free_static_route_refresh_from_env(); tiny_free_static_route_refresh_from_env();
// Phase 13 v1: Sync C7 preserve header ENV cache after bench_profile putenv defaults.
tiny_c7_preserve_header_env_refresh_from_env();
#endif #endif
} }

View File

@ -0,0 +1,50 @@
// ============================================================================
// Phase 13 v1: Tiny C7 Preserve Header ENV Box (L0) - Implementation
// ============================================================================
#include "tiny_c7_preserve_header_env_box.h"
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>
// ============================================================================
// Global State
// ============================================================================
_Atomic int g_tiny_c7_preserve_header_enabled = -1;
// ============================================================================
// Init (Cold Path)
// ============================================================================
int tiny_c7_preserve_header_env_init(void) {
const char* env = getenv("HAKMEM_TINY_C7_PRESERVE_HEADER");
int enabled = 0; // default: OFF (opt-in)
if (env && (env[0] == '1' || strcmp(env, "true") == 0 || strcmp(env, "TRUE") == 0)) {
enabled = 1;
}
// Cache result
atomic_store_explicit(&g_tiny_c7_preserve_header_enabled, enabled, memory_order_relaxed);
// Log once (stderr for immediate visibility)
if (enabled) {
const char msg[] = "[C7_PRESERVE_HEADER] enabled\n";
ssize_t w = write(2, msg, sizeof(msg) - 1);
(void)w;
}
return enabled;
}
// ============================================================================
// Refresh (Cold Path, called from bench_profile)
// ============================================================================
void tiny_c7_preserve_header_env_refresh_from_env(void) {
// Reset to uninitialized state (-1)
// Next call to tiny_c7_preserve_header_enabled() will re-read ENV
atomic_store_explicit(&g_tiny_c7_preserve_header_enabled, -1, memory_order_relaxed);
}

View File

@ -0,0 +1,72 @@
// ============================================================================
// Phase 13 v1: Tiny C7 Preserve Header ENV Box (L0)
// ============================================================================
//
// Purpose: ENV gate for C7 header-preserving freelist layout
//
// Design: docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md
// Instructions: docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_NEXT_INSTRUCTIONS.md
//
// Strategy:
// - C7 (1025-2048B) の freelist が header を壊さないようにする
// - nextptr offset を 0→1 に変更header 1B をスキップ)
// - これにより alloc 時の header 再書き込みを削減できる
//
// ENV:
// HAKMEM_TINY_C7_PRESERVE_HEADER=0/1 (default: 0, opt-in)
//
// API:
// tiny_c7_preserve_header_enabled() -> int
// tiny_c7_preserve_header_env_refresh_from_env()
//
// Box Theory:
// - L0: This file (ENV gate,戻せる)
// - L1: tiny_layout_box.h (SSOT: tiny_nextptr_offset)
// - L2: tiny_nextptr.h, tiny_header_box.h (affected code)
//
// Safety:
// - ENV-gated (default OFF, opt-in)
// - Reversible (ENV toggle)
// - Minimal change (C7 offset 0→1 のみ)
//
// ============================================================================
#ifndef TINY_C7_PRESERVE_HEADER_ENV_BOX_H
#define TINY_C7_PRESERVE_HEADER_ENV_BOX_H
#include <stdatomic.h>
// ============================================================================
// Global State (L0)
// ============================================================================
// Cached state: -1 (uninitialized), 0 (disabled), 1 (enabled)
extern _Atomic int g_tiny_c7_preserve_header_enabled;
// ============================================================================
// Hot Inline API (L0)
// ============================================================================
// Check if C7 preserve header is enabled
// Returns: 1 if enabled, 0 if disabled
static inline int tiny_c7_preserve_header_enabled(void) {
int val = atomic_load_explicit(&g_tiny_c7_preserve_header_enabled, memory_order_relaxed);
if (__builtin_expect(val == -1, 0)) {
// Lazy init: read ENV once
extern int tiny_c7_preserve_header_env_init(void);
val = tiny_c7_preserve_header_env_init();
}
return val;
}
// ============================================================================
// Cold API (L2)
// ============================================================================
// Refresh ENV cache (called from bench_profile after putenv)
// Pattern: Same as Phase 8 (FREE_STATIC_ROUTE)
extern void tiny_c7_preserve_header_env_refresh_from_env(void);
#endif // TINY_C7_PRESERVE_HEADER_ENV_BOX_H

View File

@ -41,13 +41,14 @@
// //
// Returns: // Returns:
// true - C1-C6: Header preserved at offset 0, next at offset 1 // true - C1-C6: Header preserved at offset 0, next at offset 1
// false - C0, C7: Header overwritten by next pointer at offset 0 // false - C0: Header overwritten by next pointer at offset 0
// Phase 13 v1: C7 returns false (default) or true (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
static inline bool tiny_class_preserves_header(int class_idx) { static inline bool tiny_class_preserves_header(int class_idx) {
#if HAKMEM_TINY_HEADER_CLASSIDX #if HAKMEM_TINY_HEADER_CLASSIDX
// Delegate to tiny_layout_box.h specification (Single Source of Truth) // Delegate to tiny_layout_box.h specification (Single Source of Truth)
// next_off=0 → header overwritten (C0, C7) // next_off=0 → header overwritten (C0, C7 default)
// next_off=1 → header preserved (C1-C6) // next_off=1 → header preserved (C1-C6, C7 with HAKMEM_TINY_C7_PRESERVE_HEADER=1)
return tiny_nextptr_offset(class_idx) != 0; return tiny_nextptr_offset(class_idx) != 0;
#else #else
// Headers disabled globally // Headers disabled globally
@ -87,7 +88,8 @@ static inline void tiny_header_write_if_preserved(void* base, int class_idx) {
// ============================================================================ // ============================================================================
// //
// Validates header ONLY if this class preserves headers. // Validates header ONLY if this class preserves headers.
// For C0/C7, validation is impossible (next pointer is stored at offset 0). // For C0, validation is impossible (next pointer is stored at offset 0).
// Phase 13 v1: C7 validation depends on HAKMEM_TINY_C7_PRESERVE_HEADER.
// //
// Arguments: // Arguments:
// base - BASE pointer (not user pointer) // base - BASE pointer (not user pointer)

View File

@ -79,14 +79,29 @@ static inline size_t tiny_user_offset(int class_idx) {
// Offset for storing the freelist next pointer inside a freed block. // Offset for storing the freelist next pointer inside a freed block.
// This is distinct from tiny_user_offset(): // This is distinct from tiny_user_offset():
// - User offset is always +1 in header mode. // - User offset is always +1 in header mode.
// - Next offset is 0 for C0/C7 (cannot preserve header while free), else 1. // - Next offset:
// - C0: always 0 (16B, cannot fit header+next)
// - C1-C6: always 1 (header-preserving)
// - C7: 0 (default) or 1 (Phase 13 v1: header-preserving)
static inline size_t tiny_nextptr_offset(int class_idx) { static inline size_t tiny_nextptr_offset(int class_idx) {
#if HAKMEM_TINY_HEADERLESS #if HAKMEM_TINY_HEADERLESS
(void)class_idx; (void)class_idx;
return 0; return 0;
#elif HAKMEM_TINY_HEADER_CLASSIDX #elif HAKMEM_TINY_HEADER_CLASSIDX
// Bit pattern: C0=0, C1-C6=1, C7=0 → 0b01111110 = 0x7E // Phase 13 v1: C7 preserve header gate
return (0x7Eu >> ((unsigned)class_idx & 7u)) & 1u; // Bit pattern (default): C0=0, C1-C6=1, C7=0 → 0b01111110 = 0x7E
// Bit pattern (C7 preserve): C0=0, C1-C7=1 → 0b11111110 = 0xFE
unsigned int base_pattern = 0x7Eu; // default: C7 offset=0
// Phase 13 v1: Gate for C7 header-preserving layout
if (class_idx == 7) {
extern int tiny_c7_preserve_header_enabled(void);
if (tiny_c7_preserve_header_enabled()) {
base_pattern = 0xFEu; // C7 offset=1 (header-preserving)
}
}
return (base_pattern >> ((unsigned)class_idx & 7u)) & 1u;
#else #else
(void)class_idx; (void)class_idx;
return 0u; return 0u;

View File

@ -1,7 +1,8 @@
// tiny_nextptr.h - Authoritative next-pointer offset/load/store for tiny boxes // tiny_nextptr.h - Authoritative next-pointer offset/load/store for tiny boxes
// //
// Finalized Phase E1-CORRECT spec (物理制約込み): // Finalized Phase E1-CORRECT spec (物理制約込み):
// P0.1 updated: C0 and C7 use offset 0, C1-C6 use offset 1 (header preserved) // P0.1 updated: C0 uses offset 0, C1-C6 use offset 1 (header preserved)
// Phase 13 v1: C7 uses offset 0 (default) or 1 (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
// //
// HAKMEM_TINY_HEADER_CLASSIDX != 0 のとき: // HAKMEM_TINY_HEADER_CLASSIDX != 0 のとき:
// //
@ -18,8 +19,8 @@
// //
// Class 7: // Class 7:
// [1B header][payload 2047B] // [1B header][payload 2047B]
// → headerは上書きし、next は base+0 に格納(最大サイズなので許容) // → next_off = 0 (default: headerは上書き)
// → next_off = 0 // → next_off = 1 (Phase 13 v1: HAKMEM_TINY_C7_PRESERVE_HEADER=1)
// //
// HAKMEM_TINY_HEADER_CLASSIDX == 0 のとき: // HAKMEM_TINY_HEADER_CLASSIDX == 0 のとき:
// //
@ -56,7 +57,8 @@ static __thread void* g_tiny_next_ra1 __attribute__((unused)) = NULL;
static __thread void* g_tiny_next_ra2 __attribute__((unused)) = NULL; static __thread void* g_tiny_next_ra2 __attribute__((unused)) = NULL;
// Compute freelist next-pointer offset within a block for the given class. // Compute freelist next-pointer offset within a block for the given class.
// P0.1 updated: C0 and C7 use offset 0, C1-C6 use offset 1 (header preserved) // P0.1: C0 uses offset 0, C1-C6 use offset 1 (header preserved)
// Phase 13 v1: C7 uses offset 0 (default) or 1 (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
// Rationale for C0: 8B stride cannot fit [1B header][8B next pointer] without overflow // Rationale for C0: 8B stride cannot fit [1B header][8B next pointer] without overflow
static inline __attribute__((always_inline)) size_t tiny_next_off(int class_idx) { static inline __attribute__((always_inline)) size_t tiny_next_off(int class_idx) {
return tiny_nextptr_offset(class_idx); return tiny_nextptr_offset(class_idx);
@ -186,7 +188,8 @@ static inline __attribute__((always_inline)) void* tiny_next_load(const void* ba
// - When class_map is used for class_idx lookup (default), header restoration is unnecessary // - When class_map is used for class_idx lookup (default), header restoration is unnecessary
// - Alloc path always writes fresh header before returning block to user (HAK_RET_ALLOC) // - Alloc path always writes fresh header before returning block to user (HAK_RET_ALLOC)
// - ENV: HAKMEM_TINY_RESTORE_HEADER=1 to force header restoration (legacy mode) // - ENV: HAKMEM_TINY_RESTORE_HEADER=1 to force header restoration (legacy mode)
// P0.1: C7 uses offset 0 (overwrites header), C0-C6 use offset 1 (header preserved) // P0.1: C0 uses offset 0 (overwrites header), C1-C6 use offset 1 (header preserved)
// Phase 13 v1: C7 uses offset 0 (default) or 1 (HAKMEM_TINY_C7_PRESERVE_HEADER=1)
static inline __attribute__((always_inline)) void tiny_next_store(void* base, int class_idx, void* next) { static inline __attribute__((always_inline)) void tiny_next_store(void* base, int class_idx, void* next) {
size_t off = tiny_next_off(class_idx); size_t off = tiny_next_off(class_idx);

View File

@ -0,0 +1,58 @@
# Phase 13 v1: Header Write EliminationC7 preserve headerA/B 結果
**Date**: 2025-12-14
**Verdict**: ⚪ **NEUTRAL**Phase 13 v1 は research box freeze / default OFF
設計: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md`
手順: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_NEXT_INSTRUCTIONS.md`
---
## 1. 目的
Phase 12 の gap 仮説header write taxに対して、Phase 13 v1 は:
- **ヘッダを消さずに維持**
- C7 の freelist がヘッダを壊さないheader-preservingようにして
- **E5-2header write-onceを C7 にも拡張**できるかを検証する
---
## 2. 4点マトリクスthroughput
| Case | HAKMEM_TINY_C7_PRESERVE_HEADER | HAKMEM_TINY_HEADER_WRITE_ONCE | ops/s | vs Case A |
|------|--------------------------------|-------------------------------|-------|----------|
| A | 0 | 0 | 51,490,500 | baseline |
| B | 0 | 1 | 52,070,600 | **+1.13%** |
| C | 1 | 0 | 51,355,200 | -0.26% |
| D | 1 | 1 | 51,891,902 | +0.78% |
結論:
- Phase 13 v1Case D**+0.78%** → **NEUTRAL**GO閾値 +1.0% 未満)
- **E5-2 単体Case Bが +1.13% で GO 相当**という重要な副産物が得られた
---
## 3. 判定
### 3.1 Phase 13 v1C7 preserve header
- **Verdict**: ⚪ NEUTRAL → **research box freezedefault OFF**
- 推定原因:
- C7 preserve による freelist next のオフセット変更が、保存できた write を相殺(未確定)
### 3.2 Phase 5 E5-2Header write-once
- **再テスト結果**:
- Phase 13 matrix の単発観測では **+1.13%**Case B
- 専用 clean env 20-run 再テストでは **+0.54%NEUTRAL** → research box 維持default OFF
- 詳細: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
---
## 4. Next Actions推奨
1. Phase 13 v1 は freeze保持はするが default OFF
2. E5-2 は freezedefault OFF
3. Phase 13 v1 の派生案(必要なら):
- C7 の next を “より aligned” な位置に置く設計v1bを研究箱で検討

View File

@ -0,0 +1,146 @@
# Phase 13: Header Write Elimination v1C7 Header-Preserving Freelist
**Date**: 2025-12-14
**Status**: DESIGNPhase 13 kickoff→ ⚪ **NEUTRAL (+0.78%)**research box freeze, default OFF
---
## 0. Executive Summary1枚
Phase 12 の比較で **system malloc (glibc) が hakmem より +63.7% 速い**ことが判明し、次の大きい構造差として **“steady-state のヘッダ書き込みwrite tax”** が最優先仮説になった。
ただし hakmem は free の hot path で `HEADER_MAGIC` を前提に **ヘッダを読む**ため、ヘッダを “無くす/壊す” と安全性が崩れる。
そこで Phase 13 v1 は「ヘッダ自体は維持」しつつ、**C7 の freelist でヘッダを上書きしない**設計に寄せて、既存の **E5-2 (Header write-once)****C7 にも適用可能にする**
狙い:
- C1-C6 は既に write-once で “alloc 時ヘッダ書き込み” をスキップ可能
- **C7 は現状 “free の next がヘッダを潰す” ため、alloc で毎回ヘッダ再書き込みが必要**
- C7 の next を **base+1user 先頭)**へ移すとヘッダが保持され、write-once で alloc 側の再書き込みを削れる
---
## 1. 現状(なぜ C7 だけ毎回書いているのか)
### 1.1 重要な前提(現行の正)
- Free hot path例: `core/front/malloc_tiny_fast.h``free_tiny_fast()`)は、
- `ptr-1``HEADER_MAGIC` を検証し
- class_idx を header から抽出している
**ヘッダの正しさは safety と fast path の前提**
### 1.2 E5-2 (Header write-once) の適用範囲
- `core/box/tiny_header_box.h``tiny_header_finalize_alloc()` が、
- `HAKMEM_TINY_HEADER_WRITE_ONCE=1` かつ
- `tiny_class_preserves_header(class_idx)=true`C1-C6
のとき、alloc 時の `tiny_region_id_write_header()` をスキップする。
### 1.3 C7 が write-once にならない理由(根本)
- `core/box/tiny_layout_box.h``tiny_nextptr_offset()`
- C7 は `next_off=0`= `base+0` に next を書く)
→ free 時に **ヘッダ領域を next pointer で上書き**する
→ alloc で必ず `tiny_region_id_write_header()` を実行し直す必要がある
C0 も同じだが、C0 は stride 8B のため `base+1` に 8B next を置けない制約がある)
---
## 2. 提案Phase 13 v1
### 2.1 変更のコア
**C7 の next pointer を `base+1`user 先頭)に移す**:
- Before現行:
- C7: `next_off=0``*(void**)base = next`(ヘッダ破壊)
- AfterPhase 13 v1:
- C7: `next_off=1``memcpy(base+1, &next, 8)`(ヘッダ保持)
これにより C7 が “header-preserving class” になり、E5-2 の write-once が C7 にも効く。
### 2.2 Box Theory箱割り
```
L0: tiny_c7_preserve_header_env_box (ENV gate, A/B, refresh)
L1: tiny_layout_box (tiny_nextptr_offset の SSOT)
L2: tiny_nextptr (next load/store は SSOT を参照)
L3: tiny_header_box (class_preserves_header → write-once 適用)
```
境界は 1 箇所:
- 「C7 の next オフセット決定」= `tiny_nextptr_offset()` に集約(他で分岐しない)
### 2.3 戻せるA/B
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1`default: 0
- まずは research box として導入し、GO なら preset 昇格
---
## 3. Safety / InvariantsFail-Fast
### 3.1 不変条件
- `tiny_next_store/load`**常に** `tiny_nextptr_offset()` を参照(直書き禁止)
- `tiny_class_preserves_header(class_idx)` は offset!=0 で決まる(ハードコード禁止)
- C7 preserve ON のとき:
- free 後も `*(uint8_t*)base == HEADER_MAGIC|cls` が保持される(ヘッダ破壊が起きない)
### 3.2 Fail-Fastdebug 限定)
- デバッグのみ、C7 preserve ON のときに:
- `tiny_header_validate(base, 7, ...)` の mismatch をワンショットで出す
- release では常時ログ無し、必要なら stats カウンタのみ
---
## 4. A/B 計測計画(同一バイナリ)
この変更は “freelist next の配置” を変えるため、本来は layout 差になるが、Phase 13 v1 は **ENV で切替**できるようにして同一バイナリ A/B を維持するPhase 5-7 の教訓)。
### 4.1 4点マトリクス必須
| Case | HAKMEM_TINY_C7_PRESERVE_HEADER | HAKMEM_TINY_HEADER_WRITE_ONCE | 意味 |
|------|--------------------------------|-------------------------------|------|
| A | 0 | 0 | 現行 baseline |
| B | 0 | 1 | E5-2 のみC1-C6 |
| C | 1 | 0 | C7 next を user に移す(ヘッダは毎回書く) |
| D | 1 | 1 | Phase 13 v1 本命C1-C7 を write-once |
### 4.2 GO/NO-GOMixed 10-run
- GO: mean **+1.0% 以上**
- NO-GO: mean **-1.0% 以下**
- NEUTRAL: ±1.0% → freezeresearch box
---
## 5. リスクと対策
### リスク 1: C7 next が unaligned になり memcpy 経由で遅くなる
- 対策: Case Cwrite-once 無しを必ず測り、layout 変更単体のコストを分離する
- もし C が大きく負ける場合:
- “C7 next offset=8aligned” の派生案を検討Phase 13 v1b
### リスク 2: class_idx ハードコードが残っていて壊れる
- 対策: `rg "== 7|!= 7|C7 uses offset 0"` を掃除し、SSOT`tiny_layout_box`)参照に寄せる
### リスク 3: ENV refresh が bench_profile putenv に追従しない
- 対策: Phase 8 と同様に `*_env_refresh_from_env()` を用意し、`bench_profile.h` から呼ぶ
---
## 6. 次Phase 13 以降の視界)
Phase 13 v1 は「ヘッダを “消す”」ではなく「**steady-state のヘッダ再書き込みを減らす**」に寄せる。
もし system malloc との差がまだ大きい場合、次の大テーマは:
- Thread cachetcache 相当の構造)を TinyUnifiedCache に移植するPhase 14 候補)

View File

@ -0,0 +1,134 @@
# Phase 13: Header Write Elimination v1 — 次の指示書C7 preserve header
## 0. Status
- Phase 12 で system malloc が hakmem より +63.7% 速いことが判明 → Phase 13 開始
- 方針v1: **ヘッダは維持**しつつ、**C7 の freelist がヘッダを壊さない**ようにして “alloc 時のヘッダ再書き込み” を削る
- 結果: ⚪ **NEUTRAL (+0.78%) → freeze (default OFF)**(副産物: E5-2 が +1.13%
設計: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_DESIGN.md`
結果: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
---
## 1. 目的GO 条件)
Mixed 10-runclean envで:
- **GO**: mean +1.0% 以上
- **NO-GO**: mean -1.0% 以下(即 rollback / freeze
- **NEUTRAL**: ±1.0%research box freeze
---
## 2. 実装パッチ順(小さく積む)
### Patch 1: L0 ENV Box戻せる
新規:
- `core/box/tiny_c7_preserve_header_env_box.h`
- `core/box/tiny_c7_preserve_header_env_box.c`refresh
仕様:
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1`default: 0
- API:
- `tiny_c7_preserve_header_enabled() -> int`
- `tiny_c7_preserve_header_env_refresh_from_env()`
要件:
- hot path では **getenv 禁止**lazy init + cached read のみ)
### Patch 2: L1 Layout SSOT 変更境界1箇所
修正:
- `core/box/tiny_layout_box.h`
変更:
- `tiny_nextptr_offset(class_idx)` の C7 分だけを L0 gate で切替
- OFF: 既存C7 off=0
- ON: C7 off=1header-preserving
### Patch 3: L2 NextPtr のコメント/前提を SSOT 準拠に
修正(コードの挙動変更はしない):
- `core/tiny_nextptr.h`
- `core/box/tiny_header_box.h`(コメントの “C7=offset0 固定” 等があれば撤去)
狙い:
- C7 の offset 固定前提を残さない(設計事故の芽を摘む)
### Patch 4: Bench profile の refresh 同期ENV 事故防止)
修正:
- `core/bench_profile.h`
追加:
- `bench_setenv_default(...)` の後に `tiny_c7_preserve_header_env_refresh_from_env()` を呼ぶ
Phase 8 と同じパターン)
---
## 3. A/B テスト4点マトリクス必須
`scripts/run_mixed_10_cleanenv.sh` を使用ENV リークを防ぐ)。
### Case Abaseline
```sh
HAKMEM_TINY_C7_PRESERVE_HEADER=0 \
HAKMEM_TINY_HEADER_WRITE_ONCE=0 \
scripts/run_mixed_10_cleanenv.sh
```
### Case BE5-2 only
```sh
HAKMEM_TINY_C7_PRESERVE_HEADER=0 \
HAKMEM_TINY_HEADER_WRITE_ONCE=1 \
scripts/run_mixed_10_cleanenv.sh
```
### Case CC7 preserve only
```sh
HAKMEM_TINY_C7_PRESERVE_HEADER=1 \
HAKMEM_TINY_HEADER_WRITE_ONCE=0 \
scripts/run_mixed_10_cleanenv.sh
```
### Case DPhase 13 v1 本命)
```sh
HAKMEM_TINY_C7_PRESERVE_HEADER=1 \
HAKMEM_TINY_HEADER_WRITE_ONCE=1 \
scripts/run_mixed_10_cleanenv.sh
```
追加(任意):
- `HAKMEM_PROFILE=C6_HEAVY_LEGACY_POOLV1` でも 5-run を取る(回帰が無いこと)
---
## 4. 可視化(最小)
既存:
- `HAKMEM_TINY_HEADER_WRITE_ONCE_STATS=1` を使い、
- `alloc_skip_count / alloc_write_count` の比率が増えることを確認する
新規を足す場合(必要最小):
- “C7 で skip が増えている” が見えない場合のみ、C7 だけのカウンタを追加(常時 atomic は避ける)
---
## 5. 昇格GO の場合のみ)
GO のとき:
1. `core/bench_profile.h` に default を追加
- `bench_setenv_default("HAKMEM_TINY_C7_PRESERVE_HEADER", "1");`
- (必要なら)`HAKMEM_TINY_HEADER_WRITE_ONCE=1` も昇格
2. `CURRENT_TASK.md` に Phase 13 v1 の結果A/B 表)を追記
3. rollback 手順を明記
- `export HAKMEM_TINY_C7_PRESERVE_HEADER=0`
NO-GO のとき:
- research box freezedefault OFF のまま)、設計メモに原因を記録

View File

@ -9,6 +9,19 @@
--- ---
## Addendum2025-12-14
Phase 13 v1 の 4点マトリクスで、`HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を観測(候補)。
- 結果: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
- ただし、専用 clean env 20-run 再テストでは **+0.54%NEUTRAL** となり、昇格は見送り。
- 詳細: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
結論:
- E5-2 は research box のまま維持default OFF
---
## A/B Test Results (Mixed Workload) ## A/B Test Results (Mixed Workload)
### Configuration ### Configuration

View File

@ -6,6 +6,12 @@
**Baseline**: 43.998M ops/s (Mixed, 40M iters, ws=400, E4-1+E4-2+E5-1 ON) **Baseline**: 43.998M ops/s (Mixed, 40M iters, ws=400, E4-1+E4-2+E5-1 ON)
**Goal**: +1-3% by moving header writes from allocation hot path to refill cold boundary **Goal**: +1-3% by moving header writes from allocation hot path to refill cold boundary
**Update (2025-12-14)**:
- Phase 13 v1 の 4点マトリクスで `HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を観測(候補)。
- `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
- 専用 clean env 20-run 再テストでは **+0.54%NEUTRAL** → 昇格は見送り。
- `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
--- ---
## Hypothesis ## Hypothesis

View File

@ -0,0 +1,76 @@
# Phase 5 E5-2: Header Write-Once — Promotion 判定用 指示書
**Status**: ✅ COMPLETE → ⚪ NEUTRAL昇格見送り
結果: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_RETEST_AB_TEST_RESULTS.md`
## 0. 背景
過去の E5-2 A/B では NEUTRAL だったが、Phase 13 v1 の 4点マトリクス再計測で
`HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を記録し、GO候補になった。
参照:
- 旧結果: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`
- 新観測: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md`
目的: **E5-2 を preset デフォルトへ昇格できるか**を “専用 A/B” で確定する。
---
## 1. A/B 手順clean env, 同一バイナリ)
推奨: Mixed 20-runmean/median を確度高めに取る)
### A: baselineWRITE_ONCE=0
```sh
RUNS=20 HAKMEM_TINY_HEADER_WRITE_ONCE=0 scripts/run_mixed_10_cleanenv.sh
```
### B: optimizedWRITE_ONCE=1
```sh
RUNS=20 HAKMEM_TINY_HEADER_WRITE_ONCE=1 scripts/run_mixed_10_cleanenv.sh
```
任意:
- `HAKMEM_PROFILE=C6_HEAVY_LEGACY_POOLV1` でも 5-run を 0/1 で取る(回帰がないこと)
---
## 2. 判定ゲート
- **GO**: Mixed 20-run mean **+1.0% 以上** かつ median も正
- **NO-GO**: mean **-1.0% 以下**
- **NEUTRAL**: それ以外±1.0%)→ research box 維持default OFF
---
## 3. GO の場合の昇格手順(小パッチ)
### Patch P1: preset 昇格
- `core/bench_profile.h`(対象プリセット)に追加:
- `bench_setenv_default("HAKMEM_TINY_HEADER_WRITE_ONCE", "1");`
最初は `MIXED_TINYV3_C7_SAFE` のみに昇格でよいC6-heavy は任意)。
### Patch P2: cleanenv スクリプト更新ENV 漏れ防止)
`scripts/run_mixed_10_cleanenv.sh` のデフォルト値を見直す:
- 昇格後は `HAKMEM_TINY_HEADER_WRITE_ONCE` を “研究 knob” 扱いしない
- 例: `export HAKMEM_TINY_HEADER_WRITE_ONCE=${HAKMEM_TINY_HEADER_WRITE_ONCE:-1}`
(既存の運用: export された値は bench_setenv_default が上書きできないため)
### Patch P3: ドキュメント更新
- 新しい再計測結果を 1 本にまとめる(例: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_RETEST_AB_TEST_RESULTS.md`
- `CURRENT_TASK.md` に “E5-2 ADOPT” の記録を追記
---
## 4. NO-GO/NEUTRAL の場合
- `HAKMEM_TINY_HEADER_WRITE_ONCE` は research box のままdefault OFF
- 旧結果との差分要因(ベースライン差 / env 漏れ / build 形状)をメモして凍結

View File

@ -0,0 +1,177 @@
# Phase 5 E5-2: Header Write-Once — 再テスト結果(昇格判定)
**Date**: 2025-12-14
**Verdict**: ⚪ **NEUTRAL (+0.54%)** — Research box 維持default OFF
背景: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`
指示: `docs/analysis/PHASE5_E5_2_HEADER_WRITE_ONCE_PROMOTION_NEXT_INSTRUCTIONS.md`
---
## 1. 背景
Phase 13 v1 の 4点マトリクス A/B で `HAKMEM_TINY_HEADER_WRITE_ONCE=1` 単体が **+1.13%** を記録し、GO 候補として浮上したため、専用の clean env 20-run で昇格可否を判定。
参照: `docs/analysis/PHASE13_HEADER_WRITE_ELIMINATION_1_AB_TEST_RESULTS.md` (Case B)
---
## 2. テスト構成
- **Benchmark**: scripts/run_mixed_10_cleanenv.sh
- **Profile**: MIXED_TINYV3_C7_SAFE
- **Iterations**: 20,000,000 per run
- **Working set**: 400
- **Runs**: 20 per case
- **ENV**: `HAKMEM_TINY_C7_PRESERVE_HEADER=0` 固定C7 preserve は使用しない)
---
## 3. 結果20-run
| Case | WRITE_ONCE | Mean (ops/s) | Median (ops/s) | Delta vs A |
|------|------------|--------------|----------------|------------|
| A (baseline) | 0 | 51,096,839 | 51,127,725 | — |
| B (optimized) | 1 | 51,371,358 | 51,495,811 | **+0.54%** |
---
## 4. 判定
### 4.1 GO 条件
- Mean **+1.0%** 以上 かつ Median も正
- 今回: Mean +0.54%, Median +0.72%
### 4.2 Verdict
- **NEUTRAL (+0.54%)** → Research box 維持default OFF
- GO 閾値 (+1.0%) に到達せず
---
## 5. 考察
### 5.1 Phase 13 の 4点マトリクスとの差異
| Test | WRITE_ONCE=1 結果 | Runs | Baseline |
|------|-------------------|------|----------|
| Phase 13 (Case B) | **+1.13%** | 10 | 51,490,500 ops/s |
| 今回 (専用 20-run) | **+0.54%** | 20 | 51,096,839 ops/s |
**差分要因**:
1. **Baseline の揺らぎ**: Phase 13 の baseline (51.49M) vs 今回 (51.10M) で約 -0.76% の差
2. **測定回数**: 10-run vs 20-run20-run の方が信頼性が高い)
3. **ENV 汚染**: Phase 13 では 4 ケースを連続実行ENV リーク可能性)
### 5.2 Phase 5 E5-2 旧結果との比較
旧テスト(`PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`:
- 結果: +0.45% (NEUTRAL)
- 今回: +0.54% (NEUTRAL)
**一貫性**: 両テストとも NEUTRAL 範囲内で一貫
---
## 6. Next Actions
### 6.1 E5-2 の扱い
- ✅ Research box として維持default OFF、manual opt-in
- ENV: `HAKMEM_TINY_HEADER_WRITE_ONCE=0/1` (default: 0)
### 6.2 Phase 13 v1 の扱い
- ✅ Research box として維持default OFF
- ENV: `HAKMEM_TINY_C7_PRESERVE_HEADER=0/1` (default: 0)
### 6.3 次の最適化
Phase 12 Strategic Pause の gap 仮説リストに戻る:
1. ~~Header write tax~~ → Phase 13 v1 NEUTRAL, E5-2 NEUTRAL
2. **Pointer chase overhead** (次の候補)
3. Lock contention (if applicable)
4. Memory fence overhead
5. Metadata access patterns
---
## 7. Raw Data
### Case A (baseline, WRITE_ONCE=0)
```
Run 1: 50725850 ops/s
Run 2: 51547217 ops/s
Run 3: 51076712 ops/s
Run 4: 51527474 ops/s
Run 5: 51193070 ops/s
Run 6: 51597708 ops/s
Run 7: 52239171 ops/s
Run 8: 52386008 ops/s
Run 9: 51618321 ops/s
Run 10: 50919588 ops/s
Run 11: 52415403 ops/s
Run 12: 51125404 ops/s
Run 13: 49785086 ops/s
Run 14: 50915858 ops/s
Run 15: 51130046 ops/s
Run 16: 48960162 ops/s
Run 17: 51385756 ops/s
Run 18: 50849945 ops/s
Run 19: 50550500 ops/s
Run 20: 49987500 ops/s
Mean: 51096838.95 ops/s
Median: 51127725.00 ops/s
```
### Case B (optimized, WRITE_ONCE=1)
```
Run 1: 51594697 ops/s
Run 2: 50145581 ops/s
Run 3: 52268972 ops/s
Run 4: 52083686 ops/s
Run 5: 50612405 ops/s
Run 6: 50556552 ops/s
Run 7: 49910193 ops/s
Run 8: 52657108 ops/s
Run 9: 52053748 ops/s
Run 10: 51957521 ops/s
Run 11: 52417281 ops/s
Run 12: 51712162 ops/s
Run 13: 51531743 ops/s
Run 14: 50832685 ops/s
Run 15: 51337254 ops/s
Run 16: 51218309 ops/s
Run 17: 50110155 ops/s
Run 18: 51459878 ops/s
Run 19: 51931080 ops/s
Run 20: 51036152 ops/s
Mean: 51371358.10 ops/s
Median: 51495810.50 ops/s
```
---
## 8. Rollback 手順
Phase 5 E5-2 は ENV-gated で default OFF。Rollback 不要。
手動で無効化する場合:
```sh
export HAKMEM_TINY_HEADER_WRITE_ONCE=0
```
---
## 9. まとめ
Phase 5 E5-2 (Header Write-Once) は 20-run 再テストで **+0.54% (NEUTRAL)** を記録。
- GO 閾値 (+1.0%) に到達せず
- Research box として維持default OFF、manual opt-in
- Phase 13 v1 も同様に research box 維持
次のステップ: Phase 12 Strategic Pause の次の gap 仮説に進む

View File

@ -11,6 +11,9 @@ runs=${RUNS:-10}
# Force known research knobs OFF to avoid accidental carry-over. # Force known research knobs OFF to avoid accidental carry-over.
export HAKMEM_TINY_HEADER_WRITE_ONCE=${HAKMEM_TINY_HEADER_WRITE_ONCE:-0} export HAKMEM_TINY_HEADER_WRITE_ONCE=${HAKMEM_TINY_HEADER_WRITE_ONCE:-0}
export HAKMEM_TINY_C7_PRESERVE_HEADER=${HAKMEM_TINY_C7_PRESERVE_HEADER:-0}
export HAKMEM_TINY_TCACHE=${HAKMEM_TINY_TCACHE:-0}
export HAKMEM_TINY_TCACHE_CAP=${HAKMEM_TINY_TCACHE_CAP:-64}
export HAKMEM_MALLOC_TINY_DIRECT=${HAKMEM_MALLOC_TINY_DIRECT:-0} export HAKMEM_MALLOC_TINY_DIRECT=${HAKMEM_MALLOC_TINY_DIRECT:-0}
export HAKMEM_ENV_SNAPSHOT_SHAPE=${HAKMEM_ENV_SNAPSHOT_SHAPE:-0} export HAKMEM_ENV_SNAPSHOT_SHAPE=${HAKMEM_ENV_SNAPSHOT_SHAPE:-0}
export HAKMEM_FREE_TINY_FAST_MONO_DUALHOT=${HAKMEM_FREE_TINY_FAST_MONO_DUALHOT:-0} export HAKMEM_FREE_TINY_FAST_MONO_DUALHOT=${HAKMEM_FREE_TINY_FAST_MONO_DUALHOT:-0}
@ -20,4 +23,3 @@ for i in $(seq 1 "${runs}"); do
echo "=== Run ${i}/${runs} ===" echo "=== Run ${i}/${runs} ==="
HAKMEM_PROFILE="${profile}" ./bench_random_mixed_hakmem "${iters}" "${ws}" 1 2>&1 | rg "Throughput" || true HAKMEM_PROFILE="${profile}" ./bench_random_mixed_hakmem "${iters}" "${ws}" 1 2>&1 | rg "Throughput" || true
done done