Files
hakmem/docs/analysis/PHASE5_POST_E1_NEXT_INSTRUCTIONS.md
Moe Charm (CI) 4a070d8a14 Phase 5 E4-1: Free Wrapper ENV Snapshot (+3.51% GO, ADOPTED)
Target: Consolidate free wrapper TLS reads (2→1)
- free() is 25.26% self% (top hot spot)
- Strategy: Apply E1 success pattern (ENV snapshot) to free path

Implementation:
- ENV gate: HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0/1 (default 0)
- core/box/free_wrapper_env_snapshot_box.{h,c}: New box
  - Consolidates 2 TLS reads → 1 TLS read (50% reduction)
  - Reduces 4 branches → 3 branches (25% reduction)
  - Lazy init with probe window (bench_profile putenv sync)
- core/box/hak_wrappers.inc.h: Integration in free() wrapper
- Makefile: Add free_wrapper_env_snapshot_box.o to all targets

A/B Test Results (Mixed, 10-run, 20M iters):
- Baseline (SNAPSHOT=0): 45.35M ops/s (mean), 45.31M ops/s (median)
- Optimized (SNAPSHOT=1): 46.94M ops/s (mean), 47.15M ops/s (median)
- Improvement: +3.51% mean, +4.07% median

Decision: GO (+3.51% >= +1.0% threshold)
- Exceeded conservative estimate (+1.5% → +3.51%)
- Similar efficiency to E1 (+3.92%)
- Health check: PASS (all profiles)
- Action: PROMOTED to MIXED_TINYV3_C7_SAFE preset

Phase 5 Cumulative:
- E1 (ENV Snapshot): +3.92%
- E4-1 (Free Wrapper Snapshot): +3.51%
- Total Phase 4-5: ~+7.5%

E3-4 Correction:
- Phase 4 E3-4 (ENV Constructor Init): NO-GO / FROZEN
- Initial A/B showed +4.75%, but investigation revealed:
  - Branch prediction hint mismatch (UNLIKELY with always-true)
  - Retest confirmed -1.78% regression
  - Root cause: __builtin_expect(..., 0) with ctor_mode==1
- Decision: Freeze as research box (default OFF)
- Learning: Branch hints need careful tuning, TLS consolidation safer

Deliverables:
- docs/analysis/PHASE5_E4_FREE_GATE_OPTIMIZATION_1_DESIGN.md
- docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md
- docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md (next)
- docs/analysis/PHASE5_POST_E1_NEXT_INSTRUCTIONS.md
- docs/analysis/ENV_PROFILE_PRESETS.md (E4-1 added, E3-4 corrected)
- CURRENT_TASK.md (E4-1 complete, E3-4 frozen)
- core/bench_profile.h (E4-1 promoted to default)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 04:24:34 +09:00

72 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 5: Post-E1 Baseline & Next Target次の指示書
## Status2025-12-14
- Phase 4 の勝ち箱は **E1ENV Snapshot**`MIXED_TINYV3_C7_SAFE` で default 化)
- E3-4ENV CTOR**NO-GO / freeze**
- Phase 5 の勝ち箱: **E4-1free wrapper snapshot**`MIXED_TINYV3_C7_SAFE` で default 化)
- 次は “形” ではなく **wrapper 入口の ENV/TLS** を削るE4-2か、perf で self% ≥ 5% を殴る
---
## Step 0: Baseline 固定Mixed
```sh
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE ./bench_random_mixed_hakmem 20000000 400 1
```
注意:
- 以後の A/B はこのプロファイル(=E1 ONを基準にする
---
## Step 1: perf で “芯” を選ぶself% ≥ 5%
```sh
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE perf record -F 99 -- \
./bench_random_mixed_hakmem 20000000 400 1
perf report --stdio --no-children
```
GO/NO-GO:
- self% が **5% 未満**の最適化は原則 NO-GOまず他を削る
---
## Step 2: 研究箱の候補を 1 つに絞るBox Theory
要件:
- L0 ENV gatedefault OFFを必ず用意戻せる
- 境界は 1 箇所(変換点を増やさない)
- 可視化はカウンタ 1 本まで(常時ログ禁止)
---
## Step 3: A/B で GO 判定Mixed
Mixed 10-run:
- GO: mean **+1.0% 以上**
- ±1%: NEUTRALfreeze
- -1% 以下: NO-GOfreeze
---
## Step 4: 健康診断
```sh
scripts/verify_health_profiles.sh
```
---
## Step 5: 昇格
- 勝ち箱だけを `core/bench_profile.h` のプリセットへ
- `docs/analysis/ENV_PROFILE_PRESETS.md` に結果rollback を追記
- `CURRENT_TASK.md` を更新
## Next
- E4-1 昇格: `docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
- E4-2 設計/実装: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`