Phase 5 E4-1: Free Wrapper ENV Snapshot (+3.51% GO, ADOPTED)
Target: Consolidate free wrapper TLS reads (2→1)
- free() is 25.26% self% (top hot spot)
- Strategy: Apply E1 success pattern (ENV snapshot) to free path
Implementation:
- ENV gate: HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0/1 (default 0)
- core/box/free_wrapper_env_snapshot_box.{h,c}: New box
- Consolidates 2 TLS reads → 1 TLS read (50% reduction)
- Reduces 4 branches → 3 branches (25% reduction)
- Lazy init with probe window (bench_profile putenv sync)
- core/box/hak_wrappers.inc.h: Integration in free() wrapper
- Makefile: Add free_wrapper_env_snapshot_box.o to all targets
A/B Test Results (Mixed, 10-run, 20M iters):
- Baseline (SNAPSHOT=0): 45.35M ops/s (mean), 45.31M ops/s (median)
- Optimized (SNAPSHOT=1): 46.94M ops/s (mean), 47.15M ops/s (median)
- Improvement: +3.51% mean, +4.07% median
Decision: GO (+3.51% >= +1.0% threshold)
- Exceeded conservative estimate (+1.5% → +3.51%)
- Similar efficiency to E1 (+3.92%)
- Health check: PASS (all profiles)
- Action: PROMOTED to MIXED_TINYV3_C7_SAFE preset
Phase 5 Cumulative:
- E1 (ENV Snapshot): +3.92%
- E4-1 (Free Wrapper Snapshot): +3.51%
- Total Phase 4-5: ~+7.5%
E3-4 Correction:
- Phase 4 E3-4 (ENV Constructor Init): NO-GO / FROZEN
- Initial A/B showed +4.75%, but investigation revealed:
- Branch prediction hint mismatch (UNLIKELY with always-true)
- Retest confirmed -1.78% regression
- Root cause: __builtin_expect(..., 0) with ctor_mode==1
- Decision: Freeze as research box (default OFF)
- Learning: Branch hints need careful tuning, TLS consolidation safer
Deliverables:
- docs/analysis/PHASE5_E4_FREE_GATE_OPTIMIZATION_1_DESIGN.md
- docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md
- docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md (next)
- docs/analysis/PHASE5_POST_E1_NEXT_INSTRUCTIONS.md
- docs/analysis/ENV_PROFILE_PRESETS.md (E4-1 added, E3-4 corrected)
- CURRENT_TASK.md (E4-1 complete, E3-4 frozen)
- core/bench_profile.h (E4-1 promoted to default)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -2,16 +2,15 @@
|
||||
|
||||
## Status(2025-12-14)
|
||||
|
||||
- ✅ 実装済み(research box / default OFF)
|
||||
- A/B(Mixed, 10-run, iter=20M, ws=400, E1=1)で **+4.75% mean / +4.35% median** を観測
|
||||
- ❌ NO-GO / FROZEN(default OFF)
|
||||
- 再検証 A/B(Mixed, 10-run, iter=20M, ws=400, E1=1): **-1.44% mean / -1.03% median**
|
||||
- ENV:
|
||||
- E1: `HAKMEM_ENV_SNAPSHOT=0/1`(default 0)
|
||||
- E3-4: `HAKMEM_ENV_SNAPSHOT_CTOR=0/1`(default 0、E1=1 前提)
|
||||
|
||||
## ゴール
|
||||
|
||||
1) “E3-4 の勝ち” を再確認して固定化する
|
||||
2) 本線(プリセット)へ昇格するか判断する(戻せる形で)
|
||||
E3-4 は freeze したので、実行指示は “再現検証” ではなく “凍結維持/rollback”。
|
||||
|
||||
---
|
||||
|
||||
@ -30,7 +29,7 @@ scripts/verify_health_profiles.sh
|
||||
|
||||
---
|
||||
|
||||
## Step 2: A/B(Mixed 10-run)
|
||||
## Step 2: 再現検証(必要な場合のみ)
|
||||
|
||||
Mixed 10-run(iter=20M, ws=400):
|
||||
|
||||
@ -49,9 +48,7 @@ HAKMEM_ENV_SNAPSHOT_CTOR=1 \
|
||||
```
|
||||
|
||||
判定(10-run mean):
|
||||
- GO: **+1.0% 以上**
|
||||
- ±1%: NEUTRAL(research box 維持)
|
||||
- -1% 以下: NO-GO(freeze)
|
||||
- -1% 以下 → freeze 維持(現状)
|
||||
|
||||
注意:
|
||||
- “constructor の pre-main init” を効かせたい場合は、起動前に ENV を設定する(bench_profile putenv だけでは遅い)。
|
||||
@ -75,20 +72,10 @@ perf report --stdio --no-children
|
||||
|
||||
---
|
||||
|
||||
## Step 4: 昇格(GO の場合のみ)
|
||||
## Step 4: 本線化(E1 のみ)
|
||||
|
||||
### Option A(推奨・安全): E1 だけプリセット昇格、E3-4 は opt-in 維持
|
||||
|
||||
- `core/bench_profile.h`(`MIXED_TINYV3_C7_SAFE`):
|
||||
- `bench_setenv_default("HAKMEM_ENV_SNAPSHOT","1");`
|
||||
- `HAKMEM_ENV_SNAPSHOT_CTOR` は入れない(研究箱のまま)
|
||||
- `docs/analysis/ENV_PROFILE_PRESETS.md` に E1/E3-4 の推奨セットを追記
|
||||
- `CURRENT_TASK.md` を更新
|
||||
|
||||
### Option B(攻める): E1+E3-4 をプリセット昇格
|
||||
|
||||
- 20-run validation(mean/median 両方)を通してから
|
||||
- 注意: `HAKMEM_ENV_SNAPSHOT_CTOR=1` をプリセット default にする場合、分岐 hint/期待値も合わせて見直す(baseline を汚さない)
|
||||
- `HAKMEM_ENV_SNAPSHOT_CTOR=1` は本線化しない(freeze)
|
||||
- E1(`HAKMEM_ENV_SNAPSHOT=1`)は勝ち箱なのでプリセット昇格を優先
|
||||
|
||||
---
|
||||
|
||||
@ -103,4 +90,4 @@ HAKMEM_ENV_SNAPSHOT_CTOR=0
|
||||
|
||||
## Next(Phase 4 Close)
|
||||
|
||||
- E1/E3-4 の “どこまで本線に入れるか” を決めたら、Phase 4 は CLOSE(勝ち箱はプリセットへ、研究箱は freeze)にする。
|
||||
- Phase 4 は “勝ち箱=E1” を固めて CLOSE。次は perf で次の芯を選ぶ。
|
||||
|
||||
Reference in New Issue
Block a user