Combined A/B Test Results (10-run Mixed): - Baseline (both OFF): 44.48M ops/s (mean), 44.39M ops/s (median) - Optimized (both ON): 47.34M ops/s (mean), 47.38M ops/s (median) - Improvement: +6.43% mean, +6.74% median Interaction Analysis: - E4-1 alone: +3.51% (measured in separate session) - E4-2 alone: +21.83% (measured in separate session) - Combined: +6.43% (measured in same binary) - Pattern: SUBADDITIVE (overlapping bottlenecks) Key Finding: Single-binary incremental gain is the accurate metric - E4-1 and E4-2 target overlapping TLS/branch resources - Individual measurements were from different baselines/sessions - Combined measurement (same binary, both flags) shows true progress Phase 5 Total Progress: - Original baseline (session start): 35.74M ops/s - Combined optimized: 47.34M ops/s - Total gain: +32.4% (cross-session, reference only) - Same-binary gain: +6.43% (E4-1+E4-2 both ON vs both OFF) New Baseline Perf Profile (47.0M ops/s): - free: 37.56% self% (still top hotspot) - tiny_alloc_gate_fast: 13.73% (reduced from 19.50%) - malloc: 12.95% (reduced from 16.13%) - tiny_region_id_write_header: 6.97% (header write tax) - hakmem_env_snapshot_enabled: 4.29% (ENV overhead visible) Health Check: PASS - MIXED_TINYV3_C7_SAFE: 42.3M ops/s - C6_HEAVY_LEGACY_POOLV1: 20.9M ops/s Phase 5 E5 Candidates (from perf profile): - E5-1: free() path internals (37.56% self%) - E5-2: Header write reduction (6.97% self%) - E5-3: ENV snapshot overhead (4.29% self%) Deliverables: - docs/analysis/PHASE5_E4_COMBINED_AB_TEST_RESULTS.md - docs/analysis/PHASE5_E5_NEXT_INSTRUCTIONS.md - CURRENT_TASK.md (E4 combined complete, E5 candidates) - docs/analysis/PHASE5_POST_E1_NEXT_INSTRUCTIONS.md (E5 pointer) - perf.data.e4combined (perf profile data) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
75 lines
2.0 KiB
Markdown
75 lines
2.0 KiB
Markdown
# Phase 5: Post-E1 Baseline & Next Target(次の指示書)
|
||
|
||
## Status(2025-12-14)
|
||
|
||
- Phase 4 の勝ち箱は **E1(ENV Snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
||
- E3-4(ENV CTOR)は **NO-GO / freeze**
|
||
- Phase 5 の勝ち箱: **E4-1(free wrapper snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
||
- Phase 5 の勝ち箱: **E4-2(malloc wrapper snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
||
- 次は “形” ではなく **新 baseline** で perf を取り直し、self% ≥ 5% の芯を殴る
|
||
|
||
---
|
||
|
||
## Step 0: Baseline 固定(Mixed)
|
||
|
||
```sh
|
||
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE ./bench_random_mixed_hakmem 20000000 400 1
|
||
```
|
||
|
||
注意:
|
||
- 以後の A/B はこのプロファイル(=E1 ON)を基準にする
|
||
|
||
---
|
||
|
||
## Step 1: perf で “芯” を選ぶ(self% ≥ 5%)
|
||
|
||
```sh
|
||
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE perf record -F 99 -- \
|
||
./bench_random_mixed_hakmem 20000000 400 1
|
||
perf report --stdio --no-children
|
||
```
|
||
|
||
GO/NO-GO:
|
||
- self% が **5% 未満**の最適化は原則 NO-GO(まず他を削る)
|
||
|
||
---
|
||
|
||
## Step 2: 研究箱の候補を 1 つに絞る(Box Theory)
|
||
|
||
要件:
|
||
- L0 ENV gate(default OFF)を必ず用意(戻せる)
|
||
- 境界は 1 箇所(変換点を増やさない)
|
||
- 可視化はカウンタ 1 本まで(常時ログ禁止)
|
||
|
||
---
|
||
|
||
## Step 3: A/B で GO 判定(Mixed)
|
||
|
||
Mixed 10-run:
|
||
- GO: mean **+1.0% 以上**
|
||
- ±1%: NEUTRAL(freeze)
|
||
- -1% 以下: NO-GO(freeze)
|
||
|
||
---
|
||
|
||
## Step 4: 健康診断
|
||
|
||
```sh
|
||
scripts/verify_health_profiles.sh
|
||
```
|
||
|
||
---
|
||
|
||
## Step 5: 昇格
|
||
|
||
- 勝ち箱だけを `core/bench_profile.h` のプリセットへ
|
||
- `docs/analysis/ENV_PROFILE_PRESETS.md` に結果+rollback を追記
|
||
- `CURRENT_TASK.md` を更新
|
||
|
||
## Next
|
||
|
||
- E4-1 昇格: `docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
||
- E4-2 設計/実装: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
||
- E4 合算 A/B: `docs/analysis/PHASE5_E4_COMBINED_AB_TEST_NEXT_INSTRUCTIONS.md`
|
||
- E5 次の芯: `docs/analysis/PHASE5_E5_NEXT_INSTRUCTIONS.md`
|