Files
hakmem/docs/analysis/PHASE4_E1_ENV_SNAPSHOT_CONSOLIDATION_NEXT_INSTRUCTIONS.md

109 lines
3.6 KiB
Markdown
Raw Normal View History

# Phase 4 E1: ENV Snapshot Consolidation次の指示書
## Status2025-12-14
- ✅ GOcommit: `88717a873`
- Mixed A/B10-run, iter=20M, ws=400: **+3.92% avg / +4.01% median**
- 現状: opt-indefault OFFのまま保持
## ゴール
MIXED の Hot path にある ENV gate 呼び出しを “snapshot 1 回” に集約し、**+2.5% 以上**を狙う。
対象perf self% 合計 ≈ 3.26%:
- `tiny_c7_ultra_enabled_env()`
- `tiny_front_v3_enabled()`
- `tiny_metadata_cache_enabled()`
## Step 0: 事前確認(現状)
Mixediter=20M, ws=400で perf を取り、上記 3 つが Top にいることを確認:
```sh
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE perf record -F 99 -- \
./bench_random_mixed_hakmem 20000000 400 1
perf report --stdio --no-children
```
## Step 1: L0 箱EnvSnapshotBoxを追加
新規ファイル:
- `core/box/hakmem_env_snapshot_box.h`
- `core/box/hakmem_env_snapshot_box.c`
要件:
- ENV: `HAKMEM_ENV_SNAPSHOT=0/1`default 0
- `hakmem_env_snapshot_refresh_from_env()` を用意getenv のみmalloc しない)
- `hakmem_env_snapshot_get_fast()` は hot で “1 load + 1 branch” 程度に抑える
- `tiny_metadata_cache_eff = HAKMEM_TINY_METADATA_CACHE && !learner` を snapshot で計算
## Step 2: bench_profile 同期putenv 後に refresh
`core/bench_profile.h``#ifdef USE_HAKMEM` ブロック末尾に追加:
- `hakmem_env_snapshot_refresh_from_env();`
(既に `wrapper_env_refresh_from_env()``tiny_static_route_refresh_from_env()` があるので同列で OK
## Step 3: 最小 migrationcall-site 置換)
まず “毎回通る” 箇所だけ置換3 gate → snapshot:
- `core/front/malloc_tiny_fast.h`
- `tiny_c7_ultra_enabled_env()` を snapshot 参照へC7 ULTRA gate
- `tiny_front_v3_enabled()` を snapshot 参照へfree 側の front_snap 取得)
- `core/box/tiny_legacy_fallback_box.h`
- `tiny_front_v3_enabled()` を snapshot 参照へ
- `tiny_metadata_cache_enabled()` を snapshot の `tiny_metadata_cache_eff` 参照へ
- `core/box/tiny_metadata_cache_hot_box.h`
- `tiny_metadata_cache_enabled()` を snapshot の `tiny_metadata_cache_eff` 参照へ
- (ここで learner interlock を “二重に” チェックしないよう整理)
注意Fail-safe:
- `HAKMEM_ENV_SNAPSHOT=0` のときは既存関数経由に戻る(挙動を変えない)
## Step 4: ビルド & 健康診断
```sh
make bench_random_mixed_hakmem -j
scripts/verify_health_profiles.sh
```
## Step 5: A/BGO/NO-GO
Mixed 10-runiter=20M, ws=400:
```sh
# Baseline
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE HAKMEM_ENV_SNAPSHOT=0 \
./bench_random_mixed_hakmem 20000000 400 1
# Optimized
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE HAKMEM_ENV_SNAPSHOT=1 \
./bench_random_mixed_hakmem 20000000 400 1
```
判定:
- GO: mean **+2.5% 以上**
- ±1%: NEUTRALresearch box
- -1% 以下: NO-GOfreeze
## Step 6: perf で “消えたか” を確認
E1=1 で perf を取り直し、次を確認:
- 3 つの gate 関数が Top から落ちるself% が大きく減る
- 代わりに snapshot load が 1 箇所に集約されている
## Step 7: 昇格GO の場合のみ)
- `core/bench_profile.h``MIXED_TINYV3_C7_SAFE``bench_setenv_default("HAKMEM_ENV_SNAPSHOT","1");` を追加
- `docs/analysis/ENV_PROFILE_PRESETS.md` に結果と rollback を追記
- `CURRENT_TASK.md` を E1 完了へ更新
NEUTRAL/NO-GO の場合:
- default OFF のまま freeze本線は汚さない
Phase 4 E3-4: ENV Constructor Init (+4.75% GO) Target: Eliminate E1 lazy init check overhead (3.22% self%) - E1 consolidated ENV gates but lazy check remained in hot path - Strategy: __attribute__((constructor(101))) for pre-main init Implementation: - ENV gate: HAKMEM_ENV_SNAPSHOT_CTOR=0/1 (default 0, research box) - core/box/hakmem_env_snapshot_box.c: Constructor function added - Reads ENV before main() when CTOR=1 - Refresh also syncs gate state for bench_profile putenv - core/box/hakmem_env_snapshot_box.h: Dual-mode enabled check - CTOR=1 fast path: direct global read (no lazy branch) - CTOR=0 fallback: legacy lazy init (rollback safe) - Branch hints adjusted for default OFF baseline A/B Test Results (Mixed, 10-run, 20M iters, E1=1): - Baseline (CTOR=0): 44.28M ops/s (mean), 44.60M ops/s (median) - Optimized (CTOR=1): 46.38M ops/s (mean), 46.53M ops/s (median) - Improvement: +4.75% mean, +4.35% median Decision: GO (+4.75% >> +0.5% threshold) - Expected +0.5-1.5%, achieved +4.75% - Lazy init branch overhead was larger than expected - Action: Keep as research box (default OFF), evaluate promotion Phase 4 Cumulative: - E1 (ENV Snapshot): +3.92% - E2 (Alloc Per-Class): -0.21% (NEUTRAL, frozen) - E3-4 (Constructor Init): +4.75% - Total Phase 4: ~+8.5% Deliverables: - docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_DESIGN.md - docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_NEXT_INSTRUCTIONS.md - docs/analysis/PHASE4_COMPREHENSIVE_STATUS_ANALYSIS.md - docs/analysis/PHASE4_EXECUTIVE_SUMMARY.md - scripts/verify_health_profiles.sh (sanity check script) - CURRENT_TASK.md (E3-4 complete, next instructions) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 02:57:35 +09:00
## NextPhase 4 E3-4
Phase 4 E3-4: ENV Constructor Init (+4.75% GO) Target: Eliminate E1 lazy init check overhead (3.22% self%) - E1 consolidated ENV gates but lazy check remained in hot path - Strategy: __attribute__((constructor(101))) for pre-main init Implementation: - ENV gate: HAKMEM_ENV_SNAPSHOT_CTOR=0/1 (default 0, research box) - core/box/hakmem_env_snapshot_box.c: Constructor function added - Reads ENV before main() when CTOR=1 - Refresh also syncs gate state for bench_profile putenv - core/box/hakmem_env_snapshot_box.h: Dual-mode enabled check - CTOR=1 fast path: direct global read (no lazy branch) - CTOR=0 fallback: legacy lazy init (rollback safe) - Branch hints adjusted for default OFF baseline A/B Test Results (Mixed, 10-run, 20M iters, E1=1): - Baseline (CTOR=0): 44.28M ops/s (mean), 44.60M ops/s (median) - Optimized (CTOR=1): 46.38M ops/s (mean), 46.53M ops/s (median) - Improvement: +4.75% mean, +4.35% median Decision: GO (+4.75% >> +0.5% threshold) - Expected +0.5-1.5%, achieved +4.75% - Lazy init branch overhead was larger than expected - Action: Keep as research box (default OFF), evaluate promotion Phase 4 Cumulative: - E1 (ENV Snapshot): +3.92% - E2 (Alloc Per-Class): -0.21% (NEUTRAL, frozen) - E3-4 (Constructor Init): +4.75% - Total Phase 4: ~+8.5% Deliverables: - docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_DESIGN.md - docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_NEXT_INSTRUCTIONS.md - docs/analysis/PHASE4_COMPREHENSIVE_STATUS_ANALYSIS.md - docs/analysis/PHASE4_EXECUTIVE_SUMMARY.md - scripts/verify_health_profiles.sh (sanity check script) - CURRENT_TASK.md (E3-4 complete, next instructions) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 02:57:35 +09:00
- 設計メモ: `docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_DESIGN.md`
- 次の指示書: `docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_NEXT_INSTRUCTIONS.md`