Phase 21.7 normalization: optimization pre-work + bench harness expansion
- Add opt-in optimizations (defaults OFF) - Ret purity verifier: NYASH_VERIFY_RET_PURITY=1 - strlen FAST enhancement for const handles - FAST_INT gate for same-BB SSA optimization - length cache for string literals in llvmlite - Expand bench harness (tools/perf/microbench.sh) - Add branch/call/stringchain/arraymap/chip8/kilo cases - Auto-calculate ratio vs C reference - Document in benchmarks/README.md - Compiler health improvements - Unify PHI insertion to insert_phi_at_head() - Add NYASH_LLVM_SKIP_BUILD=1 for build reuse - Runtime & safety enhancements - Clarify Rust/Hako ownership boundaries - Strengthen receiver localization (LocalSSA/pin/after-PHIs) - Stop excessive PluginInvoke→BoxCall rewrites - Update CURRENT_TASK.md, docs, and canaries 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@ -1,31 +1,33 @@
|
||||
# Phase 21.5 — Optimization (ny-llvm crate line)
|
||||
# Phase 21.5 — Optimization (AotPrep‑First)
|
||||
|
||||
Scope
|
||||
- Optimize hot paths for the crate (ny-llvmc) line using Hakorune scripts only.
|
||||
- Preserve default behavior; all risky changes behind dev toggles.
|
||||
- Measure EXE runtime (build once, run many) to avoid toolchain overhead noise.
|
||||
目的
|
||||
- .hako 側(AotPrep)で前処理最適化(構造のみ)を行い、LLVM/AOT に渡すIRを軽量にする。
|
||||
- 既定は挙動不変(opt‑in)。Return 純化ガードで安全性を担保。
|
||||
|
||||
Targets (initial)
|
||||
- loop: integer accumulations (no I/O)
|
||||
- strlen: FAST=1 path (pointer → nyrt_string_length)
|
||||
- box: construct/destroy minimal boxes (String/Integer)
|
||||
チェックリスト
|
||||
- [x] パス分割(StrlenFold / LoopHoist / ConstDedup / CollectionsHot / BinopCSE)
|
||||
- [x] CollectionsHot(Array/Map)導入(既定OFF)
|
||||
- [x] Map key モード `NYASH_AOT_MAP_KEY_MODE={h|i64|hh|auto}`
|
||||
- [x] LoopHoist v1 / BinopCSE v1(最小)
|
||||
- [x] ベンチ `linidx`/`maplin` 追加
|
||||
- [ ] LoopHoist v2(+/* 右項 const の連鎖前出し/fix‑point)
|
||||
- [ ] BinopCSE v2(線形 `i*n` 共通化の強化)
|
||||
- [ ] CollectionsHot v2(array index の共通SSA利用)
|
||||
- [ ] Map auto 精緻化(_is_const_or_linear の再帰判定)
|
||||
- [ ] Idempotence(置換済みタグで再実行時も不変)
|
||||
- [ ] `arraymap`/`matmul` ≤ 125%(C基準)
|
||||
|
||||
Methodology
|
||||
- Build once via ny-llvmc; time execution only (`--exe` mode).
|
||||
- Runs: 3–5; report median and average (target ≥ 100ms per run).
|
||||
- Observe NYASH_VM_STATS=1 (inst/compare/branch) where relevant to correlate structure and runtime.
|
||||
|
||||
Commands (examples)
|
||||
- tools/perf/phase215/bench_loop.sh --runs 5
|
||||
- tools/perf/phase215/bench_strlen.sh --runs 5 --fast 1
|
||||
- tools/perf/phase215/run_all.sh --runs 5 --timeout 120
|
||||
|
||||
Dev Toggles (keep OFF by default)
|
||||
- NYASH_LLVM_FAST=1 (strlen FAST)
|
||||
- HAKO_MIR_BUILDER_JSONFRAG_NORMALIZE=1 (normalize)
|
||||
- HAKO_MIR_BUILDER_NORMALIZE_TAG=1 (tag, test-only)
|
||||
|
||||
Exit Criteria
|
||||
- Representative microbenches stable (≤ 5% variance) and ≥ 80% of C baselines.
|
||||
- No regression in EXE canaries (loop/print/strlen FAST) and VM parity canaries.
|
||||
トグル
|
||||
- `NYASH_MIR_LOOP_HOIST=1` … StrlenFold/LoopHoist/ConstDedup/BinopCSE を有効化
|
||||
- `NYASH_AOT_COLLECTIONS_HOT=1` … CollectionsHot(Array/Map)
|
||||
- `NYASH_AOT_MAP_KEY_MODE` … `h|i64|hh|auto`(推奨: `auto`)
|
||||
- `NYASH_VERIFY_RET_PURITY=1` … Return 純化ガード(開発時ON)
|
||||
|
||||
ベンチ(例)
|
||||
```bash
|
||||
export NYASH_SKIP_TOML_ENV=1 NYASH_DISABLE_PLUGINS=1 \
|
||||
NYASH_LLVM_SKIP_BUILD=1 NYASH_LLVM_FAST=1 NYASH_LLVM_FAST_INT=1 \
|
||||
NYASH_MIR_LOOP_HOIST=1 NYASH_AOT_COLLECTIONS_HOT=1 NYASH_VERIFY_RET_PURITY=1
|
||||
for c in arraymap matmul sieve linidx maplin; do \
|
||||
tools/perf/microbench.sh --case $c --exe --runs 3; echo; done
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user