diff --git a/.github/workflows/vm-legacy-build.yml b/.github/workflows/vm-legacy-build.yml new file mode 100644 index 00000000..b1a25e54 --- /dev/null +++ b/.github/workflows/vm-legacy-build.yml @@ -0,0 +1,29 @@ +name: vm-legacy-build + +on: + workflow_dispatch: + push: + paths: + - 'src/**' + - 'Cargo.toml' + - '.github/workflows/vm-legacy-build.yml' + +jobs: + build-vm-legacy: + runs-on: ubuntu-latest + timeout-minutes: 20 + steps: + - name: Checkout + uses: actions/checkout@v4 + + - name: Set up Rust + uses: dtolnay/rust-toolchain@stable + + - name: Cache Rust build + uses: Swatinem/rust-cache@v2 + with: + cache-targets: true + + - name: Build with vm-legacy + interpreter-legacy (compile check only) + run: cargo build --release --features vm-legacy,interpreter-legacy + diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index 65f0ee19..349cb8cf 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -14,29 +14,51 @@ What Changed (recent) - 新規サポート: Ternary/Peek の Lowering を実装し、`expr.rs` から `ternary.rs`/`peek.rs` へ委譲(MIR13 PHI‑off=Copy合流/PHI‑on=Phi 合流)。 - Self‑host 生成器(Stage‑1 JSON v0)に Peek emit を追加: `apps/selfhost-compiler/boxes/parser_box.nyash`。 - Selfhost/PyVM スモークを通して E2E 確認(peek/ternary)。 -- llvmlite stability for MIR13 - - Resolver: forbids cross‑block non‑dominating vmap reuse; for multi‑pred and no declared PHI, synthesizes a localization PHI at block head. - - Finalize remains function‑local; `block_end_values` snapshots and placeholder wiring still in place. +- llvmlite stability for MIR13(bring‑up進行中) + - Control‑flow 分離: `instructions/controlflow/{branch,jump,while_.py}` を導入し、`llvm_builder.py` の責務を縮小。 + - プリパス導入(環境変数で有効化): `NYASH_LLVM_PREPASS_LOOP=1` + - ループ検出(単純 while 形)→ 構造化 lower(LoopForm失敗時は regular while) + - CFG ユーティリティ: `cfg/utils.py`(preds/succs) + - 値解決ポリシー共通化: `utils/values.py`(prefer same‑block SSA → resolver) + - vmap の per‑block 化: `lower_block` 内で `vmap_cur` を用意し、ブロック末に `block_end_values` へスナップショット。cross‑block 汚染を抑制。 + - Resolver 強化: end‑of‑block解決で他ブロックのPHIを安易に採用しない(自己参照/非支配回避)。 - Parity runner pragmatics - `tools/pyvm_vs_llvmlite.sh` compares exit code by default; use `CMP_STRICT=1` for stdout+exit. - Stage‑2 smokes更新: `tools/selfhost_stage2_smoke.sh` に "Peek basic" を追加。 Current Status -- Self‑hosting Bridge → PyVM smokes: PASS (Stage‑2 reps: array/string/logic/if/loop). -- Curated LLVM (PHI‑off default): PASS. -- Curated LLVM (PHI‑on experimental): `apps/tests/loop_if_phi.nyash` shows a dominance issue (observed; not blocking, MIR13 recommended). +- Self‑hosting Bridge → PyVM smokes: PASS(Stage‑2 代表: array/string/logic/if/loop/ternary/peek/dot-chain) +- PyVM core fixes applied: compare(None,x) の安全化、Copy 命令サポート、最大ステップ上限(NYASH_PYVM_MAX_STEPS) +- MIR13(PHI‑off): if/ternary/loop の合流で Copy が正しく JSON に出るよう修正(emit_mir_json + builder no‑phi 合流) +- Curated LLVM(PHI‑off 既定): 継続(個別ケースの IR 生成不備は未着手) +- LLVM ハーネス(llvmlite): + - `loop_if_phi`: プリパスON+構造化whileで EXE 退出コード 0(緑)。 + - `ternary_nested`: vmap per‑block で安定度向上。残タスク: merge(ret) の PHI 配線をプリパス/resolve 側で確定・重複排除。 Next (short plan) +0) Refactor/Structure(継続) + - controlflow の切出し完了(branch/jump/while)。binop/compare/copy の前処理を `utils/values.resolve_i64_strict` に集約(完了)。 + - vmap per‑block 化(完了)。builder の責務縮小と prepass/cfg/util への移譲(進行中)。 + - if‑merge プリパス実装: ret‑merge の構造化/PHI確定(予定)。 1) Legacy Interpreter/VM offboarding (phase‑A): - - Introduce `vm-legacy` feature (default OFF) to gate old VM execution層。 - - 抽出: JIT が参照する最小型(例: `VMValue`)を薄い共通モジュールへ切替(`vm_types` 等)。 - - `interpreter-legacy`/`vm-legacy` を既定ビルドから外し、ビルド警告を縮小。 + - ✅ Introduced `vm-legacy` feature (default OFF) to gate old VM execution層。 + - ✅ 抽出: JIT が参照する最小型(例: `VMValue`)を薄い共通モジュールへ切替(`vm_types`)。 + - ✅ `interpreter-legacy`/`vm-legacy` を既定ビルドから外し、既定は PyVM 経路に(`--backend vm` は PyVM へフォールバック)。 + - ✅ Runner: vm-legacy OFF のとき `vm`/`interpreter` は PyVM モードで実行。 + - ✅ HostAPI: VM 依存の GC バリアは vm-legacy ON 時のみ有効。 + - ✅ PyVM/Bridge Stage‑2 スモークを緑に再整備(短絡/三項/合流 反映) 2) Legacy Interpreter/VM offboarding (phase‑B): - 物理移動: `src/archive/{interpreter_legacy,vm_legacy}/` へ移設(ドキュメント更新)。 -3) PHI‑on lane(任意): `loop_if_phi` 支配関係を finalize/resolve の順序強化で観察(低優先)。 -4) Runner refactor(小PR): +3) LLVM/llvmlite 整備(優先中): + - MIR13 の Copy 合流を LLVM IR に等価反映(pred‑localize or PHI 合成): per‑block vmap 完了、resolver 強化済。 + - 代表ケース: + - `apps/tests/loop_if_phi.nyash`: プリパスONで緑(退出コード一致)。 + - `apps/tests/ternary_nested.nyash`: if‑merge プリパスでの構造化/PHI 確定を実装 → IR 検証通過・退出コード一致まで。 + - `tools/pyvm_vs_llvmlite.sh` で PyVM と EXE の退出コード一致(必要に応じて CMP_STRICT=1)。 +4) PHI‑on lane(任意): `loop_if_phi` 支配関係を finalize/resolve の順序強化で観察(低優先)。 +5) Runner refactor(小PR): - `selfhost/{child.rs,json.rs}` 分離; `modes/common/{io,resolve,exec}.rs` 分割; `runner/mod.rs`の表面削減。 -5) Optimizer/Verifier thin‑hub cleanup(非機能): orchestrator最小化とパス境界の明確化。 +6) Optimizer/Verifier thin‑hub cleanup(非機能): orchestrator最小化とパス境界の明確化。 How to Run - PyVM reference smokes: `tools/pyvm_stage2_smoke.sh` @@ -44,6 +66,27 @@ How to Run - LLVM curated (PHI‑off default): `tools/smokes/curated_llvm.sh` - LLVM PHI‑on (experimental): `tools/smokes/curated_llvm.sh --phi-on` - Parity (AOT vs PyVM): `tools/pyvm_vs_llvmlite.sh ` (`CMP_STRICT=1` to enable stdout check) + - 開発時の補助: `NYASH_LLVM_PREPASS_LOOP=1` を併用(loop/if‑merge のプリパス有効化)。 + +Operational Notes +- 環境変数 + - `NYASH_PYVM_MAX_STEPS`: PyVM の最大命令ステップ(既定 200000)。ループ暴走時に安全終了。 + - `NYASH_VM_USE_PY=1`: `--backend vm` を PyVM ハーネスへ切替。 + - `NYASH_PIPE_USE_PYVM=1`: `--ny-parser-pipe` / JSON v0 ブリッジも PyVM 実行に切替。 + - `NYASH_CLI_VERBOSE=1`: ブリッジ/エミットの詳細出力。 +- スモークの実行例 + - `timeout -s KILL 20s bash tools/pyvm_stage2_smoke.sh` + - `timeout -s KILL 30s bash tools/selfhost_stage2_bridge_smoke.sh` + +Backend selection (Phase‑A after vm‑legacy off) +- Default: `vm-legacy` = OFF, `interpreter-legacy` = OFF +- `--backend vm` → PyVM 実行(python3 と `tools/pyvm_runner.py` が必要) +- `--backend interpreter` → legacy 警告の上で PyVM 実行 +- `--benchmark` → vm‑legacy が必要(`cargo build --features vm-legacy`) + +Enable legacy VM/Interpreter (opt‑in) +- `cargo build --features vm-legacy,interpreter-legacy` +- その後 `--backend vm`/`--backend interpreter` が有効 Key Flags - `NYASH_MIR_NO_PHI` (default 1): PHI‑off when 1 (MIR13). Set `0` for PHI‑on. diff --git a/Cargo.toml b/Cargo.toml index 1251ce03..011ed45e 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -11,8 +11,9 @@ categories = ["development-tools::parsing", "interpreters"] # Default features - minimal CLI only [features] -default = ["cli", "plugins", "interpreter-legacy"] +default = ["cli", "plugins"] interpreter-legacy = [] +vm-legacy = [] e2e = [] cli = [] plugins-only = [] diff --git a/README.ja.md b/README.ja.md index 4278864e..1a5aedc2 100644 --- a/README.ja.md +++ b/README.ja.md @@ -108,7 +108,15 @@ local py = new PyRuntimeBox() // Pythonプラグイン ## 🏗️ **複数の実行モード** -重要: 現在、JIT ランタイム実行はデバッグ容易性のため封印しています。実行は「インタープリター/VM」、配布は「Cranelift AOT(EXE)/LLVM AOT(EXE)」の4体制です。 +重要: 現在、JIT ランタイム実行は封印中です。実行は「PyVM(既定)/VM(任意でレガシー有効)」、配布は「Cranelift AOT(EXE)/LLVM AOT(EXE)」の4体制です。 + +Phase‑15(自己ホスト期): VM/インタープリタはフィーチャーで切替 +- 既定ビルド: `--backend vm` は PyVM 実行(python3 + `tools/pyvm_runner.py` が必要) +- レガシー Rust VM/インタープリターを有効化するには: + ```bash + cargo build --release --features vm-legacy,interpreter-legacy + ``` + 以降、`--backend vm`/`--backend interpreter` が従来経路で動作します。 ### 1. **インタープリターモード** (開発用) ```bash @@ -118,13 +126,17 @@ local py = new PyRuntimeBox() // Pythonプラグイン - 完全なデバッグ情報 - 開発に最適 -### 2. **VMモード** (本番用) +### 2. **VMモード(既定は PyVM/レガシーは任意)** ```bash +# 既定: PyVM ハーネス(python3 必要) +./target/release/nyash --backend vm program.nyash + +# レガシー Rust VM を使う場合 +cargo build --release --features vm-legacy ./target/release/nyash --backend vm program.nyash ``` -- インタープリターより13.5倍高速 -- 最適化されたバイトコード実行 -- 本番環境対応のパフォーマンス +- 既定(vm-legacy OFF): MIR(JSON) を出力して `tools/pyvm_runner.py` で実行 +- レガシー VM: インタープリター比で 13.5x(歴史的実測)。比較・検証用途で維持 ### 3. **ネイティブバイナリ(Cranelift AOT)** (配布用) ```bash diff --git a/README.md b/README.md index 16bbe6e5..ccf9b39e 100644 --- a/README.md +++ b/README.md @@ -113,7 +113,15 @@ local py = new PyRuntimeBox() // Python plugin ## 🏗️ **Multiple Execution Modes** -Important: JIT runtime execution is sealed for now. Use Interpreter/VM for running, and Cranelift AOT/LLVM AOT for native executables. +Important: JIT runtime execution is sealed for now. Use PyVM/VM for running, and Cranelift AOT/LLVM AOT for native executables. + +Phase‑15 (Self‑Hosting): Legacy VM/Interpreter are feature‑gated +- Default build runs PyVM for `--backend vm` (python3 + `tools/pyvm_runner.py` required) +- To enable legacy Rust VM/Interpreter, build with: + ```bash + cargo build --release --features vm-legacy,interpreter-legacy + ``` + Then `--backend vm`/`--backend interpreter` use the legacy paths. ### 1. **Interpreter Mode** (Development) ```bash @@ -123,13 +131,17 @@ Important: JIT runtime execution is sealed for now. Use Interpreter/VM for runni - Full debug information - Perfect for development -### 2. **VM Mode** (Production) +### 2. **VM Mode (PyVM default / Legacy optional)** ```bash +# Default: PyVM harness (requires python3) +./target/release/nyash --backend vm program.nyash + +# Enable legacy Rust VM if needed +cargo build --release --features vm-legacy ./target/release/nyash --backend vm program.nyash ``` -- 13.5x faster than interpreter -- Optimized bytecode execution -- Production-ready performance +- Default (vm-legacy OFF): PyVM executes MIR(JSON) via `tools/pyvm_runner.py` +- Legacy VM: 13.5x over interpreter (historical); kept for comparison and plugin tests ### 3. **Native Binary (Cranelift AOT)** (Distribution) ```bash diff --git a/app_loop b/app_loop new file mode 100644 index 00000000..5d1a5e88 Binary files /dev/null and b/app_loop differ diff --git a/app_loop2 b/app_loop2 new file mode 100644 index 00000000..5d1a5e88 Binary files /dev/null and b/app_loop2 differ diff --git a/app_loop_cf b/app_loop_cf new file mode 100644 index 00000000..5d1a5e88 Binary files /dev/null and b/app_loop_cf differ diff --git a/app_loop_vmap b/app_loop_vmap new file mode 100644 index 00000000..5d1a5e88 Binary files /dev/null and b/app_loop_vmap differ diff --git a/app_pyvm_cmp b/app_pyvm_cmp index a0f21067..19c17192 100644 Binary files a/app_pyvm_cmp and b/app_pyvm_cmp differ diff --git a/app_tern4 b/app_tern4 new file mode 100644 index 00000000..6af84438 Binary files /dev/null and b/app_tern4 differ diff --git a/src/backend/dispatch.rs b/src/backend/dispatch.rs index 3c40b25f..a224eaf6 100644 --- a/src/backend/dispatch.rs +++ b/src/backend/dispatch.rs @@ -295,3 +295,4 @@ impl DispatchTable { pub fn execute_entry(_entry: &DispatchEntry) -> Result<(), VMError> { Ok(()) } +#![cfg(feature = "vm-legacy")] diff --git a/src/backend/mod.rs b/src/backend/mod.rs index 0ec60ca2..ed3a2913 100644 --- a/src/backend/mod.rs +++ b/src/backend/mod.rs @@ -2,24 +2,48 @@ * Backend module - Different execution backends for MIR */ -pub mod vm; -pub mod vm_boxcall; -pub mod vm_instructions; -pub mod vm_phi; -pub mod vm_stats; +// VM core types are always available pub mod vm_types; + +// Legacy VM execution pipeline (feature-gated) +#[cfg(feature = "vm-legacy")] +pub mod vm; +#[cfg(feature = "vm-legacy")] +pub mod vm_boxcall; +#[cfg(feature = "vm-legacy")] +pub mod vm_instructions; +#[cfg(feature = "vm-legacy")] +pub mod vm_phi; +#[cfg(feature = "vm-legacy")] +pub mod vm_stats; +#[cfg(feature = "vm-legacy")] pub mod vm_values; + +// When vm-legacy is disabled, provide a compatibility shim module so +// crate::backend::vm::VMValue etc. keep resolving to vm_types::*. +#[cfg(not(feature = "vm-legacy"))] +pub mod vm { + pub use super::vm_types::{VMError, VMValue}; +} // Phase 9.78h: VM split scaffolding (control_flow/dispatch/frame) pub mod abi_util; // Shared ABI/utility helpers +#[cfg(feature = "vm-legacy")] pub mod control_flow; +#[cfg(feature = "vm-legacy")] pub mod dispatch; +#[cfg(feature = "vm-legacy")] pub mod frame; pub mod gc_helpers; pub mod mir_interpreter; +#[cfg(feature = "vm-legacy")] pub mod vm_control_flow; +#[cfg(feature = "vm-legacy")] mod vm_exec; // A3: execution loop extracted +#[cfg(feature = "vm-legacy")] mod vm_gc; // A3: GC roots & diagnostics extracted +#[cfg(feature = "vm-legacy")] mod vm_methods; // A3-S1: method dispatch wrappers extracted +#[cfg(feature = "vm-legacy")] mod vm_state; // A3: state & basic helpers extracted // Lightweight MIR interpreter #[cfg(feature = "wasm-backend")] @@ -38,7 +62,10 @@ pub mod cranelift; pub mod llvm; pub use mir_interpreter::MirInterpreter; -pub use vm::{VMError, VMValue, VM}; +// Always re-export VMError/VMValue from vm_types; VM (executor) only when enabled +pub use vm_types::{VMError, VMValue}; +#[cfg(feature = "vm-legacy")] +pub use vm::VM; #[cfg(feature = "wasm-backend")] pub use aot::{AotBackend, AotConfig, AotError, AotStats}; diff --git a/src/benchmarks.rs b/src/benchmarks.rs index 28341d42..8ea95447 100644 --- a/src/benchmarks.rs +++ b/src/benchmarks.rs @@ -9,6 +9,7 @@ #[cfg(feature = "wasm-backend")] use crate::backend::WasmBackend; +#[cfg(feature = "vm-legacy")] use crate::backend::VM; use crate::interpreter::NyashInterpreter; use crate::mir::MirCompiler; @@ -54,6 +55,7 @@ impl BenchmarkSuite { results.push(interpreter_result); } + #[cfg(feature = "vm-legacy")] if let Ok(vm_result) = self.run_vm_benchmark(name, &source) { results.push(vm_result); } @@ -104,6 +106,7 @@ impl BenchmarkSuite { } /// Run benchmark on VM backend + #[cfg(feature = "vm-legacy")] fn run_vm_benchmark( &self, name: &str, diff --git a/src/box_factory/mod.rs b/src/box_factory/mod.rs index 49d55a0b..ec7036f6 100644 --- a/src/box_factory/mod.rs +++ b/src/box_factory/mod.rs @@ -230,6 +230,7 @@ impl UnifiedBoxRegistry { pub mod builtin; pub mod plugin; /// Re-export submodules +#[cfg(feature = "interpreter-legacy")] pub mod user_defined; #[cfg(test)] diff --git a/src/boxes/mod.rs b/src/boxes/mod.rs index c6688947..4ea1acff 100644 --- a/src/boxes/mod.rs +++ b/src/boxes/mod.rs @@ -155,6 +155,7 @@ pub mod stream; // P2P通信Box群 (NEW! - Completely rewritten) pub mod intent_box; +#[cfg(feature = "interpreter-legacy")] pub mod p2p_box; // null関数も再エクスポート @@ -176,4 +177,5 @@ pub use stream::{NyashStreamBox, StreamBox}; // P2P通信Boxの再エクスポート pub use intent_box::IntentBox; +#[cfg(feature = "interpreter-legacy")] pub use p2p_box::P2PBox; diff --git a/src/llvm_py/README.md b/src/llvm_py/README.md index 8de8f390..ea4b116b 100644 --- a/src/llvm_py/README.md +++ b/src/llvm_py/README.md @@ -13,12 +13,35 @@ ChatGPTが設計した`docs/design/LLVM_LAYER_OVERVIEW.md`の設計原則に従 ## 📂 構造 ``` llvm_py/ -├── README.md # このファイル -├── mir_reader.py # MIR JSON読み込み -├── llvm_builder.py # メインのLLVM IR生成 -├── resolver.py # Resolver API(Python版) -├── types.py # 型変換ユーティリティ -└── test_simple.py # 基本テスト +├── README.md # このファイル +├── llvm_builder.py # メインのLLVM IR生成(パスのオーケストレーション) +├── mir_reader.py # MIR(JSON) ローダ +├── resolver.py # 値解決(SSA/PHIの局所化とキャッシュ) +├── utils/ +│ └── values.py # 同一ブロック優先の解決などの共通ポリシー +├── cfg/ +│ └── utils.py # CFG ビルド(pred/succ) +├── prepass/ +│ ├── loops.py # ループ検出(while 形) +│ └── if_merge.py # if-merge(ret-merge)前処理(PHI前宣言プラン) +├── instructions/ +│ ├── controlflow/ +│ │ ├── branch.py # 条件分岐 +│ │ ├── jump.py # 無条件ジャンプ +│ │ └── while_.py # 通常の while 降下(LoopForm 失敗時のフォールバック) +│ ├── binop.py # 2項演算 +│ ├── compare.py # 比較演算(i1生成) +│ ├── const.py # 定数 +│ ├── copy.py # Copy(MIR13 PHI-off の合流表現) +│ ├── call.py # Ny 関数呼び出し +│ ├── boxcall.py # Box メソッド呼び出し +│ ├── externcall.py # 外部呼び出し +│ ├── newbox.py # Box 生成 +│ ├── ret.py # return 降下(if-merge の前宣言PHIを優先) +│ ├── typeop.py # 型変換 +│ ├── safepoint.py # safepoint +│ └── barrier.py # メモリバリア +└── test_simple.py # 基本テスト ``` ## 🚀 使い方 @@ -30,18 +53,40 @@ python src/llvm_py/llvm_builder.py input.mir.json -o output.o NYASH_LLVM_USE_HARNESS=1 ./target/release/nyash program.nyash ``` +## 🔧 開発用フラグ(プリパス/トレース) +- `NYASH_LLVM_USE_HARNESS=1` … Rust 実行から llvmlite ハーネスへ委譲 +- `NYASH_LLVM_PREPASS_LOOP=1` … ループ検出プリパスON(while 形を構造化) +- `NYASH_LLVM_PREPASS_IFMERGE=1` … if-merge(ret-merge)プリパスON(ret値 PHI を前宣言) +- `NYASH_LLVM_TRACE_PHI=1` … PHI 配線と end-of-block 解決の詳細トレース +- `NYASH_CLI_VERBOSE=1` … 降下やスナップショットの詳細ログ +- `NYASH_MIR_NO_PHI=1` … MIR13(PHI-off)を明示(既定1) +- `NYASH_VERIFY_ALLOW_NO_PHI=1` … PHI-less を検証で許容(既定1) + ## 📋 設計原則(LLVM_LAYER_OVERVIEWに準拠) -1. **Resolver-only reads** - 直接vmapアクセス禁止 -2. **Localize at block start** - BB先頭でPHI生成 -3. **Sealed SSA** - snapshot経由の配線 -4. **BuilderCursor相当** - 挿入位置の厳格管理 +1. Resolver-only reads(原則): 直接の cross-block vmap 参照は避け、resolver 経由で取得 +2. Localize at block start: ブロック先頭で PHI を作る(if-merge は prepass で前宣言) +3. Sealed SSA: ブロック末 snapshot を用いた finalize_phis 配線 +4. Builder cursor discipline: 生成位置の厳格化(terminator 後に emit しない) ## 🎨 実装状況 - [ ] 基本構造(MIR読み込み) -- [ ] Core-14命令の実装 -- [ ] Resolver API -- [ ] LoopForm対応 -- [ ] テストスイート +- [x] ControlFlow 分離(branch/jump/while_regular) +- [x] CFG/Prepass 分離(cfg/utils.py, prepass/loops.py, prepass/if_merge.py) +- [x] if-merge(ret-merge)の PHI 前宣言(ゲート: NYASH_LLVM_PREPASS_IFMERGE=1) +- [x] ループプリパス(ゲート: NYASH_LLVM_PREPASS_LOOP=1) +- [ ] 追加命令/Stage-3 の持続的整備 + +## ✅ テスト・検証 +- パリティ(llvmlite vs PyVM。既定は終了コードのみ比較) + - `./tools/pyvm_vs_llvmlite.sh apps/tests/ternary_nested.nyash` + - 代表例(プリパス有効): + - `NYASH_LLVM_PREPASS_IFMERGE=1 ./tools/pyvm_vs_llvmlite.sh apps/tests/ternary_nested.nyash` + - `NYASH_LLVM_PREPASS_LOOP=1 ./tools/pyvm_vs_llvmlite.sh apps/tests/loop_if_phi.nyash` +- 厳密比較(標準出力+終了コード) + - `CMP_STRICT=1 ./tools/pyvm_vs_llvmlite.sh ` +- まとまったスモーク(PHI-off 既定) + - `tools/smokes/curated_llvm.sh` + - PHI-on 検証(実験的): `tools/smokes/curated_llvm.sh --phi-on` ## 📊 予想行数 - 全体: 800-1000行 diff --git a/src/llvm_py/build_ctx.py b/src/llvm_py/build_ctx.py new file mode 100644 index 00000000..13aed54e --- /dev/null +++ b/src/llvm_py/build_ctx.py @@ -0,0 +1,34 @@ +""" +Build context for instruction lowering and helpers. + +This structure aggregates frequently passed references so call sites can +remain concise as we gradually refactor instruction signatures. +""" + +from dataclasses import dataclass +from typing import Any, Dict, List, Optional +import llvmlite.ir as ir + +@dataclass +class BuildCtx: + # Core IR handles + module: ir.Module + i64: ir.IntType + i32: ir.IntType + i8: ir.IntType + i1: ir.IntType + i8p: ir.PointerType + + # SSA maps and CFG + vmap: Dict[int, ir.Value] + bb_map: Dict[int, ir.Block] + preds: Dict[int, List[int]] + block_end_values: Dict[int, Dict[int, ir.Value]] + + # Resolver (value queries, casts, string-ish tags) + resolver: Any + + # Optional diagnostics toggles (read from env by the builder) + trace_phi: bool = False + verbose: bool = False + diff --git a/src/llvm_py/cfg/utils.py b/src/llvm_py/cfg/utils.py new file mode 100644 index 00000000..fc757f99 --- /dev/null +++ b/src/llvm_py/cfg/utils.py @@ -0,0 +1,36 @@ +""" +CFG utilities +Build predecessor/successor maps and simple helpers. +""" + +from typing import Dict, List, Any, Tuple + +def build_preds_succs(block_by_id: Dict[int, Dict[str, Any]]) -> Tuple[Dict[int, List[int]], Dict[int, List[int]]]: + """Construct predecessor and successor maps from MIR(JSON) blocks.""" + succs: Dict[int, List[int]] = {} + preds: Dict[int, List[int]] = {} + for b in block_by_id.values(): + bid = int(b.get('id', 0)) + preds.setdefault(bid, []) + for b in block_by_id.values(): + src = int(b.get('id', 0)) + for inst in b.get('instructions', []) or []: + op = inst.get('op') + if op == 'jump': + t = inst.get('target') + if t is not None: + t = int(t) + succs.setdefault(src, []).append(t) + preds.setdefault(t, []).append(src) + elif op == 'branch': + th = inst.get('then'); el = inst.get('else') + if th is not None: + th = int(th) + succs.setdefault(src, []).append(th) + preds.setdefault(th, []).append(src) + if el is not None: + el = int(el) + succs.setdefault(src, []).append(el) + preds.setdefault(el, []).append(src) + return preds, succs + diff --git a/src/llvm_py/instructions/__init__.py b/src/llvm_py/instructions/__init__.py index cdf561d8..37ac69fd 100644 --- a/src/llvm_py/instructions/__init__.py +++ b/src/llvm_py/instructions/__init__.py @@ -7,8 +7,9 @@ Each instruction has its own file, following Rust structure from .const import lower_const from .binop import lower_binop from .compare import lower_compare -from .jump import lower_jump -from .branch import lower_branch +# controlflow +from .controlflow.jump import lower_jump +from .controlflow.branch import lower_branch from .ret import lower_return from .phi import lower_phi from .call import lower_call @@ -29,4 +30,4 @@ __all__ = [ 'lower_externcall', 'lower_typeop', 'lower_safepoint', 'lower_barrier', 'lower_newbox', 'LoopFormContext', 'lower_while_loopform' -] \ No newline at end of file +] diff --git a/src/llvm_py/instructions/binop.py b/src/llvm_py/instructions/binop.py index 0cc8c865..fdc09b05 100644 --- a/src/llvm_py/instructions/binop.py +++ b/src/llvm_py/instructions/binop.py @@ -5,6 +5,7 @@ Handles +, -, *, /, %, &, |, ^, <<, >> import llvmlite.ir as ir from typing import Dict, Optional, Any +from utils.values import resolve_i64_strict from .compare import lower_compare import llvmlite.ir as ir @@ -38,12 +39,12 @@ def lower_binop( """ # Resolve operands as i64 (using resolver when available) # For now, simple vmap lookup - if resolver is not None and preds is not None and block_end_values is not None: - lhs_val = resolver.resolve_i64(lhs, current_block, preds, block_end_values, vmap, bb_map) - rhs_val = resolver.resolve_i64(rhs, current_block, preds, block_end_values, vmap, bb_map) - else: - lhs_val = vmap.get(lhs, ir.Constant(ir.IntType(64), 0)) - rhs_val = vmap.get(rhs, ir.Constant(ir.IntType(64), 0)) + lhs_val = resolve_i64_strict(resolver, lhs, current_block, preds, block_end_values, vmap, bb_map) + rhs_val = resolve_i64_strict(resolver, rhs, current_block, preds, block_end_values, vmap, bb_map) + if lhs_val is None: + lhs_val = ir.Constant(ir.IntType(64), 0) + if rhs_val is None: + rhs_val = ir.Constant(ir.IntType(64), 0) # Relational/equality operators delegate to compare if op in ('==','!=','<','>','<=','>='): @@ -60,6 +61,7 @@ def lower_binop( preds=preds, block_end_values=block_end_values, bb_map=bb_map, + ctx=getattr(resolver, 'ctx', None), ) return diff --git a/src/llvm_py/instructions/boxcall.py b/src/llvm_py/instructions/boxcall.py index f28f5962..c90df67f 100644 --- a/src/llvm_py/instructions/boxcall.py +++ b/src/llvm_py/instructions/boxcall.py @@ -4,7 +4,7 @@ Core of Nyash's "Everything is Box" philosophy """ import llvmlite.ir as ir -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Any def _declare(module: ir.Module, name: str, ret, args): for f in module.functions: @@ -47,7 +47,8 @@ def lower_boxcall( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None, ) -> None: """ Lower MIR BoxCall instruction @@ -68,10 +69,43 @@ def lower_boxcall( i8 = ir.IntType(8) i8p = i8.as_pointer() + # Short-hands with ctx (backward-compatible fallback) + r = resolver + p = preds + bev = block_end_values + bbm = bb_map + if ctx is not None: + try: + r = getattr(ctx, 'resolver', r) + p = getattr(ctx, 'preds', p) + bev = getattr(ctx, 'block_end_values', bev) + bbm = getattr(ctx, 'bb_map', bbm) + except Exception: + pass + def _res_i64(vid: int): + if r is not None and p is not None and bev is not None and bbm is not None: + try: + return r.resolve_i64(vid, builder.block, p, bev, vmap, bbm) + except Exception: + return None + return vmap.get(vid) + + # If BuildCtx is provided, prefer its maps for consistency. + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass # Receiver value - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - recv_val = resolver.resolve_i64(box_vid, builder.block, preds, block_end_values, vmap, bb_map) - else: + recv_val = _res_i64(box_vid) + if recv_val is None: recv_val = vmap.get(box_vid, ir.Constant(i64, 0)) # Minimal method bridging for strings and console @@ -96,11 +130,11 @@ def lower_boxcall( if method_name == "substring": # substring(start, end) # If receiver is a handle (i64), use handle-based helper; else pointer-based API - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - s = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else ir.Constant(i64, 0) - e = resolver.resolve_i64(args[1], builder.block, preds, block_end_values, vmap, bb_map) if len(args) > 1 else ir.Constant(i64, 0) - else: + s = _res_i64(args[0]) if args else ir.Constant(i64, 0) + if s is None: s = vmap.get(args[0], ir.Constant(i64, 0)) if args else ir.Constant(i64, 0) + e = _res_i64(args[1]) if len(args) > 1 else ir.Constant(i64, 0) + if e is None: e = vmap.get(args[1], ir.Constant(i64, 0)) if len(args) > 1 else ir.Constant(i64, 0) if hasattr(recv_val, 'type') and isinstance(recv_val.type, ir.IntType): # handle-based @@ -191,9 +225,8 @@ def lower_boxcall( # ArrayBox.get(index) → nyash.array.get_h(handle, idx) # MapBox.get(key) → nyash.map.get_hh(handle, key_any) recv_h = _ensure_handle(builder, module, recv_val) - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - k = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else ir.Constant(i64, 0) - else: + k = _res_i64(args[0]) if args else ir.Constant(i64, 0) + if k is None: k = vmap.get(args[0], ir.Constant(i64, 0)) if args else ir.Constant(i64, 0) callee_map = _declare(module, "nyash.map.get_hh", i64, [i64, i64]) res = builder.call(callee_map, [recv_h, k], name="map_get_hh") @@ -204,9 +237,8 @@ def lower_boxcall( if method_name == "push": # ArrayBox.push(val) → nyash.array.push_h(handle, val) recv_h = _ensure_handle(builder, module, recv_val) - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - v0 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else ir.Constant(i64, 0) - else: + v0 = _res_i64(args[0]) if args else ir.Constant(i64, 0) + if v0 is None: v0 = vmap.get(args[0], ir.Constant(i64, 0)) if args else ir.Constant(i64, 0) callee = _declare(module, "nyash.array.push_h", i64, [i64, i64]) res = builder.call(callee, [recv_h, v0], name="arr_push_h") @@ -217,11 +249,11 @@ def lower_boxcall( if method_name == "set": # MapBox.set(key, val) → nyash.map.set_hh(handle, key_any, val_any) recv_h = _ensure_handle(builder, module, recv_val) - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - k = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if len(args) > 0 else ir.Constant(i64, 0) - v = resolver.resolve_i64(args[1], builder.block, preds, block_end_values, vmap, bb_map) if len(args) > 1 else ir.Constant(i64, 0) - else: + k = _res_i64(args[0]) if len(args) > 0 else ir.Constant(i64, 0) + if k is None: k = vmap.get(args[0], ir.Constant(i64, 0)) if len(args) > 0 else ir.Constant(i64, 0) + v = _res_i64(args[1]) if len(args) > 1 else ir.Constant(i64, 0) + if v is None: v = vmap.get(args[1], ir.Constant(i64, 0)) if len(args) > 1 else ir.Constant(i64, 0) callee = _declare(module, "nyash.map.set_hh", i64, [i64, i64, i64]) res = builder.call(callee, [recv_h, k, v], name="map_set_hh") @@ -232,9 +264,8 @@ def lower_boxcall( if method_name == "has": # MapBox.has(key) → nyash.map.has_hh(handle, key_any) recv_h = _ensure_handle(builder, module, recv_val) - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - k = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else ir.Constant(i64, 0) - else: + k = _res_i64(args[0]) if args else ir.Constant(i64, 0) + if k is None: k = vmap.get(args[0], ir.Constant(i64, 0)) if args else ir.Constant(i64, 0) callee = _declare(module, "nyash.map.has_hh", i64, [i64, i64]) res = builder.call(callee, [recv_h, k], name="map_has_hh") diff --git a/src/llvm_py/instructions/call.py b/src/llvm_py/instructions/call.py index d379fa0b..e43dda25 100644 --- a/src/llvm_py/instructions/call.py +++ b/src/llvm_py/instructions/call.py @@ -4,7 +4,8 @@ Handles regular function calls (not BoxCall or ExternCall) """ import llvmlite.ir as ir -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Any +from trace import debug as trace_debug def lower_call( builder: ir.IRBuilder, @@ -16,7 +17,8 @@ def lower_call( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None, ) -> None: """ Lower MIR Call instruction @@ -30,6 +32,50 @@ def lower_call( vmap: Value map resolver: Optional resolver for type handling """ + # If BuildCtx is provided, prefer its maps for consistency. + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass + # Short-hands with ctx (backward-compatible fallback) + r = resolver + p = preds + bev = block_end_values + bbm = bb_map + if ctx is not None: + try: + r = getattr(ctx, 'resolver', r) + p = getattr(ctx, 'preds', p) + bev = getattr(ctx, 'block_end_values', bev) + bbm = getattr(ctx, 'bb_map', bbm) + except Exception: + pass + + # Resolver helpers (prefer resolver when available) + def _res_i64(vid: int): + if r is not None and p is not None and bev is not None and bbm is not None: + try: + return r.resolve_i64(vid, builder.block, p, bev, vmap, bbm) + except Exception: + return None + return vmap.get(vid) + + def _res_ptr(vid: int): + if r is not None and p is not None and bev is not None: + try: + return r.resolve_ptr(vid, builder.block, p, bev, vmap) + except Exception: + return None + return vmap.get(vid) + # Resolve function: accepts string name or value-id referencing a string literal actual_name = func_name if not isinstance(func_name, str): @@ -58,11 +104,10 @@ def lower_call( arg_val = None if i < len(func.args): expected_type = func.args[i].type - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - if hasattr(expected_type, 'is_pointer') and expected_type.is_pointer: - arg_val = resolver.resolve_ptr(arg_id, builder.block, preds, block_end_values, vmap) - else: - arg_val = resolver.resolve_i64(arg_id, builder.block, preds, block_end_values, vmap, bb_map) + if hasattr(expected_type, 'is_pointer') and expected_type.is_pointer: + arg_val = _res_ptr(arg_id) + else: + arg_val = _res_i64(arg_id) if arg_val is None: arg_val = vmap.get(arg_id) if arg_val is None: @@ -88,13 +133,8 @@ def lower_call( # Make the call result = builder.call(func, call_args, name=f"call_{func_name}") # Optional trace for final debugging - try: - import os - if os.environ.get('NYASH_LLVM_TRACE_FINAL') == '1' and isinstance(actual_name, str): - if actual_name in ("Main.node_json/3", "Main.esc_json/1", "main"): - print(f"[TRACE] call {actual_name} args={len(call_args)}", flush=True) - except Exception: - pass + if isinstance(actual_name, str) and actual_name in ("Main.node_json/3", "Main.esc_json/1", "main"): + trace_debug(f"[TRACE] call {actual_name} args={len(call_args)}") # Store result if needed if dst_vid is not None: diff --git a/src/llvm_py/instructions/compare.py b/src/llvm_py/instructions/compare.py index d623f931..7c87ac96 100644 --- a/src/llvm_py/instructions/compare.py +++ b/src/llvm_py/instructions/compare.py @@ -5,7 +5,9 @@ Handles comparison operations (<, >, <=, >=, ==, !=) import llvmlite.ir as ir from typing import Dict, Optional, Any +from utils.values import resolve_i64_strict from .externcall import lower_externcall +from trace import values as trace_values def lower_compare( builder: ir.IRBuilder, @@ -20,6 +22,7 @@ def lower_compare( block_end_values=None, bb_map=None, meta: Optional[Dict[str, Any]] = None, + ctx: Optional[Any] = None, ) -> None: """ Lower MIR Compare instruction @@ -32,15 +35,23 @@ def lower_compare( dst: Destination value ID vmap: Value map """ + # If BuildCtx is provided, prefer its maps for consistency. + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass # Get operands # Prefer same-block SSA from vmap; fallback to resolver for cross-block dominance - lhs_val = vmap.get(lhs) - rhs_val = vmap.get(rhs) - if (lhs_val is None or rhs_val is None) and resolver is not None and preds is not None and block_end_values is not None and current_block is not None: - if lhs_val is None: - lhs_val = resolver.resolve_i64(lhs, current_block, preds, block_end_values, vmap, bb_map) - if rhs_val is None: - rhs_val = resolver.resolve_i64(rhs, current_block, preds, block_end_values, vmap, bb_map) + lhs_val = resolve_i64_strict(resolver, lhs, current_block, preds, block_end_values, vmap, bb_map) + rhs_val = resolve_i64_strict(resolver, rhs, current_block, preds, block_end_values, vmap, bb_map) i64 = ir.IntType(64) i8p = ir.IntType(8).as_pointer() @@ -63,12 +74,7 @@ def lower_compare( except Exception: pass if force_string or lhs_tag or rhs_tag: - try: - import os - if os.environ.get('NYASH_LLVM_TRACE_VALUES') == '1': - print(f"[compare] string-eq path: lhs={lhs} rhs={rhs} force={force_string} tagL={lhs_tag} tagR={rhs_tag}", flush=True) - except Exception: - pass + trace_values(f"[compare] string-eq path: lhs={lhs} rhs={rhs} force={force_string} tagL={lhs_tag} tagR={rhs_tag}") # Prefer same-block SSA (vmap) since string handles are produced in-place; fallback to resolver lh = lhs_val if lhs_val is not None else ( resolver.resolve_i64(lhs, current_block, preds, block_end_values, vmap, bb_map) @@ -78,14 +84,7 @@ def lower_compare( resolver.resolve_i64(rhs, current_block, preds, block_end_values, vmap, bb_map) if (resolver is not None and preds is not None and block_end_values is not None and current_block is not None) else ir.Constant(i64, 0) ) - try: - import os - if os.environ.get('NYASH_LLVM_TRACE_VALUES') == '1': - lz = isinstance(lh, ir.Constant) and getattr(getattr(lh,'constant',None),'constant',None) == 0 - rz = isinstance(rh, ir.Constant) and getattr(getattr(rh,'constant',None),'constant',None) == 0 - print(f"[compare] string-eq args: lh_is_const={isinstance(lh, ir.Constant)} rh_is_const={isinstance(rh, ir.Constant)}", flush=True) - except Exception: - pass + trace_values(f"[compare] string-eq args: lh_is_const={isinstance(lh, ir.Constant)} rh_is_const={isinstance(rh, ir.Constant)}") eqf = None for f in builder.module.functions: if f.name == 'nyash.string.eq_hh': @@ -117,12 +116,11 @@ def lower_compare( # Perform signed comparison using canonical predicates ('<','>','<=','>=','==','!=') pred = op if op in ('<','>','<=','>=','==','!=') else '==' cmp_result = builder.icmp_signed(pred, lhs_val, rhs_val, name=f"cmp_{dst}") - - # Convert i1 to i64 (0 or 1) - result = builder.zext(cmp_result, i64, name=f"cmp_i64_{dst}") - - # Store result - vmap[dst] = result + # Store the canonical i1 compare result. Consumers that require i64 + # should explicitly cast at their use site (e.g., via resolver or + # instruction-specific lowering) to avoid emitting casts after + # terminators when used as branch conditions. + vmap[dst] = cmp_result def lower_fcmp( builder: ir.IRBuilder, diff --git a/src/llvm_py/instructions/branch.py b/src/llvm_py/instructions/controlflow/branch.py similarity index 71% rename from src/llvm_py/instructions/branch.py rename to src/llvm_py/instructions/controlflow/branch.py index 973fafad..481a6abb 100644 --- a/src/llvm_py/instructions/branch.py +++ b/src/llvm_py/instructions/controlflow/branch.py @@ -5,6 +5,7 @@ Conditional branch based on condition value import llvmlite.ir as ir from typing import Dict +from utils.values import resolve_i64_strict def lower_branch( builder: ir.IRBuilder, @@ -28,22 +29,22 @@ def lower_branch( vmap: Value map bb_map: Block map """ - # Get condition value - if resolver is not None and preds is not None and block_end_values is not None: - cond = resolver.resolve_i64(cond_vid, builder.block, preds, block_end_values, vmap, bb_map) - else: - cond = vmap.get(cond_vid) - if not cond: + # Get condition value with preference to same-block SSA + cond = resolve_i64_strict(resolver, cond_vid, builder.block, preds, block_end_values, vmap, bb_map) + if cond is None: # Default to false if missing cond = ir.Constant(ir.IntType(1), 0) # Convert to i1 if needed if hasattr(cond, 'type'): - if cond.type == ir.IntType(64): + # If we already have an i1 (canonical compare result), use it directly. + if isinstance(cond.type, ir.IntType) and cond.type.width == 1: + pass + elif isinstance(cond.type, ir.IntType) and cond.type.width == 64: # i64 to i1: compare != 0 zero = ir.Constant(ir.IntType(64), 0) cond = builder.icmp_unsigned('!=', cond, zero, name="cond_i1") - elif cond.type == ir.IntType(8).as_pointer(): + elif isinstance(cond.type, ir.PointerType): # Pointer to i1: compare != null null = ir.Constant(cond.type, None) cond = builder.icmp_unsigned('!=', cond, null, name="cond_p1") diff --git a/src/llvm_py/instructions/jump.py b/src/llvm_py/instructions/controlflow/jump.py similarity index 93% rename from src/llvm_py/instructions/jump.py rename to src/llvm_py/instructions/controlflow/jump.py index 4f33b589..eaff2f0b 100644 --- a/src/llvm_py/instructions/jump.py +++ b/src/llvm_py/instructions/controlflow/jump.py @@ -21,4 +21,5 @@ def lower_jump( """ target_bb = bb_map.get(target_bid) if target_bb: - builder.branch(target_bb) \ No newline at end of file + builder.branch(target_bb) + diff --git a/src/llvm_py/instructions/controlflow/while_.py b/src/llvm_py/instructions/controlflow/while_.py new file mode 100644 index 00000000..b3071382 --- /dev/null +++ b/src/llvm_py/instructions/controlflow/while_.py @@ -0,0 +1,80 @@ +""" +Lowering helpers for while-control flow (regular structured) +""" + +from typing import List, Dict, Any +import llvmlite.ir as ir + +def lower_while_regular( + builder: ir.IRBuilder, + func: ir.Function, + cond_vid: int, + body_insts: List[Dict[str, Any]], + loop_id: int, + vmap: Dict[int, ir.Value], + bb_map: Dict[int, ir.Block], + resolver, + preds, + block_end_values, +): + """Create a minimal while in IR: cond -> body -> cond, with exit. + The body instructions are lowered using the caller's dispatcher. + """ + i1 = ir.IntType(1) + i64 = ir.IntType(64) + + # Create basic blocks: cond -> body -> cond, and exit + cond_bb = func.append_basic_block(name=f"while{loop_id}_cond") + body_bb = func.append_basic_block(name=f"while{loop_id}_body") + exit_bb = func.append_basic_block(name=f"while{loop_id}_exit") + + # Jump from current to cond + builder.branch(cond_bb) + + # Cond block + cbuild = ir.IRBuilder(cond_bb) + try: + # Resolve against the condition block to localize dominance + cond_val = resolver.resolve_i64(cond_vid, cond_bb, preds, block_end_values, vmap, bb_map) + except Exception: + cond_val = vmap.get(cond_vid) + if cond_val is None: + cond_val = ir.Constant(i1, 0) + # Normalize to i1 + if hasattr(cond_val, 'type'): + if isinstance(cond_val.type, ir.IntType) and cond_val.type.width == 64: + zero64 = ir.Constant(i64, 0) + cond_val = cbuild.icmp_unsigned('!=', cond_val, zero64, name="while_cond_i1") + elif isinstance(cond_val.type, ir.PointerType): + nullp = ir.Constant(cond_val.type, None) + cond_val = cbuild.icmp_unsigned('!=', cond_val, nullp, name="while_cond_p1") + elif isinstance(cond_val.type, ir.IntType) and cond_val.type.width == 1: + # already i1 + pass + else: + # Fallback: treat as false + cond_val = ir.Constant(i1, 0) + else: + cond_val = ir.Constant(i1, 0) + + cbuild.cbranch(cond_val, body_bb, exit_bb) + + # Body block + bbuild = ir.IRBuilder(body_bb) + # The caller must provide a dispatcher to lower body_insts; do a simple inline here. + # We expect the caller to have a method lower_instruction(builder, inst, func). + lower_instruction = getattr(resolver, '_owner_lower_instruction', None) + if lower_instruction is None: + raise RuntimeError('resolver._owner_lower_instruction not set (needs NyashLLVMBuilder.lower_instruction)') + for sub in body_insts: + if bbuild.block.terminator is not None: + cont = func.append_basic_block(name=f"cont_bb_{bbuild.block.name}") + bbuild.position_at_end(cont) + lower_instruction(bbuild, sub, func) + # Ensure terminator: if not terminated, branch back to cond + if bbuild.block.terminator is None: + bbuild.branch(cond_bb) + + # Continue at exit + builder.position_at_end(exit_bb) + diff --git a/src/llvm_py/instructions/copy.py b/src/llvm_py/instructions/copy.py new file mode 100644 index 00000000..82375825 --- /dev/null +++ b/src/llvm_py/instructions/copy.py @@ -0,0 +1,46 @@ +""" +Copy instruction lowering +MIR13 PHI-off uses explicit copies along edges/blocks to model merges. +""" + +import llvmlite.ir as ir +from typing import Dict, Optional, Any +from utils.values import resolve_i64_strict + +def lower_copy( + builder: ir.IRBuilder, + dst: int, + src: int, + vmap: Dict[int, ir.Value], + resolver=None, + current_block=None, + preds=None, + block_end_values=None, + bb_map=None, + ctx: Optional[Any] = None, +): + """Lower a copy by mapping dst to src value in the current block scope. + + Prefer same-block SSA from vmap; fallback to resolver to preserve + dominance and to localize values across predecessors. + """ + # If BuildCtx is provided, prefer its maps for consistency. + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'vmap', None) is not None and vmap is None: + vmap = ctx.vmap + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass + # Prefer local SSA; resolve otherwise to preserve dominance + val = resolve_i64_strict(resolver, src, current_block, preds, block_end_values, vmap, bb_map) + if val is None: + val = ir.Constant(ir.IntType(64), 0) + vmap[dst] = val diff --git a/src/llvm_py/instructions/externcall.py b/src/llvm_py/instructions/externcall.py index c95c8ca4..7efc0208 100644 --- a/src/llvm_py/instructions/externcall.py +++ b/src/llvm_py/instructions/externcall.py @@ -4,7 +4,7 @@ Minimal mapping for NyRT-exported symbols (console/log family等) """ import llvmlite.ir as ir -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Any def lower_externcall( builder: ir.IRBuilder, @@ -16,7 +16,8 @@ def lower_externcall( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None, ) -> None: """ Lower MIR ExternCall instruction @@ -30,6 +31,19 @@ def lower_externcall( vmap: Value map resolver: Optional resolver for type handling """ + # If BuildCtx is provided, prefer its maps for consistency. + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass # Accept full symbol names (e.g., "nyash.console.log", "nyash.string.len_h"). llvm_name = func_name @@ -83,13 +97,17 @@ def lower_externcall( call_args: List[ir.Value] = [] for i, arg_id in enumerate(args): orig_arg_id = arg_id - # Prefer resolver + # Prefer resolver/ctx + aval = None if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - if len(func.args) > i and isinstance(func.args[i].type, ir.PointerType): - aval = resolver.resolve_ptr(arg_id, builder.block, preds, block_end_values, vmap) - else: - aval = resolver.resolve_i64(arg_id, builder.block, preds, block_end_values, vmap, bb_map) - else: + try: + if len(func.args) > i and isinstance(func.args[i].type, ir.PointerType): + aval = resolver.resolve_ptr(arg_id, builder.block, preds, block_end_values, vmap) + else: + aval = resolver.resolve_i64(arg_id, builder.block, preds, block_end_values, vmap, bb_map) + except Exception: + aval = None + if aval is None: aval = vmap.get(arg_id) if aval is None: # Default guess diff --git a/src/llvm_py/instructions/loopform.py b/src/llvm_py/instructions/loopform.py index 85ad8d12..55526cdd 100644 --- a/src/llvm_py/instructions/loopform.py +++ b/src/llvm_py/instructions/loopform.py @@ -123,7 +123,10 @@ def lower_while_loopform( lf.tag_phi = tag_phi lf.payload_phi = payload_phi - if os.environ.get('NYASH_CLI_VERBOSE') == '1': - print(f"[LoopForm] Created loop structure (id={loop_id})") + try: + from trace import debug as trace_debug + trace_debug(f"[LoopForm] Created loop structure (id={loop_id})") + except Exception: + pass return True diff --git a/src/llvm_py/instructions/newbox.py b/src/llvm_py/instructions/newbox.py index c86f0ef8..0a045b5f 100644 --- a/src/llvm_py/instructions/newbox.py +++ b/src/llvm_py/instructions/newbox.py @@ -4,7 +4,7 @@ Handles box creation (new StringBox(), new IntegerBox(), etc.) """ import llvmlite.ir as ir -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Any def lower_newbox( builder: ir.IRBuilder, @@ -13,7 +13,8 @@ def lower_newbox( args: List[int], dst_vid: int, vmap: Dict[int, ir.Value], - resolver=None + resolver=None, + ctx: Optional[Any] = None ) -> None: """ Lower MIR NewBox instruction diff --git a/src/llvm_py/instructions/phi.py b/src/llvm_py/instructions/phi.py index 830ff9b3..144c2896 100644 --- a/src/llvm_py/instructions/phi.py +++ b/src/llvm_py/instructions/phi.py @@ -134,12 +134,15 @@ def lower_phi( import os if used_default_zero and os.environ.get('NYASH_LLVM_PHI_STRICT') == '1': raise RuntimeError(f"[LLVM_PY] PHI dst={dst_vid} used synthesized zero; check preds/incoming") - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': + try: + from trace import phi as trace_phi try: blkname = str(current_block.name) except Exception: blkname = '' - print(f"[PHI] {blkname} v{dst_vid} incoming={len(incoming_pairs)} zero={1 if used_default_zero else 0}") + trace_phi(f"[PHI] {blkname} v{dst_vid} incoming={len(incoming_pairs)} zero={1 if used_default_zero else 0}") + except Exception: + pass # Propagate string-ness: if any incoming value-id is tagged string-ish, mark dst as string-ish. try: if resolver is not None and hasattr(resolver, 'is_stringish') and hasattr(resolver, 'mark_string'): diff --git a/src/llvm_py/instructions/ret.py b/src/llvm_py/instructions/ret.py index 842b29fc..8eccbab2 100644 --- a/src/llvm_py/instructions/ret.py +++ b/src/llvm_py/instructions/ret.py @@ -4,7 +4,7 @@ Handles void and value returns """ import llvmlite.ir as ir -from typing import Dict, Optional +from typing import Dict, Optional, Any def lower_return( builder: ir.IRBuilder, @@ -14,7 +14,8 @@ def lower_return( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None, ) -> None: """ Lower MIR Return instruction @@ -25,6 +26,19 @@ def lower_return( vmap: Value map return_type: Expected return type """ + # Prefer BuildCtx maps if provided + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass if value_id is None: # Void return builder.ret_void() @@ -33,6 +47,53 @@ def lower_return( ret_val = None if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: try: + # If this block has a declared PHI for the return value, force using the + # local PHI placeholder to ensure dominance and let finalize_phis wire it. + try: + block_name = builder.block.name + cur_bid = int(str(block_name).replace('bb','')) + except Exception: + cur_bid = -1 + try: + bm = getattr(resolver, 'block_phi_incomings', {}) or {} + except Exception: + bm = {} + if isinstance(value_id, int) and isinstance(bm.get(cur_bid), dict) and value_id in bm.get(cur_bid): + # Reuse predeclared ret-phi when available + cur = None + try: + rp = getattr(resolver, 'ret_phi_map', {}) or {} + key = (int(cur_bid), int(value_id)) + if key in rp: + cur = rp[key] + except Exception: + cur = None + if cur is None: + btop = ir.IRBuilder(builder.block) + try: + btop.position_at_start(builder.block) + except Exception: + pass + # Reuse existing local phi if present; otherwise create + cur = vmap.get(value_id) + need_new = True + try: + need_new = not (cur is not None and hasattr(cur, 'add_incoming') and getattr(getattr(cur, 'basic_block', None), 'name', None) == builder.block.name) + except Exception: + need_new = True + if need_new: + cur = btop.phi(ir.IntType(64), name=f"phi_ret_{value_id}") + # Bind to maps + vmap[value_id] = cur + try: + if hasattr(resolver, 'global_vmap') and isinstance(resolver.global_vmap, dict): + resolver.global_vmap[value_id] = cur + except Exception: + pass + ret_val = cur + if ret_val is not None: + builder.ret(ret_val) + return if isinstance(return_type, ir.PointerType): ret_val = resolver.resolve_ptr(value_id, builder.block, preds, block_end_values, vmap) else: diff --git a/src/llvm_py/instructions/safepoint.py b/src/llvm_py/instructions/safepoint.py index f4fba8e9..f12542d1 100644 --- a/src/llvm_py/instructions/safepoint.py +++ b/src/llvm_py/instructions/safepoint.py @@ -4,7 +4,7 @@ GC safepoints where runtime can safely collect garbage """ import llvmlite.ir as ir -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Any def lower_safepoint( builder: ir.IRBuilder, @@ -15,7 +15,8 @@ def lower_safepoint( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None ) -> None: """ Lower MIR Safepoint instruction @@ -53,8 +54,18 @@ def lower_safepoint( # Store each live value for i, vid in enumerate(live_values): - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - val = resolver.resolve_i64(vid, builder.block, preds, block_end_values, vmap, bb_map) + # Prefer BuildCtx if provided + r = resolver; p = preds; bev = block_end_values; bbm = bb_map + if ctx is not None: + try: + r = getattr(ctx, 'resolver', r) + p = getattr(ctx, 'preds', p) + bev = getattr(ctx, 'block_end_values', bev) + bbm = getattr(ctx, 'bb_map', bbm) + except Exception: + pass + if r is not None and p is not None and bev is not None and bbm is not None: + val = r.resolve_i64(vid, builder.block, p, bev, vmap, bbm) else: val = vmap.get(vid, ir.Constant(i64, 0)) diff --git a/src/llvm_py/instructions/typeop.py b/src/llvm_py/instructions/typeop.py index df79333c..986c6d34 100644 --- a/src/llvm_py/instructions/typeop.py +++ b/src/llvm_py/instructions/typeop.py @@ -4,7 +4,7 @@ Handles type conversions and type checks """ import llvmlite.ir as ir -from typing import Dict, Optional +from typing import Dict, Optional, Any def lower_typeop( builder: ir.IRBuilder, @@ -16,7 +16,8 @@ def lower_typeop( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None, ) -> None: """ Lower MIR TypeOp instruction @@ -35,6 +36,19 @@ def lower_typeop( vmap: Value map resolver: Optional resolver for type handling """ + # Prefer BuildCtx maps when provided + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: src_val = resolver.resolve_i64(src_vid, builder.block, preds, block_end_values, vmap, bb_map) else: @@ -83,7 +97,8 @@ def lower_convert( resolver=None, preds=None, block_end_values=None, - bb_map=None + bb_map=None, + ctx: Optional[Any] = None, ) -> None: """ Lower type conversion between primitive types @@ -96,6 +111,18 @@ def lower_convert( to_type: Target type vmap: Value map """ + if ctx is not None: + try: + if getattr(ctx, 'resolver', None) is not None: + resolver = ctx.resolver + if getattr(ctx, 'preds', None) is not None and preds is None: + preds = ctx.preds + if getattr(ctx, 'block_end_values', None) is not None and block_end_values is None: + block_end_values = ctx.block_end_values + if getattr(ctx, 'bb_map', None) is not None and bb_map is None: + bb_map = ctx.bb_map + except Exception: + pass if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: # Choose resolution based on from_type if from_type == "ptr": diff --git a/src/llvm_py/llvm_builder.py b/src/llvm_py/llvm_builder.py index 5782f1b2..3ae5cc71 100644 --- a/src/llvm_py/llvm_builder.py +++ b/src/llvm_py/llvm_builder.py @@ -15,9 +15,10 @@ import llvmlite.binding as llvm from instructions.const import lower_const from instructions.binop import lower_binop from instructions.compare import lower_compare -from instructions.jump import lower_jump -from instructions.branch import lower_branch +from instructions.controlflow.jump import lower_jump +from instructions.controlflow.branch import lower_branch from instructions.ret import lower_return +from instructions.copy import lower_copy # PHI are deferred; finalize_phis wires incoming edges after snapshots from instructions.call import lower_call from instructions.boxcall import lower_boxcall @@ -27,6 +28,13 @@ from instructions.newbox import lower_newbox from instructions.safepoint import lower_safepoint, insert_automatic_safepoint from instructions.barrier import lower_barrier from instructions.loopform import lower_while_loopform +from instructions.controlflow.while_ import lower_while_regular +from phi_wiring import setup_phi_placeholders as _setup_phi_placeholders, finalize_phis as _finalize_phis +from trace import debug as trace_debug +from trace import phi as trace_phi +from prepass.loops import detect_simple_while +from prepass.if_merge import plan_ret_phi_predeclare +from build_ctx import BuildCtx from resolver import Resolver from mir_reader import MIRReader @@ -71,6 +79,8 @@ class NyashLLVMBuilder: # Heuristics for minor gated fixes self.current_function_name: Optional[str] = None self._last_substring_vid: Optional[int] = None + # Map of (block_id, value_id) -> predeclared PHI for ret-merge if-merge prepass + self.predeclared_ret_phis: Dict[Tuple[int, int], ir.Instruction] = {} def build_from_mir(self, mir_json: Dict[str, Any]) -> str: """Build LLVM IR from MIR JSON""" @@ -177,11 +187,12 @@ class NyashLLVMBuilder: os.makedirs(os.path.dirname(dump_path), exist_ok=True) with open(dump_path, 'w') as f: f.write(ir_text) - elif os.environ.get('NYASH_CLI_VERBOSE') == '1': + else: # Default dump location when verbose and not explicitly set - os.makedirs('tmp', exist_ok=True) - with open('tmp/nyash_harness.ll', 'w') as f: - f.write(ir_text) + if os.environ.get('NYASH_CLI_VERBOSE') == '1': + os.makedirs('tmp', exist_ok=True) + with open('tmp/nyash_harness.ll', 'w') as f: + f.write(ir_text) except Exception: pass return ir_text @@ -322,13 +333,176 @@ class NyashLLVMBuilder: # Prepass: collect producer stringish hints and PHI metadata for all blocks # and create placeholders at each block head so that resolver can safely # return existing PHIs without creating new ones. - self.setup_phi_placeholders(blocks) - + _setup_phi_placeholders(self, blocks) + + # Optional: if-merge prepass → predeclare PHI for return-merge blocks + # Gate with NYASH_LLVM_PREPASS_IFMERGE=1 + try: + if os.environ.get('NYASH_LLVM_PREPASS_IFMERGE') == '1': + plan = plan_ret_phi_predeclare(block_by_id) + if plan: + # Ensure block_phi_incomings map exists + if not hasattr(self, 'block_phi_incomings') or self.block_phi_incomings is None: + self.block_phi_incomings = {} + for bbid, ret_vid in plan.items(): + # Create a placeholder PHI at block head if missing + bb0 = self.bb_map.get(bbid) + if bb0 is not None: + b0 = ir.IRBuilder(bb0) + try: + b0.position_at_start(bb0) + except Exception: + pass + cur = self.vmap.get(ret_vid) + need_new = True + try: + need_new = not (cur is not None and hasattr(cur, 'add_incoming')) + except Exception: + need_new = True + if need_new: + ph = b0.phi(self.i64, name=f"phi_ret_{ret_vid}") + self.vmap[ret_vid] = ph + else: + ph = cur + # Record for later unify + try: + self.predeclared_ret_phis[(int(bbid), int(ret_vid))] = ph + except Exception: + pass + # Record declared incoming metadata using the same value-id + # for each predecessor; finalize_phis will resolve per-pred end values. + try: + preds_raw = [p for p in self.preds.get(bbid, []) if p != bbid] + except Exception: + preds_raw = [] + # Dedup while preserving order + seen = set() + preds_list = [] + for p in preds_raw: + if p not in seen: + preds_list.append(p) + seen.add(p) + try: + # finalize_phis reads pairs as (decl_b, v_src) and maps to nearest predecessor. + # We provide (bb_pred, ret_vid) for all preds. + self.block_phi_incomings.setdefault(int(bbid), {})[int(ret_vid)] = [ + (int(p), int(ret_vid)) for p in preds_list + ] + except Exception: + pass + try: + trace_debug(f"[prepass] if-merge: predeclare PHI at bb{bbid} for v{ret_vid} preds={preds_list}") + except Exception: + pass + except Exception: + pass + + # Optional: simple loop prepass → synthesize a structured while body + loop_plan = None + try: + if os.environ.get('NYASH_LLVM_PREPASS_LOOP') == '1': + loop_plan = detect_simple_while(block_by_id) + if loop_plan is not None: + trace_debug(f"[prepass] detect loop header=bb{loop_plan['header']} then=bb{loop_plan['then']} latch=bb{loop_plan['latch']} exit=bb{loop_plan['exit']}") + except Exception: + loop_plan = None + + # Provide predeclared ret-phi map to resolver for ret lowering to reuse + try: + self.resolver.ret_phi_map = self.predeclared_ret_phis + except Exception: + pass + # Now lower blocks + skipped: set[int] = set() + if loop_plan is not None: + try: + for bskip in loop_plan.get('skip_blocks', []): + if bskip != loop_plan.get('header'): + skipped.add(int(bskip)) + except Exception: + pass for bid in order: block_data = block_by_id.get(bid) if block_data is None: continue + # If loop prepass applies, lower while once at header and skip loop-internal blocks + if loop_plan is not None and bid == loop_plan.get('header'): + bb = self.bb_map[bid] + builder = ir.IRBuilder(bb) + try: + self.resolver.builder = builder + self.resolver.module = self.module + except Exception: + pass + # Lower while via loopform (if enabled) or regular fallback + self.loop_count += 1 + body_insts = loop_plan.get('body_insts', []) + cond_vid = loop_plan.get('cond') + from instructions.loopform import lower_while_loopform + ok = False + try: + # Use a clean per-while vmap context seeded from global placeholders + self._current_vmap = dict(self.vmap) + ok = lower_while_loopform(builder, func, cond_vid, body_insts, + self.loop_count, self.vmap, self.bb_map, + self.resolver, self.preds, self.block_end_values) + except Exception: + ok = False + if not ok: + # Prepare resolver backref for instruction dispatcher + try: + self.resolver._owner_lower_instruction = self.lower_instruction + except Exception: + pass + lower_while_regular(builder, func, cond_vid, body_insts, + self.loop_count, self.vmap, self.bb_map, + self.resolver, self.preds, self.block_end_values) + # Clear while vmap context + try: + delattr(self, '_current_vmap') + except Exception: + pass + # Mark blocks to skip + for bskip in loop_plan.get('skip_blocks', []): + skipped.add(bskip) + # Ensure skipped original blocks have a valid terminator: branch to while exit + try: + exit_name = f"while{self.loop_count}_exit" + exit_bb = None + for bbf in func.blocks: + try: + if str(bbf.name) == exit_name: + exit_bb = bbf + break + except Exception: + pass + if exit_bb is not None: + # Connect while exit to original exit block if available + try: + orig_exit_bb = self.bb_map.get(loop_plan.get('exit')) + if orig_exit_bb is not None and exit_bb.terminator is None: + ibx = ir.IRBuilder(exit_bb) + ibx.branch(orig_exit_bb) + except Exception: + pass + for bskip in loop_plan.get('skip_blocks', []): + if bskip == loop_plan.get('header'): + continue + bb_skip = self.bb_map.get(bskip) + if bb_skip is None: + continue + try: + if bb_skip.terminator is None: + ib = ir.IRBuilder(bb_skip) + ib.branch(exit_bb) + except Exception: + pass + except Exception: + pass + continue + if bid in skipped: + continue bb = self.bb_map[bid] self.lower_block(bb, block_data, func) @@ -337,10 +511,32 @@ class NyashLLVMBuilder: self.resolver.def_blocks = self.def_blocks # Provide phi metadata for this function to resolver self.resolver.block_phi_incomings = getattr(self, 'block_phi_incomings', {}) + # Attach a BuildCtx object for future refactors (non-breaking) + try: + self.ctx = BuildCtx( + module=self.module, + i64=self.i64, + i32=self.i32, + i8=self.i8, + i1=self.i1, + i8p=self.i8p, + vmap=self.vmap, + bb_map=self.bb_map, + preds=self.preds, + block_end_values=self.block_end_values, + resolver=self.resolver, + trace_phi=os.environ.get('NYASH_LLVM_TRACE_PHI') == '1', + verbose=os.environ.get('NYASH_CLI_VERBOSE') == '1', + ) + # Also expose via resolver for convenience until migration completes + self.resolver.ctx = self.ctx + except Exception: + pass except Exception: pass # Finalize PHIs for this function now that all snapshots for it exist - self.finalize_phis() + _finalize_phis(self) + def setup_phi_placeholders(self, blocks: List[Dict[str, Any]]): """Predeclare PHIs and collect incoming metadata for finalize_phis. @@ -439,8 +635,17 @@ class NyashLLVMBuilder: pass def lower_block(self, bb: ir.Block, block_data: Dict[str, Any], func: ir.Function): - """Lower a single basic block""" + """Lower a single basic block. + + Emit all non-terminator ops first, then control-flow terminators + (branch/jump/ret). This avoids generating IR after a terminator. + """ builder = ir.IRBuilder(bb) + try: + import os + trace_debug(f"[llvm-py] === lower_block bb{block_data.get('id')} ===") + except Exception: + pass # Provide builder/module to resolver for PHI/casts insertion try: self.resolver.builder = builder @@ -448,54 +653,187 @@ class NyashLLVMBuilder: except Exception: pass instructions = block_data.get("instructions", []) - # Lower non-PHI instructions strictly in original program order. - # Reordering here can easily introduce use-before-def within the same - # basic block (e.g., string ops that depend on prior me.* calls). + # Ensure JSON-declared PHIs are materialized at block start before any terminator + try: + phi_insts = [inst for inst in (instructions or []) if inst.get('op') == 'phi'] + if phi_insts: + btop = ir.IRBuilder(bb) + btop.position_at_start(bb) + for pinst in phi_insts: + dstp = pinst.get('dst') + if isinstance(dstp, int): + cur = self.vmap.get(dstp) + need_new = True + try: + need_new = not (cur is not None and hasattr(cur, 'add_incoming')) + except Exception: + need_new = True + if need_new: + phi = btop.phi(self.i64, name=f"phi_{dstp}") + self.vmap[dstp] = phi + except Exception: + pass + # Partition into body ops and terminators + body_ops: List[Dict[str, Any]] = [] + term_ops: List[Dict[str, Any]] = [] + for inst in (instructions or []): + opx = inst.get("op") + if opx in ("branch", "jump", "ret"): + term_ops.append(inst) + elif opx == "phi": + continue + else: + body_ops.append(inst) + # Per-block SSA map (avoid cross-block vmap pollution) + # Seed with non-PHI globals and PHIs that belong to this block only. + vmap_cur: Dict[int, ir.Value] = {} + try: + for _vid, _val in (self.vmap or {}).items(): + keep = True + try: + if hasattr(_val, 'add_incoming'): + bb_of = getattr(getattr(_val, 'basic_block', None), 'name', None) + keep = (bb_of == bb.name) + except Exception: + keep = False + if keep: + vmap_cur[_vid] = _val + except Exception: + vmap_cur = dict(self.vmap) + # Expose to lower_instruction users (e.g., while_ regular lowering) + self._current_vmap = vmap_cur created_ids: List[int] = [] - non_phi_insts = [inst for inst in instructions if inst.get("op") != "phi"] - for inst in non_phi_insts: - # Stop if a terminator has already been emitted for this block + # Compute ids defined in this block to help with copy/PHI decisions + defined_here_all: set = set() + for _inst in body_ops: + try: + d = _inst.get('dst') + if isinstance(d, int): + defined_here_all.add(d) + except Exception: + pass + # Keep PHI synthesis on-demand in resolver; avoid predeclaring here to reduce clashes. + # Lower body ops first in-order + for i_idx, inst in enumerate(body_ops): + try: + import os + trace_debug(f"[llvm-py] body op: {inst.get('op')} dst={inst.get('dst')} cond={inst.get('cond')}") + except Exception: + pass try: if bb.terminator is not None: break except Exception: pass builder.position_at_end(bb) - self.lower_instruction(builder, inst, func) + # Special-case copy: avoid forward self-block dependencies only when src is defined later in this block + if inst.get('op') == 'copy': + src_i = inst.get('src') + skip_now = False + if isinstance(src_i, int): + try: + # Check if src will be defined in a subsequent instruction + for _rest in body_ops[i_idx+1:]: + try: + if int(_rest.get('dst')) == int(src_i): + skip_now = True + break + except Exception: + pass + except Exception: + pass + if skip_now: + # Skip now; a later copy will remap after src becomes available + pass + else: + self.lower_instruction(builder, inst, func) + else: + self.lower_instruction(builder, inst, func) + # Sync per-block vmap snapshot with any new definitions that were + # written into the global vmap by lowering routines (e.g., copy) try: dst = inst.get("dst") - if isinstance(dst, int) and dst not in created_ids and dst in self.vmap: - created_ids.append(dst) + if isinstance(dst, int): + if dst in self.vmap: + _gval = self.vmap[dst] + # Avoid syncing PHIs that belong to other blocks (placeholders) + try: + if hasattr(_gval, 'add_incoming'): + bb_of = getattr(getattr(_gval, 'basic_block', None), 'name', None) + if bb_of == bb.name: + vmap_cur[dst] = _gval + else: + vmap_cur[dst] = _gval + except Exception: + vmap_cur[dst] = _gval + if dst not in created_ids and dst in vmap_cur: + created_ids.append(dst) except Exception: pass + # Ret-phi proactive insertion removed; resolver handles ret localization as needed. + + # Lower terminators at end, preserving order + for inst in term_ops: + try: + import os + trace_debug(f"[llvm-py] term op: {inst.get('op')} dst={inst.get('dst')} cond={inst.get('cond')}") + except Exception: + pass + try: + if bb.terminator is not None: + break + except Exception: + pass + builder.position_at_end(bb) + # (if-merge handled by resolver + finalize_phis) + self.lower_instruction(builder, inst, func) + # Sync back local PHIs created in this block into the global vmap so that + # finalize_phis targets the same SSA nodes as terminators just used. + try: + for vid in created_ids: + val = vmap_cur.get(vid) + if val is not None and hasattr(val, 'add_incoming'): + try: + if getattr(getattr(val, 'basic_block', None), 'name', None) == bb.name: + self.vmap[vid] = val + except Exception: + self.vmap[vid] = val + except Exception: + pass # Snapshot end-of-block values for sealed PHI wiring bid = block_data.get("id", 0) # Robust snapshot: clone the entire vmap at block end so that # values that were not redefined in this block (but remain live) # are available to PHI finalize wiring. This avoids omissions of # phi-dst/cyclic and carry-over values. - snap: Dict[int, ir.Value] = dict(self.vmap) + snap: Dict[int, ir.Value] = dict(vmap_cur) try: import os - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - keys = sorted(list(snap.keys())) - print(f"[builder] snapshot bb{bid} keys={keys[:20]}...", flush=True) + keys = sorted(list(snap.keys())) + trace_phi(f"[builder] snapshot bb{bid} keys={keys[:20]}...") except Exception: pass # Record block-local definitions for lifetime hinting for vid in created_ids: - if vid in self.vmap: + if vid in vmap_cur: self.def_blocks.setdefault(vid, set()).add(block_data.get("id", 0)) self.block_end_values[bid] = snap + # Clear current vmap context + try: + delattr(self, '_current_vmap') + except Exception: + pass def lower_instruction(self, builder: ir.IRBuilder, inst: Dict[str, Any], func: ir.Function): """Dispatch instruction to appropriate handler""" op = inst.get("op") + # Pick current vmap context + vmap_ctx = getattr(self, '_current_vmap', self.vmap) if op == "const": dst = inst.get("dst") value = inst.get("value") - lower_const(builder, self.module, dst, value, self.vmap, self.resolver) + lower_const(builder, self.module, dst, value, vmap_ctx, self.resolver) elif op == "binop": operation = inst.get("operation") @@ -504,23 +842,28 @@ class NyashLLVMBuilder: dst = inst.get("dst") dst_type = inst.get("dst_type") lower_binop(builder, self.resolver, operation, lhs, rhs, dst, - self.vmap, builder.block, self.preds, self.block_end_values, self.bb_map, + vmap_ctx, builder.block, self.preds, self.block_end_values, self.bb_map, dst_type=dst_type) elif op == "jump": target = inst.get("target") lower_jump(builder, target, self.bb_map) + elif op == "copy": + dst = inst.get("dst") + src = inst.get("src") + lower_copy(builder, dst, src, vmap_ctx, self.resolver, builder.block, self.preds, self.block_end_values, self.bb_map, getattr(self, 'ctx', None)) + elif op == "branch": cond = inst.get("cond") then_bid = inst.get("then") else_bid = inst.get("else") - lower_branch(builder, cond, then_bid, else_bid, self.vmap, self.bb_map, self.resolver, self.preds, self.block_end_values) + lower_branch(builder, cond, then_bid, else_bid, vmap_ctx, self.bb_map, self.resolver, self.preds, self.block_end_values) elif op == "ret": value = inst.get("value") - lower_return(builder, value, self.vmap, func.function_type.return_type, - self.resolver, self.preds, self.block_end_values, self.bb_map) + lower_return(builder, value, vmap_ctx, func.function_type.return_type, + self.resolver, self.preds, self.block_end_values, self.bb_map, getattr(self, 'ctx', None)) elif op == "phi": # No-op here: PHIはメタのみ(resolverがon‑demand生成) @@ -533,15 +876,16 @@ class NyashLLVMBuilder: rhs = inst.get("rhs") dst = inst.get("dst") cmp_kind = inst.get("cmp_kind") - lower_compare(builder, operation, lhs, rhs, dst, self.vmap, + lower_compare(builder, operation, lhs, rhs, dst, vmap_ctx, self.resolver, builder.block, self.preds, self.block_end_values, self.bb_map, - meta={"cmp_kind": cmp_kind} if cmp_kind else None) + meta={"cmp_kind": cmp_kind} if cmp_kind else None, + ctx=getattr(self, 'ctx', None)) elif op == "call": func_name = inst.get("func") args = inst.get("args", []) dst = inst.get("dst") - lower_call(builder, self.module, func_name, args, dst, self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) + lower_call(builder, self.module, func_name, args, dst, vmap_ctx, self.resolver, self.preds, self.block_end_values, self.bb_map, getattr(self, 'ctx', None)) elif op == "boxcall": box_vid = inst.get("box") @@ -549,7 +893,7 @@ class NyashLLVMBuilder: args = inst.get("args", []) dst = inst.get("dst") lower_boxcall(builder, self.module, box_vid, method, args, dst, - self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) + vmap_ctx, self.resolver, self.preds, self.block_end_values, self.bb_map, getattr(self, 'ctx', None)) # Optional: honor explicit dst_type for tagging (string handle) try: dst_type = inst.get("dst_type") @@ -571,14 +915,14 @@ class NyashLLVMBuilder: args = inst.get("args", []) dst = inst.get("dst") lower_externcall(builder, self.module, func_name, args, dst, - self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) + vmap_ctx, self.resolver, self.preds, self.block_end_values, self.bb_map, getattr(self, 'ctx', None)) elif op == "newbox": box_type = inst.get("type") args = inst.get("args", []) dst = inst.get("dst") lower_newbox(builder, self.module, box_type, args, dst, - self.vmap, self.resolver) + vmap_ctx, self.resolver, getattr(self, 'ctx', None)) elif op == "typeop": operation = inst.get("operation") @@ -586,13 +930,14 @@ class NyashLLVMBuilder: dst = inst.get("dst") target_type = inst.get("target_type") lower_typeop(builder, operation, src, dst, target_type, - self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) + vmap_ctx, self.resolver, self.preds, self.block_end_values, self.bb_map, getattr(self, 'ctx', None)) elif op == "safepoint": live = inst.get("live", []) - lower_safepoint(builder, self.module, live, self.vmap, + lower_safepoint(builder, self.module, live, vmap_ctx, resolver=self.resolver, preds=self.preds, - block_end_values=self.block_end_values, bb_map=self.bb_map) + block_end_values=self.block_end_values, bb_map=self.bb_map, + ctx=getattr(self, 'ctx', None)) elif op == "barrier": barrier_type = inst.get("type", "memory") @@ -606,11 +951,16 @@ class NyashLLVMBuilder: if not lower_while_loopform(builder, func, cond, body, self.loop_count, self.vmap, self.bb_map, self.resolver, self.preds, self.block_end_values): - # Fallback to regular while - self._lower_while_regular(builder, inst, func) + # Fallback to regular while (structured) + try: + self.resolver._owner_lower_instruction = self.lower_instruction + except Exception: + pass + lower_while_regular(builder, func, cond, body, + self.loop_count, self.vmap, self.bb_map, + self.resolver, self.preds, self.block_end_values) else: - if os.environ.get('NYASH_CLI_VERBOSE') == '1': - print(f"[Python LLVM] Unknown instruction: {op}") + trace_debug(f"[Python LLVM] Unknown instruction: {op}") # Record per-inst definition for lifetime hinting as soon as available try: dst_maybe = inst.get("dst") @@ -642,7 +992,8 @@ class NyashLLVMBuilder: # Cond block cbuild = ir.IRBuilder(cond_bb) try: - cond_val = self.resolver.resolve_i64(cond_vid, builder.block, self.preds, self.block_end_values, self.vmap, self.bb_map) + # Resolve against the condition block to localize dominance + cond_val = self.resolver.resolve_i64(cond_vid, cbuild.block, self.preds, self.block_end_values, self.vmap, self.bb_map) except Exception: cond_val = self.vmap.get(cond_vid) if cond_val is None: @@ -697,6 +1048,7 @@ class NyashLLVMBuilder: for fr in from_list: succs.setdefault(fr, []).append(to_bid) for block_id, dst_map in (getattr(self, 'block_phi_incomings', {}) or {}).items(): + trace_phi(f"[finalize] bb{block_id} dsts={list(dst_map.keys())}") bb = self.bb_map.get(block_id) if bb is None: continue @@ -706,15 +1058,34 @@ class NyashLLVMBuilder: except Exception: pass for dst_vid, incoming in (dst_map or {}).items(): + trace_phi(f"[finalize] dst v{dst_vid} incoming={incoming}") # Ensure placeholder exists at block head - phi = self.vmap.get(dst_vid) - try: - is_phi = hasattr(phi, 'add_incoming') - except Exception: - is_phi = False - if not is_phi: - phi = b.phi(self.i64, name=f"phi_{dst_vid}") + # Prefer predeclared ret-phi when available and force using it. + predecl = getattr(self, 'predeclared_ret_phis', {}) if hasattr(self, 'predeclared_ret_phis') else {} + phi = predecl.get((int(block_id), int(dst_vid))) if predecl else None + if phi is not None: + # Bind as canonical target self.vmap[dst_vid] = phi + else: + phi = self.vmap.get(dst_vid) + # Ensure we target a PHI belonging to the current block; if a + # global mapping points to a PHI in another block (due to + # earlier localization), create/replace with a local PHI. + need_local_phi = False + try: + if not (phi is not None and hasattr(phi, 'add_incoming')): + need_local_phi = True + else: + bb_of_phi = getattr(getattr(phi, 'basic_block', None), 'name', None) + if bb_of_phi != bb.name: + need_local_phi = True + except Exception: + need_local_phi = True + if need_local_phi: + phi = b.phi(self.i64, name=f"phi_{dst_vid}") + self.vmap[dst_vid] = phi + n = getattr(phi, 'name', b'').decode() if hasattr(getattr(phi, 'name', None), 'decode') else str(getattr(phi, 'name', '')) + trace_phi(f"[finalize] target phi={n}") # Wire incoming per CFG predecessor; map src_vid when provided preds_raw = [p for p in self.preds.get(block_id, []) if p != block_id] # Deduplicate while preserving order @@ -820,6 +1191,10 @@ class NyashLLVMBuilder: if pred_bb is None: continue phi.add_incoming(val, pred_bb) + try: + trace_phi(f"[finalize] add incoming: bb{pred_bid} -> v{dst_vid}") + except Exception: + pass # Tag dst as string-ish if any declared source was string-ish (post-lowering info) try: if hasattr(self.resolver, 'is_stringish') and hasattr(self.resolver, 'mark_string'): diff --git a/src/llvm_py/phi_wiring.py b/src/llvm_py/phi_wiring.py new file mode 100644 index 00000000..2413a96e --- /dev/null +++ b/src/llvm_py/phi_wiring.py @@ -0,0 +1,200 @@ +""" +PHI wiring helpers + +- setup_phi_placeholders: Predeclare PHIs and collect incoming metadata +- finalize_phis: Wire PHI incomings using end-of-block snapshots and resolver + +These operate on the NyashLLVMBuilder instance to keep changes minimal. +""" + +from typing import Dict, List, Any +import llvmlite.ir as ir + +def setup_phi_placeholders(builder, blocks: List[Dict[str, Any]]): + """Predeclare PHIs and collect incoming metadata for finalize_phis. + + This pass is function-local and must be invoked after basic blocks are + created and before lowering individual blocks. It also tags string-ish + values eagerly to help downstream resolvers choose correct intrinsics. + """ + try: + # Pass A: collect producer stringish hints per value-id + produced_str: Dict[int, bool] = {} + for block_data in blocks: + for inst in block_data.get("instructions", []) or []: + try: + opx = inst.get("op") + dstx = inst.get("dst") + if dstx is None: + continue + is_str = False + if opx == "const": + v = inst.get("value", {}) or {} + t = v.get("type") + if t == "string" or (isinstance(t, dict) and t.get("kind") in ("handle","ptr") and t.get("box_type") == "StringBox"): + is_str = True + elif opx in ("binop","boxcall","externcall"): + t = inst.get("dst_type") + if isinstance(t, dict) and t.get("kind") == "handle" and t.get("box_type") == "StringBox": + is_str = True + if is_str: + produced_str[int(dstx)] = True + except Exception: + pass + # Pass B: materialize PHI placeholders and record incoming metadata + builder.block_phi_incomings = {} + for block_data in blocks: + bid0 = block_data.get("id", 0) + bb0 = builder.bb_map.get(bid0) + for inst in block_data.get("instructions", []) or []: + if inst.get("op") == "phi": + try: + dst0 = int(inst.get("dst")) + incoming0 = inst.get("incoming", []) or [] + except Exception: + dst0 = None; incoming0 = [] + if dst0 is None: + continue + # Record incoming metadata for finalize_phis + try: + builder.block_phi_incomings.setdefault(bid0, {})[dst0] = [ + (int(b), int(v)) for (v, b) in incoming0 + ] + except Exception: + pass + # Ensure placeholder exists at block head + if bb0 is not None: + b0 = ir.IRBuilder(bb0) + try: + b0.position_at_start(bb0) + except Exception: + pass + existing = builder.vmap.get(dst0) + is_phi = False + try: + is_phi = hasattr(existing, 'add_incoming') + except Exception: + is_phi = False + if not is_phi: + ph0 = b0.phi(builder.i64, name=f"phi_{dst0}") + builder.vmap[dst0] = ph0 + # Tag propagation: if explicit dst_type marks string or any incoming was produced as string-ish, tag dst + try: + dst_type0 = inst.get("dst_type") + mark_str = isinstance(dst_type0, dict) and dst_type0.get("kind") == "handle" and dst_type0.get("box_type") == "StringBox" + if not mark_str: + for (_b_decl_i, v_src_i) in incoming0: + try: + if produced_str.get(int(v_src_i)): + mark_str = True; break + except Exception: + pass + if mark_str and hasattr(builder.resolver, 'mark_string'): + builder.resolver.mark_string(int(dst0)) + except Exception: + pass + except Exception: + pass + +def finalize_phis(builder): + """Finalize PHIs declared in JSON by wiring incoming edges at block heads. + Uses resolver._value_at_end_i64 to materialize values at predecessor ends, + ensuring casts/boxing are inserted in predecessor blocks (dominance-safe).""" + # Build succ map for nearest-predecessor mapping + succs: Dict[int, List[int]] = {} + for to_bid, from_list in (builder.preds or {}).items(): + for fr in from_list: + succs.setdefault(fr, []).append(to_bid) + for block_id, dst_map in (getattr(builder, 'block_phi_incomings', {}) or {}).items(): + bb = builder.bb_map.get(block_id) + if bb is None: + continue + b = ir.IRBuilder(bb) + try: + b.position_at_start(bb) + except Exception: + pass + for dst_vid, incoming in (dst_map or {}).items(): + # Ensure placeholder exists at block head + # Prefer predeclared ret-phi when available and force using it. + predecl = getattr(builder, 'predeclared_ret_phis', {}) if hasattr(builder, 'predeclared_ret_phis') else {} + phi = predecl.get((int(block_id), int(dst_vid))) if predecl else None + if phi is not None: + builder.vmap[dst_vid] = phi + else: + phi = builder.vmap.get(dst_vid) + need_local_phi = False + try: + if not (phi is not None and hasattr(phi, 'add_incoming')): + need_local_phi = True + else: + bb_of_phi = getattr(getattr(phi, 'basic_block', None), 'name', None) + if bb_of_phi != bb.name: + need_local_phi = True + except Exception: + need_local_phi = True + if need_local_phi: + phi = b.phi(builder.i64, name=f"phi_{dst_vid}") + builder.vmap[dst_vid] = phi + # Wire incoming per CFG predecessor; map src_vid when provided + preds_raw = [p for p in builder.preds.get(block_id, []) if p != block_id] + # Deduplicate while preserving order + seen = set() + preds_list: List[int] = [] + for p in preds_raw: + if p not in seen: + preds_list.append(p) + seen.add(p) + # Helper: find the nearest immediate predecessor on a path decl_b -> ... -> block_id + def nearest_pred_on_path(decl_b: int): + from collections import deque + q = deque([decl_b]) + visited = set([decl_b]) + parent: Dict[int, Any] = {decl_b: None} + while q: + cur = q.popleft() + if cur == block_id: + par = parent.get(block_id) + return par if par in preds_list else None + for nx in succs.get(cur, []): + if nx not in visited: + visited.add(nx) + parent[nx] = cur + q.append(nx) + return None + # Precompute a non-self initial source (if present) to use for self-carry cases + init_src_vid = None + for (b_decl0, v_src0) in incoming: + try: + vs0 = int(v_src0) + except Exception: + continue + if vs0 != int(dst_vid): + init_src_vid = vs0 + break + # Pre-resolve declared incomings to nearest immediate predecessors + chosen: Dict[int, ir.Value] = {} + for (b_decl, v_src) in incoming: + try: + bd = int(b_decl); vs = int(v_src) + except Exception: + continue + pred_match = nearest_pred_on_path(bd) + if pred_match is None: + continue + if vs == int(dst_vid) and init_src_vid is not None: + vs = int(init_src_vid) + try: + val = builder.resolver._value_at_end_i64(vs, pred_match, builder.preds, builder.block_end_values, builder.vmap, builder.bb_map) + except Exception: + val = None + if val is None: + # As a last resort, zero + val = ir.Constant(builder.i64, 0) + chosen[pred_match] = val + # Finally add incomings + for pred_bid, val in chosen.items(): + pred_bb = builder.bb_map.get(pred_bid) + if pred_bb is None: + continue + phi.add_incoming(val, pred_bb) diff --git a/src/llvm_py/prepass/if_merge.py b/src/llvm_py/prepass/if_merge.py new file mode 100644 index 00000000..932c3e78 --- /dev/null +++ b/src/llvm_py/prepass/if_merge.py @@ -0,0 +1,35 @@ +""" +If-merge prepass utilities +For blocks that end with return and have multiple predecessors, plan PHI predeclare for return value ids. +""" + +from typing import Dict, Any, Optional +from cfg.utils import build_preds_succs + +def plan_ret_phi_predeclare(block_by_id: Dict[int, Dict[str, Any]]) -> Optional[Dict[int, int]]: + """Return a map {block_id: value_id} for blocks that end with ret + and have multiple predecessors. The caller can predeclare a PHI for value_id + at the block head to ensure dominance for the return. + """ + preds, _ = build_preds_succs(block_by_id) + plan: Dict[int, int] = {} + for bid, blk in block_by_id.items(): + term = None + if blk.get('instructions'): + last = blk.get('instructions')[-1] + if last.get('op') in ('jump','branch','ret'): + term = last + if term is None and 'terminator' in blk: + t = blk['terminator'] + if t and t.get('op') in ('jump','branch','ret'): + term = t + if not term or term.get('op') != 'ret': + continue + val = term.get('value') + if not isinstance(val, int): + continue + pred_list = [p for p in preds.get(int(bid), []) if p != int(bid)] + if len(pred_list) > 1: + plan[int(bid)] = int(val) + return plan or None + diff --git a/src/llvm_py/prepass/loops.py b/src/llvm_py/prepass/loops.py new file mode 100644 index 00000000..4ec62306 --- /dev/null +++ b/src/llvm_py/prepass/loops.py @@ -0,0 +1,106 @@ +""" +Loop prepass utilities +Detect simple while-shaped loops in MIR(JSON) and return a lowering plan. +""" + +from typing import Dict, List, Any, Optional +from cfg.utils import build_preds_succs + +def detect_simple_while(block_by_id: Dict[int, Dict[str, Any]]) -> Optional[Dict[str, Any]]: + """Detect a simple loop pattern: header(branch cond → then/else), + a latch that jumps back to header reachable from then, and exit on else. + Returns a plan dict or None. + """ + # Build succ and pred maps from JSON quickly + preds, succs = build_preds_succs(block_by_id) + # Find a header with a branch terminator and else leading to a ret (direct) + for b in block_by_id.values(): + bid = int(b.get('id', 0)) + term = None + if b.get('instructions'): + last = b.get('instructions')[-1] + if last.get('op') in ('jump','branch','ret'): + term = last + if term is None and 'terminator' in b: + t = b['terminator'] + if t and t.get('op') in ('jump','branch','ret'): + term = t + if not term or term.get('op') != 'branch': + continue + then_bid = int(term.get('then')) + else_bid = int(term.get('else')) + cond_vid = int(term.get('cond')) if term.get('cond') is not None else None + if cond_vid is None: + continue + # Quick check: else block ends with ret + else_blk = block_by_id.get(else_bid) + has_ret = False + if else_blk is not None: + insts = else_blk.get('instructions', []) + if insts and insts[-1].get('op') == 'ret': + has_ret = True + elif else_blk.get('terminator', {}).get('op') == 'ret': + has_ret = True + if not has_ret: + continue + # Find a latch that jumps back to header reachable from then + latch = None + visited = set() + stack = [then_bid] + while stack: + cur = stack.pop() + if cur in visited: + continue + visited.add(cur) + cur_blk = block_by_id.get(cur) + if cur_blk is None: + continue + for inst in cur_blk.get('instructions', []) or []: + if inst.get('op') == 'jump' and int(inst.get('target')) == bid: + latch = cur + break + if latch is not None: + break + for nx in succs.get(cur, []) or []: + if nx not in visited and nx != else_bid: + stack.append(nx) + if latch is None: + continue + # Compose body_insts: collect insts along then-branch region up to latch (inclusive), + # excluding any final jump back to header to prevent double edges. + collect_order: List[int] = [] + visited2 = set() + stack2 = [then_bid] + while stack2: + cur = stack2.pop() + if cur in visited2 or cur == bid or cur == else_bid: + continue + visited2.add(cur) + collect_order.append(cur) + if cur == latch: + continue + for nx in succs.get(cur, []) or []: + if nx not in visited2 and nx != else_bid: + stack2.append(nx) + body_insts: List[Dict[str, Any]] = [] + for bbid in collect_order: + blk = block_by_id.get(bbid) + if blk is None: + continue + for inst in blk.get('instructions', []) or []: + if inst.get('op') == 'jump' and int(inst.get('target', -1)) == bid: + continue + body_insts.append(inst) + skip_blocks = set(collect_order) + skip_blocks.add(bid) + return { + 'header': bid, + 'then': then_bid, + 'else': else_bid, + 'latch': latch, + 'exit': else_bid, + 'cond': cond_vid, + 'body_insts': body_insts, + 'skip_blocks': list(skip_blocks), + } + return None diff --git a/src/llvm_py/pyvm/vm.py b/src/llvm_py/pyvm/vm.py index 1b5c0270..903dc20d 100644 --- a/src/llvm_py/pyvm/vm.py +++ b/src/llvm_py/pyvm/vm.py @@ -112,8 +112,17 @@ class PyVM: cur = min(fn.blocks.keys()) prev: Optional[int] = None - # Simple block execution loop + # Simple block execution loop with step budget to avoid infinite hangs + max_steps = 0 + try: + max_steps = int(os.environ.get("NYASH_PYVM_MAX_STEPS", "200000")) + except Exception: + max_steps = 200000 + steps = 0 while True: + steps += 1 + if max_steps and steps > max_steps: + raise RuntimeError(f"pyvm: max steps exceeded ({max_steps}) in function {fn.name}") block = fn.blocks.get(cur) if block is None: raise RuntimeError(f"block not found: {cur}") @@ -226,18 +235,28 @@ class PyVM: a = self._read(regs, inst.get("lhs")) b = self._read(regs, inst.get("rhs")) res: bool - if operation == "==": + # For ordering comparisons, be robust to None by coercing to ints + if operation in ("<", "<=", ">", ">="): + try: + ai = 0 if a is None else (int(a) if not isinstance(a, str) else 0) + except Exception: + ai = 0 + try: + bi = 0 if b is None else (int(b) if not isinstance(b, str) else 0) + except Exception: + bi = 0 + if operation == "<": + res = ai < bi + elif operation == "<=": + res = ai <= bi + elif operation == ">": + res = ai > bi + else: + res = ai >= bi + elif operation == "==": res = (a == b) elif operation == "!=": res = (a != b) - elif operation == "<": - res = (a < b) - elif operation == "<=": - res = (a <= b) - elif operation == ">": - res = (a > b) - elif operation == ">=": - res = (a >= b) else: raise RuntimeError(f"unsupported compare: {operation}") # VM convention: booleans are i64 0/1 @@ -287,6 +306,12 @@ class PyVM: i += 1 continue + if op == "copy": + src = self._read(regs, inst.get("src")) + self._set(regs, inst.get("dst"), src) + i += 1 + continue + if op == "boxcall": recv = self._read(regs, inst.get("box")) method = inst.get("method") diff --git a/src/llvm_py/resolver.py b/src/llvm_py/resolver.py index 0f132ad2..8e132929 100644 --- a/src/llvm_py/resolver.py +++ b/src/llvm_py/resolver.py @@ -5,6 +5,8 @@ Based on src/backend/llvm/compiler/codegen/instructions/resolver.rs from typing import Dict, Optional, Any, Tuple import os +from trace import phi as trace_phi +from trace import values as trace_values import llvmlite.ir as ir class Resolver: @@ -26,6 +28,13 @@ class Resolver: # Legacy constructor (vmap, bb_map) — builder/module will be set later when available self.builder = None self.module = None + try: + # Keep references to global maps when provided + self.global_vmap = a if isinstance(a, dict) else None + self.global_bb_map = b if isinstance(b, dict) else None + except Exception: + self.global_vmap = None + self.global_bb_map = None # Caches: (block_name, value_id) -> llvm value self.i64_cache: Dict[Tuple[str, int], ir.Value] = {} @@ -95,9 +104,35 @@ class Resolver: bmap = self.block_phi_incomings.get(block_id) if isinstance(bmap, dict) and value_id in bmap: existing_cur = vmap.get(value_id) - if existing_cur is not None and hasattr(existing_cur, 'add_incoming'): + # Use placeholder only if it belongs to the current block; otherwise + # create/ensure a local PHI at the current block head to dominate uses. + is_phi_here = False + try: + is_phi_here = ( + existing_cur is not None + and hasattr(existing_cur, 'add_incoming') + and getattr(getattr(existing_cur, 'basic_block', None), 'name', None) == current_block.name + ) + except Exception: + is_phi_here = False + if is_phi_here: self.i64_cache[cache_key] = existing_cur return existing_cur + # Materialize a local PHI placeholder at block start and bind to vmap + b = ir.IRBuilder(current_block) + try: + b.position_at_start(current_block) + except Exception: + pass + phi_local = b.phi(self.i64, name=f"phi_{value_id}") + vmap[value_id] = phi_local + try: + if isinstance(getattr(self, 'global_vmap', None), dict): + self.global_vmap[value_id] = phi_local + except Exception: + pass + self.i64_cache[cache_key] = phi_local + return phi_local except Exception: pass @@ -116,8 +151,7 @@ class Resolver: if defined_here: existing = vmap.get(value_id) if existing is not None and hasattr(existing, 'type') and isinstance(existing.type, ir.IntType) and existing.type.width == 64: - if os.environ.get('NYASH_LLVM_TRACE_VALUES') == '1': - print(f"[resolve] local reuse: bb{bid} v{value_id}", flush=True) + trace_values(f"[resolve] local reuse: bb{bid} v{value_id}") self.i64_cache[cache_key] = existing return existing else: @@ -131,8 +165,7 @@ class Resolver: base_val = vmap.get(value_id) if base_val is None: result = ir.Constant(self.i64, 0) - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - print(f"[resolve] bb{bid} v{value_id} entry/no-preds → 0", flush=True) + trace_phi(f"[resolve] bb{bid} v{value_id} entry/no-preds → 0") else: # If pointer string, box to handle in current block (use local builder) if hasattr(base_val, 'type') and isinstance(base_val.type, ir.PointerType) and self.module is not None: @@ -185,8 +218,7 @@ class Resolver: declared = False if declared: # Return existing placeholder if present; do not create a new PHI here. - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - print(f"[resolve] use placeholder PHI: bb{cur_bid} v{value_id}", flush=True) + trace_phi(f"[resolve] use placeholder PHI: bb{cur_bid} v{value_id}") placeholder = vmap.get(value_id) result = placeholder if (placeholder is not None and hasattr(placeholder, 'add_incoming')) else ir.Constant(self.i64, 0) else: @@ -258,19 +290,14 @@ class Resolver: bb_map: Optional[Dict[int, ir.Block]] = None, _vis: Optional[set] = None) -> ir.Value: """Resolve value as i64 at the end of a given block by traversing predecessors if needed.""" - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - try: - print(f"[resolve] end_i64 enter: bb{block_id} v{value_id}", flush=True) - except Exception: - pass + trace_phi(f"[resolve] end_i64 enter: bb{block_id} v{value_id}") key = (block_id, value_id) if key in self._end_i64_cache: return self._end_i64_cache[key] if _vis is None: _vis = set() if key in _vis: - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - print(f"[resolve] cycle detected at end_i64(bb{block_id}, v{value_id}) → 0", flush=True) + trace_phi(f"[resolve] cycle detected at end_i64(bb{block_id}, v{value_id}) → 0") return ir.Constant(self.i64, 0) _vis.add(key) @@ -285,16 +312,21 @@ class Resolver: is_phi_val = hasattr(val, 'add_incoming') except Exception: is_phi_val = False - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - try: - ty = 'phi' if is_phi_val else ('ptr' if hasattr(val, 'type') and isinstance(val.type, ir.PointerType) else ('i'+str(getattr(val.type,'width','?')) if hasattr(val,'type') and isinstance(val.type, ir.IntType) else 'other')) - print(f"[resolve] snap hit: bb{block_id} v{value_id} type={ty}", flush=True) - except Exception: - pass + try: + ty = 'phi' if is_phi_val else ('ptr' if hasattr(val, 'type') and isinstance(val.type, ir.PointerType) else ('i'+str(getattr(val.type,'width','?')) if hasattr(val,'type') and isinstance(val.type, ir.IntType) else 'other')) + trace_phi(f"[resolve] snap hit: bb{block_id} v{value_id} type={ty}") + except Exception: + pass if is_phi_val: - # Using a dominating PHI placeholder as incoming is valid for finalize_phis - self._end_i64_cache[key] = val - return val + # Accept PHI only when it belongs to the same block (dominates end-of-block). + try: + belongs_here = (getattr(getattr(val, 'basic_block', None), 'name', b'').decode() if hasattr(getattr(val, 'basic_block', None), 'name') else str(getattr(getattr(val, 'basic_block', None), 'name', ''))) == f"bb{block_id}" + except Exception: + belongs_here = False + if belongs_here: + self._end_i64_cache[key] = val + return val + # Otherwise ignore and try predecessors to avoid self-carry from foreign PHI coerced = self._coerce_in_block_to_i64(val, block_id, bb_map) self._end_i64_cache[key] = coerced return coerced @@ -310,9 +342,8 @@ class Resolver: # Do not use global vmap here; if not materialized by end of this block # (or its preds), bail out with zero to preserve dominance. - if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': - preds_s = ','.join(str(x) for x in pred_ids) - print(f"[resolve] end_i64 miss: bb{block_id} v{value_id} preds=[{preds_s}] → 0", flush=True) + preds_s = ','.join(str(x) for x in pred_ids) + trace_phi(f"[resolve] end_i64 miss: bb{block_id} v{value_id} preds=[{preds_s}] → 0") z = ir.Constant(self.i64, 0) self._end_i64_cache[key] = z return z diff --git a/src/llvm_py/test_simple.py b/src/llvm_py/test_simple.py index e7aa6658..c8bad0b1 100644 --- a/src/llvm_py/test_simple.py +++ b/src/llvm_py/test_simple.py @@ -1,51 +1,36 @@ #!/usr/bin/env python3 """ -Simple test for Nyash LLVM Python backend -Tests basic MIR -> LLVM compilation +Simple smoke for Nyash LLVM Python backend +Generates a minimal MIR(JSON) in the current schema and compiles it. """ -import json from llvm_builder import NyashLLVMBuilder -# Simple MIR test case: function that returns 42 -test_mir = { - "functions": { - "main": { +# Minimal MIR(JSON): main() { ret 42 } +TEST_MIR = { + "functions": [ + { "name": "main", "params": [], - "return_type": "i64", - "entry_block": 0, - "blocks": { - "0": { + "blocks": [ + { + "id": 0, "instructions": [ - { - "kind": "Const", - "dst": 0, - "value": {"type": "i64", "value": 42} - } - ], - "terminator": { - "kind": "Return", - "value": 0 - } + {"op": "const", "dst": 0, "value": {"type": "i64", "value": 42}}, + {"op": "ret", "value": 0} + ] } - } + ] } - } + ] } def test_basic(): - """Test basic MIR -> LLVM compilation""" builder = NyashLLVMBuilder() - - # Generate LLVM IR - llvm_ir = builder.build_from_mir(test_mir) - print("Generated LLVM IR:") - print(llvm_ir) - - # Compile to object file + ir = builder.build_from_mir(TEST_MIR) + print("Generated LLVM IR (truncated):\n", ir.splitlines()[0:8]) builder.compile_to_object("test_simple.o") - print("\nCompiled to test_simple.o") + print("Compiled to test_simple.o") if __name__ == "__main__": - test_basic() \ No newline at end of file + test_basic() diff --git a/src/llvm_py/trace.py b/src/llvm_py/trace.py new file mode 100644 index 00000000..38f9e6ae --- /dev/null +++ b/src/llvm_py/trace.py @@ -0,0 +1,41 @@ +""" +Lightweight tracing helpers for the LLVM Python backend. + +Environment flags (string '1' to enable): +- NYASH_CLI_VERBOSE: general lowering/debug logs +- NYASH_LLVM_TRACE_PHI: PHI resolution/snapshot wiring logs +- NYASH_LLVM_TRACE_VALUES: value resolution logs + +Import and use: + from trace import debug, phi, values + debug("message") + phi("phi message") + values("values message") +""" + +import os + +def _enabled(env_key: str) -> bool: + return os.environ.get(env_key) == '1' + +def debug(msg: str) -> None: + if _enabled('NYASH_CLI_VERBOSE'): + try: + print(msg, flush=True) + except Exception: + pass + +def phi(msg: str) -> None: + if _enabled('NYASH_LLVM_TRACE_PHI'): + try: + print(msg, flush=True) + except Exception: + pass + +def values(msg: str) -> None: + if _enabled('NYASH_LLVM_TRACE_VALUES'): + try: + print(msg, flush=True) + except Exception: + pass + diff --git a/src/llvm_py/utils/values.py b/src/llvm_py/utils/values.py new file mode 100644 index 00000000..802d8729 --- /dev/null +++ b/src/llvm_py/utils/values.py @@ -0,0 +1,31 @@ +""" +Value resolution helpers +Centralize policies like "prefer same-block SSA; otherwise resolve with dominance". +""" + +from typing import Any, Dict, Optional +import llvmlite.ir as ir + +def resolve_i64_strict( + resolver, + value_id: int, + current_block: ir.Block, + preds: Dict[int, list], + block_end_values: Dict[int, Dict[int, Any]], + vmap: Dict[int, Any], + bb_map: Optional[Dict[int, ir.Block]] = None, + *, + prefer_local: bool = True, +) -> ir.Value: + """Resolve i64 under policies: + - If prefer_local and vmap has a same-block definition, reuse it. + - Otherwise, delegate to resolver to localize with PHI/casts as needed. + """ + # Prefer current vmap SSA first (block-local map is passed in vmap) + val = vmap.get(value_id) + if prefer_local and val is not None: + return val + # Fallback to resolver + if resolver is None: + return ir.Constant(ir.IntType(64), 0) + return resolver.resolve_i64(value_id, current_block, preds, block_end_values, vmap, bb_map) diff --git a/src/main.rs b/src/main.rs index 55442ddd..2b590bbb 100644 --- a/src/main.rs +++ b/src/main.rs @@ -20,7 +20,11 @@ pub mod environment; pub mod exception_box; pub mod finalization; pub mod instance_v2; // 🎯 Phase 9.78d: Simplified InstanceBox implementation +// Legacy interpreter module is not included by default; when disabled, re-export lib's stub +#[cfg(feature = "interpreter-legacy")] pub mod interpreter; +#[cfg(not(feature = "interpreter-legacy"))] +pub mod interpreter { pub use nyash_rust::interpreter::*; } pub mod messaging; // 🌐 P2P Communication Infrastructure pub mod method_box; pub mod operator_traits; @@ -41,8 +45,7 @@ pub mod backend; pub mod jit; pub mod semantics; // mirror library semantics module for crate path consistency in bin -// 📊 Performance Benchmarks -pub mod benchmarks; +// 📊 Performance Benchmarks (lib provides; bin does not re-declare) // 🚀 Refactored modules for better organization pub mod cli; diff --git a/src/mir/builder/phi.rs b/src/mir/builder/phi.rs index 51c969ba..a35a3c6a 100644 --- a/src/mir/builder/phi.rs +++ b/src/mir/builder/phi.rs @@ -73,6 +73,8 @@ impl MirBuilder { &mut self, then_block: BasicBlockId, else_block: BasicBlockId, + then_exit_block_opt: Option, + else_exit_block_opt: Option, then_value_raw: ValueId, else_value_raw: ValueId, pre_if_var_map: &HashMap, @@ -91,6 +93,33 @@ impl MirBuilder { let result_val = self.value_gen.next(); if self.is_no_phi_mode() { + // In PHI-off mode, emit per-predecessor copies into the actual predecessors + // of the current (merge) block instead of the entry blocks. This correctly + // handles nested conditionals where the then-branch fans out and merges later. + let merge_block = self + .current_block + .ok_or_else(|| "normalize_if_else_phi: no current (merge) block".to_string())?; + let preds: Vec = if let Some(ref fun_ro) = self.current_function { + if let Some(bb) = fun_ro.get_block(merge_block) { + bb.predecessors.iter().copied().collect() + } else { + Vec::new() + } + } else { + Vec::new() + }; + // Prefer explicit exit blocks if provided; fall back to predecessor scan + let then_exits: Vec = if let Some(b) = then_exit_block_opt { + vec![b] + } else { + preds.iter().copied().filter(|p| *p != else_block).collect() + }; + let else_exits: Vec = if let Some(b) = else_exit_block_opt { + vec![b] + } else { + preds.iter().copied().filter(|p| *p == else_block).collect() + }; + if let Some(var_name) = assigned_var_then.clone() { let else_assigns_same = assigned_var_else .as_ref() @@ -108,13 +137,22 @@ impl MirBuilder { } else { pre_then_var_value.unwrap_or(else_value_raw) }; - self.insert_edge_copy(then_block, result_val, then_value_for_var)?; - self.insert_edge_copy(else_block, result_val, else_value_for_var)?; + // Map predecessors: else_block retains else value; others take then value + for p in then_exits.iter().copied() { + self.insert_edge_copy(p, result_val, then_value_for_var)?; + } + for p in else_exits.iter().copied() { + self.insert_edge_copy(p, result_val, else_value_for_var)?; + } self.variable_map = pre_if_var_map.clone(); self.variable_map.insert(var_name, result_val); } else { - self.insert_edge_copy(then_block, result_val, then_value_raw)?; - self.insert_edge_copy(else_block, result_val, else_value_raw)?; + for p in then_exits.iter().copied() { + self.insert_edge_copy(p, result_val, then_value_raw)?; + } + for p in else_exits.iter().copied() { + self.insert_edge_copy(p, result_val, else_value_raw)?; + } self.variable_map = pre_if_var_map.clone(); } return Ok(result_val); diff --git a/src/mir/builder/stmts.rs b/src/mir/builder/stmts.rs index af565345..fdd7b3ed 100644 --- a/src/mir/builder/stmts.rs +++ b/src/mir/builder/stmts.rs @@ -187,6 +187,7 @@ impl super::MirBuilder { // Build then with a clean snapshot of pre-if variables self.variable_map = pre_if_var_map.clone(); let then_value_raw = self.build_expression(then_branch)?; + let then_exit_block = Self::current_block(self)?; let then_var_map_end = self.variable_map.clone(); if !self.is_current_block_terminated() { self.emit_instruction(MirInstruction::Jump { @@ -211,6 +212,7 @@ impl super::MirBuilder { })?; (void_val, None, None) }; + let else_exit_block = Self::current_block(self)?; if !self.is_current_block_terminated() { self.emit_instruction(MirInstruction::Jump { target: merge_block, @@ -223,6 +225,8 @@ impl super::MirBuilder { let result_val = self.normalize_if_else_phi( then_block, else_block, + Some(then_exit_block), + Some(else_exit_block), then_value_raw, else_value_raw, &pre_if_var_map, @@ -489,3 +493,4 @@ impl super::MirBuilder { Ok(me_value) } } +use crate::mir::loop_api::LoopBuilderApi; // for current_block() diff --git a/src/runner/dispatch.rs b/src/runner/dispatch.rs index facbe95f..a4ddf525 100644 --- a/src/runner/dispatch.rs +++ b/src/runner/dispatch.rs @@ -115,21 +115,27 @@ pub(crate) fn execute_file_with_backend(runner: &NyashRunner, filename: &str) { if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { println!("🚀 Nyash VM Backend - Executing file: {} 🚀", filename); } - runner.execute_vm_mode(filename); + #[cfg(feature = "vm-legacy")] + { + runner.execute_vm_mode(filename); + } + #[cfg(not(feature = "vm-legacy"))] + { + // Legacy VM is disabled; use PyVM harness instead. + super::modes::pyvm::execute_pyvm_only(runner, filename); + } } "interpreter" => { eprintln!("⚠ interpreter backend is legacy and deprecated. Use 'vm' (PyVM/LLVM) instead."); - if std::env::var("NYASH_VM_USE_PY").ok().as_deref() == Some("1") { - if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { - println!("👉 Redirecting to VM backend (PyVM) as requested by NYASH_VM_USE_PY=1"); - } - runner.execute_vm_mode(filename); - } else { - if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { - println!("👉 Redirecting to VM backend"); - } + #[cfg(feature = "vm-legacy")] + { runner.execute_vm_mode(filename); } + #[cfg(not(feature = "vm-legacy"))] + { + // Legacy VM disabled; route to PyVM-only runner + super::modes::pyvm::execute_pyvm_only(runner, filename); + } } #[cfg(feature = "cranelift-jit")] "jit-direct" => { diff --git a/src/runner/mir_json_emit.rs b/src/runner/mir_json_emit.rs index 5b912000..cb13ae13 100644 --- a/src/runner/mir_json_emit.rs +++ b/src/runner/mir_json_emit.rs @@ -18,6 +18,10 @@ pub fn emit_mir_json_for_harness( let mut insts = Vec::new(); // PHI first(オプション) for inst in &bb.instructions { + if let I::Copy { dst, src } = inst { + insts.push(json!({"op":"copy","dst": dst.as_u32(), "src": src.as_u32()})); + continue; + } if let I::Phi { dst, inputs } = inst { let incoming: Vec<_> = inputs .iter() @@ -47,6 +51,13 @@ pub fn emit_mir_json_for_harness( // Non-PHI for inst in &bb.instructions { match inst { + I::Copy { dst, src } => { + insts.push(json!({ + "op": "copy", + "dst": dst.as_u32(), + "src": src.as_u32() + })); + } I::UnaryOp { dst, op, operand } => { let kind = match op { nyash_rust::mir::UnaryOp::Neg => "neg", @@ -296,6 +307,13 @@ pub fn emit_mir_json_for_harness_bin( } for inst in &bb.instructions { match inst { + I::Copy { dst, src } => { + insts.push(json!({ + "op": "copy", + "dst": dst.as_u32(), + "src": src.as_u32() + })); + } I::Const { dst, value } => match value { crate::mir::ConstValue::Integer(i) => { insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "i64", "value": i}})); diff --git a/src/runner/mod.rs b/src/runner/mod.rs index 9a182ed4..4122e3c0 100644 --- a/src/runner/mod.rs +++ b/src/runner/mod.rs @@ -413,8 +413,17 @@ impl NyashRunner { println!("===================================="); println!("Running {} iterations per test...", self.config.iterations); println!(); - - self.execute_benchmark_mode(); + #[cfg(feature = "vm-legacy")] + { + self.execute_benchmark_mode(); + } + #[cfg(not(feature = "vm-legacy"))] + { + eprintln!( + "❌ Benchmark mode requires VM backend. Rebuild with --features vm-legacy." + ); + std::process::exit(1); + } return; } diff --git a/src/runner/modes/bench.rs b/src/runner/modes/bench.rs index c0f07a13..3deaba3b 100644 --- a/src/runner/modes/bench.rs +++ b/src/runner/modes/bench.rs @@ -236,3 +236,4 @@ impl NyashRunner { Ok(()) } } +#![cfg(feature = "vm-legacy")] diff --git a/src/runner/modes/mir_interpreter.rs b/src/runner/modes/mir_interpreter.rs index 15751ee4..57981e1d 100644 --- a/src/runner/modes/mir_interpreter.rs +++ b/src/runner/modes/mir_interpreter.rs @@ -101,4 +101,4 @@ impl NyashRunner { let _ = runtime; // reserved for future GC/safepoint integration } } - +#![cfg(feature = "interpreter-legacy")] diff --git a/src/runner/modes/mod.rs b/src/runner/modes/mod.rs index 1c416642..ecc11d0f 100644 --- a/src/runner/modes/mod.rs +++ b/src/runner/modes/mod.rs @@ -1,8 +1,12 @@ +#[cfg(feature = "vm-legacy")] pub mod bench; +#[cfg(feature = "interpreter-legacy")] pub mod interpreter; pub mod llvm; pub mod mir; +#[cfg(feature = "vm-legacy")] pub mod vm; +pub mod pyvm; #[cfg(feature = "cranelift-jit")] pub mod aot; diff --git a/src/runner/modes/pyvm.rs b/src/runner/modes/pyvm.rs new file mode 100644 index 00000000..65c10719 --- /dev/null +++ b/src/runner/modes/pyvm.rs @@ -0,0 +1,102 @@ +use super::super::NyashRunner; +use nyash_rust::{mir::MirCompiler, parser::NyashParser}; +use std::{fs, process}; + +/// Execute using PyVM only (no Rust VM runtime). Emits MIR(JSON) and invokes tools/pyvm_runner.py. +pub fn execute_pyvm_only(_runner: &NyashRunner, filename: &str) { + // Read the file + let code = match fs::read_to_string(filename) { + Ok(content) => content, + Err(e) => { + eprintln!("❌ Error reading file {}: {}", filename, e); + process::exit(1); + } + }; + + // Parse to AST + let ast = match NyashParser::parse_from_string(&code) { + Ok(ast) => ast, + Err(e) => { + eprintln!("❌ Parse error: {}", e); + process::exit(1); + } + }; + + // Compile to MIR (respect default optimizer setting) + let mut mir_compiler = MirCompiler::with_options(true); + let mut compile_result = match mir_compiler.compile(ast) { + Ok(result) => result, + Err(e) => { + eprintln!("❌ MIR compilation error: {}", e); + process::exit(1); + } + }; + + // Optional: VM-only escape analysis elision pass retained for parity with VM path + if std::env::var("NYASH_VM_ESCAPE_ANALYSIS").ok().as_deref() == Some("1") { + let removed = nyash_rust::mir::passes::escape::escape_elide_barriers_vm(&mut compile_result.module); + if removed > 0 && std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!("[PyVM] escape_elide_barriers: removed {} barriers", removed); + } + } + + // Emit MIR JSON for PyVM harness + let tmp_dir = std::path::Path::new("tmp"); + let _ = std::fs::create_dir_all(tmp_dir); + let mir_json_path = tmp_dir.join("nyash_pyvm_mir.json"); + if let Err(e) = crate::runner::mir_json_emit::emit_mir_json_for_harness( + &compile_result.module, + &mir_json_path, + ) { + eprintln!("❌ PyVM MIR JSON emit error: {}", e); + process::exit(1); + } + + // Pick entry: prefer Main.main or main + let entry = if compile_result.module.functions.contains_key("Main.main") { + "Main.main" + } else if compile_result.module.functions.contains_key("main") { + "main" + } else { + "Main.main" + }; + + // Locate python3 and run harness + let py = which::which("python3").ok(); + if let Some(py3) = py { + let runner = std::path::Path::new("tools/pyvm_runner.py"); + if !runner.exists() { + eprintln!("❌ PyVM runner not found: {}", runner.display()); + process::exit(1); + } + if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!( + "[Runner/PyVM] {} (mir={})", + filename, + mir_json_path.display() + ); + } + let status = std::process::Command::new(py3) + .args([ + runner.to_string_lossy().as_ref(), + "--in", + &mir_json_path.display().to_string(), + "--entry", + entry, + ]) + .status() + .map_err(|e| format!("spawn pyvm: {}", e)) + .unwrap(); + let code = status.code().unwrap_or(1); + if !status.success() { + if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!("❌ PyVM failed (status={})", code); + } + } + process::exit(code); + } else { + eprintln!("❌ python3 not found in PATH. Install Python 3 to use PyVM."); + process::exit(1); + } +} + diff --git a/src/runtime/host_api.rs b/src/runtime/host_api.rs index 924ab099..75389d08 100644 --- a/src/runtime/host_api.rs +++ b/src/runtime/host_api.rs @@ -9,16 +9,20 @@ use crate::box_trait::NyashBox; // ===== TLS: current VM pointer during plugin invoke ===== +// When legacy VM is enabled, keep a real pointer for write barriers. +#[cfg(feature = "vm-legacy")] thread_local! { static CURRENT_VM: std::cell::Cell<*mut crate::backend::vm::VM> = std::cell::Cell::new(std::ptr::null_mut()); } - +#[cfg(feature = "vm-legacy")] pub fn set_current_vm(ptr: *mut crate::backend::vm::VM) { CURRENT_VM.with(|c| c.set(ptr)); } +#[cfg(feature = "vm-legacy")] pub fn clear_current_vm() { CURRENT_VM.with(|c| c.set(std::ptr::null_mut())); } +#[cfg(feature = "vm-legacy")] fn with_current_vm_mut(f: F) -> Option where F: FnOnce(&mut crate::backend::vm::VM) -> R, @@ -33,6 +37,19 @@ where }) } +// When legacy VM is disabled, provide stubs (no GC barriers). +#[cfg(not(feature = "vm-legacy"))] +pub fn set_current_vm(_ptr: *mut ()) {} +#[cfg(not(feature = "vm-legacy"))] +pub fn clear_current_vm() {} +#[cfg(not(feature = "vm-legacy"))] +fn with_current_vm_mut(_f: F) -> Option +where + F: FnOnce(&mut ()) -> R, +{ + None +} + // ===== Utilities: TLV encode helpers (single-value) ===== fn tlv_encode_one(val: &crate::backend::vm::VMValue) -> Vec { use crate::runtime::plugin_ffi_common as tlv; @@ -201,12 +218,15 @@ pub extern "C" fn nyrt_host_call_name( v => v.to_string(), }; // Barrier: use current VM runtime if available - let _ = with_current_vm_mut(|vm| { - crate::backend::gc_helpers::gc_write_barrier_site( - vm.runtime_ref(), - "HostAPI.setField", - ); - }); + #[cfg(feature = "vm-legacy")] + { + let _ = with_current_vm_mut(|vm| { + crate::backend::gc_helpers::gc_write_barrier_site( + vm.runtime_ref(), + "HostAPI.setField", + ); + }); + } // Accept primitives only for now let nv_opt = match argv[1].clone() { crate::backend::vm::VMValue::Integer(i) => { @@ -268,12 +288,15 @@ pub extern "C" fn nyrt_host_call_name( crate::backend::vm::VMValue::BoxRef(b) => b.share_box(), _ => Box::new(crate::box_trait::VoidBox::new()), }; - let _ = with_current_vm_mut(|vm| { - crate::backend::gc_helpers::gc_write_barrier_site( - vm.runtime_ref(), - "HostAPI.Array.set", - ); - }); + #[cfg(feature = "vm-legacy")] + { + let _ = with_current_vm_mut(|vm| { + crate::backend::gc_helpers::gc_write_barrier_site( + vm.runtime_ref(), + "HostAPI.Array.set", + ); + }); + } let out = arr.set(Box::new(crate::box_trait::IntegerBox::new(idx)), vb); let vmv = crate::backend::vm::VMValue::from_nyash_box(out); let buf = tlv_encode_one(&vmv); @@ -402,12 +425,15 @@ pub extern "C" fn nyrt_host_call_slot( crate::backend::vm::VMValue::String(s) => s.clone(), v => v.to_string(), }; - let _ = with_current_vm_mut(|vm| { - crate::backend::gc_helpers::gc_write_barrier_site( - vm.runtime_ref(), - "HostAPI.setField", - ); - }); + #[cfg(feature = "vm-legacy")] + { + let _ = with_current_vm_mut(|vm| { + crate::backend::gc_helpers::gc_write_barrier_site( + vm.runtime_ref(), + "HostAPI.setField", + ); + }); + } let nv_opt = match argv[1].clone() { crate::backend::vm::VMValue::Integer(i) => { Some(crate::value::NyashValue::Integer(i)) @@ -492,12 +518,15 @@ pub extern "C" fn nyrt_host_call_slot( crate::backend::vm::VMValue::BoxRef(b) => b.share_box(), _ => Box::new(crate::box_trait::VoidBox::new()), }; - let _ = with_current_vm_mut(|vm| { - crate::backend::gc_helpers::gc_write_barrier_site( - vm.runtime_ref(), - "HostAPI.Array.set", - ); - }); + #[cfg(feature = "vm-legacy")] + { + let _ = with_current_vm_mut(|vm| { + crate::backend::gc_helpers::gc_write_barrier_site( + vm.runtime_ref(), + "HostAPI.Array.set", + ); + }); + } let out = arr.set(Box::new(crate::box_trait::IntegerBox::new(idx)), vb); let vmv = crate::backend::vm::VMValue::from_nyash_box(out); let buf = tlv_encode_one(&vmv); diff --git a/src/tests/mir_no_phi_merge_tests.rs b/src/tests/mir_no_phi_merge_tests.rs new file mode 100644 index 00000000..f9cd1ef3 --- /dev/null +++ b/src/tests/mir_no_phi_merge_tests.rs @@ -0,0 +1,62 @@ +use crate::ast::{ASTNode, LiteralValue, Span, BinaryOperator}; +use crate::mir::{MirCompiler, MirInstruction}; + +fn lit_i(i: i64) -> ASTNode { + ASTNode::Literal { value: LiteralValue::Integer(i), span: Span::unknown() } +} + +fn bool_lt(lhs: ASTNode, rhs: ASTNode) -> ASTNode { + ASTNode::BinaryOp { operator: BinaryOperator::LessThan, left: Box::new(lhs), right: Box::new(rhs), span: Span::unknown() } +} + +#[test] +fn mir13_no_phi_if_merge_inserts_edge_copies_for_return() { + // Force PHI-off mode + std::env::set_var("NYASH_MIR_NO_PHI", "1"); + + // if (1 < 2) { return 40 } else { return 50 } + let ast = ASTNode::If { + condition: Box::new(bool_lt(lit_i(1), lit_i(2))), + then_body: vec![ASTNode::Return { value: Some(Box::new(lit_i(40))), span: Span::unknown() }], + else_body: Some(vec![ASTNode::Return { value: Some(Box::new(lit_i(50))), span: Span::unknown() }]), + span: Span::unknown(), + }; + + let mut mc = MirCompiler::with_options(false); + let cr = mc.compile(ast).expect("compile"); + let f = cr.module.functions.get("main").expect("function main"); + + // Find the block that returns a value and capture that return value id + let mut ret_block_id = None; + let mut ret_val = None; + for (bid, bb) in &f.blocks { + if let Some(MirInstruction::Return { value: Some(v) }) = bb.terminator.clone() { + ret_block_id = Some(*bid); + ret_val = Some(v); + break; + } + } + let ret_block = ret_block_id.expect("ret block"); + let out_v = ret_val.expect("ret value"); + + // In PHI-off mode we expect copies into each predecessor of the merge/ret block + let preds: Vec<_> = f + .blocks + .get(&ret_block) + .expect("ret block present") + .predecessors + .iter() + .copied() + .collect(); + assert!(preds.len() >= 2, "expected at least two predecessors at merge"); + + for p in preds { + let bb = f.blocks.get(&p).expect("pred block present"); + let has_copy = bb + .instructions + .iter() + .any(|inst| matches!(inst, MirInstruction::Copy { dst, .. } if *dst == out_v)); + assert!(has_copy, "expected Copy to out_v in predecessor {:?}", p); + } +} + diff --git a/tools/llvmlite_harness.py b/tools/llvmlite_harness.py index 27bcd948..247cb08b 100644 --- a/tools/llvmlite_harness.py +++ b/tools/llvmlite_harness.py @@ -50,6 +50,9 @@ def run_dummy(out_path: str) -> None: def run_from_json(in_path: str, out_path: str) -> None: # Delegate to python builder to keep code unified import runpy + # Enable safe defaults for prepasses unless explicitly disabled by env + os.environ.setdefault('NYASH_LLVM_PREPASS_LOOP', os.environ.get('NYASH_LLVM_PREPASS_LOOP', '0')) + os.environ.setdefault('NYASH_LLVM_PREPASS_IFMERGE', os.environ.get('NYASH_LLVM_PREPASS_IFMERGE', '1')) # Ensure src/llvm_py is on sys.path for relative imports builder_dir = str(PY_BUILDER.parent) if builder_dir not in sys.path: diff --git a/tools/smokes/curated_llvm.sh b/tools/smokes/curated_llvm.sh index 4a7c041a..b01bcc9d 100644 --- a/tools/smokes/curated_llvm.sh +++ b/tools/smokes/curated_llvm.sh @@ -2,7 +2,7 @@ set -euo pipefail # Curated LLVM smoke runner (llvmlite harness) -# Usage: tools/smokes/curated_llvm.sh [--phi-off] +# Usage: tools/smokes/curated_llvm.sh [--phi-off|--phi-on] [--with-if-merge] ROOT_DIR=$(cd "$(dirname "$0")/../.." && pwd) BIN="$ROOT_DIR/target/release/nyash" @@ -18,9 +18,15 @@ export NYASH_LLVM_USE_HARNESS=1 # Default: PHI-off (MIR13). Use --phi-on to test PHI-on path. export NYASH_MIR_NO_PHI=${NYASH_MIR_NO_PHI:-1} export NYASH_VERIFY_ALLOW_NO_PHI=${NYASH_VERIFY_ALLOW_NO_PHI:-1} +WITH_IFMERGE=0 if [[ "${1:-}" == "--phi-on" ]]; then export NYASH_MIR_NO_PHI=0 echo "[curated-llvm] PHI-on (JSON PHI + finalize) enabled" >&2 +elif [[ "${1:-}" == "--with-if-merge" || "${2:-}" == "--with-if-merge" ]]; then + WITH_IFMERGE=1 + echo "[curated-llvm] enabling if-merge prepass for ternary tests" >&2 + export NYASH_LLVM_PREPASS_IFMERGE=1 + echo "[curated-llvm] PHI-off (edge-copy) enabled" >&2 else echo "[curated-llvm] PHI-off (edge-copy) enabled" >&2 fi @@ -50,4 +56,10 @@ run "$ROOT_DIR/apps/tests/peek_expr_block.nyash" # Try/finally control-flow without actual throw run "$ROOT_DIR/apps/tests/try_finally_break_inner_loop.nyash" +# Optional: if-merge (ret-merge) tests +if [[ "$WITH_IFMERGE" == "1" ]]; then + run "$ROOT_DIR/apps/tests/ternary_basic.nyash" + run "$ROOT_DIR/apps/tests/ternary_nested.nyash" +fi + echo "[curated-llvm] OK"