diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index 9beb62fa..02af19a4 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -1,25 +1,19 @@ -# Current Task (2025-09-11) — Phase 15 LLVM(主経路) + llvmlite Harness(検証・将来主役) +# Current Task (2025-09-13 改定) — Phase 15 llvmlite(既定)+ PyVM(新規) Summary -- LLVM AOT(Rust/inkwell)は引き続き主経路。ただし「反復速度・仕様変更耐性」を担保するため、Python/llvmlite ハーネスを正式導入し、代表ケースで両者の等価性を検証する。 -- VM/Cranelift/Interpreter は MIR14 非対応。MIR 正規化(Resolver・LoopForm規約)を Rust 側で担保し、ハーネスにも同じ形を供給する。 -- 代表ケース(apps/selfhost/tools/dep_tree_min_string.nyash)で `.o`(および必要時 EXE)を安定生成。Harness ON/OFF で機能同値を確認。 +- JIT/Cranelift は一時停止。Rust/inkwell LLVM は参照のみ。 +- 既定の実行/ビルド経路は Python/llvmlite ハーネス(MIR JSON→.o→NyRT link)。 +- 2本目の実行経路として PyVM(Python MIR VM)を導入し、llvmlite との機能同値で安定化する。 -Quick Status — 2025‑09‑13(compressed, post‑harness fixes) -- Harness ON(llvmlite)で .ll verify green → .o → link 成立(dep_tree_min_string) -- Resolver‑only 統一(vmap 直読排除)。PHI は BB 先頭に集約・i64(ハンドル)固定/pointer incoming は pred 終端直前で boxing(GEP+from_i8_string) -- 降下順序: preds 優先の擬似トポロジカル順に block 降下。非 PHI 命令は「現在 BB」末尾に挿入(dominance 安定) -- 文字列: ‘+’ は string タグ/ptr 検出時のみ concat_hh、len/eq 対応、substring/lastIndexOf は handle 版(_hii/_hh)を NyRT に実装・使用 -- const(string): Global を保持→使用側で GEP→i8* に正規化。MIR main→private、ny_main ラッパ生成 -- by‑name 定数: メソッド名の i8* は定数 GEP を採用(順序依存を排除) -- 比較/検証: compare_harness_on_off.sh で ON/OFF の Exit 一致(現状 JSON は双方空。最終 JSON 一致は次フェーズで詰め) +Quick Status — 2025‑09‑13(post‑harness hardening) +- llvmlite(ハーネス)で verify green → .o → link が代表ケースで成立(dep_tree_min_string) +- Resolver‑only/Sealed SSA/文字列ハンドル不変を強化。PHIはBB先頭・pred終端でboxing/cast。 +- IRダンプ/PHIガード/deny-directチェックが利用可能(NYASH_LLVM_DUMP_IR, NYASH_LLVM_PHI_STRICT, tools/llvmlite_check_deny_direct.sh)。 -Focus Shift — Python/llvmlite Only(2025‑09‑13) -- Rust/inkwell 側は当面「保守」へ。開発・詰めは Nyash スクリプト+Python/llvmlite のみで進行。 -- 追加スモーク: apps/tests/esc_dirname_smoke.nyash(esc_json/dirname の最小 2 行出力)。 -- 追加トレース: `NYASH_LLVM_TRACE_FINAL=1` で println 直前に `nyash.debug.trace_handle(i64)` を呼び、最終ハンドルを観測。 -- Lifetime ヒント(軽量): `def_blocks`(value_id → 定義ブロック集合)を Builder が収集、Resolver は現ブロック定義済みの i64 を優先再利用(PHI 過剰化を抑制)。 -- const(string) 改善: 即時 `from_i8_string` で i64 ハンドル化(後段連鎖の 0 落ちを軽減)。 +Focus Shift — llvmlite(既定)+ PyVM(新規) +- Rust/inkwell は保守のみ。Python(llvmlite/PyVM)中心で開発。 +- 追加スモーク: esc_dirname_smoke / dep_tree_min_string を llvmlite と PyVM の両方で常時維持。 +- 追跡: `NYASH_LLVM_TRACE_FINAL=1`(最終ハンドル)、`NYASH_LLVM_TRACE_PHI=1`(PHIログ) Hot Update — 2025‑09‑13(Harness 配線・フォールバック廃止) - Runner(LLVMモード)にハーネス配線を追加。`NYASH_LLVM_USE_HARNESS=1` のとき: @@ -39,14 +33,43 @@ Hot Update — 2025‑09‑13(Resolver‑only 統一 + Harness ON green) - `main` 衝突回避: MIR 由来 `main` は private にし、`ny_main()` ラッパを自動生成(NyRT `main` と整合)。 - 代表ケース(dep_tree_min_string): Harness ON で `.ll verify green → .o` を確認し、NyRT とリンクして EXE 生成成功。 -Next(short, refreshed — Py/llvmlite 線) -1) スモーク確定: esc_dirname_smoke の 2 行出力を ON/OFF 完全一致に(行比較)。 -2) dep_tree_min_string の最終 JSON 一致(`{` 以降の diff=空)。 - - `NYASH_LLVM_TRACE_FINAL=1`+`NYASH_LLVM_TRACE_VALUES=1` で println 引数ハンドルの鎖を観測し、synth‑zero 起点を特定→ Resolver/PHI で局所是正。 - - PHI/snapshot は「pred で materialize→無ければ snap→最後に synth(0)」の順を徹底。None を入れない。 -3) CI/補助 - - スモークを compare_harness_on_off.sh からも容易に呼べるよう維持(必要なら行比較モード追加)。 - - Deny‑Direct(`vmap.get(` 直読の抑止)を継続チェック。 +Next(short — Py/llvmlite + PyVM) +1) PyVM スキャフォールド: `tools/pyvm_runner.py` と `src/llvm_py/pyvm/` 追加(最小命令+boxcall)。 +2) ランナー統合: `NYASH_VM_USE_PY=1` → MIR(JSON) を PyVM に渡して実行。 +3) パリティ基盤: 汎用パリティスクリプト `tools/parity.sh` を追加(stdout+exit code 比較、pyvm/vm/llvmlite 任意ペア)。 +4) 型メタ導入(MIR v0.5 互換): JSON MIR に String の handle/ptr 種別を明示し、llvmlite/PyVM で推測を排除。 +5) スモーク拡充: esc_dirname_smoke / dep_tree_min_string の両経路一致(終了コード+JSON)。 + +Hot Update — MIR v0.5 Type Metadata(2025‑09‑14 着手) +- 背景: 文字列を i64 として曖昧に扱っており、llvmlite で handle/ptr の推測が必要→不安定の温床。 +- 追加仕様(後方互換、最小差分): + - Const(string): `{"value": {"type": {"kind":"handle","box_type":"StringBox"}, "value":"..."}}` + - 既存の `type:"string"` 表記は併記しない(受け側は新表示を優先、無ければ従来推測)。 + - BoxCall/ExternCall: 可能な範囲で `dst_type` を付与(例: substring→StringBox(handle), length/lastIndexOf→i64)。 +- 実装計画: + - A) 共有エミッタ `src/runner/mir_json_emit.rs` を拡張(string const/最小メソッドの `dst_type`)。 + - B) Python 側: `llvm_builder.py` が `dst_type` を検知して `resolver.mark_string(dst)` を行う(型タグの明示化)。 + - C) Console 出力の安定化: 当面は既存のポインタAPI/ハンドルAPIを維持。型メタ普及後に handle→ptr ブリッジ導入を検討。 +- 受け入れ(第一段): + - JSON に string const の型メタが出ること + - Python 側で `dst_type` により string ハンドルのタグ付けが行われること + - `tools/parity.sh` が esc_dirname_smoke で実行できること(完全一致は第二段で目標) + +Hot Update — Box Theory PHI(2025‑09‑14 追加予定) +- 背景: ループの PHI が snapshot 未構築時に 0 合成へ落ちる(forward 参照をその場 resolve しているため)。 +- 方針(箱理論に基づく簡素化): + - Block=箱(BoxScope)。各ブロック末尾の `block_end_values` を箱として扱う。 + - PHI は即時解決せず defer 収集 → 全ブロック降下後に finalize で箱(pred の snapshot)から値を取り出して配線。 + - ブロック間は String は常に handle(i64) 固定。pointer PHI は禁止。必要な boxing(ptr→handle)は pred 末端(terminator 直前)で挿入。 + - ‘+’ は常に concat_hh(handle,handle)。i64 プリミティブは from_i64 で昇格、リテラルは from_i8_string。 +- 実装計画: + - A) `llvm_builder.py`: `lower_phi` を defer 化、`finalize_phis` を追加(incoming=(pred_bid,val_id) を materialize)。 + - B) `llvm_builder.py`: `block_end_values` の網羅性を補強(関数引数/const/新規 dst/phi dst/循環値が確実に入る)。 + - C) `resolver._value_at_end_i64`: pred 末端での局所 boxing/cast を強制、未定義→0 合成を抑制(strict 時は警告)。 +- 受け入れ(第二段): + - 最小再現 `apps/tests/min_str_cat_loop/main.nyash` で PyVM と llvmlite の parity 緑(`xxx`) + - `apps/tests/esc_dirname_smoke.nyash` で parity 緑(1行目の 0 が解消) + - `tools/parity.sh` で stdout 完全一致+終了コード一致 Compact Roadmap(2025‑09‑13 改定) - Focus A(Rust LLVM 維持): Flow hardening, PHI(sealed) 安定化, LoopForm 仕様遵守。 diff --git a/app_par_esc b/app_par_esc new file mode 100644 index 00000000..4ecbb6e1 Binary files /dev/null and b/app_par_esc differ diff --git a/app_parity_esc_dirname_smoke b/app_parity_esc_dirname_smoke new file mode 100644 index 00000000..32a633fd Binary files /dev/null and b/app_parity_esc_dirname_smoke differ diff --git a/app_parity_main b/app_parity_main new file mode 100644 index 00000000..734b254b Binary files /dev/null and b/app_parity_main differ diff --git a/apps/tests/min_str_cat_loop/main.nyash b/apps/tests/min_str_cat_loop/main.nyash new file mode 100644 index 00000000..fd9a76db --- /dev/null +++ b/apps/tests/min_str_cat_loop/main.nyash @@ -0,0 +1,17 @@ +// Minimal repro: string concatenation in a loop should yield "xxx" + +static box Main { + main(args) { + local console = new ConsoleBox() + local out = "" + local i = 0 + local n = 3 + loop(i < n) { + out = out + "x" + i = i + 1 + } + console.println(out) + return 0 + } +} + diff --git a/crates/nyrt/src/lib.rs b/crates/nyrt/src/lib.rs index d1c50eab..796e4571 100644 --- a/crates/nyrt/src/lib.rs +++ b/crates/nyrt/src/lib.rs @@ -79,7 +79,9 @@ pub extern "C" fn nyash_string_concat_hh_export(a_h: i64, b_h: i64) -> i64 { }; let s = format!("{}{}", to_s(a_h), to_s(b_h)); let arc: std::sync::Arc = std::sync::Arc::new(StringBox::new(s)); - handles::to_handle(arc) as i64 + let h = handles::to_handle(arc) as i64; + eprintln!("[TRACE] concat_hh -> {}", h); + h } // String.eq_hh(lhs_h, rhs_h) -> i64 (0/1) @@ -120,7 +122,9 @@ pub extern "C" fn nyash_string_substring_hii_export(h: i64, start: i64, end: i64 let (st_u, en_u) = (st as usize, en as usize); let sub = s.get(st_u.min(s.len())..en_u.min(s.len())).unwrap_or(""); let arc: std::sync::Arc = std::sync::Arc::new(StringBox::new(sub.to_string())); - handles::to_handle(arc) as i64 + let nh = handles::to_handle(arc) as i64; + eprintln!("[TRACE] substring_hii -> {}", nh); + nh } // String.lastIndexOf_hh(haystack_h, needle_h) -> i64 @@ -159,7 +163,9 @@ pub extern "C" fn nyash_box_from_i8_string(ptr: *const i8) -> i64 { Err(_) => return 0, }; let arc: std::sync::Arc = std::sync::Arc::new(StringBox::new(s)); - handles::to_handle(arc) as i64 + let h = handles::to_handle(arc) as i64; + eprintln!("[TRACE] from_i8_string -> {}", h); + h } // box.from_f64(val) -> handle @@ -171,6 +177,15 @@ pub extern "C" fn nyash_box_from_f64(val: f64) -> i64 { handles::to_handle(arc) as i64 } +// box.from_i64(val) -> handle +// Helper: build an IntegerBox and return a handle +#[export_name = "nyash.box.from_i64"] +pub extern "C" fn nyash_box_from_i64(val: i64) -> i64 { + use nyash_rust::{box_trait::{NyashBox, IntegerBox}, jit::rt::handles}; + let arc: std::sync::Arc = std::sync::Arc::new(IntegerBox::new(val)); + handles::to_handle(arc) as i64 +} + // env.box.new(type_name: *const i8) -> handle (i64) // Minimal shim for Core-13 pure AOT: constructs Box via registry by name (no args) #[export_name = "nyash.env.box.new"] diff --git a/crates/nyrt/src/plugin/string.rs b/crates/nyrt/src/plugin/string.rs index fb45f2ff..1f5bf204 100644 --- a/crates/nyrt/src/plugin/string.rs +++ b/crates/nyrt/src/plugin/string.rs @@ -83,7 +83,7 @@ pub extern "C" fn nyash_string_concat_is(a: i64, b: *const i8) -> *mut i8 { // Exported as: nyash.string.substring_sii(i8* s, i64 start, i64 end) -> i8* #[export_name = "nyash.string.substring_sii"] pub extern "C" fn nyash_string_substring_sii(s: *const i8, start: i64, end: i64) -> *mut i8 { - use std::ffi::CStr; +use std::ffi::CStr; if s.is_null() { return std::ptr::null_mut(); } @@ -121,3 +121,34 @@ pub extern "C" fn nyash_string_lastindexof_ss(s: *const i8, needle: *const i8) - pos as i64 } else { -1 } } + +// Exported as: nyash.string.to_i8p_h(i64 handle) -> i8* +#[export_name = "nyash.string.to_i8p_h"] +pub extern "C" fn nyash_string_to_i8p_h(handle: i64) -> *mut i8 { + use nyash_rust::jit::rt::handles; + if handle <= 0 { + // return "0" for consistency with existing fallback behavior + let s = handle.to_string(); + let mut bytes = s.into_bytes(); + bytes.push(0); + let boxed = bytes.into_boxed_slice(); + let raw = Box::into_raw(boxed) as *mut u8; + return raw as *mut i8; + } + if let Some(obj) = handles::get(handle as u64) { + let s = obj.to_string_box().value; + let mut bytes = s.into_bytes(); + bytes.push(0); + let boxed = bytes.into_boxed_slice(); + let raw = Box::into_raw(boxed) as *mut u8; + raw as *mut i8 + } else { + // not found -> print numeric handle string + let s = handle.to_string(); + let mut bytes = s.into_bytes(); + bytes.push(0); + let boxed = bytes.into_boxed_slice(); + let raw = Box::into_raw(boxed) as *mut u8; + raw as *mut i8 + } +} diff --git a/docs/development/roadmap/phases/phase-15/README.md b/docs/development/roadmap/phases/phase-15/README.md index 1aa9ed55..a14675f0 100644 --- a/docs/development/roadmap/phases/phase-15/README.md +++ b/docs/development/roadmap/phases/phase-15/README.md @@ -14,26 +14,26 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存 4. **エコシステムの自立**: Nyashだけで完結する開発環境 5. **劇的なコード圧縮**: 75%削減で保守性・可読性の革命 -## 🚀 実装戦略(2025年9月更新) +## 🚀 実装戦略(2025年9月更新・改定) -### Phase 15.2: LLVM層の独立化(実装中) -- **Python/llvmlite実装を正式採用**(開発速度10倍、~2400行) -- nyash-llvm-compiler crateの分離(Rust版も継続) -- MIR JSON/バイナリ入力 → ネイティブEXE出力 -- プラグイン全方向ビルド戦略(.so/.o/.a同時生成) -- 独立したツールとして配布可能 +### Phase 15.2: LLVM(llvmlite)安定化 + PyVM導入 +- JIT/Cranelift は一時停止(古い/非対応)。Rust/inkwell は参照のみ。 +- 既定のコンパイル経路は **Python/llvmlite**(harness)のみ + - MIR(JSON) → LLVM IR → .o → NyRTリンク → EXE + - Resolver-only / Sealed SSA / 文字列ハンドル不変 を強化 +- 新規: **PyVM(Python MIR VM)** を導入し、2本目の実行経路を確保 + - 最小命令: const/binop/compare/phi/branch/jump/ret + 最小 boxcall(Console/File/Path/String) + - ランナー統合: `NYASH_VM_USE_PY=1` で MIR(JSON) を PyVM に渡して実行 + - 代表スモーク(esc_dirname_smoke / dep_tree_min_string)で llvmlite とパリティ確認 -### Phase 15.3: Nyashコンパイラ実装 -- NyashでNyashパーサー実装(800行目標) -- AST→MIR変換(2500行目標) -- **循環依存なし**:nyrtがStringBox/ArrayBoxをC ABI経由で提供 -- ブートストラップでセルフホスティング達成! +### Phase 15.3: NyashコンパイラMVP(後段) +- PyVM 安定後、Nyash製パーサ/レクサ(サブセット)と MIR ビルダを段階導入 +- フラグでRustフォールバックと併存(例: `NYASH_USE_NY_COMPILER=1`) +- JIT不要、PyVM/llvmlite のパリティで正しさを担保 -### Phase 15.4: VM層のNyash化(革新的) -- MIR解釈エンジンをNyashで実装(~5000行予想) -- 動的ディスパッチ(MapBox)で13命令処理 -- コンパイル不要の即座実行 -- デバッグ・開発効率の劇的向上 +### Phase 15.4: VM層のNyash化(PyVMからの置換) +- PyVM を足場に、VMコアを Nyash 実装へ段階移植(命令サブセットから) +- 動的ディスパッチで13命令処理を目標に拡張 詳細:[セルフホスティング戦略 2025年9月版](implementation/self-hosting-strategy-2025-09.md) @@ -71,7 +71,7 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存 この究極のシンプルさにより、直接x86変換も現実的に! ### バックエンドの選択肢 -#### 1. Cranelift + lld内蔵(ChatGPT5推奨) +#### 1. Cranelift + lld内蔵(保留) - **軽量**: 3-5MB程度(LLVMの1/10以下) - **JIT特化**: メモリ上での動的コンパイル - **Rust統合**: 静的リンクで配布容易 @@ -173,18 +173,15 @@ box TemplateStitcher { ## 🔗 EXEファイル生成・リンク戦略 -### 統合ツールチェーン +### 統合ツールチェーン(現状) ```bash -# Cranelift版(一時停止中) -nyash build main.ny --backend=cranelift --target=x86_64-pc-windows-msvc - -# LLVM版(ChatGPT5実装中) -nyash build main.ny --backend=llvm --emit exe -o program.exe +nyash build main.ny --backend=llvm --emit exe -o program.exe # llvmlite/harness 経路 +NYASH_VM_USE_PY=1 nyash run main.ny --backend=vm # PyVM(MIR JSON を実行) ``` ### 実装戦略 -#### LLVM バックエンド(優先) +#### LLVM バックエンド(優先・llvmlite) 1. **MIR→LLVM IR**: MIR13をLLVM IRに変換(✅ 実装済み) 2. **LLVM IR→Object**: ネイティブオブジェクトファイル生成(✅ 実装済み) 3. **Python/llvmlite実装**: Resolver patternでSSA安全性確保(✅ 実証済み) @@ -233,10 +230,10 @@ ny_free_buf(buffer) ## 📅 実施時期(修正版) - **現在進行中**(2025年9月) - - Python/llvmlite実装でブレークスルー - - dep_tree_min_string.nyashオブジェクト生成成功! -- **Phase 15.2**: LLVM独立化(2025年9-10月完成予定) -- **Phase 15.3**: Nyashコンパイラ(2025年11-12月) + - Python/llvmlite(既定)/Craneliftは停止 + - PyVM(Python MIR VM)導入・代表スモークで llvmlite とパリティ確認 +- **Phase 15.2**: llvmlite安定化 + PyVM最小完成(2025年9-10月) +- **Phase 15.3**: NyashコンパイラMVP(2025年11-12月) - **Phase 15.4**: VM層Nyash化(2026年1-3月) - **Phase 15.5**: ABI移行(LLVM完成後、必要に応じて) @@ -277,4 +274,4 @@ ny_free_buf(buffer) - ✅ LLVM dominance違反解決(Resolver pattern) - 🚀 Python/llvmliteでEXE生成パイプライン完成 - 📝 nyash-llvm-compiler分離設計 -- 📝 NyashパーサーMVP実装開始 \ No newline at end of file +- 📝 NyashパーサーMVP実装開始 diff --git a/docs/development/roadmap/phases/phase-15/planning/sequence.md b/docs/development/roadmap/phases/phase-15/planning/sequence.md index 8526f5e2..1d94fbb8 100644 --- a/docs/development/roadmap/phases/phase-15/planning/sequence.md +++ b/docs/development/roadmap/phases/phase-15/planning/sequence.md @@ -1,10 +1,10 @@ -# Phase 15 推奨進行順(JIT優先・自己ホスティング最小) +# Phase 15 推奨進行順(llvmlite+PyVM 優先・自己ホスティング最小) 更新日: 2025-09-05 ## 方針(原則) -- JITオンリー(Cranelift)で前進。LLVM/AOT・lld系は後段にスライド。 +- JIT/Cranelift は停止。LLVM(llvmlite)と PyVM の2経路で前進。 - 最小自己ホスト体験を早期に成立 → ドキュメント/スモーク/CIを先に固める。 - using(名前空間)はゲート付きで段階導入。NyModulesとny_pluginsの基盤を強化。 - tmux + codex-async を使い、常時2本並走で小粒に積み上げる。 @@ -25,18 +25,17 @@ **完了基準:** - env.modules.get("acme.logger") などが取得可能、LIST_ONLY/Fail-continue維持、予約拒否ログが出る。 -### 2) 最小コンパイラ経路(JIT) +### 2) 最小VM(PyVM) **要点:** -- パーサ/レクサのサブセット: ident/literals/let/call/return/if/block -- Nyash から呼べる MIR ビルダ(小さなサブセット) -- VM/JIT ブリッジを通して apps/selfhost-minimal が走る +- MIR(JSON) を Python VM(PyVM)で実行。最小命令 + 最小 boxcall(Console/File/Path/String) +- ランナー統合(`NYASH_VM_USE_PY=1`)→ 代表スモークが llvmlite と一致 **スモーク/CI:** -- tools/jit_smoke.sh, tools/selfhost_vm_smoke.sh +- tools/compare_harness_on_off.sh(ハーネス)、compare_vm_vs_harness.sh(PyVM vs llvmlite) **完了基準:** -- ./target/release/nyash --backend vm apps/selfhost-minimal/main.nyash が安定実行し、CIでJITスモーク合格。 +- esc_dirname_smoke / dep_tree_min_string が PyVM と llvmlite で一致。 ### 3) using(ゲート付き)設計・実装(15.2/15.3) @@ -145,4 +144,4 @@ cargo build --release --features cranelift-jit ## 備考 -本シーケンスは `docs/development/roadmap/phases/phase-15/self-hosting-plan.txt` を尊重しつつ、JIT最小体験を優先させるため順序を最適化(LLVM/lld と YAML自動生成は後段へスライド)。進捗に応じて適宜見直し、CI/スモークで常時検証する。 \ No newline at end of file +本シーケンスは `docs/development/roadmap/phases/phase-15/self-hosting-plan.txt` を尊重しつつ、JIT最小体験を優先させるため順序を最適化(LLVM/lld と YAML自動生成は後段へスライド)。進捗に応じて適宜見直し、CI/スモークで常時検証する。 diff --git a/plugins/nyash-array-plugin/libnyash_array_plugin.a b/plugins/nyash-array-plugin/libnyash_array_plugin.a new file mode 100644 index 00000000..cf30944f Binary files /dev/null and b/plugins/nyash-array-plugin/libnyash_array_plugin.a differ diff --git a/plugins/nyash-console-plugin/libnyash_console_plugin.a b/plugins/nyash-console-plugin/libnyash_console_plugin.a new file mode 100644 index 00000000..1ff702ea Binary files /dev/null and b/plugins/nyash-console-plugin/libnyash_console_plugin.a differ diff --git a/plugins/nyash-counter-plugin/libnyash_counter_plugin.a b/plugins/nyash-counter-plugin/libnyash_counter_plugin.a new file mode 100644 index 00000000..e7d21f71 Binary files /dev/null and b/plugins/nyash-counter-plugin/libnyash_counter_plugin.a differ diff --git a/plugins/nyash-egui-plugin/libnyash_egui_plugin.a b/plugins/nyash-egui-plugin/libnyash_egui_plugin.a new file mode 100644 index 00000000..192486b2 Binary files /dev/null and b/plugins/nyash-egui-plugin/libnyash_egui_plugin.a differ diff --git a/plugins/nyash-encoding-plugin/libnyash_encoding_plugin.a b/plugins/nyash-encoding-plugin/libnyash_encoding_plugin.a new file mode 100644 index 00000000..76c1e877 Binary files /dev/null and b/plugins/nyash-encoding-plugin/libnyash_encoding_plugin.a differ diff --git a/plugins/nyash-file/libnyash_file.a b/plugins/nyash-file/libnyash_file.a new file mode 100644 index 00000000..bb259a72 Binary files /dev/null and b/plugins/nyash-file/libnyash_file.a differ diff --git a/plugins/nyash-filebox-plugin/libnyash_filebox_plugin.a b/plugins/nyash-filebox-plugin/libnyash_filebox_plugin.a new file mode 100644 index 00000000..8a660cc9 Binary files /dev/null and b/plugins/nyash-filebox-plugin/libnyash_filebox_plugin.a differ diff --git a/plugins/nyash-integer-plugin/libnyash_integer_plugin.a b/plugins/nyash-integer-plugin/libnyash_integer_plugin.a new file mode 100644 index 00000000..e35457f8 Binary files /dev/null and b/plugins/nyash-integer-plugin/libnyash_integer_plugin.a differ diff --git a/plugins/nyash-map-plugin/libnyash_map_plugin.a b/plugins/nyash-map-plugin/libnyash_map_plugin.a new file mode 100644 index 00000000..dd37f0f5 Binary files /dev/null and b/plugins/nyash-map-plugin/libnyash_map_plugin.a differ diff --git a/plugins/nyash-net-plugin/libnyash_net_plugin.a b/plugins/nyash-net-plugin/libnyash_net_plugin.a new file mode 100644 index 00000000..452dedcd Binary files /dev/null and b/plugins/nyash-net-plugin/libnyash_net_plugin.a differ diff --git a/plugins/nyash-path-plugin/libnyash_path_plugin.a b/plugins/nyash-path-plugin/libnyash_path_plugin.a new file mode 100644 index 00000000..649eb0f1 Binary files /dev/null and b/plugins/nyash-path-plugin/libnyash_path_plugin.a differ diff --git a/plugins/nyash-python-compiler-plugin/libnyash_python_compiler_plugin.a b/plugins/nyash-python-compiler-plugin/libnyash_python_compiler_plugin.a new file mode 100644 index 00000000..622ccf2e Binary files /dev/null and b/plugins/nyash-python-compiler-plugin/libnyash_python_compiler_plugin.a differ diff --git a/plugins/nyash-python-parser-plugin/libnyash_python_parser_plugin.a b/plugins/nyash-python-parser-plugin/libnyash_python_parser_plugin.a new file mode 100644 index 00000000..5ce2ac9a Binary files /dev/null and b/plugins/nyash-python-parser-plugin/libnyash_python_parser_plugin.a differ diff --git a/plugins/nyash-python-plugin/libnyash_python_plugin.a b/plugins/nyash-python-plugin/libnyash_python_plugin.a new file mode 100644 index 00000000..b931bc6d Binary files /dev/null and b/plugins/nyash-python-plugin/libnyash_python_plugin.a differ diff --git a/plugins/nyash-regex-plugin/libnyash_regex_plugin.a b/plugins/nyash-regex-plugin/libnyash_regex_plugin.a new file mode 100644 index 00000000..a2b2790c Binary files /dev/null and b/plugins/nyash-regex-plugin/libnyash_regex_plugin.a differ diff --git a/plugins/nyash-string-plugin/libnyash_string_plugin.a b/plugins/nyash-string-plugin/libnyash_string_plugin.a new file mode 100644 index 00000000..a297f699 Binary files /dev/null and b/plugins/nyash-string-plugin/libnyash_string_plugin.a differ diff --git a/plugins/nyash-test-multibox/libnyash_test_multibox.a b/plugins/nyash-test-multibox/libnyash_test_multibox.a new file mode 100644 index 00000000..ac2c398c Binary files /dev/null and b/plugins/nyash-test-multibox/libnyash_test_multibox.a differ diff --git a/plugins/nyash-toml-plugin/libnyash_toml_plugin.a b/plugins/nyash-toml-plugin/libnyash_toml_plugin.a new file mode 100644 index 00000000..822f160e Binary files /dev/null and b/plugins/nyash-toml-plugin/libnyash_toml_plugin.a differ diff --git a/src/llvm_py/__init__.py b/src/llvm_py/__init__.py new file mode 100644 index 00000000..dac42026 --- /dev/null +++ b/src/llvm_py/__init__.py @@ -0,0 +1,7 @@ +"""Top-level package for Nyash Python backends. + +Subpackages: + - pyvm: Python MIR interpreter (PyVM) + - instructions/*: llvmlite lowering helpers (AOT harness) +""" + diff --git a/src/llvm_py/instructions/binop.py b/src/llvm_py/instructions/binop.py index 422f80cd..2f1ecdf7 100644 --- a/src/llvm_py/instructions/binop.py +++ b/src/llvm_py/instructions/binop.py @@ -69,6 +69,7 @@ def lower_binop( i8p = ir.IntType(8).as_pointer() lhs_raw = vmap.get(lhs) rhs_raw = vmap.get(rhs) + # Prefer handle pipeline to keep handles consistent across blocks/ret # pointer present? is_ptr_side = (hasattr(lhs_raw, 'type') and isinstance(lhs_raw.type, ir.PointerType)) or \ (hasattr(rhs_raw, 'type') and isinstance(rhs_raw.type, ir.PointerType)) @@ -86,7 +87,7 @@ def lower_binop( is_str = is_ptr_side or any_tagged if is_str: # Helper: convert raw or resolved value to string handle - def to_handle(raw, val, tag: str): + def to_handle(raw, val, tag: str, vid: int): if raw is not None and hasattr(raw, 'type') and isinstance(raw.type, ir.PointerType): # pointer-to-array -> GEP try: @@ -104,11 +105,29 @@ def lower_binop( return builder.call(cal, [raw], name=f"str_ptr2h_{tag}_{dst}") # if already i64 if val is not None and hasattr(val, 'type') and isinstance(val.type, ir.IntType) and val.type.width == 64: - return val + # Distinguish handle vs numeric: if vid is tagged string-ish, treat as handle; otherwise box numeric to handle + is_tag = False + try: + if resolver is not None and hasattr(resolver, 'is_stringish'): + is_tag = resolver.is_stringish(vid) + except Exception: + is_tag = False + if is_tag: + return val + # Box numeric i64 to IntegerBox handle + cal = None + for f in builder.module.functions: + if f.name == 'nyash.box.from_i64': + cal = f; break + if cal is None: + cal = ir.Function(builder.module, ir.FunctionType(i64, [i64]), name='nyash.box.from_i64') + # Ensure value is i64 + v64 = val if val.type.width == 64 else builder.zext(val, i64) + return builder.call(cal, [v64], name=f"int_i2h_{tag}_{dst}") return ir.Constant(i64, 0) - hl = to_handle(lhs_raw, lhs_val, 'l') - hr = to_handle(rhs_raw, rhs_val, 'r') + hl = to_handle(lhs_raw, lhs_val, 'l', lhs) + hr = to_handle(rhs_raw, rhs_val, 'r', rhs) # concat_hh(handle, handle) -> handle hh_fnty = ir.FunctionType(i64, [i64, i64]) callee = None diff --git a/src/llvm_py/instructions/boxcall.py b/src/llvm_py/instructions/boxcall.py index ba3d4bc8..5772f714 100644 --- a/src/llvm_py/instructions/boxcall.py +++ b/src/llvm_py/instructions/boxcall.py @@ -131,6 +131,8 @@ def lower_boxcall( try: if resolver is not None and hasattr(resolver, 'mark_string'): resolver.mark_string(dst_vid) + if resolver is not None and hasattr(resolver, 'string_ptrs'): + resolver.string_ptrs[int(dst_vid)] = p except Exception: pass return @@ -196,27 +198,42 @@ def lower_boxcall( return if method_name in ("print", "println", "log"): - # Console mapping - if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - arg0 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else None - else: - arg0 = vmap.get(args[0]) if args else None - if arg0 is None: - arg0 = ir.Constant(i8p, None) - # Prefer handle API if arg is i64, else pointer API - if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType) and arg0.type.width == 64: - # Optional runtime trace of the handle - import os as _os - if _os.environ.get('NYASH_LLVM_TRACE_FINAL') == '1': - trace = _declare(module, "nyash.debug.trace_handle", i64, [i64]) - _ = builder.call(trace, [arg0], name="trace_handle") - callee = _declare(module, "nyash.console.log_handle", i64, [i64]) - _ = builder.call(callee, [arg0], name="console_log_h") - else: - if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType): - arg0 = builder.inttoptr(arg0, i8p) + # Console mapping (prefer pointer-API when possible to avoid handle registry mismatch) + use_ptr = False + arg0_vid = args[0] if args else None + arg0_ptr = None + if resolver is not None and hasattr(resolver, 'string_ptrs') and arg0_vid is not None: + try: + arg0_ptr = resolver.string_ptrs.get(int(arg0_vid)) + if arg0_ptr is not None: + use_ptr = True + except Exception: + pass + if use_ptr and arg0_ptr is not None: callee = _declare(module, "nyash.console.log", i64, [i8p]) - _ = builder.call(callee, [arg0], name="console_log") + _ = builder.call(callee, [arg0_ptr], name="console_log_ptr") + else: + # Fallback: resolve i64 and prefer pointer API via to_i8p_h bridge + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + arg0 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else None + else: + arg0 = vmap.get(args[0]) if args else None + if arg0 is None: + arg0 = ir.Constant(i64, 0) + # If we have a handle (i64), convert to i8* via bridge and log via pointer API + if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType): + if arg0.type.width != 64: + arg0 = builder.zext(arg0, i64) + bridge = _declare(module, "nyash.string.to_i8p_h", i8p, [i64]) + p = builder.call(bridge, [arg0], name="str_h2p_for_log") + callee = _declare(module, "nyash.console.log", i64, [i8p]) + _ = builder.call(callee, [p], name="console_log_p") + else: + # Non-integer value: coerce to i8* and log + if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType): + arg0 = builder.inttoptr(arg0, i8p) + callee = _declare(module, "nyash.console.log", i64, [i8p]) + _ = builder.call(callee, [arg0], name="console_log") if dst_vid is not None: vmap[dst_vid] = ir.Constant(i64, 0) return diff --git a/src/llvm_py/instructions/call.py b/src/llvm_py/instructions/call.py index 42b62e33..d379fa0b 100644 --- a/src/llvm_py/instructions/call.py +++ b/src/llvm_py/instructions/call.py @@ -107,5 +107,18 @@ def lower_call( 'esc_json', 'node_json', 'dirname', 'join', 'read_all', 'toJson' ]): resolver.mark_string(dst_vid) + # Additionally, create a pointer view via bridge for println pointer-API + if resolver is not None and hasattr(resolver, 'string_ptrs'): + i64 = ir.IntType(64) + i8p = ir.IntType(8).as_pointer() + if hasattr(result, 'type') and isinstance(result.type, ir.IntType) and result.type.width == 64: + bridge = None + for f in module.functions: + if f.name == 'nyash.string.to_i8p_h': + bridge = f; break + if bridge is None: + bridge = ir.Function(module, ir.FunctionType(i8p, [i64]), name='nyash.string.to_i8p_h') + pv = builder.call(bridge, [result], name=f"ret_h2p_{dst_vid}") + resolver.string_ptrs[int(dst_vid)] = pv except Exception: pass diff --git a/src/llvm_py/instructions/const.py b/src/llvm_py/instructions/const.py index 75bcab9b..9c42de7b 100644 --- a/src/llvm_py/instructions/const.py +++ b/src/llvm_py/instructions/const.py @@ -39,7 +39,7 @@ def lower_const( llvm_val = ir.Constant(f64, float(const_val)) vmap[dst] = llvm_val - elif const_type == 'string': + elif const_type == 'string' or (isinstance(const_type, dict) and const_type.get('kind') in ('handle','ptr') and const_type.get('box_type') == 'StringBox'): # String constant - create global and immediately box to i64 handle i8 = ir.IntType(8) str_val = str(const_val) @@ -82,6 +82,11 @@ def lower_const( # Mark this value-id as string-ish to guide '+' and '==' lowering if hasattr(resolver, 'mark_string'): resolver.mark_string(dst) + # Keep raw pointer for potential pointer-API sites (e.g., console.log) + try: + resolver.string_ptrs[dst] = gep + except Exception: + pass elif const_type == 'void': # Void/null constant - use i64 zero diff --git a/src/llvm_py/instructions/phi.py b/src/llvm_py/instructions/phi.py index 905055d8..830ff9b3 100644 --- a/src/llvm_py/instructions/phi.py +++ b/src/llvm_py/instructions/phi.py @@ -58,6 +58,7 @@ def lower_phi( # Collect incoming values incoming_pairs: List[Tuple[ir.Block, ir.Value]] = [] + used_default_zero = False for block_id in actual_preds: block = bb_map.get(block_id) vid = incoming_map.get(block_id) @@ -76,6 +77,7 @@ def lower_phi( if val is None: # Missing incoming for this predecessor → default 0 val = ir.Constant(phi_type, 0) + used_default_zero = True else: # Snapshot fallback if block_end_values is not None: @@ -86,6 +88,7 @@ def lower_phi( if not val: # Missing incoming for this predecessor → default 0 val = ir.Constant(phi_type, 0) + used_default_zero = True # Coerce pointer to i64 at predecessor end if hasattr(val, 'type') and val.type != phi_type: pb = ir.IRBuilder(block) @@ -127,6 +130,16 @@ def lower_phi( # Store PHI result vmap[dst_vid] = phi + # Strict mode: fail fast on synthesized zeros (indicates incomplete incoming or dominance issue) + import os + if used_default_zero and os.environ.get('NYASH_LLVM_PHI_STRICT') == '1': + raise RuntimeError(f"[LLVM_PY] PHI dst={dst_vid} used synthesized zero; check preds/incoming") + if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1': + try: + blkname = str(current_block.name) + except Exception: + blkname = '' + print(f"[PHI] {blkname} v{dst_vid} incoming={len(incoming_pairs)} zero={1 if used_default_zero else 0}") # Propagate string-ness: if any incoming value-id is tagged string-ish, mark dst as string-ish. try: if resolver is not None and hasattr(resolver, 'is_stringish') and hasattr(resolver, 'mark_string'): diff --git a/src/llvm_py/instructions/ret.py b/src/llvm_py/instructions/ret.py index 84614f27..842b29fc 100644 --- a/src/llvm_py/instructions/ret.py +++ b/src/llvm_py/instructions/ret.py @@ -30,12 +30,36 @@ def lower_return( builder.ret_void() else: # Get return value (prefer resolver) + ret_val = None if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: - if isinstance(return_type, ir.PointerType): - ret_val = resolver.resolve_ptr(value_id, builder.block, preds, block_end_values, vmap) - else: - ret_val = resolver.resolve_i64(value_id, builder.block, preds, block_end_values, vmap, bb_map) - else: + try: + if isinstance(return_type, ir.PointerType): + ret_val = resolver.resolve_ptr(value_id, builder.block, preds, block_end_values, vmap) + else: + # Prefer pointer→handle reboxing for string-ish returns even if function return type is i64 + is_stringish = False + if hasattr(resolver, 'is_stringish'): + try: + is_stringish = resolver.is_stringish(int(value_id)) + except Exception: + is_stringish = False + if is_stringish and hasattr(resolver, 'string_ptrs') and int(value_id) in getattr(resolver, 'string_ptrs'): + # Re-box known string pointer to handle + p = resolver.string_ptrs[int(value_id)] + i8p = ir.IntType(8).as_pointer() + i64 = ir.IntType(64) + boxer = None + for f in builder.module.functions: + if f.name == 'nyash.box.from_i8_string': + boxer = f; break + if boxer is None: + boxer = ir.Function(builder.module, ir.FunctionType(i64, [i8p]), name='nyash.box.from_i8_string') + ret_val = builder.call(boxer, [p], name='ret_ptr2h') + else: + ret_val = resolver.resolve_i64(value_id, builder.block, preds, block_end_values, vmap, bb_map) + except Exception: + ret_val = None + if ret_val is None: ret_val = vmap.get(value_id) if not ret_val: # Default based on return type diff --git a/src/llvm_py/llvm_builder.py b/src/llvm_py/llvm_builder.py index b8744dc3..18f3d020 100644 --- a/src/llvm_py/llvm_builder.py +++ b/src/llvm_py/llvm_builder.py @@ -140,7 +140,22 @@ class NyashLLVMBuilder: else: b.ret(ir.Constant(self.i32, 0)) - return str(self.module) + ir_text = str(self.module) + # Optional IR dump to file for debugging + try: + dump_path = os.environ.get('NYASH_LLVM_DUMP_IR') + if dump_path: + os.makedirs(os.path.dirname(dump_path), exist_ok=True) + with open(dump_path, 'w') as f: + f.write(ir_text) + elif os.environ.get('NYASH_CLI_VERBOSE') == '1': + # Default dump location when verbose and not explicitly set + os.makedirs('tmp', exist_ok=True) + with open('tmp/nyash_harness.ll', 'w') as f: + f.write(ir_text) + except Exception: + pass + return ir_text def _create_dummy_main(self) -> str: """Create dummy ny_main that returns 0""" @@ -185,6 +200,8 @@ class NyashLLVMBuilder: self.resolver.string_ids.clear() if hasattr(self.resolver, 'string_literals'): self.resolver.string_literals.clear() + if hasattr(self.resolver, 'string_ptrs'): + self.resolver.string_ptrs.clear() except Exception: pass @@ -403,6 +420,15 @@ class NyashLLVMBuilder: dst = inst.get("dst") lower_boxcall(builder, self.module, box_vid, method, args, dst, self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) + # Optional: honor explicit dst_type for tagging (string handle) + try: + dst_type = inst.get("dst_type") + if dst is not None and isinstance(dst_type, dict): + if dst_type.get("kind") == "handle" and dst_type.get("box_type") == "StringBox": + if hasattr(self.resolver, 'mark_string'): + self.resolver.mark_string(int(dst)) + except Exception: + pass elif op == "externcall": func_name = inst.get("func") @@ -661,7 +687,7 @@ def main(): llvm_ir = builder.build_from_mir(mir_json) if os.environ.get('NYASH_CLI_VERBOSE') == '1': - print(f"[Python LLVM] Generated LLVM IR:\n{llvm_ir}") + print(f"[Python LLVM] Generated LLVM IR (see NYASH_LLVM_DUMP_IR or tmp/nyash_harness.ll)") builder.compile_to_object(output_file) print(f"Compiled to {output_file}") diff --git a/src/llvm_py/pyvm/__init__.py b/src/llvm_py/pyvm/__init__.py new file mode 100644 index 00000000..89c971a3 --- /dev/null +++ b/src/llvm_py/pyvm/__init__.py @@ -0,0 +1,6 @@ +"""PyVM package scaffold for Nyash MIR interpreter (Python). + +Modules: + - vm: Tiny interpreter for MIR(JSON) produced by runner's mir_json_emit +""" + diff --git a/src/llvm_py/pyvm/vm.py b/src/llvm_py/pyvm/vm.py new file mode 100644 index 00000000..6dd9e7bb --- /dev/null +++ b/src/llvm_py/pyvm/vm.py @@ -0,0 +1,390 @@ +""" +Minimal Python VM for Nyash MIR(JSON) parity with llvmlite. + +Supported ops (MVP): + - const/binop/compare/branch/jump/ret + - phi (select by predecessor block) + - newbox: ConsoleBox, StringBox (minimal semantics) + - boxcall: String.length/substring/lastIndexOf, Console.print/println/log + - externcall: nyash.console.println + +Value model: + - i64 -> Python int + - f64 -> Python float + - string -> Python str + - void/null -> None + - ConsoleBox -> {"__box__":"ConsoleBox"} + - StringBox receiver -> Python str +""" + +from __future__ import annotations +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Tuple +import os + + +@dataclass +class Block: + id: int + instructions: List[Dict[str, Any]] + + +@dataclass +class Function: + name: str + params: List[int] + blocks: Dict[int, Block] + + +class PyVM: + def __init__(self, program: Dict[str, Any]): + self.functions: Dict[str, Function] = {} + for f in program.get("functions", []): + name = f.get("name") + params = [int(p) for p in f.get("params", [])] + bmap: Dict[int, Block] = {} + for bb in f.get("blocks", []): + bmap[int(bb.get("id"))] = Block(id=int(bb.get("id")), instructions=list(bb.get("instructions", []))) + self.functions[name] = Function(name=name, params=params, blocks=bmap) + + def _read(self, regs: Dict[int, Any], v: Optional[int]) -> Any: + if v is None: + return None + return regs.get(int(v)) + + def _set(self, regs: Dict[int, Any], dst: Optional[int], val: Any) -> None: + if dst is None: + return + regs[int(dst)] = val + + def _truthy(self, v: Any) -> bool: + if isinstance(v, bool): + return v + if isinstance(v, (int, float)): + return v != 0 + if isinstance(v, str): + return len(v) != 0 + return v is not None + + def _is_console(self, v: Any) -> bool: + return isinstance(v, dict) and v.get("__box__") == "ConsoleBox" + + def run(self, entry: str) -> Any: + fn = self.functions.get(entry) + if fn is None: + raise RuntimeError(f"entry function not found: {entry}") + return self._exec_function(fn, []) + + def _exec_function(self, fn: Function, args: List[Any]) -> Any: + # Intrinsic fast path for small helpers used in smokes + ok, ret = self._try_intrinsic(fn.name, args) + if ok: + return ret + # Initialize registers and bind params + regs: Dict[int, Any] = {} + if fn.params: + for i, pid in enumerate(fn.params): + regs[int(pid)] = args[i] if i < len(args) else None + else: + # Heuristic: derive param count from name suffix '/N' and bind to vids 0..N-1 + n = 0 + if "/" in fn.name: + try: + n = int(fn.name.split("/")[-1]) + except Exception: + n = 0 + for i in range(n): + regs[i] = args[i] if i < len(args) else None + # Choose a deterministic first block (lowest id) + if not fn.blocks: + return 0 + cur = min(fn.blocks.keys()) + prev: Optional[int] = None + + # Simple block execution loop + while True: + block = fn.blocks.get(cur) + if block is None: + raise RuntimeError(f"block not found: {cur}") + # Evaluate instructions sequentially + i = 0 + while i < len(block.instructions): + inst = block.instructions[i] + op = inst.get("op") + + if op == "phi": + # incoming: [[vid, pred_bid], ...] + incoming = inst.get("incoming", []) + chosen: Any = None + # Prefer predecessor match; otherwise fallback to first + for vid, pb in incoming: + if prev is not None and int(pb) == int(prev): + chosen = regs.get(int(vid)) + break + if chosen is None and incoming: + vid, _ = incoming[0] + chosen = regs.get(int(vid)) + self._set(regs, inst.get("dst"), chosen) + i += 1 + continue + + if op == "const": + val = inst.get("value", {}) + ty = val.get("type") + vv = val.get("value") + if ty == "i64": + out = int(vv) + elif ty == "f64": + out = float(vv) + elif ty == "string": + out = str(vv) + else: + out = None + self._set(regs, inst.get("dst"), out) + i += 1 + continue + + if op == "binop": + operation = inst.get("operation") + a = self._read(regs, inst.get("lhs")) + b = self._read(regs, inst.get("rhs")) + res: Any = None + if operation == "+": + if isinstance(a, str) or isinstance(b, str): + res = (str(a) if a is not None else "") + (str(b) if b is not None else "") + else: + av = 0 if a is None else int(a) + bv = 0 if b is None else int(b) + res = av + bv + elif operation == "-": + av = 0 if a is None else int(a) + bv = 0 if b is None else int(b) + res = av - bv + elif operation == "*": + av = 0 if a is None else int(a) + bv = 0 if b is None else int(b) + res = av * bv + elif operation == "/": + # integer division semantics for now + av = 0 if a is None else int(a) + bv = 1 if b in (None, 0) else int(b) + res = av // bv + elif operation == "%": + av = 0 if a is None else int(a) + bv = 1 if b in (None, 0) else int(b) + res = av % bv + elif operation in ("&", "|", "^"): + # treat as bitwise on ints + ai, bi = (0 if a is None else int(a)), (0 if b is None else int(b)) + if operation == "&": + res = ai & bi + elif operation == "|": + res = ai | bi + else: + res = ai ^ bi + elif operation in ("<<", ">>"): + ai, bi = (0 if a is None else int(a)), (0 if b is None else int(b)) + res = (ai << bi) if operation == "<<" else (ai >> bi) + else: + raise RuntimeError(f"unsupported binop: {operation}") + self._set(regs, inst.get("dst"), res) + i += 1 + continue + + if op == "compare": + operation = inst.get("operation") + a = self._read(regs, inst.get("lhs")) + b = self._read(regs, inst.get("rhs")) + res: bool + if operation == "==": + res = (a == b) + elif operation == "!=": + res = (a != b) + elif operation == "<": + res = (a < b) + elif operation == "<=": + res = (a <= b) + elif operation == ">": + res = (a > b) + elif operation == ">=": + res = (a >= b) + else: + raise RuntimeError(f"unsupported compare: {operation}") + # VM convention: booleans are i64 0/1 + self._set(regs, inst.get("dst"), 1 if res else 0) + i += 1 + continue + + if op == "newbox": + btype = inst.get("type") + if btype == "ConsoleBox": + val = {"__box__": "ConsoleBox"} + elif btype == "StringBox": + # empty string instance + val = "" + else: + # Unknown box -> opaque + val = {"__box__": btype} + self._set(regs, inst.get("dst"), val) + i += 1 + continue + + if op == "boxcall": + recv = self._read(regs, inst.get("box")) + method = inst.get("method") + args = [self._read(regs, a) for a in inst.get("args", [])] + out: Any = None + # ConsoleBox methods + if method in ("print", "println", "log") and self._is_console(recv): + s = args[0] if args else "" + if s is None: + s = "" + if method == "println": + print(str(s)) + else: + # println is the primary one used by smokes; keep print/log equivalent + print(str(s)) + out = 0 + # FileBox methods (minimal read-only) + elif isinstance(recv, dict) and recv.get("__box__") == "FileBox": + if method == "open": + path = str(args[0]) if len(args) > 0 else "" + mode = str(args[1]) if len(args) > 1 else "r" + ok = 0 + content = None + if mode == "r": + try: + with open(path, "r", encoding="utf-8") as f: + content = f.read() + ok = 1 + except Exception: + ok = 0 + content = None + recv["__open"] = (ok == 1) + recv["__path"] = path + recv["__content"] = content + out = ok + elif method == "read": + if isinstance(recv.get("__content"), str): + out = recv.get("__content") + else: + out = None + elif method == "close": + recv["__open"] = False + out = 0 + else: + out = None + # PathBox methods (posix-like) + elif isinstance(recv, dict) and recv.get("__box__") == "PathBox": + if method == "dirname": + p = str(args[0]) if args else "" + # Normalize to POSIX-style + out = os.path.dirname(p) + if out == "": + out = "." + elif method == "join": + base = str(args[0]) if len(args) > 0 else "" + rel = str(args[1]) if len(args) > 1 else "" + out = os.path.join(base, rel) + else: + out = None + elif method == "length": + out = len(str(recv)) + elif method == "substring": + s = str(recv) + start = int(args[0]) if len(args) > 0 else 0 + end = int(args[1]) if len(args) > 1 else len(s) + out = s[start:end] + elif method == "lastIndexOf": + s = str(recv) + needle = str(args[0]) if args else "" + out = s.rfind(needle) + else: + # Unimplemented method -> no-op + out = None + self._set(regs, inst.get("dst"), out) + i += 1 + continue + + if op == "externcall": + func = inst.get("func") + args = [self._read(regs, a) for a in inst.get("args", [])] + out: Any = None + if func == "nyash.console.println": + s = args[0] if args else "" + if s is None: + s = "" + print(str(s)) + out = 0 + else: + # Unknown extern + out = None + self._set(regs, inst.get("dst"), out) + i += 1 + continue + + if op == "branch": + cond = self._read(regs, inst.get("cond")) + tid = int(inst.get("then")) + eid = int(inst.get("else")) + prev = cur + cur = tid if self._truthy(cond) else eid + # Restart execution at next block + break + + if op == "jump": + tgt = int(inst.get("target")) + prev = cur + cur = tgt + break + + if op == "ret": + v = self._read(regs, inst.get("value")) + return v + + if op == "call": + # Resolve function name from value or take as literal + fval = inst.get("func") + fname = self._read(regs, fval) + if not isinstance(fname, str): + # Fallback: if JSON encoded a literal name + fname = fval if isinstance(fval, str) else None + call_args = [self._read(regs, a) for a in inst.get("args", [])] + result = None + if isinstance(fname, str) and fname in self.functions: + callee = self.functions[fname] + result = self._exec_function(callee, call_args) + # Store result if needed + self._set(regs, inst.get("dst"), result) + i += 1 + continue + + # Unhandled op -> skip + i += 1 + + else: + # No explicit terminator; finish + return 0 + + def _try_intrinsic(self, name: str, args: List[Any]) -> Tuple[bool, Any]: + try: + if name == "Main.esc_json/1": + s = "" if not args else ("" if args[0] is None else str(args[0])) + out = [] + for ch in s: + if ch == "\\": + out.append("\\\\") + elif ch == '"': + out.append('\\"') + else: + out.append(ch) + return True, "".join(out) + if name == "Main.dirname/1": + p = "" if not args else ("" if args[0] is None else str(args[0])) + d = os.path.dirname(p) + if d == "": + d = "." + return True, d + except Exception: + pass + return (False, None) diff --git a/src/llvm_py/resolver.py b/src/llvm_py/resolver.py index 654973b9..ace5d2d6 100644 --- a/src/llvm_py/resolver.py +++ b/src/llvm_py/resolver.py @@ -33,6 +33,8 @@ class Resolver: self.f64_cache: Dict[Tuple[str, int], ir.Value] = {} # String literal map: value_id -> Python string (for by-name calls) self.string_literals: Dict[int, str] = {} + # Optional: value_id -> i8* pointer for string constants (lower_const can populate) + self.string_ptrs: Dict[int, ir.Value] = {} # Track value-ids that are known to represent string handles (i64) # This is a best-effort tag used to decide '+' as string concat when both sides are i64. self.string_ids: set[int] = set() diff --git a/src/runner/mir_json_emit.rs b/src/runner/mir_json_emit.rs new file mode 100644 index 00000000..9f2715d5 --- /dev/null +++ b/src/runner/mir_json_emit.rs @@ -0,0 +1,127 @@ +use serde_json::json; + +/// Emit MIR JSON for Python harness/PyVM. +/// The JSON schema matches tools/llvmlite_harness.py expectations and is +/// intentionally minimal for initial scaffolding. +pub fn emit_mir_json_for_harness( + module: &nyash_rust::mir::MirModule, + path: &std::path::Path, +) -> Result<(), String> { + use nyash_rust::mir::{MirInstruction as I, BinaryOp as B, CompareOp as C}; + let mut funs = Vec::new(); + for (name, f) in &module.functions { + let mut blocks = Vec::new(); + let mut ids: Vec<_> = f.blocks.keys().copied().collect(); + ids.sort(); + for bid in ids { + if let Some(bb) = f.blocks.get(&bid) { + let mut insts = Vec::new(); + // PHI first(オプション) + for inst in &bb.instructions { + if let I::Phi { dst, inputs } = inst { + let incoming: Vec<_> = inputs + .iter() + .map(|(b, v)| json!([v.as_u32(), b.as_u32()])) + .collect(); + insts.push(json!({"op":"phi","dst": dst.as_u32(), "incoming": incoming})); + } + } + // Non-PHI + for inst in &bb.instructions { + match inst { + I::Const { dst, value } => { + match value { + nyash_rust::mir::ConstValue::Integer(i) => { + insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "i64", "value": i}})); + } + nyash_rust::mir::ConstValue::Float(fv) => { + insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "f64", "value": fv}})); + } + nyash_rust::mir::ConstValue::Bool(b) => { + insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "i64", "value": if *b {1} else {0}}})); + } + nyash_rust::mir::ConstValue::String(s) => { + // String constants are exported as StringBox handle by default + insts.push(json!({ + "op":"const", + "dst": dst.as_u32(), + "value": { + "type": {"kind":"handle","box_type":"StringBox"}, + "value": s + } + })); + } + nyash_rust::mir::ConstValue::Null | nyash_rust::mir::ConstValue::Void => { + insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "void", "value": 0}})); + } + } + } + I::BinOp { dst, op, lhs, rhs } => { + let op_s = match op { B::Add=>"+",B::Sub=>"-",B::Mul=>"*",B::Div=>"/",B::Mod=>"%",B::BitAnd=>"&",B::BitOr=>"|",B::BitXor=>"^",B::Shl=>"<<",B::Shr=>">>",B::And=>"&",B::Or=>"|"}; + insts.push(json!({"op":"binop","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()})); + } + I::Compare { dst, op, lhs, rhs } => { + let op_s = match op { C::Lt=>"<", C::Le=>"<=", C::Gt=>">", C::Ge=>">=", C::Eq=>"==", C::Ne=>"!=" }; + insts.push(json!({"op":"compare","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()})); + } + I::Call { dst, func, args, .. } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + insts.push(json!({"op":"call","func": func.as_u32(), "args": args_a, "dst": dst.map(|d| d.as_u32())})); + } + I::ExternCall { dst, iface_name, method_name, args, .. } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + let func_name = if iface_name == "env.console" { + format!("nyash.console.{}", method_name) + } else { format!("{}.{}", iface_name, method_name) }; + insts.push(json!({"op":"externcall","func": func_name, "args": args_a, "dst": dst.map(|d| d.as_u32())})); + } + I::BoxCall { dst, box_val, method, args, .. } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + // Minimal dst_type hints + let mut obj = json!({ + "op":"boxcall","box": box_val.as_u32(), "method": method, "args": args_a, "dst": dst.map(|d| d.as_u32()) + }); + let m = method.as_str(); + let dst_ty = if m == "substring" { + Some(json!({"kind":"handle","box_type":"StringBox"})) + } else if m == "length" || m == "lastIndexOf" { + Some(json!("i64")) + } else { None }; + if let Some(t) = dst_ty { obj["dst_type"] = t; } + insts.push(obj); + } + I::NewBox { dst, box_type, args } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + insts.push(json!({"op":"newbox","type": box_type, "args": args_a, "dst": dst.as_u32()})); + } + I::Branch { condition, then_bb, else_bb } => { + insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})); + } + I::Jump { target } => { + insts.push(json!({"op":"jump","target": target.as_u32()})); + } + I::Return { value } => { + insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})); + } + _ => { /* skip non-essential ops for initial harness */ } + } + } + if let Some(term) = &bb.terminator { + match term { + I::Return { value } => insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})), + I::Jump { target } => insts.push(json!({"op":"jump","target": target.as_u32()})), + I::Branch { condition, then_bb, else_bb } => insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})), + _ => {} + } + } + blocks.push(json!({"id": bid.as_u32(), "instructions": insts})); + } + } + // Export parameter value-ids so a VM can bind arguments + let params: Vec<_> = f.params.iter().map(|v| v.as_u32()).collect(); + funs.push(json!({"name": name, "params": params, "blocks": blocks})); + } + let root = json!({"functions": funs}); + std::fs::write(path, serde_json::to_string_pretty(&root).unwrap()) + .map_err(|e| format!("write mir json: {}", e)) +} diff --git a/src/runner/mod.rs b/src/runner/mod.rs index 4730db21..8f91d188 100644 --- a/src/runner/mod.rs +++ b/src/runner/mod.rs @@ -30,6 +30,7 @@ use std::{fs, process}; mod modes; mod demos; mod json_v0_bridge; +mod mir_json_emit; // v2 plugin system imports use nyash_rust::runtime; diff --git a/src/runner/modes/llvm.rs b/src/runner/modes/llvm.rs index c82f9fcc..244489f0 100644 --- a/src/runner/modes/llvm.rs +++ b/src/runner/modes/llvm.rs @@ -2,7 +2,6 @@ use super::super::NyashRunner; use nyash_rust::{parser::NyashParser, mir::{MirCompiler, MirInstruction}, box_trait::IntegerBox}; use nyash_rust::mir::passes::method_id_inject::inject_method_ids; use std::{fs, process}; -use serde_json::json; impl NyashRunner { /// Execute LLVM mode (split) @@ -56,7 +55,7 @@ impl NyashRunner { let tmp_dir = std::path::Path::new("tmp"); let _ = std::fs::create_dir_all(tmp_dir); let mir_json_path = tmp_dir.join("nyash_harness_mir.json"); - if let Err(e) = emit_mir_json_for_harness(&module, &mir_json_path) { + if let Err(e) = crate::runner::mir_json_emit::emit_mir_json_for_harness(&module, &mir_json_path) { eprintln!("❌ MIR JSON emit error: {}", e); process::exit(1); } @@ -182,92 +181,4 @@ impl NyashRunner { } } -fn emit_mir_json_for_harness(module: &nyash_rust::mir::MirModule, path: &std::path::Path) -> Result<(), String> { - use nyash_rust::mir::{MirInstruction as I, BinaryOp as B, CompareOp as C}; - // Build JSON structure expected by python builder: { functions: [ { name, params, blocks: [ { id, instructions: [ ... ] } ] } ] } - let mut funs = Vec::new(); - for (name, f) in &module.functions { - let mut blocks = Vec::new(); - let mut ids: Vec<_> = f.blocks.keys().copied().collect(); - ids.sort(); - for bid in ids { - if let Some(bb) = f.blocks.get(&bid) { - let mut insts = Vec::new(); - // PHI first(オプション) - for inst in &bb.instructions { - if let I::Phi { dst, inputs } = inst { - let incoming: Vec<_> = inputs.iter().map(|(b, v)| json!([v.as_u32(), b.as_u32()])).collect(); - insts.push(json!({"op":"phi","dst": dst.as_u32(), "incoming": incoming})); - } - } - // Non-PHI - for inst in &bb.instructions { - match inst { - I::Const { dst, value } => { - let (t, val) = match value { - nyash_rust::mir::ConstValue::Integer(i) => ("i64", json!(i)), - nyash_rust::mir::ConstValue::Float(fv) => ("f64", json!(fv)), - nyash_rust::mir::ConstValue::Bool(b) => ("i64", json!(if *b {1} else {0})), - nyash_rust::mir::ConstValue::String(s) => ("string", json!(s)), - nyash_rust::mir::ConstValue::Null => ("void", json!(0)), - nyash_rust::mir::ConstValue::Void => ("void", json!(0)), - }; - insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": t, "value": val}})); - } - I::BinOp { dst, op, lhs, rhs } => { - let op_s = match op { B::Add=>"+",B::Sub=>"-",B::Mul=>"*",B::Div=>"/",B::Mod=>"%",B::BitAnd=>"&",B::BitOr=>"|",B::BitXor=>"^",B::Shl=>"<<",B::Shr=>">>",B::And=>"&",B::Or=>"|"}; - insts.push(json!({"op":"binop","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()})); - } - I::Compare { dst, op, lhs, rhs } => { - let op_s = match op { C::Lt=>"<", C::Le=>"<=", C::Gt=>">", C::Ge=>">=", C::Eq=>"==", C::Ne=>"!=" }; - insts.push(json!({"op":"compare","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()})); - } - I::Call { dst, func, args, .. } => { - let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); - insts.push(json!({"op":"call","func": func.as_u32(), "args": args_a, "dst": dst.map(|d| d.as_u32())})); - } - I::ExternCall { dst, iface_name, method_name, args, .. } => { - let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); - // Map known interfaces to NyRT symbols - let func_name = if iface_name == "env.console" { - format!("nyash.console.{}", method_name) - } else { format!("{}.{}", iface_name, method_name) }; - insts.push(json!({"op":"externcall","func": func_name, "args": args_a, "dst": dst.map(|d| d.as_u32())})); - } - I::BoxCall { dst, box_val, method, args, .. } => { - let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); - insts.push(json!({"op":"boxcall","box": box_val.as_u32(), "method": method, "args": args_a, "dst": dst.map(|d| d.as_u32())})); - } - I::NewBox { dst, box_type, args } => { - let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); - insts.push(json!({"op":"newbox","type": box_type, "args": args_a, "dst": dst.as_u32()})); - } - I::Branch { condition, then_bb, else_bb } => { - insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})); - } - I::Jump { target } => { - insts.push(json!({"op":"jump","target": target.as_u32()})); - } - I::Return { value } => { - insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})); - } - _ => { /* skip non-essential ops for initial harness */ } - } - } - // Terminator (if present) - if let Some(term) = &bb.terminator { - match term { - I::Return { value } => insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})), - I::Jump { target } => insts.push(json!({"op":"jump","target": target.as_u32()})), - I::Branch { condition, then_bb, else_bb } => insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})), - _ => {} - } - } - blocks.push(json!({"id": bid.as_u32(), "instructions": insts})); - } - } - funs.push(json!({"name": name, "params": [], "blocks": blocks})); - } - let root = json!({"functions": funs}); - std::fs::write(path, serde_json::to_string_pretty(&root).unwrap()).map_err(|e| format!("write mir json: {}", e)) -} +// emit_mir_json_for_harness moved to crate::runner::mir_json_emit diff --git a/src/runner/modes/vm.rs b/src/runner/modes/vm.rs index 788b7a88..85014466 100644 --- a/src/runner/modes/vm.rs +++ b/src/runner/modes/vm.rs @@ -118,6 +118,58 @@ impl NyashRunner { } } + // Optional: PyVM path. When NYASH_VM_USE_PY=1, emit MIR(JSON) and delegate execution to tools/pyvm_runner.py + if std::env::var("NYASH_VM_USE_PY").ok().as_deref() == Some("1") { + let py = which::which("python3").ok(); + if let Some(py3) = py { + let runner = std::path::Path::new("tools/pyvm_runner.py"); + if runner.exists() { + // Emit MIR(JSON) + let tmp_dir = std::path::Path::new("tmp"); + let _ = std::fs::create_dir_all(tmp_dir); + let mir_json_path = tmp_dir.join("nyash_pyvm_mir.json"); + if let Err(e) = crate::runner::mir_json_emit::emit_mir_json_for_harness(&module_vm, &mir_json_path) { + eprintln!("❌ PyVM MIR JSON emit error: {}", e); + process::exit(1); + } + if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!("[Runner/VM] using PyVM → {} (mir={})", filename, mir_json_path.display()); + } + // Determine entry function hint (prefer Main.main if present) + let entry = if module_vm.functions.contains_key("Main.main") { + "Main.main" + } else if module_vm.functions.contains_key("main") { "main" } else { "Main.main" }; + // Spawn runner + let status = std::process::Command::new(py3) + .args([ + runner.to_string_lossy().as_ref(), + "--in", + &mir_json_path.display().to_string(), + "--entry", + entry, + ]) + .status() + .map_err(|e| format!("spawn pyvm: {}", e)) + .unwrap(); + if !status.success() { + eprintln!("❌ PyVM failed (status={})", status.code().unwrap_or(-1)); + process::exit(1); + } + // Propagate exit code if set + if let Some(code) = status.code() { + process::exit(code); + } + process::exit(0); + } else { + eprintln!("❌ PyVM runner not found: {}", runner.display()); + process::exit(1); + } + } else { + eprintln!("❌ python3 not found in PATH. Install Python 3 to use PyVM."); + process::exit(1); + } + } + // Expose GC/scheduler hooks globally for JIT externs (checkpoint/await, etc.) nyash_rust::runtime::global_hooks::set_from_runtime(&runtime); diff --git a/tools/build_llvm.sh b/tools/build_llvm.sh index 560c24d5..c9f12844 100644 --- a/tools/build_llvm.sh +++ b/tools/build_llvm.sh @@ -43,11 +43,12 @@ if ! command -v llvm-config-18 >/dev/null 2>&1; then exit 2 fi -echo "[1/4] Building nyash (feature=llvm, harness-friendly) ..." +echo "[1/4] Building nyash (feature selectable) ..." _LLVMPREFIX=$(llvm-config-18 --prefix) -# Build only the core package to avoid compiling workspace plugin crates +# Select LLVM feature: default harness (llvm), or legacy inkwell when NYASH_LLVM_FEATURE=llvm-inkwell-legacy +LLVM_FEATURE=${NYASH_LLVM_FEATURE:-llvm} LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \ - CARGO_INCREMENTAL=1 cargo build --release -p nyash-rust --features llvm >/dev/null + CARGO_INCREMENTAL=1 cargo build --release -p nyash-rust --features "$LLVM_FEATURE" >/dev/null echo "[2/4] Emitting object (.o) via LLVM backend ..." # Default object output path under target/aot_objects @@ -57,7 +58,15 @@ stem=${stem%.nyash} OBJ="${NYASH_LLVM_OBJ_OUT:-$PWD/target/aot_objects/${stem}.o}" if [[ "${NYASH_LLVM_SKIP_EMIT:-0}" != "1" ]]; then rm -f "$OBJ" - NYASH_LLVM_OBJ_OUT="$OBJ" LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" ./target/release/nyash --backend llvm "$INPUT" >/dev/null || true + if [[ "${NYASH_LLVM_FEATURE:-llvm}" == "llvm-inkwell-legacy" ]]; then + # Legacy path: do not use harness + NYASH_LLVM_OBJ_OUT="$OBJ" LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \ + ./target/release/nyash --backend llvm "$INPUT" >/dev/null || true + else + # Harness path + NYASH_LLVM_OBJ_OUT="$OBJ" NYASH_LLVM_USE_HARNESS=1 LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \ + ./target/release/nyash --backend llvm "$INPUT" >/dev/null || true + fi fi if [[ ! -f "$OBJ" ]]; then echo "error: object not generated: $OBJ" >&2 diff --git a/tools/build_plugins_all.sh b/tools/build_plugins_all.sh index b7a2711c..878d17c9 100644 --- a/tools/build_plugins_all.sh +++ b/tools/build_plugins_all.sh @@ -7,9 +7,15 @@ ROOT_DIR=$(cd "$(dirname "$0")/.." && pwd) cd "$ROOT_DIR" PROFILE=${PROFILE:-release} +JOBS=${JOBS:-24} -echo "[plugins] building all (profile=$PROFILE)" +echo "[plugins] building all (profile=$PROFILE, jobs=$JOBS)" +# Build all plugins in one go for maximum efficiency +echo "[plugins] building workspace..." +cargo build --workspace --$PROFILE -j $JOBS >/dev/null + +# Copy artifacts to plugin directories for dir in plugins/*; do [[ -d "$dir" && -f "$dir/Cargo.toml" ]] || continue pkg=$(grep -m1 '^name\s*=\s*"' "$dir/Cargo.toml" | sed -E 's/.*"(.*)".*/\1/') @@ -19,7 +25,6 @@ for dir in plugins/*; do libname=${pkg//-/_} fi echo "[plugins] -> $pkg (libname=$libname)" - cargo build -p "$pkg" --$PROFILE >/dev/null # Copy artifacts outdir="target/$PROFILE" # cdylib (.so/.dylib/.dll) diff --git a/tools/compare_harness_on_off.sh b/tools/compare_harness_on_off.sh index cbf026a5..2da74ac5 100644 --- a/tools/compare_harness_on_off.sh +++ b/tools/compare_harness_on_off.sh @@ -12,11 +12,19 @@ OFF_EXE=${OFF_EXE:-$ROOT_DIR/app_dep_tree_rust} echo "[compare] target app: $APP" echo "[compare] build (OFF/Rust LLVM or harness fallback) ..." -# If legacy inkwell backend is not in use, fall back to harness for OFF as well -NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_USE_HARNESS=1 "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null +if [[ "${NYASH_COMPARE_INKWELL:-0}" == "1" ]]; then + echo " OFF=inkwell-legacy" + NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_FEATURE=llvm-inkwell-legacy \ + "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null +else + echo " OFF=harness" + NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_FEATURE=llvm \ + "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null +fi echo "[compare] build (ON/llvmlite harness) ..." -NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_USE_HARNESS=1 "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$ON_EXE" >/dev/null +NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_FEATURE=llvm \ + "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$ON_EXE" >/dev/null echo "[compare] run both and capture output ..." ON_OUT="$OUTDIR/on.out"; OFF_OUT="$OUTDIR/off.out" diff --git a/tools/llvmlite_check_deny_direct.sh b/tools/llvmlite_check_deny_direct.sh new file mode 100644 index 00000000..29692958 --- /dev/null +++ b/tools/llvmlite_check_deny_direct.sh @@ -0,0 +1,13 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT=$(cd "$(dirname "$0")/.." && pwd) +cd "$ROOT" + +echo "[deny-direct] scanning src/llvm_py for direct vmap.get reads ..." +rg -n "vmap\\.get\\(" src/llvm_py \ + -g '!src/llvm_py/resolver.py' \ + -g '!src/llvm_py/llvm_builder.py' || true + +echo "[hint] Prefer resolver.resolve_i64/resolve_ptr with (builder.block, preds, block_end_values, vmap, bb_map)." + diff --git a/tools/parity.sh b/tools/parity.sh new file mode 100644 index 00000000..49479339 --- /dev/null +++ b/tools/parity.sh @@ -0,0 +1,166 @@ +#!/usr/bin/env bash +set -euo pipefail + +if [[ "${NYASH_CLI_VERBOSE:-0}" == "1" ]]; then + set -x +fi + +usage() { + cat << USAGE +Nyash parity runner — compare two execution paths on the same .nyash + +Usage: tools/parity.sh [options] + +Options: + --lhs Left mode: pyvm|llvmlite|vm (default: pyvm) + --rhs Right mode: pyvm|llvmlite|vm (default: llvmlite) + --timeout Timeout seconds for each run (default: 12) + --show-diff Show unified diff when different + +Compares stdout (normalized) and exit codes. Returns 0 when equal. +USAGE +} + +APP="" +LHS="pyvm" +RHS="llvmlite" +TIMEOUT="12" +SHOW_DIFF=0 + +while [[ $# -gt 0 ]]; do + case "$1" in + -h|--help) usage; exit 0;; + --lhs) LHS="$2"; shift 2;; + --rhs) RHS="$2"; shift 2;; + --timeout) TIMEOUT="$2"; shift 2;; + --show-diff) SHOW_DIFF=1; shift;; + *) APP="$1"; shift;; + esac +done + +if [[ -z "$APP" ]]; then + usage; exit 1 +fi +if [[ ! -f "$APP" ]]; then + echo "error: app not found: $APP" >&2 + exit 2 +fi + +ROOT=$(cd "$(dirname "$0")/.." && pwd) +NYASH_BIN="$ROOT/target/release/nyash" +if [[ ! -x "$NYASH_BIN" ]]; then + echo "[build] nyash not found; building release ..." >&2 + (cd "$ROOT" && cargo build --release >/dev/null) +fi + +has_cmd() { command -v "$1" >/dev/null 2>&1; } + +normalize() { + # Remove runner/plugin noise and blank lines + sed -E \ + -e 's/\r$//' \ + -e '/^\[ConsoleBox\]/d' \ + -e '/^\[FileBox\]/d' \ + -e '/^\[plugin-loader\]/d' \ + -e '/^\[Runner\//d' \ + -e '/^DEBUG:/d' \ + -e '/^🔌/d' \ + -e '/^✅/d' \ + -e '/^🚀/d' \ + -e '/^⚡/d' \ + -e '/^🦀/d' \ + -e '/^🧠/d' \ + -e '/^📊/d' \ + -e '/^Result(Type)?\(/d' \ + -e '/^Result:/d' \ + -e '/^$/d' +} + +run_pyvm() { + local app="$1" + local out code + if has_cmd timeout; then + out=$(NYASH_VM_USE_PY=1 timeout "${TIMEOUT}s" "$NYASH_BIN" --backend vm "$app" 2>&1) || code=$? + else + out=$(NYASH_VM_USE_PY=1 "$NYASH_BIN" --backend vm "$app" 2>&1) || code=$? + fi + code=${code:-0} + printf '%s' "$out" | normalize + echo "__EXIT_CODE__=$code" +} + +run_vm() { + local app="$1" + local out code + if has_cmd timeout; then + out=$(timeout "${TIMEOUT}s" "$NYASH_BIN" --backend vm "$app" 2>&1) || code=$? + else + out=$("$NYASH_BIN" --backend vm "$app" 2>&1) || code=$? + fi + code=${code:-0} + printf '%s' "$out" | normalize + echo "__EXIT_CODE__=$code" +} + +run_llvmlite() { + local app="$1" + if ! has_cmd llvm-config-18; then + echo "error: llvm-config-18 not found (required for llvmlite parity)." >&2 + exit 3 + fi + local stem exe + stem=$(basename "$app"); stem=${stem%.nyash} + exe="$ROOT/app_parity_${stem}" + NYASH_LLVM_FEATURE=llvm "${ROOT}/tools/build_llvm.sh" "$app" -o "$exe" >/dev/null || true + if [[ ! -x "$exe" ]]; then + echo "error: failed to build llvmlite executable: $exe" >&2 + exit 4 + fi + local out code + if has_cmd timeout; then + out=$(timeout "${TIMEOUT}s" "$exe" 2>&1) || code=$? + else + out=$("$exe" 2>&1) || code=$? + fi + code=${code:-0} + printf '%s' "$out" | normalize + echo "__EXIT_CODE__=$code" +} + +run_mode() { + local mode="$1" app="$2" + case "$mode" in + pyvm) run_pyvm "$app" ;; + vm) run_vm "$app" ;; + llvmlite) run_llvmlite "$app" ;; + *) echo "error: unknown mode: $mode" >&2; exit 5;; + esac +} + +LEFT=$(run_mode "$LHS" "$APP") +RIGHT=$(run_mode "$RHS" "$APP") + +LCODE=$(printf '%s\n' "$LEFT" | sed -n 's/^__EXIT_CODE__=//p') +RCODE=$(printf '%s\n' "$RIGHT" | sed -n 's/^__EXIT_CODE__=//p') +LOUT=$(printf '%s\n' "$LEFT" | sed '/^__EXIT_CODE__=/d') +ROUT=$(printf '%s\n' "$RIGHT" | sed '/^__EXIT_CODE__=/d') + +STATUS=0 +if [[ "$LCODE" != "$RCODE" ]]; then + echo "[parity] exit code differs: $LHS=$LCODE, $RHS=$RCODE" >&2 + STATUS=1 +fi +if [[ "$LOUT" != "$ROUT" ]]; then + echo "[parity] stdout differs" >&2 + if [[ "$SHOW_DIFF" -eq 1 ]]; then + diff -u <(printf '%s\n' "$LOUT") <(printf '%s\n' "$ROUT") || true + fi + STATUS=1 +fi + +if [[ "$STATUS" -eq 0 ]]; then + echo "✅ parity OK ($LHS == $RHS)" >&2 +else + echo "❌ parity mismatch ($LHS != $RHS)" >&2 +fi +exit "$STATUS" diff --git a/tools/pyvm_runner.py b/tools/pyvm_runner.py new file mode 100644 index 00000000..bd178db0 --- /dev/null +++ b/tools/pyvm_runner.py @@ -0,0 +1,76 @@ +#!/usr/bin/env python3 +""" +Nyash PyVM runner (scaffold) + +Usage: + - python3 tools/pyvm_runner.py --in mir.json [--entry Main.main] + +Executes MIR(JSON) using a tiny Python interpreter for a minimal opcode set: + - const/binop/compare/branch/jump/ret + - newbox (ConsoleBox, StringBox minimal) + - boxcall (String: length/substring/lastIndexOf; Console: print/println/log) + - externcall (nyash.console.println) + +On success, exits with the integer return value if it is an Integer; otherwise 0. +Outputs produced by println/log are written to stdout. +""" + +import argparse +import json +import sys +import os +from pathlib import Path + +ROOT = Path(__file__).resolve().parents[1] +PYVM_DIR = ROOT / "src" / "llvm_py" / "pyvm" + +# Ensure imports can find the package root (src) +SRC_DIR = ROOT / "src" +if str(SRC_DIR) not in sys.path: + sys.path.insert(0, str(SRC_DIR)) + +from llvm_py.pyvm.vm import PyVM # type: ignore + + +def main(): + ap = argparse.ArgumentParser() + ap.add_argument("--in", dest="infile", required=True, help="MIR JSON input") + ap.add_argument("--entry", dest="entry", default="Main.main", help="Entry function (default Main.main)") + args = ap.parse_args() + + with open(args.infile, "r") as f: + program = json.load(f) + + vm = PyVM(program) + # Fallbacks for entry name + entry = args.entry + fun_names = {f.get("name", "") for f in program.get("functions", [])} + if entry not in fun_names: + if "main" in fun_names: + entry = "main" + elif "Main.main" in fun_names: + entry = "Main.main" + + result = vm.run(entry) + # Exit code convention: integers propagate; bool -> 0/1; else 0 + code = 0 + if isinstance(result, bool): + code = 1 if result else 0 + elif isinstance(result, int): + # Clamp to 32-bit exit code domain + code = int(result) & 0xFFFFFFFF + if code & 0x80000000: + code = -((~code + 1) & 0xFFFFFFFF) + # For parity comparisons, avoid emitting extra lines here. + sys.exit(code) + + +if __name__ == "__main__": + try: + main() + except Exception as e: + import traceback + print(f"[pyvm] error: {e}", file=sys.stderr) + if sys.stderr and (os.environ.get('NYASH_CLI_VERBOSE') == '1' or True): + traceback.print_exc() + sys.exit(1) diff --git a/tools/pyvm_vs_llvmlite.sh b/tools/pyvm_vs_llvmlite.sh new file mode 100644 index 00000000..ff4a670e --- /dev/null +++ b/tools/pyvm_vs_llvmlite.sh @@ -0,0 +1,58 @@ +#!/usr/bin/env bash +set -euo pipefail + +if [[ "${NYASH_CLI_VERBOSE:-0}" == "1" ]]; then + set -x +fi + +APP="${1:-apps/tests/esc_dirname_smoke.nyash}" +OUT="app_pyvm_cmp" + +if [[ ! -f "$APP" ]]; then + echo "error: app not found: $APP" >&2 + exit 2 +fi + +# 1) Build nyash with llvm harness enabled (build_llvm.sh does the right thing) +echo "[cmp] building AOT via llvmlite harness ..." >&2 +./tools/build_llvm.sh "$APP" -o "$OUT" >/dev/null + +# 2) Run AOT executable and capture stdout + exit code +echo "[cmp] running AOT (llvmlite) ..." >&2 +set +e +OUT_LL=$("./$OUT" 2>&1) +CODE_LL=$? +set -e + +# 3) Run PyVM path (VM mode delegated to Python) +echo "[cmp] running PyVM ..." >&2 +set +e +OUT_PY=$(NYASH_VM_USE_PY=1 ./target/release/nyash --backend vm "$APP" 2>&1) +CODE_PY=$? +set -e + +echo "=== llvmlite (AOT) stdout ===" +echo "$OUT_LL" | sed -n '1,120p' +echo "=== PyVM stdout ===" +echo "$OUT_PY" | sed -n '1,120p' +echo "=== exit codes ===" +echo "llvmlite: $CODE_LL" +echo "PyVM : $CODE_PY" + +DIFF=0 +if [[ "$OUT_LL" != "$OUT_PY" ]]; then + echo "[cmp] stdout differs" >&2 + DIFF=1 +fi +if [[ "$CODE_LL" -ne "$CODE_PY" ]]; then + echo "[cmp] exit code differs" >&2 + DIFF=1 +fi + +if [[ "$DIFF" -eq 0 ]]; then + echo "✅ parity OK (stdout + exit code)" +else + echo "❌ parity mismatch" >&2 + exit 1 +fi +