docs: update CURRENT_TASK with Box Theory PHI plan (defer/finalize) and MIR v0.5 type meta; add parity tooling and PyVM scaffolding

impl(pyvm/llvmlite):
- add tools/parity.sh; tools/pyvm_runner.py; src/llvm_py/pyvm/*
- emit string const as handle type in MIR JSON; add dst_type hints
- unify '+' to concat_hh with from_i64/from_i8_string bridges; console print via to_i8p_h
- add runtime bridges: nyash.box.from_i64, nyash.string.to_i8p_h

tests:
- add apps/tests/min_str_cat_loop (minimal repro for string cat loop)
This commit is contained in:
Selfhosting Dev
2025-09-14 04:51:33 +09:00
parent 658a0d46da
commit 3e07763af8
49 changed files with 1231 additions and 201 deletions

View File

@ -1,25 +1,19 @@
# Current Task (2025-09-11) — Phase 15 LLVM主経路 + llvmlite Harness検証・将来主役 # Current Task (2025-09-13 改定) — Phase 15 llvmlite既定+ PyVM新規
Summary Summary
- LLVM AOTRust/inkwellは引き続き主経路。ただし「反復速度・仕様変更耐性」を担保するため、Python/llvmlite ハーネスを正式導入し、代表ケースで両者の等価性を検証する - JIT/Cranelift は一時停止。Rust/inkwell LLVM は参照のみ
- VM/Cranelift/Interpreter は MIR14 非対応。MIR 正規化Resolver・LoopForm規約を Rust 側で担保し、ハーネスにも同じ形を供給する - 既定の実行/ビルド経路は Python/llvmlite ハーネスMIR JSON→.o→NyRT link
- 代表ケースapps/selfhost/tools/dep_tree_min_string.nyash`.o`(および必要時 EXEを安定生成。Harness ON/OFF で機能同値を確認 - 2本目の実行経路として PyVMPython MIR VMを導入し、llvmlite との機能同値で安定化する
Quick Status — 20250913compressed, postharness fixes Quick Status — 20250913postharness hardening
- Harness ONllvmlite .ll verify green → .o → link 成立dep_tree_min_string - llvmlite(ハーネス)で verify green → .o → link が代表ケースで成立dep_tree_min_string
- Resolveronly 統一vmap 直読排除)。PHIBB 先頭に集約・i64ハンドル固定pointer incoming は pred 終端直前で boxingGEP+from_i8_string - Resolveronly/Sealed SSA/文字列ハンドル不変を強化。PHIBB先頭pred終端boxing/cast。
- 降下順序: preds 優先の擬似トポロジカル順に block 降下。非 PHI 命令は「現在 BB」末尾に挿入dominance 安定) - IRダンプ/PHIガード/deny-directチェックが利用可能NYASH_LLVM_DUMP_IR, NYASH_LLVM_PHI_STRICT, tools/llvmlite_check_deny_direct.sh
- 文字列: + は string タグ/ptr 検出時のみ concat_hh、len/eq 対応、substring/lastIndexOf は handle 版_hii/_hhを NyRT に実装・使用
- const(string): Global を保持→使用側で GEP→i8* に正規化。MIR main→private、ny_main ラッパ生成
- byname 定数: メソッド名の i8* は定数 GEP を採用(順序依存を排除)
- 比較/検証: compare_harness_on_off.sh で ON/OFF の Exit 一致(現状 JSON は双方空。最終 JSON 一致は次フェーズで詰め)
Focus Shift — Python/llvmlite Only20250913 Focus Shift — llvmlite(既定)+ PyVM新規
- Rust/inkwell 側は当面「保守」へ。開発・詰めは Nyash スクリプト+Python/llvmlite のみで進行 - Rust/inkwell は保守のみ。Pythonllvmlite/PyVM中心で開発
- 追加スモーク: apps/tests/esc_dirname_smoke.nyashesc_json/dirname の最小 2 行出力) - 追加スモーク: esc_dirname_smoke / dep_tree_min_string を llvmlite と PyVM の両方で常時維持
-加トレース: `NYASH_LLVM_TRACE_FINAL=1` で println 直前に `nyash.debug.trace_handle(i64)` を呼び、最終ハンドルを観測。 -: `NYASH_LLVM_TRACE_FINAL=1`(最終ハンドル)、`NYASH_LLVM_TRACE_PHI=1`PHIログ
- Lifetime ヒント(軽量): `def_blocks`value_id → 定義ブロック集合)を Builder が収集、Resolver は現ブロック定義済みの i64 を優先再利用PHI 過剰化を抑制)。
- const(string) 改善: 即時 `from_i8_string` で i64 ハンドル化(後段連鎖の 0 落ちを軽減)。
Hot Update — 20250913Harness 配線・フォールバック廃止) Hot Update — 20250913Harness 配線・フォールバック廃止)
- RunnerLLVMモードにハーネス配線を追加。`NYASH_LLVM_USE_HARNESS=1` のとき: - RunnerLLVMモードにハーネス配線を追加。`NYASH_LLVM_USE_HARNESS=1` のとき:
@ -39,14 +33,43 @@ Hot Update — 20250913Resolveronly 統一 + Harness ON green
- `main` 衝突回避: MIR 由来 `main` は private にし、`ny_main()` ラッパを自動生成NyRT `main` と整合)。 - `main` 衝突回避: MIR 由来 `main` は private にし、`ny_main()` ラッパを自動生成NyRT `main` と整合)。
- 代表ケースdep_tree_min_string: Harness ON で `.ll verify green → .o` を確認し、NyRT とリンクして EXE 生成成功。 - 代表ケースdep_tree_min_string: Harness ON で `.ll verify green → .o` を確認し、NyRT とリンクして EXE 生成成功。
Nextshort, refreshed — Py/llvmlite Nextshort — Py/llvmlite + PyVM
1) スモーク確定: esc_dirname_smoke の 2 行出力を ON/OFF 完全一致に(行比較)。 1) PyVM スキャフォールド: `tools/pyvm_runner.py``src/llvm_py/pyvm/` 追加(最小命令+boxcall)。
2) dep_tree_min_string の最終 JSON 一致(`{` 以降の diff=空) 2) ランナー統合: `NYASH_VM_USE_PY=1` → MIR(JSON) を PyVM に渡して実行
- `NYASH_LLVM_TRACE_FINAL=1``NYASH_LLVM_TRACE_VALUES=1` で println 引数ハンドルの鎖を観測し、synthzero 起点を特定→ Resolver/PHI で局所是正 3) パリティ基盤: 汎用パリティスクリプト `tools/parity.sh` を追加stdout+exit code 比較、pyvm/vm/llvmlite 任意ペア)
- PHI/snapshot は「pred で materialize→無ければ snap→最後に synth(0)」の順を徹底。None を入れない 4) 型メタ導入MIR v0.5 互換): JSON MIR に String の handle/ptr 種別を明示し、llvmlite/PyVM で推測を排除
3) CI/補助 5) スモーク拡充: esc_dirname_smoke / dep_tree_min_string の両経路一致(終了コード+JSON
- スモークを compare_harness_on_off.sh からも容易に呼べるよう維持(必要なら行比較モード追加)。
- DenyDirect`vmap.get(` 直読の抑止)を継続チェック。 Hot Update — MIR v0.5 Type Metadata20250914 着手)
- 背景: 文字列を i64 として曖昧に扱っており、llvmlite で handle/ptr の推測が必要→不安定の温床。
- 追加仕様(後方互換、最小差分):
- Const(string): `{"value": {"type": {"kind":"handle","box_type":"StringBox"}, "value":"..."}}`
- 既存の `type:"string"` 表記は併記しない(受け側は新表示を優先、無ければ従来推測)。
- BoxCall/ExternCall: 可能な範囲で `dst_type` を付与(例: substring→StringBox(handle), length/lastIndexOf→i64
- 実装計画:
- A) 共有エミッタ `src/runner/mir_json_emit.rs` を拡張string const/最小メソッドの `dst_type`)。
- B) Python 側: `llvm_builder.py``dst_type` を検知して `resolver.mark_string(dst)` を行う(型タグの明示化)。
- C) Console 出力の安定化: 当面は既存のポインタAPI/ハンドルAPIを維持。型メタ普及後に handle→ptr ブリッジ導入を検討。
- 受け入れ(第一段):
- JSON に string const の型メタが出ること
- Python 側で `dst_type` により string ハンドルのタグ付けが行われること
- `tools/parity.sh` が esc_dirname_smoke で実行できること(完全一致は第二段で目標)
Hot Update — Box Theory PHI20250914 追加予定)
- 背景: ループの PHI が snapshot 未構築時に 0 合成へ落ちるforward 参照をその場 resolve しているため)。
- 方針(箱理論に基づく簡素化):
- Block=箱BoxScope。各ブロック末尾の `block_end_values` を箱として扱う。
- PHI は即時解決せず defer 収集 → 全ブロック降下後に finalize で箱pred の snapshotから値を取り出して配線。
- ブロック間は String は常に handle(i64) 固定。pointer PHI は禁止。必要な boxingptr→handleは pred 末端terminator 直前)で挿入。
- + は常に concat_hh(handle,handle)。i64 プリミティブは from_i64 で昇格、リテラルは from_i8_string。
- 実装計画:
- A) `llvm_builder.py`: `lower_phi` を defer 化、`finalize_phis` を追加incoming=(pred_bid,val_id) を materialize
- B) `llvm_builder.py`: `block_end_values` の網羅性を補強(関数引数/const/新規 dst/phi dst/循環値が確実に入る)。
- C) `resolver._value_at_end_i64`: pred 末端での局所 boxing/cast を強制、未定義→0 合成を抑制strict 時は警告)。
- 受け入れ(第二段):
- 最小再現 `apps/tests/min_str_cat_loop/main.nyash` で PyVM と llvmlite の parity 緑(`xxx`
- `apps/tests/esc_dirname_smoke.nyash` で parity 緑1行目の 0 が解消)
- `tools/parity.sh` で stdout 完全一致+終了コード一致
Compact Roadmap20250913 改定) Compact Roadmap20250913 改定)
- Focus ARust LLVM 維持): Flow hardening, PHI(sealed) 安定化, LoopForm 仕様遵守。 - Focus ARust LLVM 維持): Flow hardening, PHI(sealed) 安定化, LoopForm 仕様遵守。

BIN
app_par_esc Normal file

Binary file not shown.

Binary file not shown.

BIN
app_parity_main Normal file

Binary file not shown.

View File

@ -0,0 +1,17 @@
// Minimal repro: string concatenation in a loop should yield "xxx"
static box Main {
main(args) {
local console = new ConsoleBox()
local out = ""
local i = 0
local n = 3
loop(i < n) {
out = out + "x"
i = i + 1
}
console.println(out)
return 0
}
}

View File

@ -79,7 +79,9 @@ pub extern "C" fn nyash_string_concat_hh_export(a_h: i64, b_h: i64) -> i64 {
}; };
let s = format!("{}{}", to_s(a_h), to_s(b_h)); let s = format!("{}{}", to_s(a_h), to_s(b_h));
let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(StringBox::new(s)); let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(StringBox::new(s));
handles::to_handle(arc) as i64 let h = handles::to_handle(arc) as i64;
eprintln!("[TRACE] concat_hh -> {}", h);
h
} }
// String.eq_hh(lhs_h, rhs_h) -> i64 (0/1) // String.eq_hh(lhs_h, rhs_h) -> i64 (0/1)
@ -120,7 +122,9 @@ pub extern "C" fn nyash_string_substring_hii_export(h: i64, start: i64, end: i64
let (st_u, en_u) = (st as usize, en as usize); let (st_u, en_u) = (st as usize, en as usize);
let sub = s.get(st_u.min(s.len())..en_u.min(s.len())).unwrap_or(""); let sub = s.get(st_u.min(s.len())..en_u.min(s.len())).unwrap_or("");
let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(StringBox::new(sub.to_string())); let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(StringBox::new(sub.to_string()));
handles::to_handle(arc) as i64 let nh = handles::to_handle(arc) as i64;
eprintln!("[TRACE] substring_hii -> {}", nh);
nh
} }
// String.lastIndexOf_hh(haystack_h, needle_h) -> i64 // String.lastIndexOf_hh(haystack_h, needle_h) -> i64
@ -159,7 +163,9 @@ pub extern "C" fn nyash_box_from_i8_string(ptr: *const i8) -> i64 {
Err(_) => return 0, Err(_) => return 0,
}; };
let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(StringBox::new(s)); let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(StringBox::new(s));
handles::to_handle(arc) as i64 let h = handles::to_handle(arc) as i64;
eprintln!("[TRACE] from_i8_string -> {}", h);
h
} }
// box.from_f64(val) -> handle // box.from_f64(val) -> handle
@ -171,6 +177,15 @@ pub extern "C" fn nyash_box_from_f64(val: f64) -> i64 {
handles::to_handle(arc) as i64 handles::to_handle(arc) as i64
} }
// box.from_i64(val) -> handle
// Helper: build an IntegerBox and return a handle
#[export_name = "nyash.box.from_i64"]
pub extern "C" fn nyash_box_from_i64(val: i64) -> i64 {
use nyash_rust::{box_trait::{NyashBox, IntegerBox}, jit::rt::handles};
let arc: std::sync::Arc<dyn NyashBox> = std::sync::Arc::new(IntegerBox::new(val));
handles::to_handle(arc) as i64
}
// env.box.new(type_name: *const i8) -> handle (i64) // env.box.new(type_name: *const i8) -> handle (i64)
// Minimal shim for Core-13 pure AOT: constructs Box via registry by name (no args) // Minimal shim for Core-13 pure AOT: constructs Box via registry by name (no args)
#[export_name = "nyash.env.box.new"] #[export_name = "nyash.env.box.new"]

View File

@ -83,7 +83,7 @@ pub extern "C" fn nyash_string_concat_is(a: i64, b: *const i8) -> *mut i8 {
// Exported as: nyash.string.substring_sii(i8* s, i64 start, i64 end) -> i8* // Exported as: nyash.string.substring_sii(i8* s, i64 start, i64 end) -> i8*
#[export_name = "nyash.string.substring_sii"] #[export_name = "nyash.string.substring_sii"]
pub extern "C" fn nyash_string_substring_sii(s: *const i8, start: i64, end: i64) -> *mut i8 { pub extern "C" fn nyash_string_substring_sii(s: *const i8, start: i64, end: i64) -> *mut i8 {
use std::ffi::CStr; use std::ffi::CStr;
if s.is_null() { if s.is_null() {
return std::ptr::null_mut(); return std::ptr::null_mut();
} }
@ -121,3 +121,34 @@ pub extern "C" fn nyash_string_lastindexof_ss(s: *const i8, needle: *const i8) -
pos as i64 pos as i64
} else { -1 } } else { -1 }
} }
// Exported as: nyash.string.to_i8p_h(i64 handle) -> i8*
#[export_name = "nyash.string.to_i8p_h"]
pub extern "C" fn nyash_string_to_i8p_h(handle: i64) -> *mut i8 {
use nyash_rust::jit::rt::handles;
if handle <= 0 {
// return "0" for consistency with existing fallback behavior
let s = handle.to_string();
let mut bytes = s.into_bytes();
bytes.push(0);
let boxed = bytes.into_boxed_slice();
let raw = Box::into_raw(boxed) as *mut u8;
return raw as *mut i8;
}
if let Some(obj) = handles::get(handle as u64) {
let s = obj.to_string_box().value;
let mut bytes = s.into_bytes();
bytes.push(0);
let boxed = bytes.into_boxed_slice();
let raw = Box::into_raw(boxed) as *mut u8;
raw as *mut i8
} else {
// not found -> print numeric handle string
let s = handle.to_string();
let mut bytes = s.into_bytes();
bytes.push(0);
let boxed = bytes.into_boxed_slice();
let raw = Box::into_raw(boxed) as *mut u8;
raw as *mut i8
}
}

View File

@ -14,26 +14,26 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存
4. **エコシステムの自立**: Nyashだけで完結する開発環境 4. **エコシステムの自立**: Nyashだけで完結する開発環境
5. **劇的なコード圧縮**: 75%削減で保守性・可読性の革命 5. **劇的なコード圧縮**: 75%削減で保守性・可読性の革命
## 🚀 実装戦略2025年9月更新 ## 🚀 実装戦略2025年9月更新・改定
### Phase 15.2: LLVM層の独立化(実装中) ### Phase 15.2: LLVMllvmlite安定化 + PyVM導入
- **Python/llvmlite実装を正式採用**開発速度10倍、~2400行 - JIT/Cranelift は一時停止(古い/非対応。Rust/inkwell は参照のみ。
- nyash-llvm-compiler crateの分離Rust版も継続 - 既定のコンパイル経路は **Python/llvmlite**harnessのみ
- MIR JSON/バイナリ入力 → ネイティブEXE出力 - MIR(JSON) → LLVM IR → .o → NyRTリンク → EXE
- プラグイン全方向ビルド戦略(.so/.o/.a同時生成 - Resolver-only / Sealed SSA / 文字列ハンドル不変 を強化
- 独立したツールとして配布可能 - 新規: **PyVMPython MIR VM** を導入し、2本目の実行経路を確保
- 最小命令: const/binop/compare/phi/branch/jump/ret + 最小 boxcallConsole/File/Path/String
- ランナー統合: `NYASH_VM_USE_PY=1` で MIR(JSON) を PyVM に渡して実行
- 代表スモークesc_dirname_smoke / dep_tree_min_stringで llvmlite とパリティ確認
### Phase 15.3: Nyashコンパイラ実装 ### Phase 15.3: NyashコンパイラMVP後段
- NyashでNyashパーサー実装800行目標 - PyVM 安定後、Nyashパーサ/レクサ(サブセット)と MIR ビルダを段階導入
- AST→MIR変換2500行目標 - フラグでRustフォールバックと併存例: `NYASH_USE_NY_COMPILER=1`
- **循環依存なし**nyrtがStringBox/ArrayBoxをC ABI経由で提供 - JIT不要、PyVM/llvmlite のパリティで正しさを担保
- ブートストラップでセルフホスティング達成!
### Phase 15.4: VM層のNyash化革新的 ### Phase 15.4: VM層のNyash化PyVMからの置換
- MIR解釈エンジンをNyash実装~5000行予想 - PyVM を足場に、VMコアを Nyash 実装へ段階移植(命令サブセットから
- 動的ディスパッチMapBoxで13命令処理 - 動的ディスパッチで13命令処理を目標に拡張
- コンパイル不要の即座実行
- デバッグ・開発効率の劇的向上
詳細:[セルフホスティング戦略 2025年9月版](implementation/self-hosting-strategy-2025-09.md) 詳細:[セルフホスティング戦略 2025年9月版](implementation/self-hosting-strategy-2025-09.md)
@ -71,7 +71,7 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存
この究極のシンプルさにより、直接x86変換も現実的に この究極のシンプルさにより、直接x86変換も現実的に
### バックエンドの選択肢 ### バックエンドの選択肢
#### 1. Cranelift + lld内蔵ChatGPT5推奨 #### 1. Cranelift + lld内蔵保留
- **軽量**: 3-5MB程度LLVMの1/10以下 - **軽量**: 3-5MB程度LLVMの1/10以下
- **JIT特化**: メモリ上での動的コンパイル - **JIT特化**: メモリ上での動的コンパイル
- **Rust統合**: 静的リンクで配布容易 - **Rust統合**: 静的リンクで配布容易
@ -173,18 +173,15 @@ box TemplateStitcher {
## 🔗 EXEファイル生成・リンク戦略 ## 🔗 EXEファイル生成・リンク戦略
### 統合ツールチェーン ### 統合ツールチェーン(現状)
```bash ```bash
# Cranelift版一時停止中 nyash build main.ny --backend=llvm --emit exe -o program.exe # llvmlite/harness 経路
nyash build main.ny --backend=cranelift --target=x86_64-pc-windows-msvc NYASH_VM_USE_PY=1 nyash run main.ny --backend=vm # PyVMMIR JSON を実行)
# LLVM版ChatGPT5実装中
nyash build main.ny --backend=llvm --emit exe -o program.exe
``` ```
### 実装戦略 ### 実装戦略
#### LLVM バックエンド(優先) #### LLVM バックエンド(優先・llvmlite
1. **MIR→LLVM IR**: MIR13をLLVM IRに変換✅ 実装済み) 1. **MIR→LLVM IR**: MIR13をLLVM IRに変換✅ 実装済み)
2. **LLVM IR→Object**: ネイティブオブジェクトファイル生成(✅ 実装済み) 2. **LLVM IR→Object**: ネイティブオブジェクトファイル生成(✅ 実装済み)
3. **Python/llvmlite実装**: Resolver patternでSSA安全性確保✅ 実証済み) 3. **Python/llvmlite実装**: Resolver patternでSSA安全性確保✅ 実証済み)
@ -233,10 +230,10 @@ ny_free_buf(buffer)
## 📅 実施時期(修正版) ## 📅 実施時期(修正版)
- **現在進行中**2025年9月 - **現在進行中**2025年9月
- Python/llvmlite実装でブレークスルー - Python/llvmlite既定Craneliftは停止
- dep_tree_min_string.nyashオブジェクト生成成功 - PyVMPython MIR VM導入・代表スモークで llvmlite とパリティ確認
- **Phase 15.2**: LLVM独立化2025年9-10月完成予定 - **Phase 15.2**: llvmlite安定化 + PyVM最小完成2025年9-10月
- **Phase 15.3**: Nyashコンパイラ2025年11-12月 - **Phase 15.3**: NyashコンパイラMVP2025年11-12月
- **Phase 15.4**: VM層Nyash化2026年1-3月 - **Phase 15.4**: VM層Nyash化2026年1-3月
- **Phase 15.5**: ABI移行LLVM完成後、必要に応じて - **Phase 15.5**: ABI移行LLVM完成後、必要に応じて

View File

@ -1,10 +1,10 @@
# Phase 15 推奨進行順(JIT優先・自己ホスティング最小) # Phase 15 推奨進行順(llvmlite+PyVM 優先・自己ホスティング最小)
更新日: 2025-09-05 更新日: 2025-09-05
## 方針(原則) ## 方針(原則)
- JITオンリー(Cranelift)で前進。LLVM/AOT・lld系は後段にスライド - JIT/Cranelift は停止。LLVMllvmliteと PyVM の2経路で前進
- 最小自己ホスト体験を早期に成立 → ドキュメント/スモーク/CIを先に固める。 - 最小自己ホスト体験を早期に成立 → ドキュメント/スモーク/CIを先に固める。
- using名前空間はゲート付きで段階導入。NyModulesとny_pluginsの基盤を強化。 - using名前空間はゲート付きで段階導入。NyModulesとny_pluginsの基盤を強化。
- tmux + codex-async を使い、常時2本並走で小粒に積み上げる。 - tmux + codex-async を使い、常時2本並走で小粒に積み上げる。
@ -25,18 +25,17 @@
**完了基準:** **完了基準:**
- env.modules.get("acme.logger") などが取得可能、LIST_ONLY/Fail-continue維持、予約拒否ログが出る。 - env.modules.get("acme.logger") などが取得可能、LIST_ONLY/Fail-continue維持、予約拒否ログが出る。
### 2) 最小コンパイラ経路JIT ### 2) 最小VMPyVM
**要点:** **要点:**
- パーサ/レクサのサブセット: ident/literals/let/call/return/if/block - MIR(JSON) を Python VMPyVMで実行。最小命令 + 最小 boxcallConsole/File/Path/String
- Nyash から呼べる MIR ビルダ(小さなサブセット) - ランナー統合(`NYASH_VM_USE_PY=1`)→ 代表スモークが llvmlite と一致
- VM/JIT ブリッジを通して apps/selfhost-minimal が走る
**スモーク/CI:** **スモーク/CI:**
- tools/jit_smoke.sh, tools/selfhost_vm_smoke.sh - tools/compare_harness_on_off.shハーネス、compare_vm_vs_harness.shPyVM vs llvmlite
**完了基準:** **完了基準:**
- ./target/release/nyash --backend vm apps/selfhost-minimal/main.nyash が安定実行し、CIでJITスモーク合格 - esc_dirname_smoke / dep_tree_min_string が PyVM と llvmlite で一致
### 3) usingゲート付き設計・実装15.2/15.3 ### 3) usingゲート付き設計・実装15.2/15.3

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

7
src/llvm_py/__init__.py Normal file
View File

@ -0,0 +1,7 @@
"""Top-level package for Nyash Python backends.
Subpackages:
- pyvm: Python MIR interpreter (PyVM)
- instructions/*: llvmlite lowering helpers (AOT harness)
"""

View File

@ -69,6 +69,7 @@ def lower_binop(
i8p = ir.IntType(8).as_pointer() i8p = ir.IntType(8).as_pointer()
lhs_raw = vmap.get(lhs) lhs_raw = vmap.get(lhs)
rhs_raw = vmap.get(rhs) rhs_raw = vmap.get(rhs)
# Prefer handle pipeline to keep handles consistent across blocks/ret
# pointer present? # pointer present?
is_ptr_side = (hasattr(lhs_raw, 'type') and isinstance(lhs_raw.type, ir.PointerType)) or \ is_ptr_side = (hasattr(lhs_raw, 'type') and isinstance(lhs_raw.type, ir.PointerType)) or \
(hasattr(rhs_raw, 'type') and isinstance(rhs_raw.type, ir.PointerType)) (hasattr(rhs_raw, 'type') and isinstance(rhs_raw.type, ir.PointerType))
@ -86,7 +87,7 @@ def lower_binop(
is_str = is_ptr_side or any_tagged is_str = is_ptr_side or any_tagged
if is_str: if is_str:
# Helper: convert raw or resolved value to string handle # Helper: convert raw or resolved value to string handle
def to_handle(raw, val, tag: str): def to_handle(raw, val, tag: str, vid: int):
if raw is not None and hasattr(raw, 'type') and isinstance(raw.type, ir.PointerType): if raw is not None and hasattr(raw, 'type') and isinstance(raw.type, ir.PointerType):
# pointer-to-array -> GEP # pointer-to-array -> GEP
try: try:
@ -104,11 +105,29 @@ def lower_binop(
return builder.call(cal, [raw], name=f"str_ptr2h_{tag}_{dst}") return builder.call(cal, [raw], name=f"str_ptr2h_{tag}_{dst}")
# if already i64 # if already i64
if val is not None and hasattr(val, 'type') and isinstance(val.type, ir.IntType) and val.type.width == 64: if val is not None and hasattr(val, 'type') and isinstance(val.type, ir.IntType) and val.type.width == 64:
# Distinguish handle vs numeric: if vid is tagged string-ish, treat as handle; otherwise box numeric to handle
is_tag = False
try:
if resolver is not None and hasattr(resolver, 'is_stringish'):
is_tag = resolver.is_stringish(vid)
except Exception:
is_tag = False
if is_tag:
return val return val
# Box numeric i64 to IntegerBox handle
cal = None
for f in builder.module.functions:
if f.name == 'nyash.box.from_i64':
cal = f; break
if cal is None:
cal = ir.Function(builder.module, ir.FunctionType(i64, [i64]), name='nyash.box.from_i64')
# Ensure value is i64
v64 = val if val.type.width == 64 else builder.zext(val, i64)
return builder.call(cal, [v64], name=f"int_i2h_{tag}_{dst}")
return ir.Constant(i64, 0) return ir.Constant(i64, 0)
hl = to_handle(lhs_raw, lhs_val, 'l') hl = to_handle(lhs_raw, lhs_val, 'l', lhs)
hr = to_handle(rhs_raw, rhs_val, 'r') hr = to_handle(rhs_raw, rhs_val, 'r', rhs)
# concat_hh(handle, handle) -> handle # concat_hh(handle, handle) -> handle
hh_fnty = ir.FunctionType(i64, [i64, i64]) hh_fnty = ir.FunctionType(i64, [i64, i64])
callee = None callee = None

View File

@ -131,6 +131,8 @@ def lower_boxcall(
try: try:
if resolver is not None and hasattr(resolver, 'mark_string'): if resolver is not None and hasattr(resolver, 'mark_string'):
resolver.mark_string(dst_vid) resolver.mark_string(dst_vid)
if resolver is not None and hasattr(resolver, 'string_ptrs'):
resolver.string_ptrs[int(dst_vid)] = p
except Exception: except Exception:
pass pass
return return
@ -196,23 +198,38 @@ def lower_boxcall(
return return
if method_name in ("print", "println", "log"): if method_name in ("print", "println", "log"):
# Console mapping # Console mapping (prefer pointer-API when possible to avoid handle registry mismatch)
use_ptr = False
arg0_vid = args[0] if args else None
arg0_ptr = None
if resolver is not None and hasattr(resolver, 'string_ptrs') and arg0_vid is not None:
try:
arg0_ptr = resolver.string_ptrs.get(int(arg0_vid))
if arg0_ptr is not None:
use_ptr = True
except Exception:
pass
if use_ptr and arg0_ptr is not None:
callee = _declare(module, "nyash.console.log", i64, [i8p])
_ = builder.call(callee, [arg0_ptr], name="console_log_ptr")
else:
# Fallback: resolve i64 and prefer pointer API via to_i8p_h bridge
if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None:
arg0 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else None arg0 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else None
else: else:
arg0 = vmap.get(args[0]) if args else None arg0 = vmap.get(args[0]) if args else None
if arg0 is None: if arg0 is None:
arg0 = ir.Constant(i8p, None) arg0 = ir.Constant(i64, 0)
# Prefer handle API if arg is i64, else pointer API # If we have a handle (i64), convert to i8* via bridge and log via pointer API
if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType) and arg0.type.width == 64: if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType):
# Optional runtime trace of the handle if arg0.type.width != 64:
import os as _os arg0 = builder.zext(arg0, i64)
if _os.environ.get('NYASH_LLVM_TRACE_FINAL') == '1': bridge = _declare(module, "nyash.string.to_i8p_h", i8p, [i64])
trace = _declare(module, "nyash.debug.trace_handle", i64, [i64]) p = builder.call(bridge, [arg0], name="str_h2p_for_log")
_ = builder.call(trace, [arg0], name="trace_handle") callee = _declare(module, "nyash.console.log", i64, [i8p])
callee = _declare(module, "nyash.console.log_handle", i64, [i64]) _ = builder.call(callee, [p], name="console_log_p")
_ = builder.call(callee, [arg0], name="console_log_h")
else: else:
# Non-integer value: coerce to i8* and log
if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType): if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType):
arg0 = builder.inttoptr(arg0, i8p) arg0 = builder.inttoptr(arg0, i8p)
callee = _declare(module, "nyash.console.log", i64, [i8p]) callee = _declare(module, "nyash.console.log", i64, [i8p])

View File

@ -107,5 +107,18 @@ def lower_call(
'esc_json', 'node_json', 'dirname', 'join', 'read_all', 'toJson' 'esc_json', 'node_json', 'dirname', 'join', 'read_all', 'toJson'
]): ]):
resolver.mark_string(dst_vid) resolver.mark_string(dst_vid)
# Additionally, create a pointer view via bridge for println pointer-API
if resolver is not None and hasattr(resolver, 'string_ptrs'):
i64 = ir.IntType(64)
i8p = ir.IntType(8).as_pointer()
if hasattr(result, 'type') and isinstance(result.type, ir.IntType) and result.type.width == 64:
bridge = None
for f in module.functions:
if f.name == 'nyash.string.to_i8p_h':
bridge = f; break
if bridge is None:
bridge = ir.Function(module, ir.FunctionType(i8p, [i64]), name='nyash.string.to_i8p_h')
pv = builder.call(bridge, [result], name=f"ret_h2p_{dst_vid}")
resolver.string_ptrs[int(dst_vid)] = pv
except Exception: except Exception:
pass pass

View File

@ -39,7 +39,7 @@ def lower_const(
llvm_val = ir.Constant(f64, float(const_val)) llvm_val = ir.Constant(f64, float(const_val))
vmap[dst] = llvm_val vmap[dst] = llvm_val
elif const_type == 'string': elif const_type == 'string' or (isinstance(const_type, dict) and const_type.get('kind') in ('handle','ptr') and const_type.get('box_type') == 'StringBox'):
# String constant - create global and immediately box to i64 handle # String constant - create global and immediately box to i64 handle
i8 = ir.IntType(8) i8 = ir.IntType(8)
str_val = str(const_val) str_val = str(const_val)
@ -82,6 +82,11 @@ def lower_const(
# Mark this value-id as string-ish to guide '+' and '==' lowering # Mark this value-id as string-ish to guide '+' and '==' lowering
if hasattr(resolver, 'mark_string'): if hasattr(resolver, 'mark_string'):
resolver.mark_string(dst) resolver.mark_string(dst)
# Keep raw pointer for potential pointer-API sites (e.g., console.log)
try:
resolver.string_ptrs[dst] = gep
except Exception:
pass
elif const_type == 'void': elif const_type == 'void':
# Void/null constant - use i64 zero # Void/null constant - use i64 zero

View File

@ -58,6 +58,7 @@ def lower_phi(
# Collect incoming values # Collect incoming values
incoming_pairs: List[Tuple[ir.Block, ir.Value]] = [] incoming_pairs: List[Tuple[ir.Block, ir.Value]] = []
used_default_zero = False
for block_id in actual_preds: for block_id in actual_preds:
block = bb_map.get(block_id) block = bb_map.get(block_id)
vid = incoming_map.get(block_id) vid = incoming_map.get(block_id)
@ -76,6 +77,7 @@ def lower_phi(
if val is None: if val is None:
# Missing incoming for this predecessor → default 0 # Missing incoming for this predecessor → default 0
val = ir.Constant(phi_type, 0) val = ir.Constant(phi_type, 0)
used_default_zero = True
else: else:
# Snapshot fallback # Snapshot fallback
if block_end_values is not None: if block_end_values is not None:
@ -86,6 +88,7 @@ def lower_phi(
if not val: if not val:
# Missing incoming for this predecessor → default 0 # Missing incoming for this predecessor → default 0
val = ir.Constant(phi_type, 0) val = ir.Constant(phi_type, 0)
used_default_zero = True
# Coerce pointer to i64 at predecessor end # Coerce pointer to i64 at predecessor end
if hasattr(val, 'type') and val.type != phi_type: if hasattr(val, 'type') and val.type != phi_type:
pb = ir.IRBuilder(block) pb = ir.IRBuilder(block)
@ -127,6 +130,16 @@ def lower_phi(
# Store PHI result # Store PHI result
vmap[dst_vid] = phi vmap[dst_vid] = phi
# Strict mode: fail fast on synthesized zeros (indicates incomplete incoming or dominance issue)
import os
if used_default_zero and os.environ.get('NYASH_LLVM_PHI_STRICT') == '1':
raise RuntimeError(f"[LLVM_PY] PHI dst={dst_vid} used synthesized zero; check preds/incoming")
if os.environ.get('NYASH_LLVM_TRACE_PHI') == '1':
try:
blkname = str(current_block.name)
except Exception:
blkname = '<blk>'
print(f"[PHI] {blkname} v{dst_vid} incoming={len(incoming_pairs)} zero={1 if used_default_zero else 0}")
# Propagate string-ness: if any incoming value-id is tagged string-ish, mark dst as string-ish. # Propagate string-ness: if any incoming value-id is tagged string-ish, mark dst as string-ish.
try: try:
if resolver is not None and hasattr(resolver, 'is_stringish') and hasattr(resolver, 'mark_string'): if resolver is not None and hasattr(resolver, 'is_stringish') and hasattr(resolver, 'mark_string'):

View File

@ -30,12 +30,36 @@ def lower_return(
builder.ret_void() builder.ret_void()
else: else:
# Get return value (prefer resolver) # Get return value (prefer resolver)
ret_val = None
if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None:
try:
if isinstance(return_type, ir.PointerType): if isinstance(return_type, ir.PointerType):
ret_val = resolver.resolve_ptr(value_id, builder.block, preds, block_end_values, vmap) ret_val = resolver.resolve_ptr(value_id, builder.block, preds, block_end_values, vmap)
else: else:
ret_val = resolver.resolve_i64(value_id, builder.block, preds, block_end_values, vmap, bb_map) # Prefer pointer→handle reboxing for string-ish returns even if function return type is i64
is_stringish = False
if hasattr(resolver, 'is_stringish'):
try:
is_stringish = resolver.is_stringish(int(value_id))
except Exception:
is_stringish = False
if is_stringish and hasattr(resolver, 'string_ptrs') and int(value_id) in getattr(resolver, 'string_ptrs'):
# Re-box known string pointer to handle
p = resolver.string_ptrs[int(value_id)]
i8p = ir.IntType(8).as_pointer()
i64 = ir.IntType(64)
boxer = None
for f in builder.module.functions:
if f.name == 'nyash.box.from_i8_string':
boxer = f; break
if boxer is None:
boxer = ir.Function(builder.module, ir.FunctionType(i64, [i8p]), name='nyash.box.from_i8_string')
ret_val = builder.call(boxer, [p], name='ret_ptr2h')
else: else:
ret_val = resolver.resolve_i64(value_id, builder.block, preds, block_end_values, vmap, bb_map)
except Exception:
ret_val = None
if ret_val is None:
ret_val = vmap.get(value_id) ret_val = vmap.get(value_id)
if not ret_val: if not ret_val:
# Default based on return type # Default based on return type

View File

@ -140,7 +140,22 @@ class NyashLLVMBuilder:
else: else:
b.ret(ir.Constant(self.i32, 0)) b.ret(ir.Constant(self.i32, 0))
return str(self.module) ir_text = str(self.module)
# Optional IR dump to file for debugging
try:
dump_path = os.environ.get('NYASH_LLVM_DUMP_IR')
if dump_path:
os.makedirs(os.path.dirname(dump_path), exist_ok=True)
with open(dump_path, 'w') as f:
f.write(ir_text)
elif os.environ.get('NYASH_CLI_VERBOSE') == '1':
# Default dump location when verbose and not explicitly set
os.makedirs('tmp', exist_ok=True)
with open('tmp/nyash_harness.ll', 'w') as f:
f.write(ir_text)
except Exception:
pass
return ir_text
def _create_dummy_main(self) -> str: def _create_dummy_main(self) -> str:
"""Create dummy ny_main that returns 0""" """Create dummy ny_main that returns 0"""
@ -185,6 +200,8 @@ class NyashLLVMBuilder:
self.resolver.string_ids.clear() self.resolver.string_ids.clear()
if hasattr(self.resolver, 'string_literals'): if hasattr(self.resolver, 'string_literals'):
self.resolver.string_literals.clear() self.resolver.string_literals.clear()
if hasattr(self.resolver, 'string_ptrs'):
self.resolver.string_ptrs.clear()
except Exception: except Exception:
pass pass
@ -403,6 +420,15 @@ class NyashLLVMBuilder:
dst = inst.get("dst") dst = inst.get("dst")
lower_boxcall(builder, self.module, box_vid, method, args, dst, lower_boxcall(builder, self.module, box_vid, method, args, dst,
self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map)
# Optional: honor explicit dst_type for tagging (string handle)
try:
dst_type = inst.get("dst_type")
if dst is not None and isinstance(dst_type, dict):
if dst_type.get("kind") == "handle" and dst_type.get("box_type") == "StringBox":
if hasattr(self.resolver, 'mark_string'):
self.resolver.mark_string(int(dst))
except Exception:
pass
elif op == "externcall": elif op == "externcall":
func_name = inst.get("func") func_name = inst.get("func")
@ -661,7 +687,7 @@ def main():
llvm_ir = builder.build_from_mir(mir_json) llvm_ir = builder.build_from_mir(mir_json)
if os.environ.get('NYASH_CLI_VERBOSE') == '1': if os.environ.get('NYASH_CLI_VERBOSE') == '1':
print(f"[Python LLVM] Generated LLVM IR:\n{llvm_ir}") print(f"[Python LLVM] Generated LLVM IR (see NYASH_LLVM_DUMP_IR or tmp/nyash_harness.ll)")
builder.compile_to_object(output_file) builder.compile_to_object(output_file)
print(f"Compiled to {output_file}") print(f"Compiled to {output_file}")

View File

@ -0,0 +1,6 @@
"""PyVM package scaffold for Nyash MIR interpreter (Python).
Modules:
- vm: Tiny interpreter for MIR(JSON) produced by runner's mir_json_emit
"""

390
src/llvm_py/pyvm/vm.py Normal file
View File

@ -0,0 +1,390 @@
"""
Minimal Python VM for Nyash MIR(JSON) parity with llvmlite.
Supported ops (MVP):
- const/binop/compare/branch/jump/ret
- phi (select by predecessor block)
- newbox: ConsoleBox, StringBox (minimal semantics)
- boxcall: String.length/substring/lastIndexOf, Console.print/println/log
- externcall: nyash.console.println
Value model:
- i64 -> Python int
- f64 -> Python float
- string -> Python str
- void/null -> None
- ConsoleBox -> {"__box__":"ConsoleBox"}
- StringBox receiver -> Python str
"""
from __future__ import annotations
from dataclasses import dataclass
from typing import Any, Dict, List, Optional, Tuple
import os
@dataclass
class Block:
id: int
instructions: List[Dict[str, Any]]
@dataclass
class Function:
name: str
params: List[int]
blocks: Dict[int, Block]
class PyVM:
def __init__(self, program: Dict[str, Any]):
self.functions: Dict[str, Function] = {}
for f in program.get("functions", []):
name = f.get("name")
params = [int(p) for p in f.get("params", [])]
bmap: Dict[int, Block] = {}
for bb in f.get("blocks", []):
bmap[int(bb.get("id"))] = Block(id=int(bb.get("id")), instructions=list(bb.get("instructions", [])))
self.functions[name] = Function(name=name, params=params, blocks=bmap)
def _read(self, regs: Dict[int, Any], v: Optional[int]) -> Any:
if v is None:
return None
return regs.get(int(v))
def _set(self, regs: Dict[int, Any], dst: Optional[int], val: Any) -> None:
if dst is None:
return
regs[int(dst)] = val
def _truthy(self, v: Any) -> bool:
if isinstance(v, bool):
return v
if isinstance(v, (int, float)):
return v != 0
if isinstance(v, str):
return len(v) != 0
return v is not None
def _is_console(self, v: Any) -> bool:
return isinstance(v, dict) and v.get("__box__") == "ConsoleBox"
def run(self, entry: str) -> Any:
fn = self.functions.get(entry)
if fn is None:
raise RuntimeError(f"entry function not found: {entry}")
return self._exec_function(fn, [])
def _exec_function(self, fn: Function, args: List[Any]) -> Any:
# Intrinsic fast path for small helpers used in smokes
ok, ret = self._try_intrinsic(fn.name, args)
if ok:
return ret
# Initialize registers and bind params
regs: Dict[int, Any] = {}
if fn.params:
for i, pid in enumerate(fn.params):
regs[int(pid)] = args[i] if i < len(args) else None
else:
# Heuristic: derive param count from name suffix '/N' and bind to vids 0..N-1
n = 0
if "/" in fn.name:
try:
n = int(fn.name.split("/")[-1])
except Exception:
n = 0
for i in range(n):
regs[i] = args[i] if i < len(args) else None
# Choose a deterministic first block (lowest id)
if not fn.blocks:
return 0
cur = min(fn.blocks.keys())
prev: Optional[int] = None
# Simple block execution loop
while True:
block = fn.blocks.get(cur)
if block is None:
raise RuntimeError(f"block not found: {cur}")
# Evaluate instructions sequentially
i = 0
while i < len(block.instructions):
inst = block.instructions[i]
op = inst.get("op")
if op == "phi":
# incoming: [[vid, pred_bid], ...]
incoming = inst.get("incoming", [])
chosen: Any = None
# Prefer predecessor match; otherwise fallback to first
for vid, pb in incoming:
if prev is not None and int(pb) == int(prev):
chosen = regs.get(int(vid))
break
if chosen is None and incoming:
vid, _ = incoming[0]
chosen = regs.get(int(vid))
self._set(regs, inst.get("dst"), chosen)
i += 1
continue
if op == "const":
val = inst.get("value", {})
ty = val.get("type")
vv = val.get("value")
if ty == "i64":
out = int(vv)
elif ty == "f64":
out = float(vv)
elif ty == "string":
out = str(vv)
else:
out = None
self._set(regs, inst.get("dst"), out)
i += 1
continue
if op == "binop":
operation = inst.get("operation")
a = self._read(regs, inst.get("lhs"))
b = self._read(regs, inst.get("rhs"))
res: Any = None
if operation == "+":
if isinstance(a, str) or isinstance(b, str):
res = (str(a) if a is not None else "") + (str(b) if b is not None else "")
else:
av = 0 if a is None else int(a)
bv = 0 if b is None else int(b)
res = av + bv
elif operation == "-":
av = 0 if a is None else int(a)
bv = 0 if b is None else int(b)
res = av - bv
elif operation == "*":
av = 0 if a is None else int(a)
bv = 0 if b is None else int(b)
res = av * bv
elif operation == "/":
# integer division semantics for now
av = 0 if a is None else int(a)
bv = 1 if b in (None, 0) else int(b)
res = av // bv
elif operation == "%":
av = 0 if a is None else int(a)
bv = 1 if b in (None, 0) else int(b)
res = av % bv
elif operation in ("&", "|", "^"):
# treat as bitwise on ints
ai, bi = (0 if a is None else int(a)), (0 if b is None else int(b))
if operation == "&":
res = ai & bi
elif operation == "|":
res = ai | bi
else:
res = ai ^ bi
elif operation in ("<<", ">>"):
ai, bi = (0 if a is None else int(a)), (0 if b is None else int(b))
res = (ai << bi) if operation == "<<" else (ai >> bi)
else:
raise RuntimeError(f"unsupported binop: {operation}")
self._set(regs, inst.get("dst"), res)
i += 1
continue
if op == "compare":
operation = inst.get("operation")
a = self._read(regs, inst.get("lhs"))
b = self._read(regs, inst.get("rhs"))
res: bool
if operation == "==":
res = (a == b)
elif operation == "!=":
res = (a != b)
elif operation == "<":
res = (a < b)
elif operation == "<=":
res = (a <= b)
elif operation == ">":
res = (a > b)
elif operation == ">=":
res = (a >= b)
else:
raise RuntimeError(f"unsupported compare: {operation}")
# VM convention: booleans are i64 0/1
self._set(regs, inst.get("dst"), 1 if res else 0)
i += 1
continue
if op == "newbox":
btype = inst.get("type")
if btype == "ConsoleBox":
val = {"__box__": "ConsoleBox"}
elif btype == "StringBox":
# empty string instance
val = ""
else:
# Unknown box -> opaque
val = {"__box__": btype}
self._set(regs, inst.get("dst"), val)
i += 1
continue
if op == "boxcall":
recv = self._read(regs, inst.get("box"))
method = inst.get("method")
args = [self._read(regs, a) for a in inst.get("args", [])]
out: Any = None
# ConsoleBox methods
if method in ("print", "println", "log") and self._is_console(recv):
s = args[0] if args else ""
if s is None:
s = ""
if method == "println":
print(str(s))
else:
# println is the primary one used by smokes; keep print/log equivalent
print(str(s))
out = 0
# FileBox methods (minimal read-only)
elif isinstance(recv, dict) and recv.get("__box__") == "FileBox":
if method == "open":
path = str(args[0]) if len(args) > 0 else ""
mode = str(args[1]) if len(args) > 1 else "r"
ok = 0
content = None
if mode == "r":
try:
with open(path, "r", encoding="utf-8") as f:
content = f.read()
ok = 1
except Exception:
ok = 0
content = None
recv["__open"] = (ok == 1)
recv["__path"] = path
recv["__content"] = content
out = ok
elif method == "read":
if isinstance(recv.get("__content"), str):
out = recv.get("__content")
else:
out = None
elif method == "close":
recv["__open"] = False
out = 0
else:
out = None
# PathBox methods (posix-like)
elif isinstance(recv, dict) and recv.get("__box__") == "PathBox":
if method == "dirname":
p = str(args[0]) if args else ""
# Normalize to POSIX-style
out = os.path.dirname(p)
if out == "":
out = "."
elif method == "join":
base = str(args[0]) if len(args) > 0 else ""
rel = str(args[1]) if len(args) > 1 else ""
out = os.path.join(base, rel)
else:
out = None
elif method == "length":
out = len(str(recv))
elif method == "substring":
s = str(recv)
start = int(args[0]) if len(args) > 0 else 0
end = int(args[1]) if len(args) > 1 else len(s)
out = s[start:end]
elif method == "lastIndexOf":
s = str(recv)
needle = str(args[0]) if args else ""
out = s.rfind(needle)
else:
# Unimplemented method -> no-op
out = None
self._set(regs, inst.get("dst"), out)
i += 1
continue
if op == "externcall":
func = inst.get("func")
args = [self._read(regs, a) for a in inst.get("args", [])]
out: Any = None
if func == "nyash.console.println":
s = args[0] if args else ""
if s is None:
s = ""
print(str(s))
out = 0
else:
# Unknown extern
out = None
self._set(regs, inst.get("dst"), out)
i += 1
continue
if op == "branch":
cond = self._read(regs, inst.get("cond"))
tid = int(inst.get("then"))
eid = int(inst.get("else"))
prev = cur
cur = tid if self._truthy(cond) else eid
# Restart execution at next block
break
if op == "jump":
tgt = int(inst.get("target"))
prev = cur
cur = tgt
break
if op == "ret":
v = self._read(regs, inst.get("value"))
return v
if op == "call":
# Resolve function name from value or take as literal
fval = inst.get("func")
fname = self._read(regs, fval)
if not isinstance(fname, str):
# Fallback: if JSON encoded a literal name
fname = fval if isinstance(fval, str) else None
call_args = [self._read(regs, a) for a in inst.get("args", [])]
result = None
if isinstance(fname, str) and fname in self.functions:
callee = self.functions[fname]
result = self._exec_function(callee, call_args)
# Store result if needed
self._set(regs, inst.get("dst"), result)
i += 1
continue
# Unhandled op -> skip
i += 1
else:
# No explicit terminator; finish
return 0
def _try_intrinsic(self, name: str, args: List[Any]) -> Tuple[bool, Any]:
try:
if name == "Main.esc_json/1":
s = "" if not args else ("" if args[0] is None else str(args[0]))
out = []
for ch in s:
if ch == "\\":
out.append("\\\\")
elif ch == '"':
out.append('\\"')
else:
out.append(ch)
return True, "".join(out)
if name == "Main.dirname/1":
p = "" if not args else ("" if args[0] is None else str(args[0]))
d = os.path.dirname(p)
if d == "":
d = "."
return True, d
except Exception:
pass
return (False, None)

View File

@ -33,6 +33,8 @@ class Resolver:
self.f64_cache: Dict[Tuple[str, int], ir.Value] = {} self.f64_cache: Dict[Tuple[str, int], ir.Value] = {}
# String literal map: value_id -> Python string (for by-name calls) # String literal map: value_id -> Python string (for by-name calls)
self.string_literals: Dict[int, str] = {} self.string_literals: Dict[int, str] = {}
# Optional: value_id -> i8* pointer for string constants (lower_const can populate)
self.string_ptrs: Dict[int, ir.Value] = {}
# Track value-ids that are known to represent string handles (i64) # Track value-ids that are known to represent string handles (i64)
# This is a best-effort tag used to decide '+' as string concat when both sides are i64. # This is a best-effort tag used to decide '+' as string concat when both sides are i64.
self.string_ids: set[int] = set() self.string_ids: set[int] = set()

127
src/runner/mir_json_emit.rs Normal file
View File

@ -0,0 +1,127 @@
use serde_json::json;
/// Emit MIR JSON for Python harness/PyVM.
/// The JSON schema matches tools/llvmlite_harness.py expectations and is
/// intentionally minimal for initial scaffolding.
pub fn emit_mir_json_for_harness(
module: &nyash_rust::mir::MirModule,
path: &std::path::Path,
) -> Result<(), String> {
use nyash_rust::mir::{MirInstruction as I, BinaryOp as B, CompareOp as C};
let mut funs = Vec::new();
for (name, f) in &module.functions {
let mut blocks = Vec::new();
let mut ids: Vec<_> = f.blocks.keys().copied().collect();
ids.sort();
for bid in ids {
if let Some(bb) = f.blocks.get(&bid) {
let mut insts = Vec::new();
// PHI firstオプション
for inst in &bb.instructions {
if let I::Phi { dst, inputs } = inst {
let incoming: Vec<_> = inputs
.iter()
.map(|(b, v)| json!([v.as_u32(), b.as_u32()]))
.collect();
insts.push(json!({"op":"phi","dst": dst.as_u32(), "incoming": incoming}));
}
}
// Non-PHI
for inst in &bb.instructions {
match inst {
I::Const { dst, value } => {
match value {
nyash_rust::mir::ConstValue::Integer(i) => {
insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "i64", "value": i}}));
}
nyash_rust::mir::ConstValue::Float(fv) => {
insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "f64", "value": fv}}));
}
nyash_rust::mir::ConstValue::Bool(b) => {
insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "i64", "value": if *b {1} else {0}}}));
}
nyash_rust::mir::ConstValue::String(s) => {
// String constants are exported as StringBox handle by default
insts.push(json!({
"op":"const",
"dst": dst.as_u32(),
"value": {
"type": {"kind":"handle","box_type":"StringBox"},
"value": s
}
}));
}
nyash_rust::mir::ConstValue::Null | nyash_rust::mir::ConstValue::Void => {
insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": "void", "value": 0}}));
}
}
}
I::BinOp { dst, op, lhs, rhs } => {
let op_s = match op { B::Add=>"+",B::Sub=>"-",B::Mul=>"*",B::Div=>"/",B::Mod=>"%",B::BitAnd=>"&",B::BitOr=>"|",B::BitXor=>"^",B::Shl=>"<<",B::Shr=>">>",B::And=>"&",B::Or=>"|"};
insts.push(json!({"op":"binop","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()}));
}
I::Compare { dst, op, lhs, rhs } => {
let op_s = match op { C::Lt=>"<", C::Le=>"<=", C::Gt=>">", C::Ge=>">=", C::Eq=>"==", C::Ne=>"!=" };
insts.push(json!({"op":"compare","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()}));
}
I::Call { dst, func, args, .. } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
insts.push(json!({"op":"call","func": func.as_u32(), "args": args_a, "dst": dst.map(|d| d.as_u32())}));
}
I::ExternCall { dst, iface_name, method_name, args, .. } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
let func_name = if iface_name == "env.console" {
format!("nyash.console.{}", method_name)
} else { format!("{}.{}", iface_name, method_name) };
insts.push(json!({"op":"externcall","func": func_name, "args": args_a, "dst": dst.map(|d| d.as_u32())}));
}
I::BoxCall { dst, box_val, method, args, .. } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
// Minimal dst_type hints
let mut obj = json!({
"op":"boxcall","box": box_val.as_u32(), "method": method, "args": args_a, "dst": dst.map(|d| d.as_u32())
});
let m = method.as_str();
let dst_ty = if m == "substring" {
Some(json!({"kind":"handle","box_type":"StringBox"}))
} else if m == "length" || m == "lastIndexOf" {
Some(json!("i64"))
} else { None };
if let Some(t) = dst_ty { obj["dst_type"] = t; }
insts.push(obj);
}
I::NewBox { dst, box_type, args } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
insts.push(json!({"op":"newbox","type": box_type, "args": args_a, "dst": dst.as_u32()}));
}
I::Branch { condition, then_bb, else_bb } => {
insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()}));
}
I::Jump { target } => {
insts.push(json!({"op":"jump","target": target.as_u32()}));
}
I::Return { value } => {
insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())}));
}
_ => { /* skip non-essential ops for initial harness */ }
}
}
if let Some(term) = &bb.terminator {
match term {
I::Return { value } => insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})),
I::Jump { target } => insts.push(json!({"op":"jump","target": target.as_u32()})),
I::Branch { condition, then_bb, else_bb } => insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})),
_ => {}
}
}
blocks.push(json!({"id": bid.as_u32(), "instructions": insts}));
}
}
// Export parameter value-ids so a VM can bind arguments
let params: Vec<_> = f.params.iter().map(|v| v.as_u32()).collect();
funs.push(json!({"name": name, "params": params, "blocks": blocks}));
}
let root = json!({"functions": funs});
std::fs::write(path, serde_json::to_string_pretty(&root).unwrap())
.map_err(|e| format!("write mir json: {}", e))
}

View File

@ -30,6 +30,7 @@ use std::{fs, process};
mod modes; mod modes;
mod demos; mod demos;
mod json_v0_bridge; mod json_v0_bridge;
mod mir_json_emit;
// v2 plugin system imports // v2 plugin system imports
use nyash_rust::runtime; use nyash_rust::runtime;

View File

@ -2,7 +2,6 @@ use super::super::NyashRunner;
use nyash_rust::{parser::NyashParser, mir::{MirCompiler, MirInstruction}, box_trait::IntegerBox}; use nyash_rust::{parser::NyashParser, mir::{MirCompiler, MirInstruction}, box_trait::IntegerBox};
use nyash_rust::mir::passes::method_id_inject::inject_method_ids; use nyash_rust::mir::passes::method_id_inject::inject_method_ids;
use std::{fs, process}; use std::{fs, process};
use serde_json::json;
impl NyashRunner { impl NyashRunner {
/// Execute LLVM mode (split) /// Execute LLVM mode (split)
@ -56,7 +55,7 @@ impl NyashRunner {
let tmp_dir = std::path::Path::new("tmp"); let tmp_dir = std::path::Path::new("tmp");
let _ = std::fs::create_dir_all(tmp_dir); let _ = std::fs::create_dir_all(tmp_dir);
let mir_json_path = tmp_dir.join("nyash_harness_mir.json"); let mir_json_path = tmp_dir.join("nyash_harness_mir.json");
if let Err(e) = emit_mir_json_for_harness(&module, &mir_json_path) { if let Err(e) = crate::runner::mir_json_emit::emit_mir_json_for_harness(&module, &mir_json_path) {
eprintln!("❌ MIR JSON emit error: {}", e); eprintln!("❌ MIR JSON emit error: {}", e);
process::exit(1); process::exit(1);
} }
@ -182,92 +181,4 @@ impl NyashRunner {
} }
} }
fn emit_mir_json_for_harness(module: &nyash_rust::mir::MirModule, path: &std::path::Path) -> Result<(), String> { // emit_mir_json_for_harness moved to crate::runner::mir_json_emit
use nyash_rust::mir::{MirInstruction as I, BinaryOp as B, CompareOp as C};
// Build JSON structure expected by python builder: { functions: [ { name, params, blocks: [ { id, instructions: [ ... ] } ] } ] }
let mut funs = Vec::new();
for (name, f) in &module.functions {
let mut blocks = Vec::new();
let mut ids: Vec<_> = f.blocks.keys().copied().collect();
ids.sort();
for bid in ids {
if let Some(bb) = f.blocks.get(&bid) {
let mut insts = Vec::new();
// PHI firstオプション
for inst in &bb.instructions {
if let I::Phi { dst, inputs } = inst {
let incoming: Vec<_> = inputs.iter().map(|(b, v)| json!([v.as_u32(), b.as_u32()])).collect();
insts.push(json!({"op":"phi","dst": dst.as_u32(), "incoming": incoming}));
}
}
// Non-PHI
for inst in &bb.instructions {
match inst {
I::Const { dst, value } => {
let (t, val) = match value {
nyash_rust::mir::ConstValue::Integer(i) => ("i64", json!(i)),
nyash_rust::mir::ConstValue::Float(fv) => ("f64", json!(fv)),
nyash_rust::mir::ConstValue::Bool(b) => ("i64", json!(if *b {1} else {0})),
nyash_rust::mir::ConstValue::String(s) => ("string", json!(s)),
nyash_rust::mir::ConstValue::Null => ("void", json!(0)),
nyash_rust::mir::ConstValue::Void => ("void", json!(0)),
};
insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": t, "value": val}}));
}
I::BinOp { dst, op, lhs, rhs } => {
let op_s = match op { B::Add=>"+",B::Sub=>"-",B::Mul=>"*",B::Div=>"/",B::Mod=>"%",B::BitAnd=>"&",B::BitOr=>"|",B::BitXor=>"^",B::Shl=>"<<",B::Shr=>">>",B::And=>"&",B::Or=>"|"};
insts.push(json!({"op":"binop","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()}));
}
I::Compare { dst, op, lhs, rhs } => {
let op_s = match op { C::Lt=>"<", C::Le=>"<=", C::Gt=>">", C::Ge=>">=", C::Eq=>"==", C::Ne=>"!=" };
insts.push(json!({"op":"compare","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()}));
}
I::Call { dst, func, args, .. } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
insts.push(json!({"op":"call","func": func.as_u32(), "args": args_a, "dst": dst.map(|d| d.as_u32())}));
}
I::ExternCall { dst, iface_name, method_name, args, .. } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
// Map known interfaces to NyRT symbols
let func_name = if iface_name == "env.console" {
format!("nyash.console.{}", method_name)
} else { format!("{}.{}", iface_name, method_name) };
insts.push(json!({"op":"externcall","func": func_name, "args": args_a, "dst": dst.map(|d| d.as_u32())}));
}
I::BoxCall { dst, box_val, method, args, .. } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
insts.push(json!({"op":"boxcall","box": box_val.as_u32(), "method": method, "args": args_a, "dst": dst.map(|d| d.as_u32())}));
}
I::NewBox { dst, box_type, args } => {
let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect();
insts.push(json!({"op":"newbox","type": box_type, "args": args_a, "dst": dst.as_u32()}));
}
I::Branch { condition, then_bb, else_bb } => {
insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()}));
}
I::Jump { target } => {
insts.push(json!({"op":"jump","target": target.as_u32()}));
}
I::Return { value } => {
insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())}));
}
_ => { /* skip non-essential ops for initial harness */ }
}
}
// Terminator (if present)
if let Some(term) = &bb.terminator {
match term {
I::Return { value } => insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})),
I::Jump { target } => insts.push(json!({"op":"jump","target": target.as_u32()})),
I::Branch { condition, then_bb, else_bb } => insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})),
_ => {}
}
}
blocks.push(json!({"id": bid.as_u32(), "instructions": insts}));
}
}
funs.push(json!({"name": name, "params": [], "blocks": blocks}));
}
let root = json!({"functions": funs});
std::fs::write(path, serde_json::to_string_pretty(&root).unwrap()).map_err(|e| format!("write mir json: {}", e))
}

View File

@ -118,6 +118,58 @@ impl NyashRunner {
} }
} }
// Optional: PyVM path. When NYASH_VM_USE_PY=1, emit MIR(JSON) and delegate execution to tools/pyvm_runner.py
if std::env::var("NYASH_VM_USE_PY").ok().as_deref() == Some("1") {
let py = which::which("python3").ok();
if let Some(py3) = py {
let runner = std::path::Path::new("tools/pyvm_runner.py");
if runner.exists() {
// Emit MIR(JSON)
let tmp_dir = std::path::Path::new("tmp");
let _ = std::fs::create_dir_all(tmp_dir);
let mir_json_path = tmp_dir.join("nyash_pyvm_mir.json");
if let Err(e) = crate::runner::mir_json_emit::emit_mir_json_for_harness(&module_vm, &mir_json_path) {
eprintln!("❌ PyVM MIR JSON emit error: {}", e);
process::exit(1);
}
if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") {
eprintln!("[Runner/VM] using PyVM → {} (mir={})", filename, mir_json_path.display());
}
// Determine entry function hint (prefer Main.main if present)
let entry = if module_vm.functions.contains_key("Main.main") {
"Main.main"
} else if module_vm.functions.contains_key("main") { "main" } else { "Main.main" };
// Spawn runner
let status = std::process::Command::new(py3)
.args([
runner.to_string_lossy().as_ref(),
"--in",
&mir_json_path.display().to_string(),
"--entry",
entry,
])
.status()
.map_err(|e| format!("spawn pyvm: {}", e))
.unwrap();
if !status.success() {
eprintln!("❌ PyVM failed (status={})", status.code().unwrap_or(-1));
process::exit(1);
}
// Propagate exit code if set
if let Some(code) = status.code() {
process::exit(code);
}
process::exit(0);
} else {
eprintln!("❌ PyVM runner not found: {}", runner.display());
process::exit(1);
}
} else {
eprintln!("❌ python3 not found in PATH. Install Python 3 to use PyVM.");
process::exit(1);
}
}
// Expose GC/scheduler hooks globally for JIT externs (checkpoint/await, etc.) // Expose GC/scheduler hooks globally for JIT externs (checkpoint/await, etc.)
nyash_rust::runtime::global_hooks::set_from_runtime(&runtime); nyash_rust::runtime::global_hooks::set_from_runtime(&runtime);

View File

@ -43,11 +43,12 @@ if ! command -v llvm-config-18 >/dev/null 2>&1; then
exit 2 exit 2
fi fi
echo "[1/4] Building nyash (feature=llvm, harness-friendly) ..." echo "[1/4] Building nyash (feature selectable) ..."
_LLVMPREFIX=$(llvm-config-18 --prefix) _LLVMPREFIX=$(llvm-config-18 --prefix)
# Build only the core package to avoid compiling workspace plugin crates # Select LLVM feature: default harness (llvm), or legacy inkwell when NYASH_LLVM_FEATURE=llvm-inkwell-legacy
LLVM_FEATURE=${NYASH_LLVM_FEATURE:-llvm}
LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \ LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \
CARGO_INCREMENTAL=1 cargo build --release -p nyash-rust --features llvm >/dev/null CARGO_INCREMENTAL=1 cargo build --release -p nyash-rust --features "$LLVM_FEATURE" >/dev/null
echo "[2/4] Emitting object (.o) via LLVM backend ..." echo "[2/4] Emitting object (.o) via LLVM backend ..."
# Default object output path under target/aot_objects # Default object output path under target/aot_objects
@ -57,7 +58,15 @@ stem=${stem%.nyash}
OBJ="${NYASH_LLVM_OBJ_OUT:-$PWD/target/aot_objects/${stem}.o}" OBJ="${NYASH_LLVM_OBJ_OUT:-$PWD/target/aot_objects/${stem}.o}"
if [[ "${NYASH_LLVM_SKIP_EMIT:-0}" != "1" ]]; then if [[ "${NYASH_LLVM_SKIP_EMIT:-0}" != "1" ]]; then
rm -f "$OBJ" rm -f "$OBJ"
NYASH_LLVM_OBJ_OUT="$OBJ" LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" ./target/release/nyash --backend llvm "$INPUT" >/dev/null || true if [[ "${NYASH_LLVM_FEATURE:-llvm}" == "llvm-inkwell-legacy" ]]; then
# Legacy path: do not use harness
NYASH_LLVM_OBJ_OUT="$OBJ" LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \
./target/release/nyash --backend llvm "$INPUT" >/dev/null || true
else
# Harness path
NYASH_LLVM_OBJ_OUT="$OBJ" NYASH_LLVM_USE_HARNESS=1 LLVM_SYS_181_PREFIX="${_LLVMPREFIX}" LLVM_SYS_180_PREFIX="${_LLVMPREFIX}" \
./target/release/nyash --backend llvm "$INPUT" >/dev/null || true
fi
fi fi
if [[ ! -f "$OBJ" ]]; then if [[ ! -f "$OBJ" ]]; then
echo "error: object not generated: $OBJ" >&2 echo "error: object not generated: $OBJ" >&2

View File

@ -7,9 +7,15 @@ ROOT_DIR=$(cd "$(dirname "$0")/.." && pwd)
cd "$ROOT_DIR" cd "$ROOT_DIR"
PROFILE=${PROFILE:-release} PROFILE=${PROFILE:-release}
JOBS=${JOBS:-24}
echo "[plugins] building all (profile=$PROFILE)" echo "[plugins] building all (profile=$PROFILE, jobs=$JOBS)"
# Build all plugins in one go for maximum efficiency
echo "[plugins] building workspace..."
cargo build --workspace --$PROFILE -j $JOBS >/dev/null
# Copy artifacts to plugin directories
for dir in plugins/*; do for dir in plugins/*; do
[[ -d "$dir" && -f "$dir/Cargo.toml" ]] || continue [[ -d "$dir" && -f "$dir/Cargo.toml" ]] || continue
pkg=$(grep -m1 '^name\s*=\s*"' "$dir/Cargo.toml" | sed -E 's/.*"(.*)".*/\1/') pkg=$(grep -m1 '^name\s*=\s*"' "$dir/Cargo.toml" | sed -E 's/.*"(.*)".*/\1/')
@ -19,7 +25,6 @@ for dir in plugins/*; do
libname=${pkg//-/_} libname=${pkg//-/_}
fi fi
echo "[plugins] -> $pkg (libname=$libname)" echo "[plugins] -> $pkg (libname=$libname)"
cargo build -p "$pkg" --$PROFILE >/dev/null
# Copy artifacts # Copy artifacts
outdir="target/$PROFILE" outdir="target/$PROFILE"
# cdylib (.so/.dylib/.dll) # cdylib (.so/.dylib/.dll)

View File

@ -12,11 +12,19 @@ OFF_EXE=${OFF_EXE:-$ROOT_DIR/app_dep_tree_rust}
echo "[compare] target app: $APP" echo "[compare] target app: $APP"
echo "[compare] build (OFF/Rust LLVM or harness fallback) ..." echo "[compare] build (OFF/Rust LLVM or harness fallback) ..."
# If legacy inkwell backend is not in use, fall back to harness for OFF as well if [[ "${NYASH_COMPARE_INKWELL:-0}" == "1" ]]; then
NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_USE_HARNESS=1 "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null echo " OFF=inkwell-legacy"
NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_FEATURE=llvm-inkwell-legacy \
"$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null
else
echo " OFF=harness"
NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_FEATURE=llvm \
"$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null
fi
echo "[compare] build (ON/llvmlite harness) ..." echo "[compare] build (ON/llvmlite harness) ..."
NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_USE_HARNESS=1 "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$ON_EXE" >/dev/null NYASH_LLVM_SKIP_NYRT_BUILD=1 NYASH_LLVM_FEATURE=llvm \
"$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$ON_EXE" >/dev/null
echo "[compare] run both and capture output ..." echo "[compare] run both and capture output ..."
ON_OUT="$OUTDIR/on.out"; OFF_OUT="$OUTDIR/off.out" ON_OUT="$OUTDIR/on.out"; OFF_OUT="$OUTDIR/off.out"

View File

@ -0,0 +1,13 @@
#!/usr/bin/env bash
set -euo pipefail
ROOT=$(cd "$(dirname "$0")/.." && pwd)
cd "$ROOT"
echo "[deny-direct] scanning src/llvm_py for direct vmap.get reads ..."
rg -n "vmap\\.get\\(" src/llvm_py \
-g '!src/llvm_py/resolver.py' \
-g '!src/llvm_py/llvm_builder.py' || true
echo "[hint] Prefer resolver.resolve_i64/resolve_ptr with (builder.block, preds, block_end_values, vmap, bb_map)."

166
tools/parity.sh Normal file
View File

@ -0,0 +1,166 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ "${NYASH_CLI_VERBOSE:-0}" == "1" ]]; then
set -x
fi
usage() {
cat << USAGE
Nyash parity runner — compare two execution paths on the same .nyash
Usage: tools/parity.sh [options] <app.nyash>
Options:
--lhs <mode> Left mode: pyvm|llvmlite|vm (default: pyvm)
--rhs <mode> Right mode: pyvm|llvmlite|vm (default: llvmlite)
--timeout <s> Timeout seconds for each run (default: 12)
--show-diff Show unified diff when different
Compares stdout (normalized) and exit codes. Returns 0 when equal.
USAGE
}
APP=""
LHS="pyvm"
RHS="llvmlite"
TIMEOUT="12"
SHOW_DIFF=0
while [[ $# -gt 0 ]]; do
case "$1" in
-h|--help) usage; exit 0;;
--lhs) LHS="$2"; shift 2;;
--rhs) RHS="$2"; shift 2;;
--timeout) TIMEOUT="$2"; shift 2;;
--show-diff) SHOW_DIFF=1; shift;;
*) APP="$1"; shift;;
esac
done
if [[ -z "$APP" ]]; then
usage; exit 1
fi
if [[ ! -f "$APP" ]]; then
echo "error: app not found: $APP" >&2
exit 2
fi
ROOT=$(cd "$(dirname "$0")/.." && pwd)
NYASH_BIN="$ROOT/target/release/nyash"
if [[ ! -x "$NYASH_BIN" ]]; then
echo "[build] nyash not found; building release ..." >&2
(cd "$ROOT" && cargo build --release >/dev/null)
fi
has_cmd() { command -v "$1" >/dev/null 2>&1; }
normalize() {
# Remove runner/plugin noise and blank lines
sed -E \
-e 's/\r$//' \
-e '/^\[ConsoleBox\]/d' \
-e '/^\[FileBox\]/d' \
-e '/^\[plugin-loader\]/d' \
-e '/^\[Runner\//d' \
-e '/^DEBUG:/d' \
-e '/^🔌/d' \
-e '/^✅/d' \
-e '/^🚀/d' \
-e '/^⚡/d' \
-e '/^🦀/d' \
-e '/^🧠/d' \
-e '/^📊/d' \
-e '/^Result(Type)?\(/d' \
-e '/^Result:/d' \
-e '/^$/d'
}
run_pyvm() {
local app="$1"
local out code
if has_cmd timeout; then
out=$(NYASH_VM_USE_PY=1 timeout "${TIMEOUT}s" "$NYASH_BIN" --backend vm "$app" 2>&1) || code=$?
else
out=$(NYASH_VM_USE_PY=1 "$NYASH_BIN" --backend vm "$app" 2>&1) || code=$?
fi
code=${code:-0}
printf '%s' "$out" | normalize
echo "__EXIT_CODE__=$code"
}
run_vm() {
local app="$1"
local out code
if has_cmd timeout; then
out=$(timeout "${TIMEOUT}s" "$NYASH_BIN" --backend vm "$app" 2>&1) || code=$?
else
out=$("$NYASH_BIN" --backend vm "$app" 2>&1) || code=$?
fi
code=${code:-0}
printf '%s' "$out" | normalize
echo "__EXIT_CODE__=$code"
}
run_llvmlite() {
local app="$1"
if ! has_cmd llvm-config-18; then
echo "error: llvm-config-18 not found (required for llvmlite parity)." >&2
exit 3
fi
local stem exe
stem=$(basename "$app"); stem=${stem%.nyash}
exe="$ROOT/app_parity_${stem}"
NYASH_LLVM_FEATURE=llvm "${ROOT}/tools/build_llvm.sh" "$app" -o "$exe" >/dev/null || true
if [[ ! -x "$exe" ]]; then
echo "error: failed to build llvmlite executable: $exe" >&2
exit 4
fi
local out code
if has_cmd timeout; then
out=$(timeout "${TIMEOUT}s" "$exe" 2>&1) || code=$?
else
out=$("$exe" 2>&1) || code=$?
fi
code=${code:-0}
printf '%s' "$out" | normalize
echo "__EXIT_CODE__=$code"
}
run_mode() {
local mode="$1" app="$2"
case "$mode" in
pyvm) run_pyvm "$app" ;;
vm) run_vm "$app" ;;
llvmlite) run_llvmlite "$app" ;;
*) echo "error: unknown mode: $mode" >&2; exit 5;;
esac
}
LEFT=$(run_mode "$LHS" "$APP")
RIGHT=$(run_mode "$RHS" "$APP")
LCODE=$(printf '%s\n' "$LEFT" | sed -n 's/^__EXIT_CODE__=//p')
RCODE=$(printf '%s\n' "$RIGHT" | sed -n 's/^__EXIT_CODE__=//p')
LOUT=$(printf '%s\n' "$LEFT" | sed '/^__EXIT_CODE__=/d')
ROUT=$(printf '%s\n' "$RIGHT" | sed '/^__EXIT_CODE__=/d')
STATUS=0
if [[ "$LCODE" != "$RCODE" ]]; then
echo "[parity] exit code differs: $LHS=$LCODE, $RHS=$RCODE" >&2
STATUS=1
fi
if [[ "$LOUT" != "$ROUT" ]]; then
echo "[parity] stdout differs" >&2
if [[ "$SHOW_DIFF" -eq 1 ]]; then
diff -u <(printf '%s\n' "$LOUT") <(printf '%s\n' "$ROUT") || true
fi
STATUS=1
fi
if [[ "$STATUS" -eq 0 ]]; then
echo "✅ parity OK ($LHS == $RHS)" >&2
else
echo "❌ parity mismatch ($LHS != $RHS)" >&2
fi
exit "$STATUS"

76
tools/pyvm_runner.py Normal file
View File

@ -0,0 +1,76 @@
#!/usr/bin/env python3
"""
Nyash PyVM runner (scaffold)
Usage:
- python3 tools/pyvm_runner.py --in mir.json [--entry Main.main]
Executes MIR(JSON) using a tiny Python interpreter for a minimal opcode set:
- const/binop/compare/branch/jump/ret
- newbox (ConsoleBox, StringBox minimal)
- boxcall (String: length/substring/lastIndexOf; Console: print/println/log)
- externcall (nyash.console.println)
On success, exits with the integer return value if it is an Integer; otherwise 0.
Outputs produced by println/log are written to stdout.
"""
import argparse
import json
import sys
import os
from pathlib import Path
ROOT = Path(__file__).resolve().parents[1]
PYVM_DIR = ROOT / "src" / "llvm_py" / "pyvm"
# Ensure imports can find the package root (src)
SRC_DIR = ROOT / "src"
if str(SRC_DIR) not in sys.path:
sys.path.insert(0, str(SRC_DIR))
from llvm_py.pyvm.vm import PyVM # type: ignore
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--in", dest="infile", required=True, help="MIR JSON input")
ap.add_argument("--entry", dest="entry", default="Main.main", help="Entry function (default Main.main)")
args = ap.parse_args()
with open(args.infile, "r") as f:
program = json.load(f)
vm = PyVM(program)
# Fallbacks for entry name
entry = args.entry
fun_names = {f.get("name", "") for f in program.get("functions", [])}
if entry not in fun_names:
if "main" in fun_names:
entry = "main"
elif "Main.main" in fun_names:
entry = "Main.main"
result = vm.run(entry)
# Exit code convention: integers propagate; bool -> 0/1; else 0
code = 0
if isinstance(result, bool):
code = 1 if result else 0
elif isinstance(result, int):
# Clamp to 32-bit exit code domain
code = int(result) & 0xFFFFFFFF
if code & 0x80000000:
code = -((~code + 1) & 0xFFFFFFFF)
# For parity comparisons, avoid emitting extra lines here.
sys.exit(code)
if __name__ == "__main__":
try:
main()
except Exception as e:
import traceback
print(f"[pyvm] error: {e}", file=sys.stderr)
if sys.stderr and (os.environ.get('NYASH_CLI_VERBOSE') == '1' or True):
traceback.print_exc()
sys.exit(1)

58
tools/pyvm_vs_llvmlite.sh Normal file
View File

@ -0,0 +1,58 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ "${NYASH_CLI_VERBOSE:-0}" == "1" ]]; then
set -x
fi
APP="${1:-apps/tests/esc_dirname_smoke.nyash}"
OUT="app_pyvm_cmp"
if [[ ! -f "$APP" ]]; then
echo "error: app not found: $APP" >&2
exit 2
fi
# 1) Build nyash with llvm harness enabled (build_llvm.sh does the right thing)
echo "[cmp] building AOT via llvmlite harness ..." >&2
./tools/build_llvm.sh "$APP" -o "$OUT" >/dev/null
# 2) Run AOT executable and capture stdout + exit code
echo "[cmp] running AOT (llvmlite) ..." >&2
set +e
OUT_LL=$("./$OUT" 2>&1)
CODE_LL=$?
set -e
# 3) Run PyVM path (VM mode delegated to Python)
echo "[cmp] running PyVM ..." >&2
set +e
OUT_PY=$(NYASH_VM_USE_PY=1 ./target/release/nyash --backend vm "$APP" 2>&1)
CODE_PY=$?
set -e
echo "=== llvmlite (AOT) stdout ==="
echo "$OUT_LL" | sed -n '1,120p'
echo "=== PyVM stdout ==="
echo "$OUT_PY" | sed -n '1,120p'
echo "=== exit codes ==="
echo "llvmlite: $CODE_LL"
echo "PyVM : $CODE_PY"
DIFF=0
if [[ "$OUT_LL" != "$OUT_PY" ]]; then
echo "[cmp] stdout differs" >&2
DIFF=1
fi
if [[ "$CODE_LL" -ne "$CODE_PY" ]]; then
echo "[cmp] exit code differs" >&2
DIFF=1
fi
if [[ "$DIFF" -eq 0 ]]; then
echo "✅ parity OK (stdout + exit code)"
else
echo "❌ parity mismatch" >&2
exit 1
fi