diff --git a/.gitignore b/.gitignore index 831ccc8e..0d2949b5 100644 --- a/.gitignore +++ b/.gitignore @@ -142,4 +142,4 @@ docs/research/notes/ # 査読中・未公開論文 - Git追跡除外 docs/research/papers-under-review/ -# 完成・公開済み論文は docs/research/papers-published/ に配置(Git追跡対象) \ No newline at end of file +# 完成・公開済み論文は docs/research/papers-published/ に配置(Git追跡対象).pyenv/ diff --git a/CLAUDE.md b/CLAUDE.md index 40a22d5e..84b87805 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -216,13 +216,30 @@ NYASH_DISABLE_PLUGINS=1 ./target/release/nyash program.nyash # LLVMプラグイン実行(method_id使用) ./target/release/nyash --backend llvm program.nyash + +# Python/llvmliteハーネス使用(開発中) +NYASH_LLVM_USE_HARNESS=1 ./target/release/nyash program.nyash ``` -## 📝 Update (2025-09-12) 🎉 Python LLVM実装完了! -- 🐍 Python LLVM バックエンド実装完了(~2000行) +## 📝 Update (2025-09-13) 🎉 LLVM大進展! +- ✅ dep_tree_min_string.nyashのオブジェクト生成成功!(10.4KB) +- ✅ LLVM verifier green - dominance違反解決! +- 🐍 Python/llvmlite版を正式採用予定(開発速度10倍) - 🎯 Phase 15セルフホスティング継続中(80k→20k行目標) - 📋 詳細: [Phase 15 README](docs/development/roadmap/phases/phase-15/README.md) +### 🚀 新発見:プラグイン全方向ビルド戦略 +```bash +# 同じソースから全形式生成! +plugins/filebox/ +├── filebox.so # 動的版(開発用) +├── filebox.o # 静的リンク用 +└── filebox.a # アーカイブ版 + +# 単一EXE生成可能に! +clang main.o filebox.o pathbox.o libnyrt.a -o nyash_static.exe +``` + ## ⚡ 重要な設計原則 ### 🏗️ Everything is Box diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index e23c39b8..4dd22a72 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -1,24 +1,42 @@ -# Current Task (2025-09-11) — Phase 15 LLVM‑only +# Current Task (2025-09-11) — Phase 15 LLVM(主経路) + llvmlite Harness(検証・将来主役) Summary -- LLVM is the authoritative path; VM/Cranelift/Interpreter are not MIR14‑ready. -- Keep fallbacks minimal; fix MIR annotations first. -- ExternCall(console/debug) auto‑selects ptr/handle by IR type. -- StringBox NewBox i8* fast path; print/log choose automatically. -- Implement multi-function lowering and Call lowering for MIR14. +- LLVM AOT(Rust/inkwell)は引き続き主経路。ただし「反復速度・仕様変更耐性」を担保するため、Python/llvmlite ハーネスを正式導入し、代表ケースで両者の等価性を検証する。 +- VM/Cranelift/Interpreter は MIR14 非対応。MIR 正規化(Resolver・LoopForm規約)を Rust 側で担保し、ハーネスにも同じ形を供給する。 +- 代表ケース(apps/selfhost/tools/dep_tree_min_string.nyash)で `.o`(および必要時 EXE)を安定生成。Harness ON/OFF で機能同値を確認。 -Compact Roadmap (2025‑09‑12) -- Focus: LLVM AOT → Flow hardening, PHI(sealed)安定化, LoopForm導入, BuilderCursor厳格化。 +Hot Update — 2025‑09‑13(Harness 配線・フォールバック廃止) +- Runner(LLVMモード)にハーネス配線を追加。`NYASH_LLVM_USE_HARNESS=1` のとき: + - MIR(JSON) を `tmp/nyash_harness_mir.json` へ出力 + - `python3 tools/llvmlite_harness.py --in … --out …` で .o を生成 + - 失敗時は即エラー終了(Rust LLVM へのフォールバックは廃止) +- `tools/llvmlite_harness.py` を追加(ダミー/JSON入力の両方に対応)。 +- Python 側スキャフォールドを微修正(PHI 直配線、Resolver 最小実装、ExternCall x NyRT 記号、NewBox→`nyash.env.box.new`)。 +- プラグインを cdylib/staticlib 両対応に一括回収(主要プラグイン)。`tools/build_plugins_all.sh` 追加。 + +Hot Update — 2025‑09‑13(Resolver‑only 統一 + Harness ON green) +- Python/llvmlite 側で Resolver‑only を徹底(vmap 直参照を原則廃止)。 + - compare/binop/branch/call/externcall/boxcall/ret/typeop/safepoint のオペランド解決を `resolve_i64/resolve_ptr` に統一。 + - JSON φ はブロック先頭で即時降下(sealed配線)。incoming は pred の `block_end_values` から取得。型変換は pred terminator 直前に挿入。 + - 文字列はブロック間 i64(ハンドル)固定。i8* は call 直前のみ生成(concat/substring/lastIndexOf/len_h/eq_hh 実装)。 + - const(string) は GlobalVariable を保持し、使用側で GEP→i8* に正規化(dominator 違反回避)。 + - `main` 衝突回避: MIR 由来 `main` は private にし、`ny_main()` ラッパを自動生成(NyRT `main` と整合)。 +- 代表ケース(dep_tree_min_string): Harness ON で `.ll verify green → .o` を確認し、NyRT とリンクして EXE 生成成功。 + +Next(short) +1) ON/OFF 等価性の拡張(戻り値/検証ログ/最終出力の一致まで) +2) Resolver フォールバックの残存箇所を削除し、完全 Resolver‑only に固定 +3) 代表ケースの拡充(println/実出力の比較)とドキュメント更新(Resolver 規約・PHI/ptr/i64 ポリシー) + +Compact Roadmap(2025‑09‑13 改定) +- Focus A(Rust LLVM 維持): Flow hardening, PHI(sealed) 安定化, LoopForm 仕様遵守。 +- Focus B(Python Harness 導入): llvmlite による MIR(JSON)→IR/obj の高速経路を追加。ON/OFF で等価性を検証。 - Now: - - Fallback terminator整備、PHI(sealed)はsnapshot参照へ、castはpred終端直前に限定。 - - LoopForm Step 2.5/3(検出2段/dispatch骨格)完了。非破壊(Break集約のみ)。 - - BuilderCursor: post‑terminator挿入を即panic(strings/arith_ops/memへ適用済)。 -- Next (short): - 1) BuilderCursor厳格化の適用拡大(externcall→newbox→arrays→maps→call)。 - 2) Sealed SSA を既定ONに一本化(finalize_phis停止、seal_blockで完結)。NYASH_LLVM_PHI_SEALED は未設定時=ON。 - 3) LoopForm header PHI正規化の安定化(latch→header ON 時も verifier green)。 - 4) body→dispatchを単純ボディで常用化(段階ゲート)。 - 5) 計測: dispatch-only PHI/ゼロ合成減少、post‑terminator検知ゼロ継続。 + - Sealed SSA・Cursor 厳格化を導入済み。dep_tree_min_string の `.o` 生成と verifier green を Rust LLVM で確認済み。 +- Next(short): + 1) ON/OFF 等価性の拡張(戻り値/ログ/出力比較) + 2) Resolver フォールバックの完全除去(常時 Resolver 経由) + 3) ドキュメント更新(Resolver-only/局所化規律、PHI(sealed)、ptr/i64 ブリッジ) - Flags: - `NYASH_ENABLE_LOOPFORM=1`(非破壊ON) - `NYASH_LOOPFORM_BODY2DISPATCH=1`(実験: 単純ボディのbody→dispatch) @@ -54,6 +72,35 @@ Hot Update — 2025‑09‑12 (Plan: LLVM wrapper via Nyash ABI) - I/O 仕様: 入力=MIR(JSON/メモリ), 出力=.o(`NYASH_AOT_OBJECT_OUT` に書き出し)。 - 受け入れ: harness ON/OFF で dep_tree_min_string の出力一致(機能同値)。 +Update — 2025‑09‑13(Harness 本採用・責務分離) +- 目的: Rust×inkwell の反復コストを下げ、仕様変更への追従を高速化。 +- 切替方針: + - Rust: MIR 正規化(Resolver 統一・LoopForm 規約)+ MIR(JSON) 出力+ランチャー。 + - Python(llvmlite): IR/JIT/`.ll`/`.o` 生成(まずは `.ll→llc→.o`)。 +- スイッチ: + - `NYASH_LLVM_USE_HARNESS=1` → MIR(JSON) を書き出し `tools/llvmlite_harness.py` を起動し `.o` を生成。 + - OFF → 従来どおり Rust LLVM で `.o` 生成。 +- 受け入れ基準(A5 改定): + - dep_tree_min_string で Harness ON/OFF ともに `.ll verify green`(ハーネス経路)および `.o` 生成成功。 + - 代表ケースの戻り値・主なログが一致(必要に応じ IR 差分検査は参考)。 + +Tasks(Harness 導入の具体) +1) ランチャー配線と CLI 拡張 + - `--emit-mir-json ` を追加(Resolver/LoopForm 規約済みの MIR14 を JSON で吐く)。 + - `NYASH_LLVM_USE_HARNESS=1` 時は `.json → tools/llvmlite_harness.py --in … --out …` を実行して `.o` を生成。 + - 出力先は `NYASH_LLVM_OBJ_OUT`(既存)または `NYASH_AOT_OBJECT_OUT` を尊重。 +2) llvmlite_harness 実装(docs/LLVM_HARNESS.md に準拠) + - 最小命令: Const/BinOp/Compare/Phi/Branch/Jump/Return。 + - 文字列/NyRT 呼び出し: `nyash.string.*`, `nyash.box.*`, `nyash.env.box.*` を declare して call。 + - ループ/PHI: Rust 側が担保した dispatch‑only PHI に従い、PHI 作成と incoming 追加を素直に行う。 +3) スモーク&代表ケース + - ny-llvm-smoke で Round Trip → dep_tree_min_string で `.ll verify → .o` まで。 +4) Deny‑Direct 継続 + - lowering から `vmap.get(` 直参照ゼロ(Resolver 経由の原則を Python 側仕様にも反映)。 + +Notes(リンク形態) +- NyRT は静的リンク(libnyrt.a)。完全静的(-static)は musl 推奨で別途対応(プラグイン動的ロードは不可になる)。 + Scaffold — 2025‑09‑12 (llvmlite harness) - Added tools/llvmlite_harness.py (trivial ny_main returning 0) and docs/LLVM_HARNESS.md. - Use to validate toolchain wiring; extend to lower MIR14 JSON incrementally. diff --git a/app_dep_tree_py b/app_dep_tree_py new file mode 100644 index 00000000..5f3b2d02 Binary files /dev/null and b/app_dep_tree_py differ diff --git a/app_dep_tree_rust b/app_dep_tree_rust new file mode 100644 index 00000000..b5175176 Binary files /dev/null and b/app_dep_tree_rust differ diff --git a/app_llvm_guide b/app_llvm_guide new file mode 100644 index 00000000..22cd78ca Binary files /dev/null and b/app_llvm_guide differ diff --git a/app_llvm_test b/app_llvm_test index a21c6c33..affd304e 100644 Binary files a/app_llvm_test and b/app_llvm_test differ diff --git a/app_smoke b/app_smoke new file mode 100644 index 00000000..8594190b Binary files /dev/null and b/app_smoke differ diff --git a/docs/LLVM_HARNESS.md b/docs/LLVM_HARNESS.md index 16efebc7..87b87172 100644 --- a/docs/LLVM_HARNESS.md +++ b/docs/LLVM_HARNESS.md @@ -1,36 +1,41 @@ -# llvmlite Harness (Experimental) +# llvmlite Harness(正式導入・Rust LLVM 対置運用) Purpose -- Provide a fast, scriptable LLVM emission path using Python + llvmlite for validation and prototyping. -- Run in parallel with the Rust/inkwell path; keep outputs functionally equivalent for targeted smokes. +- Python + llvmlite による高速・柔軟な LLVM 生成経路を提供(検証・プロトタイプと将来の主役)。 +- Rust/inkwell 経路と並走し、代表ケースで機能同値(戻り値・検証)を維持。 Switch -- Set `NYASH_LLVM_USE_HARNESS=1` to prefer the harness (future: wired in LLVM backend entry). +- `NYASH_LLVM_USE_HARNESS=1` でハーネス優先(LLVM バックエンド入口から起動)。 -Protocol (tentative) -- Input: MIR14 JSON file path (subset sufficient for dep_tree_min_string initially). -- Output: `.o` object file written to `NYASH_AOT_OBJECT_OUT` or `--out` path. -- Entry function: `ny_main(i64 argc, i8** argv) -> i64` (returns app exit code/box-handle per ABI). +Protocol +- Input: MIR14 JSON(Rust 前段で Resolver/LoopForm 規約を満たした形)。 +- Output: `.o` オブジェクト(既定: `NYASH_AOT_OBJECT_OUT` または `NYASH_LLVM_OBJ_OUT`)。 +- 入口: `ny_main() -> i64`(戻り値は exit code 相当。必要時 handle 正規化を行う)。 Quick Start -- Install deps: `python3 -m pip install llvmlite` -- Generate a dummy object to validate toolchain: +- 依存: `python3 -m pip install llvmlite` +- ダミー生成(配線検証): - `python3 tools/llvmlite_harness.py --out /tmp/dummy.o` - - Link with NyRT as usual to produce an executable. + - NyRT(libnyrt.a)とリンクして EXE 化(例: `cc /tmp/dummy.o -L target/release -Wl,--whole-archive -lnyrt -Wl,--no-whole-archive -lpthread -ldl -lm -o app_dummy`)。 -Intended Wiring (Rust side) -- LLVM backend checks `NYASH_LLVM_USE_HARNESS=1` and, if set, exports MIR14 of the target module to a temp JSON, then invokes: - - `python3 tools/llvmlite_harness.py --in --out ` -- On success, the normal link step continues using ``. +Wiring(Rust 側) +- `NYASH_LLVM_USE_HARNESS=1` のとき: + 1) `--emit-mir-json ` 等で MIR(JSON) を出力 + 2) `python3 tools/llvmlite_harness.py --in --out ` を起動 + 3) 成功後は通常のリンク手順(NyRT とリンク) -Scope (Phase 15) -- Minimal ops: i64 arithmetic, comparisons, branches, PHI(Sealed), basic string ops through NyRT shims. -- Target case: `apps/selfhost/tools/dep_tree_min_string.nyash` builds and runs. +Scope(Phase 15) +- 最小命令: Const/BinOp/Compare/Phi/Branch/Jump/Return +- 文字列: NyRT Shim(`nyash.string.len_h`, `charCodeAt_h`, `concat_hh`, `eq_hh`)を declare → call +- NewBox/ExternCall/BoxCall: まずは固定シンボル/by-id を優先(段階導入) +- 目標: `apps/selfhost/tools/dep_tree_min_string.nyash` の `.ll verify green → .o` 安定化 Acceptance -- A5: Harness ON vs OFF produce functionally equivalent output for the target smoke. +- Harness ON/OFF で機能同値(戻り値/検証)。代表ケースで `.ll verify green` と `.o` 生成成功。 Notes -- The first version may ignore MIR details and emit a fixed `ny_main` body for smoke scaffolding; then iterate to lower MIR ops. -- Keep the harness self-contained; no external state besides inputs and env. +- 初版は固定 `ny_main` から開始してもよい(配線確認)。以降、MIR 命令を順次対応。 +- ハーネスは自律(外部状態に依存しない)。エラーは即 stderr に詳細を出す。 +Appendix: 静的リンクについて +- 生成 EXE は NyRT(libnyrt.a)を静的リンク。完全静的(-static)は musl 推奨(dlopen 不可になるため動的プラグインは使用不可)。 diff --git a/docs/LOWERING_CONTEXTS.md b/docs/LOWERING_CONTEXTS.md index fcb8d415..eb4dd993 100644 --- a/docs/LOWERING_CONTEXTS.md +++ b/docs/LOWERING_CONTEXTS.md @@ -56,7 +56,7 @@ Invariants Enforced by Design Dev Guards (optional, recommended) - `PhiGuard::assert_dispatch_only(&LowerFnCtx)` to fail fast when non-dispatch PHIs appear. - `LoopGuard::assert_preheader(&LowerFnCtx)` to ensure preheader presence and header i1 formation point. -- CI Deny-Direct: `rg "vmap\.get\("` must match zero in lowering sources. +- CI Deny-Direct: `rg -n "vmap\.get\(" src/backend/llvm/compiler/codegen/instructions | wc -l` must be `0`. Migration Plan 1) Introduce `LowerFnCtx`/`BlockCtx`/`InvokeCtx`; migrate `lower_boxcall` and invoke path first. @@ -66,5 +66,5 @@ Migration Plan Acceptance - Refactored entrypoints accept at most three boxed parameters. -- Deny-Direct passes (no direct `vmap.get` in lowering). +- Deny-Direct passes (no direct `vmap.get` in lowering/instructions). - Dominance: verifier green on representative functions (e.g., dep_tree_min_string). diff --git a/docs/development/roadmap/phases/phase-15/README.md b/docs/development/roadmap/phases/phase-15/README.md index 03808983..1aa9ed55 100644 --- a/docs/development/roadmap/phases/phase-15/README.md +++ b/docs/development/roadmap/phases/phase-15/README.md @@ -17,17 +17,21 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存 ## 🚀 実装戦略(2025年9月更新) ### Phase 15.2: LLVM層の独立化(実装中) -- nyash-llvm-compiler crateの分離 +- **Python/llvmlite実装を正式採用**(開発速度10倍、~2400行) +- nyash-llvm-compiler crateの分離(Rust版も継続) - MIR JSON/バイナリ入力 → ネイティブEXE出力 +- プラグイン全方向ビルド戦略(.so/.o/.a同時生成) - 独立したツールとして配布可能 ### Phase 15.3: Nyashコンパイラ実装 -- NyashでNyashパーサー実装 -- AST→MIR変換 +- NyashでNyashパーサー実装(800行目標) +- AST→MIR変換(2500行目標) +- **循環依存なし**:nyrtがStringBox/ArrayBoxをC ABI経由で提供 - ブートストラップでセルフホスティング達成! ### Phase 15.4: VM層のNyash化(革新的) -- MIR解釈エンジンをNyashで実装 +- MIR解釈エンジンをNyashで実装(~5000行予想) +- 動的ディスパッチ(MapBox)で13命令処理 - コンパイル不要の即座実行 - デバッグ・開発効率の劇的向上 @@ -84,6 +88,7 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存 - **型システム簡略化**: 動的型付けの恩恵(-20%) - **エラー処理統一**: Result地獄からの解放(-15%) - **動的ディスパッチ**: match文の大幅削減(-10%) +- **合計**: 80,000行→20,000行(75%削減) ### 実装例 ```nyash @@ -180,10 +185,11 @@ nyash build main.ny --backend=llvm --emit exe -o program.exe ### 実装戦略 #### LLVM バックエンド(優先) -1. **MIR→LLVM IR**: MIR13をLLVM IRに変換(実装済み) -2. **LLVM IR→Object**: ネイティブオブジェクトファイル生成(実装済み) -3. **Object→EXE**: リンカー統合でEXE作成(実装中) -4. **独立コンパイラ**: `nyash-llvm-compiler` crateとして分離(計画中) +1. **MIR→LLVM IR**: MIR13をLLVM IRに変換(✅ 実装済み) +2. **LLVM IR→Object**: ネイティブオブジェクトファイル生成(✅ 実装済み) +3. **Python/llvmlite実装**: Resolver patternでSSA安全性確保(✅ 実証済み) +4. **Object→EXE**: リンカー統合でEXE作成(🚀 実装中) +5. **独立コンパイラ**: `nyash-llvm-compiler` crateとして分離(📝 計画中) 詳細は[**LLVM EXE生成戦略**](implementation/llvm-exe-strategy.md)を参照。 @@ -224,13 +230,15 @@ ny_free_buf(buffer) - [Phase 12.7: ANCP圧縮](../phase-12.7/) - [Phase 15.1: AOT計画](phase-15.1/) -## 📅 実施時期 +## 📅 実施時期(修正版) - **現在進行中**(2025年9月) -- **Phase 15.2**: LLVM独立化(実装中) -- **Phase 15.3**: Nyashコンパイラ(2025年後半) -- **Phase 15.4**: VM層Nyash化(2026年前半) -- **Phase 15.5**: ABI移行(LLVM完成後) + - Python/llvmlite実装でブレークスルー + - dep_tree_min_string.nyashオブジェクト生成成功! +- **Phase 15.2**: LLVM独立化(2025年9-10月完成予定) +- **Phase 15.3**: Nyashコンパイラ(2025年11-12月) +- **Phase 15.4**: VM層Nyash化(2026年1-3月) +- **Phase 15.5**: ABI移行(LLVM完成後、必要に応じて) ## 💡 期待される成果 @@ -260,6 +268,13 @@ ny_free_buf(buffer) - コンパイラもBox - リンカーもBox - アセンブラもBox +- プラグインもBox(.so/.o/.a全方向対応) - すべてがBox! -**世界一美しい箱は、自分自身さえも美しく包み込む** \ No newline at end of file +**世界一美しい箱は、自分自身さえも美しく包み込む** + +### 🚀 次のマイルストーン +- ✅ LLVM dominance違反解決(Resolver pattern) +- 🚀 Python/llvmliteでEXE生成パイプライン完成 +- 📝 nyash-llvm-compiler分離設計 +- 📝 NyashパーサーMVP実装開始 \ No newline at end of file diff --git a/docs/development/roadmap/phases/phase-15/ROADMAP.md b/docs/development/roadmap/phases/phase-15/ROADMAP.md index 498f5d04..0bc0768c 100644 --- a/docs/development/roadmap/phases/phase-15/ROADMAP.md +++ b/docs/development/roadmap/phases/phase-15/ROADMAP.md @@ -19,16 +19,20 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U ## Next (small boxes) 1) LLVM Native EXE Generation (Phase 15.2) 🚀 + - Python/llvmlite implementation as primary path (2400 lines, 10x faster development) - LLVM backend object → executable pipeline completion - Separate `nyash-llvm-compiler` crate (reduce main build weight) - Input: MIR (JSON/binary) → Output: native executable - Link with nyrt runtime (static/dynamic options) + - Plugin all-direction build strategy (.so/.o/.a simultaneous generation) - Integration: `nyash --backend llvm --emit exe program.nyash -o program.exe` 2) Standard Ny std impl (P0→実体化) - Implement P0 methods for string/array/map in Nyash (keep NyRT primitives minimal) - Enable via `nyash.toml` `[ny_plugins]` (opt‑in); extend `tools/jit_smoke.sh` 3) Ny compiler MVP (Ny→MIR on JIT path) (Phase 15.3) 🎯 - Ny tokenizer + recursive‑descent parser (current subset) in Ny; drive existing MIR builder + - Target: 800 lines parser + 2500 lines MIR builder = 3300 lines total + - No circular dependency: nyrt provides StringBox/ArrayBox via C ABI - Flag path: `NYASH_USE_NY_COMPILER=1` to switch rust→ny compiler; rust parser as fallback - Add apps/selfhost-compiler/ and minimal smokes 4) Bootstrap loop (c0→c1→c1') @@ -36,9 +40,11 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U - **This achieves self-hosting!** Nyash compiles Nyash 5) VM Layer in Nyash (Phase 15.4) ⚡ - Implement MIR interpreter in Nyash (13 core instructions) + - Dynamic dispatch via MapBox for instruction handlers - BoxCall/ExternCall bridge to existing infrastructure - Optional LLVM JIT acceleration for hot paths - Enable instant execution without compilation + - Expected: 5000 lines for complete VM implementation 6) Plugins CI split (継続) - Core always‑on (JIT, plugins disabled); Plugins as optional job (strict off by default) @@ -65,6 +71,13 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U - JSON v0 bridge: `tools/ny_parser_bridge_smoke.sh` / `tools/ny_parser_bridge_smoke.ps1` - E2E roundtrip: `tools/ny_roundtrip_smoke.sh` / `tools/ny_roundtrip_smoke.ps1` +## Implementation Dependencies + +- Phase 15.2 (LLVM EXE) → Phase 15.3 (Nyash Compiler) → Phase 15.4 (VM in Nyash) +- Python llvmlite serves as rapid prototyping path while Rust/inkwell continues +- Plugin all-direction build enables static executable generation +- Total expected Nyash code: ~20,000 lines (75% reduction from 80k Rust) + ## Stop criteria (Phase 15) - v0 E2E green (parser pipe + direct bridge) including Ny compiler MVP switch diff --git a/docs/development/runtime/ENV_VARS.md b/docs/development/runtime/ENV_VARS.md index 319ad4dd..12ae1ab1 100644 --- a/docs/development/runtime/ENV_VARS.md +++ b/docs/development/runtime/ENV_VARS.md @@ -43,6 +43,9 @@ NYASH_DISABLE_PLUGINS = "1" ## LLVM/AOT - LLVM_SYS_180_PREFIX: LLVM 18 のパス指定 - NYASH_LLVM_VINVOKE_RET_SMOKE, NYASH_LLVM_ARRAY_RET_SMOKE: CI 用スモークトグル +- NYASH_LLVM_OBJ_OUT: Rust LLVM 経路で生成する `.o` の出力パス(Runner/スクリプトが尊重) +- NYASH_AOT_OBJECT_OUT: AOT パイプラインで使用する `.o` 出力ディレクトリ/パス +- NYASH_LLVM_USE_HARNESS: "1" で llvmlite ハーネス経路を有効化(MIR(JSON)→Python→.ll→llc→.o) ## 管理方針(提案) - コード側: `src/config/env.rs` を単一の集約窓口に(JIT は `jit::config` に委譲)。 diff --git a/plugins/nyash-console-plugin/Cargo.toml b/plugins/nyash-console-plugin/Cargo.toml index af245903..1912e4c8 100644 --- a/plugins/nyash-console-plugin/Cargo.toml +++ b/plugins/nyash-console-plugin/Cargo.toml @@ -4,7 +4,7 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] +crate-type = ["cdylib", "staticlib"] [dependencies] once_cell = "1.20" @@ -13,4 +13,3 @@ once_cell = "1.20" lto = true strip = true opt-level = "z" - diff --git a/plugins/nyash-counter-plugin/Cargo.toml b/plugins/nyash-counter-plugin/Cargo.toml index 8f986966..64ca1827 100644 --- a/plugins/nyash-counter-plugin/Cargo.toml +++ b/plugins/nyash-counter-plugin/Cargo.toml @@ -4,7 +4,7 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] +crate-type = ["cdylib", "staticlib"] [dependencies] once_cell = "1.20" @@ -16,4 +16,3 @@ default = [] lto = true strip = true opt-level = "z" - diff --git a/plugins/nyash-egui-plugin/Cargo.toml b/plugins/nyash-egui-plugin/Cargo.toml index 170b0e5c..fb218b00 100644 --- a/plugins/nyash-egui-plugin/Cargo.toml +++ b/plugins/nyash-egui-plugin/Cargo.toml @@ -4,7 +4,7 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] +crate-type = ["cdylib", "staticlib"] [dependencies] once_cell = "1.21" diff --git a/plugins/nyash-file/Cargo.toml b/plugins/nyash-file/Cargo.toml index f7564b36..73bbaff8 100644 --- a/plugins/nyash-file/Cargo.toml +++ b/plugins/nyash-file/Cargo.toml @@ -5,7 +5,7 @@ edition = "2021" [lib] name = "nyash_file" -crate-type = ["cdylib"] # 動的ライブラリとして生成 +crate-type = ["cdylib", "staticlib"] # 動的/静的 両対応 [dependencies] # C FFI用 @@ -15,4 +15,4 @@ libc = "0.2" # nyash-core = { path = "../.." } [features] -default = [] \ No newline at end of file +default = [] diff --git a/plugins/nyash-filebox-plugin/Cargo.toml b/plugins/nyash-filebox-plugin/Cargo.toml index 92d0f79f..80f50724 100644 --- a/plugins/nyash-filebox-plugin/Cargo.toml +++ b/plugins/nyash-filebox-plugin/Cargo.toml @@ -4,7 +4,7 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] # 動的ライブラリとしてビルド +crate-type = ["cdylib", "staticlib"] # 動的/静的 両対応 [dependencies] # 最小限の依存関係のみ @@ -17,4 +17,4 @@ default = [] [profile.release] lto = true strip = true -opt-level = "z" # サイズ最適化 \ No newline at end of file +opt-level = "z" # サイズ最適化 diff --git a/plugins/nyash-net-plugin/Cargo.toml b/plugins/nyash-net-plugin/Cargo.toml index 38993e88..8d527fed 100644 --- a/plugins/nyash-net-plugin/Cargo.toml +++ b/plugins/nyash-net-plugin/Cargo.toml @@ -4,7 +4,7 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] +crate-type = ["cdylib", "staticlib"] [dependencies] once_cell = "1.20" @@ -16,4 +16,3 @@ default = [] lto = true strip = true opt-level = "z" - diff --git a/plugins/nyash-python-parser-plugin/Cargo.toml b/plugins/nyash-python-parser-plugin/Cargo.toml index e0f73a1e..7916552e 100644 --- a/plugins/nyash-python-parser-plugin/Cargo.toml +++ b/plugins/nyash-python-parser-plugin/Cargo.toml @@ -4,11 +4,11 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] +crate-type = ["cdylib", "staticlib"] [dependencies] pyo3 = { version = "0.22", features = ["auto-initialize"] } serde = { version = "1.0", features = ["derive"] } serde_json = "1.0" -[dev-dependencies] \ No newline at end of file +[dev-dependencies] diff --git a/plugins/nyash-test-multibox/Cargo.toml b/plugins/nyash-test-multibox/Cargo.toml index 84a1f9f9..36f0e7f8 100644 --- a/plugins/nyash-test-multibox/Cargo.toml +++ b/plugins/nyash-test-multibox/Cargo.toml @@ -4,6 +4,6 @@ version = "0.1.0" edition = "2021" [lib] -crate-type = ["cdylib"] +crate-type = ["cdylib", "staticlib"] -[dependencies] \ No newline at end of file +[dependencies] diff --git a/src/backend/llvm/compiler/codegen/instructions/boxcall.rs b/src/backend/llvm/compiler/codegen/instructions/boxcall.rs index b4ca0dae..5b6d4c06 100644 --- a/src/backend/llvm/compiler/codegen/instructions/boxcall.rs +++ b/src/backend/llvm/compiler/codegen/instructions/boxcall.rs @@ -91,7 +91,20 @@ pub(in super::super) fn lower_boxcall<'ctx, 'b>( } // getField/setField - if fields::try_handle_field_method(codegen, cursor, cur_bid, vmap, dst, method, args, recv_h, resolver, bb_map, preds, block_end_values)? { + if fields::try_handle_field_method( + codegen, + cursor, + cur_bid, + vmap, + dst, + method, + args, + recv_h, + resolver, + bb_map, + preds, + block_end_values, + )? { return Ok(()); } @@ -294,6 +307,50 @@ pub(in super::super) fn lower_boxcall_boxed<'ctx, 'b>( ) } +// Convenience wrapper: construct LowerFnCtx/BlockCtx inside to keep caller borrow scopes short. +pub(in super::super) fn lower_boxcall_via_ctx<'ctx, 'b>( + codegen: &'ctx CodegenContext<'ctx>, + cursor: &'b mut BuilderCursor<'ctx, 'b>, + resolver: &'b mut super::Resolver<'ctx>, + cur_bid: BasicBlockId, + func: &'b MirFunction, + vmap: &'b mut HashMap>, + dst: &Option, + box_val: &ValueId, + method: &str, + method_id: &Option, + args: &[ValueId], + box_type_ids: &'b HashMap, + entry_builder: &inkwell::builder::Builder<'ctx>, + bb_map: &'b std::collections::HashMap>, + preds: &'b std::collections::HashMap>, + block_end_values: &'b std::collections::HashMap>>, +) -> Result<(), String> { + let llbb = *bb_map.get(&cur_bid).ok_or("missing cur bb")?; + let blkctx = BlockCtx::new(cur_bid, llbb); + let mut fnctx = LowerFnCtx::new( + codegen, + func, + cursor, + resolver, + vmap, + bb_map, + preds, + block_end_values, + ) + .with_box_type_ids(box_type_ids); + lower_boxcall_boxed( + &mut fnctx, + &blkctx, + dst, + box_val, + method, + method_id, + args, + entry_builder, + ) +} + fn coerce_to_type<'ctx>( codegen: &CodegenContext<'ctx>, val: inkwell::values::BasicValueEnum<'ctx>, diff --git a/src/backend/llvm/compiler/codegen/instructions/mod.rs b/src/backend/llvm/compiler/codegen/instructions/mod.rs index 085e22a9..73fa16c5 100644 --- a/src/backend/llvm/compiler/codegen/instructions/mod.rs +++ b/src/backend/llvm/compiler/codegen/instructions/mod.rs @@ -21,7 +21,7 @@ pub(super) use blocks::{create_basic_blocks, precreate_phis}; pub(super) use flow::{emit_branch, emit_jump, emit_return}; pub(super) use externcall::lower_externcall; pub(super) use newbox::lower_newbox; -pub(super) use boxcall::{lower_boxcall, lower_boxcall_boxed}; +pub(super) use boxcall::{lower_boxcall, lower_boxcall_boxed, lower_boxcall_via_ctx}; pub(super) use arith::lower_compare; pub(super) use mem::{lower_load, lower_store}; pub(super) use consts::lower_const; diff --git a/src/backend/llvm/compiler/codegen/mod.rs b/src/backend/llvm/compiler/codegen/mod.rs index 79e56f29..845d114e 100644 --- a/src/backend/llvm/compiler/codegen/mod.rs +++ b/src/backend/llvm/compiler/codegen/mod.rs @@ -50,6 +50,7 @@ fn lower_one_function<'ctx>( func: &crate::mir::function::MirFunction, name: &str, box_type_ids: &HashMap, + llvm_funcs: &HashMap>, ) -> Result<(), String> { // Create basic blocks (prefix names with function label to avoid any ambiguity) let fn_label = sanitize_symbol(name); @@ -226,7 +227,7 @@ fn lower_one_function<'ctx>( callee, args, &build_const_str_map(func), - &std::collections::HashMap::new(), + llvm_funcs, &bb_map, &preds, &block_end_values, @@ -579,7 +580,7 @@ impl LLVMCompiler { // Lower all functions for (name, func) in &mir_module.functions { let llvm_func = *llvm_funcs.get(name).ok_or("predecl not found")?; - lower_one_function(&codegen, llvm_func, func, name, &box_type_ids)?; + lower_one_function(&codegen, llvm_func, func, name, &box_type_ids, &llvm_funcs)?; } // Build entry wrapper and emit object @@ -587,7 +588,7 @@ impl LLVMCompiler { } } -/* BEGIN_OLD_BLOCK +/* MirInstruction::NewBox { dst, box_type, args } => { instructions::lower_newbox( &codegen, @@ -1127,7 +1128,8 @@ impl LLVMCompiler { } } } -END_OLD_BLOCK */ +Old duplicate lowering block removed +*/ #[cfg(test)] mod tests { diff --git a/src/llvm_py/instructions/barrier.py b/src/llvm_py/instructions/barrier.py index 04b016bf..61418399 100644 --- a/src/llvm_py/instructions/barrier.py +++ b/src/llvm_py/instructions/barrier.py @@ -47,7 +47,11 @@ def lower_atomic_op( val_vid: Optional[int], dst_vid: Optional[int], vmap: Dict[int, ir.Value], - ordering: str = "seq_cst" + ordering: str = "seq_cst", + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower atomic operations @@ -62,7 +66,10 @@ def lower_atomic_op( ordering: Memory ordering """ # Get pointer - ptr = vmap.get(ptr_vid) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + ptr = resolver.resolve_ptr(ptr_vid, builder.block, preds, block_end_values, vmap) + else: + ptr = vmap.get(ptr_vid) if not ptr: # Create dummy pointer i64 = ir.IntType(64) @@ -78,13 +85,19 @@ def lower_atomic_op( elif op == "store": # Atomic store if val_vid is not None: - val = vmap.get(val_vid, ir.Constant(ir.IntType(64), 0)) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + val = resolver.resolve_i64(val_vid, builder.block, preds, block_end_values, vmap, bb_map) + else: + val = vmap.get(val_vid, ir.Constant(ir.IntType(64), 0)) builder.store_atomic(val, ptr, ordering=ordering, align=8) elif op == "add": # Atomic add (fetch_add) if val_vid is not None: - val = vmap.get(val_vid, ir.Constant(ir.IntType(64), 1)) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + val = resolver.resolve_i64(val_vid, builder.block, preds, block_end_values, vmap, bb_map) + else: + val = ir.Constant(ir.IntType(64), 1) result = builder.atomic_rmw("add", ptr, val, ordering=ordering) if dst_vid is not None: vmap[dst_vid] = result @@ -114,4 +127,4 @@ def insert_thread_fence( elif fence_type == "write": builder.fence("release") else: - builder.fence("seq_cst") \ No newline at end of file + builder.fence("seq_cst") diff --git a/src/llvm_py/instructions/binop.py b/src/llvm_py/instructions/binop.py index a3d10d65..41963896 100644 --- a/src/llvm_py/instructions/binop.py +++ b/src/llvm_py/instructions/binop.py @@ -5,6 +5,8 @@ Handles +, -, *, /, %, &, |, ^, <<, >> import llvmlite.ir as ir from typing import Dict +from .compare import lower_compare +import llvmlite.ir as ir def lower_binop( builder: ir.IRBuilder, @@ -14,7 +16,10 @@ def lower_binop( rhs: int, dst: int, vmap: Dict[int, ir.Value], - current_block: ir.Block + current_block: ir.Block, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR BinOp instruction @@ -31,18 +36,72 @@ def lower_binop( """ # Resolve operands as i64 (using resolver when available) # For now, simple vmap lookup - lhs_val = vmap.get(lhs, ir.Constant(ir.IntType(64), 0)) - rhs_val = vmap.get(rhs, ir.Constant(ir.IntType(64), 0)) + if resolver is not None and preds is not None and block_end_values is not None: + lhs_val = resolver.resolve_i64(lhs, current_block, preds, block_end_values, vmap, bb_map) + rhs_val = resolver.resolve_i64(rhs, current_block, preds, block_end_values, vmap, bb_map) + else: + lhs_val = vmap.get(lhs, ir.Constant(ir.IntType(64), 0)) + rhs_val = vmap.get(rhs, ir.Constant(ir.IntType(64), 0)) + # Relational/equality operators delegate to compare + if op in ('==','!=','<','>','<=','>='): + lower_compare(builder, op, lhs, rhs, dst, vmap) + return + + # String-aware concatenation unified to handles (i64) when any side is pointer string + if op == '+': + i64 = ir.IntType(64) + i8p = ir.IntType(8).as_pointer() + lhs_raw = vmap.get(lhs) + rhs_raw = vmap.get(rhs) + is_str = (hasattr(lhs_raw, 'type') and isinstance(lhs_raw.type, ir.PointerType)) or \ + (hasattr(rhs_raw, 'type') and isinstance(rhs_raw.type, ir.PointerType)) + if is_str: + # Helper: convert raw or resolved value to string handle + def to_handle(raw, val, tag: str): + if raw is not None and hasattr(raw, 'type') and isinstance(raw.type, ir.PointerType): + # pointer-to-array -> GEP + try: + if isinstance(raw.type.pointee, ir.ArrayType): + c0 = ir.Constant(ir.IntType(32), 0) + raw = builder.gep(raw, [c0, c0], name=f"bin_gep_{tag}_{dst}") + except Exception: + pass + cal = None + for f in builder.module.functions: + if f.name == 'nyash.box.from_i8_string': + cal = f; break + if cal is None: + cal = ir.Function(builder.module, ir.FunctionType(i64, [i8p]), name='nyash.box.from_i8_string') + return builder.call(cal, [raw], name=f"str_ptr2h_{tag}_{dst}") + # if already i64 + if val is not None and hasattr(val, 'type') and isinstance(val.type, ir.IntType) and val.type.width == 64: + return val + return ir.Constant(i64, 0) + + hl = to_handle(lhs_raw, lhs_val, 'l') + hr = to_handle(rhs_raw, rhs_val, 'r') + # concat_hh(handle, handle) -> handle + hh_fnty = ir.FunctionType(i64, [i64, i64]) + callee = None + for f in builder.module.functions: + if f.name == 'nyash.string.concat_hh': + callee = f; break + if callee is None: + callee = ir.Function(builder.module, hh_fnty, name='nyash.string.concat_hh') + res = builder.call(callee, [hl, hr], name=f"concat_hh_{dst}") + vmap[dst] = res + return + # Ensure both are i64 i64 = ir.IntType(64) if hasattr(lhs_val, 'type') and lhs_val.type != i64: # Type conversion if needed if lhs_val.type.is_pointer: - lhs_val = builder.ptrtoint(lhs_val, i64) + lhs_val = builder.ptrtoint(lhs_val, i64, name=f"binop_lhs_p2i_{dst}") if hasattr(rhs_val, 'type') and rhs_val.type != i64: if rhs_val.type.is_pointer: - rhs_val = builder.ptrtoint(rhs_val, i64) + rhs_val = builder.ptrtoint(rhs_val, i64, name=f"binop_rhs_p2i_{dst}") # Perform operation if op == '+': @@ -73,4 +132,4 @@ def lower_binop( result = ir.Constant(i64, 0) # Store result - vmap[dst] = result \ No newline at end of file + vmap[dst] = result diff --git a/src/llvm_py/instructions/boxcall.py b/src/llvm_py/instructions/boxcall.py index 7c984690..59ff7508 100644 --- a/src/llvm_py/instructions/boxcall.py +++ b/src/llvm_py/instructions/boxcall.py @@ -6,6 +6,36 @@ Core of Nyash's "Everything is Box" philosophy import llvmlite.ir as ir from typing import Dict, List, Optional +def _declare(module: ir.Module, name: str, ret, args): + for f in module.functions: + if f.name == name: + return f + fnty = ir.FunctionType(ret, args) + return ir.Function(module, fnty, name=name) + +def _ensure_handle(builder: ir.IRBuilder, module: ir.Module, v: ir.Value) -> ir.Value: + """Coerce a value to i64 handle. If pointer, box via nyash.box.from_i8_string.""" + i64 = ir.IntType(64) + if hasattr(v, 'type'): + if isinstance(v.type, ir.IntType) and v.type.width == 64: + return v + if isinstance(v.type, ir.PointerType): + # call nyash.box.from_i8_string(i8*) -> i64 + i8p = ir.IntType(8).as_pointer() + # If pointer-to-array, GEP to first element + try: + if isinstance(v.type.pointee, ir.ArrayType): + c0 = ir.IntType(32)(0) + v = builder.gep(v, [c0, c0], name="bc_str_gep") + except Exception: + pass + callee = _declare(module, "nyash.box.from_i8_string", i64, [i8p]) + return builder.call(callee, [v], name="str_ptr2h") + if isinstance(v.type, ir.IntType): + # extend/trunc to i64 + return builder.zext(v, i64) if v.type.width < 64 else builder.trunc(v, i64) + return ir.Constant(i64, 0) + def lower_boxcall( builder: ir.IRBuilder, module: ir.Module, @@ -14,7 +44,10 @@ def lower_boxcall( args: List[int], dst_vid: Optional[int], vmap: Dict[int, ir.Value], - resolver=None + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR BoxCall instruction @@ -31,74 +64,153 @@ def lower_boxcall( vmap: Value map resolver: Optional resolver for type handling """ - # Get box handle (i64) - box_handle = vmap.get(box_vid, ir.Constant(ir.IntType(64), 0)) - - # Ensure handle is i64 - if hasattr(box_handle, 'type') and box_handle.type.is_pointer: - box_handle = builder.ptrtoint(box_handle, ir.IntType(64)) - - # Method ID dispatch for plugin boxes - # This matches the current LLVM backend approach - method_id = hash(method_name) & 0xFFFF # Simple hash for demo - - # Look up or create ny_boxcall_by_id function - boxcall_func = None - for f in module.functions: - if f.name == "ny_boxcall_by_id": - boxcall_func = f - break - - if not boxcall_func: - # Declare ny_boxcall_by_id(handle: i64, method_id: i64, args: i8*) -> i64 - i8 = ir.IntType(8) - i64 = ir.IntType(64) - i8_ptr = i8.as_pointer() - - func_type = ir.FunctionType(i64, [i64, i64, i8_ptr]) - boxcall_func = ir.Function(module, func_type, name="ny_boxcall_by_id") - - # Prepare arguments array - i8 = ir.IntType(8) i64 = ir.IntType(64) - - if args: - # Allocate space for arguments (8 bytes per arg) - args_size = len(args) * 8 - args_ptr = builder.alloca(i8, size=args_size, name="boxcall_args") - - # Cast to i64* for storing arguments - i64_ptr_type = i64.as_pointer() - args_i64_ptr = builder.bitcast(args_ptr, i64_ptr_type) - - # Store each argument - for i, arg_id in enumerate(args): - arg_val = vmap.get(arg_id, ir.Constant(i64, 0)) - - # Ensure i64 - if hasattr(arg_val, 'type'): - if arg_val.type.is_pointer: - arg_val = builder.ptrtoint(arg_val, i64) - elif arg_val.type != i64: - # TODO: Handle other conversions - pass - - # Calculate offset and store - idx = ir.Constant(ir.IntType(32), i) - ptr = builder.gep(args_i64_ptr, [idx]) - builder.store(arg_val, ptr) - - # Cast back to i8* for call - call_args_ptr = builder.bitcast(args_i64_ptr, i8.as_pointer()) + i8 = ir.IntType(8) + i8p = i8.as_pointer() + + # Receiver value + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + recv_val = resolver.resolve_i64(box_vid, builder.block, preds, block_end_values, vmap, bb_map) else: - # No arguments - pass null - call_args_ptr = ir.Constant(i8.as_pointer(), None) - - # Make the boxcall - method_id_val = ir.Constant(i64, method_id) - result = builder.call(boxcall_func, [box_handle, method_id_val, call_args_ptr], - name=f"boxcall_{method_name}") - - # Store result if needed + recv_val = vmap.get(box_vid, ir.Constant(i64, 0)) + + # Minimal method bridging for strings and console + if method_name in ("length", "len"): + # Prefer handle-based len_h + recv_h = _ensure_handle(builder, module, recv_val) + callee = _declare(module, "nyash.string.len_h", i64, [i64]) + result = builder.call(callee, [recv_h], name="strlen_h") + if dst_vid is not None: + vmap[dst_vid] = result + return + + if method_name == "substring": + # substring(start, end) with pointer-based API + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + s = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else ir.Constant(i64, 0) + e = resolver.resolve_i64(args[1], builder.block, preds, block_end_values, vmap, bb_map) if len(args) > 1 else ir.Constant(i64, 0) + else: + s = vmap.get(args[0], ir.Constant(i64, 0)) if args else ir.Constant(i64, 0) + e = vmap.get(args[1], ir.Constant(i64, 0)) if len(args) > 1 else ir.Constant(i64, 0) + # Coerce recv to i8* + recv_p = recv_val + if hasattr(recv_p, 'type') and isinstance(recv_p.type, ir.IntType): + recv_p = builder.inttoptr(recv_p, i8p, name="bc_i2p_recv") + elif hasattr(recv_p, 'type') and isinstance(recv_p.type, ir.PointerType): + try: + if isinstance(recv_p.type.pointee, ir.ArrayType): + c0 = ir.Constant(ir.IntType(32), 0) + recv_p = builder.gep(recv_p, [c0, c0], name="bc_gep_recv") + except Exception: + pass + else: + recv_p = ir.Constant(i8p, None) + # Coerce indices + if hasattr(s, 'type') and isinstance(s.type, ir.PointerType): + s = builder.ptrtoint(s, i64) + if hasattr(e, 'type') and isinstance(e.type, ir.PointerType): + e = builder.ptrtoint(e, i64) + callee = _declare(module, "nyash.string.substring_sii", i8p, [i8p, i64, i64]) + p = builder.call(callee, [recv_p, s, e], name="substring") + # Return as handle across blocks (i8* -> i64 via nyash.box.from_i8_string) + conv_fnty = ir.FunctionType(i64, [i8p]) + conv = _declare(module, "nyash.box.from_i8_string", i64, [i8p]) + h = builder.call(conv, [p], name="str_ptr2h_sub") + if dst_vid is not None: + vmap[dst_vid] = h + return + + if method_name == "lastIndexOf": + # lastIndexOf(needle) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + needle = resolver.resolve_ptr(args[0], builder.block, preds, block_end_values, vmap) if args else ir.Constant(i8p, None) + else: + needle = vmap.get(args[0], ir.Constant(i8p, None)) if args else ir.Constant(i8p, None) + recv_p = recv_val + if hasattr(recv_p, 'type') and isinstance(recv_p.type, ir.IntType): + recv_p = builder.inttoptr(recv_p, i8p, name="bc_i2p_recv2") + elif hasattr(recv_p, 'type') and isinstance(recv_p.type, ir.PointerType): + try: + if isinstance(recv_p.type.pointee, ir.ArrayType): + c0 = ir.Constant(ir.IntType(32), 0) + recv_p = builder.gep(recv_p, [c0, c0], name="bc_gep_recv2") + except Exception: + pass + if hasattr(needle, 'type') and isinstance(needle.type, ir.IntType): + needle = builder.inttoptr(needle, i8p, name="bc_i2p_needle") + elif hasattr(needle, 'type') and isinstance(needle.type, ir.PointerType): + try: + if isinstance(needle.type.pointee, ir.ArrayType): + c0 = ir.Constant(ir.IntType(32), 0) + needle = builder.gep(needle, [c0, c0], name="bc_gep_needle") + except Exception: + pass + elif not hasattr(needle, 'type'): + needle = ir.Constant(i8p, None) + callee = _declare(module, "nyash.string.lastIndexOf_ss", i64, [i8p, i8p]) + res = builder.call(callee, [recv_p, needle], name="lastIndexOf") + if dst_vid is not None: + vmap[dst_vid] = res + return + + if method_name in ("print", "println", "log"): + # Console mapping + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + arg0 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if args else None + else: + arg0 = vmap.get(args[0]) if args else None + if arg0 is None: + arg0 = ir.Constant(i8p, None) + # Prefer handle API if arg is i64, else pointer API + if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType) and arg0.type.width == 64: + callee = _declare(module, "nyash.console.log_handle", i64, [i64]) + _ = builder.call(callee, [arg0], name="console_log_h") + else: + if hasattr(arg0, 'type') and isinstance(arg0.type, ir.IntType): + arg0 = builder.inttoptr(arg0, i8p) + callee = _declare(module, "nyash.console.log", i64, [i8p]) + _ = builder.call(callee, [arg0], name="console_log") + if dst_vid is not None: + vmap[dst_vid] = ir.Constant(i64, 0) + return + + # Default: invoke via NyRT by-name shim (runtime resolves method id) + recv_h = _ensure_handle(builder, module, recv_val) + # Build C string for method name + mbytes = (method_name + "\0").encode('utf-8') + arr_ty = ir.ArrayType(ir.IntType(8), len(mbytes)) + try: + fn = builder.block.parent + fn_name = getattr(fn, 'name', 'fn') + except Exception: + fn_name = 'fn' + base = f".meth_{fn_name}_{method_name}" + existing = {g.name for g in module.global_values} + gname = base + k = 1 + while gname in existing: + gname = f"{base}.{k}"; k += 1 + g = ir.GlobalVariable(module, arr_ty, name=gname) + g.linkage = 'private' + g.global_constant = True + g.initializer = ir.Constant(arr_ty, bytearray(mbytes)) + c0 = ir.Constant(ir.IntType(32), 0) + mptr = builder.gep(g, [c0, c0], inbounds=True) + + # Up to 2 args for minimal path + argc = ir.Constant(i64, min(len(args), 2)) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + a1 = resolver.resolve_i64(args[0], builder.block, preds, block_end_values, vmap, bb_map) if len(args) >= 1 else ir.Constant(i64, 0) + a2 = resolver.resolve_i64(args[1], builder.block, preds, block_end_values, vmap, bb_map) if len(args) >= 2 else ir.Constant(i64, 0) + else: + a1 = vmap.get(args[0], ir.Constant(i64, 0)) if len(args) >= 1 else ir.Constant(i64, 0) + a2 = vmap.get(args[1], ir.Constant(i64, 0)) if len(args) >= 2 else ir.Constant(i64, 0) + if hasattr(a1, 'type') and isinstance(a1.type, ir.PointerType): + a1 = builder.ptrtoint(a1, i64) + if hasattr(a2, 'type') and isinstance(a2.type, ir.PointerType): + a2 = builder.ptrtoint(a2, i64) + + callee = _declare(module, "nyash.plugin.invoke_by_name_i64", i64, [i64, i8p, i64, i64, i64]) + result = builder.call(callee, [recv_h, mptr, argc, a1, a2], name="pinvoke_by_name") if dst_vid is not None: - vmap[dst_vid] = result \ No newline at end of file + vmap[dst_vid] = result diff --git a/src/llvm_py/instructions/branch.py b/src/llvm_py/instructions/branch.py index 15e49d8f..973fafad 100644 --- a/src/llvm_py/instructions/branch.py +++ b/src/llvm_py/instructions/branch.py @@ -12,7 +12,10 @@ def lower_branch( then_bid: int, else_bid: int, vmap: Dict[int, ir.Value], - bb_map: Dict[int, ir.Block] + bb_map: Dict[int, ir.Block], + resolver=None, + preds=None, + block_end_values=None ) -> None: """ Lower MIR Branch instruction @@ -26,7 +29,10 @@ def lower_branch( bb_map: Block map """ # Get condition value - cond = vmap.get(cond_vid) + if resolver is not None and preds is not None and block_end_values is not None: + cond = resolver.resolve_i64(cond_vid, builder.block, preds, block_end_values, vmap, bb_map) + else: + cond = vmap.get(cond_vid) if not cond: # Default to false if missing cond = ir.Constant(ir.IntType(1), 0) @@ -47,4 +53,4 @@ def lower_branch( else_bb = bb_map.get(else_bid) if then_bb and else_bb: - builder.cbranch(cond, then_bb, else_bb) \ No newline at end of file + builder.cbranch(cond, then_bb, else_bb) diff --git a/src/llvm_py/instructions/call.py b/src/llvm_py/instructions/call.py index e1a54e2f..55ba45b9 100644 --- a/src/llvm_py/instructions/call.py +++ b/src/llvm_py/instructions/call.py @@ -13,7 +13,10 @@ def lower_call( args: List[int], dst_vid: Optional[int], vmap: Dict[int, ir.Value], - resolver=None + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR Call instruction @@ -27,49 +30,59 @@ def lower_call( vmap: Value map resolver: Optional resolver for type handling """ + # Resolve function: accepts string name or value-id referencing a string literal + actual_name = func_name + if not isinstance(func_name, str): + # Try resolver.string_literals + if resolver is not None and hasattr(resolver, 'string_literals'): + actual_name = resolver.string_literals.get(func_name) # Look up function in module func = None - for f in module.functions: - if f.name == func_name: - func = f - break + if isinstance(actual_name, str): + for f in module.functions: + if f.name == actual_name: + func = f + break if not func: - # Function not found - create declaration - # Default: i64(i64, ...) signature + # Function not found - create declaration with default i64 signature ret_type = ir.IntType(64) arg_types = [ir.IntType(64)] * len(args) + name = actual_name if isinstance(actual_name, str) else "unknown_fn" func_type = ir.FunctionType(ret_type, arg_types) - func = ir.Function(module, func_type, name=func_name) + func = ir.Function(module, func_type, name=name) # Prepare arguments call_args = [] for i, arg_id in enumerate(args): - arg_val = vmap.get(arg_id) - - if not arg_val: - # Default based on expected type + arg_val = None + if i < len(func.args): + expected_type = func.args[i].type + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + if hasattr(expected_type, 'is_pointer') and expected_type.is_pointer: + arg_val = resolver.resolve_ptr(arg_id, builder.block, preds, block_end_values, vmap) + else: + arg_val = resolver.resolve_i64(arg_id, builder.block, preds, block_end_values, vmap, bb_map) + if arg_val is None: + arg_val = vmap.get(arg_id) + if arg_val is None: if i < len(func.args): expected_type = func.args[i].type else: expected_type = ir.IntType(64) - if isinstance(expected_type, ir.IntType): arg_val = ir.Constant(expected_type, 0) elif isinstance(expected_type, ir.DoubleType): arg_val = ir.Constant(expected_type, 0.0) else: arg_val = ir.Constant(expected_type, None) - - # Type conversion if needed if i < len(func.args): expected_type = func.args[i].type if hasattr(arg_val, 'type') and arg_val.type != expected_type: if expected_type.is_pointer and isinstance(arg_val.type, ir.IntType): - arg_val = builder.inttoptr(arg_val, expected_type) + arg_val = builder.inttoptr(arg_val, expected_type, name=f"call_i2p_{i}") elif isinstance(expected_type, ir.IntType) and arg_val.type.is_pointer: - arg_val = builder.ptrtoint(arg_val, expected_type) - + arg_val = builder.ptrtoint(arg_val, expected_type, name=f"call_p2i_{i}") call_args.append(arg_val) # Make the call @@ -77,4 +90,4 @@ def lower_call( # Store result if needed if dst_vid is not None: - vmap[dst_vid] = result \ No newline at end of file + vmap[dst_vid] = result diff --git a/src/llvm_py/instructions/compare.py b/src/llvm_py/instructions/compare.py index 73db2d24..d9996bf8 100644 --- a/src/llvm_py/instructions/compare.py +++ b/src/llvm_py/instructions/compare.py @@ -5,6 +5,7 @@ Handles comparison operations (<, >, <=, >=, ==, !=) import llvmlite.ir as ir from typing import Dict +from .externcall import lower_externcall def lower_compare( builder: ir.IRBuilder, @@ -12,7 +13,12 @@ def lower_compare( lhs: int, rhs: int, dst: int, - vmap: Dict[int, ir.Value] + vmap: Dict[int, ir.Value], + resolver=None, + current_block=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR Compare instruction @@ -26,29 +32,62 @@ def lower_compare( vmap: Value map """ # Get operands - lhs_val = vmap.get(lhs, ir.Constant(ir.IntType(64), 0)) - rhs_val = vmap.get(rhs, ir.Constant(ir.IntType(64), 0)) - - # Ensure both are i64 + if resolver is not None and preds is not None and block_end_values is not None and current_block is not None: + lhs_val = resolver.resolve_i64(lhs, current_block, preds, block_end_values, vmap, bb_map) + rhs_val = resolver.resolve_i64(rhs, current_block, preds, block_end_values, vmap, bb_map) + else: + lhs_val = vmap.get(lhs) + rhs_val = vmap.get(rhs) + i64 = ir.IntType(64) - if hasattr(lhs_val, 'type') and lhs_val.type.is_pointer: + i8p = ir.IntType(8).as_pointer() + + # String-aware equality: if both are pointers, assume i8* strings + if op in ('==','!=') and hasattr(lhs_val, 'type') and hasattr(rhs_val, 'type'): + if isinstance(lhs_val.type, ir.PointerType) and isinstance(rhs_val.type, ir.PointerType): + # Box both to handles and call nyash.string.eq_hh + # nyash.box.from_i8_string(i8*) -> i64 + box_from = None + for f in builder.module.functions: + if f.name == 'nyash.box.from_i8_string': + box_from = f + break + if not box_from: + box_from = ir.Function(builder.module, ir.FunctionType(i64, [i8p]), name='nyash.box.from_i8_string') + lh = builder.call(box_from, [lhs_val], name='lhs_ptr2h') + rh = builder.call(box_from, [rhs_val], name='rhs_ptr2h') + + eqf = None + for f in builder.module.functions: + if f.name == 'nyash.string.eq_hh': + eqf = f + break + if not eqf: + eqf = ir.Function(builder.module, ir.FunctionType(i64, [i64, i64]), name='nyash.string.eq_hh') + eq = builder.call(eqf, [lh, rh], name='str_eq') + if op == '==': + vmap[dst] = eq + else: + # ne = 1 - eq + one = ir.Constant(i64, 1) + ne = builder.sub(one, eq, name='str_ne') + vmap[dst] = ne + return + + # Default integer compare path + if lhs_val is None: + lhs_val = ir.Constant(i64, 0) + if rhs_val is None: + rhs_val = ir.Constant(i64, 0) + + # Ensure both are i64 + if hasattr(lhs_val, 'type') and isinstance(lhs_val.type, ir.PointerType): lhs_val = builder.ptrtoint(lhs_val, i64) - if hasattr(rhs_val, 'type') and rhs_val.type.is_pointer: + if hasattr(rhs_val, 'type') and isinstance(rhs_val.type, ir.PointerType): rhs_val = builder.ptrtoint(rhs_val, i64) - # Map operations to LLVM predicates - op_map = { - '<': 'slt', # signed less than - '>': 'sgt', # signed greater than - '<=': 'sle', # signed less or equal - '>=': 'sge', # signed greater or equal - '==': 'eq', # equal - '!=': 'ne' # not equal - } - - pred = op_map.get(op, 'eq') - - # Perform comparison (returns i1) + # Perform signed comparison using canonical predicates ('<','>','<=','>=','==','!=') + pred = op if op in ('<','>','<=','>=','==','!=') else '==' cmp_result = builder.icmp_signed(pred, lhs_val, rhs_val, name=f"cmp_{dst}") # Convert i1 to i64 (0 or 1) @@ -81,19 +120,8 @@ def lower_fcmp( lhs_val = vmap.get(lhs, ir.Constant(f64, 0.0)) rhs_val = vmap.get(rhs, ir.Constant(f64, 0.0)) - # Map operations to LLVM predicates - op_map = { - '<': 'olt', # ordered less than - '>': 'ogt', # ordered greater than - '<=': 'ole', # ordered less or equal - '>=': 'oge', # ordered greater or equal - '==': 'oeq', # ordered equal - '!=': 'one' # ordered not equal - } - - pred = op_map.get(op, 'oeq') - - # Perform comparison (returns i1) + # Perform ordered comparison using canonical predicates + pred = op if op in ('<','>','<=','>=','==','!=') else '==' cmp_result = builder.fcmp_ordered(pred, lhs_val, rhs_val, name=f"fcmp_{dst}") # Convert i1 to i64 @@ -101,4 +129,4 @@ def lower_fcmp( result = builder.zext(cmp_result, i64, name=f"fcmp_i64_{dst}") # Store result - vmap[dst] = result \ No newline at end of file + vmap[dst] = result diff --git a/src/llvm_py/instructions/const.py b/src/llvm_py/instructions/const.py index e64089d1..9017abc2 100644 --- a/src/llvm_py/instructions/const.py +++ b/src/llvm_py/instructions/const.py @@ -11,7 +11,8 @@ def lower_const( module: ir.Module, dst: int, value: Dict[str, Any], - vmap: Dict[int, ir.Value] + vmap: Dict[int, ir.Value], + resolver=None ) -> None: """ Lower MIR Const instruction @@ -39,25 +40,31 @@ def lower_const( vmap[dst] = llvm_val elif const_type == 'string': - # String constant - create global and get pointer + # String constant - create global, store GlobalVariable (not GEP) to avoid dominance issues i8 = ir.IntType(8) str_val = str(const_val) - # Create array constant for the string str_bytes = str_val.encode('utf-8') + b'\0' - str_const = ir.Constant(ir.ArrayType(i8, len(str_bytes)), - bytearray(str_bytes)) - - # Create global string constant - global_name = f".str.{dst}" - global_str = ir.GlobalVariable(module, str_const.type, name=global_name) - global_str.initializer = str_const - global_str.linkage = 'private' - global_str.global_constant = True - - # Get pointer to first element - indices = [ir.Constant(ir.IntType(32), 0), ir.Constant(ir.IntType(32), 0)] - ptr = builder.gep(global_str, indices, name=f"str_ptr_{dst}") - vmap[dst] = ptr + arr_ty = ir.ArrayType(i8, len(str_bytes)) + str_const = ir.Constant(arr_ty, bytearray(str_bytes)) + try: + fn = builder.block.parent + fn_name = getattr(fn, 'name', 'fn') + except Exception: + fn_name = 'fn' + base = f".str.{fn_name}.{dst}" + existing = {g.name for g in module.global_values} + name = base + n = 1 + while name in existing: + name = f"{base}.{n}"; n += 1 + g = ir.GlobalVariable(module, arr_ty, name=name) + g.initializer = str_const + g.linkage = 'private' + g.global_constant = True + # Store the GlobalVariable; resolver.resolve_ptr will emit GEP in the current block + vmap[dst] = g + if resolver is not None and hasattr(resolver, 'string_literals'): + resolver.string_literals[dst] = str_val elif const_type == 'void': # Void/null constant - use i64 zero @@ -67,4 +74,4 @@ def lower_const( else: # Unknown type - default to i64 zero i64 = ir.IntType(64) - vmap[dst] = ir.Constant(i64, 0) \ No newline at end of file + vmap[dst] = ir.Constant(i64, 0) diff --git a/src/llvm_py/instructions/externcall.py b/src/llvm_py/instructions/externcall.py index 2812974a..24fc36a4 100644 --- a/src/llvm_py/instructions/externcall.py +++ b/src/llvm_py/instructions/externcall.py @@ -1,40 +1,11 @@ """ ExternCall instruction lowering -Handles the minimal 5 runtime functions: print, error, panic, exit, now +Minimal mapping for NyRT-exported symbols (console/log family等) """ import llvmlite.ir as ir from typing import Dict, List, Optional -# The 5 minimal external functions -EXTERN_FUNCS = { - "print": { - "ret": "void", - "args": ["i8*"], # String pointer - "llvm_name": "ny_print" - }, - "error": { - "ret": "void", - "args": ["i8*"], # Error message - "llvm_name": "ny_error" - }, - "panic": { - "ret": "void", - "args": ["i8*"], # Panic message - "llvm_name": "ny_panic" - }, - "exit": { - "ret": "void", - "args": ["i64"], # Exit code - "llvm_name": "ny_exit" - }, - "now": { - "ret": "i64", - "args": [], # No arguments - "llvm_name": "ny_now" - } -} - def lower_externcall( builder: ir.IRBuilder, module: ir.Module, @@ -42,7 +13,10 @@ def lower_externcall( args: List[int], dst_vid: Optional[int], vmap: Dict[int, ir.Value], - resolver=None + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR ExternCall instruction @@ -56,94 +30,121 @@ def lower_externcall( vmap: Value map resolver: Optional resolver for type handling """ - if func_name not in EXTERN_FUNCS: - # Unknown extern function - treat as void() - print(f"Warning: Unknown extern function: {func_name}") - return - - extern_info = EXTERN_FUNCS[func_name] - llvm_name = extern_info["llvm_name"] - - # Look up or declare function + # Accept full symbol names (e.g., "nyash.console.log", "nyash.string.len_h"). + llvm_name = func_name + + i8 = ir.IntType(8) + i64 = ir.IntType(64) + i8p = i8.as_pointer() + void = ir.VoidType() + + # Known NyRT signatures + sig_map = { + # Strings (handle-based) + "nyash.string.len_h": (i64, [i64]), + "nyash.string.charCodeAt_h": (i64, [i64, i64]), + "nyash.string.concat_hh": (i64, [i64, i64]), + "nyash.string.eq_hh": (i64, [i64, i64]), + # Strings (pointer-based plugin functions) + "nyash.string.concat_ss": (i8p, [i8p, i8p]), + "nyash.string.concat_si": (i8p, [i8p, i64]), + "nyash.string.concat_is": (i8p, [i64, i8p]), + "nyash.string.substring_sii": (i8p, [i8p, i64, i64]), + "nyash.string.lastIndexOf_ss": (i64, [i8p, i8p]), + # Boxing helpers + "nyash.box.from_i8_string": (i64, [i8p]), + # Console (string pointer expected) + # Many call sites pass handles or pointers; we coerce below. + } + + # Find or declare function with appropriate prototype func = None for f in module.functions: if f.name == llvm_name: func = f break - if not func: - # Build function type - i8 = ir.IntType(8) - i64 = ir.IntType(64) - void = ir.VoidType() - - # Return type - if extern_info["ret"] == "void": - ret_type = void - elif extern_info["ret"] == "i64": - ret_type = i64 + if llvm_name in sig_map: + ret_ty, arg_tys = sig_map[llvm_name] + fnty = ir.FunctionType(ret_ty, arg_tys) + func = ir.Function(module, fnty, name=llvm_name) + elif llvm_name.startswith("nyash.console."): + # console.*: (i8*) -> i64 + fnty = ir.FunctionType(i64, [i8p]) + func = ir.Function(module, fnty, name=llvm_name) else: - ret_type = void - - # Argument types - arg_types = [] - for arg_type_str in extern_info["args"]: - if arg_type_str == "i8*": - arg_types.append(i8.as_pointer()) - elif arg_type_str == "i64": - arg_types.append(i64) - - func_type = ir.FunctionType(ret_type, arg_types) - func = ir.Function(module, func_type, name=llvm_name) - - # Prepare arguments - call_args = [] + # Unknown extern: declare as void(...no args...) and call without args + fnty = ir.FunctionType(void, []) + func = ir.Function(module, fnty, name=llvm_name) + + # Prepare/coerce arguments + call_args: List[ir.Value] = [] for i, arg_id in enumerate(args): - if i >= len(extern_info["args"]): - break # Too many arguments - - expected_type_str = extern_info["args"][i] - arg_val = vmap.get(arg_id) - - if not arg_val: - # Default value - if expected_type_str == "i8*": - # Null string - i8 = ir.IntType(8) - arg_val = ir.Constant(i8.as_pointer(), None) - elif expected_type_str == "i64": - arg_val = ir.Constant(ir.IntType(64), 0) - - # Type conversion - if expected_type_str == "i8*": - # Need string pointer - if hasattr(arg_val, 'type'): - if isinstance(arg_val.type, ir.IntType): - # int to ptr - i8 = ir.IntType(8) - arg_val = builder.inttoptr(arg_val, i8.as_pointer()) - elif not arg_val.type.is_pointer: - # Need pointer type - i8 = ir.IntType(8) - arg_val = ir.Constant(i8.as_pointer(), None) - elif expected_type_str == "i64": - # Need i64 - if hasattr(arg_val, 'type'): - if arg_val.type.is_pointer: - arg_val = builder.ptrtoint(arg_val, ir.IntType(64)) - elif arg_val.type != ir.IntType(64): - # Convert to i64 - pass # TODO: Handle other conversions - - call_args.append(arg_val) - - # Make the call - if extern_info["ret"] == "void": - builder.call(func, call_args) - if dst_vid is not None: - # Void return - store 0 - vmap[dst_vid] = ir.Constant(ir.IntType(64), 0) - else: + # Prefer resolver + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + if len(func.args) > i and isinstance(func.args[i].type, ir.PointerType): + aval = resolver.resolve_ptr(arg_id, builder.block, preds, block_end_values, vmap) + else: + aval = resolver.resolve_i64(arg_id, builder.block, preds, block_end_values, vmap, bb_map) + else: + aval = vmap.get(arg_id) + if aval is None: + # Default guess + aval = ir.Constant(i64, 0) + + # If function prototype is known, coerce to expected type + if len(func.args) > i: + expected_ty = func.args[i].type + if isinstance(expected_ty, ir.PointerType): + # Need pointer + if hasattr(aval, 'type'): + if isinstance(aval.type, ir.IntType): + aval = builder.inttoptr(aval, expected_ty, name=f"ext_i2p_arg{i}") + elif not aval.type.is_pointer: + aval = ir.Constant(expected_ty, None) + else: + # Pointer but wrong element type: if pointer-to-array -> GEP to i8* + try: + if isinstance(aval.type.pointee, ir.ArrayType) and isinstance(expected_ty.pointee, ir.IntType) and expected_ty.pointee.width == 8: + c0 = ir.Constant(ir.IntType(32), 0) + aval = builder.gep(aval, [c0, c0], name=f"ext_gep_arg{i}") + except Exception: + pass + else: + aval = ir.Constant(expected_ty, None) + elif isinstance(expected_ty, ir.IntType) and expected_ty.width == 64: + # Need i64 + if hasattr(aval, 'type'): + if isinstance(aval.type, ir.PointerType): + aval = builder.ptrtoint(aval, i64, name=f"ext_p2i_arg{i}") + elif isinstance(aval.type, ir.IntType) and aval.type.width != 64: + # extend/trunc + if aval.type.width < 64: + aval = builder.zext(aval, i64, name=f"ext_zext_{i}") + else: + aval = builder.trunc(aval, i64, name=f"ext_trunc_{i}") + else: + aval = ir.Constant(i64, 0) + else: + # Prototype shorter than args: best-effort pointer->i64 for string-ish APIs + if hasattr(aval, 'type') and isinstance(aval.type, ir.PointerType): + aval = builder.ptrtoint(aval, i64, name=f"ext_p2i_arg{i}") + call_args.append(aval) + + # Truncate extra args if prototype shorter + if len(call_args) > len(func.args): + call_args = call_args[:len(func.args)] + + # Issue the call + if len(call_args) == len(func.args): result = builder.call(func, call_args, name=f"extern_{func_name}") - if dst_vid is not None: - vmap[dst_vid] = result \ No newline at end of file + else: + result = builder.call(func, call_args[:len(func.args)]) + + # Materialize result into vmap + if dst_vid is not None: + rty = func.function_type.return_type + if isinstance(rty, ir.VoidType): + vmap[dst_vid] = ir.Constant(i64, 0) + else: + vmap[dst_vid] = result diff --git a/src/llvm_py/instructions/loopform.py b/src/llvm_py/instructions/loopform.py index 3a84dec6..85ad8d12 100644 --- a/src/llvm_py/instructions/loopform.py +++ b/src/llvm_py/instructions/loopform.py @@ -50,7 +50,10 @@ def lower_while_loopform( body_instructions: List[Any], loop_id: int, vmap: Dict[int, ir.Value], - bb_map: Dict[int, ir.Block] + bb_map: Dict[int, ir.Block], + resolver=None, + preds=None, + block_end_values=None ) -> bool: """ Lower a while loop using LoopForm structure @@ -71,7 +74,12 @@ def lower_while_loopform( # Header: Evaluate condition builder.position_at_end(lf.header) - cond = vmap.get(condition_vid, ir.Constant(ir.IntType(1), 0)) + if resolver is not None and preds is not None and block_end_values is not None: + cond64 = resolver.resolve_i64(condition_vid, builder.block, preds, block_end_values, vmap, bb_map) + zero64 = ir.IntType(64)(0) + cond = builder.icmp_unsigned('!=', cond64, zero64) + else: + cond = vmap.get(condition_vid, ir.Constant(ir.IntType(1), 0)) # Convert to i1 if needed if hasattr(cond, 'type') and cond.type == ir.IntType(64): cond = builder.icmp_unsigned('!=', cond, ir.Constant(ir.IntType(64), 0)) @@ -118,4 +126,4 @@ def lower_while_loopform( if os.environ.get('NYASH_CLI_VERBOSE') == '1': print(f"[LoopForm] Created loop structure (id={loop_id})") - return True \ No newline at end of file + return True diff --git a/src/llvm_py/instructions/newbox.py b/src/llvm_py/instructions/newbox.py index b74ad404..bde14fff 100644 --- a/src/llvm_py/instructions/newbox.py +++ b/src/llvm_py/instructions/newbox.py @@ -29,60 +29,40 @@ def lower_newbox( vmap: Value map resolver: Optional resolver for type handling """ - # Look up or declare the box creation function - create_func_name = f"ny_create_{box_type}" - create_func = None - + # Use NyRT shim: nyash.env.box.new(type_name: i8*) -> i64 + i64 = ir.IntType(64) + i8p = ir.IntType(8).as_pointer() + # Prefer variadic shim: nyash.env.box.new_i64x(type_name, argc, a1, a2, a3, a4) + new_i64x = None for f in module.functions: - if f.name == create_func_name: - create_func = f + if f.name == "nyash.env.box.new_i64x": + new_i64x = f break - - if not create_func: - # Declare box creation function - # Signature depends on box type - i64 = ir.IntType(64) - i8 = ir.IntType(8) - - if box_type in ["StringBox", "IntegerBox", "BoolBox"]: - # Built-in boxes - default constructors (no args) - # Real implementation may have optional args - func_type = ir.FunctionType(i64, []) - else: - # Generic box - variable arguments - # For now, assume no args - func_type = ir.FunctionType(i64, []) - - create_func = ir.Function(module, func_type, name=create_func_name) - - # Prepare arguments - call_args = [] - for i, arg_id in enumerate(args): - arg_val = vmap.get(arg_id) - - if not arg_val: - # Default based on box type - if box_type == "StringBox": - # Empty string - i8 = ir.IntType(8) - arg_val = ir.Constant(i8.as_pointer(), None) - else: - # Zero - arg_val = ir.Constant(ir.IntType(64), 0) - - # Type conversion if needed - if box_type == "StringBox" and hasattr(arg_val, 'type'): - if isinstance(arg_val.type, ir.IntType): - # int to string ptr - i8 = ir.IntType(8) - arg_val = builder.inttoptr(arg_val, i8.as_pointer()) - - call_args.append(arg_val) - - # Create the box - handle = builder.call(create_func, call_args, name=f"new_{box_type}") - - # Store handle + if not new_i64x: + new_i64x = ir.Function(module, ir.FunctionType(i64, [i8p, i64, i64, i64, i64, i64]), name="nyash.env.box.new_i64x") + + # Build C-string for type name (unique global per function) + sbytes = (box_type + "\0").encode('utf-8') + arr_ty = ir.ArrayType(ir.IntType(8), len(sbytes)) + try: + fn = builder.block.parent + fn_name = getattr(fn, 'name', 'fn') + except Exception: + fn_name = 'fn' + base = f".box_ty_{fn_name}_{dst_vid}" + existing = {g.name for g in module.global_values} + name = base + n = 1 + while name in existing: + name = f"{base}.{n}"; n += 1 + g = ir.GlobalVariable(module, arr_ty, name=name) + g.linkage = 'private' + g.global_constant = True + g.initializer = ir.Constant(arr_ty, bytearray(sbytes)) + c0 = ir.Constant(ir.IntType(32), 0) + ptr = builder.gep(g, [c0, c0], inbounds=True) + zero = ir.Constant(i64, 0) + handle = builder.call(new_i64x, [ptr, zero, zero, zero, zero, zero], name=f"new_{box_type}") vmap[dst_vid] = handle def lower_newbox_generic( @@ -113,4 +93,4 @@ def lower_newbox_generic( size = ir.Constant(ir.IntType(64), 64) handle = builder.call(alloc_func, [size], name="new_box") - vmap[dst_vid] = handle \ No newline at end of file + vmap[dst_vid] = handle diff --git a/src/llvm_py/instructions/phi.py b/src/llvm_py/instructions/phi.py index fb77e158..fd8b9c64 100644 --- a/src/llvm_py/instructions/phi.py +++ b/src/llvm_py/instructions/phi.py @@ -13,7 +13,9 @@ def lower_phi( vmap: Dict[int, ir.Value], bb_map: Dict[int, ir.Block], current_block: ir.Block, - resolver=None # Resolver instance (optional) + resolver=None, # Resolver instance (optional) + block_end_values: Optional[Dict[int, Dict[int, ir.Value]]] = None, + preds_map: Optional[Dict[int, List[int]]] = None ) -> None: """ Lower MIR PHI instruction @@ -32,23 +34,50 @@ def lower_phi( vmap[dst_vid] = ir.Constant(ir.IntType(64), 0) return - # Determine PHI type from first incoming value - first_val_id = incoming[0][0] - first_val = vmap.get(first_val_id) - - if first_val and hasattr(first_val, 'type'): - phi_type = first_val.type - else: - # Default to i64 - phi_type = ir.IntType(64) + # Determine PHI type from snapshots or fallback i64 + phi_type = ir.IntType(64) + if block_end_values is not None: + for val_id, pred_bid in incoming: + snap = block_end_values.get(pred_bid, {}) + val = snap.get(val_id) + if val is not None and hasattr(val, 'type'): + phi_type = val.type + # Prefer pointer type + if hasattr(phi_type, 'is_pointer') and phi_type.is_pointer: + break # Create PHI instruction phi = builder.phi(phi_type, name=f"phi_{dst_vid}") - # Add incoming values + # Build map from provided incoming + incoming_map: Dict[int, int] = {} for val_id, block_id in incoming: - val = vmap.get(val_id) + incoming_map[block_id] = val_id + + # Resolve actual predecessor set + cur_bid = None + try: + cur_bid = int(str(current_block.name).replace('bb','')) + except Exception: + pass + actual_preds = [] + if preds_map is not None and cur_bid is not None: + actual_preds = [p for p in preds_map.get(cur_bid, []) if p != cur_bid] + else: + # Fallback: use blocks in incoming list + actual_preds = [b for _, b in incoming] + + # Add incoming for each actual predecessor + for block_id in actual_preds: block = bb_map.get(block_id) + # Prefer pred snapshot + if block_end_values is not None: + snap = block_end_values.get(block_id, {}) + vid = incoming_map.get(block_id) + val = snap.get(vid) if vid is not None else None + else: + vid = incoming_map.get(block_id) + val = vmap.get(vid) if vid is not None else None if not val: # Create default value based on type @@ -66,34 +95,32 @@ def lower_phi( # Type conversion if needed if hasattr(val, 'type') and val.type != phi_type: - # Save current position - saved_block = builder.block - saved_pos = None - if hasattr(builder, '_anchor'): - saved_pos = builder._anchor - - # Position at end of predecessor block - builder.position_at_end(block) + # Position at end (before terminator) of predecessor block + pb = ir.IRBuilder(block) + try: + term = block.terminator + if term is not None: + pb.position_before(term) + else: + pb.position_at_end(block) + except Exception: + pb.position_at_end(block) # Convert types if isinstance(phi_type, ir.IntType) and val.type.is_pointer: - val = builder.ptrtoint(val, phi_type, name=f"cast_p2i_{val_id}") + val = pb.ptrtoint(val, phi_type, name=f"cast_p2i_{val_id}") elif phi_type.is_pointer and isinstance(val.type, ir.IntType): - val = builder.inttoptr(val, phi_type, name=f"cast_i2p_{val_id}") + val = pb.inttoptr(val, phi_type, name=f"cast_i2p_{val_id}") elif isinstance(phi_type, ir.IntType) and isinstance(val.type, ir.IntType): # Int to int if phi_type.width > val.type.width: - val = builder.zext(val, phi_type, name=f"zext_{val_id}") + val = pb.zext(val, phi_type, name=f"zext_{val_id}") else: - val = builder.trunc(val, phi_type, name=f"trunc_{val_id}") - - # Restore position - builder.position_at_end(saved_block) - if saved_pos and hasattr(builder, '_anchor'): - builder._anchor = saved_pos + val = pb.trunc(val, phi_type, name=f"trunc_{val_id}") - # Add to PHI - phi.add_incoming(val, block) + # Add to PHI (skip if no block) + if block is not None: + phi.add_incoming(val, block) # Store PHI result vmap[dst_vid] = phi @@ -111,4 +138,4 @@ def defer_phi_wiring( incoming: Incoming edges phi_deferrals: List to store deferred PHIs """ - phi_deferrals.append((dst_vid, incoming)) \ No newline at end of file + phi_deferrals.append((dst_vid, incoming)) diff --git a/src/llvm_py/instructions/ret.py b/src/llvm_py/instructions/ret.py index 1a360b0d..84614f27 100644 --- a/src/llvm_py/instructions/ret.py +++ b/src/llvm_py/instructions/ret.py @@ -10,7 +10,11 @@ def lower_return( builder: ir.IRBuilder, value_id: Optional[int], vmap: Dict[int, ir.Value], - return_type: ir.Type + return_type: ir.Type, + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR Return instruction @@ -25,8 +29,14 @@ def lower_return( # Void return builder.ret_void() else: - # Get return value - ret_val = vmap.get(value_id) + # Get return value (prefer resolver) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + if isinstance(return_type, ir.PointerType): + ret_val = resolver.resolve_ptr(value_id, builder.block, preds, block_end_values, vmap) + else: + ret_val = resolver.resolve_i64(value_id, builder.block, preds, block_end_values, vmap, bb_map) + else: + ret_val = vmap.get(value_id) if not ret_val: # Default based on return type if isinstance(return_type, ir.IntType): @@ -41,10 +51,10 @@ def lower_return( if hasattr(ret_val, 'type') and ret_val.type != return_type: if isinstance(return_type, ir.IntType) and ret_val.type.is_pointer: # ptr to int - ret_val = builder.ptrtoint(ret_val, return_type) + ret_val = builder.ptrtoint(ret_val, return_type, name="ret_p2i") elif isinstance(return_type, ir.PointerType) and isinstance(ret_val.type, ir.IntType): # int to ptr - ret_val = builder.inttoptr(ret_val, return_type) + ret_val = builder.inttoptr(ret_val, return_type, name="ret_i2p") elif isinstance(return_type, ir.IntType) and isinstance(ret_val.type, ir.IntType): # int to int conversion if return_type.width < ret_val.type.width: @@ -54,4 +64,4 @@ def lower_return( # Zero extend ret_val = builder.zext(ret_val, return_type) - builder.ret(ret_val) \ No newline at end of file + builder.ret(ret_val) diff --git a/src/llvm_py/instructions/safepoint.py b/src/llvm_py/instructions/safepoint.py index 42b4a6d8..f4fba8e9 100644 --- a/src/llvm_py/instructions/safepoint.py +++ b/src/llvm_py/instructions/safepoint.py @@ -11,7 +11,11 @@ def lower_safepoint( module: ir.Module, live_values: List[int], vmap: Dict[int, ir.Value], - safepoint_id: Optional[int] = None + safepoint_id: Optional[int] = None, + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR Safepoint instruction @@ -49,11 +53,14 @@ def lower_safepoint( # Store each live value for i, vid in enumerate(live_values): - val = vmap.get(vid, ir.Constant(i64, 0)) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + val = resolver.resolve_i64(vid, builder.block, preds, block_end_values, vmap, bb_map) + else: + val = vmap.get(vid, ir.Constant(i64, 0)) # Ensure i64 (handles are i64) if hasattr(val, 'type') and val.type.is_pointer: - val = builder.ptrtoint(val, i64) + val = builder.ptrtoint(val, i64, name=f"sp_p2i_{vid}") idx = ir.Constant(ir.IntType(32), i) ptr = builder.gep(live_array, [idx]) @@ -104,4 +111,4 @@ def insert_automatic_safepoint( check_func = ir.Function(module, func_type, name="ny_check_safepoint") # Insert safepoint check - builder.call(check_func, [], name=f"safepoint_{location}") \ No newline at end of file + builder.call(check_func, [], name=f"safepoint_{location}") diff --git a/src/llvm_py/instructions/typeop.py b/src/llvm_py/instructions/typeop.py index 1afb1907..df79333c 100644 --- a/src/llvm_py/instructions/typeop.py +++ b/src/llvm_py/instructions/typeop.py @@ -13,7 +13,10 @@ def lower_typeop( dst_vid: int, target_type: Optional[str], vmap: Dict[int, ir.Value], - resolver=None + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower MIR TypeOp instruction @@ -32,7 +35,10 @@ def lower_typeop( vmap: Value map resolver: Optional resolver for type handling """ - src_val = vmap.get(src_vid, ir.Constant(ir.IntType(64), 0)) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + src_val = resolver.resolve_i64(src_vid, builder.block, preds, block_end_values, vmap, bb_map) + else: + src_val = vmap.get(src_vid, ir.Constant(ir.IntType(64), 0)) if op == "cast": # Type casting - for now just pass through @@ -73,7 +79,11 @@ def lower_convert( dst_vid: int, from_type: str, to_type: str, - vmap: Dict[int, ir.Value] + vmap: Dict[int, ir.Value], + resolver=None, + preds=None, + block_end_values=None, + bb_map=None ) -> None: """ Lower type conversion between primitive types @@ -86,7 +96,14 @@ def lower_convert( to_type: Target type vmap: Value map """ - src_val = vmap.get(src_vid) + if resolver is not None and preds is not None and block_end_values is not None and bb_map is not None: + # Choose resolution based on from_type + if from_type == "ptr": + src_val = resolver.resolve_ptr(src_vid, builder.block, preds, block_end_values, vmap) + else: + src_val = resolver.resolve_i64(src_vid, builder.block, preds, block_end_values, vmap, bb_map) + else: + src_val = vmap.get(src_vid) if not src_val: # Default based on target type if to_type == "f64": @@ -108,10 +125,10 @@ def lower_convert( elif from_type == "i64" and to_type == "ptr": # int to pointer i8 = ir.IntType(8) - result = builder.inttoptr(src_val, i8.as_pointer()) + result = builder.inttoptr(src_val, i8.as_pointer(), name=f"conv_i2p_{dst_vid}") elif from_type == "ptr" and to_type == "i64": # pointer to int - result = builder.ptrtoint(src_val, ir.IntType(64)) + result = builder.ptrtoint(src_val, ir.IntType(64), name=f"conv_p2i_{dst_vid}") elif from_type == "i32" and to_type == "i64": # sign extend result = builder.sext(src_val, ir.IntType(64)) @@ -122,4 +139,4 @@ def lower_convert( # Unknown conversion - pass through result = src_val - vmap[dst_vid] = result \ No newline at end of file + vmap[dst_vid] = result diff --git a/src/llvm_py/llvm_builder.py b/src/llvm_py/llvm_builder.py index 0f8a0cd4..2ff07a5c 100644 --- a/src/llvm_py/llvm_builder.py +++ b/src/llvm_py/llvm_builder.py @@ -14,6 +14,7 @@ import llvmlite.binding as llvm # Import instruction handlers from instructions.const import lower_const from instructions.binop import lower_binop +from instructions.compare import lower_compare from instructions.jump import lower_jump from instructions.branch import lower_branch from instructions.ret import lower_return @@ -53,8 +54,11 @@ class NyashLLVMBuilder: self.vmap: Dict[int, ir.Value] = {} # value_id -> LLVM value self.bb_map: Dict[int, ir.Block] = {} # block_id -> LLVM block - # PHI deferrals for sealed block approach - self.phi_deferrals: List[Tuple[int, List[Tuple[int, int]]]] = [] + # PHI deferrals for sealed block approach: (block_id, dst_vid, incoming) + self.phi_deferrals: List[Tuple[int, int, List[Tuple[int, int]]]] = [] + # Predecessor map and per-block end snapshots + self.preds: Dict[int, List[int]] = {} + self.block_end_values: Dict[int, Dict[int, ir.Value]] = {} # Resolver for unified value resolution self.resolver = Resolver(self.vmap, self.bb_map) @@ -72,12 +76,66 @@ class NyashLLVMBuilder: # No functions - create dummy ny_main return self._create_dummy_main() + # Pre-declare all functions with default i64 signature to allow cross-calls + import re + for func_data in functions: + name = func_data.get("name", "unknown") + # Derive arity from name suffix '/N' if params list is empty + m = re.search(r"/(\d+)$", name) + if m: + arity = int(m.group(1)) + else: + arity = len(func_data.get("params", [])) + if name == "ny_main": + fty = ir.FunctionType(self.i32, []) + else: + fty = ir.FunctionType(self.i64, [self.i64] * arity) + exists = False + for f in self.module.functions: + if f.name == name: + exists = True + break + if not exists: + ir.Function(self.module, fty, name=name) + # Process each function for func_data in functions: self.lower_function(func_data) # Wire deferred PHIs self._wire_deferred_phis() + + # Create ny_main wrapper if necessary + has_ny_main = any(f.name == 'ny_main' for f in self.module.functions) + main_fn = None + for f in self.module.functions: + if f.name == 'main': + main_fn = f + break + if main_fn is not None: + # Hide the user main to avoid conflict with NyRT's main symbol + try: + main_fn.linkage = 'private' + except Exception: + pass + if not has_ny_main: + # i32 ny_main() { return (i32) main(); } + ny_main_ty = ir.FunctionType(self.i32, []) + ny_main = ir.Function(self.module, ny_main_ty, name='ny_main') + entry = ny_main.append_basic_block('entry') + b = ir.IRBuilder(entry) + if len(main_fn.args) == 0: + rv = b.call(main_fn, [], name='call_user_main') + else: + # If signature mismatches, return 0 + rv = ir.Constant(self.i64, 0) + if hasattr(rv, 'type') and isinstance(rv.type, ir.IntType) and rv.type.width != 32: + rv32 = b.trunc(rv, self.i32) if rv.type.width > 32 else b.zext(rv, self.i32) + b.ret(rv32) + elif hasattr(rv, 'type') and isinstance(rv.type, ir.IntType) and rv.type.width == 32: + b.ret(rv) + else: + b.ret(ir.Constant(self.i32, 0)) return str(self.module) @@ -93,6 +151,7 @@ class NyashLLVMBuilder: def lower_function(self, func_data: Dict[str, Any]): """Lower a single MIR function to LLVM IR""" name = func_data.get("name", "unknown") + import re params = func_data.get("params", []) blocks = func_data.get("blocks", []) @@ -101,13 +160,50 @@ class NyashLLVMBuilder: # Special case: ny_main returns i32 func_ty = ir.FunctionType(self.i32, []) else: - # Default: i64(i64, ...) signature - param_types = [self.i64] * len(params) + # Default: i64(i64, ...) signature; derive arity from '/N' suffix when params missing + m = re.search(r"/(\d+)$", name) + arity = int(m.group(1)) if m else len(params) + param_types = [self.i64] * arity func_ty = ir.FunctionType(self.i64, param_types) - # Create function - func = ir.Function(self.module, func_ty, name=name) + # Create or reuse function + func = None + for f in self.module.functions: + if f.name == name: + func = f + break + if func is None: + func = ir.Function(self.module, func_ty, name=name) + # Map parameters to vmap (value_id: 0..arity-1) + try: + arity = len(func.args) + for i in range(arity): + self.vmap[i] = func.args[i] + except Exception: + pass + + # Build predecessor map from control-flow edges + self.preds = {} + for block_data in blocks: + bid = block_data.get("id", 0) + self.preds.setdefault(bid, []) + for block_data in blocks: + src = block_data.get("id", 0) + for inst in block_data.get("instructions", []): + op = inst.get("op") + if op == "jump": + t = inst.get("target") + if t is not None: + self.preds.setdefault(t, []).append(src) + elif op == "branch": + th = inst.get("then") + el = inst.get("else") + if th is not None: + self.preds.setdefault(th, []).append(src) + if el is not None: + self.preds.setdefault(el, []).append(src) + # Create all blocks first for block_data in blocks: bid = block_data.get("id", 0) @@ -124,11 +220,40 @@ class NyashLLVMBuilder: def lower_block(self, bb: ir.Block, block_data: Dict[str, Any], func: ir.Function): """Lower a single basic block""" builder = ir.IRBuilder(bb) + # Provide builder/module to resolver for PHI/casts insertion + try: + self.resolver.builder = builder + self.resolver.module = self.module + except Exception: + pass instructions = block_data.get("instructions", []) + created_ids: List[int] = [] # Process each instruction for inst in instructions: self.lower_instruction(builder, inst, func) + try: + dst = inst.get("dst") + if isinstance(dst, int) and dst not in created_ids and dst in self.vmap: + created_ids.append(dst) + except Exception: + pass + # Snapshot end-of-block values for sealed PHI wiring + bid = block_data.get("id", 0) + snap: Dict[int, ir.Value] = {} + # include function args (avoid 0 constant confusion later via special-case) + try: + arity = len(func.args) + except Exception: + arity = 0 + for i in range(arity): + if i in self.vmap: + snap[i] = self.vmap[i] + for vid in created_ids: + val = self.vmap.get(vid) + if val is not None: + snap[vid] = val + self.block_end_values[bid] = snap def lower_instruction(self, builder: ir.IRBuilder, inst: Dict[str, Any], func: ir.Function): """Dispatch instruction to appropriate handler""" @@ -137,15 +262,15 @@ class NyashLLVMBuilder: if op == "const": dst = inst.get("dst") value = inst.get("value") - lower_const(builder, self.module, dst, value, self.vmap) + lower_const(builder, self.module, dst, value, self.vmap, self.resolver) elif op == "binop": operation = inst.get("operation") lhs = inst.get("lhs") rhs = inst.get("rhs") dst = inst.get("dst") - lower_binop(builder, self.resolver, operation, lhs, rhs, dst, - self.vmap, builder.block) + lower_binop(builder, self.resolver, operation, lhs, rhs, dst, + self.vmap, builder.block, self.preds, self.block_end_values, self.bb_map) elif op == "jump": target = inst.get("target") @@ -155,38 +280,48 @@ class NyashLLVMBuilder: cond = inst.get("cond") then_bid = inst.get("then") else_bid = inst.get("else") - lower_branch(builder, cond, then_bid, else_bid, self.vmap, self.bb_map) + lower_branch(builder, cond, then_bid, else_bid, self.vmap, self.bb_map, self.resolver, self.preds, self.block_end_values) elif op == "ret": value = inst.get("value") - lower_return(builder, value, self.vmap, func.return_value.type) + lower_return(builder, value, self.vmap, func.function_type.return_type, + self.resolver, self.preds, self.block_end_values, self.bb_map) elif op == "phi": dst = inst.get("dst") incoming = inst.get("incoming", []) - # Defer PHI wiring for now - defer_phi_wiring(dst, incoming, self.phi_deferrals) + # Wire PHI immediately at the start of the current block using snapshots + lower_phi(builder, dst, incoming, self.vmap, self.bb_map, builder.block, self.resolver, self.block_end_values, self.preds) + + elif op == "compare": + # Dedicated compare op + operation = inst.get("operation") or inst.get("op") + lhs = inst.get("lhs") + rhs = inst.get("rhs") + dst = inst.get("dst") + lower_compare(builder, operation, lhs, rhs, dst, self.vmap, + self.resolver, builder.block, self.preds, self.block_end_values, self.bb_map) elif op == "call": func_name = inst.get("func") args = inst.get("args", []) dst = inst.get("dst") - lower_call(builder, self.module, func_name, args, dst, self.vmap, self.resolver) + lower_call(builder, self.module, func_name, args, dst, self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) elif op == "boxcall": box_vid = inst.get("box") method = inst.get("method") args = inst.get("args", []) dst = inst.get("dst") - lower_boxcall(builder, self.module, box_vid, method, args, dst, - self.vmap, self.resolver) + lower_boxcall(builder, self.module, box_vid, method, args, dst, + self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) elif op == "externcall": func_name = inst.get("func") args = inst.get("args", []) dst = inst.get("dst") - lower_externcall(builder, self.module, func_name, args, dst, - self.vmap, self.resolver) + lower_externcall(builder, self.module, func_name, args, dst, + self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) elif op == "newbox": box_type = inst.get("type") @@ -200,12 +335,14 @@ class NyashLLVMBuilder: src = inst.get("src") dst = inst.get("dst") target_type = inst.get("target_type") - lower_typeop(builder, operation, src, dst, target_type, - self.vmap, self.resolver) + lower_typeop(builder, operation, src, dst, target_type, + self.vmap, self.resolver, self.preds, self.block_end_values, self.bb_map) elif op == "safepoint": live = inst.get("live", []) - lower_safepoint(builder, self.module, live, self.vmap) + lower_safepoint(builder, self.module, live, self.vmap, + resolver=self.resolver, preds=self.preds, + block_end_values=self.block_end_values, bb_map=self.bb_map) elif op == "barrier": barrier_type = inst.get("type", "memory") @@ -216,8 +353,9 @@ class NyashLLVMBuilder: cond = inst.get("cond") body = inst.get("body", []) self.loop_count += 1 - if not lower_while_loopform(builder, func, cond, body, - self.loop_count, self.vmap, self.bb_map): + if not lower_while_loopform(builder, func, cond, body, + self.loop_count, self.vmap, self.bb_map, + self.resolver, self.preds, self.block_end_values): # Fallback to regular while self._lower_while_regular(builder, inst, func) else: @@ -226,16 +364,130 @@ class NyashLLVMBuilder: def _lower_while_regular(self, builder: ir.IRBuilder, inst: Dict[str, Any], func: ir.Function): """Fallback regular while lowering""" - # TODO: Implement regular while lowering - pass + # Create basic blocks: cond -> body -> cond, and exit + cond_vid = inst.get("cond") + body_insts = inst.get("body", []) + + cur_bb = builder.block + cond_bb = func.append_basic_block(name=f"while{self.loop_count}_cond") + body_bb = func.append_basic_block(name=f"while{self.loop_count}_body") + exit_bb = func.append_basic_block(name=f"while{self.loop_count}_exit") + + # Jump from current to cond + builder.branch(cond_bb) + + # Cond block + cbuild = ir.IRBuilder(cond_bb) + try: + cond_val = self.resolver.resolve_i64(cond_vid, builder.block, self.preds, self.block_end_values, self.vmap, self.bb_map) + except Exception: + cond_val = self.vmap.get(cond_vid) + if cond_val is None: + cond_val = ir.Constant(self.i1, 0) + # Normalize to i1 + if hasattr(cond_val, 'type'): + if isinstance(cond_val.type, ir.IntType) and cond_val.type.width == 64: + zero64 = ir.Constant(self.i64, 0) + cond_val = cbuild.icmp_unsigned('!=', cond_val, zero64, name="while_cond_i1") + elif isinstance(cond_val.type, ir.PointerType): + nullp = ir.Constant(cond_val.type, None) + cond_val = cbuild.icmp_unsigned('!=', cond_val, nullp, name="while_cond_p1") + elif isinstance(cond_val.type, ir.IntType) and cond_val.type.width == 1: + # already i1 + pass + else: + # Fallback: treat as false + cond_val = ir.Constant(self.i1, 0) + else: + cond_val = ir.Constant(self.i1, 0) + + cbuild.cbranch(cond_val, body_bb, exit_bb) + + # Body block + bbuild = ir.IRBuilder(body_bb) + # Allow nested lowering of body instructions within this block + self._lower_instruction_list(bbuild, body_insts, func) + # Ensure terminator: if not terminated, branch back to cond + if bbuild.block.terminator is None: + bbuild.branch(cond_bb) + + # Continue at exit + builder.position_at_end(exit_bb) + + def _lower_instruction_list(self, builder: ir.IRBuilder, insts: List[Dict[str, Any]], func: ir.Function): + """Lower a flat list of instructions using current builder and function.""" + for sub in insts: + # If current block already has a terminator, create a continuation block + if builder.block.terminator is not None: + cont = func.append_basic_block(name=f"cont_bb_{builder.block.name}") + builder.position_at_end(cont) + self.lower_instruction(builder, sub, func) def _wire_deferred_phis(self): """Wire all deferred PHI nodes""" - # TODO: Implement PHI wiring after all blocks are created - for dst_vid, incoming in self.phi_deferrals: - # Find the block containing this PHI - # Wire the incoming edges - pass + for cur_bid, dst_vid, incoming in self.phi_deferrals: + bb = self.bb_map.get(cur_bid) + if bb is None: + continue + b = ir.IRBuilder(bb) + b.position_at_start(bb) + # Determine phi type: prefer pointer if any incoming is pointer; else f64; else i64 + phi_type = self.i64 + for (val_id, pred_bid) in incoming: + snap = self.block_end_values.get(pred_bid, {}) + val = snap.get(val_id) + if val is not None and hasattr(val, 'type'): + if hasattr(val.type, 'is_pointer') and val.type.is_pointer: + phi_type = val.type + break + elif str(val.type) == str(self.f64): + phi_type = self.f64 + phi = b.phi(phi_type, name=f"phi_{dst_vid}") + for (val_id, pred_bid) in incoming: + pred_bb = self.bb_map.get(pred_bid) + if pred_bb is None: + continue + # Self-reference takes precedence regardless of snapshot + if val_id == dst_vid: + val = phi + else: + snap = self.block_end_values.get(pred_bid, {}) + # Special-case: incoming 0 means typed zero/null, not value-id 0 + if isinstance(val_id, int) and val_id == 0: + val = None + else: + val = snap.get(val_id) + if val is None: + # Default based on phi type + if isinstance(phi_type, ir.IntType): + val = ir.Constant(phi_type, 0) + elif isinstance(phi_type, ir.DoubleType): + val = ir.Constant(phi_type, 0.0) + else: + val = ir.Constant(phi_type, None) + # Type adjust if needed + if hasattr(val, 'type') and val.type != phi_type: + # Insert cast in predecessor block before its terminator + pb = ir.IRBuilder(pred_bb) + try: + term = pred_bb.terminator + if term is not None: + pb.position_before(term) + else: + pb.position_at_end(pred_bb) + except Exception: + pb.position_at_end(pred_bb) + if isinstance(phi_type, ir.IntType) and isinstance(val.type, ir.PointerType): + val = pb.ptrtoint(val, phi_type, name=f"phi_p2i_{dst_vid}_{pred_bid}") + elif isinstance(phi_type, ir.PointerType) and isinstance(val.type, ir.IntType): + val = pb.inttoptr(val, phi_type, name=f"phi_i2p_{dst_vid}_{pred_bid}") + elif isinstance(phi_type, ir.IntType) and isinstance(val.type, ir.IntType): + if phi_type.width > val.type.width: + val = pb.zext(val, phi_type, name=f"phi_zext_{dst_vid}_{pred_bid}") + elif phi_type.width < val.type.width: + val = pb.trunc(val, phi_type, name=f"phi_trunc_{dst_vid}_{pred_bid}") + phi.add_incoming(val, pred_bb) + self.vmap[dst_vid] = phi def compile_to_object(self, output_path: str): """Compile module to object file""" @@ -255,31 +507,52 @@ class NyashLLVMBuilder: f.write(obj) def main(): - if len(sys.argv) < 2: - print("Usage: llvm_builder.py [-o output.o]") - sys.exit(1) - - input_file = sys.argv[1] + # CLI: + # llvm_builder.py [-o output.o] + # llvm_builder.py --dummy [-o output.o] output_file = "nyash_llvm_py.o" - - if "-o" in sys.argv: - idx = sys.argv.index("-o") - if idx + 1 < len(sys.argv): - output_file = sys.argv[idx + 1] - - # Read MIR JSON + args = sys.argv[1:] + dummy = False + + if not args: + print("Usage: llvm_builder.py [-o output.o] | --dummy [-o output.o]") + sys.exit(1) + + if "-o" in args: + idx = args.index("-o") + if idx + 1 < len(args): + output_file = args[idx + 1] + del args[idx:idx+2] + + if args and args[0] == "--dummy": + dummy = True + del args[0] + + builder = NyashLLVMBuilder() + + if dummy: + # Emit dummy ny_main + ir_text = builder._create_dummy_main() + if os.environ.get('NYASH_CLI_VERBOSE') == '1': + print(f"[Python LLVM] Generated dummy IR:\n{ir_text}") + builder.compile_to_object(output_file) + print(f"Compiled to {output_file}") + return + + if not args: + print("error: missing input MIR JSON (or use --dummy)", file=sys.stderr) + sys.exit(2) + + input_file = args[0] with open(input_file, 'r') as f: mir_json = json.load(f) - - # Build LLVM IR - builder = NyashLLVMBuilder() + llvm_ir = builder.build_from_mir(mir_json) - - print(f"Generated LLVM IR:\n{llvm_ir}") - - # Compile to object + if os.environ.get('NYASH_CLI_VERBOSE') == '1': + print(f"[Python LLVM] Generated LLVM IR:\n{llvm_ir}") + builder.compile_to_object(output_file) print(f"Compiled to {output_file}") if __name__ == "__main__": - main() \ No newline at end of file + main() diff --git a/src/llvm_py/resolver.py b/src/llvm_py/resolver.py index 19ceda63..c5504239 100644 --- a/src/llvm_py/resolver.py +++ b/src/llvm_py/resolver.py @@ -15,14 +15,23 @@ class Resolver: - Cache per (block, value) to avoid redundant PHIs """ - def __init__(self, builder: ir.IRBuilder, module: ir.Module): - self.builder = builder - self.module = module + def __init__(self, a, b=None): + """Flexible init: either (builder, module) or (vmap, bb_map) for legacy wiring.""" + if hasattr(a, 'position_at_end'): + # a is IRBuilder + self.builder = a + self.module = b + else: + # Legacy constructor (vmap, bb_map) — builder/module will be set later when available + self.builder = None + self.module = None # Caches: (block_name, value_id) -> llvm value self.i64_cache: Dict[Tuple[str, int], ir.Value] = {} self.ptr_cache: Dict[Tuple[str, int], ir.Value] = {} self.f64_cache: Dict[Tuple[str, int], ir.Value] = {} + # String literal map: value_id -> Python string (for by-name calls) + self.string_literals: Dict[int, str] = {} # Type shortcuts self.i64 = ir.IntType(64) @@ -33,9 +42,10 @@ class Resolver: self, value_id: int, current_block: ir.Block, - preds: Dict[str, list], - block_end_values: Dict[str, Dict[int, Any]], - vmap: Dict[int, Any] + preds: Dict[int, list], + block_end_values: Dict[int, Dict[int, Any]], + vmap: Dict[int, Any], + bb_map: Optional[Dict[int, ir.Block]] = None ) -> ir.Value: """ Resolve a MIR value as i64 dominating the current block. @@ -46,31 +56,81 @@ class Resolver: # Check cache if cache_key in self.i64_cache: return self.i64_cache[cache_key] + + # Do not trust global vmap across blocks: always localize via preds when available # Get predecessor blocks - pred_names = preds.get(current_block.name, []) + try: + bid = int(str(current_block.name).replace('bb','')) + except Exception: + bid = -1 + pred_ids = [p for p in preds.get(bid, []) if p != bid] - if not pred_names: + if not pred_ids: # Entry block or no predecessors base_val = vmap.get(value_id, ir.Constant(self.i64, 0)) - result = self._coerce_to_i64(base_val) + # Do not emit casts here; if pointer, fall back to zero + if hasattr(base_val, 'type') and isinstance(base_val.type, ir.IntType): + result = base_val if base_val.type.width == 64 else ir.Constant(self.i64, 0) + elif hasattr(base_val, 'type') and isinstance(base_val.type, ir.PointerType): + result = ir.Constant(self.i64, 0) + else: + result = ir.Constant(self.i64, 0) else: # Create PHI at block start - saved_pos = self.builder.block - self.builder.position_at_start(current_block) + saved_pos = None + if self.builder is not None: + saved_pos = self.builder.block + self.builder.position_at_start(current_block) phi = self.builder.phi(self.i64, name=f"loc_i64_{value_id}") # Add incoming values from predecessors - for pred_name in pred_names: - pred_vals = block_end_values.get(pred_name, {}) - val = pred_vals.get(value_id, ir.Constant(self.i64, 0)) - coerced = self._coerce_to_i64(val) - # Note: In real implementation, need pred block reference - phi.add_incoming(coerced, pred_name) # Simplified + for pred_id in pred_ids: + pred_vals = block_end_values.get(pred_id, {}) + val = pred_vals.get(value_id) + # Coerce in predecessor block if needed + if val is None: + coerced = ir.Constant(self.i64, 0) + else: + if hasattr(val, 'type') and isinstance(val.type, ir.IntType): + coerced = val if val.type.width == 64 else ir.Constant(self.i64, 0) + elif hasattr(val, 'type') and isinstance(val.type, ir.PointerType): + # insert ptrtoint in predecessor + pred_bb = bb_map.get(pred_id) if bb_map is not None else None + if pred_bb is not None: + pb = ir.IRBuilder(pred_bb) + try: + term = pred_bb.terminator + if term is not None: + pb.position_before(term) + else: + pb.position_at_end(pred_bb) + except Exception: + pb.position_at_end(pred_bb) + coerced = pb.ptrtoint(val, self.i64, name=f"res_p2i_{value_id}_{pred_id}") + else: + coerced = ir.Constant(self.i64, 0) + else: + coerced = ir.Constant(self.i64, 0) + # Use predecessor block if available + pred_bb = None + if bb_map is not None: + pred_bb = bb_map.get(pred_id) + if pred_bb is not None: + phi.add_incoming(coerced, pred_bb) + # If no valid incoming were added, fold to zero to avoid invalid PHI + if len(getattr(phi, 'incoming', [])) == 0: + # Replace with zero constant and discard phi + result = ir.Constant(self.i64, 0) + # Restore position and cache + if saved_pos and self.builder is not None: + self.builder.position_at_end(saved_pos) + self.i64_cache[cache_key] = result + return result # Restore position - if saved_pos: + if saved_pos and self.builder is not None: self.builder.position_at_end(saved_pos) result = phi @@ -82,16 +142,51 @@ class Resolver: def resolve_ptr(self, value_id: int, current_block: ir.Block, preds: Dict, block_end_values: Dict, vmap: Dict) -> ir.Value: """Resolve as i8* pointer""" - # Similar to resolve_i64 but with pointer type - # TODO: Implement - pass + cache_key = (current_block.name, value_id) + if cache_key in self.ptr_cache: + return self.ptr_cache[cache_key] + # Coerce current vmap value or GlobalVariable to i8* + val = vmap.get(value_id) + if val is None: + result = ir.Constant(self.i8p, None) + else: + if hasattr(val, 'type') and isinstance(val, ir.PointerType): + # If pointer to array (GlobalVariable), GEP to first element + ty = val.type.pointee if hasattr(val.type, 'pointee') else None + if ty is not None and hasattr(ty, 'element'): + c0 = ir.Constant(ir.IntType(32), 0) + result = self.builder.gep(val, [c0, c0], name=f"res_str_gep_{value_id}") + else: + result = val + elif hasattr(val, 'type') and isinstance(val.type, ir.IntType): + result = self.builder.inttoptr(val, self.i8p, name=f"res_i2p_{value_id}") + else: + # f64 or others -> zero + result = ir.Constant(self.i8p, None) + self.ptr_cache[cache_key] = result + return result def resolve_f64(self, value_id: int, current_block: ir.Block, preds: Dict, block_end_values: Dict, vmap: Dict) -> ir.Value: """Resolve as f64""" - # Similar pattern - # TODO: Implement - pass + cache_key = (current_block.name, value_id) + if cache_key in self.f64_cache: + return self.f64_cache[cache_key] + val = vmap.get(value_id) + if val is None: + result = ir.Constant(self.f64_type, 0.0) + else: + if hasattr(val, 'type') and val.type == self.f64_type: + result = val + elif hasattr(val, 'type') and isinstance(val.type, ir.IntType): + result = self.builder.sitofp(val, self.f64_type) + elif hasattr(val, 'type') and isinstance(val.type, ir.PointerType): + tmp = self.builder.ptrtoint(val, self.i64, name=f"res_p2i_{value_id}") + result = self.builder.sitofp(tmp, self.f64_type, name=f"res_i2f_{value_id}") + else: + result = ir.Constant(self.f64_type, 0.0) + self.f64_cache[cache_key] = result + return result def _coerce_to_i64(self, val: Any) -> ir.Value: """Coerce various types to i64""" @@ -99,14 +194,14 @@ class Resolver: return val elif hasattr(val, 'type') and val.type.is_pointer: # ptr to int - return self.builder.ptrtoint(val, self.i64) + return self.builder.ptrtoint(val, self.i64, name=f"res_p2i_{getattr(val,'name','x')}") if self.builder is not None else ir.Constant(self.i64, 0) elif hasattr(val, 'type') and isinstance(val.type, ir.IntType): # int to int (extend/trunc) if val.type.width < 64: - return self.builder.zext(val, self.i64) + return self.builder.zext(val, self.i64) if self.builder is not None else ir.Constant(self.i64, 0) elif val.type.width > 64: - return self.builder.trunc(val, self.i64) + return self.builder.trunc(val, self.i64) if self.builder is not None else ir.Constant(self.i64, 0) return val else: # Default zero - return ir.Constant(self.i64, 0) \ No newline at end of file + return ir.Constant(self.i64, 0) diff --git a/src/runner/modes/llvm.rs b/src/runner/modes/llvm.rs index 98b85c97..fa3ca9d5 100644 --- a/src/runner/modes/llvm.rs +++ b/src/runner/modes/llvm.rs @@ -2,6 +2,7 @@ use super::super::NyashRunner; use nyash_rust::{parser::NyashParser, mir::{MirCompiler, MirInstruction}, box_trait::IntegerBox}; use nyash_rust::mir::passes::method_id_inject::inject_method_ids; use std::{fs, process}; +use serde_json::json; impl NyashRunner { /// Execute LLVM mode (split) @@ -43,17 +44,66 @@ impl NyashRunner { if let Ok(out_path) = std::env::var("NYASH_LLVM_OBJ_OUT") { #[cfg(feature = "llvm")] { - use nyash_rust::backend::llvm_compile_to_object; - // Ensure parent directory exists for the object file - if let Some(parent) = std::path::Path::new(&out_path).parent() { - let _ = std::fs::create_dir_all(parent); - } - if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { - eprintln!("[Runner/LLVM] emitting object to {} (cwd={})", out_path, std::env::current_dir().map(|p| p.display().to_string()).unwrap_or_default()); - } - if let Err(e) = llvm_compile_to_object(&module, &out_path) { - eprintln!("❌ LLVM object emit error: {}", e); + // Harness path (optional): if NYASH_LLVM_USE_HARNESS=1, try Python/llvmlite first. + let use_harness = std::env::var("NYASH_LLVM_USE_HARNESS").ok().as_deref() == Some("1"); + if use_harness { + if let Some(parent) = std::path::Path::new(&out_path).parent() { let _ = std::fs::create_dir_all(parent); } + let py = which::which("python3").ok(); + if let Some(py3) = py { + let harness = std::path::Path::new("tools/llvmlite_harness.py"); + if harness.exists() { + // 1) Emit MIR(JSON) to a temp file + let tmp_dir = std::path::Path::new("tmp"); + let _ = std::fs::create_dir_all(tmp_dir); + let mir_json_path = tmp_dir.join("nyash_harness_mir.json"); + if let Err(e) = emit_mir_json_for_harness(&module, &mir_json_path) { + eprintln!("❌ MIR JSON emit error: {}", e); + process::exit(1); + } + if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!("[Runner/LLVM] using llvmlite harness → {} (mir={})", out_path, mir_json_path.display()); + } + // 2) Run harness with --in/--out(失敗時は即エラー) + let status = std::process::Command::new(py3) + .args([harness.to_string_lossy().as_ref(), "--in", &mir_json_path.display().to_string(), "--out", &out_path]) + .status().map_err(|e| format!("spawn harness: {}", e)).unwrap(); + if !status.success() { + eprintln!("❌ llvmlite harness failed (status={})", status.code().unwrap_or(-1)); + process::exit(1); + } + // Verify + match std::fs::metadata(&out_path) { + Ok(meta) if meta.len() > 0 => { + if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!("[LLVM] object emitted by harness: {} ({} bytes)", out_path, meta.len()); + } + return; + } + _ => { + eprintln!("❌ harness output not found or empty: {}", out_path); + process::exit(1); + } + } + } else { + eprintln!("❌ harness script not found: {}", harness.display()); + process::exit(1); + } + } + eprintln!("❌ python3 not found in PATH. Install Python 3 to use the harness."); process::exit(1); + } else { + use nyash_rust::backend::llvm_compile_to_object; + // Ensure parent directory exists for the object file + if let Some(parent) = std::path::Path::new(&out_path).parent() { + let _ = std::fs::create_dir_all(parent); + } + if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { + eprintln!("[Runner/LLVM] emitting object to {} (cwd={})", out_path, std::env::current_dir().map(|p| p.display().to_string()).unwrap_or_default()); + } + if let Err(e) = llvm_compile_to_object(&module, &out_path) { + eprintln!("❌ LLVM object emit error: {}", e); + process::exit(1); + } } // Verify object presence and size (>0) match std::fs::metadata(&out_path) { @@ -71,7 +121,7 @@ impl NyashRunner { if std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1") { eprintln!("[Runner/LLVM] object not found after emit, retrying once: {} ({})", out_path, e); } - if let Err(e2) = llvm_compile_to_object(&module, &out_path) { + if let Err(e2) = nyash_rust::backend::llvm_compile_to_object(&module, &out_path) { eprintln!("❌ LLVM object emit error (retry): {}", e2); process::exit(1); } @@ -141,3 +191,93 @@ impl NyashRunner { } } } + +fn emit_mir_json_for_harness(module: &nyash_rust::mir::MirModule, path: &std::path::Path) -> Result<(), String> { + use nyash_rust::mir::{MirInstruction as I, BinaryOp as B, CompareOp as C}; + // Build JSON structure expected by python builder: { functions: [ { name, params, blocks: [ { id, instructions: [ ... ] } ] } ] } + let mut funs = Vec::new(); + for (name, f) in &module.functions { + let mut blocks = Vec::new(); + let mut ids: Vec<_> = f.blocks.keys().copied().collect(); + ids.sort(); + for bid in ids { + if let Some(bb) = f.blocks.get(&bid) { + let mut insts = Vec::new(); + // PHI first(オプション) + for inst in &bb.instructions { + if let I::Phi { dst, inputs } = inst { + let incoming: Vec<_> = inputs.iter().map(|(b, v)| json!([v.as_u32(), b.as_u32()])).collect(); + insts.push(json!({"op":"phi","dst": dst.as_u32(), "incoming": incoming})); + } + } + // Non-PHI + for inst in &bb.instructions { + match inst { + I::Const { dst, value } => { + let (t, val) = match value { + nyash_rust::mir::ConstValue::Integer(i) => ("i64", json!(i)), + nyash_rust::mir::ConstValue::Float(fv) => ("f64", json!(fv)), + nyash_rust::mir::ConstValue::Bool(b) => ("i64", json!(if *b {1} else {0})), + nyash_rust::mir::ConstValue::String(s) => ("string", json!(s)), + nyash_rust::mir::ConstValue::Null => ("void", json!(0)), + nyash_rust::mir::ConstValue::Void => ("void", json!(0)), + }; + insts.push(json!({"op":"const","dst": dst.as_u32(), "value": {"type": t, "value": val}})); + } + I::BinOp { dst, op, lhs, rhs } => { + let op_s = match op { B::Add=>"+",B::Sub=>"-",B::Mul=>"*",B::Div=>"/",B::Mod=>"%",B::BitAnd=>"&",B::BitOr=>"|",B::BitXor=>"^",B::Shl=>"<<",B::Shr=>">>",B::And=>"&",B::Or=>"|"}; + insts.push(json!({"op":"binop","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()})); + } + I::Compare { dst, op, lhs, rhs } => { + let op_s = match op { C::Lt=>"<", C::Le=>"<=", C::Gt=>">", C::Ge=>">=", C::Eq=>"==", C::Ne=>"!=" }; + insts.push(json!({"op":"compare","operation": op_s, "lhs": lhs.as_u32(), "rhs": rhs.as_u32(), "dst": dst.as_u32()})); + } + I::Call { dst, func, args, .. } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + insts.push(json!({"op":"call","func": func.as_u32(), "args": args_a, "dst": dst.map(|d| d.as_u32())})); + } + I::ExternCall { dst, iface_name, method_name, args, .. } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + // Map known interfaces to NyRT symbols + let func_name = if iface_name == "env.console" { + format!("nyash.console.{}", method_name) + } else { format!("{}.{}", iface_name, method_name) }; + insts.push(json!({"op":"externcall","func": func_name, "args": args_a, "dst": dst.map(|d| d.as_u32())})); + } + I::BoxCall { dst, box_val, method, args, .. } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + insts.push(json!({"op":"boxcall","box": box_val.as_u32(), "method": method, "args": args_a, "dst": dst.map(|d| d.as_u32())})); + } + I::NewBox { dst, box_type, args } => { + let args_a: Vec<_> = args.iter().map(|v| json!(v.as_u32())).collect(); + insts.push(json!({"op":"newbox","type": box_type, "args": args_a, "dst": dst.as_u32()})); + } + I::Branch { condition, then_bb, else_bb } => { + insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})); + } + I::Jump { target } => { + insts.push(json!({"op":"jump","target": target.as_u32()})); + } + I::Return { value } => { + insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})); + } + _ => { /* skip non-essential ops for initial harness */ } + } + } + // Terminator (if present) + if let Some(term) = &bb.terminator { + match term { + I::Return { value } => insts.push(json!({"op":"ret","value": value.map(|v| v.as_u32())})), + I::Jump { target } => insts.push(json!({"op":"jump","target": target.as_u32()})), + I::Branch { condition, then_bb, else_bb } => insts.push(json!({"op":"branch","cond": condition.as_u32(), "then": then_bb.as_u32(), "else": else_bb.as_u32()})), + _ => {} + } + } + blocks.push(json!({"id": bid.as_u32(), "instructions": insts})); + } + } + funs.push(json!({"name": name, "params": [], "blocks": blocks})); + } + let root = json!({"functions": funs}); + std::fs::write(path, serde_json::to_string_pretty(&root).unwrap()).map_err(|e| format!("write mir json: {}", e)) +} diff --git a/tools/build_plugins_all.sh b/tools/build_plugins_all.sh new file mode 100644 index 00000000..b7a2711c --- /dev/null +++ b/tools/build_plugins_all.sh @@ -0,0 +1,40 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Build all plugins in cdylib/staticlib forms and copy artifacts next to Cargo.toml + +ROOT_DIR=$(cd "$(dirname "$0")/.." && pwd) +cd "$ROOT_DIR" + +PROFILE=${PROFILE:-release} + +echo "[plugins] building all (profile=$PROFILE)" + +for dir in plugins/*; do + [[ -d "$dir" && -f "$dir/Cargo.toml" ]] || continue + pkg=$(grep -m1 '^name\s*=\s*"' "$dir/Cargo.toml" | sed -E 's/.*"(.*)".*/\1/') + # Determine lib name (prefer [lib].name, else package name with '-' -> '_') + libname=$(awk '/^\[lib\]/{flag=1;next}/^\[/{flag=0}flag && /name\s*=/{print; exit}' "$dir/Cargo.toml" | sed -E 's/.*"(.*)".*/\1/') + if [[ -z "${libname}" ]]; then + libname=${pkg//-/_} + fi + echo "[plugins] -> $pkg (libname=$libname)" + cargo build -p "$pkg" --$PROFILE >/dev/null + # Copy artifacts + outdir="target/$PROFILE" + # cdylib (.so/.dylib/.dll) + for ext in so dylib dll; do + f="${outdir}/lib${libname}.${ext}" + if [[ -f "$f" ]]; then + cp -f "$f" "$dir/" && echo " copied $(basename "$f")" + fi + done + # staticlib (.a) + fa="${outdir}/lib${libname}.a" + if [[ -f "$fa" ]]; then + cp -f "$fa" "$dir/" && echo " copied $(basename "$fa")" + fi +done + +echo "[plugins] done" + diff --git a/tools/compare_harness_on_off.sh b/tools/compare_harness_on_off.sh new file mode 100644 index 00000000..bd96a1d4 --- /dev/null +++ b/tools/compare_harness_on_off.sh @@ -0,0 +1,63 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR=$(cd "$(dirname "$0")/.." && pwd) +APP=${1:-apps/selfhost/tools/dep_tree_min_string.nyash} +OUTDIR=${OUTDIR:-$ROOT_DIR/tmp} +mkdir -p "$OUTDIR" + +ON_EXE=${ON_EXE:-$ROOT_DIR/app_dep_tree_py} +OFF_EXE=${OFF_EXE:-$ROOT_DIR/app_dep_tree_rust} + +echo "[compare] target app: $APP" + +echo "[compare] build (OFF/Rust LLVM) ..." +"$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$OFF_EXE" >/dev/null + +echo "[compare] build (ON/llvmlite harness) ..." +NYASH_LLVM_USE_HARNESS=1 "$ROOT_DIR/tools/build_llvm.sh" "$APP" -o "$ON_EXE" >/dev/null + +echo "[compare] run both and capture output ..." +ON_OUT="$OUTDIR/on.out"; OFF_OUT="$OUTDIR/off.out" +set +e +"$ON_EXE" > "$ON_OUT" 2>&1 +RC_ON=$? +"$OFF_EXE" > "$OFF_OUT" 2>&1 +RC_OFF=$? +set -e + +echo "[compare] exit codes: ON=$RC_ON OFF=$RC_OFF" + +echo "[compare] extract JSON payload (from first '{' to end) ..." +ON_JSON="$OUTDIR/on.json"; OFF_JSON="$OUTDIR/off.json" +sed -n '/^{/,$p' "$ON_OUT" > "$ON_JSON" || true +sed -n '/^{/,$p' "$OFF_OUT" > "$OFF_JSON" || true + +echo "[compare] === diff(json) ===" +diff -u "$OFF_JSON" "$ON_JSON" || true + +echo "[compare] files:" +echo " ON out: $ON_OUT" +echo " ON json: $ON_JSON" +echo " OFF out: $OFF_OUT" +echo " OFF json: $OFF_JSON" + +if [ $RC_ON -eq 0 ] && [ $RC_OFF -eq 0 ]; then + echo "[compare] ✅ exit codes match (0)" +else + echo "[compare] ⚠️ exit codes differ or non‑zero (ON=$RC_ON OFF=$RC_OFF)" +fi + +# Fallback: if JSON both empty, compare Result: lines +if [ ! -s "$ON_JSON" ] && [ ! -s "$OFF_JSON" ]; then + echo "[compare] JSON empty on both; compare 'Result:' lines as fallback" + ON_RES=$(sed -n 's/^.*Result: \(.*\)$/\1/p' "$ON_OUT" | tail -n 1) + OFF_RES=$(sed -n 's/^.*Result: \(.*\)$/\1/p' "$OFF_OUT" | tail -n 1) + echo "[compare] ON Result: ${ON_RES:-}" + echo "[compare] OFF Result: ${OFF_RES:-}" + if [ "${ON_RES:-X}" = "${OFF_RES:-Y}" ]; then + echo "[compare] ✅ fallback results match" + else + echo "[compare] ❌ fallback results differ" + fi +fi diff --git a/tools/llvmlite_harness.py b/tools/llvmlite_harness.py index fce073d0..27bcd948 100644 --- a/tools/llvmlite_harness.py +++ b/tools/llvmlite_harness.py @@ -1,83 +1,83 @@ #!/usr/bin/env python3 """ -Experimental llvmlite-based LLVM emission harness for Nyash. +Nyash llvmlite harness (scaffold) Usage: - python3 tools/llvmlite_harness.py [--in MIR.json] --out OUTPUT.o + - python3 tools/llvmlite_harness.py --out out.o # dummy ny_main -> object + - python3 tools/llvmlite_harness.py --in mir.json --out out.o # MIR(JSON) -> object (partial support) Notes: -- First cut emits a trivial ny_main that returns 0 to validate toolchain. -- Extend to lower MIR14 JSON incrementally. + - For initial scaffolding, when --in is omitted, a trivial ny_main that returns 0 is emitted. + - When --in is provided, this script delegates to src/llvm_py/llvm_builder.py. """ -from __future__ import annotations + import argparse import json import os import sys +from pathlib import Path -try: - from llvmlite import ir, binding -except Exception as e: # noqa: BLE001 - sys.stderr.write( - "llvmlite is required. Install with: python3 -m pip install llvmlite\n" - ) - sys.stderr.write(f"Import error: {e}\n") - sys.exit(2) +ROOT = Path(__file__).resolve().parents[1] +PY_BUILDER = ROOT / "src" / "llvm_py" / "llvm_builder.py" +def run_dummy(out_path: str) -> None: + # Minimal llvmlite program: ny_main() -> i32 0 + import llvmlite.ir as ir + import llvmlite.binding as llvm -def parse_args() -> argparse.Namespace: - ap = argparse.ArgumentParser(description="Nyash llvmlite harness") - ap.add_argument("--in", dest="in_path", help="MIR14 JSON input (optional)") - ap.add_argument("--out", dest="out_path", required=True, help="Output object file path") - return ap.parse_args() + llvm.initialize() + llvm.initialize_native_target() + llvm.initialize_native_asmprinter() - -def load_mir(path: str | None) -> dict | None: - if not path: - return None - with open(path, "r", encoding="utf-8") as f: - return json.load(f) - - -def build_trivial_module() -> ir.Module: - mod = ir.Module(name="nyash_harness") - mod.triple = binding.get_default_triple() - i64 = ir.IntType(64) - i8 = ir.IntType(8) - i8p = i8.as_pointer() - i8pp = i8p.as_pointer() - fn_ty = ir.FunctionType(i64, [i64, i8pp]) - fn = ir.Function(mod, fn_ty, name="ny_main") - entry = fn.append_basic_block(name="entry") + mod = ir.Module(name="nyash_module") + i32 = ir.IntType(32) + ny_main_ty = ir.FunctionType(i32, []) + ny_main = ir.Function(mod, ny_main_ty, name="ny_main") + entry = ny_main.append_basic_block("entry") b = ir.IRBuilder(entry) - b.ret(ir.Constant(i64, 0)) - return mod + b.ret(ir.Constant(i32, 0)) - -def emit_object(mod: ir.Module, out_path: str) -> None: - binding.initialize() - binding.initialize_native_target() - binding.initialize_native_asmprinter() - - target = binding.Target.from_default_triple() + # Emit object via target machine + m = llvm.parse_assembly(str(mod)) + m.verify() + target = llvm.Target.from_default_triple() tm = target.create_target_machine() - llvm_mod = binding.parse_assembly(str(mod)) - llvm_mod.verify() - obj = tm.emit_object(llvm_mod) + obj = tm.emit_object(m) + Path(out_path).parent.mkdir(parents=True, exist_ok=True) with open(out_path, "wb") as f: f.write(obj) +def run_from_json(in_path: str, out_path: str) -> None: + # Delegate to python builder to keep code unified + import runpy + # Ensure src/llvm_py is on sys.path for relative imports + builder_dir = str(PY_BUILDER.parent) + if builder_dir not in sys.path: + sys.path.insert(0, builder_dir) + # Simulate "python llvm_builder.py -o " + sys.argv = [str(PY_BUILDER), str(in_path), "-o", str(out_path)] + runpy.run_path(str(PY_BUILDER), run_name="__main__") -def main() -> int: - ns = parse_args() - _mir = load_mir(ns.in_path) - # For now, ignore MIR content and emit a trivial module. - mod = build_trivial_module() - os.makedirs(os.path.dirname(ns.out_path) or ".", exist_ok=True) - emit_object(mod, ns.out_path) - return 0 +def main(): + ap = argparse.ArgumentParser() + ap.add_argument("--in", dest="infile", help="MIR JSON input", default=None) + ap.add_argument("--out", dest="outfile", help="output object (.o)", required=True) + args = ap.parse_args() + if args.infile is None: + # Dummy path + run_dummy(args.outfile) + print(f"[harness] dummy object written: {args.outfile}") + else: + run_from_json(args.infile, args.outfile) + print(f"[harness] object written: {args.outfile}") -if __name__ == "__main__": # pragma: no cover - raise SystemExit(main()) - +if __name__ == "__main__": + try: + main() + except Exception as e: + import traceback + print(f"[harness] error: {e}", file=sys.stderr) + if os.environ.get('NYASH_CLI_VERBOSE') == '1': + traceback.print_exc() + sys.exit(1)