📚 Phase 15 - セルフホスティング戦略の明確化とEXE-first実装

## 主な変更点

### 🎯 戦略の転換と明確化
- PyVMを開発ツールとして位置づけ(本番経路ではない)
- EXE-first戦略を明確に優先(build_compiler_exe.sh実装済み)
- Phase順序の整理: 15.2(LLVM)→15.3(コンパイラ)→15.4(VM)

### 🚀 セルフホスティング基盤の実装
- apps/selfhost-compiler/にNyashコンパイラMVP実装
  - compiler.nyash: メインエントリー(位置引数対応)
  - boxes/: parser_box, emitter_box, debug_box分離
- tools/build_compiler_exe.sh: ネイティブEXEビルド+dist配布
- Python MVPパーサーStage-2完成(local/if/loop/call/method/new)

### 📝 ドキュメント整備
- Phase 15 README/ROADMAP更新(Self-Hosting優先明記)
- docs/guides/exe-first-wsl.md: WSLクイックスタート追加
- docs/private/papers/: 論文G~L、爆速事件簿41事例収録

### 🔧 技術的改善
- JSON v0 Bridge: If/Loop PHI生成実装(ChatGPT協力)
- PyVM/llvmliteパリティ検証スイート追加
- using/namespace機能(gated実装、Phase 15では非解決)

## 次のステップ
1. パーサー無限ループ修正(未実装関数の実装)
2. EXEビルドとセルフホスティング実証
3. c0→c1→c1'ブートストラップループ確立

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Selfhosting Dev
2025-09-15 18:44:49 +09:00
parent 8f11c79f19
commit d90216e9c4
68 changed files with 4521 additions and 1641 deletions

View File

@ -16,6 +16,13 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存
## 🚀 実装戦略2025年9月更新・改定
### SelfHosting 優先Phase15 基礎固め)
- 目的: Nyash製パーサ/言語機能/Bridge整合/パリティを完成させ、自己ホスト c0→c1→c1' を達成する。
- 運用:
- Runner から `NYASH_USE_NY_COMPILER=1` を推奨子プロセス実行→JSON v0→Bridge→MIR 実行)。
- EXE化は任意の実験導線として維持配布は Phase15 の外)。
- PyVM は参照実行器として意味論検証に用い、パリティ監視を継続。
### Phase 15.2: LLVMllvmlite安定化 + PyVM導入
- JIT/Cranelift は一時停止(古い/非対応。Rust/inkwell は参照のみ。
- 既定のコンパイル経路は **Python/llvmlite**harnessのみ
@ -34,7 +41,7 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存
#### PHI 取り扱い方針Phase15 中)
- 現行: JSON v0 Bridge 側で If/Loop の PHI を生成(安定・緑)。
- 方針: Phase15 ではこのまま完成させる(変更しない)。
- 理由: LoopFormCore14導入時に、逆Loweringで PHI を自動生成する案(推薦)に寄せるため。
- 理由: LoopFormMIR18導入時に、逆Loweringで PHI を自動生成する案(推薦)に寄せるため。
- PHI は「合流点での別名付け」であり、Boxの操作ではない。
- 抽象レイヤの純度維持Everything is Box
- 実装責務の一極化(行数削減/保守性向上)。
@ -79,6 +86,7 @@ MIR 13命令の美しさを最大限に活かし、外部コンパイラ依存
- Smokes / Tools更新
- `tools/selfhost_compiler_smoke.sh`(入口)
- `tools/build_compiler_exe.sh`Selfhost Parser のEXE化
- `tools/ny_stage2_bridge_smoke.sh`(算術/比較/短絡/ネストif
- `tools/ny_parser_stage2_phi_smoke.sh`If/Loop の PHI 合流)
- `tools/parity.sh --lhs pyvm --rhs llvmlite <test.nyash>`(常時)
@ -101,10 +109,10 @@ Imports/Namespace plan15.3late
- `tools/ny_roundtrip_smoke.sh`Case A/B
- `apps/tests/esc_dirname_smoke.nyash` / `apps/selfhost/tools/dep_tree_min_string.nyash` を Ny パーサ経路で実行し、PyVM/llvmlite とパリティ一致stdout/exit
#### 予告: LoopFormCore14)での PHI 自動化Phase15 後)
#### 予告: LoopFormMIR18)での PHI 自動化Phase15 後)
- LoopForm を強化し、`loop.begin(loop_carried_values) / loop.iter / loop.branch / loop.end` の構造的情報から逆Loweringで PHI を合成。
- If/短絡についても同様に、構造ブロックから合流点を決めて PHI を自動化。
- スケジュール: Phase15 後(Core14で検討・実装。Phase15 では変更しない。
- スケジュール: Phase15 後(MIR18/LoopFormで検討・実装。Phase15 では変更しない。
### Phase 15.4: VM層のNyash化PyVMからの置換
- PyVM を足場に、VMコアを Nyash 実装へ段階移植(命令サブセットから)
@ -116,7 +124,7 @@ Imports/Namespace plan15.3late
補足: JSON v0 の扱い(互換)
- Phase15: Bridge で PHI を生成(現行継続)。
- Core14 以降: LoopForm PHI 自動化後、JSON 側の PHI は非必須(将来は除外方向)。
- MIR18LoopForm)以降: PHI 自動化後、JSON 側の PHI は非必須(将来は除外方向)。
- 型メタ(“+”の文字列混在/文字列比較)は継続。
## 📊 主要成果物
@ -314,9 +322,13 @@ ny_free_buf(buffer)
### ✅ クイックスモーク(現状)
- PyVM↔llvmlite パリティ: `tools/parity.sh --lhs pyvm --rhs llvmlite apps/tests/esc_dirname_smoke.nyash`
- dep_treeハーネスON: `NYASH_LLVM_FEATURE=llvm ./tools/build_llvm.sh apps/selfhost/tools/dep_tree_min_string.nyash -o app_dep && ./app_dep`
- Selfhost Parser EXE: `tools/build_compiler_exe.sh && (cd dist/nyash_compiler && ./nyash_compiler tmp/sample.nyash > sample.json)`
- JSON v0 bridge spec: `docs/reference/ir/json_v0.md`
- Stage2 smokes: `tools/ny_stage2_bridge_smoke.sh`, `tools/ny_parser_stage2_phi_smoke.sh`, `tools/ny_me_dummy_smoke.sh`
WSL Quickstart
- See: `docs/guides/exe-first-wsl.md`依存の導入→Parser EXE バンドル→スモークの順)
### 📚 関連フェーズ
- [Phase 10: Cranelift JIT](../phase-10/)
- [Phase 12.5: 最適化戦略](../phase-12.5/)

View File

@ -15,10 +15,17 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U
- [x] using/namespace (gated) + nyash.link minimal resolver
- [x] NyModules + ny_plugins regression suite (Windows path normalization/namespace derivation)
- [x] Standard Ny scripts scaffolds added (string/array/map P0) + examples + jit_smoke
- [x] Selfhost Parser accepts positional input file argEXE運用の前提
## Next (small boxes)
1) LLVM Native EXE Generation (Phase 15.2) 🚀
1) EXE-first: Selfhost Parser → EXEPhase 15.2🚀
- tools/build_compiler_exe.sh で EXE をビルド同梱distパッケージ作成
- dist/nyash_compiler/{nyash_compiler,nyash.toml,plugins/...} で独立実行
- 入力: Nyソース → 出力: JSON v0stdout
- Smokes: sample.nyash→JSON 行生成JSONのみ出力
- リスク: プラグイン解決FileBoxをnyash.tomlで固定
2) LLVM Native EXE GenerationAOTパイプライン継続
- Python/llvmlite implementation as primary path (2400 lines, 10x faster development)
- LLVM backend object → executable pipeline completion
- Separate `nyash-llvm-compiler` crate (reduce main build weight)
@ -26,10 +33,10 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U
- Link with nyrt runtime (static/dynamic options)
- Plugin all-direction build strategy (.so/.o/.a simultaneous generation)
- Integration: `nyash --backend llvm --emit exe program.nyash -o program.exe`
2) Standard Ny std impl (P0→実体化)
3) Standard Ny std impl (P0→実体化)
- Implement P0 methods for string/array/map in Nyash (keep NyRT primitives minimal)
- Enable via `nyash.toml` `[ny_plugins]` (optin); extend `tools/jit_smoke.sh`
3) Ny compiler MVP (Ny→MIR on JIT path) (Phase 15.3) 🎯
4) Ny compiler MVP (Ny→MIR on JIT path) (Phase 15.3) 🎯
- Ny tokenizer + recursivedescent parser (current subset) in Ny; drive existing MIR builder
- Target: 800 lines parser + 2500 lines MIR builder = 3300 lines total
- No circular dependency: nyrt provides StringBox/ArrayBox via C ABI
@ -43,13 +50,13 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U
- [ ] local/if/loop/call/method/new/var/logical/compare
- [ ] PHI 合流は Bridge に委譲If/Loop
- [ ] Smokes: nested if / loop 累積 / and/or × if/loop
4) PHI 自動化は Phase15 後(Core14 LoopForm
5) PHI 自動化は Phase15 後LoopForm = MIR18
- Phase15: 現行の BridgePHI を維持し、E2E 緑とパリティを最優先
- Core14: LoopForm 強化逆Loweringで PHI を自動生成(合流点の定型化)
4) Bootstrap loop (c0→c1→c1')
- MIR18 (LoopForm): LoopForm 強化逆Loweringで PHI を自動生成(合流点の定型化)
6) Bootstrap loop (c0→c1→c1')
- Use existing trace/hash harness to compare parity; add optional CI gate
- **This achieves self-hosting!** Nyash compiles Nyash
5) VM Layer in Nyash (Phase 15.4) ⚡
7) VM Layer in Nyash (Phase 15.4) ⚡
- Implement MIR interpreter in Nyash (13 core instructions)
- Dynamic dispatch via MapBox for instruction handlers
- BoxCall/ExternCall bridge to existing infrastructure
@ -74,8 +81,9 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U
- Parser path: `--parser {rust|ny}` or `NYASH_USE_NY_PARSER=1`
- JSON dump: `NYASH_DUMP_JSON_IR=1`
- 予告LoopForm: Core14 で仕様化予定
- 予告LoopForm: MIR18 で仕様化予定
- Selfhost compiler: `NYASH_USE_NY_COMPILER=1`, child quiet: `NYASH_JSON_ONLY=1`
- EXE-first bundle: `tools/build_compiler_exe.sh``dist/nyash_compiler/`
- Load Ny plugins: `NYASH_LOAD_NY_PLUGINS=1` / `--load-ny-plugins`
- AOT smoke: `CLIF_SMOKE_RUN=1`
@ -83,6 +91,7 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U
- JSON v0 bridge: `tools/ny_parser_bridge_smoke.sh` / `tools/ny_parser_bridge_smoke.ps1`
- E2E roundtrip: `tools/ny_roundtrip_smoke.sh` / `tools/ny_roundtrip_smoke.ps1`
- EXE-first smoke: `tools/build_compiler_exe.sh && (cd dist/nyash_compiler && ./nyash_compiler tmp/sample.nyash > sample.json)`
## Implementation Dependencies
@ -96,7 +105,7 @@ This roadmap is a living checklist to advance Phase 15 with small, safe boxes. U
- v0 E2E green (parser pipe + direct bridge) including Ny compiler MVP switch
- v1 minimal samples pass via JSON bridge
- AOT P2: emit→link→run stable for constant/arith
- Phase15 STOP には PHI 切替を含めないPHI は LoopForm/Core14 で扱う)
- Phase15 STOP には PHI 切替を含めないPHI は LoopForm/MIR18 で扱う)
- 15.3: Stage1 代表サンプル緑 + Bootstrap smokeフォールバック許容+ 文分離ポリシー公開
- Docs/recipes usable on Windows/Unix

View File

@ -0,0 +1,100 @@
# MIR Builder EXE Design (Phase 15 — EXEFirst)
Purpose: define a standalone MIR Builder executable that takes Nyash JSON IR (v0/v1) and produces native outputs (object/executable), independent of the Rust Runner/VM. This aligns Phase15 with an EXEfirst delivery pipeline.
Goals
- Accept JSON IR from stdin or file and validate semantics (minimal passes).
- Emit: object (.o/.obj), LLVM IR (.ll), or final executable by linking with NyRT.
- Operate without the Rust VM path; suitable for CLI and scripted pipelines.
- Keep the boundary simple and observable (stdout diagnostics, exit codes).
CLI Interface (proposed)
- Basic form
- `ny_mir_builder [--in <file>|--stdin] [--emit {obj|exe|ll|json}] -o <out> [options]`
- Defaults: `--stdin`, `--emit obj`, `-o target/aot_objects/a.o`
- Options
- `--in <file>`: Input JSON IR file (v0/v1). If omitted, read stdin.
- `--emit {obj|exe|ll|json}`: Output kind. `json` emits validated/normalized IR for roundtrip.
- `-o <path>`: Output path (object/exe/IR). Default under `target/aot_objects`.
- `--target <triple>`: Target triple override (default: host).
- `--nyrt <path>`: NyRT static runtime directory (for `--emit exe`).
- `--plugin-config <path>`: `nyash.toml` path resolution for boxes/plugins.
- `--quiet`: Minimize logs; only errors to stderr.
- `--validate-only`: Parse+validate IR; no codegen.
- `--verify-llvm`: Run LLVM verifier on generated IR (when `--emit {obj|exe}`).
Exit Codes
- `0` on success; >0 on error. Validation errors produce humanreadable messages on stderr.
Input IR
- JSON v0 (current Bridge spec). Unknown fields are ignored; `meta.usings` is accepted.
- Future JSON v1 (additive) must remain backward compatible; builder performs normalization.
Outputs
- `--emit obj`: Native object file. Uses LLVM harness internally.
- `--emit ll`: Dumps LLVM IR for diagnostics.
- `--emit exe`: Produces a selfcontained executable by linking the object with NyRT.
- `--emit json`: Emits normalized MIR JSON (postvalidation) for roundtrip tests.
Packaging Forms
- CLI executable: `ny_mir_builder` (primary).
- Optional shared lib: `libny_mir_builder` exposing a minimal C ABI for embedding.
C ABI Sketch (optional library form)
```c
// Input: JSON IR bytes. Output: newly allocated buffer with object bytes.
// Caller frees via ny_free_buf.
int ny_mir_to_obj(const uint8_t* json, size_t len,
const char* target_triple,
uint8_t** out_buf, size_t* out_len);
// Convenience linker: object → exe (returns 0=ok).
int ny_obj_to_exe(const uint8_t* obj, size_t len,
const char* nyrt_dir, const char* out_path);
void ny_free_buf(void* p);
```
Internal Architecture
- Frontend
- JSON IR parser → AST/CFG structures compatible with existing MIR builder expectations.
- Validation passes: controlflow wellformedness, PHI consistency (incoming edges), type sanity for BoxCall/ExternCall minimal set.
- Codegen
- LLVM harness path (current primary). Environment fallback via `LLVM_SYS_180/181_PREFIX`.
- Option flag `NYASH_LLVM_FEATURE=llvm|llvm-inkwell-legacy` maintained for transitional builds.
- Linking (`--emit exe`)
- Use `cc` with `-L <nyrt>/target/release -lnyrt` (static preferred) + platform libs `-lpthread -ldl -lm` (Unix) or Win equivalents.
- Search `nyash.toml` near the output exe and current CWD (same heuristic as NyRT runtime) to initialize plugins at runtime.
Integration Points
- Parser EXE → MIR Builder EXE
- `./nyash_compiler <in.nyash> | ny_mir_builder --stdin --emit obj -o a.o`
- Compose with link step for quick endtoend: `... --emit exe -o a.out`
- Runner (future option)
- `NYASH_USE_NY_COMPILER_EXE=1`: Runner spawns parser EXE; optionally chain into MIR Builder EXE for AOT.
- Fallback to inproc Bridge when EXE pipeline fails.
Logging & Observability
- Default: singleline summaries on success, detailed errors on failure.
- `--quiet` to suppress nonessential logs; `--verify-llvm` to force verification.
- Print `OK obj:<path>` / `OK exe:<path>` on success (stable for scripts).
Security & Sandboxing
- No arbitrary file writes beyond `-o` and temp dirs.
- Deny network; fail fast on malformed JSON.
Platform Notes
- Windows: `.obj` + link with MSVC or lldlink; prefer bundling `nyrt` artifacts.
- macOS/Linux: `.o` + `cc` link; RPATH/loader path considerations documented.
Incremental Plan
1) CLI skeleton: stdin/file → validate → `--emit json/ll` (dry path) + golden tests。
2) Hook LLVM harness: `--emit obj` for const/arith/branch/ret subset。
3) Linker integration: `--emit exe` with NyRT static lib; add platform matrices。
4) Parity suite: run produced EXE against known cases (strings/collections minimal)。
5) Packaging: `tools/build_mir_builder_exe.sh` + smoke `tools/mir_builder_exe_smoke.sh`
Reference Artifacts
- `tools/build_llvm.sh`: current harness flow used as a baseline for emission and link steps。
- `crates/nyrt`: runtime interface and plugin host initialization heuristics。

View File

@ -28,7 +28,7 @@ Acceptance (15.3)
- Ny compiler can lex/parse `using` forms without breaking Stage1/2 programs
- Runner path (Rust) continues to resolve `using` and `nyash.toml` as before (parity unchanged)
Looking ahead (Core14 / Phase 16)
Looking ahead (MIR18 / Phase 16)
- Evaluate moving `nyash.toml` parsing to Ny as a library box (ConfigBox)
- Unify include/import/namespace into a single resolver pass in Ny with a small JSON side channel back to the runner
- Keep VM unchanged; all resolution before MIR build