stage3: unify to cleanup; MIR return-defer; docs+smokes updated; LLVM(harness): finalize_phis ownership, ret.py simplified, uses-predeclare; cleanup return override green; method-postfix cleanup return WIP (PHI head)

This commit is contained in:
Selfhosting Dev
2025-09-19 02:07:38 +09:00
parent 951a050592
commit 5e818eeb7e
205 changed files with 9671 additions and 1849 deletions

View File

@ -4,7 +4,7 @@ Scope
- Extend Stage-2 parser emission to cover control flow constructs usually seen in routine code bases:
- `break` / `continue`
- `throw expr`
- `try { ... } catch (Type err) { ... } finally { ... }`
- `try { ... } catch (Type err) { ... } cleanup { ... }`
- Alert: other Stage-3 ideas (switch/async) remain out of scope until after self-host parity.
- Preserve existing Stage-2 behaviour (locals/loop/if/call/method/new/ternary) with no regressions.
@ -15,31 +15,32 @@ Guiding Principles
Current Status (Phase 15.3 2025-09-16)
- ParserBox / Selfhost compiler expose `stage3_enable` and `--stage3` CLI flag, defaulting to the safe Stage-2 surface.
- Rust core parser accepts Stage3 syntax behind env `NYASH_PARSER_STAGE3=1` (default OFF) to keep Stage2 stable.
- Break/Continue JSON emission and Bridge lowering are implemented. Bridge now emits `Jump` to loop exit/head and records instrumentation events.
- Throw/Try nodes are emitted when the gate is on, but still degrade to expression/no-op during lowering; runtime semantics remain TBD.
- Documentation for JSON v0 (Stage-3 nodes) is updated; remaining runtime work is tracked in CURRENT_TASK.md.
- Throw/Try nodes are emitted when the gate is on. When `NYASH_TRY_RESULT_MODE=1`, the Bridge lowers try/catch/cleanup into structured blocks and jumps (no MIR Throw/Catch). Nested throws route to a single catch via a threadlocal ThrowCtx. Policy: single catch per try (branch inside catch).
- Documentation for JSON v0 (Stage-3 nodes) is updated; remaining native unwind work is tracked in CURRENT_TASK.md.
Runtime snapshot
- MIR Builder already lowers `ASTNode::Throw` into `MirInstruction::Throw` (unless disabled via `NYASH_BUILDER_DISABLE_THROW`) and has a provisional `build_try_catch_statement` that emits `Catch`/`Jump` scaffolding with env flags controlling fallback.
- Rust VM (`interpreter::ControlFlow::Throw`) supports catch/finally semantics and rethrows unhandled exceptions.
- Rust VM (`interpreter::ControlFlow::Throw`) supports catch/cleanup semantics and rethrows unhandled exceptions.
- Bridge degradation prevents these MIR paths from activating unless `NYASH_BRIDGE_THROW_ENABLE=1`;既定では Const0 を出し、フラグONで実際に `Throw` を生成する。
PyVM plan
- Current PyVM runner treats Stage-3 constructs as no-ops because JSON v0 never emits MIR throws; once Bridge emits them, PyVM must mirror Rust VM semantics:
- Introduce a lightweight `exception` representation (reuse ErrorBox JSON form) and propagate via structured returns.
- Implement try/catch/finally execution order identical to Rust VM (catch matches first, finally always runs, rethrow on miss).
- Implement try/catch/cleanup execution order identical to Rust VM (catch matches first, cleanup always runs, rethrow on miss).
- Add minimal smoke tests under `tools/pyvm_stage2_smoke.sh` (gated) to ensure PyVM and LLVM stay in sync when Stage-3 is enabled.
LLVM plan
- Short term: continue degrading throw/try to keep LLVM pipeline green while implementation lands (Stage-3 smoke ensures awareness).
- Implementation steps once runtime semantics are ready:
1. Ensure MIR output contains `Throw`/`Catch` instructions; update LLVM codegen to treat `Throw` as a call to a runtime helper (`nyash.rt.throw`) that unwinds or aborts.
2. Model catch/finally blocks using landing pads or structured IR (likely via `invoke`/`landingpad` in LLVM); document minimal ABI expected from NyRT.
2. Model catch/cleanup blocks using landing pads or structured IR (likely via `invoke`/`landingpad` in LLVM); document minimal ABI expected from NyRT.
3. Add gated smoke (`NYASH_LLVM_STAGE3_SMOKE`) that expects non-degraded behaviour (distinct exit codes or printed markers) once helper is active.
- Until landing pad support exists, document that Stage-3 throw/try is unsupported in LLVM release mode and falls back to interpreter/PyVM.
Testing plan
- JSON fixtures: create `tests/json_v0_stage3/{break_continue,throw_basic,try_catch_finally}.json` to lock parser/bridge output and allow regression diffs.
- JSON fixtures: create `tests/json_v0_stage3/{break_continue,throw_basic,try_catch_cleanup}.json` to lock parser/bridge output and allow regression diffs.
- PyVM/VM: extend Stage-3 smoke scripts with throw/try cases (under gate) to ensure runtime consistency before enabling by default.
- LLVM: `NYASH_LLVM_STAGE3_SMOKE=1``NYASH_BRIDGE_THROW_ENABLE=1` / `NYASH_BRIDGE_TRY_ENABLE=1` と組み合わせて実際の例外経路を確認。将来的に常時ONへ移行予定。
- CI gating: add optional job that runs Stage-3 smokes (PyVM + LLVM) nightly to guard against regressions while feature is still experimental.
@ -50,7 +51,7 @@ JSON v0 Additions
| break | `{ "type": "Break" }` | Lowered into loop exit block with implicit jump. |
| continue | `{ "type": "Continue" }` | Lowered into loop head block jump. |
| throw expr | `{ "type": "Throw", "expr": Expr }` | Initial implementation can degrade to `{ "type": "Expr", "expr": expr }` until VM/JIT semantics are ready. |
| try/catch/finally | `{ "type": "Try", "try": Stmt[], "catches": Catch[], "finally": Stmt[]? }` | Each `Catch` includes `{ "param": String?, "body": Stmt[] }`. Stage-1 implementation may treat as pass-through expression block. |
| try/catch/cleanup | `{ "type": "Try", "try": Stmt[], "catches": Catch[], "finally": Stmt[]? }` | Surface syntax uses `cleanup` but JSON v0 field remains `finally` for compatibility. Each `Catch` includes `{ "param": String?, "body": Stmt[] }`. Stage1 implementation may treat as passthrough expression block. |
Lowering Strategy (Bridge)
1. **Break/Continue**
@ -58,13 +59,10 @@ Lowering Strategy (Bridge)
- `Break` maps to `Jump { target: loop_exit }`, `Continue` to `Jump { target: loop_head }`.
- MirBuilder already has `LoopBuilder`; expose helpers to fetch head/exit blocks.
2. **Throw/Try**
- Phase 15 MVP keeps them syntax-only to avoid VM/JIT churn. Parser/Emitter produce nodes; Bridge either degrades (Expr) or logs a structured event for future handling.
- Bridge helper `lower_throw` respects `NYASH_BRIDGE_THROW_ENABLE=1`; defaultは Const i64 0 のデグレード、フラグONで `MirInstruction::Throw` を実際に生成。
- Try lowering plan:
1. Parse-time JSON already includes `catches`/`finally`. Bridge should map `try` body into a fresh region, emit basic blocks for each `catch`, and wire `finally` as a postamble block.
2. MIR needs explicit instructions/metadata for exception edges. Evaluate whether existing `MirInstruction::Throw` + `ControlFlow::Throw` is sufficient or if `Catch` terminators are required.
3. Until runtime implementation lands, keep current degrade path but log a structured event to flag unhandled try/catch.
2. **Throw/Try (Resultmode)**
- Enable `NYASH_TRY_RESULT_MODE=1` to lower try/catch/cleanup via structured blocks (no MIR Throw/Catch).
- A threadlocal ThrowCtx records all throw sites in the try region and routes them to the single catch block. Catch param is wired via PHI (PHIoff uses edgecopy). Cleanup always executes.
- Nested throws are supported; multiple catch is not (MVP policy: branch inside catch).
3. **Metadata Events**
- Augment `crate::jit::observe` with `lower_shortcircuit`/`lower_try` stubs so instrumentation remains coherent when full support is wired.
@ -77,8 +75,8 @@ Testing Plan
Migration Checklist
1. ParserBox emits Stage-3 nodes under `stage3_enable` gate to allow gradual rollout. ✅
2. Emitter attaches Stage-3 JSON when gate is enabled (otherwise degrade to existing Stage-2 forms). ✅
3. Bridge honours Stage-3 nodes when gate is on; break/continue lowering implemented, throw/try still degrade. ✅ (partial)
4. PyVM/VM/JIT semantics gradually enabled (throw/try remain degrade until corresponding runtime support is merged). 🔄 Pending runtime work.
3. Bridge honours Stage-3 nodes when gate is on; break/continue lowering implemented, throw/try supported via Resultmode structured blocks. ✅ (MVP)
4. PyVM/VM/JIT semantics gradually enabled (native unwind remains out of scope). 🔄 Future work.
5. Documentation kept in sync (`CURRENT_TASK.md`, release notes). ✅ (break/continue) / 🔄 (throw/try runtime notes).
References

View File

@ -16,7 +16,7 @@ Statements (`StmtV0`)
- `Loop { cond, body: Stmt[] }` (Stage2; while(cond) body)
- `Break` (Stage3; exits current loop)
- `Continue` (Stage3; jumps to loop head)
- `Try { try: Stmt[], catches?: Catch[], finally?: Stmt[] }` (Stage3 skeleton; currently lowered as sequential `try` body only when runtime support is absent)
- `Try { try: Stmt[], catches?: Catch[], finally?: Stmt[] }` (Stage3 skeleton; surface syntax uses `cleanup`, but the v0 field name remains `finally` for compatibility; currently lowered as sequential `try` body only when runtime support is absent)
Expressions (`ExprV0`)
- `Int { value }` where `value` is JSON number or digit string
@ -41,6 +41,7 @@ PHI mergingPhase15 終盤の方針)
- MIR 生成層は PHI を生成しないMIR13 運用。If/Loop の合流は LLVM 層llvmlite/Resolverが PHI を合成。
- ループは既存 CFGpreheader→cond→{body|exit}; body→condの検出により、ヘッダ BB で搬送値の PHI を構築。
- 将来LoopForm= MIR18では LoopForm 占位命令から逆 Lowering で PHI を自動化予定。
- PHIoff 運用Builder 側の規約): merge 内に copy を置かず、then/else の pred へ edge_copy のみを挿入selfcopy は NoOp。usebeforedef と重複 copy を原理的に回避する。
Type meta (emitter/LLVM harness cooperation)
- `+` with any string operand → string concat pathhandle固定
@ -81,4 +82,4 @@ If with local + PHI merge
]}
```
- `Break` / `Continue` are emitted when Stage3 gate is enabled. When the bridge is compiled without Stage3 lowering, frontends may degrade them into `Expr(Int(0))` as a safety fallback.
- `Try` nodes include optional `catches` entries of the form `{ param?: string, typeHint?: string, body: Stmt[] }`. Until runtime exception semantics land, downstream lowers only the `try` body and ignores handlers/finally.
- `Try` nodes include optional `catches` entries of the form `{ param?: string, typeHint?: string, body: Stmt[] }`. Until runtime exception semantics land, downstream lowers only the `try` body and ignores handlers/`finally`.

View File

@ -42,3 +42,25 @@ Notes
- Array literal is enabled when syntax sugar is on (NYASH_SYNTAX_SUGAR_LEVEL=basic|full) or when NYASH_ENABLE_ARRAY_LITERAL=1 is set.
- Map literal is enabled when syntax sugar is on (NYASH_SYNTAX_SUGAR_LEVEL=basic|full) or when NYASH_ENABLE_MAP_LITERAL=1 is set.
- Identifier keys (`{name: v}`) are Stage3 and require either NYASH_SYNTAX_SUGAR_LEVEL=full or NYASH_ENABLE_MAP_IDENT_KEY=1.
## Stage3 (Gated) Additions
Enabled when `NYASH_PARSER_STAGE3=1` for the Rust parser (and via `--stage3`/`NYASH_NY_COMPILER_STAGE3=1` for the selfhost parser):
- try/catch/cleanup
- `try_stmt := 'try' block ('catch' '(' (IDENT IDENT | IDENT | ε) ')' block) ('cleanup' block)?`
- MVP policy: single `catch` per `try`
- `(Type var)` or `(var)` or `()` are accepted for the catch parameter。
- Blockpostfix catch/cleanupPhase 15.5
- `block_catch := '{' stmt* '}' ('catch' '(' (IDENT IDENT | IDENT | ε) ')' block)? ('cleanup' block)?`
- Applies to standalone block statements. Do not attach to `if/else/loop` structural blocks (wrap with a standalone block when needed).
- Gate: `NYASH_BLOCK_CATCH=1` (or `NYASH_PARSER_STAGE3=1`).
- throw
- `throw_stmt := 'throw' expr`
- Methodlevel postfix catch/cleanupPhase 15.6, gated
- `method_decl := 'method' IDENT '(' params? ')' block ('catch' '(' (IDENT IDENT | IDENT | ε) ')' block)? ('cleanup' block)?`
- Gate: `NYASH_METHOD_CATCH=1`(または `NYASH_PARSER_STAGE3=1` と同梱)
These constructs remain experimental; behaviour may degrade to noop in some backends until runtime support lands, as tracked in CURRENT_TASK.md.

View File

@ -36,7 +36,7 @@ Rust製インタープリターによる高性能実行と、直感的な構文
| `override` | 明示的オーバーライド | `override speak() { }` |
| `break` | ループ脱出 | `break` |
| `catch` | 例外処理 | `catch (e) { }` |
| `finally` | 最終処理 | `finally { }` |
| `cleanup` | 最終処理finally の後継) | `cleanup { }` |
| `throw` | 例外発生 | `throw error` |
| `nowait` | 非同期実行 | `nowait future = task()` |
| `await` | 待機・結果取得 | `result = await future` |

View File

@ -1,5 +1,8 @@
# MIR PHI Invariants
Note
- Default policy is PHIoff at MIR level. These invariants apply to the devonly PHIon mode and to how LLVM synthesizes PHIs from predecessor copies. See also `phi_policy.md`.
Scope: Builder/Bridge, PyVM, llvmlite (AOT)
Goal: Ensure deterministic PHI formation at control-flow merges so that
@ -32,4 +35,3 @@ Diagnostics
wiring in the LLVM path.
- Bridge verifier may allow `verify_allow_no_phi()` in PHI-off mode, but
the invariants above still apply to resolver synthesis order.

View File

@ -0,0 +1,29 @@
## MIR PHI Policy (Phase15)
Status
- Default: PHIoff (edgecopy mode). Builders/Bridge do not emit PHI; merges are realized via perpredecessor Copy into the merge destination.
- Devonly: PHIon is experimental for targeted tests (enable with `NYASH_MIR_NO_PHI=0`).
- LLVM: PHI synthesis is delegated to the LLVM/llvmlite path (AOT/EXE). PyVM serves as the semantic reference.
Rationale
- Simplify MIR builders and JSON bridge by removing PHI placement decisions from the core path.
- Centralize SSA join formation at a single backend (LLVM harness), reducing maintenance and divergence.
- Keep PyVM parity by treating merges as value routing; shortcircuit semantics remain unchanged.
Operational Rules (PHIoff)
- Edgecopy only: predecessors write the merged destination via `Copy { dst=merged, src=pred_value }`.
- Merge block: must not emit a selfCopy for the same destination; merged value is already defined by predecessors.
- Verifier: `verify_allow_no_phi()` is on by default; dominance and merge checks are relaxed in PHIoff.
- Developer Notes (PHIon dev mode)
- Requires cargo feature `phi-legacy`. Build with `--features phi-legacy` and enable with `NYASH_MIR_NO_PHI=0`.
Builders may place `Phi` at block heads with inputs covering all predecessors.
- Use `NYASH_LLVM_TRACE_PHI=1` for wiring trace; prefer small, isolated tests.
Backends
- LLVM harness performs PHI synthesis based on predecessor copies and dominance.
- Other backends (Cranelift/JIT) are secondary during Phase15; PHI synthesis there is not required.
Acceptance
- Default smokes/CI run with PHIoff.
- Parity checks compare PyVM vs. LLVM AOT outputs; differences are resolved on the LLVM side when they stem from PHI formation.