fix(mir): fix else block scope bug - PHI materialization order

Root Cause:
- Else blocks were not propagating variable assignments to outer scope
- Bug 1 (if_form.rs): PHI materialization happened before variable_map reset,
  causing PHI nodes to be lost
- Bug 2 (phi.rs): Variable merge didn't check if else branch modified variables

Changes:
- src/mir/builder/if_form.rs:93-127
  - Reordered: reset variable_map BEFORE materializing PHI nodes
  - Now matches then-branch pattern (reset → materialize → execute)
  - Applied to both "else" and "no else" branches for consistency
- src/mir/builder/phi.rs:137-154
  - Added else_modified_var check to detect variable modifications
  - Use modified value from else_var_map_end_opt when available
  - Fall back to pre-if value only when truly not modified

Test Results:
 Simple block: { x=42 } → 42
 If block: if 1 { x=42 } → 42
 Else block: if 0 { x=99 } else { x=42 } → 42 (FIXED!)
 Stage-B body extraction: "return 42" correctly extracted (was null)

Impact:
- Else block variable assignments now work correctly
- Stage-B compiler body extraction restored
- Selfhost builder path can now function
- Foundation for Phase 21.x progress

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
nyash-codex
2025-11-13 20:16:20 +09:00
parent 801833df8d
commit 8b44c5009f
19 changed files with 309 additions and 205 deletions

View File

@ -37,6 +37,9 @@ Parser/StageB
- HAKO_STAGEB_FUNC_SCAN=1
- Devonly: inject a `defs` array into Program(JSON) with scanned method definitions for `box Main`.
- HAKO_STAGEB_BODY_EXTRACT=0|1
- Toggle StageB body extractor. When `0`, skip methodbody extraction and pass the full `--source` to `parse_program2`. Useful to avoid environmentspecific drift in extractors; default is `1` (enabled).
Selfhost builders and wrappers
- HAKO_SELFHOST_BUILDER_FIRST=1
- Prefer the Hako MirBuilder path first; wrappers fall back to Rust CLI builder on failure to keep runs green.
@ -70,12 +73,12 @@ Builder/Emit (Selfhost)
- HAKO_SELFHOST_TRACE=1
- Print additional traces during MIR emit bench/wrappers.
- HAKO_MIR_BUILDER_LOOP_FORCE_JSONFRAG=1
- Force the selfhost builder (and wrappers) to emit a minimal, pure controlflow MIR(JSON) for loop cases (const/phi/compare/branch/binop/jump/ret)
- Dev専用。purify/normalize と併用すると ret ブロックに副作用命令を混入させない形で AOT/EXE 検証がしやすくなる
- HAKO_MIR_BUILDER_LOOP_FORCE_JSONFRAG=1devonly
- 最小 MIR(JSON)const/phi/compare/branch/jump/ret のみ)を強制生成する緊急回避
- emit が壊れているときの診断用途に限定。ベンチ/本番経路では使用しない
- HAKO_MIR_BUILDER_JSONFRAG_NORMALIZE=1, HAKO_MIR_BUILDER_JSONFRAG_PURIFY=1
- JsonFrag の正規化と純化を有効化する。purify=1 のとき newbox/boxcall/externcall/mir_call を除去し、ret 以降の命令を打ち切る(構造純化)
- HAKO_MIR_BUILDER_JSONFRAG_NORMALIZE=1, HAKO_MIR_BUILDER_JSONFRAG_PURIFY=1devonly
- JsonFrag の整形/純化ユーティリティ。比較/可視化の安定化が目的で、意味論や性能を変える“最適化”ではない
Provider path (delegate)
- HAKO_MIR_NORMALIZE_PROVIDER=1

View File

@ -66,3 +66,31 @@ SSOT for Using/Resolver (summary)
- Verify routing: HAKO_VERIFY_PRIMARY=hakovm (default); hv1_inline perf path parity (env toggles only).
- Build: `cargo build --release` (default features); LLVM paths are optin.
- Docs: keep RESTORE steps for any archived parts; small diffs, easy rollback.
## Convergence Plan — Line Consolidation (A→D)
Goal: reduce parallel lines (Rust/Hako builders, VM variants, LLVM backends) to a clear SSOT while keeping reversibility.
Phase A — Stabilize (now)
- SSOT: semantics/normalization/optimization live in Hako (AotPrep/Normalize).
- Rust: limit to structure/safety/emit (SSA/PHI/guards/executor). No new rules.
- Gates: quick/integration canaries green; VM↔LLVM parity for representatives; no default flips.
Phase B — Defaultization (small flips)
- StageB/selfhost builder: default ON in dev/quick; provider as fallback. Document toggles and rollback.
- AotPrep passes: enable normalize/collections_hot behind canaries; promote gradually.
- Docs: ENV_VARS + CURRENT_TASK に昇格条件/戻し手順を明記。
Phase C — Line Thinning
- LLVM: prefer crate (ny-llvmc) as default; llvmlite becomes optional job (deprecation window).
- VM: Hakorune VM = primary; PyVM = reference/comparison only.
- Remove duplicated heavy paths from default profiles; keep explicit toggles for restore.
Phase D — Toggle Cleanup & Sunsets
- Once stable in defaults for ≥2 weeks: remove legacy toggles and code paths (e.g., Rust normalize.rs).
- Record sunset plan (reason/range/restore) in CURRENT_TASK and changelog.
Acceptance (each phase)
- quick/integration green, parity holds (exit codes/log shape where applicable).
- Defaults unchanged until promotion; any flip is guarded and reversible.
- Small diffs; explicit RESTORE steps; minimal blast radius.

View File

@ -16,7 +16,8 @@ Core
- Carrier analysis emits observation hints only (zero runtime cost).
- Break/continue lowering is unified via LoopBuilder; nested bare blocks inside loops are handled consistently (Program nodes recurse into loopaware lowering).
- Scope
- Enter/Leave scope events are observable through MIR hints; they do not affect program semantics.
- Enter/Leave scope events are observable through MIR hints; they do not affect program semantics.
- Blockscoped locals: `local x = ...` declares a binding limited to the lexical block. Assignment without `local` updates the nearest enclosing binding; redeclaration with `local` shadows the outer variable. This is Lualike and differs from Python's block (no) scope.
Observability
- MIR hints can be traced via `NYASH_MIR_HINTS` (pipe style): `trace|scope|join|loop|phi` or `jsonl=path|loop`.

View File

@ -16,6 +16,9 @@ Statement separation and semicolons
Imports and namespaces
- See: reference/language/using.md — `using` syntax, runner resolution, and style guidance.
Variables and scope
- See: reference/language/variables-and-scope.md — Block-scoped locals, assignment resolution, and strong/weak reference guidance.
Grammar (EBNF)
- See: reference/language/EBNF.md — Stage2 grammar specification used by parser implementations.
- Unified Members (stored/computed/once/birth_once): see reference/language/EBNF.md “Box Members (Phase 15)” and the Language Reference section. Default ON (disable with `NYASH_ENABLE_UNIFIED_MEMBERS=0`).

View File

@ -0,0 +1,66 @@
# Variables and Scope (Local/Block Semantics)
Status: Stable (Stage3 surface for `local`), default strong references.
This document defines the variable model used by Hakorune/Nyash and clarifies how locals interact with blocks, memory, and references across VMs (Rust VM, Hakorune VM, LLVM harness).
## Local Variables
- Syntax: `local name = expr`
- Scope: Blockscoped. The variable is visible from its declaration to the end of the lexical block.
- Redeclaration: Writing `local name = ...` inside a nested block creates a new shadowing binding. Writing `name = ...` without `local` updates the nearest existing binding in an enclosing scope.
- Mutability: Locals are mutable unless future keywords specify otherwise (e.g., `const`).
- Lifetime: The variable binding is dropped at block end; any referenced objects live as long as at least one strong reference exists elsewhere.
Notes:
- Stage3 gate: Parsing `local` requires Stage3 to be enabled (`NYASH_PARSER_STAGE3=1` or equivalent runner profile).
## Assignment Resolution (Enclosing Scope Update)
Assignment to an identifier resolves as follows:
1) If a `local` declaration with the same name exists in the current block, update that binding.
2) Otherwise, search outward through enclosing blocks and update the first found binding.
3) If no binding exists in any enclosing scope, create a new binding in the current scope.
This matches intuitive blockscoped semantics (Lualike), and differs from Python where inner blocks do not create a new scope (function scope), and assignment would create a local unless `nonlocal`/`global` is used.
## Reference Semantics (Strong/Weak)
- Default: Locals hold strong references to boxes/collections. Implementation uses reference counting (strong = ownership) with internal synchronization.
- Weak references: Use `WeakBox` to hold a nonowning (weak) reference. Weak refs do not keep the object alive; they can be upgraded to strong at use sites. Intended for backpointers and cachelike links to avoid cycles.
- Typical guidance:
- Locals and return values: strong references.
- Object fields that create cycles (child→parent): weak references.
Example (nested block retains object via outer local):
```
local a = null
{
local b = new Box(a)
a = b // outer binding updated; a and b point to the same object
}
// leaving the block drops `b` (strongcount 1), but `a` still keeps the object alive
```
## Shadowing vs. Updating
- Shadowing: `local x = ...` inside a block hides an outer `x` for the remainder of the inner block. The outer `x` remains unchanged.
- Updating: `x = ...` without `local` updates the nearest enclosing `x` binding.
Prefer clarity: avoid accidental shadowing. If you intentionally shadow, consider naming or comments to clarify intent.
## Const/Immutability (Future)
- A separate keyword (e.g., `const`) can introduce an immutable local. Semantics: same scoping as `local`, but reassignment is a compile error. This does not affect reference ownership (still strong by default).
## CrossVM Consistency
The above semantics are enforced consistently across:
- Rust VM (MIR interpreter): scope updates propagate to enclosing locals.
- Hakorune VM/runner: same resolution rules.
- LLVM harness/EXE: parity tests validate identical exit codes/behavior.
See also: quick/integration smokes `scope_assign_vm.sh`, `vm_llvm_scope_assign.sh`.