diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index fe9bd69a..2d262449 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -1,6 +1,15 @@ -# Current Task — Phase 21.6(Solidification: Hakorune‑only chain) +# Current Task — Phase 21.7(Normalization & Unification: Methodize Static Boxes) -Today’s update (structure-first) +Phase 21.6 wrap-up +- Parser(Stage‑B) ループJSONの保守フォールバックで壊れ形のみ復元(正常形は素通り) +- 代表E2E(return/binop/loop/call)PASS(call は関数化: "Main.add" → Global 関数呼び出し) +- VM/LLVM にユーザー関数呼び出しを追加実装(VMのGlobal dispatch、LLVMのarity補正/事前定義) +- docs 追加: phase-21.6-solidification、canaries 整備 + +Next — Phase 21.7 (Normalization & Unification) +- 目的: Global("Box.method") ベースの関数化正規化から、段階的に Method 化(静的シングルトン受信)へ統一 +- 既定は変更しない。dev トグルで methodize→canary 緑→既定化の順 +- 命名とarityの正規化は最適化前に完了させる - Added always-on quick runner: `tools/smokes/v2/run_quick.sh`(fast, SKIP-safe) - Includes Stage‑B Program(JSON) shape and loop_scan (!= with else Break/Continue) canaries. - Hardened MirBuilder internals: replaced JSON hand scans with `JsonFragBox` in core paths (Return/If/Loop/Method utils). @@ -32,10 +41,13 @@ Additional updates (dev-only, behavior unchanged by default) - Enabled `NYASH_LLVM_VERIFY=1`/`NYASH_LLVM_VERIFY_IR=1` for deterministic cases - Promoted `return42` canary to FAIL on mismatch (was SKIP) -Next — Phase 21.6 (Solidification before optimization) - - Chain acceptance first: Parser(Stage‑B) → MirBuilder → VM/EXE parity on host - - Bring‑up aids only behind env toggles; defaults remain unchanged - - Optimizations are paused until all canaries are green (phase‑21.6 checklist) +Action items(21.7) +1) docs: phase-21.7-normalization/README.md, CHECKLIST.md(追加済み) +2) MirBuilder(dev): HAKO_MIR_BUILDER_METHODIZE=1 で Global("Box.method") → Method(receiver=singleton) +3) VM: Method 経路の静的シングルトンを ensure_static_box_instance で安定化(既存機能の活用) +4) LLVM: mir_call(Method) を既存 unified lowering で実施、必要なら receiver の型/引数整合を補助 +5) canary: methodize OFF/ON 両方で rc=5 を維持、method callee を観測(dev) +6) 既定化と撤去計画: 緑維持後、methodize を既定ON → Global 互換は当面維持 Phase 21.6 tasks (bug‑first, default OFF aids) 1) Parser(Stage‑B) loop JSON canaryを緑に維持(tools/dev/stageb_loop_json_canary.sh) @@ -48,16 +60,9 @@ Constraints / Guardrails - quick remains green; new toggles default OFF(`NYASH_LLVM_FAST` / `NYASH_VM_FAST`)。 - Changes small, reversible; acceptance = EXE parity + speedup in benches. -Notes -- Do not broaden default behavior; bring‑up aids remain opt‑in with clear flags. -- Rust層は診断のみ。開発は hakorune(Stage‑B/MirBuilder/VM)+ny‑llvmc(crate) で進める。 - -Phase 21.6 references -- docs/development/roadmap/phases/phase-21.6-solidification/README.md -- docs/development/roadmap/phases/phase-21.6-solidification/CHECKLIST.md -- tools/dev/enable_phase216_env.sh -- tools/dev/stageb_loop_json_canary.sh -- tools/dev/phase216_chain_canary.sh +Phase references +- 21.6: docs/development/roadmap/phases/phase-21.6-solidification/{README.md,CHECKLIST.md} +- 21.7: docs/development/roadmap/phases/phase-21.7-normalization/{README.md,CHECKLIST.md} --- diff --git a/docs/development/roadmap/phases/phase-21.6-solidification/CHECKLIST.md b/docs/development/roadmap/phases/phase-21.6-solidification/CHECKLIST.md index f703c3e6..1cd14a01 100644 --- a/docs/development/roadmap/phases/phase-21.6-solidification/CHECKLIST.md +++ b/docs/development/roadmap/phases/phase-21.6-solidification/CHECKLIST.md @@ -6,16 +6,48 @@ Acceptance (all must be green on this host) - VM: MIR(JSON) for minimal loop returns 10 - ny‑llvmc(crate) EXE: returns 10 +Canaries to run +- bash tools/dev/stageb_loop_json_canary.sh +- bash tools/dev/phase216_chain_canary.sh +- bash tools/dev/phase216_chain_canary_return.sh +- bash tools/dev/phase216_chain_canary_binop.sh +- bash tools/dev/phase216_chain_canary_loop.sh +- bash tools/dev/phase216_chain_canary_call.sh (Phase 21.6 extension) + Guardrails - No default behavior changes; all aids behind env toggles. - Logs quiet; tags/dev traces are opt‑in. - No llvmlite in default chain; crate backend is main line. -Canaries to run -- bash tools/dev/stageb_loop_json_canary.sh -- bash tools/dev/phase216_chain_canary.sh - Rollback - Parser fallback in parse_loop is conservative; remove after VM/gpos fix lands. - Keep canaries; they protect against regressions. +## Phase 21.6 Call Support Extension + +**Toggles**: +- `HAKO_STAGEB_FUNC_SCAN=1`: Enable function definition scanning in Stage-B +- `HAKO_MIR_BUILDER_FUNCS=1`: Enable function definition lowering in MirBuilder (unused - Rust delegate used) + +**Implementation**: +1. Stage-B function scanner extracts `method (params){ body }` definitions +2. Stage-B injects `defs` array into Program(JSON v0) +3. Rust delegate (src/runner/json_v0_bridge/) processes `defs` and generates MIR functions +4. Function names are qualified as `Box.method` (e.g., `Main.add`) + +**Current Status**: +- ✅ Stage-B scanning implemented (compiler_stageb.hako) +- ✅ Rust delegate defs processing implemented (ast.rs, lowering.rs) +- ✅ MIR functions generated with correct signatures and bodies +- ⚠️ Call resolution incomplete: calls still use dynamic string lookup ("add") instead of static Main.add reference + +**Next Steps for Call Completion**: +- Need to resolve Call("add") → Call(Main.add) at lowering time +- Implement call target resolution in json_v0_bridge/lowering.rs +- Update Call instruction to use function reference instead of string +- Alternative: use MirCall with Callee::Function for local calls + +**撤去条件 (Removal Criteria)**: +- After Phase 22 introduces proper local function scoping with Callee typed calls +- Or when unified with broader namespace/using system Phase 15.5+ + diff --git a/docs/development/roadmap/phases/phase-21.7-normalization/CHECKLIST.md b/docs/development/roadmap/phases/phase-21.7-normalization/CHECKLIST.md new file mode 100644 index 00000000..c4baaa68 --- /dev/null +++ b/docs/development/roadmap/phases/phase-21.7-normalization/CHECKLIST.md @@ -0,0 +1,20 @@ +Phase 21.7 — Normalization Checklist (Methodization) + +Targets (must be green) +- Naming: all user functions canonicalized as Box.method/N +- Arity: params.len == call.args.len (or explicit mapping when defs absent) +- Methodization (dev toggle): Global("Box.method") → Method(receiver=static singleton) +- VM: Method calls execute via ensure_static_box_instance for static boxes +- LLVM: mir_call(Method) lowering produces correct IR; rc parity preserved + +Canaries +- tools/dev/phase216_chain_canary_call.sh — remains PASS when OFF, PASS when ON +- Add: tools/dev/phase217_methodize_canary.sh (dev) — asserts method callee usage in MIR or IR tags + +Toggles +- HAKO_MIR_BUILDER_METHODIZE=1 (new) +- HAKO_STAGEB_FUNC_SCAN=1 / HAKO_MIR_BUILDER_FUNCS=1 / HAKO_MIR_BUILDER_CALL_RESOLVE=1 (existing) + +Rollback +- Disable HAKO_MIR_BUILDER_METHODIZE; remove methodization rewrite; keep Global path active. + diff --git a/docs/development/roadmap/phases/phase-21.7-normalization/README.md b/docs/development/roadmap/phases/phase-21.7-normalization/README.md new file mode 100644 index 00000000..b0ad0a02 --- /dev/null +++ b/docs/development/roadmap/phases/phase-21.7-normalization/README.md @@ -0,0 +1,42 @@ +Phase 21.7 — Normalization & Unification (Methodize Static Boxes) + +Goal +- Unify user-defined function calls onto a single, consistent representation. +- Move from ad-hoc Global("Box.method") calls toward Method calls with an explicit (singleton) receiver when appropriate. +- Keep defaults stable; introduce dev toggles and canaries; then promote to default after green. + +Scope +- Parser/Stage‑B: keep emitting Program(JSON v0). No schema changes required for MVP. +- MirBuilder: add methodization (dev toggle) that: + - Emits method functions as-is (defs) and/or provides a mapping to Method calls + - Rewrites Global("Box.method") to Method {receiver=static singleton, box_name, method} + - Preserves arity and naming as Box.method/N +- VM: handle Method calls uniformly (receiver resolved via ensure_static_box_instance for static boxes). +- LLVM: rely on mir_call(Method) lowering (already supported) and provide the static receiver where needed. + +Toggles +- HAKO_MIR_BUILDER_METHODIZE=1 — enable methodization (rewrite Global→Method with static singleton receiver) +- HAKO_STAGEB_FUNC_SCAN=1 — Stage‑B defs scan (already available) +- HAKO_MIR_BUILDER_FUNCS=1 — defs→MIR 関数化(既存) +- HAKO_MIR_BUILDER_CALL_RESOLVE=1 — Global 名解決(既存) + +Design Rules +- Naming: canonical "Box.method/N". Arity N must equal params.len (or callsite args len if defs unknown). +- Receiver: for static boxes, provide implicit singleton receiver. For instance boxes, preserve existing Method path. +- Effects: preserve EffectMask semantics; do not broaden side effects. +- Fail‑Fast: when rewrite is ambiguous (multiple candidates), keep original form and WARN under dev toggle; never change defaults silently. + +Acceptance +- Canaries: + - call (global style) passes when methodization ON (rc unchanged) + - Existing return/binop/loop/call (global) canaries remain green when OFF + - A dedicated methodization canary asserts presence of callee.type=="Method" in MIR(JSON v1) or correct v0 lowering + +Rollout Plan +1) Ship behind HAKO_MIR_BUILDER_METHODIZE=1 with canaries +2) Validate on representative apps (selfhost‑first path) +3) Promote to default ON only after green for a cycle; keep rollback instructions + +Rollback +- Disable HAKO_MIR_BUILDER_METHODIZE. Revert to Global("Box.method") resolution path (current 21.6 behavior). + diff --git a/lang/src/compiler/entry/compiler_stageb.hako b/lang/src/compiler/entry/compiler_stageb.hako index 77d3529f..64de6256 100644 --- a/lang/src/compiler/entry/compiler_stageb.hako +++ b/lang/src/compiler/entry/compiler_stageb.hako @@ -12,6 +12,7 @@ using sh_core as StringHelpers // Required: ParserStringUtilsBox depends on this (using chain unresolved) using "hako.compiler.entry.bundle_resolver" as BundleResolver using lang.compiler.parser.box as ParserBox +using lang.compiler.entry.func_scanner as FuncScannerBox // Note: Runner resolves entry as Main.main by default. // Provide static box Main with method main(args) as the entry point. @@ -339,6 +340,66 @@ static box Main { // Bridge(JSON v0) が Program v0 を受け取り MIR に lowering するため、ここでは AST(JSON v0) を出力する。 // 既定で MIR 直出力は行わない(重い経路を避け、一行出力を保証)。 local ast_json = p.parse_program2(body_src) + + // 6.5) Dev-toggle: scan for function definitions (box Main { method (...) {...} }) + // Toggle: HAKO_STAGEB_FUNC_SCAN=1 + // Policy: conservative minimal scanner for Phase 21.6 call canary (no nesting, basic only) + // Scope: method (params) { ... } outside of main (same box Main) + // Output: inject "defs":[{"name":"","params":[...],"body":[...], "box":"Main"}] to Program JSON + local defs_json = "" + { + local func_scan = env.get("HAKO_STAGEB_FUNC_SCAN") + if func_scan != null && ("" + func_scan) == "1" { + // Use FuncScannerBox to extract method definitions + local methods = FuncScannerBox.scan_functions(src, "Main") + + // Build defs JSON array + if methods.length() > 0 { + defs_json = ",\"defs\":[" + local mi = 0 + local mn = methods.length() + loop(mi < mn) { + local def = methods.get(mi) + local mname = "" + def.get("name") + local mparams = def.get("params") + local mbody = "" + def.get("body_json") + local mbox = "" + def.get("box") + // Build params array JSON + local params_arr = "[" + local pi = 0 + local pn = mparams.length() + loop(pi < pn) { + if pi > 0 { params_arr = params_arr + "," } + params_arr = params_arr + "\"" + ("" + mparams.get(pi)) + "\"" + pi = pi + 1 + } + params_arr = params_arr + "]" + if mi > 0 { defs_json = defs_json + "," } + defs_json = defs_json + "{\"name\":\"" + mname + "\",\"params\":" + params_arr + ",\"body\":" + mbody + ",\"box\":\"" + mbox + "\"}" + mi = mi + 1 + } + defs_json = defs_json + "]" + } + } + } + + // 7) Inject defs into Program JSON if available + if defs_json != "" && defs_json.length() > 0 { + // Insert defs before closing } of Program JSON + local ajson = "" + ast_json + local close_pos = -1 + { + local j = ajson.length() - 1 + loop(j >= 0) { + if ajson.substring(j, j + 1) == "}" { close_pos = j break } + j = j - 1 + } + } + if close_pos >= 0 { + ast_json = ajson.substring(0, close_pos) + defs_json + ajson.substring(close_pos, ajson.length()) + } + } + print(ast_json) return 0 } diff --git a/lang/src/compiler/entry/func_scanner.hako b/lang/src/compiler/entry/func_scanner.hako new file mode 100644 index 00000000..2decee5b --- /dev/null +++ b/lang/src/compiler/entry/func_scanner.hako @@ -0,0 +1,262 @@ +// FuncScannerBox — Function definition scanner for Stage-B compiler +// Policy: Extract method definitions from Hako source (conservative, minimal) +// Toggle: HAKO_STAGEB_FUNC_SCAN=1 +// Scope: method (params) { ... } outside of main (same box Main) +// Output: [{"name":"","params":[...],"body_json":"","box":"Main"}] + +using lang.compiler.parser.box as ParserBox + +static box FuncScannerBox { + // Scan source for method definitions (excluding main) + // Returns ArrayBox of method definitions + method scan_functions(source, box_name) { + local methods = new ArrayBox() + local s = "" + source + local n = s.length() + local i = 0 + + loop(i < n) { + // Search for "method " pattern + local k = -1 + { + local pat = "method " + local m = pat.length() + local j = i + loop(j + m <= n) { + if s.substring(j, j + m) == pat { k = j break } + j = j + 1 + } + } + if k < 0 { break } + i = k + 7 // skip "method " + + // Extract method name (alphanumeric until '(') + local name_start = i + local name_end = -1 + { + local j = i + loop(j < n) { + local ch = s.substring(j, j + 1) + if ch == "(" { name_end = j break } + j = j + 1 + } + } + if name_end < 0 { break } + local method_name = s.substring(name_start, name_end) + + // Skip main (already extracted as body) + if method_name == "main" { i = name_end continue } + + // Find '(' after name + local lparen = name_end + + // Find matching ')' for params (skip strings) + local rparen = -1 + { + local j = lparen + 1 + local in_str = 0 + local esc = 0 + loop(j < n) { + local ch = s.substring(j, j + 1) + if in_str == 1 { + if esc == 1 { esc = 0 j = j + 1 continue } + if ch == "\\" { esc = 1 j = j + 1 continue } + if ch == "\"" { in_str = 0 j = j + 1 continue } + j = j + 1 + continue + } + if ch == "\"" { in_str = 1 j = j + 1 continue } + if ch == ")" { rparen = j break } + j = j + 1 + } + } + if rparen < 0 { break } + + // Extract params (minimal: comma-separated names) + local params_str = s.substring(lparen + 1, rparen) + local params = me._parse_params(params_str) + + // Find opening '{' after ')' + local lbrace = -1 + { + local j = rparen + 1 + local in_str = 0 + local esc = 0 + loop(j < n) { + local ch = s.substring(j, j + 1) + if in_str == 1 { + if esc == 1 { esc = 0 j = j + 1 continue } + if ch == "\\" { esc = 1 j = j + 1 continue } + if ch == "\"" { in_str = 0 j = j + 1 continue } + j = j + 1 + continue + } + if ch == "\"" { in_str = 1 j = j + 1 continue } + if ch == "{" { lbrace = j break } + j = j + 1 + } + } + if lbrace < 0 { break } + + // Find matching '}' (balanced) + local rbrace = -1 + { + local depth = 0 + local j = lbrace + local in_str = 0 + local esc = 0 + loop(j < n) { + local ch = s.substring(j, j + 1) + if in_str == 1 { + if esc == 1 { esc = 0 j = j + 1 continue } + if ch == "\\" { esc = 1 j = j + 1 continue } + if ch == "\"" { in_str = 0 j = j + 1 continue } + j = j + 1 + continue + } + if ch == "\"" { in_str = 1 j = j + 1 continue } + if ch == "{" { depth = depth + 1 j = j + 1 continue } + if ch == "}" { + depth = depth - 1 + j = j + 1 + if depth == 0 { rbrace = j - 1 break } + continue + } + j = j + 1 + } + } + if rbrace < 0 { break } + + // Extract method body (inside braces) + local method_body = s.substring(lbrace + 1, rbrace) + + // Strip comments from method body + method_body = me._strip_comments(method_body) + + // Trim method body + method_body = me._trim(method_body) + + // Parse method body to JSON (statement list) + local body_json = null + if method_body.length() > 0 { + local p = new ParserBox() + p.stage3_enable(1) + body_json = p.parse_program2(method_body) + } + + // Store method definition + if body_json != null && body_json != "" { + local def = new MapBox() + def.set("name", method_name) + def.set("params", params) + def.set("body_json", body_json) + def.set("box", box_name) + methods.push(def) + } + i = rbrace + } + + return methods + } + + // Helper: parse comma-separated parameter names + method _parse_params(params_str) { + local params = new ArrayBox() + local pstr = "" + params_str + local pn = pstr.length() + local pstart = 0 + + loop(pstart < pn) { + // Skip whitespace + loop(pstart < pn) { + local ch = pstr.substring(pstart, pstart + 1) + if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" { pstart = pstart + 1 } else { break } + } + if pstart >= pn { break } + + // Find next comma or end + local pend = pstart + loop(pend < pn) { + local ch = pstr.substring(pend, pend + 1) + if ch == "," { break } + pend = pend + 1 + } + + // Extract param name (trim) + local pname = pstr.substring(pstart, pend) + pname = me._trim(pname) + if pname.length() > 0 { params.push(pname) } + pstart = pend + 1 + } + + return params + } + + // Helper: strip comments from source + method _strip_comments(source) { + local s = "" + source + local out = "" + local i = 0 + local n = s.length() + local in_str = 0 + local esc = 0 + local in_line = 0 + local in_block = 0 + + loop(i < n) { + local ch = s.substring(i, i + 1) + if in_line == 1 { + if ch == "\n" { in_line = 0 out = out + ch } + i = i + 1 + continue + } + if in_block == 1 { + if ch == "*" && i + 1 < n && s.substring(i + 1, i + 2) == "/" { in_block = 0 i = i + 2 continue } + i = i + 1 + continue + } + if in_str == 1 { + if esc == 1 { out = out + ch esc = 0 i = i + 1 continue } + if ch == "\\" { out = out + ch esc = 1 i = i + 1 continue } + if ch == "\"" { out = out + ch in_str = 0 i = i + 1 continue } + out = out + ch + i = i + 1 + continue + } + // Not in string/comment + if ch == "\"" { out = out + ch in_str = 1 i = i + 1 continue } + if ch == "/" && i + 1 < n { + local ch2 = s.substring(i + 1, i + 2) + if ch2 == "/" { in_line = 1 i = i + 2 continue } + if ch2 == "*" { in_block = 1 i = i + 2 continue } + } + out = out + ch + i = i + 1 + } + + return out + } + + // Helper: trim whitespace from string + method _trim(s) { + local str = "" + s + local n = str.length() + local b = 0 + + // left trim (space, tab, CR, LF) + loop(b < n) { + local ch = str.substring(b, b + 1) + if ch == " " || ch == "\t" || ch == "\r" || ch == "\n" { b = b + 1 } else { break } + } + + // right trim + local e = n + loop(e > b) { + local ch = str.substring(e - 1, e) + if ch == " " || ch == "\t" || ch == "\r" || ch == "\n" { e = e - 1 } else { break } + } + + if e > b { return str.substring(b, e) } + return "" + } +} diff --git a/lang/src/compiler/hako_module.toml b/lang/src/compiler/hako_module.toml index d12f8dd1..b02f0a84 100644 --- a/lang/src/compiler/hako_module.toml +++ b/lang/src/compiler/hako_module.toml @@ -35,6 +35,12 @@ builder.ssa.cond_inserter = "builder/ssa/cond_inserter.hako" builder.rewrite.special = "builder/rewrite/special.hako" builder.rewrite.known = "builder/rewrite/known.hako" +# Entry point modules (Phase 15 compiler infrastructure) +entry.func_scanner = "entry/func_scanner.hako" +entry.compiler = "entry/compiler.hako" +entry.compiler_stageb = "entry/compiler_stageb.hako" +entry.bundle_resolver = "entry/bundle_resolver.hako" + [dependencies] "selfhost.shared" = "^1.0.0" "selfhost.vm" = "^1.0.0" diff --git a/lang/src/mir/builder/MirBuilderBox.hako b/lang/src/mir/builder/MirBuilderBox.hako index 2978b5cc..c7f886d4 100644 --- a/lang/src/mir/builder/MirBuilderBox.hako +++ b/lang/src/mir/builder/MirBuilderBox.hako @@ -17,6 +17,7 @@ using selfhost.shared.json.utils.json_frag as JsonFragBox using "hako.mir.builder.internal.jsonfrag_normalizer" as NormBox using "hako.mir.builder.internal.pattern_util" as PatternUtilBox +using lang.mir.builder.func_lowering as FuncLoweringBox static box MirBuilderBox { // Availability probe (for canaries) @@ -27,6 +28,7 @@ static box MirBuilderBox { if ("" + t) == "1" { return 1 } else { return 0 } } + // Main entry method emit_from_program_json_v0(program_json, opts) { // Debug tag (dev toggle only) @@ -45,12 +47,26 @@ static box MirBuilderBox { print("[mirbuilder/input/invalid] missing version/kind keys") return null } - // Helper: optional normalization (dev toggle, default OFF) + + // Dev-toggle: extract and lower function definitions (defs) + // Toggle: HAKO_MIR_BUILDER_FUNCS=1 + // Policy: delegate to FuncLoweringBox for lowering + // Output: inject additional MIR functions to output JSON + local func_defs_mir = "" + { + local funcs_toggle = env.get("HAKO_MIR_BUILDER_FUNCS") + if funcs_toggle != null && ("" + funcs_toggle) == "1" { + func_defs_mir = FuncLoweringBox.lower_func_defs(s, s) + } + } + // Helper: optional normalization (dev toggle, default OFF) + func injection local norm_if = function(m) { if m == null { return null } + // Inject function definitions if available + local result = FuncLoweringBox.inject_funcs(m, func_defs_mir) local nv = env.get("HAKO_MIR_BUILDER_JSONFRAG_NORMALIZE") - if nv != null && ("" + nv) == "1" { return NormBox.normalize_all(m) } - return m + if nv != null && ("" + nv) == "1" { return NormBox.normalize_all(result) } + return result } // Internal path(既定ON) — const(int)+ret, binop+ret ほか、registry 優先の lowering // Disable with: HAKO_MIR_BUILDER_INTERNAL=0 diff --git a/lang/src/mir/builder/func_lowering.hako b/lang/src/mir/builder/func_lowering.hako new file mode 100644 index 00000000..a929a5a3 --- /dev/null +++ b/lang/src/mir/builder/func_lowering.hako @@ -0,0 +1,567 @@ +// FuncLoweringBox — Function definition lowering and Call resolution for MirBuilder +// Policy: Lower function defs to MIR + resolve Call targets to qualified names +// Toggle: HAKO_MIR_BUILDER_FUNCS=1, HAKO_MIR_BUILDER_CALL_RESOLVE=1 +// Scope: Minimal support for Return(Int), Return(Binary(+|-|*|/, Int|Var, Int|Var)), Return(Call) +// Output: Additional MIR functions + resolved Call targets + +using selfhost.shared.json.utils.json_frag as JsonFragBox + +static box FuncLoweringBox { + // Lower function definitions to MIR + // Returns comma-separated JSON strings for additional functions + method lower_func_defs(program_json, defs_json) { + if defs_json == null || defs_json == "" { return "" } + + local s = "" + program_json + local func_defs_mir = "" + + // Check for "defs" key in Program JSON + local defs_idx = JsonFragBox.index_of_from(s, "\"defs\":", 0) + if defs_idx < 0 { return "" } + + // Extract defs array bounds + local defs_start = -1 + local defs_end = -1 + { + local j = defs_idx + 7 // skip "defs": + // skip whitespace + loop(j < s.length()) { + local ch = s.substring(j, j + 1) + if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" { j = j + 1 } else { break } + } + if j < s.length() && s.substring(j, j + 1) == "[" { + defs_start = j + 1 + // Find matching ] + local depth = 1 + local k = j + 1 + local in_str = 0 + local esc = 0 + loop(k < s.length()) { + local ch = s.substring(k, k + 1) + if in_str == 1 { + if esc == 1 { esc = 0 k = k + 1 continue } + if ch == "\\" { esc = 1 k = k + 1 continue } + if ch == "\"" { in_str = 0 k = k + 1 continue } + k = k + 1 + continue + } + if ch == "\"" { in_str = 1 k = k + 1 continue } + if ch == "[" { depth = depth + 1 k = k + 1 continue } + if ch == "]" { + depth = depth - 1 + if depth == 0 { defs_end = k break } + k = k + 1 + continue + } + k = k + 1 + } + } + } + if defs_start < 0 || defs_end < 0 { return "" } + + // Parse each def object in defs array + local defs_str = s.substring(defs_start, defs_end) + local func_jsons = new ArrayBox() + local func_map = new MapBox() // For Call resolution: name -> "Box.method" + + // Scan for {"name": pattern + local pos = 0 + loop(pos < defs_str.length()) { + local name_idx = JsonFragBox.index_of_from(defs_str, "\"name\":\"", pos) + if name_idx < 0 { break } + + // Extract name + local name_start = name_idx + 8 + local name_end = -1 + { + local j = name_start + loop(j < defs_str.length()) { + if defs_str.substring(j, j + 1) == "\"" { name_end = j break } + j = j + 1 + } + } + if name_end < 0 { break } + local func_name = defs_str.substring(name_start, name_end) + + // Extract box name + local box_name = "Main" // default + local box_idx = JsonFragBox.index_of_from(defs_str, "\"box\":\"", name_end) + if box_idx >= 0 { + local box_start = box_idx + 7 + local box_end = -1 + { + local j = box_start + loop(j < defs_str.length()) { + if defs_str.substring(j, j + 1) == "\"" { box_end = j break } + j = j + 1 + } + } + if box_end >= 0 { box_name = defs_str.substring(box_start, box_end) } + } + + // Register function in map for Call resolution + func_map.set(func_name, box_name + "." + func_name) + + // Extract params array + local params_arr = new ArrayBox() + local params_idx = JsonFragBox.index_of_from(defs_str, "\"params\":[", name_end) + if params_idx >= 0 { + local params_start = params_idx + 10 + local params_end = -1 + { + local j = params_start + local depth = 1 + local in_str = 0 + local esc = 0 + loop(j < defs_str.length()) { + local ch = defs_str.substring(j, j + 1) + if in_str == 1 { + if esc == 1 { esc = 0 j = j + 1 continue } + if ch == "\\" { esc = 1 j = j + 1 continue } + if ch == "\"" { in_str = 0 j = j + 1 continue } + j = j + 1 + continue + } + if ch == "\"" { in_str = 1 j = j + 1 continue } + if ch == "[" { depth = depth + 1 j = j + 1 continue } + if ch == "]" { + depth = depth - 1 + if depth == 0 { params_end = j break } + j = j + 1 + continue + } + j = j + 1 + } + } + if params_end >= 0 { + local params_str = defs_str.substring(params_start, params_end) + // Extract param names from JSON array + local p_pos = 0 + loop(p_pos < params_str.length()) { + local p_idx = JsonFragBox.index_of_from(params_str, "\"", p_pos) + if p_idx < 0 { break } + local p_start = p_idx + 1 + local p_end = -1 + { + local j = p_start + loop(j < params_str.length()) { + if params_str.substring(j, j + 1) == "\"" { p_end = j break } + j = j + 1 + } + } + if p_end < 0 { break } + params_arr.push(params_str.substring(p_start, p_end)) + p_pos = p_end + 1 + } + } + } + + // Extract body JSON (Program statements) + local body_idx = JsonFragBox.index_of_from(defs_str, "\"body\":", name_end) + if body_idx >= 0 { + local body_start = body_idx + 7 + // Find body object bounds (scan for balanced {}) + local body_end = -1 + { + local j = body_start + // skip whitespace + loop(j < defs_str.length()) { + local ch = defs_str.substring(j, j + 1) + if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" { j = j + 1 } else { break } + } + if j < defs_str.length() && defs_str.substring(j, j + 1) == "{" { + local depth = 1 + local k = j + 1 + local in_str = 0 + local esc = 0 + loop(k < defs_str.length()) { + local ch = defs_str.substring(k, k + 1) + if in_str == 1 { + if esc == 1 { esc = 0 k = k + 1 continue } + if ch == "\\" { esc = 1 k = k + 1 continue } + if ch == "\"" { in_str = 0 k = k + 1 continue } + k = k + 1 + continue + } + if ch == "\"" { in_str = 1 k = k + 1 continue } + if ch == "{" { depth = depth + 1 k = k + 1 continue } + if ch == "}" { + depth = depth - 1 + if depth == 0 { body_end = k + 1 break } + k = k + 1 + continue + } + k = k + 1 + } + } + } + if body_end >= 0 { + local body_json = defs_str.substring(body_start, body_end) + // Try to lower body to MIR + local mir_func = me._lower_func_body(func_name, box_name, params_arr, body_json, func_map) + if mir_func != null && mir_func != "" { + func_jsons.push(mir_func) + } + } + } + pos = name_end + } + + // Build additional functions JSON + if func_jsons.length() > 0 { + local fi = 0 + local fn = func_jsons.length() + loop(fi < fn) { + func_defs_mir = func_defs_mir + "," + ("" + func_jsons.get(fi)) + fi = fi + 1 + } + } + + return func_defs_mir + } + + // Lower function body to MIR (minimal support) + // Supports: Return(Int), Return(Binary(+|-|*|/, Int|Var, Int|Var)), Return(Call) + method _lower_func_body(func_name, box_name, params_arr, body_json, func_map) { + local body_str = "" + body_json + + // Check for Return statement + local ret_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"Return\"", 0) + if ret_idx < 0 { return null } + + // Check for Call in Return + local call_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"Call\"", ret_idx) + if call_idx >= 0 { + // Return(Call(name, args)) + return me._lower_return_call(func_name, box_name, params_arr, body_str, call_idx, func_map) + } + + // Check for Binary in Return + local bin_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"Binary\"", ret_idx) + if bin_idx >= 0 { + // Return(Binary(op, lhs, rhs)) + return me._lower_return_binary(func_name, box_name, params_arr, body_str, bin_idx) + } + + // Check for Return(Int) directly + local int_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"Int\"", ret_idx) + if int_idx >= 0 { + local val_idx = JsonFragBox.index_of_from(body_str, "\"value\":", int_idx) + if val_idx >= 0 { + local val = JsonFragBox.read_int_after(body_str, val_idx + 8) + if val != null { + // Build params JSON array + local params_json = me._build_params_json(params_arr) + local mir = "{\\\"name\\\":\\\"" + box_name + "." + func_name + "\\\",\\\"params\\\":" + params_json + ",\\\"locals\\\":[],\\\"blocks\\\":[{\\\"id\\\":0,\\\"instructions\\\":[{\\\"op\\\":\\\"const\\\",\\\"dst\\\":1,\\\"value\\\":{\\\"type\\\":\\\"i64\\\",\\\"value\\\":" + val + "}},{\\\"op\\\":\\\"ret\\\",\\\"value\\\":1}]}]}" + return mir + } + } + } + + return null + } + + // Lower Return(Binary(op, lhs, rhs)) + method _lower_return_binary(func_name, box_name, params_arr, body_str, bin_idx) { + // Extract op + local op_idx = JsonFragBox.index_of_from(body_str, "\"op\":\"", bin_idx) + if op_idx < 0 { return null } + local op = JsonFragBox.read_string_after(body_str, op_idx + 5) + if !(op == "+" || op == "-" || op == "*" || op == "/") { return null } + + // Extract lhs (Var or Int) + local lhs_idx = JsonFragBox.index_of_from(body_str, "\"lhs\":{", bin_idx) + local lhs_type = null + local lhs_val = null + if lhs_idx >= 0 { + local lhs_type_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"", lhs_idx) + if lhs_type_idx >= 0 { + lhs_type = JsonFragBox.read_string_after(body_str, lhs_type_idx + 7) + if lhs_type == "Var" { + local var_idx = JsonFragBox.index_of_from(body_str, "\"name\":\"", lhs_type_idx) + if var_idx >= 0 { lhs_val = JsonFragBox.read_string_after(body_str, var_idx + 8) } + } else if lhs_type == "Int" { + local val_idx = JsonFragBox.index_of_from(body_str, "\"value\":", lhs_type_idx) + if val_idx >= 0 { lhs_val = JsonFragBox.read_int_after(body_str, val_idx + 8) } + } + } + } + + // Extract rhs (Var or Int) + local rhs_idx = JsonFragBox.index_of_from(body_str, "\"rhs\":{", bin_idx) + local rhs_type = null + local rhs_val = null + if rhs_idx >= 0 { + local rhs_type_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"", rhs_idx) + if rhs_type_idx >= 0 { + rhs_type = JsonFragBox.read_string_after(body_str, rhs_type_idx + 7) + if rhs_type == "Var" { + local var_idx = JsonFragBox.index_of_from(body_str, "\"name\":\"", rhs_type_idx) + if var_idx >= 0 { rhs_val = JsonFragBox.read_string_after(body_str, var_idx + 8) } + } else if rhs_type == "Int" { + local val_idx = JsonFragBox.index_of_from(body_str, "\"value\":", rhs_type_idx) + if val_idx >= 0 { rhs_val = JsonFragBox.read_int_after(body_str, val_idx + 8) } + } + } + } + + if lhs_type == null || rhs_type == null || lhs_val == null || rhs_val == null { return null } + + // Build MIR function with params + local insts = "" + local next_reg = 1 + + // Map params to registers (params start at r1, r2, ...) + local param_map = new MapBox() + { + local pi = 0 + local pn = params_arr.length() + loop(pi < pn) { + param_map.set("" + params_arr.get(pi), "" + next_reg) + next_reg = next_reg + 1 + pi = pi + 1 + } + } + + // Load lhs + local lhs_reg = next_reg + if lhs_type == "Var" { + // Use param register + local preg = param_map.get("" + lhs_val) + if preg == null { return null } + lhs_reg = JsonFragBox._str_to_int("" + preg) + } else if lhs_type == "Int" { + insts = "{\\\"op\\\":\\\"const\\\",\\\"dst\\\":" + next_reg + ",\\\"value\\\":{\\\"type\\\":\\\"i64\\\",\\\"value\\\":" + lhs_val + "}}" + lhs_reg = next_reg + next_reg = next_reg + 1 + } + + // Load rhs + local rhs_reg = next_reg + if rhs_type == "Var" { + // Use param register + local preg = param_map.get("" + rhs_val) + if preg == null { return null } + rhs_reg = JsonFragBox._str_to_int("" + preg) + } else if rhs_type == "Int" { + if insts != "" { insts = insts + "," } + insts = insts + "{\\\"op\\\":\\\"const\\\",\\\"dst\\\":" + next_reg + ",\\\"value\\\":{\\\"type\\\":\\\"i64\\\",\\\"value\\\":" + rhs_val + "}}" + rhs_reg = next_reg + next_reg = next_reg + 1 + } + + // binop + if insts != "" { insts = insts + "," } + insts = insts + "{\\\"op\\\":\\\"binop\\\",\\\"operation\\\":\\\"" + op + "\\\",\\\"lhs\\\":" + lhs_reg + ",\\\"rhs\\\":" + rhs_reg + ",\\\"dst\\\":" + next_reg + "}" + local result_reg = next_reg + next_reg = next_reg + 1 + + // ret + insts = insts + ",{\\\"op\\\":\\\"ret\\\",\\\"value\\\":" + result_reg + "}" + + // Build params JSON array + local params_json = me._build_params_json(params_arr) + local mir = "{\\\"name\\\":\\\"" + box_name + "." + func_name + "\\\",\\\"params\\\":" + params_json + ",\\\"locals\\\":[],\\\"blocks\\\":[{\\\"id\\\":0,\\\"instructions\\\":[" + insts + "]}]}" + return mir + } + + // Lower Return(Call(name, args)) + method _lower_return_call(func_name, box_name, params_arr, body_str, call_idx, func_map) { + // Extract call function name + local func_idx = JsonFragBox.index_of_from(body_str, "\"func\":\"", call_idx) + if func_idx < 0 { return null } + local call_name = JsonFragBox.read_string_after(body_str, func_idx + 8) + if call_name == null { return null } + + // Resolve call target + local resolved_name = me.resolve_call_target(call_name, func_map) + + // Extract args (minimal: Int or Var) + local args_arr = new ArrayBox() + local args_idx = JsonFragBox.index_of_from(body_str, "\"args\":[", call_idx) + if args_idx >= 0 { + // Parse args array (simplified) + local args_pos = args_idx + 8 + loop(args_pos < body_str.length()) { + local arg_type_idx = JsonFragBox.index_of_from(body_str, "\"type\":\"", args_pos) + if arg_type_idx < 0 { break } + if arg_type_idx > call_idx + 200 { break } // Limit search scope + + local arg_type = JsonFragBox.read_string_after(body_str, arg_type_idx + 7) + if arg_type == "Int" { + local val_idx = JsonFragBox.index_of_from(body_str, "\"value\":", arg_type_idx) + if val_idx >= 0 { + local val = JsonFragBox.read_int_after(body_str, val_idx + 8) + if val != null { + local arg_info = new MapBox() + arg_info.set("type", "Int") + arg_info.set("value", val) + args_arr.push(arg_info) + } + } + } else if arg_type == "Var" { + local var_idx = JsonFragBox.index_of_from(body_str, "\"name\":\"", arg_type_idx) + if var_idx >= 0 { + local var_name = JsonFragBox.read_string_after(body_str, var_idx + 8) + if var_name != null { + local arg_info = new MapBox() + arg_info.set("type", "Var") + arg_info.set("value", var_name) + args_arr.push(arg_info) + } + } + } + args_pos = arg_type_idx + 20 + } + } + + // Build MIR for Call + local insts = "" + local next_reg = 1 + + // Map params to registers + local param_map = new MapBox() + { + local pi = 0 + local pn = params_arr.length() + loop(pi < pn) { + param_map.set("" + params_arr.get(pi), "" + next_reg) + next_reg = next_reg + 1 + pi = pi + 1 + } + } + + // Load const for function name + insts = "{\\\"op\\\":\\\"const\\\",\\\"dst\\\":" + next_reg + ",\\\"value\\\":{\\\"type\\\":\\\"string\\\",\\\"value\\\":\\\"" + resolved_name + "\\\"}}" + local func_reg = next_reg + next_reg = next_reg + 1 + + // Load args + local arg_regs = new ArrayBox() + { + local ai = 0 + local an = args_arr.length() + loop(ai < an) { + local arg_info = args_arr.get(ai) + local arg_type = "" + arg_info.get("type") + if arg_type == "Int" { + local val = arg_info.get("value") + insts = insts + ",{\\\"op\\\":\\\"const\\\",\\\"dst\\\":" + next_reg + ",\\\"value\\\":{\\\"type\\\":\\\"i64\\\",\\\"value\\\":" + val + "}}" + arg_regs.push(next_reg) + next_reg = next_reg + 1 + } else if arg_type == "Var" { + local var_name = "" + arg_info.get("value") + local preg = param_map.get(var_name) + if preg != null { + arg_regs.push(JsonFragBox._str_to_int("" + preg)) + } + } + ai = ai + 1 + } + } + + // Build args list + local args_list = "" + { + local ri = 0 + local rn = arg_regs.length() + loop(ri < rn) { + if ri > 0 { args_list = args_list + "," } + args_list = args_list + ("" + arg_regs.get(ri)) + ri = ri + 1 + } + } + + // call instruction (using func_reg for function name) + insts = insts + ",{\\\"op\\\":\\\"call\\\",\\\"func\\\":" + func_reg + ",\\\"args\\\":[" + args_list + "],\\\"dst\\\":" + next_reg + "}" + local result_reg = next_reg + next_reg = next_reg + 1 + + // ret + insts = insts + ",{\\\"op\\\":\\\"ret\\\",\\\"value\\\":" + result_reg + "}" + + // Build params JSON array + local params_json = me._build_params_json(params_arr) + local mir = "{\\\"name\\\":\\\"" + box_name + "." + func_name + "\\\",\\\"params\\\":" + params_json + ",\\\"locals\\\":[],\\\"blocks\\\":[{\\\"id\\\":0,\\\"instructions\\\":[" + insts + "]}]}" + + // Debug log + if env.get("HAKO_MIR_BUILDER_DEBUG") == "1" { + print("[mirbuilder/call:lowered] " + func_name + " -> call(" + resolved_name + ")") + } + + return mir + } + + // Resolve call target using function map + // Toggle: HAKO_MIR_BUILDER_CALL_RESOLVE=1 + method resolve_call_target(call_name, func_map) { + if env.get("HAKO_MIR_BUILDER_CALL_RESOLVE") != "1" { return call_name } + + local resolved = func_map.get(call_name) + if resolved != null { + if env.get("HAKO_MIR_BUILDER_DEBUG") == "1" { + print("[mirbuilder/call:resolve] " + call_name + " => " + resolved) + } + return "" + resolved + } + return call_name + } + + // Helper: build params JSON array + method _build_params_json(params_arr) { + local params_json = "[" + local pi = 0 + local pn = params_arr.length() + loop(pi < pn) { + if pi > 0 { params_json = params_json + "," } + params_json = params_json + "\\\"" + ("" + params_arr.get(pi)) + "\\\"" + pi = pi + 1 + } + params_json = params_json + "]" + return params_json + } + + // Inject function definitions into MIR JSON + method inject_funcs(mir_json, func_defs_mir) { + if func_defs_mir == null || func_defs_mir == "" { return mir_json } + + // Find "functions":[{ in mir_json and inject after first function + local mir_str = "" + mir_json + local funcs_idx = JsonFragBox.index_of_from(mir_str, "\"functions\":[", 0) + if funcs_idx < 0 { return mir_json } + + // Find first function's closing } + local first_func_start = funcs_idx + 13 // skip "functions":[ + local brace_depth = 0 + local first_func_end = -1 + { + local j = first_func_start + local in_str = 0 + local esc = 0 + loop(j < mir_str.length()) { + local ch = mir_str.substring(j, j + 1) + if in_str == 1 { + if esc == 1 { esc = 0 j = j + 1 continue } + if ch == "\\" { esc = 1 j = j + 1 continue } + if ch == "\"" { in_str = 0 j = j + 1 continue } + j = j + 1 + continue + } + if ch == "\"" { in_str = 1 j = j + 1 continue } + if ch == "{" { brace_depth = brace_depth + 1 j = j + 1 continue } + if ch == "}" { + brace_depth = brace_depth - 1 + if brace_depth == 0 { first_func_end = j + 1 break } + j = j + 1 + continue + } + j = j + 1 + } + } + if first_func_end < 0 { return mir_json } + + // Inject func_defs_mir after first function + local result = mir_str.substring(0, first_func_end) + func_defs_mir + mir_str.substring(first_func_end, mir_str.length()) + return result + } +} diff --git a/lang/src/mir/hako_module.toml b/lang/src/mir/hako_module.toml new file mode 100644 index 00000000..94749b78 --- /dev/null +++ b/lang/src/mir/hako_module.toml @@ -0,0 +1,26 @@ +[module] +name = "lang.mir" +version = "1.0.0" + +[exports] +# MIR builder modules +builder.func_lowering = "builder/func_lowering.hako" +builder.MirBuilderBox = "builder/MirBuilderBox.hako" +builder.MirBuilderMinBox = "builder/MirBuilderMinBox.hako" +builder.pattern_registry = "builder/pattern_registry.hako" + +# MIR builder internal modules +builder.internal.prog_scan_box = "builder/internal/prog_scan_box.hako" +builder.internal.lower_load_store_local_box = "builder/internal/lower_load_store_local_box.hako" +builder.internal.lower_typeop_cast_box = "builder/internal/lower_typeop_cast_box.hako" +builder.internal.lower_typeop_check_box = "builder/internal/lower_typeop_check_box.hako" +builder.internal.lower_loop_simple_box = "builder/internal/lower_loop_simple_box.hako" +builder.internal.loop_opts_adapter_box = "builder/internal/loop_opts_adapter_box.hako" +builder.internal.builder_config_box = "builder/internal/builder_config_box.hako" +builder.internal.jsonfrag_normalizer_box = "builder/internal/jsonfrag_normalizer_box.hako" + +# MIR emitter +min_emitter = "min_emitter.hako" + +[dependencies] +"selfhost.shared" = "^1.0.0" diff --git a/nyash.toml b/nyash.toml index 79347430..e26b456c 100644 --- a/nyash.toml +++ b/nyash.toml @@ -12,6 +12,7 @@ paths = ["apps", "lib", ".", "lang/src"] [modules.workspace] members = [ "lang/src/compiler/hako_module.toml", + "lang/src/mir/hako_module.toml", "lang/src/shared/hako_module.toml", "lang/src/vm/hako_module.toml", "lang/src/runtime/meta/hako_module.toml", @@ -115,8 +116,10 @@ path = "lang/src/shared/common/string_helpers.hako" "lang.compiler.builder.rewrite.special" = "lang/src/compiler/builder/rewrite/special.hako" "lang.compiler.builder.rewrite.known" = "lang/src/compiler/builder/rewrite/known.hako" "lang.compiler.pipeline_v2.localvar_ssa_box" = "lang/src/compiler/pipeline_v2/local_ssa_box.hako" +"lang.compiler.entry.func_scanner" = "lang/src/compiler/entry/func_scanner.hako" "lang.compiler.entry.compiler" = "lang/src/compiler/entry/compiler.hako" "lang.compiler.entry.compiler_stageb" = "lang/src/compiler/entry/compiler_stageb.hako" +"lang.compiler.entry.bundle_resolver" = "lang/src/compiler/entry/bundle_resolver.hako" "lang.compiler.emit.mir_emitter_box" = "lang/src/compiler/emit/mir_emitter_box.hako" "lang.compiler.emit.common.json_emit_box" = "lang/src/compiler/emit/common/json_emit_box.hako" "lang.compiler.emit.common.mir_emit_box" = "lang/src/compiler/emit/common/mir_emit_box.hako" @@ -129,6 +132,21 @@ path = "lang/src/shared/common/string_helpers.hako" "lang.compiler.emit.common.newbox_emit" = "lang/src/compiler/emit/common/newbox_emit_box.hako" "lang.compiler.emit.common.header_emit" = "lang/src/compiler/emit/common/header_emit_box.hako" +# MIR modules (Phase 15 compiler infrastructure) +"lang.mir.builder.func_lowering" = "lang/src/mir/builder/func_lowering.hako" +"lang.mir.builder.MirBuilderBox" = "lang/src/mir/builder/MirBuilderBox.hako" +"lang.mir.builder.MirBuilderMinBox" = "lang/src/mir/builder/MirBuilderMinBox.hako" +"lang.mir.builder.pattern_registry" = "lang/src/mir/builder/pattern_registry.hako" +"lang.mir.builder.internal.prog_scan_box" = "lang/src/mir/builder/internal/prog_scan_box.hako" +"lang.mir.builder.internal.lower_load_store_local_box" = "lang/src/mir/builder/internal/lower_load_store_local_box.hako" +"lang.mir.builder.internal.lower_typeop_cast_box" = "lang/src/mir/builder/internal/lower_typeop_cast_box.hako" +"lang.mir.builder.internal.lower_typeop_check_box" = "lang/src/mir/builder/internal/lower_typeop_check_box.hako" +"lang.mir.builder.internal.lower_loop_simple_box" = "lang/src/mir/builder/internal/lower_loop_simple_box.hako" +"lang.mir.builder.internal.loop_opts_adapter_box" = "lang/src/mir/builder/internal/loop_opts_adapter_box.hako" +"lang.mir.builder.internal.builder_config_box" = "lang/src/mir/builder/internal/builder_config_box.hako" +"lang.mir.builder.internal.jsonfrag_normalizer_box" = "lang/src/mir/builder/internal/jsonfrag_normalizer_box.hako" +"lang.mir.min_emitter" = "lang/src/mir/min_emitter.hako" + # Shared helpers (selfhost shared/vm) — kept under `lang/` tree "selfhost.shared.json_adapter" = "lang/src/shared/json_adapter.hako" "selfhost.shared.common.mini_vm_scan" = "lang/src/shared/common/mini_vm_scan.hako" diff --git a/src/backend/mir_interpreter/handlers/calls.rs b/src/backend/mir_interpreter/handlers/calls.rs index 6a7ee624..1b515d66 100644 --- a/src/backend/mir_interpreter/handlers/calls.rs +++ b/src/backend/mir_interpreter/handlers/calls.rs @@ -802,6 +802,30 @@ impl MirInterpreter { } } + // Fallback: user-defined function dispatch for Global calls + // If none of the above extern/provider/global bridges matched, + // try to resolve and execute a user function present in this module. + { + // Use unique-tail resolver against snapshot of self.functions + let fname = call_resolution::resolve_function_name( + func_name, + args.len(), + &self.functions, + self.cur_fn.as_deref(), + ); + if let Some(fname) = fname { + if let Some(func) = self.functions.get(&fname).cloned() { + // Load arguments and execute + let mut argv: Vec = Vec::new(); + for a in args { argv.push(self.reg_load(*a)?); } + if std::env::var("NYASH_VM_CALL_TRACE").ok().as_deref() == Some("1") { + eprintln!("[vm] global-call resolved '{}' -> '{}'", func_name, fname); + } + return self.exec_function_inner(&func, Some(&argv)); + } + } + } + fn execute_extern_function( &mut self, extern_name: &str, diff --git a/src/llvm_py/builders/block_lower.py b/src/llvm_py/builders/block_lower.py index 20b29b09..95e105d8 100644 --- a/src/llvm_py/builders/block_lower.py +++ b/src/llvm_py/builders/block_lower.py @@ -196,8 +196,15 @@ def lower_blocks(builder, func: ir.Function, block_by_id: Dict[int, Dict[str, An try: dst = inst.get("dst") if isinstance(dst, int): - if dst in builder.vmap: + # Prefer current vmap context (_current_vmap) updates; fallback to global vmap + _gval = None + try: + _gval = vmap_cur.get(dst) + except Exception: + _gval = None + if _gval is None and dst in builder.vmap: _gval = builder.vmap[dst] + if _gval is not None: try: if hasattr(_gval, 'add_incoming'): bb_of = getattr(getattr(_gval, 'basic_block', None), 'name', None) diff --git a/src/llvm_py/builders/function_lower.py b/src/llvm_py/builders/function_lower.py index 03000486..f31e5e85 100644 --- a/src/llvm_py/builders/function_lower.py +++ b/src/llvm_py/builders/function_lower.py @@ -30,6 +30,13 @@ def lower_function(builder, func_data: Dict[str, Any]): # Default: i64(i64, ...) signature; derive arity from '/N' suffix when params missing m = re.search(r"/(\d+)$", name) arity = int(m.group(1)) if m else len(params) + # Dev fallback: when params are missing for global (Box.method) functions, + # use observed call-site arity if available (scanned in builder.build_from_mir) + if arity == 0 and '.' in name: + try: + arity = int(builder.call_arities.get(name, 0)) + except Exception: + pass param_types = [builder.i64] * arity func_ty = ir.FunctionType(builder.i64, param_types) @@ -67,11 +74,38 @@ def lower_function(builder, func_data: Dict[str, Any]): if func is None: func = ir.Function(builder.module, func_ty, name=name) - # Map parameters to vmap (value_id: 0..arity-1) + # Map parameters to vmap. Prefer mapping by referenced value-ids that have no + # local definition (common in v0 JSON where params appear as lhs/rhs ids). try: arity = len(func.args) + # Collect defined and used ids + defs = set() + uses = set() + for bb in (blocks or []): + for ins in (bb.get('instructions') or []): + try: + dstv = ins.get('dst') + if isinstance(dstv, int): + defs.add(int(dstv)) + except Exception: + pass + for k in ('lhs','rhs','value','cond','box_val'): + try: + v = ins.get(k) + if isinstance(v, int): + uses.add(int(v)) + except Exception: + pass + cand = [vid for vid in uses if vid not in defs] + cand.sort() + mapped = 0 + for i in range(min(arity, len(cand))): + builder.vmap[int(cand[i])] = func.args[i] + mapped += 1 + # Fallback: also map positional 0..arity-1 to args if not already mapped for i in range(arity): - builder.vmap[i] = func.args[i] + if i not in builder.vmap: + builder.vmap[i] = func.args[i] except Exception: pass diff --git a/src/llvm_py/llvm_builder.py b/src/llvm_py/llvm_builder.py index 8eeea3d8..7e102664 100644 --- a/src/llvm_py/llvm_builder.py +++ b/src/llvm_py/llvm_builder.py @@ -133,6 +133,46 @@ class NyashLLVMBuilder: # Parse MIR reader = MIRReader(mir_json) functions = reader.get_functions() + + # Pre-scan call sites to estimate arity for global functions when params are missing + def _scan_call_arities(funcs: List[Dict[str, Any]]): + ar: Dict[str, int] = {} + for f in funcs or []: + # Build map: const dst -> string name (per-function scope) + const_names: Dict[int, str] = {} + for bb in (f.get('blocks') or []): + for ins in (bb.get('instructions') or []): + try: + op = ins.get('op') + if op == 'const': + dst = ins.get('dst') + val = ins.get('value') or {} + name = None + if isinstance(val, dict): + v = val.get('value') + t = val.get('type') + if isinstance(v, str) and ( + t == 'string' or (isinstance(t, dict) and t.get('box_type') == 'StringBox') + ): + name = v + if isinstance(dst, int) and isinstance(name, str): + const_names[int(dst)] = name + elif op == 'call': + func_id = ins.get('func') + if isinstance(func_id, int) and func_id in const_names: + nm = const_names[func_id] + argc = len(ins.get('args') or []) + prev = ar.get(nm, 0) + if argc > prev: + ar[nm] = argc + except Exception: + continue + return ar + + try: + self.call_arities = _scan_call_arities(functions) + except Exception: + self.call_arities = {} if not functions: # No functions - create dummy ny_main @@ -149,6 +189,12 @@ class NyashLLVMBuilder: params_list = func_data.get("params", []) or [] if "." in name: arity = len(params_list) + # Dev fallback: when params missing for Box.method, use call-site arity + if arity == 0: + try: + arity = int(self.call_arities.get(name, 0)) + except Exception: + pass else: arity = int(m.group(1)) if m else len(params_list) if name == "ny_main": diff --git a/src/runner/json_v0_bridge/ast.rs b/src/runner/json_v0_bridge/ast.rs index 8a679724..b2bc805b 100644 --- a/src/runner/json_v0_bridge/ast.rs +++ b/src/runner/json_v0_bridge/ast.rs @@ -1,10 +1,21 @@ use serde::{Deserialize, Serialize}; -#[derive(Debug, Deserialize, Serialize)] +#[derive(Debug, Deserialize, Serialize, Clone)] pub(super) struct ProgramV0 { pub(super) version: i32, pub(super) kind: String, pub(super) body: Vec, + #[serde(default)] + pub(super) defs: Vec, +} + +#[derive(Debug, Deserialize, Serialize, Clone)] +pub(super) struct FuncDefV0 { + pub(super) name: String, + pub(super) params: Vec, + pub(super) body: ProgramV0, + #[serde(rename = "box")] + pub(super) box_name: String, } #[derive(Debug, Deserialize, Serialize, Clone)] diff --git a/src/runner/json_v0_bridge/lexer.rs b/src/runner/json_v0_bridge/lexer.rs index 6d3bf11e..2c546700 100644 --- a/src/runner/json_v0_bridge/lexer.rs +++ b/src/runner/json_v0_bridge/lexer.rs @@ -185,6 +185,7 @@ pub(super) fn parse_source_v0_to_json(input: &str) -> Result { version: 0, kind: "Program".into(), body: vec![StmtV0::Return { expr }], + defs: vec![], }; serde_json::to_string(&prog).map_err(|e| e.to_string()) } diff --git a/src/runner/json_v0_bridge/lowering.rs b/src/runner/json_v0_bridge/lowering.rs index b426be9c..827a1677 100644 --- a/src/runner/json_v0_bridge/lowering.rs +++ b/src/runner/json_v0_bridge/lowering.rs @@ -293,6 +293,103 @@ pub(super) fn lower_program(prog: ProgramV0) -> Result { f.signature.return_type = MirType::Unknown; // フェーズM.2: PHI後処理削除 - MirBuilder/LoopBuilderでPHI統一済み module.add_function(f); + + // Phase 21.6: Process function definitions (defs) + // Toggle: HAKO_STAGEB_FUNC_SCAN=1 + HAKO_MIR_BUILDER_FUNCS=1 + // Minimal support: Return(Int|Binary(+|-|*|/, Int|Var, Int|Var)) + let mut func_map: HashMap = HashMap::new(); + if !prog.defs.is_empty() { + for func_def in prog.defs { + // Create function signature: Main. + let func_name = format!("{}.{}", func_def.box_name, func_def.name); + + // Register function in map for Call resolution + func_map.insert(func_def.name.clone(), func_name.clone()); + + let param_ids: Vec = (0..func_def.params.len()) + .map(|i| ValueId::new(i as u32 + 1)) + .collect(); + let param_types: Vec = (0..func_def.params.len()) + .map(|_| MirType::Unknown) + .collect(); + let sig = FunctionSignature { + name: func_name, + params: param_types, + return_type: MirType::Integer, + effects: EffectMask::PURE, + }; + let entry = BasicBlockId::new(0); + let mut func = MirFunction::new(sig, entry); + + // Map params to value IDs + let mut func_var_map: HashMap = HashMap::new(); + for (i, param_name) in func_def.params.iter().enumerate() { + func_var_map.insert(param_name.clone(), param_ids[i]); + } + + // Lower function body + let mut loop_stack: Vec = Vec::new(); + let start_bb = func.entry_block; + let _end_bb = lower_stmt_list_with_vars( + &mut func, + start_bb, + &func_def.body.body, + &mut func_var_map, + &mut loop_stack, + &env, + )?; + + func.signature.return_type = MirType::Unknown; + module.add_function(func); + } + } + + // Phase 21.6: Call resolution post-processing + // Toggle: HAKO_MIR_BUILDER_CALL_RESOLVE=1 + // Resolve Call instructions to use qualified function names (e.g., "add" -> "Main.add") + if std::env::var("HAKO_MIR_BUILDER_CALL_RESOLVE").ok().as_deref() == Some("1") { + if !func_map.is_empty() { + for (_func_idx, func) in module.functions.iter_mut() { + for (_block_id, block) in func.blocks.iter_mut() { + let mut const_replacements: Vec<(ValueId, String)> = Vec::new(); + + // Find Call instructions and their associated Const values + for inst in &block.instructions { + if let MirInstruction::Call { func: func_reg, .. } = inst { + // Look for the Const instruction that defines func_reg + for const_inst in &block.instructions { + if let MirInstruction::Const { dst, value } = const_inst { + if dst == func_reg { + if let ConstValue::String(name) = value { + // Try to resolve the name + if let Some(resolved) = func_map.get(name) { + const_replacements.push((*dst, resolved.clone())); + if std::env::var("HAKO_MIR_BUILDER_DEBUG").ok().as_deref() == Some("1") { + eprintln!("[mirbuilder/call:resolve] {} => {}", name, resolved); + } + } + } + } + } + } + } + } + + // Apply replacements + for (dst, new_name) in const_replacements { + for inst in &mut block.instructions { + if let MirInstruction::Const { dst: d, value } = inst { + if d == &dst { + *value = ConstValue::String(new_name.clone()); + } + } + } + } + } + } + } + } + Ok(module) } diff --git a/tools/dev/phase216_chain_canary_call.sh b/tools/dev/phase216_chain_canary_call.sh index b2afec14..6558b4e8 100644 --- a/tools/dev/phase216_chain_canary_call.sh +++ b/tools/dev/phase216_chain_canary_call.sh @@ -15,7 +15,13 @@ HAKO TMP_JSON=$(mktemp --suffix .json) OUT_EXE=$(mktemp --suffix .exe) +# Bundle FuncScannerBox and FuncLoweringBox modules via compiler_stageb direct inclusion +# Skip func_scan for now - use simpler non-modular approach in Phase 21.6 + HAKO_SELFHOST_BUILDER_FIRST=1 \ +HAKO_STAGEB_FUNC_SCAN=1 \ +HAKO_MIR_BUILDER_FUNCS=1 \ +HAKO_MIR_BUILDER_CALL_RESOLVE=1 \ NYASH_USE_NY_COMPILER=0 HAKO_DISABLE_NY_COMPILER=1 \ NYASH_PARSER_STAGE3=1 HAKO_PARSER_STAGE3=1 NYASH_PARSER_ALLOW_SEMICOLON=1 \ NYASH_ENABLE_USING=1 HAKO_ENABLE_USING=1 \ @@ -24,14 +30,7 @@ NYASH_ENABLE_USING=1 HAKO_ENABLE_USING=1 \ NYASH_LLVM_BACKEND=crate NYASH_LLVM_SKIP_BUILD=1 \ NYASH_NY_LLVM_COMPILER="${NYASH_NY_LLVM_COMPILER:-$ROOT/target/release/ny-llvmc}" \ NYASH_EMIT_EXE_NYRT="${NYASH_EMIT_EXE_NYRT:-$ROOT/target/release}" \ - bash "$ROOT/tools/ny_mir_builder.sh" --in "$TMP_JSON" --emit exe -o "$OUT_EXE" --quiet >/dev/null || true - -if [[ ! -x "$OUT_EXE" ]]; then - # Likely unresolved local function symbol (e.g., `add`) in Stage‑B minimal chain - echo "[SKIP] phase216_call — local function linking not yet supported in Stage‑B minimal chain" >&2 - rm -f "$TMP_SRC" "$TMP_JSON" "$OUT_EXE" 2>/dev/null || true - exit 0 -fi + bash "$ROOT/tools/ny_mir_builder.sh" --in "$TMP_JSON" --emit exe -o "$OUT_EXE" --quiet >/dev/null set +e "$OUT_EXE"; rc=$?