builder+vm: unify method calls via emit_unified_call; add RouterPolicy trace; finalize LocalSSA/BlockSchedule guards; docs + selfhost quickstart

- Unify standard method calls to emit_unified_call; route via RouterPolicy and apply rewrite::{special,known} at a single entry.\n- Stabilize emit-time invariants: LocalSSA finalize + BlockSchedule PHI→Copy→Call ordering; metadata propagation on copies.\n- Known rewrite default ON (userbox only, strict guards) with opt-out flag NYASH_REWRITE_KNOWN_DEFAULT=0.\n- Expand TypeAnnotation whitelist (is_digit_char/is_hex_digit_char/is_alpha_char/Map.has).\n- Docs: unified-method-resolution design note; Quick Reference normalization note; selfhosting/quickstart.\n- Tools: add tools/selfhost_smoke.sh (dev-only).\n- Keep behavior unchanged for Unknown/core/user-instance via BoxCall fallback; all tests green (quick/integration).
This commit is contained in:
nyash-codex
2025-09-28 20:38:09 +09:00
parent e442e5f612
commit dd65cf7e4c
60 changed files with 2523 additions and 471 deletions

View File

@ -0,0 +1,64 @@
# MIR Builder — Boxes Catalog (Phase 15.7)
Purpose
- Consolidate scattered responsibilities into small, focused “boxes” (modules) with clear APIs.
- Reduce regression surface by centralizing invariants and repeated patterns.
- Keep behavior unchanged (default-off for any new diagnostics). Adopt gradually.
Status (2025-09-28)
- S-tier (landed skeletons):
- MetadataPropagationBox — type/origin propagation.
- ConstantEmissionBox — Const emission helpers.
- TypeAnnotationBox — minimal return-type annotation for known calls.
- S-tier (new in this pass):
- RouterPolicyBox — route decision (Unified vs BoxCall).
- EmitGuardBox — emit-time invariants (LocalSSA finalize + schedule verify).
- NameConstBox — string Const for function names.
- A/B-tier: planned; do not implement by default.
Call Routing — Unification (20250928)
- Standard method calls now delegate to `emit_unified_call` (single entry).
- Receiver class hint (origin/type) is resolved inside unified; handlers no longer duplicate it.
- RouterPolicy decides Unified vs BoxCall. Unknown/core/userinstance → BoxCall (behaviorpreserving).
- Rewrites apply centrally: `rewrite::special` (toString/stringify→str, equals/1) and `rewrite::known` (Known→function).
- LocalSSA + BlockSchedule + EmitGuard enforce PHI→Copy→Call ordering and inblock materialization.
Structure
```
src/mir/builder/
├── metadata/propagate.rs # MetadataPropagationBox
├── emission/constant.rs # ConstantEmissionBox
├── emission/compare.rs # CompareEmissionBox (new)
├── emission/branch.rs # BranchEmissionBox (new)
├── types/annotation.rs # TypeAnnotationBox
├── router/policy.rs # RouterPolicyBox
├── emit_guard/mod.rs # EmitGuardBox
└── name_const.rs # NameConstBox
```
APIs (concise)
- metadata::propagate(builder, src, dst)
- metadata::propagate_with_override(builder, dst, MirType)
- emission::constant::{emit_integer, emit_string, emit_bool, emit_float, emit_null, emit_void}
- emission::compare::{emit_to, emit_eq_to, emit_ne_to}
- emission::branch::{emit_conditional, emit_jump}
- types::annotation::{set_type, annotate_from_function}
- router::policy::{Route, choose_route(box_name, method, certainty, arity)}
- emit_guard::{finalize_call_operands(builder, &mut Callee, &mut Vec<ValueId>), verify_after_call(builder)}
- name_const::{make_name_const_result(builder, &str) -> Result<ValueId, String>}
Adoption Plan (behavior-preserving)
1) Replace representative Const sites with `emission::constant`.
2) Replace ad-hoc type/origin copy with `metadata::propagate`.
3) Call `types::annotation` where return type is clearly known (string length/size/str etc.).
4) Use `router::policy::choose_route` in unified call path; later migrate utils prefer_legacy to it.
5) Use `emit_guard` to centralize LocalSSA finalize + schedule verify around calls; later extend to branch/compare.
6) Use `name_const` in rewrite paths to reduce duplication.
Diagnostics
- All new logs remain dev-only behind env toggles already present (e.g., NYASH_LOCAL_SSA_TRACE, NYASH_BLOCK_SCHEDULE_VERIFY).
- Router trace: `NYASH_ROUTER_TRACE=1` prints route decisions (stderr, short, default OFF).
Guardrails
- Behavior must remain unchanged; only refactors/centralizations allowed.
- Keep diffs small; validate `make smoke-quick` and `make smoke-integration` stay green at each step.

View File

@ -0,0 +1,59 @@
# Unified Method Resolution — Design Note (Phase P4)
Purpose
- Document the unified pipeline for method resolution and how we will roll it out safely.
- Make behavior observable (dev-only) and gate any future default changes behind clear criteria.
Goals
- Single entry for all method calls via `emit_unified_call`.
- Behavior-preserving by default: Unknown/core/userinstance receivers route to BoxCall.
- Known receivers may be rewritten to function calls (obj.m → Class.m(me,…)) under strict conditions.
- Keep invariants around SSA and instruction order to prevent sporadic undefined uses.
Pipeline (concept)
1) Entry: `emit_unified_call(dst, CallTarget::Method { box_type, method, receiver }, args)`
2) Special rewrites (early): toString/stringify → str, equals/1 consolidation.
3) Known/unique rewrite (user boxes only): if class is Known and a unique function exists, rewrite to `Call(Class.m/arity)`.
4) Routing: `RouterPolicy.choose_route` decides Unified vs BoxCall (Unknown/core/userinstance → BoxCall; else Unified).
5) Emit guard: LocalSSA finalize (recv/args in current block) + BlockSchedule order contract (PHI → Copy → Call).
6) MIR emit: `Call { callee=Method/Extern/Global }` or `BoxCall` as routed.
Invariants (dev-verified)
- SSA locality: All operands are materialized within the current basic block before use.
- Order: PHI group at block head, then materialize Copies, then body (Calls). Verified with `NYASH_BLOCK_SCHEDULE_VERIFY=1`.
- Rewrites do not change semantics: Known rewrite only when a concrete target exists and is unique for the arity.
Behavior flags (existing)
- `NYASH_ROUTER_TRACE=1`: short route decisions to stderr (reason, class, method, arity, certainty).
- `NYASH_LOCAL_SSA_TRACE=1`: LocalSSA ensure/finalize traces (recv/arg/cond/cmp).
- `NYASH_BLOCK_SCHEDULE_VERIFY=1`: warn when Copy/Call ordering does not follow the contract.
- KPI (dev-only):
- `NYASH_DEBUG_KPI_KNOWN=1` → aggregate Known rate for `resolve.choose`.
- `NYASH_DEBUG_SAMPLE_EVERY=N` → sample output every N events.
Flag (P4)
- `NYASH_REWRITE_KNOWN_DEFAULT` (default ON; set to 0/false/off to disable):
- Enables Known→function rewrite by default for user boxes if and only if:
- receiver is Known (origin), and
- function exists, and
- candidate is unique for the arity.
- When disabled, behavior remains conservative; routing still handles BoxCall fallback.
Rollout note
- Default is ON with strict guards; set `NYASH_REWRITE_KNOWN_DEFAULT=0` to revert to conservative behavior.
- Continue to use `NYASH_ROUTER_TRACE=1` and KPI sampling to validate stability during development.
Key files
- Entry & routing: `src/mir/builder/builder_calls.rs`, `src/mir/builder/router/policy.rs`
- Rewrites: `src/mir/builder/rewrite/{special.rs, known.rs}`
- SSA & order: `src/mir/builder/ssa/local.rs`, `src/mir/builder/schedule/block.rs`, `src/mir/builder/emit_guard/`
- Observability: `src/mir/builder/observe/resolve.rs`
Acceptance for P4
- quick/integration stay green with flags OFF.
- With flags ON (dev), green remains; KPI reports sensible Known rates without mismatches.
- No noisy logs in default runs; all diagnostics behind flags.
Notes
- This design keeps Unknown/core/userinstance on BoxCall for stability and parity with legacy behavior.
- Known rewrite is structurally safe because user box methods are lowered to standalone MIR functions during build.