refactor: split optimizer/verifier/parser modules (mainline); add runner trace/directives; add LLVM terminator/select scaffolds; extract AST Span; update CURRENT_TASK with remaining plan

This commit is contained in:
Selfhosting Dev
2025-09-17 05:56:33 +09:00
parent 154778fc57
commit 9dc5c9afb9
39 changed files with 1327 additions and 739 deletions

View File

@ -24,6 +24,40 @@ What Changed (today)
- dev プロファイル `tools/dev_env.sh phi_off` を追加。ルート清掃ユーティリティ `tools/clean_root_artifacts.sh` を追加。
- CIGH Actionsを curated LLVMPHIon/off実行に刷新。旧JITジョブは停止。
Refactor Progress (20250916, end of day)
- Runner: ヘッダ指令スキャンとトレース出力を分離(`runner/cli_directives.rs`, `runner/trace.rs`。using 解決ログを集約。
- LLVM: terminators(select) の足場を追加し、呼び出しを alias 経由に切替(挙動不変)。
- Optimizer: パス別に分割し、オーケストレータから委譲(挙動不変の足場)。
- `optimizer_passes/{normalize,diagnostics,reorder,boxfield,intrinsics}.rs`
- 統計を `optimizer_stats.rs` へ分離。
- Verifier: 主要チェックをモジュール化し、`verification.rs` を薄いオーケストレータ化。
- `verification/{ssa,dom,cfg,barrier,legacy,awaits,utils}.rs`
- AST: `Span``ast/span.rs` へ分離し、`ast.rs` は reexport。
- Parser: expressions を段階分割ternary/coalesce/logic を `parser/expr/*` へ)。
Remaining Refactors (Phase15 mainline)
- Verifier仕上げ
- `verification.rs` 内の `compute_*` ラッパーを完全撤去し、全呼び出しを `verification::utils` に集約。
- テスト追加: reachability/phi/await チェックの簡易ケース(任意)。
- Parser段階分割の続き
- `bit_or/bit_xor/bit_and``comparison/range/term/shift/factor``parser/expr/` へ移動し、`parse_expression` チェインを維持。
- `call/primary` は最後に移動(依存が多いため)。
- AST構造の分離
- `ast/nodes/{structure,expression,statement}.rs` へノード定義を分離し、`ast.rs``pub use` 集約のみへ縮退。
- Optimizer足場→実装へ
- `reorder/boxfield/intrinsics` の実装を段階導入(まず small win: CSE のキーフィルタ改善、boxfield の load-after-store
- `normalize` の terminator 側の補完(未移行箇所があれば寄せる)。
- Runner/env 集約
- ホットパスの環境参照を `config::env` getter へ置換(残件: VM trace/diagnostics の一部)。
- LLVM select/terminators実装化
- `select` に truthy 規約の軽い正規化を追加(等価変換のみ)。
- `terminators` へ実体移動(`flow` からの段階的差し替え)。
- VM dispatch段階導入
- `NYASH_VM_USE_DISPATCH=1` フラグを導入し、無副作用命令から `backend/dispatch.rs` 経由に切替。
Notes
- すべて挙動等価の範囲で段階的に進める。足場化したモジュールは後続で実装を徐々に移す。
SelfHosting plumbing (20250916, later in day)
- Runner: 自己ホスト経路で子プログラム(`apps/selfhost-compiler/compiler.nyash`)を優先実行し、`--read-tmp` 常時付与で安定運用に変更。
- PyVM 優先の統一(`NYASH_VM_USE_PY=1`)は EXE/inline/child の全分岐で尊重。
@ -198,6 +232,33 @@ Recommended Next (short list)
- `function.rs` は「BB 周回+各 lowering 呼び出し」の骨格のみへ縮退。
- MIR BuilderC 継続)
- `builder/loops.rs` を新設し、ループのヘッダ/出口の小物ユーティリティを抽出(`LoopBuilder` の補助レイヤ)。
Refactor Plan (MIR Core / Parser / Runner)
- MIR Core中〜高
- 問題点
- `optimizer.rs``verification.rs` に処理が集中し、追加パスや検証増に弱い構造(巨大化: 994/980 行規模)。
- 提案
- パス駆動に分割: `mir/passes/{dce.rs, ssa.rs, const_fold.rs, simplify.rs}``MirPass` トレイト(`run(&mut MirModule)`)。
- 進捗: dce/cse 抽出済み、`MirPass` 骨格導入済み。次は normalize 系の抽出を段階実施。
- `optimizer.rs` はパイプライン組立順序・ゲート・Stats 集約)に縮小。
- `verification.rs` もカテゴリ分割ブロック整合性、SSA/PHI、型整合。失敗メッセージ表現の一貫化hintを統一
- Parser/Tokenizer
- 問題点
- `parser/expressions.rs`~986行`parser/statements.rs`~562行`tokenizer.rs`~863行が肥大。
- 提案
- Pratt/precedence テーブル化で演算子別分岐の重複削減。
- 共通エラー生成ユーティリティで `expected(...)` メッセージを統一。
- `tokenizer.rs``tokens.rs`(定義)と `lexer.rs`(実装)に分離。テストは `tests/lexer_*.rs` へ退避。
- Runner
- 問題点
- `runner/modes/common.rs`~734行`runner/mod.rs`~597行に CLI/環境フラグ/実行分岐が混載。
- 提案
- `runner/modes/common/` ディレクトリ化し、CLI引数処理・環境フラグ解決・モードディスパッチを分離。
- 重複ログ/検証を共通ヘルパへ集約。
- `builder/vars.rs` に SSA 変数正規化の小物を段階追加(変数名再束縛/スコープ終端の型ヒント伝搬など)。
- Runner仕上げ
- `mod.rs` の残置ヘルパusingの候補提示・環境注入ログ`pipeline/dispatch` へ集約し、`mod.rs` を最小のオーケストレーションに。

View File

@ -173,3 +173,9 @@ src/mir/builder/
The Nyash codebase shows signs of rapid development with opportunities for significant refactoring. The plugin loader consolidation offers the highest impact for maintenance improvement, while the MIR builder modularization will improve long-term extensibility. A phased approach is recommended to minimize disruption while delivering incremental benefits.
The analysis reveals a well-architected system that would benefit from tactical refactoring to improve maintainability without compromising the innovative "Everything is Box" design philosophy.
---
Phase 15 Addendum (Mainline only)
- Implemented: extracted CLI directives scanning and fields-top lint into `src/runner/cli_directives.rs` to slim `src/runner/mod.rs` without behavior changes.
- Proposed next steps (non-JIT): see `docs/refactoring/candidates_phase15.md` for focused items on Runner/LLVM/VM.

View File

@ -0,0 +1,45 @@
# Refactoring Candidates — Phase 15 (Mainline only)
Scope: PyVM/LLVM/Runner mainline. JIT/Cranelift is explicitly out of scope for this pass.
Goals
- Improve maintainability and locality of logic in the primary execution paths.
- Reduce ad-hoc env access; prefer `src/config/env.rs` helpers.
- Clarify control-flow lowering responsibilities in LLVM codegen.
HighValue Candidates
- Runner CLI directives
- Extract headercomment directives scanning from `runner/mod.rs``runner/cli_directives.rs` (done).
- Next: Move using/alias/env merge traces behind a `runner::trace` helper to keep `mod.rs` slim.
- LLVM codegen structure
- Introduce `instructions/terminators.rs` (return/jump/branch emit glue) and `instructions/select.rs` (cond/shortcircuit prenormalize). Initially reexport from `flow.rs`; later migrate implementations.
- Keep `function.rs` focused on BB visitation + delegations (no heavy logic inline).
- VM dispatch integration
- Gradually route the big `match` in VM through `backend/dispatch.rs` (already scaffolded). Start with sideeffectfree ops (Const/Compare/TypeOp) to derisk.
- Add optin flag `NYASH_VM_USE_DISPATCH=1` to gate behavior during the transition.
- Env access centralization
- Replace scattered `std::env::var("NYASH_*")` with `config::env` getters in hot paths (VM tracing, GC barriers, resolver toggles).
- Add tiny wrappers where a tristate is needed (off/soft/on).
- MIR builder granularity (nonbreaking)
- Extract loop helpers: `mir/builder/loops.rs` (headers/exits/latch snapshot utilities).
- Extract phi helpers: `mir/builder/phi.rs` (ifelse merge, header normalize for latch).
LowRisk Cleanups
- Tools scripts: dedupe `tools/*smoke*.sh` into `tools/smokes/` with common helpers (env, timeout, exit filtering).
- Tests naming: prefer `*_test.rs` and `apps/tests/*.nyash` consistency for smokes.
- Logging: add `NYASH_CLI_VERBOSE` guards consistently; provide `runner::trace!(...)` macro for concise on/off.
Suggested Sequencing
1) Runner small extractions (directives/trace). Validate with existing smokes.
2) LLVM `terminators.rs`/`select.rs` scaffolding + staged migration; zero behavior change initially.
3) VM dispatch gating: land flag and migrate 35 simple opcodes; parity by unit tests.
4) MIR builder helpers: extract without API changes; run PyVM/LLVM curated smokes.
Notes
- Keep JIT/Cranelift untouched in this phase to avoid drift from the mainline policy.
- Prefer filelevel docs on new modules to guide incremental migration.

View File

@ -8,81 +8,10 @@
use crate::box_trait::NyashBox;
use std::collections::HashMap;
use std::fmt;
mod span;
pub use span::Span;
/// ソースコード位置情報 - エラー報告とデバッグの革命
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Span {
pub start: usize, // 開始位置(バイトオフセット)
pub end: usize, // 終了位置(バイトオフセット)
pub line: usize, // 行番号1から開始
pub column: usize, // 列番号1から開始
}
impl Span {
/// 新しいSpanを作成
pub fn new(start: usize, end: usize, line: usize, column: usize) -> Self {
Self { start, end, line, column }
}
/// デフォルトのSpan不明な位置
pub fn unknown() -> Self {
Self { start: 0, end: 0, line: 1, column: 1 }
}
/// 2つのSpanを結合開始位置から終了位置まで
pub fn merge(&self, other: Span) -> Span {
Span {
start: self.start.min(other.start),
end: self.end.max(other.end),
line: self.line,
column: self.column,
}
}
/// ソースコードから該当箇所を抽出してエラー表示用文字列を生成
pub fn error_context(&self, source: &str) -> String {
let lines: Vec<&str> = source.lines().collect();
if self.line == 0 || self.line > lines.len() {
return format!("line {}, column {}", self.line, self.column);
}
let line_content = lines[self.line - 1];
let mut context = String::new();
// 行番号とソース行を表示
context.push_str(&format!(" |\n{:3} | {}\n", self.line, line_content));
// カーソル位置を表示(簡易版)
if self.column > 0 && self.column <= line_content.len() + 1 {
context.push_str(" | ");
for _ in 1..self.column {
context.push(' ');
}
let span_length = if self.end > self.start {
(self.end - self.start).min(line_content.len() - self.column + 1)
} else {
1
};
for _ in 0..span_length.max(1) {
context.push('^');
}
context.push('\n');
}
context
}
/// 位置情報の文字列表現
pub fn location_string(&self) -> String {
format!("line {}, column {}", self.line, self.column)
}
}
impl fmt::Display for Span {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "line {}, column {}", self.line, self.column)
}
}
// Span は src/ast/span.rs へ分離re-export で後方互換維持)
/// 🌟 AST分類システム - ChatGPTアドバイス統合による3層アーキテクチャ
/// Structure/Expression/Statement の明確な分離による型安全性向上

71
src/ast/span.rs Normal file
View File

@ -0,0 +1,71 @@
use std::fmt;
/// ソースコード位置情報 - エラー報告とデバッグの革命
#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Span {
pub start: usize, // 開始位置(バイトオフセット)
pub end: usize, // 終了位置(バイトオフセット)
pub line: usize, // 行番号1から開始
pub column: usize, // 列番号1から開始
}
impl Span {
/// 新しいSpanを作成
pub fn new(start: usize, end: usize, line: usize, column: usize) -> Self {
Self { start, end, line, column }
}
/// デフォルトのSpan不明な位置
pub fn unknown() -> Self {
Self { start: 0, end: 0, line: 1, column: 1 }
}
/// 2つのSpanを結合開始位置から終了位置まで
pub fn merge(&self, other: Span) -> Span {
Span {
start: self.start.min(other.start),
end: self.end.max(other.end),
line: self.line,
column: self.column,
}
}
/// ソースコードから該当箇所を抽出してエラー表示用文字列を生成
pub fn error_context(&self, source: &str) -> String {
let lines: Vec<&str> = source.lines().collect();
if self.line == 0 || self.line > lines.len() {
return format!("line {}, column {}", self.line, self.column);
}
let line_content = lines[self.line - 1];
let mut context = String::new();
// 行番号とソース行を表示
context.push_str(&format!(" |\n{:3} | {}\n", self.line, line_content));
// カーソル位置を表示(簡易版)
if self.column > 0 && self.column <= line_content.len() + 1 {
context.push_str(" | ");
for _ in 1..self.column { context.push(' '); }
let span_length = if self.end > self.start {
(self.end - self.start).min(line_content.len() - self.column + 1)
} else { 1 };
for _ in 0..span_length.max(1) { context.push('^'); }
context.push('\n');
}
context
}
/// 位置情報の文字列表現
pub fn location_string(&self) -> String {
format!("line {}, column {}", self.line, self.column)
}
}
impl fmt::Display for Span {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
write!(f, "line {}, column {}", self.line, self.column)
}
}

View File

@ -400,7 +400,7 @@ pub(super) fn lower_one_function<'ctx>(
cursor.at_end(*bid, bb);
match term {
MirInstruction::Return { value } => {
instructions::emit_return(
instructions::term_emit_return(
codegen,
&mut cursor,
&mut resolver,
@ -443,7 +443,7 @@ pub(super) fn lower_one_function<'ctx>(
}
}
if !handled {
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,
@ -526,12 +526,13 @@ pub(super) fn lower_one_function<'ctx>(
}
}
if !handled_by_loopform {
instructions::emit_branch(
let cond_norm = instructions::normalize_branch_condition(func, condition);
instructions::term_emit_branch(
codegen,
&mut cursor,
&mut resolver,
*bid,
condition,
&cond_norm,
then_bb,
else_bb,
&bb_map,
@ -545,7 +546,7 @@ pub(super) fn lower_one_function<'ctx>(
_ => {
cursor.at_end(*bid, bb);
if let Some(next_bid) = block_ids.get(bi + 1) {
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,
@ -555,7 +556,7 @@ pub(super) fn lower_one_function<'ctx>(
)?;
} else {
let entry_first = func.entry_block;
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,
@ -575,7 +576,7 @@ pub(super) fn lower_one_function<'ctx>(
}
cursor.at_end(*bid, bb);
if let Some(next_bid) = block_ids.get(bi + 1) {
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,
@ -585,7 +586,7 @@ pub(super) fn lower_one_function<'ctx>(
)?;
} else {
let entry_first = func.entry_block;
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,
@ -604,7 +605,7 @@ pub(super) fn lower_one_function<'ctx>(
}
cursor.at_end(*bid, bb);
if let Some(next_bid) = block_ids.get(bi + 1) {
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,
@ -614,7 +615,7 @@ pub(super) fn lower_one_function<'ctx>(
)?;
} else {
let entry_first = func.entry_block;
instructions::emit_jump(
instructions::term_emit_jump(
codegen,
&mut cursor,
*bid,

View File

@ -16,6 +16,8 @@ mod newbox;
mod resolver;
pub mod string_ops;
mod strings;
mod terminators; // scaffolding: re-exports flow terminators
mod select; // scaffolding: prepare for cond/short-circuit helpers
pub(super) use arith::lower_compare;
pub(super) use arith_ops::{lower_binop, lower_unary};
@ -25,6 +27,9 @@ pub(super) use call::lower_call;
pub(super) use consts::lower_const;
pub(super) use externcall::lower_externcall;
pub(super) use flow::{emit_branch, emit_jump, emit_return};
// Future: swap callers to use `terminators::*` instead of `flow::*` directly
pub(super) use terminators::{emit_branch as term_emit_branch, emit_jump as term_emit_jump, emit_return as term_emit_return};
pub(super) use select::normalize_branch_condition;
pub(super) use loopform::dev_check_dispatch_only_phi;
pub(super) use loopform::normalize_header_phis_for_latch;
pub(super) use loopform::{lower_while_loopform, LoopFormContext};

View File

@ -0,0 +1,16 @@
/*!
* Select & Condition helpers (scaffolding)
*
* Placeholder for condition normalization / short-circuit pre-processing
* to keep `function.rs` focused on structure. Implementations will be
* added incrementally; for now, this module is documentation-only.
*/
use crate::mir::{function::MirFunction, ValueId};
/// Normalize a branch condition if needed (scaffolding).
/// Currently returns the input unchanged; provides a single place
/// to adjust semantics later (e.g., truthy rules, short-circuit pre-pass).
pub(crate) fn normalize_branch_condition(_func: &MirFunction, cond: &ValueId) -> ValueId {
*cond
}

View File

@ -0,0 +1,9 @@
/*!
* Terminators (scaffolding)
*
* Thin re-exports of flow-level terminators. Call sites can gradually
* migrate to `terminators::*` without changing behavior.
*/
pub use super::flow::{emit_branch, emit_jump, emit_return};

View File

@ -56,7 +56,7 @@ impl VM {
}
Entry::Vacant(v) => { v.insert(vec![(label.clone(), ver, func_name.to_string())]); }
}
if std::env::var("NYASH_VM_PIC_STATS").ok().as_deref() == Some("1") {
if crate::config::env::vm_pic_stats() {
if let Some(v) = self.boxcall_poly_pic.get(pic_site_key) {
eprintln!("[PIC] site={} size={} last=({}, v{}) -> {}", pic_site_key, v.len(), label, ver, func_name);
}

View File

@ -13,10 +13,13 @@ pub mod builder;
pub mod loop_builder; // SSA loop construction with phi nodes
pub mod loop_api; // Minimal LoopBuilder facade (adapter-ready)
pub mod verification;
pub mod verification_types; // extracted error types
pub mod printer;
pub mod value_id;
pub mod effect;
pub mod optimizer;
pub mod optimizer_stats; // extracted stats struct
pub mod optimizer_passes; // optimizer passes (normalize/diagnostics)
pub mod slot_registry; // Phase 9.79b.1: method slot resolution (IDs)
#[cfg(feature = "aot-plan-import")]
pub mod aot_plan_import;
@ -27,7 +30,8 @@ pub use instruction::{MirInstruction, BinaryOp, CompareOp, UnaryOp, ConstValue,
pub use basic_block::{BasicBlock, BasicBlockId, BasicBlockIdGenerator};
pub use function::{MirFunction, MirModule, FunctionSignature};
pub use builder::MirBuilder;
pub use verification::{MirVerifier, VerificationError};
pub use verification::MirVerifier;
pub use verification_types::VerificationError;
pub use printer::MirPrinter;
pub use value_id::{ValueId, LocalId, ValueIdGenerator};
pub use effect::{EffectMask, Effect};

View File

@ -9,6 +9,7 @@
*/
use super::{MirModule, MirFunction, MirInstruction, ValueId, MirType, EffectMask, Effect};
use crate::mir::optimizer_stats::OptimizationStats;
use std::collections::{HashMap, HashSet};
/// MIR optimization passes
@ -47,20 +48,20 @@ impl MirOptimizer {
// Pass 0: Normalize legacy instructions to unified forms
// - Includes optional Array→BoxCall guarded by env (inside the pass)
stats.merge(self.normalize_legacy_instructions(module));
stats.merge(crate::mir::optimizer_passes::normalize::normalize_legacy_instructions(self, module));
// Pass 0.1: RefGet/RefSet → BoxCall(getField/setField) (guarded)
if ref_to_boxcall {
stats.merge(self.normalize_ref_field_access(module));
stats.merge(crate::mir::optimizer_passes::normalize::normalize_ref_field_access(self, module));
}
// Option: Force BoxCall → PluginInvoke (env)
if crate::config::env::mir_plugin_invoke()
|| crate::config::env::plugin_only() {
stats.merge(self.force_plugin_invoke(module));
stats.merge(crate::mir::optimizer_passes::normalize::force_plugin_invoke(self, module));
}
// Normalize Python helper form: py.getattr(obj, name) → obj.getattr(name)
stats.merge(self.normalize_python_helper_calls(module));
stats.merge(crate::mir::optimizer_passes::normalize::normalize_python_helper_calls(self, module));
// Pass 1: Dead code elimination (modularized pass)
{
@ -75,15 +76,15 @@ impl MirOptimizer {
}
// Pass 3: Pure instruction reordering for better locality
stats.merge(self.reorder_pure_instructions(module));
stats.merge(crate::mir::optimizer_passes::reorder::reorder_pure_instructions(self, module));
// Pass 4: Intrinsic function optimization
stats.merge(self.optimize_intrinsic_calls(module));
stats.merge(crate::mir::optimizer_passes::intrinsics::optimize_intrinsic_calls(self, module));
// Safety-net passesは削除Phase 2: 変換の一本化)。診断のみ後段で実施。
// Pass 5: BoxField dependency optimization
stats.merge(self.optimize_boxfield_operations(module));
stats.merge(crate::mir::optimizer_passes::boxfield::optimize_boxfield_operations(self, module));
// Pass 6: 受け手型ヒントの伝搬callsite→callee
// 目的: helper(arr){ return arr.length() } のようなケースで、
@ -101,10 +102,10 @@ impl MirOptimizer {
println!("✅ Optimization complete: {}", stats);
}
// Diagnostics (informational): report unlowered patterns
let diag1 = self.diagnose_unlowered_type_ops(module);
let diag1 = crate::mir::optimizer_passes::diagnostics::diagnose_unlowered_type_ops(self, module);
stats.merge(diag1);
// Diagnostics (policy): detect legacy (pre-unified) instructions when requested
let diag2 = self.diagnose_legacy_instructions(module);
let diag2 = crate::mir::optimizer_passes::diagnostics::diagnose_legacy_instructions(self, module);
stats.merge(diag2);
stats
@ -371,98 +372,18 @@ impl MirOptimizer {
}
}
/// Reorder pure instructions for better locality
fn reorder_pure_instructions(&mut self, module: &mut MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
for (func_name, function) in &mut module.functions {
if self.debug {
println!(" 🔀 Pure instruction reordering in function: {}", func_name);
// Reorder/Intrinsics/BoxField passes moved to optimizer_passes/* modules
}
stats.reorderings += self.reorder_in_function(function);
}
stats
}
/// Reorder instructions in a function
fn reorder_in_function(&mut self, _function: &mut MirFunction) -> usize {
// Simplified implementation - in full version would implement:
// 1. Build dependency graph
// 2. Topological sort respecting effects
// 3. Group pure instructions together
// 4. Move loads closer to uses
0
}
/// Optimize intrinsic function calls
fn optimize_intrinsic_calls(&mut self, module: &mut MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
for (func_name, function) in &mut module.functions {
if self.debug {
println!(" ⚡ Intrinsic optimization in function: {}", func_name);
}
stats.intrinsic_optimizations += self.optimize_intrinsics_in_function(function);
}
stats
}
/// Optimize intrinsics in a function
fn optimize_intrinsics_in_function(&mut self, _function: &mut MirFunction) -> usize {
// Simplified implementation - would optimize:
// 1. Constant folding in intrinsic calls
// 2. Strength reduction (e.g., @unary_neg(@unary_neg(x)) → x)
// 3. Identity elimination (e.g., x + 0 → x)
0
}
/// Optimize BoxField operations
fn optimize_boxfield_operations(&mut self, module: &mut MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
for (func_name, function) in &mut module.functions {
if self.debug {
println!(" 📦 BoxField optimization in function: {}", func_name);
}
stats.boxfield_optimizations += self.optimize_boxfield_in_function(function);
}
stats
}
/// Optimize BoxField operations in a function
fn optimize_boxfield_in_function(&mut self, _function: &mut MirFunction) -> usize {
// Simplified implementation - would optimize:
// 1. Load-after-store elimination
// 2. Store-after-store elimination
// 3. Load forwarding
// 4. Field access coalescing
0
}
impl MirOptimizer {
/// Expose debug flag for helper modules
pub(crate) fn debug_enabled(&self) -> bool { self.debug }
}
impl MirOptimizer {
/// Rewrite all BoxCall to PluginInvoke to force plugin path (no builtin fallback)
fn force_plugin_invoke(&mut self, module: &mut MirModule) -> OptimizationStats {
use super::MirInstruction as I;
let mut stats = OptimizationStats::new();
for (_fname, function) in &mut module.functions {
for (_bb, block) in &mut function.blocks {
for inst in &mut block.instructions {
if let I::BoxCall { dst, box_val, method, args, effects, .. } = inst.clone() {
*inst = I::PluginInvoke { dst, box_val, method, args, effects };
stats.intrinsic_optimizations += 1;
}
}
}
}
stats
crate::mir::optimizer_passes::normalize::force_plugin_invoke(self, module)
}
/// Normalize Python helper calls that route via PyRuntimeBox into proper receiver form.
@ -470,33 +391,7 @@ impl MirOptimizer {
/// Rewrites: PluginInvoke { box_val=py (PyRuntimeBox), method="getattr"|"call", args=[obj, rest...] }
/// → PluginInvoke { box_val=obj, method, args=[rest...] }
fn normalize_python_helper_calls(&mut self, module: &mut MirModule) -> OptimizationStats {
use super::MirInstruction as I;
let mut stats = OptimizationStats::new();
for (_fname, function) in &mut module.functions {
for (_bb, block) in &mut function.blocks {
for inst in &mut block.instructions {
if let I::PluginInvoke { box_val, method, args, .. } = inst {
if method == "getattr" && args.len() >= 2 {
// Prefer metadata when available
// Heuristic: helper形式 (obj, name) のときのみ書換
// Rewrite receiver to args[0]
let new_recv = args[0];
// Remove first arg and keep the rest
args.remove(0);
*box_val = new_recv;
stats.intrinsic_optimizations += 1;
} else if method == "call" && !args.is_empty() {
// call は helper形式 (func, args...) を receiver=func に正規化
let new_recv = args[0];
args.remove(0);
*box_val = new_recv;
stats.intrinsic_optimizations += 1;
}
}
}
}
}
stats
crate::mir::optimizer_passes::normalize::normalize_python_helper_calls(self, module)
}
/// Normalize legacy instructions into unified MIR26 forms.
/// - TypeCheck/Cast → TypeOp(Check/Cast)
@ -767,50 +662,7 @@ impl Default for MirOptimizer {
}
}
/// Statistics from optimization passes
#[derive(Debug, Clone, Default)]
pub struct OptimizationStats {
pub dead_code_eliminated: usize,
pub cse_eliminated: usize,
pub reorderings: usize,
pub intrinsic_optimizations: usize,
pub boxfield_optimizations: usize,
pub diagnostics_reported: usize,
}
impl OptimizationStats {
pub fn new() -> Self {
Default::default()
}
pub fn merge(&mut self, other: OptimizationStats) {
self.dead_code_eliminated += other.dead_code_eliminated;
self.cse_eliminated += other.cse_eliminated;
self.reorderings += other.reorderings;
self.intrinsic_optimizations += other.intrinsic_optimizations;
self.boxfield_optimizations += other.boxfield_optimizations;
self.diagnostics_reported += other.diagnostics_reported;
}
pub fn total_optimizations(&self) -> usize {
self.dead_code_eliminated + self.cse_eliminated + self.reorderings +
self.intrinsic_optimizations + self.boxfield_optimizations
}
}
impl std::fmt::Display for OptimizationStats {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f,
"dead_code: {}, cse: {}, reorder: {}, intrinsic: {}, boxfield: {} (total: {})",
self.dead_code_eliminated,
self.cse_eliminated,
self.reorderings,
self.intrinsic_optimizations,
self.boxfield_optimizations,
self.total_optimizations()
)
}
}
// OptimizationStats moved to crate::mir::optimizer_stats
impl MirOptimizer {
/// Diagnostic: detect unlowered is/as/isType/asType after Builder

View File

@ -0,0 +1,16 @@
use crate::mir::MirModule;
use crate::mir::optimizer::MirOptimizer;
use crate::mir::optimizer_stats::OptimizationStats;
/// Optimize BoxField operations (scaffolding)
pub fn optimize_boxfield_operations(opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
for (func_name, _function) in &mut module.functions {
if opt.debug_enabled() {
println!(" 📦 BoxField optimization in function: {}", func_name);
}
// Placeholder: no transformation yet; maintain existing behavior
}
stats
}

View File

@ -0,0 +1,105 @@
use crate::mir::{MirModule, MirInstruction, BasicBlockId, function::MirFunction, ValueId};
use crate::mir::optimizer_stats::OptimizationStats;
use crate::mir::optimizer::MirOptimizer;
/// Diagnostic: detect unlowered is/as/isType/asType after Builder
pub fn diagnose_unlowered_type_ops(opt: &mut MirOptimizer, module: &MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
let diag_on = opt.debug_enabled() || crate::config::env::opt_diag();
for (fname, function) in &module.functions {
let mut def_map: std::collections::HashMap<ValueId, (BasicBlockId, usize)> = std::collections::HashMap::new();
for (bb_id, block) in &function.blocks {
for (i, inst) in block.instructions.iter().enumerate() {
if let Some(dst) = inst.dst_value() { def_map.insert(dst, (*bb_id, i)); }
}
if let Some(term) = &block.terminator { if let Some(dst) = term.dst_value() { def_map.insert(dst, (*bb_id, usize::MAX)); } }
}
let mut count = 0usize;
for (_bb, block) in &function.blocks {
for inst in &block.instructions {
match inst {
MirInstruction::BoxCall { method, .. } if method == "is" || method == "as" || method == "isType" || method == "asType" => { count += 1; }
MirInstruction::Call { func, .. } => {
if let Some((bb, idx)) = def_map.get(func).copied() {
if let Some(b) = function.blocks.get(&bb) {
if idx < b.instructions.len() {
if let MirInstruction::Const { value: crate::mir::instruction::ConstValue::String(s), .. } = &b.instructions[idx] {
if s == "isType" || s == "asType" { count += 1; }
}
}
}
}
}
_ => {}
}
}
}
if count > 0 {
stats.diagnostics_reported += count;
if diag_on {
eprintln!("[OPT][DIAG] Function '{}' has {} unlowered type-op calls", fname, count);
}
}
}
stats
}
/// Diagnostic: detect legacy instructions that should be unified
/// Legacy set: TypeCheck/Cast/WeakNew/WeakLoad/BarrierRead/BarrierWrite/ArrayGet/ArraySet/RefGet/RefSet/PluginInvoke
pub fn diagnose_legacy_instructions(opt: &mut MirOptimizer, module: &MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
let diag_on = opt.debug_enabled()
|| crate::config::env::opt_diag()
|| crate::config::env::opt_diag_forbid_legacy();
for (fname, function) in &module.functions {
let mut count = 0usize;
for (_bb, block) in &function.blocks {
for inst in &block.instructions {
match inst {
MirInstruction::TypeCheck { .. }
| MirInstruction::Cast { .. }
| MirInstruction::WeakNew { .. }
| MirInstruction::WeakLoad { .. }
| MirInstruction::BarrierRead { .. }
| MirInstruction::BarrierWrite { .. }
| MirInstruction::ArrayGet { .. }
| MirInstruction::ArraySet { .. }
| MirInstruction::RefGet { .. }
| MirInstruction::RefSet { .. }
| MirInstruction::PluginInvoke { .. } => { count += 1; }
_ => {}
}
}
if let Some(term) = &block.terminator {
match term {
MirInstruction::TypeCheck { .. }
| MirInstruction::Cast { .. }
| MirInstruction::WeakNew { .. }
| MirInstruction::WeakLoad { .. }
| MirInstruction::BarrierRead { .. }
| MirInstruction::BarrierWrite { .. }
| MirInstruction::ArrayGet { .. }
| MirInstruction::ArraySet { .. }
| MirInstruction::RefGet { .. }
| MirInstruction::RefSet { .. }
| MirInstruction::PluginInvoke { .. } => { count += 1; }
_ => {}
}
}
}
if count > 0 {
stats.diagnostics_reported += count;
if diag_on {
eprintln!(
"[OPT][DIAG] Function '{}' has {} legacy MIR ops: unify to Core13 (TypeOp/WeakRef/Barrier/BoxCall)",
fname, count
);
if crate::config::env::opt_diag_forbid_legacy() {
panic!("NYASH_OPT_DIAG_FORBID_LEGACY=1: legacy MIR ops detected in '{}': {}", fname, count);
}
}
}
}
stats
}

View File

@ -0,0 +1,18 @@
use crate::mir::MirModule;
use crate::mir::optimizer::MirOptimizer;
use crate::mir::optimizer_stats::OptimizationStats;
/// Intrinsic optimization pass (scaffolding)
/// Keeps behavior identical for now (no transforms), but centralizes
/// debug printing and future hooks.
pub fn optimize_intrinsic_calls(opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
for (func_name, _function) in &mut module.functions {
if opt.debug_enabled() {
println!(" ⚡ Intrinsic optimization in function: {}", func_name);
}
// Placeholder: no transformation; keep parity
}
stats
}

View File

@ -0,0 +1,5 @@
pub mod normalize;
pub mod diagnostics;
pub mod reorder;
pub mod boxfield;
pub mod intrinsics;

View File

@ -0,0 +1,223 @@
use crate::mir::{MirModule, MirInstruction, TypeOpKind, WeakRefOp, BarrierOp, ValueId};
use crate::mir::optimizer::MirOptimizer;
use crate::mir::optimizer_stats::OptimizationStats;
pub fn force_plugin_invoke(_opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
use crate::mir::MirInstruction as I;
let mut stats = OptimizationStats::new();
for (_fname, function) in &mut module.functions {
for (_bb, block) in &mut function.blocks {
for inst in &mut block.instructions {
if let I::BoxCall { dst, box_val, method, args, effects, .. } = inst.clone() {
*inst = I::PluginInvoke { dst, box_val, method, args, effects };
stats.intrinsic_optimizations += 1;
}
}
}
}
stats
}
pub fn normalize_python_helper_calls(_opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
use crate::mir::MirInstruction as I;
let mut stats = OptimizationStats::new();
for (_fname, function) in &mut module.functions {
for (_bb, block) in &mut function.blocks {
for inst in &mut block.instructions {
if let I::PluginInvoke { box_val, method, args, .. } = inst {
if method == "getattr" && args.len() >= 2 {
let new_recv = args[0];
args.remove(0);
*box_val = new_recv;
stats.intrinsic_optimizations += 1;
} else if method == "call" && !args.is_empty() {
let new_recv = args[0];
args.remove(0);
*box_val = new_recv;
stats.intrinsic_optimizations += 1;
}
}
}
}
}
stats
}
pub fn normalize_legacy_instructions(_opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
use crate::mir::MirInstruction as I;
let mut stats = OptimizationStats::new();
let rw_dbg = crate::config::env::rewrite_debug();
let rw_sp = crate::config::env::rewrite_safepoint();
let rw_future = crate::config::env::rewrite_future();
let core13 = crate::config::env::mir_core13();
let mut array_to_boxcall = crate::config::env::mir_array_boxcall();
if core13 { array_to_boxcall = true; }
for (_fname, function) in &mut module.functions {
for (_bb, block) in &mut function.blocks {
for inst in &mut block.instructions {
match inst {
I::WeakNew { dst, box_val } => {
let d = *dst; let v = *box_val;
*inst = I::WeakRef { dst: d, op: WeakRefOp::New, value: v };
stats.intrinsic_optimizations += 1;
}
I::WeakLoad { dst, weak_ref } => {
let d = *dst; let v = *weak_ref;
*inst = I::WeakRef { dst: d, op: WeakRefOp::Load, value: v };
stats.intrinsic_optimizations += 1;
}
I::BarrierRead { ptr } => {
let p = *ptr; *inst = I::Barrier { op: BarrierOp::Read, ptr: p };
stats.intrinsic_optimizations += 1;
}
I::BarrierWrite { ptr } => {
let p = *ptr; *inst = I::Barrier { op: BarrierOp::Write, ptr: p };
stats.intrinsic_optimizations += 1;
}
I::Print { value, .. } => {
let v = *value;
*inst = I::ExternCall { dst: None, iface_name: "env.console".to_string(), method_name: "log".to_string(), args: vec![v], effects: crate::mir::EffectMask::PURE.add(crate::mir::Effect::Io) };
stats.intrinsic_optimizations += 1;
}
I::ArrayGet { dst, array, index } if array_to_boxcall => {
let d = *dst; let a = *array; let i = *index;
let mid = crate::mir::slot_registry::resolve_slot_by_type_name("ArrayBox", "get");
*inst = I::BoxCall { dst: Some(d), box_val: a, method: "get".to_string(), method_id: mid, args: vec![i], effects: crate::mir::EffectMask::READ };
stats.intrinsic_optimizations += 1;
}
I::ArraySet { array, index, value } if array_to_boxcall => {
let a = *array; let i = *index; let v = *value;
let mid = crate::mir::slot_registry::resolve_slot_by_type_name("ArrayBox", "set");
*inst = I::BoxCall { dst: None, box_val: a, method: "set".to_string(), method_id: mid, args: vec![i, v], effects: crate::mir::EffectMask::WRITE };
stats.intrinsic_optimizations += 1;
}
I::PluginInvoke { dst, box_val, method, args, effects } => {
let d = *dst; let recv = *box_val; let m = method.clone(); let as_ = args.clone(); let eff = *effects;
*inst = I::BoxCall { dst: d, box_val: recv, method: m, method_id: None, args: as_, effects: eff };
stats.intrinsic_optimizations += 1;
}
I::Debug { .. } if !rw_dbg => {
*inst = I::Nop;
}
I::Safepoint if !rw_sp => {
*inst = I::Nop;
}
I::FutureNew { dst, value } if rw_future => {
let d = *dst; let v = *value;
*inst = I::ExternCall { dst: Some(d), iface_name: "env.future".to_string(), method_name: "new".to_string(), args: vec![v], effects: crate::mir::EffectMask::PURE.add(crate::mir::Effect::Io) };
}
I::FutureSet { future, value } if rw_future => {
let f = *future; let v = *value;
*inst = I::ExternCall { dst: None, iface_name: "env.future".to_string(), method_name: "set".to_string(), args: vec![f, v], effects: crate::mir::EffectMask::PURE.add(crate::mir::Effect::Io) };
}
I::Await { dst, future } if rw_future => {
let d = *dst; let f = *future;
*inst = I::ExternCall { dst: Some(d), iface_name: "env.future".to_string(), method_name: "await".to_string(), args: vec![f], effects: crate::mir::EffectMask::PURE.add(crate::mir::Effect::Io) };
}
_ => {}
}
}
// terminator rewrite (subset migrated as needed)
if let Some(term) = &mut block.terminator {
match term {
I::TypeCheck { dst, value, expected_type } => {
let ty = crate::mir::instruction::MirType::Box(expected_type.clone());
*term = I::TypeOp { dst: *dst, op: TypeOpKind::Check, value: *value, ty };
stats.intrinsic_optimizations += 1;
}
I::Cast { dst, value, target_type } => {
let ty = target_type.clone();
*term = I::TypeOp { dst: *dst, op: TypeOpKind::Cast, value: *value, ty };
stats.intrinsic_optimizations += 1;
}
I::WeakNew { dst, box_val } => {
let d = *dst; let v = *box_val;
*term = I::WeakRef { dst: d, op: WeakRefOp::New, value: v };
stats.intrinsic_optimizations += 1;
}
I::WeakLoad { dst, weak_ref } => {
let d = *dst; let v = *weak_ref;
*term = I::WeakRef { dst: d, op: WeakRefOp::Load, value: v };
stats.intrinsic_optimizations += 1;
}
I::BarrierRead { ptr } => {
let p = *ptr; *term = I::Barrier { op: BarrierOp::Read, ptr: p };
stats.intrinsic_optimizations += 1;
}
I::BarrierWrite { ptr } => {
let p = *ptr; *term = I::Barrier { op: BarrierOp::Write, ptr: p };
stats.intrinsic_optimizations += 1;
}
I::Print { value, .. } => {
let v = *value; *term = I::ExternCall { dst: None, iface_name: "env.console".to_string(), method_name: "log".to_string(), args: vec![v], effects: crate::mir::EffectMask::PURE };
stats.intrinsic_optimizations += 1;
}
I::ArrayGet { dst, array, index } if array_to_boxcall => {
let d = *dst; let a = *array; let i = *index;
*term = I::BoxCall { dst: Some(d), box_val: a, method: "get".to_string(), method_id: None, args: vec![i], effects: crate::mir::EffectMask::READ };
stats.intrinsic_optimizations += 1;
}
I::ArraySet { array, index, value } if array_to_boxcall => {
let a = *array; let i = *index; let v = *value;
*term = I::BoxCall { dst: None, box_val: a, method: "set".to_string(), method_id: None, args: vec![i, v], effects: crate::mir::EffectMask::WRITE };
stats.intrinsic_optimizations += 1;
}
_ => {}
}
}
}
}
stats
}
pub fn normalize_ref_field_access(_opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
use crate::mir::MirInstruction as I;
let mut stats = OptimizationStats::new();
for (_fname, function) in &mut module.functions {
for (_bb, block) in &mut function.blocks {
let mut out: Vec<I> = Vec::with_capacity(block.instructions.len() + 2);
let old = std::mem::take(&mut block.instructions);
for inst in old.into_iter() {
match inst {
I::RefGet { dst, reference, field } => {
let new_id = ValueId::new(function.next_value_id);
function.next_value_id += 1;
out.push(I::Const { dst: new_id, value: crate::mir::instruction::ConstValue::String(field) });
out.push(I::BoxCall { dst: Some(dst), box_val: reference, method: "getField".to_string(), method_id: None, args: vec![new_id], effects: crate::mir::EffectMask::READ });
stats.intrinsic_optimizations += 1;
}
I::RefSet { reference, field, value } => {
let new_id = ValueId::new(function.next_value_id);
function.next_value_id += 1;
out.push(I::Const { dst: new_id, value: crate::mir::instruction::ConstValue::String(field) });
out.push(I::Barrier { op: BarrierOp::Write, ptr: reference });
out.push(I::BoxCall { dst: None, box_val: reference, method: "setField".to_string(), method_id: None, args: vec![new_id, value], effects: crate::mir::EffectMask::WRITE });
stats.intrinsic_optimizations += 1;
}
other => out.push(other),
}
}
block.instructions = out;
if let Some(term) = block.terminator.take() {
block.terminator = Some(match term {
I::RefGet { dst, reference, field } => {
let new_id = ValueId::new(function.next_value_id);
function.next_value_id += 1;
block.instructions.push(I::Const { dst: new_id, value: crate::mir::instruction::ConstValue::String(field) });
I::BoxCall { dst: Some(dst), box_val: reference, method: "getField".to_string(), method_id: None, args: vec![new_id], effects: crate::mir::EffectMask::READ }
}
I::RefSet { reference, field, value } => {
let new_id = ValueId::new(function.next_value_id);
function.next_value_id += 1;
block.instructions.push(I::Const { dst: new_id, value: crate::mir::instruction::ConstValue::String(field) });
block.instructions.push(I::Barrier { op: BarrierOp::Write, ptr: reference });
I::BoxCall { dst: None, box_val: reference, method: "setField".to_string(), method_id: None, args: vec![new_id, value], effects: crate::mir::EffectMask::WRITE }
}
other => other,
});
}
}
}
stats
}

View File

@ -0,0 +1,17 @@
use crate::mir::{MirModule};
use crate::mir::optimizer::MirOptimizer;
use crate::mir::optimizer_stats::OptimizationStats;
/// Reorder pure instructions for better locality (scaffolding)
pub fn reorder_pure_instructions(opt: &mut MirOptimizer, module: &mut MirModule) -> OptimizationStats {
let mut stats = OptimizationStats::new();
for (func_name, _function) in &mut module.functions {
if opt.debug_enabled() {
println!(" 🔀 Pure instruction reordering in function: {}", func_name);
}
// Placeholder: keep behavior identical (no reordering yet)
// When implemented, set stats.reorderings += N per function.
}
stats
}

View File

@ -0,0 +1,53 @@
/*!
* Optimizer statistics (extracted from optimizer.rs)
*/
/// Statistics from optimization passes
#[derive(Debug, Clone, Default)]
pub struct OptimizationStats {
pub dead_code_eliminated: usize,
pub cse_eliminated: usize,
pub reorderings: usize,
pub intrinsic_optimizations: usize,
pub boxfield_optimizations: usize,
pub diagnostics_reported: usize,
}
impl OptimizationStats {
pub fn new() -> Self {
Default::default()
}
pub fn merge(&mut self, other: OptimizationStats) {
self.dead_code_eliminated += other.dead_code_eliminated;
self.cse_eliminated += other.cse_eliminated;
self.reorderings += other.reorderings;
self.intrinsic_optimizations += other.intrinsic_optimizations;
self.boxfield_optimizations += other.boxfield_optimizations;
self.diagnostics_reported += other.diagnostics_reported;
}
pub fn total_optimizations(&self) -> usize {
self.dead_code_eliminated
+ self.cse_eliminated
+ self.reorderings
+ self.intrinsic_optimizations
+ self.boxfield_optimizations
}
}
impl std::fmt::Display for OptimizationStats {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(
f,
"dead_code: {}, cse: {}, reorder: {}, intrinsic: {}, boxfield: {} (total: {})",
self.dead_code_eliminated,
self.cse_eliminated,
self.reorderings,
self.intrinsic_optimizations,
self.boxfield_optimizations,
self.total_optimizations()
)
}
}

View File

@ -5,89 +5,15 @@
*/
use super::{MirModule, MirFunction, BasicBlockId, ValueId};
use crate::mir::verification_types::VerificationError;
use crate::debug::log as dlog;
use std::collections::{HashSet, HashMap};
mod legacy;
mod barrier;
mod awaits;
mod utils;
/// Verification error types
#[derive(Debug, Clone, PartialEq)]
pub enum VerificationError {
/// Undefined value used
UndefinedValue {
value: ValueId,
block: BasicBlockId,
instruction_index: usize,
},
/// Value defined multiple times
MultipleDefinition {
value: ValueId,
first_block: BasicBlockId,
second_block: BasicBlockId,
},
/// Invalid phi function
InvalidPhi {
phi_value: ValueId,
block: BasicBlockId,
reason: String,
},
/// Unreachable block
UnreachableBlock {
block: BasicBlockId,
},
/// Control flow error
ControlFlowError {
block: BasicBlockId,
reason: String,
},
/// Dominator violation
DominatorViolation {
value: ValueId,
use_block: BasicBlockId,
def_block: BasicBlockId,
},
/// Merge block uses predecessor-defined value directly instead of Phi
MergeUsesPredecessorValue {
value: ValueId,
merge_block: BasicBlockId,
pred_block: BasicBlockId,
},
/// WeakRef(load) must originate from a WeakNew/WeakRef(new)
InvalidWeakRefSource {
weak_ref: ValueId,
block: BasicBlockId,
instruction_index: usize,
reason: String,
},
/// Barrier pointer must not be a void constant
InvalidBarrierPointer {
ptr: ValueId,
block: BasicBlockId,
instruction_index: usize,
reason: String,
},
/// Barrier appears without nearby memory ops (diagnostic; strict mode only)
SuspiciousBarrierContext {
block: BasicBlockId,
instruction_index: usize,
note: String,
},
/// Legacy/Deprecated instruction encountered (should have been rewritten to Core-15)
UnsupportedLegacyInstruction {
block: BasicBlockId,
instruction_index: usize,
name: String,
},
/// Await must be surrounded by checkpoints (before and after)
MissingCheckpointAroundAwait {
block: BasicBlockId,
instruction_index: usize,
position: &'static str, // "before" | "after"
},
}
// VerificationError moved to crate::mir::verification_types
/// MIR verifier for SSA form and semantic correctness
pub struct MirVerifier {
@ -219,211 +145,25 @@ impl MirVerifier {
/// Reject legacy instructions that should be rewritten to Core-15 equivalents
/// Skips check when NYASH_VERIFY_ALLOW_LEGACY=1
fn verify_no_legacy_ops(&self, function: &MirFunction) -> Result<(), Vec<VerificationError>> {
if std::env::var("NYASH_VERIFY_ALLOW_LEGACY").ok().as_deref() == Some("1") {
return Ok(());
}
use super::MirInstruction as I;
let mut errors = Vec::new();
for (bid, block) in &function.blocks {
for (idx, inst) in block.all_instructions().enumerate() {
let legacy_name = match inst {
// Explicit legacy forms that must be rewritten to unified/core ops
I::TypeCheck { .. } => Some("TypeCheck"), // -> TypeOp(Check)
I::Cast { .. } => Some("Cast"), // -> TypeOp(Cast)
I::WeakNew { .. } => Some("WeakNew"), // -> WeakRef(New)
I::WeakLoad { .. } => Some("WeakLoad"), // -> WeakRef(Load)
I::BarrierRead { .. } => Some("BarrierRead"), // -> Barrier(Read)
I::BarrierWrite { .. } => Some("BarrierWrite"), // -> Barrier(Write)
I::Print { .. } => Some("Print"), // -> ExternCall(env.console.log)
I::ArrayGet { .. } => Some("ArrayGet"), // -> BoxCall("get")
I::ArraySet { .. } => Some("ArraySet"), // -> BoxCall("set")
I::RefGet { .. } => Some("RefGet"), // -> BoxCall("getField")
I::RefSet { .. } => Some("RefSet"), // -> BoxCall("setField")
I::PluginInvoke { .. } => Some("PluginInvoke"), // -> BoxCall
// Keep generic Call for now (migration ongoing)
// Meta/exceptional ops are handled separately; not hard-forbidden here
_ => None,
};
if let Some(name) = legacy_name {
errors.push(VerificationError::UnsupportedLegacyInstruction {
block: *bid,
instruction_index: idx,
name: name.to_string(),
});
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
legacy::check_no_legacy_ops(function)
}
/// Ensure that each Await instruction (or ExternCall(env.future.await)) is immediately
/// preceded and followed by a checkpoint.
/// A checkpoint is either MirInstruction::Safepoint or ExternCall("env.runtime", "checkpoint").
fn verify_await_checkpoints(&self, function: &MirFunction) -> Result<(), Vec<VerificationError>> {
use super::MirInstruction as I;
let mut errors = Vec::new();
let is_cp = |inst: &I| match inst {
I::Safepoint => true,
I::ExternCall { iface_name, method_name, .. } => iface_name == "env.runtime" && method_name == "checkpoint",
_ => false,
};
for (bid, block) in &function.blocks {
let instrs = &block.instructions;
for (idx, inst) in instrs.iter().enumerate() {
let is_await_like = match inst {
I::Await { .. } => true,
I::ExternCall { iface_name, method_name, .. } => iface_name == "env.future" && method_name == "await",
_ => false,
};
if is_await_like {
// Check immediate previous
if idx == 0 || !is_cp(&instrs[idx - 1]) {
errors.push(VerificationError::MissingCheckpointAroundAwait { block: *bid, instruction_index: idx, position: "before" });
}
// Check immediate next (within instructions list)
if idx + 1 >= instrs.len() || !is_cp(&instrs[idx + 1]) {
errors.push(VerificationError::MissingCheckpointAroundAwait { block: *bid, instruction_index: idx, position: "after" });
}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
awaits::check_await_checkpoints(function)
}
/// Verify WeakRef/Barrier minimal semantics
fn verify_weakref_and_barrier(&self, function: &MirFunction) -> Result<(), Vec<VerificationError>> {
use super::MirInstruction;
let mut errors = Vec::new();
// Build def map value -> (block, idx, &inst)
let mut def_map: HashMap<ValueId, (BasicBlockId, usize, &MirInstruction)> = HashMap::new();
for (bid, block) in &function.blocks {
for (idx, inst) in block.all_instructions().enumerate() {
if let Some(dst) = inst.dst_value() {
def_map.insert(dst, (*bid, idx, inst));
}
}
}
for (bid, block) in &function.blocks {
for (idx, inst) in block.all_instructions().enumerate() {
match inst {
MirInstruction::WeakRef { op: super::WeakRefOp::Load, value, .. } => {
match def_map.get(value) {
Some((_db, _di, def_inst)) => match def_inst {
MirInstruction::WeakRef { op: super::WeakRefOp::New, .. } | MirInstruction::WeakNew { .. } => {}
_ => {
errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *value,
block: *bid,
instruction_index: idx,
reason: "weakref.load source is not a weakref.new/weak_new".to_string(),
});
}
},
None => {
errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *value,
block: *bid,
instruction_index: idx,
reason: "weakref.load source is undefined".to_string(),
});
}
}
}
MirInstruction::WeakLoad { weak_ref, .. } => {
match def_map.get(weak_ref) {
Some((_db, _di, def_inst)) => match def_inst {
MirInstruction::WeakNew { .. } | MirInstruction::WeakRef { op: super::WeakRefOp::New, .. } => {}
_ => {
errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *weak_ref,
block: *bid,
instruction_index: idx,
reason: "weak_load source is not a weak_new/weakref.new".to_string(),
});
}
},
None => {
errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *weak_ref,
block: *bid,
instruction_index: idx,
reason: "weak_load source is undefined".to_string(),
});
}
}
}
MirInstruction::Barrier { ptr, .. } | MirInstruction::BarrierRead { ptr } | MirInstruction::BarrierWrite { ptr } => {
if let Some((_db, _di, def_inst)) = def_map.get(ptr) {
if let MirInstruction::Const { value: super::ConstValue::Void, .. } = def_inst {
errors.push(VerificationError::InvalidBarrierPointer {
ptr: *ptr,
block: *bid,
instruction_index: idx,
reason: "barrier pointer is void".to_string(),
});
}
}
}
_ => {}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
barrier::check_weakref_and_barrier(function)
}
/// Light diagnostic: Barrier should be near memory ops in the same block (best-effort)
/// Enabled only when NYASH_VERIFY_BARRIER_STRICT=1
fn verify_barrier_context(&self, function: &MirFunction) -> Result<(), Vec<VerificationError>> {
let strict = std::env::var("NYASH_VERIFY_BARRIER_STRICT").ok().as_deref() == Some("1");
if !strict { return Ok(()); }
use super::MirInstruction;
let mut errors = Vec::new();
for (bid, block) in &function.blocks {
// Build a flat vec of (idx, &inst) including terminator (as last)
let mut insts: Vec<(usize, &MirInstruction)> = block.instructions.iter().enumerate().collect();
if let Some(term) = &block.terminator {
insts.push((usize::MAX, term));
}
for (idx, inst) in &insts {
let is_barrier = matches!(inst,
MirInstruction::Barrier { .. } |
MirInstruction::BarrierRead { .. } |
MirInstruction::BarrierWrite { .. }
);
if !is_barrier { continue; }
// Look around +-2 instructions for a memory op hint
let mut has_mem_neighbor = false;
for (j, other) in &insts {
if *j == *idx { continue; }
// integer distance (treat usize::MAX as distant)
let dist = if *idx == usize::MAX || *j == usize::MAX { 99 } else { idx.max(j) - idx.min(j) };
if dist > 2 { continue; }
if matches!(other,
MirInstruction::Load { .. } |
MirInstruction::Store { .. } |
MirInstruction::ArrayGet { .. } |
MirInstruction::ArraySet { .. } |
MirInstruction::RefGet { .. } |
MirInstruction::RefSet { .. }
) {
has_mem_neighbor = true;
break;
}
}
if !has_mem_neighbor {
errors.push(VerificationError::SuspiciousBarrierContext {
block: *bid,
instruction_index: *idx,
note: "barrier without nearby memory op (±2 inst)".to_string(),
});
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
barrier::check_barrier_context(function)
}
/// Verify SSA form properties
@ -596,41 +336,7 @@ impl MirVerifier {
/// Compute reachable blocks from entry
fn compute_reachable_blocks(&self, function: &MirFunction) -> HashSet<BasicBlockId> {
let mut reachable = HashSet::new();
let mut worklist = vec![function.entry_block];
while let Some(current) = worklist.pop() {
if reachable.insert(current) {
if let Some(block) = function.blocks.get(&current) {
// Add normal successors
for successor in &block.successors {
if !reachable.contains(successor) {
worklist.push(*successor);
}
}
// Add exception handler blocks as reachable
for instruction in &block.instructions {
if let super::MirInstruction::Catch { handler_bb, .. } = instruction {
if !reachable.contains(handler_bb) {
worklist.push(*handler_bb);
}
}
}
// Also check terminator for exception handlers
if let Some(ref terminator) = block.terminator {
if let super::MirInstruction::Catch { handler_bb, .. } = terminator {
if !reachable.contains(handler_bb) {
worklist.push(*handler_bb);
}
}
}
}
}
}
reachable
utils::compute_reachable_blocks(function)
}
/// Get all verification errors from the last run
@ -645,66 +351,17 @@ impl MirVerifier {
/// Build predecessor map for all blocks
fn compute_predecessors(&self, function: &MirFunction) -> HashMap<BasicBlockId, Vec<BasicBlockId>> {
let mut preds: HashMap<BasicBlockId, Vec<BasicBlockId>> = HashMap::new();
for (bid, block) in &function.blocks {
for succ in &block.successors {
preds.entry(*succ).or_default().push(*bid);
}
}
preds
utils::compute_predecessors(function)
}
/// Build a map from ValueId to its defining block
fn compute_def_blocks(&self, function: &MirFunction) -> HashMap<ValueId, BasicBlockId> {
let mut def_block: HashMap<ValueId, BasicBlockId> = HashMap::new();
for (bid, block) in &function.blocks {
for inst in block.all_instructions() {
if let Some(dst) = inst.dst_value() { def_block.insert(dst, *bid); }
}
}
def_block
utils::compute_def_blocks(function)
}
/// Compute dominator sets per block using standard iterative algorithm
fn compute_dominators(&self, function: &MirFunction) -> HashMap<BasicBlockId, HashSet<BasicBlockId>> {
let all_blocks: HashSet<BasicBlockId> = function.blocks.keys().copied().collect();
let preds = self.compute_predecessors(function);
let mut dom: HashMap<BasicBlockId, HashSet<BasicBlockId>> = HashMap::new();
for &b in function.blocks.keys() {
if b == function.entry_block {
let mut set = HashSet::new();
set.insert(b);
dom.insert(b, set);
} else {
dom.insert(b, all_blocks.clone());
}
}
let mut changed = true;
while changed {
changed = false;
for &b in function.blocks.keys() {
if b == function.entry_block { continue; }
let mut new_set: HashSet<BasicBlockId> = all_blocks.clone();
if let Some(ps) = preds.get(&b) {
if !ps.is_empty() {
for (i, p) in ps.iter().enumerate() {
if let Some(p_set) = dom.get(p) {
if i == 0 { new_set = p_set.clone(); }
else { new_set = new_set.intersection(p_set).copied().collect(); }
}
}
}
}
new_set.insert(b);
if let Some(old) = dom.get(&b) {
if &new_set != old { dom.insert(b, new_set); changed = true; }
}
}
}
dom
utils::compute_dominators(function)
}
}

View File

@ -0,0 +1,34 @@
use crate::mir::{function::MirFunction, MirInstruction};
use crate::mir::verification_types::VerificationError;
/// Ensure that each Await instruction (or ExternCall(env.future.await)) is immediately
/// preceded and followed by a checkpoint.
/// A checkpoint is either MirInstruction::Safepoint or ExternCall("env.runtime", "checkpoint").
pub fn check_await_checkpoints(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
let mut errors = Vec::new();
let is_cp = |inst: &MirInstruction| match inst {
MirInstruction::Safepoint => true,
MirInstruction::ExternCall { iface_name, method_name, .. } => iface_name == "env.runtime" && method_name == "checkpoint",
_ => false,
};
for (bid, block) in &function.blocks {
let instrs = &block.instructions;
for (idx, inst) in instrs.iter().enumerate() {
let is_await_like = match inst {
MirInstruction::Await { .. } => true,
MirInstruction::ExternCall { iface_name, method_name, .. } => iface_name == "env.future" && method_name == "await",
_ => false,
};
if is_await_like {
if idx == 0 || !is_cp(&instrs[idx - 1]) {
errors.push(VerificationError::MissingCheckpointAroundAwait { block: *bid, instruction_index: idx, position: "before" });
}
if idx + 1 >= instrs.len() || !is_cp(&instrs[idx + 1]) {
errors.push(VerificationError::MissingCheckpointAroundAwait { block: *bid, instruction_index: idx, position: "after" });
}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}

View File

@ -0,0 +1,108 @@
use crate::mir::{function::MirFunction, MirInstruction};
use crate::mir::verification_types::VerificationError;
/// Verify WeakRef/Barrier minimal semantics
pub fn check_weakref_and_barrier(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
use crate::mir::{BasicBlockId, ValueId};
let mut errors = Vec::new();
// Build def map value -> (block, idx, &inst)
let mut def_map: std::collections::HashMap<ValueId, (BasicBlockId, usize, &MirInstruction)> = std::collections::HashMap::new();
for (bid, block) in &function.blocks {
for (idx, inst) in block.all_instructions().enumerate() {
if let Some(dst) = inst.dst_value() { def_map.insert(dst, (*bid, idx, inst)); }
}
}
for (bid, block) in &function.blocks {
for (idx, inst) in block.all_instructions().enumerate() {
match inst {
MirInstruction::WeakRef { op: crate::mir::WeakRefOp::Load, value, .. } => {
match def_map.get(value) {
Some((_db, _di, def_inst)) => match def_inst {
MirInstruction::WeakRef { op: crate::mir::WeakRefOp::New, .. } | MirInstruction::WeakNew { .. } => {}
_ => errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *value, block: *bid, instruction_index: idx,
reason: "weakref.load source is not a weakref.new/weak_new".to_string(),
}),
},
None => errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *value, block: *bid, instruction_index: idx,
reason: "weakref.load source is undefined".to_string(),
}),
}
}
MirInstruction::WeakLoad { weak_ref, .. } => {
match def_map.get(weak_ref) {
Some((_db, _di, def_inst)) => match def_inst {
MirInstruction::WeakNew { .. } | MirInstruction::WeakRef { op: crate::mir::WeakRefOp::New, .. } => {}
_ => errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *weak_ref, block: *bid, instruction_index: idx,
reason: "weak_load source is not a weak_new/weakref.new".to_string(),
}),
},
None => errors.push(VerificationError::InvalidWeakRefSource {
weak_ref: *weak_ref, block: *bid, instruction_index: idx,
reason: "weak_load source is undefined".to_string(),
}),
}
}
MirInstruction::Barrier { ptr, .. } | MirInstruction::BarrierRead { ptr } | MirInstruction::BarrierWrite { ptr } => {
if let Some((_db, _di, def_inst)) = def_map.get(ptr) {
if let MirInstruction::Const { value: crate::mir::instruction::ConstValue::Void, .. } = def_inst {
errors.push(VerificationError::InvalidBarrierPointer {
ptr: *ptr, block: *bid, instruction_index: idx,
reason: "barrier pointer is void".to_string(),
});
}
}
}
_ => {}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}
/// Light diagnostic: Barrier should be near memory ops in the same block (best-effort)
/// Enabled only when NYASH_VERIFY_BARRIER_STRICT=1
pub fn check_barrier_context(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
let strict = std::env::var("NYASH_VERIFY_BARRIER_STRICT").ok().as_deref() == Some("1");
if !strict { return Ok(()); }
let mut errors = Vec::new();
for (bid, block) in &function.blocks {
let mut insts: Vec<(usize, &MirInstruction)> = block.instructions.iter().enumerate().collect();
if let Some(term) = &block.terminator { insts.push((usize::MAX, term)); }
for (idx, inst) in &insts {
let is_barrier = matches!(inst,
MirInstruction::Barrier { .. } |
MirInstruction::BarrierRead { .. } |
MirInstruction::BarrierWrite { .. }
);
if !is_barrier { continue; }
// Look around +-2 instructions for a memory op hint
let mut has_mem_neighbor = false;
for (j, other) in &insts {
if *j == *idx { continue; }
let dist = if *idx == usize::MAX || *j == usize::MAX { 99 } else { idx.max(j) - idx.min(j) };
if dist > 2 { continue; }
if matches!(other,
MirInstruction::Load { .. } |
MirInstruction::Store { .. } |
MirInstruction::ArrayGet { .. } |
MirInstruction::ArraySet { .. } |
MirInstruction::RefGet { .. } |
MirInstruction::RefSet { .. }
) { has_mem_neighbor = true; break; }
}
if !has_mem_neighbor {
errors.push(VerificationError::SuspiciousBarrierContext {
block: *bid,
instruction_index: *idx,
note: "barrier without nearby memory op (±2 inst)".to_string(),
});
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}

View File

@ -0,0 +1,64 @@
use crate::mir::function::MirFunction;
use crate::mir::{BasicBlockId, ValueId};
use crate::mir::verification_types::VerificationError;
use crate::mir::verification::utils;
use std::collections::{HashMap, HashSet};
/// Verify CFG references and reachability
pub fn check_control_flow(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
let mut errors = Vec::new();
for (block_id, block) in &function.blocks {
for successor in &block.successors {
if !function.blocks.contains_key(successor) {
errors.push(VerificationError::ControlFlowError {
block: *block_id,
reason: format!("References non-existent block {}", successor),
});
}
}
}
let reachable = utils::compute_reachable_blocks(function);
for block_id in function.blocks.keys() {
if !reachable.contains(block_id) && *block_id != function.entry_block {
errors.push(VerificationError::UnreachableBlock { block: *block_id });
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}
/// Verify that merge blocks do not use predecessor-defined values directly (must go through Phi)
pub fn check_merge_uses(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
if crate::config::env::verify_allow_no_phi() { return Ok(()); }
let mut errors = Vec::new();
let preds = utils::compute_predecessors(function);
let def_block = utils::compute_def_blocks(function);
let dominators = utils::compute_dominators(function);
let mut phi_dsts_in_block: HashMap<BasicBlockId, HashSet<ValueId>> = HashMap::new();
for (bid, block) in &function.blocks {
let set = phi_dsts_in_block.entry(*bid).or_default();
for inst in block.all_instructions() {
if let crate::mir::MirInstruction::Phi { dst, .. } = inst { set.insert(*dst); }
}
}
for (bid, block) in &function.blocks {
let Some(pred_list) = preds.get(bid) else { continue };
if pred_list.len() < 2 { continue; }
let phi_dsts = phi_dsts_in_block.get(bid);
let doms_of_block = dominators.get(bid).unwrap();
for inst in block.all_instructions() {
if let crate::mir::MirInstruction::Phi { .. } = inst { continue; }
for used in inst.used_values() {
if let Some(&db) = def_block.get(&used) {
if !doms_of_block.contains(&db) {
let is_phi_dst = phi_dsts.map(|s| s.contains(&used)).unwrap_or(false);
if !is_phi_dst {
errors.push(VerificationError::MergeUsesPredecessorValue { value: used, merge_block: *bid, pred_block: db });
}
}
}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}

View File

@ -0,0 +1,32 @@
use crate::mir::function::MirFunction;
use crate::mir::verification_types::VerificationError;
use crate::mir::verification::utils;
/// Verify dominance: def must dominate use across blocks (Phi inputs excluded)
pub fn check_dominance(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
if crate::config::env::verify_allow_no_phi() { return Ok(()); }
let mut errors = Vec::new();
let def_block = utils::compute_def_blocks(function);
let dominators = utils::compute_dominators(function);
for (use_block_id, block) in &function.blocks {
for instruction in block.all_instructions() {
if let crate::mir::MirInstruction::Phi { .. } = instruction { continue; }
for used_value in instruction.used_values() {
if let Some(&def_bb) = def_block.get(&used_value) {
if def_bb != *use_block_id {
let doms = dominators.get(use_block_id).unwrap();
if !doms.contains(&def_bb) {
errors.push(VerificationError::DominatorViolation {
value: used_value,
use_block: *use_block_id,
def_block: def_bb,
});
}
}
}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}

View File

@ -0,0 +1,39 @@
use crate::mir::{function::MirFunction, MirInstruction};
use crate::mir::verification_types::VerificationError;
/// Reject legacy instructions that should be rewritten to Core-15 equivalents
/// Skips check when NYASH_VERIFY_ALLOW_LEGACY=1
pub fn check_no_legacy_ops(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
if std::env::var("NYASH_VERIFY_ALLOW_LEGACY").ok().as_deref() == Some("1") {
return Ok(());
}
let mut errors = Vec::new();
for (bid, block) in &function.blocks {
for (idx, inst) in block.all_instructions().enumerate() {
let legacy_name = match inst {
MirInstruction::TypeCheck { .. } => Some("TypeCheck"), // -> TypeOp(Check)
MirInstruction::Cast { .. } => Some("Cast"), // -> TypeOp(Cast)
MirInstruction::WeakNew { .. } => Some("WeakNew"), // -> WeakRef(New)
MirInstruction::WeakLoad { .. } => Some("WeakLoad"), // -> WeakRef(Load)
MirInstruction::BarrierRead { .. } => Some("BarrierRead"), // -> Barrier(Read)
MirInstruction::BarrierWrite { .. } => Some("BarrierWrite"), // -> Barrier(Write)
MirInstruction::Print { .. } => Some("Print"), // -> ExternCall(env.console.log)
MirInstruction::ArrayGet { .. } => Some("ArrayGet"), // -> BoxCall("get")
MirInstruction::ArraySet { .. } => Some("ArraySet"), // -> BoxCall("set")
MirInstruction::RefGet { .. } => Some("RefGet"), // -> BoxCall("getField")
MirInstruction::RefSet { .. } => Some("RefSet"), // -> BoxCall("setField")
MirInstruction::PluginInvoke { .. } => Some("PluginInvoke"), // -> BoxCall
_ => None,
};
if let Some(name) = legacy_name {
errors.push(VerificationError::UnsupportedLegacyInstruction {
block: *bid,
instruction_index: idx,
name: name.to_string(),
});
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}

View File

@ -0,0 +1,41 @@
use crate::mir::function::MirFunction;
use crate::mir::{ValueId};
use crate::mir::verification_types::VerificationError;
/// Verify SSA form: single assignment and all uses defined
pub fn check_ssa_form(function: &MirFunction) -> Result<(), Vec<VerificationError>> {
use std::collections::HashMap;
let mut errors = Vec::new();
let mut definitions: HashMap<ValueId, (crate::mir::BasicBlockId, usize)> = HashMap::new();
for (block_id, block) in &function.blocks {
for (inst_idx, instruction) in block.all_instructions().enumerate() {
if let Some(dst) = instruction.dst_value() {
if let Some((first_block, _)) = definitions.insert(dst, (*block_id, inst_idx)) {
errors.push(VerificationError::MultipleDefinition {
value: dst,
first_block,
second_block: *block_id,
});
}
}
}
}
for (block_id, block) in &function.blocks {
for (inst_idx, instruction) in block.all_instructions().enumerate() {
for used_value in instruction.used_values() {
if !definitions.contains_key(&used_value) {
errors.push(VerificationError::UndefinedValue {
value: used_value,
block: *block_id,
instruction_index: inst_idx,
});
}
}
}
}
if errors.is_empty() { Ok(()) } else { Err(errors) }
}

View File

@ -0,0 +1,86 @@
use crate::mir::{function::MirFunction, BasicBlockId, ValueId};
use std::collections::{HashMap, HashSet};
pub fn compute_predecessors(function: &MirFunction) -> HashMap<BasicBlockId, Vec<BasicBlockId>> {
let mut preds: HashMap<BasicBlockId, Vec<BasicBlockId>> = HashMap::new();
for (bid, block) in &function.blocks {
for succ in &block.successors {
preds.entry(*succ).or_default().push(*bid);
}
}
preds
}
pub fn compute_def_blocks(function: &MirFunction) -> HashMap<ValueId, BasicBlockId> {
let mut def_block: HashMap<ValueId, BasicBlockId> = HashMap::new();
for (bid, block) in &function.blocks {
for inst in block.all_instructions() {
if let Some(dst) = inst.dst_value() { def_block.insert(dst, *bid); }
}
}
def_block
}
pub fn compute_dominators(function: &MirFunction) -> HashMap<BasicBlockId, HashSet<BasicBlockId>> {
let all_blocks: HashSet<BasicBlockId> = function.blocks.keys().copied().collect();
let preds = compute_predecessors(function);
let mut dom: HashMap<BasicBlockId, HashSet<BasicBlockId>> = HashMap::new();
for &b in function.blocks.keys() {
if b == function.entry_block {
let mut set = HashSet::new();
set.insert(b);
dom.insert(b, set);
} else {
dom.insert(b, all_blocks.clone());
}
}
let mut changed = true;
while changed {
changed = false;
for &b in function.blocks.keys() {
if b == function.entry_block { continue; }
let mut new_set = all_blocks.clone();
if let Some(p_list) = preds.get(&b) {
for p in p_list {
if let Some(p_dom) = dom.get(p) {
new_set = new_set.intersection(p_dom).copied().collect();
}
}
}
new_set.insert(b);
let cur = dom.get(&b).unwrap();
if &new_set != cur { dom.insert(b, new_set); changed = true; }
}
}
dom
}
pub fn compute_reachable_blocks(function: &MirFunction) -> HashSet<BasicBlockId> {
let mut reachable = HashSet::new();
let mut worklist = vec![function.entry_block];
while let Some(current) = worklist.pop() {
if reachable.insert(current) {
if let Some(block) = function.blocks.get(&current) {
for successor in &block.successors {
if !reachable.contains(successor) {
worklist.push(*successor);
}
}
for instruction in &block.instructions {
if let crate::mir::MirInstruction::Catch { handler_bb, .. } = instruction {
if !reachable.contains(handler_bb) { worklist.push(*handler_bb); }
}
}
if let Some(ref terminator) = block.terminator {
if let crate::mir::MirInstruction::Catch { handler_bb, .. } = terminator {
if !reachable.contains(handler_bb) { worklist.push(*handler_bb); }
}
}
}
}
}
reachable
}

View File

@ -0,0 +1,23 @@
/*!
* Verification types (extracted from verification.rs)
*/
use super::{BasicBlockId, ValueId};
/// Verification error types
#[derive(Debug, Clone, PartialEq)]
pub enum VerificationError {
UndefinedValue { value: ValueId, block: BasicBlockId, instruction_index: usize },
MultipleDefinition { value: ValueId, first_block: BasicBlockId, second_block: BasicBlockId },
InvalidPhi { phi_value: ValueId, block: BasicBlockId, reason: String },
UnreachableBlock { block: BasicBlockId },
ControlFlowError { block: BasicBlockId, reason: String },
DominatorViolation { value: ValueId, use_block: BasicBlockId, def_block: BasicBlockId },
MergeUsesPredecessorValue { value: ValueId, merge_block: BasicBlockId, pred_block: BasicBlockId },
InvalidWeakRefSource { weak_ref: ValueId, block: BasicBlockId, instruction_index: usize, reason: String },
InvalidBarrierPointer { ptr: ValueId, block: BasicBlockId, instruction_index: usize, reason: String },
SuspiciousBarrierContext { block: BasicBlockId, instruction_index: usize, note: String },
UnsupportedLegacyInstruction { block: BasicBlockId, instruction_index: usize, name: String },
MissingCheckpointAroundAwait { block: BasicBlockId, instruction_index: usize, position: &'static str },
}

View File

@ -0,0 +1,33 @@
use crate::parser::{NyashParser, ParseError};
use crate::parser::common::ParserUtils;
use crate::tokenizer::TokenType;
use crate::ast::{ASTNode, Span};
#[inline]
fn is_sugar_enabled() -> bool { crate::parser::sugar_gate::is_enabled() }
impl NyashParser {
pub(crate) fn expr_parse_coalesce(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.expr_parse_or()?;
while self.match_token(&TokenType::QmarkQmark) {
if !is_sugar_enabled() {
let line = self.current_token().line;
return Err(ParseError::UnexpectedToken {
found: self.current_token().token_type.clone(),
expected: "enable NYASH_SYNTAX_SUGAR_LEVEL=basic|full for '??'".to_string(),
line,
});
}
self.advance();
let rhs = self.expr_parse_or()?;
let scr = expr;
expr = ASTNode::PeekExpr {
scrutinee: Box::new(scr.clone()),
arms: vec![(crate::ast::LiteralValue::Null, rhs)],
else_expr: Box::new(scr),
span: Span::unknown(),
};
}
Ok(expr)
}
}

36
src/parser/expr/logic.rs Normal file
View File

@ -0,0 +1,36 @@
use crate::parser::{NyashParser, ParseError};
use crate::parser::common::ParserUtils;
use crate::tokenizer::TokenType;
use crate::ast::{ASTNode, BinaryOperator, Span};
impl NyashParser {
pub(crate) fn expr_parse_or(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.expr_parse_and()?;
while self.match_token(&TokenType::OR) {
let operator = BinaryOperator::Or;
self.advance();
let right = self.expr_parse_and()?;
if std::env::var("NYASH_GRAMMAR_DIFF").ok().as_deref() == Some("1") {
let ok = crate::grammar::engine::get().syntax_is_allowed_binop("or");
if !ok { eprintln!("[GRAMMAR-DIFF][Parser] binop 'or' not allowed by syntax rules"); }
}
expr = ASTNode::BinaryOp { operator, left: Box::new(expr), right: Box::new(right), span: Span::unknown() };
}
Ok(expr)
}
pub(crate) fn expr_parse_and(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.parse_bit_or()?;
while self.match_token(&TokenType::AND) {
let operator = BinaryOperator::And;
self.advance();
let right = self.parse_equality()?;
if std::env::var("NYASH_GRAMMAR_DIFF").ok().as_deref() == Some("1") {
let ok = crate::grammar::engine::get().syntax_is_allowed_binop("and");
if !ok { eprintln!("[GRAMMAR-DIFF][Parser] binop 'and' not allowed by syntax rules"); }
}
expr = ASTNode::BinaryOp { operator, left: Box::new(expr), right: Box::new(right), span: Span::unknown() };
}
Ok(expr)
}
}

4
src/parser/expr/mod.rs Normal file
View File

@ -0,0 +1,4 @@
pub(crate) mod ternary;
pub(crate) mod coalesce;
pub(crate) mod logic;

View File

@ -0,0 +1,26 @@
use crate::parser::{NyashParser, ParseError};
use crate::parser::common::ParserUtils;
use crate::tokenizer::TokenType;
use crate::ast::{ASTNode, Span};
#[inline]
fn is_sugar_enabled() -> bool { crate::parser::sugar_gate::is_enabled() }
impl NyashParser {
pub(crate) fn expr_parse_ternary(&mut self) -> Result<ASTNode, ParseError> {
let cond = self.expr_parse_coalesce()?;
if self.match_token(&TokenType::QUESTION) {
self.advance();
let then_expr = self.parse_expression()?;
self.consume(TokenType::COLON)?;
let else_expr = self.parse_expression()?;
return Ok(ASTNode::If {
condition: Box::new(cond),
then_body: vec![then_expr],
else_body: Some(vec![else_expr]),
span: Span::unknown(),
});
}
Ok(cond)
}
}

View File

@ -85,99 +85,19 @@ impl NyashParser {
/// 三項演算子: cond ? then : else
/// Grammar (Phase 12.7): TernaryExpr = NullsafeExpr ( "?" Expr ":" Expr )?
/// 実装: coalesce の上に差し込み、`cond ? a : b` を If式に変換する。
fn parse_ternary(&mut self) -> Result<ASTNode, ParseError> {
let cond = self.parse_coalesce()?;
if self.match_token(&TokenType::QUESTION) {
// consume '?' and parse then/else expressions
self.advance();
let then_expr = self.parse_expression()?;
self.consume(TokenType::COLON)?; // ':'
let else_expr = self.parse_expression()?;
// Lower to If-expression AST (builder側でPhi化
return Ok(ASTNode::If {
condition: Box::new(cond),
then_body: vec![then_expr],
else_body: Some(vec![else_expr]),
span: Span::unknown(),
});
}
Ok(cond)
}
fn parse_ternary(&mut self) -> Result<ASTNode, ParseError> { self.expr_parse_ternary() }
/// デフォルト値(??: x ?? y => peek x { null => y, else => x }
fn parse_coalesce(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.parse_or()?;
while self.match_token(&TokenType::QmarkQmark) {
if !is_sugar_enabled() {
let line = self.current_token().line;
return Err(ParseError::UnexpectedToken {
found: self.current_token().token_type.clone(),
expected: "enable NYASH_SYNTAX_SUGAR_LEVEL=basic|full for '??'".to_string(),
line,
});
}
self.advance(); // consume '??'
let rhs = self.parse_or()?;
let scr = expr;
expr = ASTNode::PeekExpr {
scrutinee: Box::new(scr.clone()),
arms: vec![(crate::ast::LiteralValue::Null, rhs)],
else_expr: Box::new(scr),
span: Span::unknown(),
};
}
Ok(expr)
}
fn parse_coalesce(&mut self) -> Result<ASTNode, ParseError> { self.expr_parse_coalesce() }
/// OR演算子をパース: ||
fn parse_or(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.parse_and()?;
while self.match_token(&TokenType::OR) {
let operator = BinaryOperator::Or;
self.advance();
let right = self.parse_and()?;
// Non-invasive syntax diff: record binop
if std::env::var("NYASH_GRAMMAR_DIFF").ok().as_deref() == Some("1") {
let ok = crate::grammar::engine::get().syntax_is_allowed_binop("or");
if !ok { eprintln!("[GRAMMAR-DIFF][Parser] binop 'or' not allowed by syntax rules"); }
}
expr = ASTNode::BinaryOp {
operator,
left: Box::new(expr),
right: Box::new(right),
span: Span::unknown(),
};
}
Ok(expr)
}
fn parse_or(&mut self) -> Result<ASTNode, ParseError> { self.expr_parse_or() }
/// AND演算子をパース: &&
fn parse_and(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.parse_bit_or()?;
while self.match_token(&TokenType::AND) {
let operator = BinaryOperator::And;
self.advance();
let right = self.parse_equality()?;
if std::env::var("NYASH_GRAMMAR_DIFF").ok().as_deref() == Some("1") {
let ok = crate::grammar::engine::get().syntax_is_allowed_binop("and");
if !ok { eprintln!("[GRAMMAR-DIFF][Parser] binop 'and' not allowed by syntax rules"); }
}
expr = ASTNode::BinaryOp {
operator,
left: Box::new(expr),
right: Box::new(right),
span: Span::unknown(),
};
}
Ok(expr)
}
fn parse_and(&mut self) -> Result<ASTNode, ParseError> { self.expr_parse_and() }
/// ビットOR: |
fn parse_bit_or(&mut self) -> Result<ASTNode, ParseError> {
pub(crate) fn parse_bit_or(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.parse_bit_xor()?;
while self.match_token(&TokenType::BitOr) {
let operator = BinaryOperator::BitOr;
@ -213,7 +133,7 @@ impl NyashParser {
}
/// 等値演算子をパース: == !=
fn parse_equality(&mut self) -> Result<ASTNode, ParseError> {
pub(crate) fn parse_equality(&mut self) -> Result<ASTNode, ParseError> {
let mut expr = self.parse_comparison()?;
while self.match_token(&TokenType::EQUALS) || self.match_token(&TokenType::NotEquals) {

View File

@ -19,6 +19,7 @@
// サブモジュール宣言
mod common;
mod expressions;
mod expr;
mod statements;
mod declarations;
mod items;

View File

@ -0,0 +1,56 @@
/*!
* CLI Directives Scanner — early source comments to env plumbing
*
* Supports lightweight, file-scoped directives placed in the first lines
* of a Nyash source file. Current directives:
* - // @env KEY=VALUE → export KEY=VALUE into process env
* - // @plugin-builtins → NYASH_USE_PLUGIN_BUILTINS=1
* - // @jit-debug → enable common JIT debug flags (no-op if JIT unused)
* - // @jit-strict → strict JIT flags (no VM fallback) for experiments
*
* Also runs the "fields-at-top" lint delegated to pipeline::lint_fields_top.
*/
pub(super) fn apply_cli_directives_from_source(
code: &str,
strict_fields: bool,
verbose: bool,
) -> Result<(), String> {
// Scan only the header area (up to the first non-comment content line)
for (i, line) in code.lines().take(128).enumerate() {
let l = line.trim();
if !(l.starts_with("//") || l.starts_with("#!") || l.is_empty()) {
if i > 0 { break; }
}
if let Some(rest) = l.strip_prefix("//") {
let rest = rest.trim();
if let Some(dir) = rest.strip_prefix("@env ") {
if let Some((k, v)) = dir.split_once('=') {
let key = k.trim();
let val = v.trim();
if !key.is_empty() { std::env::set_var(key, val); }
}
} else if rest == "@plugin-builtins" {
std::env::set_var("NYASH_USE_PLUGIN_BUILTINS", "1");
} else if rest == "@jit-debug" {
// Safe even if JIT is disabled elsewhere; treated as no-op flags
std::env::set_var("NYASH_JIT_EXEC", "1");
std::env::set_var("NYASH_JIT_THRESHOLD", "1");
std::env::set_var("NYASH_JIT_EVENTS", "1");
std::env::set_var("NYASH_JIT_EVENTS_COMPILE", "1");
std::env::set_var("NYASH_JIT_EVENTS_RUNTIME", "1");
std::env::set_var("NYASH_JIT_SHIM_TRACE", "1");
} else if rest == "@jit-strict" {
std::env::set_var("NYASH_JIT_STRICT", "1");
std::env::set_var("NYASH_JIT_ARGS_HANDLE_ONLY", "1");
if std::env::var("NYASH_JIT_ONLY").ok().is_none() {
std::env::set_var("NYASH_JIT_ONLY", "1");
}
}
}
}
// Lint: enforce fields at top-of-box (delegated)
super::pipeline::lint_fields_top(code, strict_fields, verbose)
}

View File

@ -21,6 +21,8 @@ mod json_v0_bridge;
mod mir_json_emit;
mod pipe_io;
mod pipeline;
mod cli_directives;
mod trace;
mod box_index;
mod tasks;
mod build;
@ -123,42 +125,10 @@ impl NyashRunner {
// // @plugin-builtins (NYASH_USE_PLUGIN_BUILTINS=1)
if let Some(ref filename) = self.config.file {
if let Ok(code) = fs::read_to_string(filename) {
// Scan first 128 lines for directives
for (i, line) in code.lines().take(128).enumerate() {
let l = line.trim();
if !(l.starts_with("//") || l.starts_with("#!") || l.is_empty()) {
// Stop early at first non-comment line to avoid scanning full file
if i > 0 { break; }
}
// Shebang with envs: handled by shell normally; keep placeholder
if let Some(rest) = l.strip_prefix("//") { let rest = rest.trim();
if let Some(dir) = rest.strip_prefix("@env ") {
if let Some((k,v)) = dir.split_once('=') {
let key = k.trim(); let val = v.trim();
if !key.is_empty() { std::env::set_var(key, val); }
}
} else if rest == "@jit-debug" {
std::env::set_var("NYASH_JIT_EXEC", "1");
std::env::set_var("NYASH_JIT_THRESHOLD", "1");
std::env::set_var("NYASH_JIT_EVENTS", "1");
std::env::set_var("NYASH_JIT_EVENTS_COMPILE", "1");
std::env::set_var("NYASH_JIT_EVENTS_RUNTIME", "1");
std::env::set_var("NYASH_JIT_SHIM_TRACE", "1");
} else if rest == "@plugin-builtins" {
std::env::set_var("NYASH_USE_PLUGIN_BUILTINS", "1");
} else if rest == "@jit-strict" {
std::env::set_var("NYASH_JIT_STRICT", "1");
std::env::set_var("NYASH_JIT_ARGS_HANDLE_ONLY", "1");
// In strict mode, default to JIT-only (no VM fallback)
if std::env::var("NYASH_JIT_ONLY").ok().is_none() { std::env::set_var("NYASH_JIT_ONLY", "1"); }
}
}
}
// Lint: fields must be at top of box
// Apply script-level directives and lint
let strict_fields = std::env::var("NYASH_FIELDS_TOP_STRICT").ok().as_deref() == Some("1");
if let Err(e) = pipeline::lint_fields_top(&code, strict_fields, self.config.cli_verbose) {
eprintln!("❌ Lint error: {}", e);
if let Err(e) = cli_directives::apply_cli_directives_from_source(&code, strict_fields, self.config.cli_verbose) {
eprintln!("❌ Lint/Directive error: {}", e);
std::process::exit(1);
}

View File

@ -44,7 +44,7 @@ impl NyashRunner {
#[cfg(feature = "llvm-harness")]
{
// Harness path (optional): if NYASH_LLVM_USE_HARNESS=1, try Python/llvmlite first.
let use_harness = std::env::var("NYASH_LLVM_USE_HARNESS").ok().as_deref() == Some("1");
let use_harness = crate::config::env::llvm_use_harness();
if use_harness {
if let Some(parent) = std::path::Path::new(&_out_path).parent() { let _ = std::fs::create_dir_all(parent); }
let py = which::which("python3").ok();

View File

@ -158,12 +158,12 @@ pub(super) fn resolve_using_target(
format!("{}|{}|{}|{}", tgt, base, strict as i32, using_paths.join(":"))
};
if let Some(hit) = crate::runner::box_index::cache_get(&key) {
if trace { eprintln!("[using/cache] '{}' -> '{}'", tgt, hit); }
if trace { crate::runner::trace::log(format!("[using/cache] '{}' -> '{}'", tgt, hit)); }
return Ok(hit);
}
// Resolve aliases early (provided map)
if let Some(v) = aliases.get(tgt) {
if trace { eprintln!("[using/resolve] alias '{}' -> '{}'", tgt, v); }
if trace { crate::runner::trace::log(format!("[using/resolve] alias '{}' -> '{}'", tgt, v)); }
crate::runner::box_index::cache_put(&key, v.clone());
return Ok(v.clone());
}
@ -173,7 +173,7 @@ pub(super) fn resolve_using_target(
if let Some((k,v)) = ent.split_once('=') {
if k.trim() == tgt {
let out = v.trim().to_string();
if trace { eprintln!("[using/resolve] env-alias '{}' -> '{}'", tgt, out); }
if trace { crate::runner::trace::log(format!("[using/resolve] env-alias '{}' -> '{}'", tgt, out)); }
crate::runner::box_index::cache_put(&key, out.clone());
return Ok(out);
}
@ -183,7 +183,7 @@ pub(super) fn resolve_using_target(
// 1) modules mapping
if let Some((_, p)) = modules.iter().find(|(n, _)| n == tgt) {
let out = p.clone();
if trace { eprintln!("[using/resolve] modules '{}' -> '{}'", tgt, out); }
if trace { crate::runner::trace::log(format!("[using/resolve] modules '{}' -> '{}'", tgt, out)); }
crate::runner::box_index::cache_put(&key, out.clone());
return Ok(out);
}
@ -204,9 +204,13 @@ pub(super) fn resolve_using_target(
if cands.len() < 5 { suggest_in_base("lib", leaf, &mut cands); }
if cands.len() < 5 { suggest_in_base(".", leaf, &mut cands); }
if cands.is_empty() {
eprintln!("[using] unresolved '{}' (searched: rel+paths)", tgt);
crate::runner::trace::log(format!("[using] unresolved '{}' (searched: rel+paths)", tgt));
} else {
eprintln!("[using] unresolved '{}' (searched: rel+paths) candidates: {}", tgt, cands.join(", "));
crate::runner::trace::log(format!(
"[using] unresolved '{}' (searched: rel+paths) candidates: {}",
tgt,
cands.join(", ")
));
}
}
return Ok(tgt.to_string());
@ -215,7 +219,7 @@ pub(super) fn resolve_using_target(
return Err(format!("ambiguous using '{}': {}", tgt, cand.join(", ")));
}
let out = cand.remove(0);
if trace { eprintln!("[using/resolve] '{}' -> '{}'", tgt, out); }
if trace { crate::runner::trace::log(format!("[using/resolve] '{}' -> '{}'", tgt, out)); }
crate::runner::box_index::cache_put(&key, out.clone());
Ok(out)
}

18
src/runner/trace.rs Normal file
View File

@ -0,0 +1,18 @@
/*!
* Runner trace utilities — centralized verbose logging
*/
/// Returns true when runner-level verbose tracing is enabled.
/// Controlled by `NYASH_CLI_VERBOSE=1` or `NYASH_RESOLVE_TRACE=1`.
pub fn enabled() -> bool {
std::env::var("NYASH_CLI_VERBOSE").ok().as_deref() == Some("1")
|| std::env::var("NYASH_RESOLVE_TRACE").ok().as_deref() == Some("1")
}
/// Emit a single-line trace message when enabled.
pub fn log<S: AsRef<str>>(msg: S) {
if enabled() {
eprintln!("{}", msg.as_ref());
}
}