feat(mir/phi): improve LoopForm parameter detection - track param names

**Problem**: is_parameter() was too simple, checking only ValueId which changes through copies/PHIs. This caused parameters like 'data' to be misclassified as carriers, leading to incorrect PHI construction. **Solution**: Track original parameter names at function entry. **Changes**: 1. **Added function_param_names field** (builder.rs): - HashSet<String> to track original parameter names - Populated in lower_static_method_as_function() - Cleared and repopulated for each new function 2. **Improved is_parameter()** (loop_builder.rs): - Check name against function_param_names instead of ValueId - More reliable than checking func.params (ValueIds change) - __pin$*$@* variables correctly classified as carriers - Added debug logging with NYASH_LOOPFORM_DEBUG 3. **Enhanced debug output** (loopform_builder.rs): - Show carrier/pinned classification during prepare_structure() - Show variable_map state after emit_header_phis() **Test Results**: - ✅ 'args' correctly identified as parameter (was working) - ✅ 'data' now correctly identified as parameter (was broken) - ✅ __pin variables correctly classified as carriers - ✅ PHI values allocated and variable_map updated correctly - ⚠️ ValueId undefined errors persist (separate issue) **Remaining Issue**: ValueId(10) undefined error suggests PHI visibility problem or VM verification issue. Needs further investigation of emit_phi_at_block_start() or VM executor. **Backward Compatibility**: - Flag OFF: 100% existing behavior preserved (legacy path unchanged) - Feature-flagged with NYASH_LOOPFORM_PHI_V2=1 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 05:24:07 +09:00
parent f85e485195
commit c459135238
6 changed files with 258 additions and 11 deletions
--- a/CURRENT_TASK.md
+++ b/CURRENT_TASK.md
@ -237,7 +237,7 @@ Update (2025-11-16 — Phase 25.1b: selfhost builder multi-carrier & BoxTypeInsp
    - さらに Rust VM 側の `MirInterpreter::reg_load` に開発用の追加情報を付けたことで、`Invalid value: use of undefined value ValueId(N)` が発生した際に `fn` / `last_block` / `last_inst` がエラーメッセージに含まれるようになり、Stage‑B Main.main 内の `ParserBox.length()` 呼び出しが recv 未定義で落ちていることを特定できるようになった（NYASH_VM_TRACE/NYASH_VM_TRACE_EXEC 未設定時でも場所が分かる）。
  - なお、Stage‑B を selfhost CLI サンプルに対して実行した際に現時点で見えている残存課題は次の 2 点:
    - 1) `if args { ... }` まわりの truthy 判定（ArrayBox を boolean 条件に使っている部分）の扱いに起因する型エラーであり、これは SSA ではなく「条件式の型／truthy 規約」をどう定義するかという別問題として扱う（Phase 25.1c 以降の型システム整理タスクで扱う想定）。
-    - 2) Rust VM 実行時に `❌ VM error: Invalid value: use of undefined value ValueId(17)` が発生しており、拡張したエラーメッセージ（`fn=Main.main, last_block=Some(BasicBlockId(3419)), last_inst=Some(Call { ... ParserBox.length() [recv: %17] ... })`）から「Stage‑B Main.main 内の `ParserBox.length()` 呼び出しにおいて recv スロット（ValueId(17)) が定義されていない」ことが判明している。これは verifier (`NYASH_VM_VERIFY_MIR=1`) では検出されない「Callee.receiver 側の SSA 漏れ」であり、Phase 25.1c の Stage‑B / LoopBuilder / LocalSSA 整理タスクの中で「MethodCall の recv に対しても `Undefined value` 検査を掛ける＋MirBuilder 側で未定義 recv を構造的に防ぐ」方針で扱う想定。
+    - 2) Rust VM 実行時に `❌ VM error: Invalid value: use of undefined value ValueId(17)` が発生しており、拡張したエラーメッセージ（初期段階では `fn=Main.main`、Stage‑B 箱分割後は `fn=StageBArgsBox.resolve_src/1` として報告）から「Stage‑B パスにおける `ParserBox.length()` 呼び出しで Method recv（ValueId(17)) が適切に定義されていない」ことが判明している。Task先生の調査では、loop header/body の PHI 生成まわりで pinned スロットに対して循環依存する入力（例: `bb6` 内で `%17 = phi[%14,bb3,...]` かつ `%14` 自体が同じ `bb6` の後続で定義される）があり、Use-before-def ではなく「PHI 入力の cycle」に近い構造バグであることが分かっている。これは verifier (`NYASH_VM_VERIFY_MIR=1`) が現状まだ検出していない「Callee.receiver 側＋PHI 配線」の問題であり、Phase 25.1c の Stage‑B / LoopBuilder / LocalSSA 整理タスクの中で (a) MethodCall の recv に対しても `Undefined value` 検査を掛ける、(b) loop_phi / pinned 変数の PHI 配線を修正して循環入力を構造的に防ぐ、という方針で扱う想定。
 - Next tasks (Phase 25.1b → 25.1c handoff / Codex):
  1. Rust 層 Call/ExternCall 契約のドキュメント固定（Step 4.1）
     - `src/mir/builder/builder_calls.rs` / `src/backend/mir_interpreter/handlers/{calls,externs,extern_provider}.rs` / `src/runtime/plugin_loader_v2/enabled/extern_functions.rs` をベースに、「MethodCall/ExternCall/hostbridge.extern_invoke/ env.codegen/env.mirbuilder」の SSOT を Phase 25.1b README に記録（実施済み）。
--- a/docs/development/roadmap/phases/phase-25.1c/README.md
+++ b/docs/development/roadmap/phases/phase-25.1c/README.md
@ -46,12 +46,12 @@ Status: planning（構造整理フェーズ・挙動は変えない）
     - ParserBox 呼び出し (`parse_program2` → emit JSON)
     - defs スキャン (`FuncScannerBox.scan_all_boxes`)
     が 1 関数に詰め込まれており、MIR 上でも巨大な `Main.main` になっている。
-   - 25.1c ではこれを「箱理論」に沿って分割する:
-     - `StageBArgsBox`（CLI 引数と bundle/require の扱いだけを担当）
-     - `StageBBodyExtractorBox`（`body_src` 抽出ロジックだけを担当）
-     - `StageBDriverBox`（ParserBox/FuncScannerBox を呼んで Program(JSON v0) を emit）
-   - Rust 側 MirBuilder には、この箱ごとの小さな関数をそのまま MIR に落とさせることで、`Main.main` の SSA/Loop の複雑さを減らし、今回のような ValueId 追跡をしやすくする。
-   - 併せて、現状 selfhost CLI サンプルで観測されている `Main.main` 内の `ParserBox.length()` 呼び出しに対する recv 未定義エラー（`Invalid value: use of undefined value ValueId(17)`）を、Stage‑B Main 分割＋LocalSSA/LoopBuilder 整理の一環として根本修正する（MethodCall の recv にも SSA/verify を適用する）。
+   - 25.1c ではこれを「箱理論」に沿って分割する方針を立てており、Phase 25.1c 冒頭でまず Stage‑B 側を 4 箱構造にリファクタした:
+     - `Main`（エントリ薄箱）: `main(args){ return StageBDriverBox.main(args) }` のみを担当。
+     - `StageBDriverBox`（オーケストレーション）: `StageBArgsBox.resolve_src` → `StageBBodyExtractorBox.build_body_src` → `ParserBox.parse_program2` → defs 挿入 → `print(ast_json)` だけを見る。
+     - `StageBArgsBox`（CLI 引数と bundle/require の扱いだけを担当）: もともとの「args/src/src_file/HAKO_SOURCE_FILE_CONTENT/return 0」ロジックを完全移動。
+     - `StageBBodyExtractorBox`（`body_src` 抽出ロジック＋bundle/using/trim を担当）: もともとの `body_src` 抽出〜コメント削除〜BundleResolver/Stage1UsingResolverBox〜前後 trim までを丸ごとカプセル化。
+   - いずれもロジックはそのまま移動であり、コメント・using・ログを含めて挙動は完全に不変（同じ Program(JSON v0)、同じログ、同じ `VM error: Invalid value`）であることを selfhost CLI サンプルで確認済み。エラーの発生箇所は `Main.main` から `StageBArgsBox.resolve_src/1` に関数名だけ変わっており、SSA/Loop 側の根本修正はこの後のタスク（LoopBuilder / LocalSSA 整理）で扱う。

 6. **LoopBuilder / pin スロットの型付け・箱化**
   - いまの LoopBuilder は `__pin$*$@recv` のような文字列ベースの「内部変数名」を `variable_map` に直接突っ込んで、SSA/phi/pin を管理している。
--- a/src/mir/builder.rs
+++ b/src/mir/builder.rs
@ -109,6 +109,10 @@ pub struct MirBuilder {
    /// Index of static methods seen during lowering: name -> [(BoxName, arity)]
    pub(super) static_method_index: std::collections::HashMap<String, Vec<(String, usize)>>,

+    /// Function parameter names (for LoopForm PHI construction)
+    /// Tracks the original parameter names at function entry
+    pub(super) function_param_names: HashSet<String>,
+
    /// Fast lookup: method+arity tail → candidate function names (e.g., ".str/0" → ["JsonNode.str/0", ...])
    pub(super) method_tail_index: std::collections::HashMap<String, Vec<String>>,
    /// Source size snapshot to detect when to rebuild the tail index
@ -191,6 +195,7 @@ impl MirBuilder {
            plugin_method_sigs,
            current_static_box: None,
            static_method_index: std::collections::HashMap::new(),
+            function_param_names: HashSet::new(),
            method_tail_index: std::collections::HashMap::new(),
            method_tail_index_source_len: 0,

@ -549,6 +554,34 @@ impl MirBuilder {
        }
    }

+    /// Update an existing PHI instruction's inputs (for loop sealing)
+    /// Used by LoopFormBuilder to complete incomplete PHI nodes
+    pub(super) fn update_phi_instruction(
+        &mut self,
+        block: BasicBlockId,
+        phi_id: ValueId,
+        new_inputs: Vec<(BasicBlockId, ValueId)>,
+    ) -> Result<(), String> {
+        if let Some(ref mut function) = self.current_function {
+            if let Some(block_data) = function.get_block_mut(block) {
+                // Find PHI instruction with matching dst
+                for inst in &mut block_data.instructions {
+                    if let MirInstruction::Phi { dst, inputs } = inst {
+                        if *dst == phi_id {
+                            *inputs = new_inputs;
+                            return Ok(());
+                        }
+                    }
+                }
+                Err(format!("PHI instruction {} not found in block {}", phi_id, block))
+            } else {
+                Err(format!("Block {} not found", block))
+            }
+        } else {
+            Err("No current function".to_string())
+        }
+    }
+
    // フェーズM: is_no_phi_mode()メソッド削除

    // フェーズM: insert_edge_copy()メソッド削除（no_phi_mode撤廃により不要）
--- a/src/mir/builder/builder_calls.rs
+++ b/src/mir/builder/builder_calls.rs
@ -931,11 +931,14 @@ impl super::MirBuilder {
        self.ensure_block_exists(entry)?;
        // Allocate parameter ValueIds from function's own ID space (starting from %0)
        // This ensures params are %0, %1, %2... as expected by verification and printing
+        // Also track parameter names for LoopForm PHI construction
+        self.function_param_names.clear();
        if let Some(ref mut f) = self.current_function {
            for p in &params {
                let pid = f.next_value_id();  // Use function's own ID allocator, not global
                f.params.push(pid);
                self.variable_map.insert(p.clone(), pid);
+                self.function_param_names.insert(p.clone());
            }
        }
        let program_ast = function_lowering::wrap_in_program(body);
--- a/src/mir/loop_builder.rs
+++ b/src/mir/loop_builder.rs
@ -7,6 +7,7 @@

 use super::{BasicBlockId, ConstValue, MirInstruction, ValueId};
 use crate::mir::phi_core::loop_phi::IncompletePhi;
+use crate::mir::phi_core::loopform_builder::{LoopFormBuilder, LoopFormOps};
 use crate::ast::ASTNode;
 use std::collections::HashMap;

@ -112,11 +113,137 @@ impl<'a> LoopBuilder<'a> {
        }
    }

-    /// SSA形式でループを構築
+    /// SSA形式でループを構築 (Feature flag dispatch)
    pub fn build_loop(
        &mut self,
        condition: ASTNode,
        body: Vec<ASTNode>,
+    ) -> Result<ValueId, String> {
+        // Check feature flag for LoopForm PHI v2
+        let use_loopform_v2 = std::env::var("NYASH_LOOPFORM_PHI_V2")
+            .map(|v| v == "1" || v.to_lowercase() == "true")
+            .unwrap_or(false);
+
+        if use_loopform_v2 {
+            self.build_loop_with_loopform(condition, body)
+        } else {
+            self.build_loop_legacy(condition, body)
+        }
+    }
+
+    /// SSA形式でループを構築 (LoopFormBuilder implementation)
+    fn build_loop_with_loopform(
+        &mut self,
+        condition: ASTNode,
+        body: Vec<ASTNode>,
+    ) -> Result<ValueId, String> {
+        // Create loop structure blocks
+        let preheader_id = self.current_block()?;
+        let header_id = self.new_block();
+        let body_id = self.new_block();
+        let latch_id = self.new_block();
+        let exit_id = self.new_block();
+
+        // Initialize LoopFormBuilder with preheader and header blocks
+        let mut loopform = LoopFormBuilder::new(preheader_id, header_id);
+
+        // Capture current variable map snapshot at preheader
+        let current_vars = self.get_current_variable_map();
+
+        // Pass 1: Prepare structure (allocate all ValueIds upfront)
+        if std::env::var("NYASH_LOOPFORM_DEBUG").is_ok() {
+            eprintln!("[loopform] variable_map at loop entry:");
+            for (name, value) in &current_vars {
+                let is_param = self.is_parameter(name);
+                eprintln!("  {} -> {:?} (param={})", name, value, is_param);
+            }
+        }
+        loopform.prepare_structure(self, &current_vars)?;
+
+        // Pass 2: Emit preheader (copies and jump to header)
+        loopform.emit_preheader(self)?;
+
+        // Pass 3: Emit header PHIs (incomplete, only preheader edge)
+        self.set_current_block(header_id)?;
+
+        // Ensure header block exists before emitting PHIs
+        self.parent_builder.ensure_block_exists(header_id)?;
+
+        loopform.emit_header_phis(self)?;
+
+        if std::env::var("NYASH_LOOPFORM_DEBUG").is_ok() {
+            eprintln!("[loopform] variable_map after emit_header_phis:");
+            for (name, value) in self.get_current_variable_map().iter() {
+                eprintln!("  {} -> {:?}", name, value);
+            }
+        }
+
+        // Set up loop context for break/continue
+        crate::mir::builder::loops::push_loop_context(self.parent_builder, header_id, exit_id);
+        self.loop_header = Some(header_id);
+        self.continue_snapshots.clear();
+        self.exit_snapshots.clear();
+
+        // Emit condition check in header
+        let cond_value = self.parent_builder.build_expression(condition)?;
+        self.emit_branch(cond_value, body_id, exit_id)?;
+
+        // Lower loop body
+        self.set_current_block(body_id)?;
+        for stmt in body {
+            self.build_statement(stmt)?;
+            if is_current_block_terminated(self.parent_builder)? {
+                break;
+            }
+        }
+
+        // Capture variable snapshot at end of body (before jumping to latch)
+        let body_end_vars = self.get_current_variable_map();
+
+        // Jump to latch if not already terminated
+        let actual_latch_id = if !is_current_block_terminated(self.parent_builder)? {
+            let cur_body_end = self.current_block()?;
+            self.emit_jump(latch_id)?;
+            latch_id
+        } else {
+            // Body is terminated (break/continue), use current block as latch
+            self.current_block()?
+        };
+
+        // Latch: jump back to header
+        self.set_current_block(latch_id)?;
+
+        // Update variable map with body end values for sealing
+        for (name, value) in body_end_vars {
+            self.update_variable(name, value);
+        }
+
+        self.emit_jump(header_id)?;
+
+        // Pass 4: Seal PHIs with latch values
+        loopform.seal_phis(self, actual_latch_id)?;
+
+        // Exit block
+        self.set_current_block(exit_id)?;
+
+        // Build exit PHIs for break statements
+        let exit_snaps = self.exit_snapshots.clone();
+        loopform.build_exit_phis(self, exit_id, &exit_snaps)?;
+
+        // Pop loop context
+        crate::mir::builder::loops::pop_loop_context(self.parent_builder);
+
+        // Return void value
+        let void_dst = self.new_value();
+        self.emit_const(void_dst, ConstValue::Void)?;
+        Ok(void_dst)
+    }
+
+    /// SSA形式でループを構築 (Legacy implementation)
+    fn build_loop_legacy(
+        &mut self,
+        condition: ASTNode,
+        body: Vec<ASTNode>,
    ) -> Result<ValueId, String> {
        // Reserve a deterministic loop id for debug region labeling
        let loop_id = self.parent_builder.debug_next_loop_id();
@ -925,3 +1052,79 @@ impl crate::mir::phi_core::loop_phi::LoopPhiOps for LoopBuilder<'_> {
        self.add_predecessor(block, pred)
    }
 }
+
+// Implement LoopFormOps trait for LoopBuilder to support LoopFormBuilder integration
+impl<'a> LoopFormOps for LoopBuilder<'a> {
+    fn new_value(&mut self) -> ValueId {
+        self.parent_builder.value_gen.next()
+    }
+
+    fn is_parameter(&self, name: &str) -> bool {
+        // A parameter is a true function parameter that doesn't change across iterations
+        // Pinned receivers (__pin$*$@*) are NOT parameters - they're carriers
+        // because they can be reassigned in the loop body
+
+        // Pinned variables are always carriers (loop-variant)
+        if name.starts_with("__pin$") {
+            return false;
+        }
+
+        // Check if it's the receiver
+        if name == "me" {
+            return true;
+        }
+
+        // Check if it's in the original function parameter names
+        // This is more reliable than checking ValueIds, which can change through copies/PHIs
+        let is_param = self.parent_builder.function_param_names.contains(name);
+
+        if std::env::var("NYASH_LOOPFORM_DEBUG").is_ok() {
+            eprintln!("[is_parameter] {} -> {} (param_names = {:?})",
+                     name, is_param, self.parent_builder.function_param_names);
+        }
+
+        is_param
+    }
+
+    fn set_current_block(&mut self, block: BasicBlockId) -> Result<(), String> {
+        self.parent_builder.start_new_block(block)
+    }
+
+    fn emit_copy(&mut self, dst: ValueId, src: ValueId) -> Result<(), String> {
+        self.parent_builder.emit_instruction(MirInstruction::Copy { dst, src })
+    }
+
+    fn emit_jump(&mut self, target: BasicBlockId) -> Result<(), String> {
+        self.emit_jump(target)
+    }
+
+    fn emit_phi(
+        &mut self,
+        dst: ValueId,
+        inputs: Vec<(BasicBlockId, ValueId)>,
+    ) -> Result<(), String> {
+        self.emit_phi_at_block_start(
+            self.current_block()?,
+            dst,
+            inputs
+        )
+    }
+
+    fn update_phi_inputs(
+        &mut self,
+        block: BasicBlockId,
+        phi_id: ValueId,
+        inputs: Vec<(BasicBlockId, ValueId)>,
+    ) -> Result<(), String> {
+        self.parent_builder.update_phi_instruction(block, phi_id, inputs)
+    }
+
+    fn update_var(&mut self, name: String, value: ValueId) {
+        self.parent_builder.variable_map.insert(name, value);
+    }
+
+    fn get_variable_at_block(&self, name: &str, block: BasicBlockId) -> Option<ValueId> {
+        // Use the inherent method to avoid recursion
+        LoopBuilder::get_variable_at_block(self, name, block)
+    }
+}
--- a/src/mir/phi_core/loopform_builder.rs
+++ b/src/mir/phi_core/loopform_builder.rs
@ -38,7 +38,7 @@ pub struct PinnedVariable {
 ///
 /// Key Innovation: All ValueIds allocated upfront before any MIR emission,
 /// eliminating circular dependency issues.
-#[derive(Debug, Default)]
+#[derive(Debug)]
 pub struct LoopFormBuilder {
    pub carriers: Vec<CarrierVariable>,
    pub pinned: Vec<PinnedVariable>,
@ -77,6 +77,10 @@ impl LoopFormBuilder {
                    preheader_copy: ops.new_value(), // Allocate NOW
                    header_phi: ops.new_value(),     // Allocate NOW
                };
+                if std::env::var("NYASH_LOOPFORM_DEBUG").is_ok() {
+                    eprintln!("[loopform/prepare] pinned: {} -> init={:?}, copy={:?}, phi={:?}",
+                             name, value, pinned.preheader_copy, pinned.header_phi);
+                }
                self.pinned.push(pinned);
            } else {
                // Carrier variable (local, modified in loop)
@ -85,8 +89,12 @@ impl LoopFormBuilder {
                    init_value: value,
                    preheader_copy: ops.new_value(), // Allocate NOW
                    header_phi: ops.new_value(),     // Allocate NOW
-                    latch_value: ValueId::INVALID,   // Will be set during seal
+                    latch_value: ValueId(0),         // Will be set during seal (placeholder)
                };
+                if std::env::var("NYASH_LOOPFORM_DEBUG").is_ok() {
+                    eprintln!("[loopform/prepare] carrier: {} -> init={:?}, copy={:?}, phi={:?}",
+                             name, value, carrier.preheader_copy, carrier.header_phi);
+                }
                self.carriers.push(carrier);
            }
        }