Files
hakorune/src/mir/join_ir/lowering/inline_boundary.rs
nyash-codex e30116f53d feat(joinir): Phase 171-fix ConditionEnv/ConditionBinding architecture
Proper HOST↔JoinIR ValueId separation for condition variables:

- Add ConditionEnv struct (name → JoinIR-local ValueId mapping)
- Add ConditionBinding struct (HOST/JoinIR ValueId pairs)
- Modify condition_to_joinir to use ConditionEnv instead of builder.variable_map
- Update Pattern2 lowerer to build ConditionEnv and ConditionBindings
- Extend JoinInlineBoundary with condition_bindings field
- Update BoundaryInjector to inject Copy instructions for condition variables

This fixes the undefined ValueId errors where HOST ValueIds were being
used directly in JoinIR instructions. Programs now execute (RC: 0),
though loop variable exit values still need Phase 172 work.

Key invariants established:
1. JoinIR uses ONLY JoinIR-local ValueIds
2. HOST↔JoinIR bridging is ONLY through JoinInlineBoundary
3. condition_to_joinir NEVER accesses builder.variable_map

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:45:03 +09:00

542 lines
19 KiB
Rust

//! Phase 188-Impl-3: JoinInlineBoundary - Boundary information for JoinIR inlining
//!
//! This module defines the boundary between JoinIR fragments and the host MIR function.
//! It enables clean separation of concerns:
//!
//! - **Box A**: JoinIR Frontend (doesn't know about host ValueIds)
//! - **Box B**: Join→MIR Bridge (converts to MIR using local ValueIds)
//! - **Box C**: JoinInlineBoundary (stores boundary info - THIS FILE)
//! - **Box D**: JoinMirInlineMerger (injects Copy instructions at boundary)
//!
//! ## Design Philosophy
//!
//! The JoinIR lowerer should work with **local ValueIds** (0, 1, 2, ...) without
//! knowing anything about the host function's ValueId space. This ensures:
//!
//! 1. **Modularity**: JoinIR lowerers are pure transformers
//! 2. **Reusability**: Same lowerer can be used in different contexts
//! 3. **Testability**: JoinIR can be tested independently
//! 4. **Correctness**: SSA properties are maintained via explicit Copy instructions
//!
//! ## Example
//!
//! For `loop(i < 3) { print(i); i = i + 1 }`:
//!
//! ```text
//! Host Function:
//! ValueId(4) = Const 0 // i = 0 in host
//!
//! JoinIR Fragment (uses local IDs 0, 1, 2, ...):
//! ValueId(0) = param // i_param (local to JoinIR)
//! ValueId(1) = Const 3
//! ValueId(2) = Compare ...
//!
//! Boundary:
//! join_inputs: [ValueId(0)] // JoinIR's param slot
//! host_inputs: [ValueId(4)] // Host's `i` variable
//!
//! Merged MIR (with Copy injection):
//! entry:
//! ValueId(100) = Copy ValueId(4) // Connect host→JoinIR
//! ValueId(101) = Const 3
//! ...
//! ```
use crate::mir::ValueId;
/// Explicit binding between JoinIR exit value and host variable
///
/// This structure formalizes the connection between a JoinIR exit PHI value
/// and the host variable it should update. This eliminates implicit assumptions
/// about which variable a ValueId represents.
///
/// # Pattern 3 Example
///
/// For `loop(i < 3) { sum = sum + i; i = i + 1 }`:
///
/// ```text
/// LoopExitBinding {
/// carrier_name: "sum",
/// join_exit_value: ValueId(18), // k_exit's return value (JoinIR-local)
/// host_slot: ValueId(5), // variable_map["sum"] in host
/// }
/// ```
///
/// # Multi-Carrier Support (Pattern 4+)
///
/// Multiple carriers can be represented as a vector:
///
/// ```text
/// vec![
/// LoopExitBinding { carrier_name: "sum", join_exit_value: ValueId(18), host_slot: ValueId(5) },
/// LoopExitBinding { carrier_name: "count", join_exit_value: ValueId(19), host_slot: ValueId(6) },
/// ]
/// ```
#[derive(Debug, Clone)]
pub struct LoopExitBinding {
/// Carrier variable name (e.g., "sum", "count")
///
/// This is the variable name in the host's variable_map that should
/// receive the exit value.
pub carrier_name: String,
/// JoinIR-side ValueId from k_exit (or exit parameter)
///
/// This is the **JoinIR-local** ValueId that represents the exit value.
/// It will be remapped when merged into the host function.
pub join_exit_value: ValueId,
/// Host-side variable_map slot to reconnect
///
/// This is the host function's ValueId for the variable that should be
/// updated with the exit PHI result.
pub host_slot: ValueId,
}
/// Boundary information for inlining a JoinIR fragment into a host function
///
/// This structure captures the "interface" between a JoinIR fragment and the
/// host function, allowing the merger to inject necessary Copy instructions
/// to connect the two SSA value spaces.
///
/// # Design Note
///
/// This is a **pure data structure** with no logic. All transformation logic
/// lives in the merger (merge_joinir_mir_blocks).
#[derive(Debug, Clone)]
pub struct JoinInlineBoundary {
/// JoinIR-local ValueIds that act as "input slots"
///
/// These are the ValueIds used **inside** the JoinIR fragment to refer
/// to values that come from the host. They should be small sequential
/// IDs (0, 1, 2, ...) since JoinIR lowerers allocate locally.
///
/// Example: For a loop variable `i`, JoinIR uses ValueId(0) as the parameter.
pub join_inputs: Vec<ValueId>,
/// Host-function ValueIds that provide the input values
///
/// These are the ValueIds from the **host function** that correspond to
/// the join_inputs. The merger will inject Copy instructions to connect
/// host_inputs[i] → join_inputs[i].
///
/// Example: If host has `i` as ValueId(4), then host_inputs = [ValueId(4)].
pub host_inputs: Vec<ValueId>,
/// JoinIR-local ValueIds that represent outputs (if any)
///
/// For loops that produce values (e.g., loop result), these are the
/// JoinIR-local ValueIds that should be visible to the host after inlining.
///
/// Phase 188/189 ではまだ利用していないが、将来的な Multi-carrier パターン
/// (複数の変数を一度に返すループ) のために予約している。
pub join_outputs: Vec<ValueId>,
/// Host-function ValueIds that receive the outputs (DEPRECATED)
///
/// **DEPRECATED**: Use `exit_bindings` instead for explicit carrier naming.
///
/// These are the destination ValueIds in the host function that should
/// receive the values from join_outputs, or (Pattern 3 のような単一
/// キャリアのケースでは) ループ exit PHI の結果を受け取るホスト側の
/// SSA スロットを表す。
///
/// Phase 188-Impl-3 までは未使用だったが、Phase 189 で
/// loop_if_phi.hako の sum のような「ループの出口で更新されるキャリア」の
/// 再接続に利用する。
#[deprecated(since = "Phase 190", note = "Use exit_bindings instead")]
pub host_outputs: Vec<ValueId>,
/// Explicit exit bindings for loop carriers (Phase 190+)
///
/// Each binding explicitly names which variable is being updated and
/// where the value comes from. This eliminates ambiguity and prepares
/// for multi-carrier support.
///
/// For Pattern 3 (single carrier "sum"):
/// ```
/// exit_bindings: vec![
/// LoopExitBinding {
/// carrier_name: "sum",
/// join_exit_value: ValueId(18), // k_exit return value
/// host_slot: ValueId(5), // variable_map["sum"]
/// }
/// ]
/// ```
pub exit_bindings: Vec<LoopExitBinding>,
/// Condition-only input variables (Phase 171+ / Phase 171-fix)
///
/// **DEPRECATED**: Use `condition_bindings` instead (Phase 171-fix).
///
/// These are variables used ONLY in the loop condition, NOT as loop parameters.
/// They need to be available in JoinIR scope but are not modified by the loop.
///
/// # Example
///
/// For `loop(start < end) { i = i + 1 }`:
/// - Loop parameter: `i` → goes in `join_inputs`/`host_inputs`
/// - Condition-only: `start`, `end` → go in `condition_inputs`
///
/// # Format
///
/// Each entry is `(variable_name, host_value_id)`:
/// ```
/// condition_inputs: vec![
/// ("start".to_string(), ValueId(33)), // HOST ID for "start"
/// ("end".to_string(), ValueId(34)), // HOST ID for "end"
/// ]
/// ```
///
/// The merger will:
/// 1. Extract unique variable names from condition AST
/// 2. Look up HOST ValueIds from `builder.variable_map`
/// 3. Inject Copy instructions for each condition input
/// 4. Remap JoinIR references to use the copied values
#[deprecated(since = "Phase 171-fix", note = "Use condition_bindings instead")]
pub condition_inputs: Vec<(String, ValueId)>,
/// Phase 171-fix: Condition bindings with explicit JoinIR ValueIds
///
/// Each binding explicitly specifies:
/// - Variable name
/// - HOST ValueId (source)
/// - JoinIR ValueId (destination)
///
/// This replaces `condition_inputs` to ensure proper ValueId separation.
pub condition_bindings: Vec<super::condition_to_joinir::ConditionBinding>,
}
impl JoinInlineBoundary {
/// Create a new boundary with input mappings only
///
/// This is the common case for loops like Pattern 1 where:
/// - Inputs: loop variables (e.g., `i` in `loop(i < 3)`)
/// - Outputs: none (loop returns void/0)
pub fn new_inputs_only(join_inputs: Vec<ValueId>, host_inputs: Vec<ValueId>) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
}
}
/// Create a new boundary with both inputs and outputs (DEPRECATED)
///
/// **DEPRECATED**: Use `new_with_exit_bindings` instead.
///
/// Reserved for future loop patterns that produce values.
///
/// 現在の実装では Multi-carrier 出力には未対応だが、型としては複数出力を
/// 表現できるようにしておく。
#[allow(dead_code)]
#[deprecated(since = "Phase 190", note = "Use new_with_exit_bindings instead")]
pub fn new_with_outputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
join_outputs: Vec<ValueId>,
host_outputs: Vec<ValueId>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
assert_eq!(
join_outputs.len(),
host_outputs.len(),
"join_outputs and host_outputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs,
#[allow(deprecated)]
host_outputs,
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
}
}
/// Create a new boundary with inputs and **host outputs only** (DEPRECATED)
///
/// **DEPRECATED**: Use `new_with_exit_bindings` instead for explicit carrier naming.
///
/// JoinIR 側の exit 値 (k_exit の引数など) を 1 つの PHI にまとめ、
/// その PHI 結果をホスト側の変数スロットへ再接続したい場合に使う。
///
/// 典型例: Pattern 3 (loop_if_phi.hako)
/// - join_inputs : [i_init, sum_init]
/// - host_inputs : [host_i, host_sum]
/// - host_outputs : [host_sum] // ループ exit 時に上書きしたい変数
#[deprecated(since = "Phase 190", note = "Use new_with_exit_bindings instead")]
pub fn new_with_input_and_host_outputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
host_outputs: Vec<ValueId>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs,
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
}
}
/// Create a new boundary with explicit exit bindings (Phase 190+)
///
/// This is the recommended constructor for loops with exit carriers.
/// Each exit binding explicitly names the carrier variable and its
/// source/destination values.
///
/// # Example: Pattern 3 (single carrier)
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_exit_bindings(
/// vec![ValueId(0), ValueId(1)], // join_inputs (i, sum init)
/// vec![loop_var_id, sum_var_id], // host_inputs
/// vec![
/// LoopExitBinding {
/// carrier_name: "sum".to_string(),
/// join_exit_value: ValueId(18), // k_exit return value
/// host_slot: sum_var_id, // variable_map["sum"]
/// }
/// ],
/// );
/// ```
///
/// # Example: Pattern 4+ (multiple carriers)
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_exit_bindings(
/// vec![ValueId(0), ValueId(1), ValueId(2)], // join_inputs
/// vec![i_id, sum_id, count_id], // host_inputs
/// vec![
/// LoopExitBinding { carrier_name: "sum".to_string(), ... },
/// LoopExitBinding { carrier_name: "count".to_string(), ... },
/// ],
/// );
/// ```
pub fn new_with_exit_bindings(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
exit_bindings: Vec<LoopExitBinding>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings,
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
}
}
/// Create a new boundary with condition inputs (Phase 171+)
///
/// # Arguments
///
/// * `join_inputs` - JoinIR-local ValueIds for loop parameters
/// * `host_inputs` - HOST ValueIds for loop parameters
/// * `condition_inputs` - Condition-only variables [(name, host_value_id)]
///
/// # Example
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_condition_inputs(
/// vec![ValueId(0)], // join_inputs (i)
/// vec![ValueId(5)], // host_inputs (i)
/// vec![
/// ("start".to_string(), ValueId(33)),
/// ("end".to_string(), ValueId(34)),
/// ],
/// );
/// ```
pub fn new_with_condition_inputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
condition_inputs: Vec<(String, ValueId)>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs,
condition_bindings: vec![], // Phase 171-fix: Will be populated by new constructor
}
}
/// Create boundary with inputs, exit bindings, AND condition inputs (Phase 171+)
///
/// This is the most complete constructor for loops with carriers and condition variables.
///
/// # Example: Pattern 3 with condition variables
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_exit_and_condition_inputs(
/// vec![ValueId(0), ValueId(1)], // join_inputs (i, sum)
/// vec![ValueId(5), ValueId(10)], // host_inputs
/// vec![
/// LoopExitBinding {
/// carrier_name: "sum".to_string(),
/// join_exit_value: ValueId(18),
/// host_slot: ValueId(10),
/// }
/// ],
/// vec![
/// ("start".to_string(), ValueId(33)),
/// ("end".to_string(), ValueId(34)),
/// ],
/// );
/// ```
pub fn new_with_exit_and_condition_inputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
exit_bindings: Vec<LoopExitBinding>,
condition_inputs: Vec<(String, ValueId)>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings,
#[allow(deprecated)]
condition_inputs,
condition_bindings: vec![], // Phase 171-fix: Will be populated by new constructor
}
}
/// Phase 171-fix: Create boundary with ConditionBindings (NEW constructor)
///
/// This is the recommended constructor that uses ConditionBindings instead of
/// the deprecated condition_inputs.
///
/// # Arguments
///
/// * `join_inputs` - JoinIR-local ValueIds for loop parameters
/// * `host_inputs` - HOST ValueIds for loop parameters
/// * `condition_bindings` - Explicit HOST ↔ JoinIR mappings for condition variables
///
/// # Example
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_condition_bindings(
/// vec![ValueId(0)], // join_inputs (loop param i)
/// vec![ValueId(5)], // host_inputs (loop param i)
/// vec![
/// ConditionBinding {
/// name: "start".to_string(),
/// host_value: ValueId(33), // HOST
/// join_value: ValueId(1), // JoinIR
/// },
/// ConditionBinding {
/// name: "end".to_string(),
/// host_value: ValueId(34), // HOST
/// join_value: ValueId(2), // JoinIR
/// },
/// ],
/// );
/// ```
pub fn new_with_condition_bindings(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
condition_bindings: Vec<super::condition_to_joinir::ConditionBinding>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Deprecated, use condition_bindings instead
condition_bindings,
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_boundary_inputs_only() {
let boundary = JoinInlineBoundary::new_inputs_only(
vec![ValueId(0)], // JoinIR uses ValueId(0) for loop var
vec![ValueId(4)], // Host has loop var at ValueId(4)
);
assert_eq!(boundary.join_inputs.len(), 1);
assert_eq!(boundary.host_inputs.len(), 1);
assert_eq!(boundary.join_outputs.len(), 0);
#[allow(deprecated)]
{
assert_eq!(boundary.host_outputs.len(), 0);
assert_eq!(boundary.condition_inputs.len(), 0); // Phase 171: Deprecated field
}
assert_eq!(boundary.condition_bindings.len(), 0); // Phase 171-fix: New field
}
#[test]
#[should_panic(expected = "join_inputs and host_inputs must have same length")]
fn test_boundary_mismatched_inputs() {
JoinInlineBoundary::new_inputs_only(
vec![ValueId(0), ValueId(1)],
vec![ValueId(4)],
);
}
}