Files
hakorune/src/mir/join_ir/lowering/inline_boundary.rs
nyash-codex 3f58f34592 feat(llvm): Phase 132-P0 - block_end_values tuple-key fix for cross-function isolation
## Problem
`block_end_values` used block ID only as key, causing collisions when
multiple functions share the same block IDs (e.g., bb0 in both
condition_fn and main).

## Root Cause
- condition_fn's bb0 → block_end_values[0]
- main's bb0 → block_end_values[0] (OVERWRITES!)
- PHI resolution gets wrong snapshot → dominance error

## Solution (Box-First principle)
Change key from `int` to `Tuple[str, int]` (func_name, block_id):

```python
# Before
block_end_values: Dict[int, Dict[int, ir.Value]]

# After
block_end_values: Dict[Tuple[str, int], Dict[int, ir.Value]]
```

## Files Modified (Python - 6 files)

1. `llvm_builder.py` - Type annotation update
2. `function_lower.py` - Pass func_name to lower_blocks
3. `block_lower.py` - Use tuple keys for snapshot save/load
4. `resolver.py` - Add func_name parameter to resolve_incoming
5. `wiring.py` - Thread func_name through PHI wiring
6. `phi_manager.py` - Debug traces

## Files Modified (Rust - cleanup)

- Removed deprecated `loop_to_join.rs` (297 lines deleted)
- Updated pattern lowerers for cleaner exit handling
- Added lifecycle management improvements

## Verification

-  Pattern 1: VM RC: 3, LLVM Result: 3 (no regression)
- ⚠️ Case C: Still has dominance error (separate root cause)
  - Needs additional scope fixes (phi_manager, resolver caches)

## Design Principles

- **Box-First**: Each function is an isolated Box with scoped state
- **SSOT**: (func_name, block_id) uniquely identifies block snapshots
- **Fail-Fast**: No cross-function state contamination

## Known Issues (Phase 132-P1)

Other function-local state needs same treatment:
- phi_manager.predeclared
- resolver caches (i64_cache, ptr_cache, etc.)
- builder._jump_only_blocks

## Documentation

- docs/development/current/main/investigations/phase132-p0-case-c-root-cause.md
- docs/development/current/main/investigations/phase132-p0-tuple-key-implementation.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 05:36:50 +09:00

575 lines
21 KiB
Rust

//! Phase 188-Impl-3: JoinInlineBoundary - Boundary information for JoinIR inlining
//!
//! This module defines the boundary between JoinIR fragments and the host MIR function.
//! It enables clean separation of concerns:
//!
//! - **Box A**: JoinIR Frontend (doesn't know about host ValueIds)
//! - **Box B**: Join→MIR Bridge (converts to MIR using local ValueIds)
//! - **Box C**: JoinInlineBoundary (stores boundary info - THIS FILE)
//! - **Box D**: JoinMirInlineMerger (injects Copy instructions at boundary)
//!
//! ## Design Philosophy
//!
//! The JoinIR lowerer should work with **JoinIR-side ValueIds** allocated via
//! `JoinValueSpace` (Param: 100-999, Local: 1000+) without knowing anything about
//! the host function's ValueId space. This ensures:
//!
//! 1. **Modularity**: JoinIR lowerers are pure transformers
//! 2. **Reusability**: Same lowerer can be used in different contexts
//! 3. **Testability**: JoinIR can be tested independently
//! 4. **Correctness**: SSA properties are maintained via explicit Copy instructions
//!
//! ## Example
//!
//! For `loop(i < 3) { print(i); i = i + 1 }`:
//!
//! ```text
//! Host Function:
//! ValueId(4) = Const 0 // i = 0 in host
//!
//! JoinIR Fragment:
//! ValueId(100) = param // i_param (JoinIR Param region)
//! ValueId(1000) = Const 3
//! ValueId(1001) = Compare ...
//!
//! Boundary:
//! join_inputs: [ValueId(100)] // JoinIR's param slot (Param region)
//! host_inputs: [ValueId(4)] // Host's `i` variable
//!
//! Merged MIR (with Copy injection):
//! entry:
//! ValueId(100) = Copy ValueId(4) // Connect host→JoinIR
//! ValueId(101) = Const 3
//! ...
//! ```
use super::carrier_info::CarrierRole;
use crate::mir::ValueId;
/// Explicit binding between JoinIR exit value and host variable
///
/// This structure formalizes the connection between a JoinIR exit PHI value
/// and the host variable it should update. This eliminates implicit assumptions
/// about which variable a ValueId represents.
///
/// # Pattern 3 Example
///
/// For `loop(i < 3) { sum = sum + i; i = i + 1 }`:
///
/// ```text
/// LoopExitBinding {
/// carrier_name: "sum",
/// join_exit_value: ValueId(18), // k_exit's return value (JoinIR-local)
/// host_slot: ValueId(5), // variable_map["sum"] in host
/// }
/// ```
///
/// # Multi-Carrier Support (Pattern 4+)
///
/// Multiple carriers can be represented as a vector:
///
/// ```text
/// vec![
/// LoopExitBinding { carrier_name: "sum", join_exit_value: ValueId(18), host_slot: ValueId(5), role: LoopState },
/// LoopExitBinding { carrier_name: "count", join_exit_value: ValueId(19), host_slot: ValueId(6), role: LoopState },
/// ]
/// ```
#[derive(Debug, Clone)]
pub struct LoopExitBinding {
/// Carrier variable name (e.g., "sum", "count", "is_digit_pos")
///
/// This is the variable name in the host's variable_map that should
/// receive the exit value.
pub carrier_name: String,
/// JoinIR-side ValueId from k_exit (or exit parameter)
///
/// This is the **JoinIR-local** ValueId that represents the exit value.
/// It will be remapped when merged into the host function.
pub join_exit_value: ValueId,
/// Host-side variable_map slot to reconnect
///
/// This is the host function's ValueId for the variable that should be
/// updated with the exit PHI result.
pub host_slot: ValueId,
/// Phase 227: Role of this carrier (LoopState or ConditionOnly)
///
/// Determines whether this carrier should participate in exit PHI:
/// - LoopState: Needs exit PHI (value used after loop)
/// - ConditionOnly: No exit PHI (only used in loop condition)
pub role: CarrierRole,
}
/// Boundary information for inlining a JoinIR fragment into a host function
///
/// This structure captures the "interface" between a JoinIR fragment and the
/// host function, allowing the merger to inject necessary Copy instructions
/// to connect the two SSA value spaces.
///
/// # Design Note
///
/// This is a **pure data structure** with no logic. All transformation logic
/// lives in the merger (merge_joinir_mir_blocks).
#[derive(Debug, Clone)]
pub struct JoinInlineBoundary {
/// JoinIR-local ValueIds that act as "input slots"
///
/// These are the ValueIds used **inside** the JoinIR fragment to refer
/// to values that come from the host. They should be in the JoinValueSpace
/// Param region (100-999). (They are typically allocated sequentially.)
///
/// Example: For a loop variable `i`, JoinIR uses ValueId(100) as the parameter.
pub join_inputs: Vec<ValueId>,
/// Host-function ValueIds that provide the input values
///
/// These are the ValueIds from the **host function** that correspond to
/// the join_inputs. The merger will inject Copy instructions to connect
/// host_inputs[i] → join_inputs[i].
///
/// Example: If host has `i` as ValueId(4), then host_inputs = [ValueId(4)].
pub host_inputs: Vec<ValueId>,
/// JoinIR-local ValueIds that represent outputs (if any)
///
/// For loops that produce values (e.g., loop result), these are the
/// JoinIR-local ValueIds that should be visible to the host after inlining.
///
/// Phase 188/189 ではまだ利用していないが、将来的な Multi-carrier パターン
/// (複数の変数を一度に返すループ) のために予約している。
pub join_outputs: Vec<ValueId>,
/// Host-function ValueIds that receive the outputs (DEPRECATED)
///
/// **DEPRECATED**: Use `exit_bindings` instead for explicit carrier naming.
///
/// These are the destination ValueIds in the host function that should
/// receive the values from join_outputs, or (Pattern 3 のような単一
/// キャリアのケースでは) ループ exit PHI の結果を受け取るホスト側の
/// SSA スロットを表す。
///
/// Phase 188-Impl-3 までは未使用だったが、Phase 189 で
/// loop_if_phi.hako の sum のような「ループの出口で更新されるキャリア」の
/// 再接続に利用する。
#[deprecated(since = "Phase 190", note = "Use exit_bindings instead")]
pub host_outputs: Vec<ValueId>,
/// Explicit exit bindings for loop carriers (Phase 190+)
///
/// Each binding explicitly names which variable is being updated and
/// where the value comes from. This eliminates ambiguity and prepares
/// for multi-carrier support.
///
/// For Pattern 3 (single carrier "sum"):
/// ```
/// exit_bindings: vec![
/// LoopExitBinding {
/// carrier_name: "sum",
/// join_exit_value: ValueId(18), // k_exit return value
/// host_slot: ValueId(5), // variable_map["sum"]
/// }
/// ]
/// ```
pub exit_bindings: Vec<LoopExitBinding>,
/// Condition-only input variables (Phase 171+ / Phase 171-fix)
///
/// **DEPRECATED**: Use `condition_bindings` instead (Phase 171-fix).
///
/// These are variables used ONLY in the loop condition, NOT as loop parameters.
/// They need to be available in JoinIR scope but are not modified by the loop.
///
/// # Example
///
/// For `loop(start < end) { i = i + 1 }`:
/// - Loop parameter: `i` → goes in `join_inputs`/`host_inputs`
/// - Condition-only: `start`, `end` → go in `condition_inputs`
///
/// # Format
///
/// Each entry is `(variable_name, host_value_id)`:
/// ```
/// condition_inputs: vec![
/// ("start".to_string(), ValueId(33)), // HOST ID for "start"
/// ("end".to_string(), ValueId(34)), // HOST ID for "end"
/// ]
/// ```
///
/// The merger will:
/// 1. Extract unique variable names from condition AST
/// 2. Look up HOST ValueIds from `builder.variable_map`
/// 3. Inject Copy instructions for each condition input
/// 4. Remap JoinIR references to use the copied values
#[deprecated(since = "Phase 171-fix", note = "Use condition_bindings instead")]
pub condition_inputs: Vec<(String, ValueId)>,
/// Phase 171-fix: Condition bindings with explicit JoinIR ValueIds
///
/// Each binding explicitly specifies:
/// - Variable name
/// - HOST ValueId (source)
/// - JoinIR ValueId (destination)
///
/// This replaces `condition_inputs` to ensure proper ValueId separation.
pub condition_bindings: Vec<super::condition_to_joinir::ConditionBinding>,
/// Phase 33-14: Expression result ValueId (JoinIR-local)
///
/// If the loop is used as an expression (like `return loop(...)`), this field
/// contains the JoinIR-local ValueId of k_exit's return value.
///
/// - `Some(ValueId)`: Loop returns a value → k_exit return goes to exit_phi_inputs
/// - `None`: Loop only updates carriers → no exit_phi_inputs generation
///
/// # Example: joinir_min_loop.hako (expr result pattern)
///
/// ```nyash
/// loop(i < 3) { if (i >= 2) { break } i = i + 1 }
/// return i
/// ```
///
/// Here, `expr_result = Some(i_exit)` because the loop's result is used.
///
/// # Example: trim pattern (carrier-only)
///
/// ```nyash
/// loop(start < end) { start = start + 1 }
/// print(start) // Uses carrier after loop
/// ```
///
/// Here, `expr_result = None` because the loop doesn't return a value.
pub expr_result: Option<crate::mir::ValueId>,
/// Phase 33-16: Loop variable name (for LoopHeaderPhiBuilder)
///
/// The name of the loop control variable (e.g., "i" in `loop(i < 3)`).
/// Used to track which PHI corresponds to the loop variable.
pub loop_var_name: Option<String>,
/// Phase 228: Carrier metadata (for header PHI generation)
///
/// Contains full carrier information including initialization policies.
/// This allows header PHI generation to handle ConditionOnly carriers
/// with explicit bool initialization.
///
/// - `Some(CarrierInfo)`: Full carrier metadata available
/// - `None`: Legacy path (derive carriers from exit_bindings)
pub carrier_info: Option<super::carrier_info::CarrierInfo>,
}
impl JoinInlineBoundary {
/// Create a new boundary with input mappings only
///
/// This is the common case for loops like Pattern 1 where:
/// - Inputs: loop variables (e.g., `i` in `loop(i < 3)`)
/// - Outputs: none (loop returns void/0)
pub fn new_inputs_only(join_inputs: Vec<ValueId>, host_inputs: Vec<ValueId>) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
expr_result: None, // Phase 33-14: Default to carrier-only pattern
loop_var_name: None, // Phase 33-16
carrier_info: None, // Phase 228: Default to None
}
}
/// Create a new boundary with both inputs and outputs (DEPRECATED)
///
/// **DEPRECATED**: Use `new_with_exit_bindings` instead.
///
/// Create a new boundary with inputs and **host outputs only** (DEPRECATED)
///
/// **DEPRECATED**: Use `new_with_exit_bindings` instead for explicit carrier naming.
///
/// JoinIR 側の exit 値 (k_exit の引数など) を 1 つの PHI にまとめ、
/// その PHI 結果をホスト側の変数スロットへ再接続したい場合に使う。
///
/// 典型例: Pattern 3 (loop_if_phi.hako)
/// - join_inputs : [i_init, sum_init]
/// - host_inputs : [host_i, host_sum]
/// - host_outputs : [host_sum] // ループ exit 時に上書きしたい変数
#[deprecated(since = "Phase 190", note = "Use new_with_exit_bindings instead")]
pub fn new_with_input_and_host_outputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
host_outputs: Vec<ValueId>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs,
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
expr_result: None, // Phase 33-14
loop_var_name: None, // Phase 33-16
carrier_info: None, // Phase 228
}
}
/// Create a new boundary with explicit exit bindings (Phase 190+)
///
/// This is the recommended constructor for loops with exit carriers.
/// Each exit binding explicitly names the carrier variable and its
/// source/destination values.
///
/// # Example: Pattern 3 (single carrier)
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_exit_bindings(
/// vec![ValueId(0), ValueId(1)], // join_inputs (i, sum init)
/// vec![loop_var_id, sum_var_id], // host_inputs
/// vec![
/// LoopExitBinding {
/// carrier_name: "sum".to_string(),
/// join_exit_value: ValueId(18), // k_exit return value
/// host_slot: sum_var_id, // variable_map["sum"]
/// }
/// ],
/// );
/// ```
///
/// # Example: Pattern 4+ (multiple carriers)
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_exit_bindings(
/// vec![ValueId(0), ValueId(1), ValueId(2)], // join_inputs
/// vec![i_id, sum_id, count_id], // host_inputs
/// vec![
/// LoopExitBinding { carrier_name: "sum".to_string(), ... },
/// LoopExitBinding { carrier_name: "count".to_string(), ... },
/// ],
/// );
/// ```
pub fn new_with_exit_bindings(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
exit_bindings: Vec<LoopExitBinding>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings,
#[allow(deprecated)]
condition_inputs: vec![], // Phase 171: Default to empty (deprecated)
condition_bindings: vec![], // Phase 171-fix: Default to empty
expr_result: None, // Phase 33-14
loop_var_name: None, // Phase 33-16
carrier_info: None, // Phase 228
}
}
/// Create a new boundary with condition inputs (Phase 171+)
///
/// # Arguments
///
/// * `join_inputs` - JoinIR-local ValueIds for loop parameters
/// * `host_inputs` - HOST ValueIds for loop parameters
/// * `condition_inputs` - Condition-only variables [(name, host_value_id)]
///
/// # Example
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_condition_inputs(
/// vec![ValueId(0)], // join_inputs (i)
/// vec![ValueId(5)], // host_inputs (i)
/// vec![
/// ("start".to_string(), ValueId(33)),
/// ("end".to_string(), ValueId(34)),
/// ],
/// );
/// ```
pub fn new_with_condition_inputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
condition_inputs: Vec<(String, ValueId)>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs,
condition_bindings: vec![], // Phase 171-fix: Will be populated by new constructor
expr_result: None, // Phase 33-14
loop_var_name: None, // Phase 33-16
carrier_info: None, // Phase 228
}
}
/// Create boundary with inputs, exit bindings, AND condition inputs (Phase 171+)
///
/// This is the most complete constructor for loops with carriers and condition variables.
///
/// # Example: Pattern 3 with condition variables
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_exit_and_condition_inputs(
/// vec![ValueId(0), ValueId(1)], // join_inputs (i, sum)
/// vec![ValueId(5), ValueId(10)], // host_inputs
/// vec![
/// LoopExitBinding {
/// carrier_name: "sum".to_string(),
/// join_exit_value: ValueId(18),
/// host_slot: ValueId(10),
/// }
/// ],
/// vec![
/// ("start".to_string(), ValueId(33)),
/// ("end".to_string(), ValueId(34)),
/// ],
/// );
/// ```
pub fn new_with_exit_and_condition_inputs(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
exit_bindings: Vec<LoopExitBinding>,
condition_inputs: Vec<(String, ValueId)>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings,
#[allow(deprecated)]
condition_inputs,
condition_bindings: vec![], // Phase 171-fix: Will be populated by new constructor
expr_result: None, // Phase 33-14
loop_var_name: None, // Phase 33-16
carrier_info: None, // Phase 228
}
}
/// Phase 171-fix: Create boundary with ConditionBindings (NEW constructor)
///
/// This is the recommended constructor that uses ConditionBindings instead of
/// the deprecated condition_inputs.
///
/// # Arguments
///
/// * `join_inputs` - JoinIR-local ValueIds for loop parameters
/// * `host_inputs` - HOST ValueIds for loop parameters
/// * `condition_bindings` - Explicit HOST ↔ JoinIR mappings for condition variables
///
/// # Example
///
/// ```ignore
/// let boundary = JoinInlineBoundary::new_with_condition_bindings(
/// vec![ValueId(0)], // join_inputs (loop param i)
/// vec![ValueId(5)], // host_inputs (loop param i)
/// vec![
/// ConditionBinding {
/// name: "start".to_string(),
/// host_value: ValueId(33), // HOST
/// join_value: ValueId(1), // JoinIR
/// },
/// ConditionBinding {
/// name: "end".to_string(),
/// host_value: ValueId(34), // HOST
/// join_value: ValueId(2), // JoinIR
/// },
/// ],
/// );
/// ```
pub fn new_with_condition_bindings(
join_inputs: Vec<ValueId>,
host_inputs: Vec<ValueId>,
condition_bindings: Vec<super::condition_to_joinir::ConditionBinding>,
) -> Self {
assert_eq!(
join_inputs.len(),
host_inputs.len(),
"join_inputs and host_inputs must have same length"
);
Self {
join_inputs,
host_inputs,
join_outputs: vec![],
#[allow(deprecated)]
host_outputs: vec![],
exit_bindings: vec![],
#[allow(deprecated)]
condition_inputs: vec![], // Deprecated, use condition_bindings instead
condition_bindings,
expr_result: None, // Phase 33-14
loop_var_name: None, // Phase 33-16
carrier_info: None, // Phase 228
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_boundary_inputs_only() {
let boundary = JoinInlineBoundary::new_inputs_only(
vec![ValueId(0)], // JoinIR uses ValueId(0) for loop var
vec![ValueId(4)], // Host has loop var at ValueId(4)
);
assert_eq!(boundary.join_inputs.len(), 1);
assert_eq!(boundary.host_inputs.len(), 1);
assert_eq!(boundary.join_outputs.len(), 0);
#[allow(deprecated)]
{
assert_eq!(boundary.host_outputs.len(), 0);
assert_eq!(boundary.condition_inputs.len(), 0); // Phase 171: Deprecated field
}
assert_eq!(boundary.condition_bindings.len(), 0); // Phase 171-fix: New field
}
#[test]
#[should_panic(expected = "join_inputs and host_inputs must have same length")]
fn test_boundary_mismatched_inputs() {
JoinInlineBoundary::new_inputs_only(vec![ValueId(0), ValueId(1)], vec![ValueId(4)]);
}
}