419 lines
15 KiB
Markdown
419 lines
15 KiB
Markdown
|
|
# Phase 171-2: Condition Inputs Metadata Design
|
||
|
|
|
||
|
|
**Date**: 2025-12-07
|
||
|
|
**Status**: Design Complete
|
||
|
|
**Decision**: **Option A - Extend JoinInlineBoundary**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Design Decision: Option A
|
||
|
|
|
||
|
|
### Rationale
|
||
|
|
|
||
|
|
**Option A: Extend JoinInlineBoundary** (CHOSEN ✅)
|
||
|
|
```rust
|
||
|
|
pub struct JoinInlineBoundary {
|
||
|
|
pub join_inputs: Vec<ValueId>,
|
||
|
|
pub host_inputs: Vec<ValueId>,
|
||
|
|
pub join_outputs: Vec<ValueId>,
|
||
|
|
pub host_outputs: Vec<ValueId>,
|
||
|
|
pub exit_bindings: Vec<LoopExitBinding>,
|
||
|
|
|
||
|
|
// NEW: Condition-only inputs
|
||
|
|
pub condition_inputs: Vec<(String, ValueId)>, // [(var_name, host_value_id)]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why this is best**:
|
||
|
|
1. **Minimal invasiveness**: Single structure change
|
||
|
|
2. **Clear semantics**: "Condition inputs" are distinct from "loop parameters"
|
||
|
|
3. **Reuses existing infrastructure**: Same Copy injection mechanism
|
||
|
|
4. **Future-proof**: Easy to extend for condition-only outputs (if needed)
|
||
|
|
5. **Symmetric design**: Mirrors how `exit_bindings` handle exit values
|
||
|
|
|
||
|
|
**Rejected alternatives**:
|
||
|
|
|
||
|
|
**Option B: Create new LoopInputBinding**
|
||
|
|
```rust
|
||
|
|
pub struct LoopInputBinding {
|
||
|
|
pub condition_vars: HashMap<String, ValueId>,
|
||
|
|
}
|
||
|
|
```
|
||
|
|
❌ **Rejected**: Introduces another structure; harder to coordinate with boundary
|
||
|
|
|
||
|
|
**Option C: Extend LoopExitBinding**
|
||
|
|
```rust
|
||
|
|
pub struct LoopExitBinding {
|
||
|
|
pub condition_inputs: Vec<String>, // NEW
|
||
|
|
// ...
|
||
|
|
}
|
||
|
|
```
|
||
|
|
❌ **Rejected**: Semantic mismatch (exit bindings are for outputs, not inputs)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Detailed Design
|
||
|
|
|
||
|
|
### 1. Extended Structure Definition
|
||
|
|
|
||
|
|
**File**: `src/mir/join_ir/lowering/inline_boundary.rs`
|
||
|
|
|
||
|
|
```rust
|
||
|
|
#[derive(Debug, Clone)]
|
||
|
|
pub struct JoinInlineBoundary {
|
||
|
|
/// JoinIR-local ValueIds that act as "input slots"
|
||
|
|
///
|
||
|
|
/// These are the ValueIds used **inside** the JoinIR fragment to refer
|
||
|
|
/// to values that come from the host. They should be small sequential
|
||
|
|
/// IDs (0, 1, 2, ...) since JoinIR lowerers allocate locally.
|
||
|
|
///
|
||
|
|
/// Example: For a loop variable `i`, JoinIR uses ValueId(0) as the parameter.
|
||
|
|
pub join_inputs: Vec<ValueId>,
|
||
|
|
|
||
|
|
/// Host-function ValueIds that provide the input values
|
||
|
|
///
|
||
|
|
/// These are the ValueIds from the **host function** that correspond to
|
||
|
|
/// the join_inputs. The merger will inject Copy instructions to connect
|
||
|
|
/// host_inputs[i] → join_inputs[i].
|
||
|
|
///
|
||
|
|
/// Example: If host has `i` as ValueId(4), then host_inputs = [ValueId(4)].
|
||
|
|
pub host_inputs: Vec<ValueId>,
|
||
|
|
|
||
|
|
/// JoinIR-local ValueIds that represent outputs (if any)
|
||
|
|
pub join_outputs: Vec<ValueId>,
|
||
|
|
|
||
|
|
/// Host-function ValueIds that receive the outputs (DEPRECATED)
|
||
|
|
#[deprecated(since = "Phase 190", note = "Use exit_bindings instead")]
|
||
|
|
pub host_outputs: Vec<ValueId>,
|
||
|
|
|
||
|
|
/// Explicit exit bindings for loop carriers (Phase 190+)
|
||
|
|
pub exit_bindings: Vec<LoopExitBinding>,
|
||
|
|
|
||
|
|
/// Condition-only input variables (Phase 171+)
|
||
|
|
///
|
||
|
|
/// These are variables used ONLY in the loop condition, NOT as loop parameters.
|
||
|
|
/// They need to be available in JoinIR scope but are not modified by the loop.
|
||
|
|
///
|
||
|
|
/// # Example
|
||
|
|
///
|
||
|
|
/// For `loop(start < end) { i = i + 1 }`:
|
||
|
|
/// - Loop parameter: `i` → goes in `join_inputs`/`host_inputs`
|
||
|
|
/// - Condition-only: `start`, `end` → go in `condition_inputs`
|
||
|
|
///
|
||
|
|
/// # Format
|
||
|
|
///
|
||
|
|
/// Each entry is `(variable_name, host_value_id)`:
|
||
|
|
/// ```
|
||
|
|
/// condition_inputs: vec![
|
||
|
|
/// ("start".to_string(), ValueId(33)), // HOST ID for "start"
|
||
|
|
/// ("end".to_string(), ValueId(34)), // HOST ID for "end"
|
||
|
|
/// ]
|
||
|
|
/// ```
|
||
|
|
///
|
||
|
|
/// The merger will:
|
||
|
|
/// 1. Extract unique variable names from condition AST
|
||
|
|
/// 2. Look up HOST ValueIds from `builder.variable_map`
|
||
|
|
/// 3. Inject Copy instructions for each condition input
|
||
|
|
/// 4. Remap JoinIR references to use the copied values
|
||
|
|
pub condition_inputs: Vec<(String, ValueId)>,
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 2. Constructor Updates
|
||
|
|
|
||
|
|
**Add new constructor**:
|
||
|
|
|
||
|
|
```rust
|
||
|
|
impl JoinInlineBoundary {
|
||
|
|
/// Create a new boundary with condition inputs (Phase 171+)
|
||
|
|
///
|
||
|
|
/// # Arguments
|
||
|
|
///
|
||
|
|
/// * `join_inputs` - JoinIR-local ValueIds for loop parameters
|
||
|
|
/// * `host_inputs` - HOST ValueIds for loop parameters
|
||
|
|
/// * `condition_inputs` - Condition-only variables [(name, host_value_id)]
|
||
|
|
///
|
||
|
|
/// # Example
|
||
|
|
///
|
||
|
|
/// ```ignore
|
||
|
|
/// let boundary = JoinInlineBoundary::new_with_condition_inputs(
|
||
|
|
/// vec![ValueId(0)], // join_inputs (i)
|
||
|
|
/// vec![ValueId(5)], // host_inputs (i)
|
||
|
|
/// vec![
|
||
|
|
/// ("start".to_string(), ValueId(33)),
|
||
|
|
/// ("end".to_string(), ValueId(34)),
|
||
|
|
/// ],
|
||
|
|
/// );
|
||
|
|
/// ```
|
||
|
|
pub fn new_with_condition_inputs(
|
||
|
|
join_inputs: Vec<ValueId>,
|
||
|
|
host_inputs: Vec<ValueId>,
|
||
|
|
condition_inputs: Vec<(String, ValueId)>,
|
||
|
|
) -> Self {
|
||
|
|
assert_eq!(
|
||
|
|
join_inputs.len(),
|
||
|
|
host_inputs.len(),
|
||
|
|
"join_inputs and host_inputs must have same length"
|
||
|
|
);
|
||
|
|
Self {
|
||
|
|
join_inputs,
|
||
|
|
host_inputs,
|
||
|
|
join_outputs: vec![],
|
||
|
|
#[allow(deprecated)]
|
||
|
|
host_outputs: vec![],
|
||
|
|
exit_bindings: vec![],
|
||
|
|
condition_inputs,
|
||
|
|
}
|
||
|
|
}
|
||
|
|
|
||
|
|
/// Create boundary with inputs, exit bindings, AND condition inputs (Phase 171+)
|
||
|
|
pub fn new_with_exit_and_condition_inputs(
|
||
|
|
join_inputs: Vec<ValueId>,
|
||
|
|
host_inputs: Vec<ValueId>,
|
||
|
|
exit_bindings: Vec<LoopExitBinding>,
|
||
|
|
condition_inputs: Vec<(String, ValueId)>,
|
||
|
|
) -> Self {
|
||
|
|
assert_eq!(
|
||
|
|
join_inputs.len(),
|
||
|
|
host_inputs.len(),
|
||
|
|
"join_inputs and host_inputs must have same length"
|
||
|
|
);
|
||
|
|
Self {
|
||
|
|
join_inputs,
|
||
|
|
host_inputs,
|
||
|
|
join_outputs: vec![],
|
||
|
|
#[allow(deprecated)]
|
||
|
|
host_outputs: vec![],
|
||
|
|
exit_bindings,
|
||
|
|
condition_inputs,
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Update existing constructors** to set `condition_inputs: vec![]`:
|
||
|
|
|
||
|
|
```rust
|
||
|
|
pub fn new_inputs_only(join_inputs: Vec<ValueId>, host_inputs: Vec<ValueId>) -> Self {
|
||
|
|
// ... existing assertions
|
||
|
|
Self {
|
||
|
|
join_inputs,
|
||
|
|
host_inputs,
|
||
|
|
join_outputs: vec![],
|
||
|
|
#[allow(deprecated)]
|
||
|
|
host_outputs: vec![],
|
||
|
|
exit_bindings: vec![],
|
||
|
|
condition_inputs: vec![], // NEW: Default to empty
|
||
|
|
}
|
||
|
|
}
|
||
|
|
|
||
|
|
pub fn new_with_exit_bindings(
|
||
|
|
join_inputs: Vec<ValueId>,
|
||
|
|
host_inputs: Vec<ValueId>,
|
||
|
|
exit_bindings: Vec<LoopExitBinding>,
|
||
|
|
) -> Self {
|
||
|
|
// ... existing assertions
|
||
|
|
Self {
|
||
|
|
join_inputs,
|
||
|
|
host_inputs,
|
||
|
|
join_outputs: vec![],
|
||
|
|
#[allow(deprecated)]
|
||
|
|
host_outputs: vec![],
|
||
|
|
exit_bindings,
|
||
|
|
condition_inputs: vec![], // NEW: Default to empty
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 3. Value Flow Diagram
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────────────────────────────────────────────────────┐
|
||
|
|
│ HOST MIR Builder │
|
||
|
|
│ │
|
||
|
|
│ variable_map: │
|
||
|
|
│ "i" → ValueId(5) (loop variable - becomes parameter) │
|
||
|
|
│ "start" → ValueId(33) (condition input - read-only) │
|
||
|
|
│ "end" → ValueId(34) (condition input - read-only) │
|
||
|
|
│ "sum" → ValueId(10) (carrier - exit binding) │
|
||
|
|
└─────────────────────────────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
┌─────────────────┴─────────────────┐
|
||
|
|
│ │
|
||
|
|
↓ ↓
|
||
|
|
┌──────────────────────┐ ┌──────────────────────┐
|
||
|
|
│ JoinIR Lowerer │ │ Condition Extractor │
|
||
|
|
│ │ │ │
|
||
|
|
│ Allocates: │ │ Extracts variables: │
|
||
|
|
│ i_param = Val(0) │ │ ["start", "end"] │
|
||
|
|
│ sum_init = Val(1) │ │ │
|
||
|
|
└──────────────────────┘ └──────────────────────┘
|
||
|
|
↓ ↓
|
||
|
|
└─────────────────┬─────────────────┘
|
||
|
|
↓
|
||
|
|
┌────────────────────────────────────────────┐
|
||
|
|
│ JoinInlineBoundary │
|
||
|
|
│ │
|
||
|
|
│ join_inputs: [Val(0), Val(1)] │
|
||
|
|
│ host_inputs: [Val(5), Val(10)] │
|
||
|
|
│ │
|
||
|
|
│ condition_inputs: [ │
|
||
|
|
│ ("start", Val(33)), │
|
||
|
|
│ ("end", Val(34)) │
|
||
|
|
│ ] │
|
||
|
|
│ │
|
||
|
|
│ exit_bindings: [ │
|
||
|
|
│ { carrier: "sum", join_exit: Val(18), │
|
||
|
|
│ host_slot: Val(10) } │
|
||
|
|
│ ] │
|
||
|
|
└────────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
┌────────────────────────────────────────────┐
|
||
|
|
│ merge_joinir_mir_blocks() │
|
||
|
|
│ │
|
||
|
|
│ Phase 1: Inject Copy instructions │
|
||
|
|
│ Val(100) = Copy Val(5) // i │
|
||
|
|
│ Val(101) = Copy Val(10) // sum │
|
||
|
|
│ Val(102) = Copy Val(33) // start ← NEW │
|
||
|
|
│ Val(103) = Copy Val(34) // end ← NEW │
|
||
|
|
│ │
|
||
|
|
│ Phase 2: Remap JoinIR ValueIds │
|
||
|
|
│ Val(0) → Val(100) // i param │
|
||
|
|
│ Val(1) → Val(101) // sum init │
|
||
|
|
│ │
|
||
|
|
│ Phase 3: Remap condition refs │
|
||
|
|
│ Compare { lhs: Val(33), ... } │
|
||
|
|
│ ↓ NO CHANGE (uses HOST ID directly) │
|
||
|
|
│ Compare { lhs: Val(102), ... } ← FIXED │
|
||
|
|
│ │
|
||
|
|
│ Phase 4: Reconnect exit bindings │
|
||
|
|
│ variable_map["sum"] = Val(200) │
|
||
|
|
└────────────────────────────────────────────┘
|
||
|
|
↓
|
||
|
|
✅ All ValueIds valid
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 4. Key Insight: Two Types of Inputs
|
||
|
|
|
||
|
|
This design recognizes **two distinct categories** of JoinIR inputs:
|
||
|
|
|
||
|
|
| Category | Examples | Boundary Field | Mutability | Treatment |
|
||
|
|
|----------|----------|----------------|-----------|-----------|
|
||
|
|
| **Loop Parameters** | `i` (loop var), `sum` (carrier) | `join_inputs`/`host_inputs` | Mutable | Passed as function params |
|
||
|
|
| **Condition Inputs** | `start`, `end`, `len` | `condition_inputs` | Read-only | Captured from HOST scope |
|
||
|
|
|
||
|
|
**Why separate?**
|
||
|
|
|
||
|
|
1. **Semantic clarity**: Loop parameters can be modified; condition inputs are immutable
|
||
|
|
2. **Implementation simplicity**: Condition inputs don't need JoinIR parameters - just Copy + remap
|
||
|
|
3. **Future extensibility**: May want condition-only outputs (e.g., for side-effectful conditions)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 5. Implementation Strategy
|
||
|
|
|
||
|
|
**Step 1**: Modify `inline_boundary.rs`
|
||
|
|
- Add `condition_inputs` field
|
||
|
|
- Update all constructors to initialize it
|
||
|
|
- Add new constructors for condition input support
|
||
|
|
|
||
|
|
**Step 2**: Implement condition variable extraction
|
||
|
|
- Create `extract_condition_variables()` function
|
||
|
|
- Recursively traverse condition AST
|
||
|
|
- Collect unique variable names
|
||
|
|
|
||
|
|
**Step 3**: Update loop lowerers
|
||
|
|
- Call `extract_condition_variables()` on condition AST
|
||
|
|
- Look up HOST ValueIds from `builder.variable_map`
|
||
|
|
- Pass to boundary constructor
|
||
|
|
|
||
|
|
**Step 4**: Update merge logic
|
||
|
|
- Inject Copy instructions for condition inputs
|
||
|
|
- Create temporary mapping: var_name → copied_value_id
|
||
|
|
- Rewrite condition instructions to use copied ValueIds
|
||
|
|
|
||
|
|
**Step 5**: Test with trim pattern
|
||
|
|
- Should resolve ValueId(33) undefined error
|
||
|
|
- Verify condition evaluation uses correct values
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Remaining Questions
|
||
|
|
|
||
|
|
### Q1: Should condition inputs be remapped globally or locally?
|
||
|
|
|
||
|
|
**Answer**: **Locally** - only within JoinIR condition instructions
|
||
|
|
|
||
|
|
**Rationale**: Condition inputs are used in:
|
||
|
|
1. Loop condition evaluation (in `loop_step` function)
|
||
|
|
2. Nowhere else (by definition - they're condition-only)
|
||
|
|
|
||
|
|
So we only need to remap ValueIds in the condition instructions, not globally across all JoinIR.
|
||
|
|
|
||
|
|
### Q2: What if a condition input is ALSO a loop parameter?
|
||
|
|
|
||
|
|
**Example**: `loop(i < 10) { i = i + 1 }`
|
||
|
|
- `i` is both a loop parameter (mutated in body) AND used in condition
|
||
|
|
|
||
|
|
**Answer**: **Loop parameter takes precedence** - it's already in `join_inputs`/`host_inputs`
|
||
|
|
|
||
|
|
**Implementation**: When extracting condition variables, SKIP any that are already in `join_inputs`
|
||
|
|
|
||
|
|
```rust
|
||
|
|
fn extract_condition_variables(
|
||
|
|
condition_ast: &ASTNode,
|
||
|
|
join_inputs_names: &[String], // Already-registered parameters
|
||
|
|
) -> Vec<String> {
|
||
|
|
let all_vars = collect_variable_names_recursive(condition_ast);
|
||
|
|
all_vars.into_iter()
|
||
|
|
.filter(|name| !join_inputs_names.contains(name)) // Skip loop params
|
||
|
|
.collect()
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### Q3: How to handle condition variables that don't exist in variable_map?
|
||
|
|
|
||
|
|
**Answer**: **Fail-fast with clear error**
|
||
|
|
|
||
|
|
```rust
|
||
|
|
let host_value_id = builder.variable_map.get(var_name)
|
||
|
|
.ok_or_else(|| {
|
||
|
|
format!(
|
||
|
|
"Condition variable '{}' not found in variable_map. \
|
||
|
|
Loop condition references undefined variable.",
|
||
|
|
var_name
|
||
|
|
)
|
||
|
|
})?;
|
||
|
|
```
|
||
|
|
|
||
|
|
This follows the "Fail-Fast" principle from CLAUDE.md.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Summary
|
||
|
|
|
||
|
|
**Design Choice**: Option A - Extend `JoinInlineBoundary` with `condition_inputs` field
|
||
|
|
|
||
|
|
**Key Properties**:
|
||
|
|
- ✅ Minimal invasiveness (single structure change)
|
||
|
|
- ✅ Clear semantics (condition-only inputs)
|
||
|
|
- ✅ Reuses existing Copy injection mechanism
|
||
|
|
- ✅ Symmetric with `exit_bindings` design
|
||
|
|
- ✅ Handles all edge cases (overlap with loop params, missing variables)
|
||
|
|
|
||
|
|
**Next Steps**: Phase 171-3 - Implement condition variable extraction in loop lowerers
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- Phase 171-1 Analysis: `phase171-1-boundary-analysis.md`
|
||
|
|
- JoinInlineBoundary: `src/mir/join_ir/lowering/inline_boundary.rs`
|
||
|
|
- Merge Logic: `src/mir/builder/control_flow/joinir/merge/mod.rs`
|