Proper HOST↔JoinIR ValueId separation for condition variables: - Add ConditionEnv struct (name → JoinIR-local ValueId mapping) - Add ConditionBinding struct (HOST/JoinIR ValueId pairs) - Modify condition_to_joinir to use ConditionEnv instead of builder.variable_map - Update Pattern2 lowerer to build ConditionEnv and ConditionBindings - Extend JoinInlineBoundary with condition_bindings field - Update BoundaryInjector to inject Copy instructions for condition variables This fixes the undefined ValueId errors where HOST ValueIds were being used directly in JoinIR instructions. Programs now execute (RC: 0), though loop variable exit values still need Phase 172 work. Key invariants established: 1. JoinIR uses ONLY JoinIR-local ValueIds 2. HOST↔JoinIR bridging is ONLY through JoinInlineBoundary 3. condition_to_joinir NEVER accesses builder.variable_map 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
7.9 KiB
Phase 170: ValueId Boundary Mapping Analysis
Date: 2025-12-07 Status: Root Cause Identified Impact: CRITICAL - Blocks all JsonParserBox complex condition tests
Problem Summary
JoinIR loop patterns with complex conditions (e.g., start < end in _trim) compile successfully but fail silently at runtime because condition variable ValueIds are not properly mapped between HOST and JoinIR contexts.
Symptoms
[ssa-undef-debug] fn=TrimTest.trim/1 bb=BasicBlockId(12) inst_idx=0 used=ValueId(33) inst=Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), rhs: ValueId(34) }
[ssa-undef-debug] fn=TrimTest.trim/1 bb=BasicBlockId(12) inst_idx=0 used=ValueId(34) inst=Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), rhs: ValueId(34) }
- Condition uses undefined ValueIds (33, 34) for variables
startandend - Program compiles but produces no output (silent runtime failure)
- PHI nodes also reference undefined carrier values
Root Cause
Architecture Overview
The JoinIR merge process uses two separate ValueId "namespaces":
- HOST context: Main MirBuilder's ValueId space (e.g.,
start = ValueId(33)) - JoinIR context: Fresh ValueId allocator starting from 0 (e.g.,
ValueId(0), ValueId(1), ...)
The JoinInlineBoundary mechanism is supposed to bridge these two spaces by injecting Copy instructions at the entry block.
The Bug
Location: src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs (and other patterns)
Current boundary creation:
let boundary = JoinInlineBoundary::new_inputs_only(
vec![ValueId(0)], // JoinIR's main() parameter (loop variable init)
vec![loop_var_id], // Host's loop variable
);
What's missing: Condition variables (start, end) are NOT in the boundary!
What happens:
-
condition_to_joinir.rslooks up variables inbuilder.variable_map:builder.variable_map.get("start") // Returns ValueId(33) from HOST builder.variable_map.get("end") // Returns ValueId(34) from HOST -
JoinIR instructions are generated with these HOST ValueIds:
JoinInst::Compute(MirLikeInst::Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), // HOST ValueId, not in JoinIR space! rhs: ValueId(34), // HOST ValueId, not in JoinIR space! }) -
During merge,
remap_values()only remaps ValueIds that are inused_values:- ValueIds 0, 1, 2, ... from JoinIR → new ValueIds from builder
- But ValueId(33) and ValueId(34) are not in JoinIR's used_values set!
- They're HOST ValueIds that leaked into JoinIR space
-
Result: Compare instruction references undefined ValueIds
Architectural Issue
The current design has a conceptual mismatch:
condition_to_joinir.rsassumes it can directly reference HOST ValueIds frombuilder.variable_map- JoinIR merge assumes all ValueIds come from JoinIR's fresh allocator
- Boundary mechanism only maps explicitly listed inputs/outputs
This works for simple patterns where:
- Condition is hardcoded (e.g.,
i < 3) - All condition values are constants or loop variables already in the boundary
This breaks when:
- Condition references variables from HOST context (e.g.,
start < end) - Those variables are not in the boundary inputs
Affected Code Paths
Phase 169 Integration: condition_to_joinir.rs
Lines 183-189:
ASTNode::Variable { name, .. } => {
builder
.variable_map
.get(name)
.copied()
.ok_or_else(|| format!("Variable '{}' not found in variable_map", name))
}
This returns HOST ValueIds directly without checking if they need boundary mapping.
Pattern Lowerers: pattern2_with_break.rs, etc.
Pattern lowerers create minimal boundaries that only include:
- Loop variable (e.g.,
i) - Accumulator (if present)
But NOT:
- Variables referenced in loop condition (e.g.,
start,endinstart < end) - Variables referenced in loop body expressions
Merge Infrastructure: merge/mod.rs
The merge process has no way to detect that HOST ValueIds have leaked into JoinIR instructions.
Test Case: TrimTest.trim/1
Code:
local start = 0
local end = s.length()
loop(start < end) { // Condition references start, end
// ...
break
}
Expected boundary:
JoinInlineBoundary::new_inputs_only(
vec![ValueId(0), ValueId(1), ValueId(2)], // loop var, start, end
vec![loop_var_id, start_id, end_id], // HOST ValueIds
)
Actual boundary:
JoinInlineBoundary::new_inputs_only(
vec![ValueId(0)], // Only loop var
vec![loop_var_id], // Only loop var
)
Result: start and end are undefined in JoinIR space
Solutions
Option A: Extract Condition Variables into Boundary (Recommended)
Where: Pattern lowerers (pattern1/2/3/4)
Steps:
- Before calling
lower_condition_to_joinir(), analyze AST to find all variables - For each variable, get HOST ValueId from
builder.variable_map - Allocate JoinIR-side ValueIds (e.g., ValueId(1), ValueId(2))
- Create boundary with all condition variables:
let cond_vars = extract_condition_variables(condition_ast, builder); let boundary = JoinInlineBoundary::new_inputs_only( vec![ValueId(0), ValueId(1), ValueId(2)], // loop var + cond vars vec![loop_var_id, cond_vars[0], cond_vars[1]], );
Pros:
- Minimal change to existing architecture
- Clear separation: boundary handles HOST↔JoinIR mapping
- Works for all condition complexity
Cons:
- Need to implement variable extraction from AST
- Each pattern needs updating
Option B: Delay Variable Resolution Until Merge
Where: condition_to_joinir.rs
Idea: Instead of resolving variables immediately, emit placeholder instructions and resolve during merge.
Pros:
- Cleaner separation: JoinIR doesn't touch HOST ValueIds
Cons:
- Major refactoring required
- Need new placeholder instruction type
- Complicates merge logic
Option C: Use Variable Names Instead of ValueIds in JoinIR
Where: JoinIR instruction format
Idea: JoinIR uses variable names (strings) instead of ValueIds, resolve during merge.
Pros:
- Most robust solution
- Eliminates ValueId namespace confusion
Cons:
- Breaks current JoinIR design (uses MirLikeInst which has ValueIds)
- Major architectural change
Recommendation
Option A - Extract condition variables and add to boundary.
Implementation Plan:
-
Create AST variable extractor (30 minutes)
- File:
src/mir/builder/control_flow/joinir/patterns/cond_var_extractor.rs - Function:
extract_condition_variables(ast: &ASTNode, builder: &MirBuilder) -> Vec<(String, ValueId)> - Recursively walk AST, collect all Variable nodes
- File:
-
Update Pattern2 (1 hour)
- Extract condition variables before calling pattern lowerer
- Create boundary with extracted variables
- Test with
TrimTest.trim/1
-
Update Pattern1, Pattern3, Pattern4 (1 hour each)
- Apply same pattern
-
Validation (30 minutes)
- Re-run
TrimTest.trim/1→ should output correctly - Re-run JsonParserBox tests → should work
- Re-run
Total Estimate: 4.5 hours
Files Affected
New:
src/mir/builder/control_flow/joinir/patterns/cond_var_extractor.rs(new utility)
Modified:
src/mir/builder/control_flow/joinir/patterns/pattern1_minimal.rssrc/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rssrc/mir/builder/control_flow/joinir/patterns/pattern3_with_if_phi.rssrc/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs
Related Issues
- Phase 169: BoolExprLowerer integration exposed this issue
- Phase 166: JsonParserBox validation blocked by this bug
- Phase 188-189: Boundary mechanism exists but incomplete
Next Steps
- Implement Option A (condition variable extraction)
- Update all 4 patterns
- Re-run Phase 170 validation tests
- Document the "always include condition variables in boundary" pattern