Files
hakorune/docs/development/current/main/phase170-valueid-boundary-analysis.md
nyash-codex e30116f53d feat(joinir): Phase 171-fix ConditionEnv/ConditionBinding architecture
Proper HOST↔JoinIR ValueId separation for condition variables:

- Add ConditionEnv struct (name → JoinIR-local ValueId mapping)
- Add ConditionBinding struct (HOST/JoinIR ValueId pairs)
- Modify condition_to_joinir to use ConditionEnv instead of builder.variable_map
- Update Pattern2 lowerer to build ConditionEnv and ConditionBindings
- Extend JoinInlineBoundary with condition_bindings field
- Update BoundaryInjector to inject Copy instructions for condition variables

This fixes the undefined ValueId errors where HOST ValueIds were being
used directly in JoinIR instructions. Programs now execute (RC: 0),
though loop variable exit values still need Phase 172 work.

Key invariants established:
1. JoinIR uses ONLY JoinIR-local ValueIds
2. HOST↔JoinIR bridging is ONLY through JoinInlineBoundary
3. condition_to_joinir NEVER accesses builder.variable_map

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:45:03 +09:00

7.9 KiB

Phase 170: ValueId Boundary Mapping Analysis

Date: 2025-12-07 Status: Root Cause Identified Impact: CRITICAL - Blocks all JsonParserBox complex condition tests

Problem Summary

JoinIR loop patterns with complex conditions (e.g., start < end in _trim) compile successfully but fail silently at runtime because condition variable ValueIds are not properly mapped between HOST and JoinIR contexts.

Symptoms

[ssa-undef-debug] fn=TrimTest.trim/1 bb=BasicBlockId(12) inst_idx=0 used=ValueId(33) inst=Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), rhs: ValueId(34) }
[ssa-undef-debug] fn=TrimTest.trim/1 bb=BasicBlockId(12) inst_idx=0 used=ValueId(34) inst=Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), rhs: ValueId(34) }
  • Condition uses undefined ValueIds (33, 34) for variables start and end
  • Program compiles but produces no output (silent runtime failure)
  • PHI nodes also reference undefined carrier values

Root Cause

Architecture Overview

The JoinIR merge process uses two separate ValueId "namespaces":

  1. HOST context: Main MirBuilder's ValueId space (e.g., start = ValueId(33))
  2. JoinIR context: Fresh ValueId allocator starting from 0 (e.g., ValueId(0), ValueId(1), ...)

The JoinInlineBoundary mechanism is supposed to bridge these two spaces by injecting Copy instructions at the entry block.

The Bug

Location: src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs (and other patterns)

Current boundary creation:

let boundary = JoinInlineBoundary::new_inputs_only(
    vec![ValueId(0)],      // JoinIR's main() parameter (loop variable init)
    vec![loop_var_id],     // Host's loop variable
);

What's missing: Condition variables (start, end) are NOT in the boundary!

What happens:

  1. condition_to_joinir.rs looks up variables in builder.variable_map:

    builder.variable_map.get("start") // Returns ValueId(33) from HOST
    builder.variable_map.get("end")   // Returns ValueId(34) from HOST
    
  2. JoinIR instructions are generated with these HOST ValueIds:

    JoinInst::Compute(MirLikeInst::Compare {
        dst: ValueId(26),
        op: Lt,
        lhs: ValueId(33),  // HOST ValueId, not in JoinIR space!
        rhs: ValueId(34),  // HOST ValueId, not in JoinIR space!
    })
    
  3. During merge, remap_values() only remaps ValueIds that are in used_values:

    • ValueIds 0, 1, 2, ... from JoinIR → new ValueIds from builder
    • But ValueId(33) and ValueId(34) are not in JoinIR's used_values set!
    • They're HOST ValueIds that leaked into JoinIR space
  4. Result: Compare instruction references undefined ValueIds

Architectural Issue

The current design has a conceptual mismatch:

  • condition_to_joinir.rs assumes it can directly reference HOST ValueIds from builder.variable_map
  • JoinIR merge assumes all ValueIds come from JoinIR's fresh allocator
  • Boundary mechanism only maps explicitly listed inputs/outputs

This works for simple patterns where:

  • Condition is hardcoded (e.g., i < 3)
  • All condition values are constants or loop variables already in the boundary

This breaks when:

  • Condition references variables from HOST context (e.g., start < end)
  • Those variables are not in the boundary inputs

Affected Code Paths

Phase 169 Integration: condition_to_joinir.rs

Lines 183-189:

ASTNode::Variable { name, .. } => {
    builder
        .variable_map
        .get(name)
        .copied()
        .ok_or_else(|| format!("Variable '{}' not found in variable_map", name))
}

This returns HOST ValueIds directly without checking if they need boundary mapping.

Pattern Lowerers: pattern2_with_break.rs, etc.

Pattern lowerers create minimal boundaries that only include:

  • Loop variable (e.g., i)
  • Accumulator (if present)

But NOT:

  • Variables referenced in loop condition (e.g., start, end in start < end)
  • Variables referenced in loop body expressions

Merge Infrastructure: merge/mod.rs

The merge process has no way to detect that HOST ValueIds have leaked into JoinIR instructions.

Test Case: TrimTest.trim/1

Code:

local start = 0
local end = s.length()

loop(start < end) {  // Condition references start, end
    // ...
    break
}

Expected boundary:

JoinInlineBoundary::new_inputs_only(
    vec![ValueId(0), ValueId(1), ValueId(2)],  // loop var, start, end
    vec![loop_var_id, start_id, end_id],        // HOST ValueIds
)

Actual boundary:

JoinInlineBoundary::new_inputs_only(
    vec![ValueId(0)],      // Only loop var
    vec![loop_var_id],     // Only loop var
)

Result: start and end are undefined in JoinIR space

Solutions

Where: Pattern lowerers (pattern1/2/3/4)

Steps:

  1. Before calling lower_condition_to_joinir(), analyze AST to find all variables
  2. For each variable, get HOST ValueId from builder.variable_map
  3. Allocate JoinIR-side ValueIds (e.g., ValueId(1), ValueId(2))
  4. Create boundary with all condition variables:
    let cond_vars = extract_condition_variables(condition_ast, builder);
    let boundary = JoinInlineBoundary::new_inputs_only(
        vec![ValueId(0), ValueId(1), ValueId(2)],  // loop var + cond vars
        vec![loop_var_id, cond_vars[0], cond_vars[1]],
    );
    

Pros:

  • Minimal change to existing architecture
  • Clear separation: boundary handles HOST↔JoinIR mapping
  • Works for all condition complexity

Cons:

  • Need to implement variable extraction from AST
  • Each pattern needs updating

Option B: Delay Variable Resolution Until Merge

Where: condition_to_joinir.rs

Idea: Instead of resolving variables immediately, emit placeholder instructions and resolve during merge.

Pros:

  • Cleaner separation: JoinIR doesn't touch HOST ValueIds

Cons:

  • Major refactoring required
  • Need new placeholder instruction type
  • Complicates merge logic

Option C: Use Variable Names Instead of ValueIds in JoinIR

Where: JoinIR instruction format

Idea: JoinIR uses variable names (strings) instead of ValueIds, resolve during merge.

Pros:

  • Most robust solution
  • Eliminates ValueId namespace confusion

Cons:

  • Breaks current JoinIR design (uses MirLikeInst which has ValueIds)
  • Major architectural change

Recommendation

Option A - Extract condition variables and add to boundary.

Implementation Plan:

  1. Create AST variable extractor (30 minutes)

    • File: src/mir/builder/control_flow/joinir/patterns/cond_var_extractor.rs
    • Function: extract_condition_variables(ast: &ASTNode, builder: &MirBuilder) -> Vec<(String, ValueId)>
    • Recursively walk AST, collect all Variable nodes
  2. Update Pattern2 (1 hour)

    • Extract condition variables before calling pattern lowerer
    • Create boundary with extracted variables
    • Test with TrimTest.trim/1
  3. Update Pattern1, Pattern3, Pattern4 (1 hour each)

    • Apply same pattern
  4. Validation (30 minutes)

    • Re-run TrimTest.trim/1 → should output correctly
    • Re-run JsonParserBox tests → should work

Total Estimate: 4.5 hours

Files Affected

New:

  • src/mir/builder/control_flow/joinir/patterns/cond_var_extractor.rs (new utility)

Modified:

  • src/mir/builder/control_flow/joinir/patterns/pattern1_minimal.rs
  • src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs
  • src/mir/builder/control_flow/joinir/patterns/pattern3_with_if_phi.rs
  • src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs
  • Phase 169: BoolExprLowerer integration exposed this issue
  • Phase 166: JsonParserBox validation blocked by this bug
  • Phase 188-189: Boundary mechanism exists but incomplete

Next Steps

  1. Implement Option A (condition variable extraction)
  2. Update all 4 patterns
  3. Re-run Phase 170 validation tests
  4. Document the "always include condition variables in boundary" pattern