Files
hakorune/docs/development/current/main/phase33-16-qa.md
nyash-codex 4e32a803a7 feat(joinir): Phase 33-22 CommonPatternInitializer & JoinIRConversionPipeline integration
Unifies initialization and conversion logic across all 4 loop patterns,
eliminating code duplication and establishing single source of truth.

## Changes

### Infrastructure (New)
- CommonPatternInitializer (117 lines): Unified loop var extraction + CarrierInfo building
- JoinIRConversionPipeline (127 lines): Unified JoinIR→MIR→Merge flow

### Pattern Refactoring
- Pattern 1: Uses CommonPatternInitializer + JoinIRConversionPipeline (-25 lines)
- Pattern 2: Uses CommonPatternInitializer + JoinIRConversionPipeline (-25 lines)
- Pattern 3: Uses CommonPatternInitializer + JoinIRConversionPipeline (-25 lines)
- Pattern 4: Uses CommonPatternInitializer + JoinIRConversionPipeline (-40 lines)

### Code Reduction
- Total reduction: ~115 lines across all patterns
- Zero code duplication in initialization/conversion
- Pattern files: 806 lines total (down from ~920)

### Quality Improvements
- Single source of truth for initialization
- Consistent conversion flow across all patterns
- Guaranteed boundary.loop_var_name setting (prevents SSA-undef bugs)
- Improved maintainability and testability

### Testing
- All 4 patterns tested and passing:
  - Pattern 1 (Simple While): 
  - Pattern 2 (With Break): 
  - Pattern 3 (If-Else PHI): 
  - Pattern 4 (With Continue): 

### Documentation
- Phase 33-22 inventory and results document
- Updated joinir-architecture-overview.md with new infrastructure

## Breaking Changes
None - pure refactoring with no API changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-07 21:02:20 +09:00

10 KiB

Phase 33-16: Q&A - Implementation Flow Details

Your Questions Answered

Q1: Where exactly should LoopHeaderPhiBuilder::build() be called?

Answer: Between Phase 3 (remap_values) and Phase 4 (instruction_rewriter) in merge/mod.rs

Location: Line 107, after remap_values()

Why here:

  • After remap_values: We have remapped ValueIds (needed for phi_dst allocation)
  • Before instruction_rewriter: We need to know PHI dsts so instruction_rewriter can use them in exit values
  • Clear phase boundary: Phase 3.5

Code location in file:

// Line 107: After remap_values(...)
remap_values(builder, &used_values, &mut remapper, debug)?;

// INSERT HERE: Phase 3.5 - Build loop header PHIs
let mut loop_header_phi_info = if let Some(boundary) = boundary {
    // ... build logic ...
} else {
    loop_header_phi_builder::LoopHeaderPhiInfo::empty(BasicBlockId(0))
};

// Phase 4: Merge blocks and rewrite instructions
// PASS loop_header_phi_info to instruction_rewriter
let merge_result = instruction_rewriter::merge_and_rewrite(
    builder,
    mir_module,
    &mut remapper,
    &value_to_func_name,
    &function_params,
    boundary,
    &mut loop_header_phi_info,  // NEW: Pass mutable reference
    debug,
)?;

Q2: How do I get the header_block_id (loop_step's entry block after remapping)?

Answer: It's the entry function's entry block, obtained via remapper

Exact code:

let (entry_func_name, entry_func) = mir_module
    .functions
    .iter()
    .next()
    .ok_or("JoinIR module has no functions")?;
    
let entry_block_remapped = remapper
    .get_block(entry_func_name, entry_func.entry_block)
    .ok_or_else(|| format!("Entry block not found"))?;

// For Pattern 2, entry block == header block
let header_block_id = entry_block_remapped;

Why this works:

  • JoinIR's first function is always the entry function
  • entry_func.entry_block is the BasicBlockId in JoinIR space
  • remapper.get_block() returns the remapped BasicBlockId in the host MIR

For more complex patterns (future):

  • Pattern 3/4 might have different header block logic
  • For now, assume entry_block == header_block (safe for Pattern 2)

Q3: How do I get the loop variable's initial value (host-side)?

Answer: Get it from the remapper (it's the remapped join_inputs[0])

Exact code:

// Loop variable's initial value is from join_inputs[0]
// It's been remapped by remap_values() in Phase 3
let loop_var_init = remapper
    .get_value(ValueId(0))  // JoinIR param slot (always 0 for loop var)
    .ok_or("Loop var init not remapped")?;

Why ValueId(0):

  • In JoinIR, loop parameter is always allocated as ValueId(0)
  • Pattern 2 lowerer does this (pattern2_with_break.rs line 84):
    env.insert(loop_var_name.clone(), crate::mir::ValueId(0));
    
  • After remap_values(), this becomes a new ValueId in host space

What you DON'T need:

  • Don't look in boundary.host_inputs[0] directly
  • Don't use boundary.join_inputs[0] (it's the pre-remap value)
  • Use remapper.get_value(ValueId(0)) (it's the post-remap value)

Q4: Where should instruction_rewriter record latch_incoming?

Answer: In the tail call handling section, after parameter bindings

Location: instruction_rewriter.rs, in the tail call branch (lines 276-335)

Exact code:

if let Some((target_block, args)) = tail_call_target {
    // ... existing parameter binding code (lines 276-319) ...
    
    // NEW: Track latch incoming AFTER param bindings
    if let Some(loop_var_name) = &boundary.and_then(|b| b.loop_var_name.as_ref()) {
        if !args.is_empty() {
            let latch_value = args[0]; // Updated loop variable from tail call args
            loop_header_phi_info.set_latch_incoming(
                loop_var_name,
                target_block,  // This is the loop header block (from tail call target)
                latch_value,   // This is i_next (the updated value)
            );
            
            if debug {
                eprintln!("[cf_loop/joinir] Phase 33-16: Set latch incoming for '{}': {:?}", 
                    loop_var_name, latch_value);
            }
        }
    }
    
    // ... then set terminator to Jump (line 321-323) ...
}

Why this location:

  • Tail call args are the ACTUAL updated values (i_next, not i_param)
  • args[0] is guaranteed to be the loop variable (Pattern 2 guarantees)
  • target_block is the loop header (where we're jumping back to)
  • Called for EACH block that has a tail call (ensures all paths tracked)

Key insight: The latch block is NOT explicitly identified; we identify it by the Jump target!


Q5: Should the Phase 33-15 skip logic be removed or modified to use header PHI dst?

Answer: Modify, NOT remove. Use header PHI dst when available, fallback to parameter.

What to do:

  1. Replace skip logic (lines 354-398 in instruction_rewriter.rs):
// OLD: Skip exit_phi_inputs collection
// if debug { eprintln!(...skip...); }

// NEW: Use header PHI dst if available
if let Some(ret_val) = value {
    let remapped_val = remapper.get_value(*ret_val).unwrap_or(*ret_val);
    
    // Phase 33-16: Prefer header PHI dst
    if let Some(loop_var_name) = &boundary.and_then(|b| b.loop_var_name.as_ref()) {
        if let Some(phi_dst) = loop_header_phi_info.get_carrier_phi(loop_var_name) {
            // Use PHI dst (SSA-correct!)
            exit_phi_inputs.push((exit_block_id, phi_dst));
        } else {
            // Fallback: Use parameter (for backward compatibility)
            exit_phi_inputs.push((exit_block_id, remapped_val));
        }
    } else {
        // No boundary or loop_var_name: use parameter
        exit_phi_inputs.push((exit_block_id, remapped_val));
    }
}
  1. Modify carrier_inputs logic (lines 400-431):
// OLD: Skip carrier_inputs collection
// if debug { eprintln!(...skip...); }

// NEW: Use header PHI dsts for carriers
if let Some(boundary) = boundary {
    for binding in &boundary.exit_bindings {
        // Phase 33-16: Look up header PHI dst for this carrier
        if let Some(phi_dst) = loop_header_phi_info.get_carrier_phi(&binding.carrier_name) {
            carrier_inputs.entry(binding.carrier_name.clone())
                .or_insert_with(Vec::new)
                .push((exit_block_id, phi_dst));
        }
        // If no PHI dst, skip this carrier (not yet implemented)
    }
}

Why this approach:

  • Phase 33-16 adds header PHIs → use them (SSA-correct)
  • If no header PHIs → fallback to old behavior (backward compat)
  • Gradual migration: Patterns enable loop_var_name progressively
  • Easy to debug: Explicit "Using PHI" vs "Fallback" logs

Don't do:

  • Don't remove skip logic entirely (patterns without loop_var_name would break)
  • Don't add loop_header_phi_info to merge_and_rewrite() signature if you don't track latch
  • Do add both build() and finalize() to merge/mod.rs

Q6: Flow Summary - How does it all fit together?

Complete flow:

merge_joinir_mir_blocks() {
    // Phase 1: Allocate block IDs
    allocate_blocks()
    
    // Phase 2: Collect values
    collect_values()
    
    // Phase 3: Remap ValueIds
    remap_values(builder, &used_values, &mut remapper)
    
    // ===== Phase 3.5 (NEW) =====
    // Build loop header PHIs with entry incoming edges
    let mut loop_header_phi_info = if let Some(boundary) = boundary {
        if let Some(loop_var_name) = &boundary.loop_var_name {
            // Get header_block_id (entry block after remap)
            let entry_block = remapper.get_block(entry_func, entry_func.entry_block)?;
            
            // Get loop_var_init (remapped ValueId(0))
            let loop_var_init = remapper.get_value(ValueId(0))?;
            
            // Build header PHIs (allocates PHI dsts, doesn't emit yet)
            LoopHeaderPhiBuilder::build(
                builder,
                entry_block,      // header_block_id
                entry_block,      // entry_block_id
                loop_var_name,
                loop_var_init,
                &[],              // No other carriers yet
                boundary.expr_result.is_some(),
                debug,
            )?
        } else {
            LoopHeaderPhiInfo::empty(...)
        }
    } else {
        LoopHeaderPhiInfo::empty(...)
    };
    
    // ===== Phase 4 (MODIFIED) =====
    // Instruction rewriter sets latch incoming and uses PHI dsts
    let merge_result = instruction_rewriter::merge_and_rewrite(
        builder,
        mir_module,
        &mut remapper,
        ...,
        &mut loop_header_phi_info,  // PASS MUTABLE REFERENCE
        debug,
    )?;
    // Inside merge_and_rewrite:
    //   - When processing tail calls: record latch_incoming
    //   - When processing Return: use header PHI dsts (not parameters)
    
    // ===== Phase 5 =====
    // Build exit PHI from exit_phi_inputs and carrier_inputs
    let (exit_phi_result_id, carrier_phis) = exit_phi_builder::build_exit_phi(...)?;
    
    // ===== Phase 4.5 (NEW) =====
    // Finalize loop header PHIs (insert into blocks)
    LoopHeaderPhiBuilder::finalize(builder, &loop_header_phi_info, debug)?;
    
    // ===== Phase 6 =====
    // Reconnect exit values using carrier_phis from Phase 5
    if let Some(boundary) = boundary {
        ExitLineOrchestrator::execute(builder, boundary, &carrier_phis, debug)?;
    }
    
    // ... continue with boundary jump and exit block switch ...
}

Key transitions:

  1. Phase 3 → Phase 3.5: remap_values() gives us remapped ValueIds
  2. Phase 3.5 → Phase 4: loop_header_phi_info allocated (PHI dsts ready)
  3. Phase 4 → Phase 4.5: instruction_rewriter sets latch_incoming
  4. Phase 4.5 → Phase 5: Finalize emits PHIs into blocks
  5. Phase 5 → Phase 6: exit_phi_builder returns carrier_phis (PHI dsts)
  6. Phase 6: ExitLineOrchestrator uses carrier_phis to update variable_map

Summary: Exact Answer to Your Core Question

Where to call build(): Line 107 in merge/mod.rs, after remap_values()
How to get header_block_id: remapper.get_block(entry_func_name, entry_func.entry_block)?
How to get loop_var_init: remapper.get_value(ValueId(0))?
Where to record latch_incoming: In tail call handling (line ~300), use args[0] as latch_value
Replace skip logic: Yes, with fallback mechanism for backward compatibility

The magic: Loop header PHI dst (allocated in Phase 3.5, finalized in Phase 4.5) is SSA-defined and can be safely used in exit values instead of parameters!