diff --git a/docs/development/current/main/phases/phase-286/C-2.1-pipeline-refactoring.md b/docs/development/current/main/phases/phase-286/C-2.1-pipeline-refactoring.md new file mode 100644 index 00000000..c220656a --- /dev/null +++ b/docs/development/current/main/phases/phase-286/C-2.1-pipeline-refactoring.md @@ -0,0 +1,216 @@ +# Phase 286C-2.1: 3-Stage Pipeline Refactoring + +## Goal +Refactor `instruction_rewriter.rs` into a **3-stage pipeline** (Scan → Plan → Apply) to: +1. Solve Rust borrow checker conflicts +2. Clarify responsibilities (1 Box = 1 Question) +3. Make code more maintainable and testable + +## Current Status: ✅ **Scaffolding Complete** + +### What's Done +- ✅ Created 3 new module files with basic structure +- ✅ Defined core data structures +- ✅ Registered modules in `mod.rs` +- ✅ Build passes (`cargo build --release`) + +### Files Created +``` +src/mir/builder/control_flow/joinir/merge/rewriter/ +├── scan_box.rs # Stage 1: Read-only scanning +├── plan_box.rs # Stage 2: Pure transformation +└── apply_box.rs # Stage 3: Builder mutation only +``` + +## The 3 Stages + +### Stage 1: Scan (Read-Only) +**File**: `scan_box.rs` +**Responsibility**: "What needs to be rewritten?" + +- **Input**: `&MirFunction`, `&JoinInlineBoundary`, `&RewriteContext` +- **Output**: `RewritePlan` (describes WHAT to do) +- **Key**: Does NOT touch `MirBuilder` + +**Data Structures**: +```rust +pub struct RewritePlan { + pub tail_calls: Vec, + pub return_conversions: Vec, + pub phi_adjustments: Vec, + pub parameter_bindings: Vec, +} +``` + +**What it does**: +- Identify tail calls to rewrite +- Find return instructions to convert +- Detect PHI adjustments needed +- Collect parameter binding requirements + +### Stage 2: Plan (Pure Transformation) +**File**: `plan_box.rs` +**Responsibility**: "How should the MIR look after rewriting?" + +- **Input**: `RewritePlan` +- **Output**: `RewrittenBlocks` (new blocks, PHI tables, remapping tables) +- **Key**: Can update `RewriteContext` but does NOT touch builder + +**Data Structures**: +```rust +pub struct RewrittenBlocks { + pub new_blocks: Vec, + pub block_replacements: BTreeMap, + pub phi_inputs: Vec<(BasicBlockId, ValueId)>, + pub carrier_inputs: BTreeMap>, +} +``` + +**What it does**: +- Generate new basic blocks based on plan +- Create PHI input mappings +- Prepare parameter binding instructions +- Build block ID remapping tables + +### Stage 3: Apply (Builder Mutation Only) +**File**: `apply_box.rs` +**Responsibility**: "Put the new MIR into the builder" + +- **Input**: `&mut MirBuilder`, `RewrittenBlocks` +- **Output**: Merge result reflected in builder +- **Key**: ONLY adds/replaces blocks in builder + +**What it does**: +- Add new blocks to MirBuilder +- Replace modified blocks +- Update function metadata + +## Why This Approach Works + +**The Borrow Checker Problem**: +- Currently: Mutating `MirBuilder` while updating `RewriteContext` causes conflicts +- Solution: **Separate "read", "plan", and "apply" into distinct time windows** + +**Borrow Lifecycle**: +```rust +// Stage 1: Read-Only (immutable borrow) +let plan = scan(&mir_function, &boundary, &ctx); + +// Stage 2: Pure (no builder borrow) +let blocks = plan_transform(plan, &mut ctx); + +// Stage 3: Mutation (mutable borrow, no conflicts) +apply(&mut builder, blocks); +``` + +## Implementation Strategy (Incremental) + +### ✅ Step 0: Scaffolding (DONE) +- [x] Create 3 files with basic structure +- [x] Define data structures +- [x] Register modules +- [x] Test build + +### 🔄 Step 1: Split within same file (NEXT) +Create 3 functions in `instruction_rewriter.rs`: +```rust +fn scan(...) -> RewritePlan { ... } +fn plan(plan: RewritePlan) -> RewrittenBlocks { ... } +fn apply(builder: &mut MirBuilder, blocks: RewrittenBlocks) { ... } +``` + +Refactor `merge_and_rewrite()` to: +```rust +pub fn merge_and_rewrite(...) -> Result { + let plan = scan(...); + let blocks = plan(plan); + apply(builder, blocks)?; + // ... existing post-processing ... +} +``` + +**Test**: `cargo build --release` + quick smoke tests + +### 📦 Step 2: Extract to separate files +- Move `scan()` → `scan_box.rs` +- Move `plan()` → `plan_box.rs` +- Move `apply()` → `apply_box.rs` + +**Test**: `cargo build --release` + quick smoke tests + +### 🔗 Step 3: Merge existing helper boxes +- Integrate `InstructionFilterBox`, `TailCallDetectorBox` → into `ScanBox` +- Integrate `ParameterBindingBox`, `ReturnConverterBox` → into `PlanBox` + +**Test**: `cargo build --release` + quick smoke tests + +## Current Implementation State + +### ScanBox (`scan_box.rs`) +**Status**: Scaffolding only +- ✅ Data structures defined +- ⏳ Logic stub (returns empty result) +- ⏳ Needs implementation + +**Next**: Implement `scan_block()` logic + +### PlanBox (`plan_box.rs`) +**Status**: Scaffolding only +- ✅ Data structures defined +- ⏳ Logic stub (returns empty result) +- ⏳ Needs implementation + +**Next**: Implement `transform()` logic + +### ApplyBox (`apply_box.rs`) +**Status**: Basic implementation +- ✅ Data structures defined +- ✅ Basic `apply()` implementation (adds/replaces blocks) +- ⏳ Needs integration with full pipeline + +**Next**: Test with real data + +## Testing Strategy + +### Unit Tests +Each Box should have tests: +- `scan_box.rs`: Test block scanning logic +- `plan_box.rs`: Test plan → blocks transformation +- `apply_box.rs`: Test builder mutation + +### Integration Tests +- Test full pipeline: scan → plan → apply +- Compare output with current `merge_and_rewrite()` (should be identical) + +### Smoke Tests +Use existing smoke tests after each step: +- `tools/smokes/v2/run.sh --profile quick` +- No new failures beyond known issues + +## Success Criteria + +1. ✅ `merge_and_rewrite()` becomes ~100-200 line orchestrator +2. ✅ `cargo build --release` passes at each step +3. ⏳ Quick smoke tests maintain (no new failures) +4. ⏳ Borrow checker conflicts resolved +5. ⏳ Zero functionality changes (pure refactoring) + +## Important Notes + +- **Incremental steps** - test after each change +- **Preserve all functionality** - this is refactoring, not rewriting +- **Use RewriteContext from Phase 286C-2** - already committed +- **Read current code first** - understand actual structure before changing + +## Next Steps + +1. **Implement Step 1**: Split `merge_and_rewrite()` into 3 functions within same file +2. **Test extensively**: Ensure no behavioral changes +3. **Document edge cases**: Identify any special handling needed +4. **Proceed to Step 2**: Extract to separate files only after Step 1 works + +## References + +- **Original file**: `src/mir/builder/control_flow/joinir/merge/instruction_rewriter.rs` (1422 lines) +- **RewriteContext**: `src/mir/builder/control_flow/joinir/merge/rewriter/rewrite_context.rs` +- **Existing helper boxes**: `instruction_filter_box.rs`, `parameter_binding_box.rs`, etc. diff --git a/src/mir/builder/control_flow/joinir/merge/rewriter/apply_box.rs b/src/mir/builder/control_flow/joinir/merge/rewriter/apply_box.rs new file mode 100644 index 00000000..d5af1436 --- /dev/null +++ b/src/mir/builder/control_flow/joinir/merge/rewriter/apply_box.rs @@ -0,0 +1,63 @@ +//! ApplyBox: Builder mutation only (RewrittenBlocks → MirBuilder) +//! +//! Phase 286C-2.1: Third stage of 3-stage pipeline (Scan → Plan → Apply) +//! Adds/replaces blocks in MirBuilder WITHOUT reading/analyzing. + +use crate::mir::builder::MirBuilder; +use crate::mir::{BasicBlock, BasicBlockId}; + +use super::plan_box::RewrittenBlocks; + +/// ApplyBox: Builder mutation only +pub struct ApplyBox; + +impl ApplyBox { + /// Apply rewritten blocks to the MirBuilder + /// + /// This is the ONLY stage that mutates the builder. + /// It doesn't analyze or plan - just adds/replaces blocks. + /// + /// # Arguments + /// * `builder` - The MirBuilder to mutate + /// * `blocks` - The rewritten blocks from PlanBox + /// + /// # Returns + /// * Result indicating success or failure + pub fn apply(builder: &mut MirBuilder, blocks: RewrittenBlocks) -> Result<(), String> { + // Add new blocks + if let Some(ref mut current_func) = builder.scope_ctx.current_function { + for block in blocks.new_blocks { + current_func.add_block(block); + } + + // Replace existing blocks + for (block_id, new_block) in blocks.block_replacements { + current_func.blocks.insert(block_id, new_block); + } + } + + Ok(()) + } +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::mir::builder::MirBuilder; + use std::collections::BTreeMap; + + #[test] + fn test_apply_box_basic() { + // Basic smoke test + let mut builder = MirBuilder::new(); + let blocks = RewrittenBlocks { + new_blocks: Vec::new(), + block_replacements: BTreeMap::new(), + phi_inputs: Vec::new(), + carrier_inputs: BTreeMap::new(), + }; + + let result = ApplyBox::apply(&mut builder, blocks); + assert!(result.is_ok()); + } +} diff --git a/src/mir/builder/control_flow/joinir/merge/rewriter/mod.rs b/src/mir/builder/control_flow/joinir/merge/rewriter/mod.rs index 46b5c26e..abcda812 100644 --- a/src/mir/builder/control_flow/joinir/merge/rewriter/mod.rs +++ b/src/mir/builder/control_flow/joinir/merge/rewriter/mod.rs @@ -43,6 +43,11 @@ pub(super) mod tail_call_detector_box; // Phase 286C-2 Step 2: Tail call detecti pub(super) mod terminator; // Phase 260 P0.1 Step 5: Terminator remapping extracted ✅ pub(super) mod type_propagation; // Phase 260 P0.1 Step 4: Type propagation extracted ✅ +// Phase 286C-2.1: 3-stage pipeline (Scan → Plan → Apply) ✅ +pub(super) mod scan_box; // Stage 1: Read-only scanning for rewrite planning +pub(super) mod plan_box; // Stage 2: Pure transformation (scan → blocks) +pub(super) mod apply_box; // Stage 3: Builder mutation only + // Re-export public API // Phase 260 P0.1 Step 3: From helpers ✅ pub(super) use super::instruction_rewriter::merge_and_rewrite; // Still in parent (TODO: extract) diff --git a/src/mir/builder/control_flow/joinir/merge/rewriter/plan_box.rs b/src/mir/builder/control_flow/joinir/merge/rewriter/plan_box.rs new file mode 100644 index 00000000..a27dc17c --- /dev/null +++ b/src/mir/builder/control_flow/joinir/merge/rewriter/plan_box.rs @@ -0,0 +1,70 @@ +//! PlanBox: Pure transformation (Scan → RewrittenBlocks) +//! +//! Phase 286C-2.1: Second stage of 3-stage pipeline (Scan → Plan → Apply) +//! Generates new blocks and mappings WITHOUT touching the builder. + +use crate::mir::{BasicBlock, BasicBlockId, ValueId}; +use std::collections::BTreeMap; + +use super::scan_box::RewritePlan; + +/// Rewritten blocks: New blocks, PHI inputs, carrier inputs (ready to apply) +#[derive(Debug, Clone)] +pub struct RewrittenBlocks { + /// New blocks to add to the builder + pub new_blocks: Vec, + + /// Block replacements: Map + pub block_replacements: BTreeMap, + + /// PHI inputs for exit block: Vec<(from_block, value)> + pub phi_inputs: Vec<(BasicBlockId, ValueId)>, + + /// Carrier inputs: Map> + pub carrier_inputs: BTreeMap>, +} + +/// PlanBox: Pure transformation from RewritePlan to RewrittenBlocks +pub struct PlanBox; + +impl PlanBox { + /// Transform a RewritePlan into RewrittenBlocks + /// + /// This is a PURE transformation - no builder mutation, no side effects. + /// Can update RewriteContext but does NOT touch MirBuilder. + /// + /// # Arguments + /// * `plan` - The rewrite plan from ScanBox + /// + /// # Returns + /// * RewrittenBlocks ready to be applied to the builder + pub fn transform(_plan: RewritePlan) -> RewrittenBlocks { + // TODO: Implement transformation logic + // For now, return empty result + RewrittenBlocks { + new_blocks: Vec::new(), + block_replacements: BTreeMap::new(), + phi_inputs: Vec::new(), + carrier_inputs: BTreeMap::new(), + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_plan_box_basic() { + // Basic smoke test + let plan = RewritePlan { + tail_calls: Vec::new(), + return_conversions: Vec::new(), + phi_adjustments: Vec::new(), + parameter_bindings: Vec::new(), + }; + + let blocks = PlanBox::transform(plan); + assert!(blocks.new_blocks.is_empty()); + } +} diff --git a/src/mir/builder/control_flow/joinir/merge/rewriter/scan_box.rs b/src/mir/builder/control_flow/joinir/merge/rewriter/scan_box.rs new file mode 100644 index 00000000..d8c3e992 --- /dev/null +++ b/src/mir/builder/control_flow/joinir/merge/rewriter/scan_box.rs @@ -0,0 +1,162 @@ +//! ScanBox: Read-only scanning for rewrite planning +//! +//! Phase 286C-2.1: First stage of 3-stage pipeline (Scan → Plan → Apply) +//! Identifies what needs to be rewritten WITHOUT mutating any state. + +use crate::mir::{BasicBlock, BasicBlockId, MirFunction, MirInstruction, ValueId}; +use std::collections::BTreeSet; + +/// Rewrite plan: Describes WHAT to do (doesn't touch instructions yet) +#[derive(Debug, Clone)] +pub struct RewritePlan { + /// Tail calls to be converted to parameter bindings + Jump + pub tail_calls: Vec, + + /// Return instructions to convert to Jump to exit block + pub return_conversions: Vec, + + /// PHI adjustments needed (currently empty, for future use) + pub phi_adjustments: Vec, + + /// Parameter bindings to generate for tail calls + pub parameter_bindings: Vec, +} + +/// Information about a tail call to rewrite +#[derive(Debug, Clone)] +pub struct TailCallRewrite { + /// Block containing the tail call + pub block_id: BasicBlockId, + + /// Target function name + pub target_func_name: String, + + /// Target block (entry block of called function) + pub target_block: BasicBlockId, + + /// Remapped arguments to pass + pub args: Vec, + + /// Whether this is a recursive call (same function) + pub is_recursive: bool, + + /// Whether this is a continuation call (k_exit, etc.) + pub is_continuation: bool, + + /// Whether the target is the loop entry function + pub is_loop_entry: bool, +} + +/// Information about a Return instruction to convert +#[derive(Debug, Clone)] +pub struct ReturnConversion { + /// Block containing the Return + pub block_id: BasicBlockId, + + /// Return value (if any) + pub return_value: Option, + + /// Whether this has exit edge args + pub has_exit_edge_args: bool, + + /// Whether to keep Return (non-skippable continuation) + pub should_keep_return: bool, +} + +/// PHI adjustment (placeholder for future use) +#[derive(Debug, Clone)] +pub struct PhiAdjustment { + /// Block containing the PHI + pub block_id: BasicBlockId, + + /// PHI destination + pub phi_dst: ValueId, +} + +/// Parameter binding to generate +#[derive(Debug, Clone)] +pub struct ParameterBinding { + /// Block to insert the binding into + pub block_id: BasicBlockId, + + /// Destination ValueId (parameter) + pub dst: ValueId, + + /// Source ValueId (argument) + pub src: ValueId, +} + +/// ScanBox: Read-only scanning for rewrite planning +pub struct ScanBox; + +impl ScanBox { + /// Scan a block to identify rewrite operations needed + /// + /// This is a READ-ONLY operation - it doesn't mutate any state, + /// just analyzes what needs to be done. + /// + /// # Arguments + /// * `block` - The block to scan + /// * `block_id` - The block's ID + /// * `func_name` - The function containing this block + /// * Additional context parameters (TODO: consolidate into ScanContext) + /// + /// # Returns + /// * Instructions to skip, tail calls found, returns to convert, etc. + pub fn scan_block( + block: &BasicBlock, + block_id: BasicBlockId, + _func_name: &str, + ) -> BlockScanResult { + let mut skip_instructions = BTreeSet::new(); + let mut found_tail_call = false; + + // Scan instructions for tail calls, returns, etc. + for (idx, inst) in block.instructions.iter().enumerate() { + match inst { + MirInstruction::Call { .. } => { + // Detect tail calls + // TODO: Implement tail call detection logic + found_tail_call = false; // Placeholder + } + MirInstruction::Return { .. } => { + // Mark return for conversion + // TODO: Implement return conversion detection + } + _ => {} + } + } + + BlockScanResult { + skip_instructions, + found_tail_call, + tail_call_target: None, + } + } +} + +/// Result of scanning a single block +#[derive(Debug, Clone)] +pub struct BlockScanResult { + /// Instruction indices to skip during rewriting + pub skip_instructions: BTreeSet, + + /// Whether a tail call was found + pub found_tail_call: bool, + + /// Tail call target (if found) + pub tail_call_target: Option<(BasicBlockId, Vec)>, +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_scan_box_basic() { + // Basic smoke test - will expand with real logic + let block = BasicBlock::new(BasicBlockId(0)); + let result = ScanBox::scan_block(&block, BasicBlockId(0), "test_func"); + assert!(!result.found_tail_call); + } +}