Files
hakorune/docs/development/current/main/phases/phase-286/C-2.1-pipeline-refactoring.md
tomoaki 1a8a70cc99 refactor(joinir): Phase 286C-2.1 - 3-stage pipeline scaffolding (Scan/Plan/Apply)
- Create 3-stage pipeline architecture to solve borrow checker conflicts
- Stage 1 (ScanBox): Read-only scanning, identify rewrites (170 lines)
- Stage 2 (PlanBox): Pure transformation, generate new blocks (85 lines)
- Stage 3 (ApplyBox): Builder mutation only (75 lines)

New Files:
- scan_box.rs: Stage 1 - Read-only scan for RewritePlan
- plan_box.rs: Stage 2 - Transform RewritePlan → RewrittenBlocks
- apply_box.rs: Stage 3 - Apply RewrittenBlocks to MirBuilder
- C-2.1-pipeline-refactoring.md: Implementation guide

Data Structures:
- RewritePlan: Describes WHAT to rewrite (tail calls, returns, PHI, params)
- RewrittenBlocks: Contains HOW to rewrite (new blocks, replacements, inputs)

Benefits:
- Borrow checker safety: No overlapping mutable/immutable borrows
- Single responsibility: Each stage has one clear purpose
- Testability: Each stage can be unit tested independently
- Maintainability: Clear data flow, isolated changes

Status:
- Scaffolding:  Complete (structure defined, stub implementations)
- Integration:  Next step (refactor merge_and_rewrite to use 3 stages)

Build: cargo build --release 

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-25 03:05:44 +09:00

6.5 KiB

Phase 286C-2.1: 3-Stage Pipeline Refactoring

Goal

Refactor instruction_rewriter.rs into a 3-stage pipeline (Scan → Plan → Apply) to:

  1. Solve Rust borrow checker conflicts
  2. Clarify responsibilities (1 Box = 1 Question)
  3. Make code more maintainable and testable

Current Status: Scaffolding Complete

What's Done

  • Created 3 new module files with basic structure
  • Defined core data structures
  • Registered modules in mod.rs
  • Build passes (cargo build --release)

Files Created

src/mir/builder/control_flow/joinir/merge/rewriter/
├── scan_box.rs       # Stage 1: Read-only scanning
├── plan_box.rs       # Stage 2: Pure transformation
└── apply_box.rs      # Stage 3: Builder mutation only

The 3 Stages

Stage 1: Scan (Read-Only)

File: scan_box.rs Responsibility: "What needs to be rewritten?"

  • Input: &MirFunction, &JoinInlineBoundary, &RewriteContext
  • Output: RewritePlan (describes WHAT to do)
  • Key: Does NOT touch MirBuilder

Data Structures:

pub struct RewritePlan {
    pub tail_calls: Vec<TailCallRewrite>,
    pub return_conversions: Vec<ReturnConversion>,
    pub phi_adjustments: Vec<PhiAdjustment>,
    pub parameter_bindings: Vec<ParameterBinding>,
}

What it does:

  • Identify tail calls to rewrite
  • Find return instructions to convert
  • Detect PHI adjustments needed
  • Collect parameter binding requirements

Stage 2: Plan (Pure Transformation)

File: plan_box.rs Responsibility: "How should the MIR look after rewriting?"

  • Input: RewritePlan
  • Output: RewrittenBlocks (new blocks, PHI tables, remapping tables)
  • Key: Can update RewriteContext but does NOT touch builder

Data Structures:

pub struct RewrittenBlocks {
    pub new_blocks: Vec<BasicBlock>,
    pub block_replacements: BTreeMap<BasicBlockId, BasicBlock>,
    pub phi_inputs: Vec<(BasicBlockId, ValueId)>,
    pub carrier_inputs: BTreeMap<String, Vec<(BasicBlockId, ValueId)>>,
}

What it does:

  • Generate new basic blocks based on plan
  • Create PHI input mappings
  • Prepare parameter binding instructions
  • Build block ID remapping tables

Stage 3: Apply (Builder Mutation Only)

File: apply_box.rs Responsibility: "Put the new MIR into the builder"

  • Input: &mut MirBuilder, RewrittenBlocks
  • Output: Merge result reflected in builder
  • Key: ONLY adds/replaces blocks in builder

What it does:

  • Add new blocks to MirBuilder
  • Replace modified blocks
  • Update function metadata

Why This Approach Works

The Borrow Checker Problem:

  • Currently: Mutating MirBuilder while updating RewriteContext causes conflicts
  • Solution: Separate "read", "plan", and "apply" into distinct time windows

Borrow Lifecycle:

// Stage 1: Read-Only (immutable borrow)
let plan = scan(&mir_function, &boundary, &ctx);

// Stage 2: Pure (no builder borrow)
let blocks = plan_transform(plan, &mut ctx);

// Stage 3: Mutation (mutable borrow, no conflicts)
apply(&mut builder, blocks);

Implementation Strategy (Incremental)

Step 0: Scaffolding (DONE)

  • Create 3 files with basic structure
  • Define data structures
  • Register modules
  • Test build

🔄 Step 1: Split within same file (NEXT)

Create 3 functions in instruction_rewriter.rs:

fn scan(...) -> RewritePlan { ... }
fn plan(plan: RewritePlan) -> RewrittenBlocks { ... }
fn apply(builder: &mut MirBuilder, blocks: RewrittenBlocks) { ... }

Refactor merge_and_rewrite() to:

pub fn merge_and_rewrite(...) -> Result<MergeResult, String> {
    let plan = scan(...);
    let blocks = plan(plan);
    apply(builder, blocks)?;
    // ... existing post-processing ...
}

Test: cargo build --release + quick smoke tests

📦 Step 2: Extract to separate files

  • Move scan()scan_box.rs
  • Move plan()plan_box.rs
  • Move apply()apply_box.rs

Test: cargo build --release + quick smoke tests

🔗 Step 3: Merge existing helper boxes

  • Integrate InstructionFilterBox, TailCallDetectorBox → into ScanBox
  • Integrate ParameterBindingBox, ReturnConverterBox → into PlanBox

Test: cargo build --release + quick smoke tests

Current Implementation State

ScanBox (scan_box.rs)

Status: Scaffolding only

  • Data structures defined
  • Logic stub (returns empty result)
  • Needs implementation

Next: Implement scan_block() logic

PlanBox (plan_box.rs)

Status: Scaffolding only

  • Data structures defined
  • Logic stub (returns empty result)
  • Needs implementation

Next: Implement transform() logic

ApplyBox (apply_box.rs)

Status: Basic implementation

  • Data structures defined
  • Basic apply() implementation (adds/replaces blocks)
  • Needs integration with full pipeline

Next: Test with real data

Testing Strategy

Unit Tests

Each Box should have tests:

  • scan_box.rs: Test block scanning logic
  • plan_box.rs: Test plan → blocks transformation
  • apply_box.rs: Test builder mutation

Integration Tests

  • Test full pipeline: scan → plan → apply
  • Compare output with current merge_and_rewrite() (should be identical)

Smoke Tests

Use existing smoke tests after each step:

  • tools/smokes/v2/run.sh --profile quick
  • No new failures beyond known issues

Success Criteria

  1. merge_and_rewrite() becomes ~100-200 line orchestrator
  2. cargo build --release passes at each step
  3. Quick smoke tests maintain (no new failures)
  4. Borrow checker conflicts resolved
  5. Zero functionality changes (pure refactoring)

Important Notes

  • Incremental steps - test after each change
  • Preserve all functionality - this is refactoring, not rewriting
  • Use RewriteContext from Phase 286C-2 - already committed
  • Read current code first - understand actual structure before changing

Next Steps

  1. Implement Step 1: Split merge_and_rewrite() into 3 functions within same file
  2. Test extensively: Ensure no behavioral changes
  3. Document edge cases: Identify any special handling needed
  4. Proceed to Step 2: Extract to separate files only after Step 1 works

References

  • Original file: src/mir/builder/control_flow/joinir/merge/instruction_rewriter.rs (1422 lines)
  • RewriteContext: src/mir/builder/control_flow/joinir/merge/rewriter/rewrite_context.rs
  • Existing helper boxes: instruction_filter_box.rs, parameter_binding_box.rs, etc.