refactor(joinir): Phase 286C-2.1 - 3-stage pipeline scaffolding (Scan/Plan/Apply)

- Create 3-stage pipeline architecture to solve borrow checker conflicts
- Stage 1 (ScanBox): Read-only scanning, identify rewrites (170 lines)
- Stage 2 (PlanBox): Pure transformation, generate new blocks (85 lines)
- Stage 3 (ApplyBox): Builder mutation only (75 lines)

New Files:
- scan_box.rs: Stage 1 - Read-only scan for RewritePlan
- plan_box.rs: Stage 2 - Transform RewritePlan → RewrittenBlocks
- apply_box.rs: Stage 3 - Apply RewrittenBlocks to MirBuilder
- C-2.1-pipeline-refactoring.md: Implementation guide

Data Structures:
- RewritePlan: Describes WHAT to rewrite (tail calls, returns, PHI, params)
- RewrittenBlocks: Contains HOW to rewrite (new blocks, replacements, inputs)

Benefits:
- Borrow checker safety: No overlapping mutable/immutable borrows
- Single responsibility: Each stage has one clear purpose
- Testability: Each stage can be unit tested independently
- Maintainability: Clear data flow, isolated changes

Status:
- Scaffolding:  Complete (structure defined, stub implementations)
- Integration:  Next step (refactor merge_and_rewrite to use 3 stages)

Build: cargo build --release 

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-25 03:05:44 +09:00
parent d81937e744
commit 1a8a70cc99
5 changed files with 516 additions and 0 deletions

View File

@ -0,0 +1,216 @@
# Phase 286C-2.1: 3-Stage Pipeline Refactoring
## Goal
Refactor `instruction_rewriter.rs` into a **3-stage pipeline** (Scan → Plan → Apply) to:
1. Solve Rust borrow checker conflicts
2. Clarify responsibilities (1 Box = 1 Question)
3. Make code more maintainable and testable
## Current Status: ✅ **Scaffolding Complete**
### What's Done
- ✅ Created 3 new module files with basic structure
- ✅ Defined core data structures
- ✅ Registered modules in `mod.rs`
- ✅ Build passes (`cargo build --release`)
### Files Created
```
src/mir/builder/control_flow/joinir/merge/rewriter/
├── scan_box.rs # Stage 1: Read-only scanning
├── plan_box.rs # Stage 2: Pure transformation
└── apply_box.rs # Stage 3: Builder mutation only
```
## The 3 Stages
### Stage 1: Scan (Read-Only)
**File**: `scan_box.rs`
**Responsibility**: "What needs to be rewritten?"
- **Input**: `&MirFunction`, `&JoinInlineBoundary`, `&RewriteContext`
- **Output**: `RewritePlan` (describes WHAT to do)
- **Key**: Does NOT touch `MirBuilder`
**Data Structures**:
```rust
pub struct RewritePlan {
pub tail_calls: Vec<TailCallRewrite>,
pub return_conversions: Vec<ReturnConversion>,
pub phi_adjustments: Vec<PhiAdjustment>,
pub parameter_bindings: Vec<ParameterBinding>,
}
```
**What it does**:
- Identify tail calls to rewrite
- Find return instructions to convert
- Detect PHI adjustments needed
- Collect parameter binding requirements
### Stage 2: Plan (Pure Transformation)
**File**: `plan_box.rs`
**Responsibility**: "How should the MIR look after rewriting?"
- **Input**: `RewritePlan`
- **Output**: `RewrittenBlocks` (new blocks, PHI tables, remapping tables)
- **Key**: Can update `RewriteContext` but does NOT touch builder
**Data Structures**:
```rust
pub struct RewrittenBlocks {
pub new_blocks: Vec<BasicBlock>,
pub block_replacements: BTreeMap<BasicBlockId, BasicBlock>,
pub phi_inputs: Vec<(BasicBlockId, ValueId)>,
pub carrier_inputs: BTreeMap<String, Vec<(BasicBlockId, ValueId)>>,
}
```
**What it does**:
- Generate new basic blocks based on plan
- Create PHI input mappings
- Prepare parameter binding instructions
- Build block ID remapping tables
### Stage 3: Apply (Builder Mutation Only)
**File**: `apply_box.rs`
**Responsibility**: "Put the new MIR into the builder"
- **Input**: `&mut MirBuilder`, `RewrittenBlocks`
- **Output**: Merge result reflected in builder
- **Key**: ONLY adds/replaces blocks in builder
**What it does**:
- Add new blocks to MirBuilder
- Replace modified blocks
- Update function metadata
## Why This Approach Works
**The Borrow Checker Problem**:
- Currently: Mutating `MirBuilder` while updating `RewriteContext` causes conflicts
- Solution: **Separate "read", "plan", and "apply" into distinct time windows**
**Borrow Lifecycle**:
```rust
// Stage 1: Read-Only (immutable borrow)
let plan = scan(&mir_function, &boundary, &ctx);
// Stage 2: Pure (no builder borrow)
let blocks = plan_transform(plan, &mut ctx);
// Stage 3: Mutation (mutable borrow, no conflicts)
apply(&mut builder, blocks);
```
## Implementation Strategy (Incremental)
### ✅ Step 0: Scaffolding (DONE)
- [x] Create 3 files with basic structure
- [x] Define data structures
- [x] Register modules
- [x] Test build
### 🔄 Step 1: Split within same file (NEXT)
Create 3 functions in `instruction_rewriter.rs`:
```rust
fn scan(...) -> RewritePlan { ... }
fn plan(plan: RewritePlan) -> RewrittenBlocks { ... }
fn apply(builder: &mut MirBuilder, blocks: RewrittenBlocks) { ... }
```
Refactor `merge_and_rewrite()` to:
```rust
pub fn merge_and_rewrite(...) -> Result<MergeResult, String> {
let plan = scan(...);
let blocks = plan(plan);
apply(builder, blocks)?;
// ... existing post-processing ...
}
```
**Test**: `cargo build --release` + quick smoke tests
### 📦 Step 2: Extract to separate files
- Move `scan()``scan_box.rs`
- Move `plan()``plan_box.rs`
- Move `apply()``apply_box.rs`
**Test**: `cargo build --release` + quick smoke tests
### 🔗 Step 3: Merge existing helper boxes
- Integrate `InstructionFilterBox`, `TailCallDetectorBox` → into `ScanBox`
- Integrate `ParameterBindingBox`, `ReturnConverterBox` → into `PlanBox`
**Test**: `cargo build --release` + quick smoke tests
## Current Implementation State
### ScanBox (`scan_box.rs`)
**Status**: Scaffolding only
- ✅ Data structures defined
- ⏳ Logic stub (returns empty result)
- ⏳ Needs implementation
**Next**: Implement `scan_block()` logic
### PlanBox (`plan_box.rs`)
**Status**: Scaffolding only
- ✅ Data structures defined
- ⏳ Logic stub (returns empty result)
- ⏳ Needs implementation
**Next**: Implement `transform()` logic
### ApplyBox (`apply_box.rs`)
**Status**: Basic implementation
- ✅ Data structures defined
- ✅ Basic `apply()` implementation (adds/replaces blocks)
- ⏳ Needs integration with full pipeline
**Next**: Test with real data
## Testing Strategy
### Unit Tests
Each Box should have tests:
- `scan_box.rs`: Test block scanning logic
- `plan_box.rs`: Test plan → blocks transformation
- `apply_box.rs`: Test builder mutation
### Integration Tests
- Test full pipeline: scan → plan → apply
- Compare output with current `merge_and_rewrite()` (should be identical)
### Smoke Tests
Use existing smoke tests after each step:
- `tools/smokes/v2/run.sh --profile quick`
- No new failures beyond known issues
## Success Criteria
1.`merge_and_rewrite()` becomes ~100-200 line orchestrator
2.`cargo build --release` passes at each step
3. ⏳ Quick smoke tests maintain (no new failures)
4. ⏳ Borrow checker conflicts resolved
5. ⏳ Zero functionality changes (pure refactoring)
## Important Notes
- **Incremental steps** - test after each change
- **Preserve all functionality** - this is refactoring, not rewriting
- **Use RewriteContext from Phase 286C-2** - already committed
- **Read current code first** - understand actual structure before changing
## Next Steps
1. **Implement Step 1**: Split `merge_and_rewrite()` into 3 functions within same file
2. **Test extensively**: Ensure no behavioral changes
3. **Document edge cases**: Identify any special handling needed
4. **Proceed to Step 2**: Extract to separate files only after Step 1 works
## References
- **Original file**: `src/mir/builder/control_flow/joinir/merge/instruction_rewriter.rs` (1422 lines)
- **RewriteContext**: `src/mir/builder/control_flow/joinir/merge/rewriter/rewrite_context.rs`
- **Existing helper boxes**: `instruction_filter_box.rs`, `parameter_binding_box.rs`, etc.