docs: Add comprehensive modularization implementation plan
Create 5-document modularization strategy for 3 large source files: - control_flow.rs (1,632 lines → 19 modules) - generic_case_a.rs (1,056 lines → 7 modules) - loopform_builder.rs (1,166 lines → 11 modules) Documents included: 1. **README.md** (352 lines) Navigation hub with overview, metrics, timeline, ROI analysis 2. **modularization-implementation-plan.md** (960 lines) Complete 20-hour implementation guide with: - Phase-by-phase breakdown (13 phases across 3 files) - Hour-by-hour effort estimates - Risk assessment matrices - Success criteria (quantitative & qualitative) - Public API changes (zero breaking changes) 3. **modularization-quick-start.md** (253 lines) Actionable checklist with: - Copy-paste verification commands - Emergency rollback procedures - Timeline and milestones 4. **modularization-directory-structure.md** (426 lines) Visual guide with: - Before/after directory trees - File size metrics - Import path examples - Code quality comparison 5. **phase4-merge-function-breakdown.md** (789 lines) Detailed implementation for most complex phase: - 714-line merge_joinir_mir_blocks() → 6 focused modules - 10 detailed implementation steps - Common pitfalls and solutions Key metrics: - Total effort: 20 hours (2-3 weeks) - Files: 9 → 37 focused modules - Largest file: 1,632 → 180 lines (-89%) - Backward compatible: Zero breaking changes - ROI: Breakeven in 4 months (5 hrs/month saved) Priority: control_flow.rs Phase 1-4 (high impact) Safety: Fully reversible at each phase 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
352
docs/development/refactoring/README.md
Normal file
352
docs/development/refactoring/README.md
Normal file
@ -0,0 +1,352 @@
|
||||
# Modularization Implementation Resources
|
||||
|
||||
This directory contains comprehensive plans and guides for modularizing large source files in the Nyash codebase.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
**Goal**: Break down 3 oversized files (3,854 lines total) into 37 focused modules with clear separation of concerns.
|
||||
|
||||
**Priority**:
|
||||
1. **control_flow.rs** (1,632 lines) - **HIGHEST** (blocking Pattern 4+ development)
|
||||
2. **generic_case_a.rs** (1,056 lines) - **MEDIUM** (high code deduplication potential)
|
||||
3. **loopform_builder.rs** (1,166 lines) - **LOWER** (already partially modularized)
|
||||
|
||||
---
|
||||
|
||||
## Documents in This Directory
|
||||
|
||||
### 1. [modularization-implementation-plan.md](./modularization-implementation-plan.md) ⭐ **START HERE**
|
||||
**Comprehensive implementation plan** covering all 3 files.
|
||||
|
||||
**Contents**:
|
||||
- Executive summary
|
||||
- Phase-by-phase migration plans for each file
|
||||
- Public API changes
|
||||
- Build verification strategies
|
||||
- Risk assessment matrices
|
||||
- Implementation effort breakdowns
|
||||
- Success criteria
|
||||
|
||||
**Who should read this**: Anyone implementing the modularization.
|
||||
|
||||
**Estimated read time**: 30 minutes
|
||||
|
||||
---
|
||||
|
||||
### 2. [modularization-quick-start.md](./modularization-quick-start.md) 🚀 **QUICK REFERENCE**
|
||||
**TL;DR checklist version** of the implementation plan.
|
||||
|
||||
**Contents**:
|
||||
- Step-by-step checklists for each phase
|
||||
- Verification commands
|
||||
- Emergency rollback commands
|
||||
- Timeline and milestones
|
||||
|
||||
**Who should read this**: Developers actively working on modularization.
|
||||
|
||||
**Estimated read time**: 10 minutes
|
||||
|
||||
---
|
||||
|
||||
### 3. [modularization-directory-structure.md](./modularization-directory-structure.md) 📊 **VISUAL GUIDE**
|
||||
**Visual directory structure diagrams** showing before/after states.
|
||||
|
||||
**Contents**:
|
||||
- Directory tree diagrams for all 3 files
|
||||
- Metrics comparison tables
|
||||
- Import path changes
|
||||
- Navigation improvement examples
|
||||
- File size distribution charts
|
||||
|
||||
**Who should read this**: Anyone wanting to understand the proposed structure.
|
||||
|
||||
**Estimated read time**: 15 minutes
|
||||
|
||||
---
|
||||
|
||||
### 4. [phase4-merge-function-breakdown.md](./phase4-merge-function-breakdown.md) 🔥 **CRITICAL PHASE**
|
||||
**Detailed implementation guide** for Phase 4 (merge_joinir_mir_blocks breakdown).
|
||||
|
||||
**Contents**:
|
||||
- Function analysis (714 lines → 6 modules)
|
||||
- Detailed module breakdowns with code examples
|
||||
- Step-by-step implementation steps (10 steps)
|
||||
- Verification checklist
|
||||
- Common pitfalls and solutions
|
||||
- Rollback procedure
|
||||
|
||||
**Who should read this**: Developers working on control_flow.rs Phase 4.
|
||||
|
||||
**Estimated read time**: 20 minutes
|
||||
|
||||
---
|
||||
|
||||
## Quick Navigation
|
||||
|
||||
### I want to...
|
||||
|
||||
#### **Start the modularization**
|
||||
→ Read [modularization-implementation-plan.md](./modularization-implementation-plan.md) (full plan)
|
||||
→ Use [modularization-quick-start.md](./modularization-quick-start.md) (checklist)
|
||||
|
||||
#### **Understand the proposed structure**
|
||||
→ Read [modularization-directory-structure.md](./modularization-directory-structure.md) (visual guide)
|
||||
|
||||
#### **Work on Phase 4 (merge function)**
|
||||
→ Read [phase4-merge-function-breakdown.md](./phase4-merge-function-breakdown.md) (detailed guide)
|
||||
|
||||
#### **Get approval for the plan**
|
||||
→ Share [modularization-implementation-plan.md](./modularization-implementation-plan.md) (comprehensive)
|
||||
→ Use [modularization-directory-structure.md](./modularization-directory-structure.md) (visual support)
|
||||
|
||||
#### **Estimate effort**
|
||||
→ See "Implementation Effort Breakdown" in [modularization-implementation-plan.md](./modularization-implementation-plan.md)
|
||||
|
||||
#### **Assess risks**
|
||||
→ See "Risk Assessment" sections in [modularization-implementation-plan.md](./modularization-implementation-plan.md)
|
||||
|
||||
---
|
||||
|
||||
## Recommended Reading Order
|
||||
|
||||
### For Implementers (Developers)
|
||||
1. **Quick Start** - [modularization-quick-start.md](./modularization-quick-start.md) (10 min)
|
||||
2. **Full Plan** - [modularization-implementation-plan.md](./modularization-implementation-plan.md) (30 min)
|
||||
3. **Phase 4 Guide** - [phase4-merge-function-breakdown.md](./phase4-merge-function-breakdown.md) (when ready for Phase 4)
|
||||
|
||||
### For Reviewers (Team Leads)
|
||||
1. **Visual Guide** - [modularization-directory-structure.md](./modularization-directory-structure.md) (15 min)
|
||||
2. **Full Plan** - [modularization-implementation-plan.md](./modularization-implementation-plan.md) (30 min)
|
||||
3. **Quick Start** - [modularization-quick-start.md](./modularization-quick-start.md) (verification commands)
|
||||
|
||||
### For Stakeholders (Management)
|
||||
1. **Executive Summary** - First page of [modularization-implementation-plan.md](./modularization-implementation-plan.md) (5 min)
|
||||
2. **Metrics Comparison** - Tables in [modularization-directory-structure.md](./modularization-directory-structure.md) (5 min)
|
||||
|
||||
---
|
||||
|
||||
## Key Metrics
|
||||
|
||||
### control_flow.rs
|
||||
- **Lines**: 1,632 → 1,850 (+13% for clarity)
|
||||
- **Files**: 1 → 19
|
||||
- **Largest file**: 1,632 → 180 (-89%)
|
||||
- **Effort**: 12.5 hours
|
||||
|
||||
### generic_case_a.rs
|
||||
- **Lines**: 1,056 → 1,470 (+39% for clarity)
|
||||
- **Files**: 3 → 7
|
||||
- **Largest file**: 1,056 → 500 (-53%)
|
||||
- **Effort**: 3.5 hours
|
||||
|
||||
### loopform_builder.rs
|
||||
- **Lines**: 1,166 → 1,450 (+24% for clarity)
|
||||
- **Files**: 5 → 11
|
||||
- **Largest file**: 1,166 → 200 (-83%)
|
||||
- **Effort**: 4 hours
|
||||
|
||||
### Total
|
||||
- **Lines**: 3,854 → 4,770 (+24% for clarity, distributed across 37 files)
|
||||
- **Files**: 9 → 37
|
||||
- **Total Effort**: 20 hours (2-3 weeks)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
### Week 1: control_flow.rs Phases 1-3 (Low Risk)
|
||||
- **Monday**: Phase 1 (Debug utilities) - 30 min
|
||||
- **Tuesday**: Phase 2 (Pattern lowerers) - 2 hours
|
||||
- **Wednesday**: Phase 3 (JoinIR routing) - 1.5 hours
|
||||
- **Thursday-Friday**: Verification and buffer
|
||||
|
||||
**Deliverable**: Pattern lowerers and routing isolated
|
||||
|
||||
### Week 2: control_flow.rs Phase 4 (High Risk)
|
||||
- **Monday-Tuesday**: Phase 4 (merge function) - 6 hours
|
||||
- **Wednesday**: Buffer for issues
|
||||
- **Thursday-Friday**: Phases 5-7 (Exception, utils, cleanup) - 2.5 hours
|
||||
|
||||
**Deliverable**: control_flow.rs fully modularized
|
||||
|
||||
### Week 3: generic_case_a.rs (Optional)
|
||||
- **Monday-Tuesday**: generic_case_a.rs Phases 1-5 - 3.5 hours
|
||||
- **Wednesday**: Buffer
|
||||
- **Thursday-Friday**: Documentation & final verification
|
||||
|
||||
**Deliverable**: generic_case_a.rs fully modularized
|
||||
|
||||
### Future: loopform_builder.rs (After Pattern 4+)
|
||||
- **Timing**: After Pattern 4/5/6 development stabilizes
|
||||
- **Effort**: 4 hours
|
||||
- **Priority**: Lower (already partially modularized)
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
### Quantitative
|
||||
- ✅ All 267+ tests pass (no regressions)
|
||||
- ✅ Build time ≤ current (no increase)
|
||||
- ✅ Largest file < 250 lines (vs 1,632 before)
|
||||
- ✅ Average file size < 150 lines
|
||||
|
||||
### Qualitative
|
||||
- ✅ Code is easier to navigate
|
||||
- ✅ New patterns can be added without modifying 1,600-line files
|
||||
- ✅ Debug traces remain functional
|
||||
- ✅ Documentation is clear and helpful
|
||||
|
||||
### Process
|
||||
- ✅ Zero breaking changes at any phase
|
||||
- ✅ Each phase can be rolled back independently
|
||||
- ✅ Commits are small and focused
|
||||
- ✅ CI/CD passes after every commit
|
||||
|
||||
---
|
||||
|
||||
## Verification Commands
|
||||
|
||||
### Quick Verification (after each phase)
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
```
|
||||
|
||||
### Comprehensive Verification (after critical phases)
|
||||
```bash
|
||||
cargo build --release --all-features
|
||||
cargo test --release
|
||||
cargo clippy --all-targets
|
||||
tools/smokes/v2/run.sh --profile quick
|
||||
```
|
||||
|
||||
### Debug Trace Verification (Phase 4 only)
|
||||
```bash
|
||||
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Emergency Rollback
|
||||
|
||||
### control_flow.rs
|
||||
```bash
|
||||
rm -rf src/mir/builder/control_flow/
|
||||
git checkout src/mir/builder/control_flow.rs
|
||||
cargo build --release && cargo test --lib
|
||||
```
|
||||
|
||||
### generic_case_a.rs
|
||||
```bash
|
||||
rm -rf src/mir/join_ir/lowering/generic_case_a/
|
||||
git checkout src/mir/join_ir/lowering/generic_case_a*.rs
|
||||
cargo build --release && cargo test --lib
|
||||
```
|
||||
|
||||
### loopform_builder.rs
|
||||
```bash
|
||||
rm -rf src/mir/phi_core/loopform/
|
||||
git checkout src/mir/phi_core/loopform*.rs
|
||||
cargo build --release && cargo test --lib
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Why Modularize?
|
||||
|
||||
### Current Pain Points
|
||||
1. **714-line merge function** - Impossible to understand without hours of study
|
||||
2. **1,632-line control_flow.rs** - Pattern 4+ would add another 500+ lines
|
||||
3. **Merge conflicts** - Multiple developers editing the same giant file
|
||||
4. **Hard to debug** - `NYASH_OPTION_C_DEBUG` traces are buried in massive files
|
||||
5. **Hard to test** - Can't test individual phases in isolation
|
||||
|
||||
### Benefits After Modularization
|
||||
1. **100-150 line modules** - Easy to understand at a glance
|
||||
2. **19 focused files** - Each with a single responsibility
|
||||
3. **Isolated changes** - Modify one phase without affecting others
|
||||
4. **Easy debugging** - Jump to specific module for traces
|
||||
5. **Testable** - Can unit test individual modules
|
||||
|
||||
### ROI (Return on Investment)
|
||||
- **Time investment**: 20 hours (2-3 weeks)
|
||||
- **Time saved**: ~5 hours/month on maintenance (conservatively)
|
||||
- **Breakeven**: 4 months
|
||||
- **Long-term benefit**: Much easier Pattern 4/5/6 development
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order Justification
|
||||
|
||||
### Why control_flow.rs First?
|
||||
1. **Blocking Pattern 4+** - Currently blocking new pattern development
|
||||
2. **Highest pain** - 714-line merge function is the biggest code smell
|
||||
3. **Sets the pattern** - Establishes the modularization template for others
|
||||
4. **Most benefit** - Reduces merge conflicts immediately
|
||||
|
||||
### Why generic_case_a.rs Second?
|
||||
1. **High code deduplication** - 4 similar lowerers can be separated
|
||||
2. **Already partially split** - Companion files already exist
|
||||
3. **Medium priority** - Not blocking, but would improve maintainability
|
||||
|
||||
### Why loopform_builder.rs Last?
|
||||
1. **Already partially modularized** - Phase 191 did most of the work
|
||||
2. **Lower priority** - Not blocking anything
|
||||
3. **Can wait** - Best done after Pattern 4+ development stabilizes
|
||||
|
||||
---
|
||||
|
||||
## Questions & Concerns
|
||||
|
||||
### "Is this worth the effort?"
|
||||
**Yes.** 20 hours investment for ongoing maintenance benefits. Breakeven in 4 months.
|
||||
|
||||
### "Will this break anything?"
|
||||
**No.** Zero breaking changes, backward compatible at every phase. Full test suite verification.
|
||||
|
||||
### "Can we roll back if needed?"
|
||||
**Yes.** Each phase can be rolled back independently with simple git commands.
|
||||
|
||||
### "What if we only do control_flow.rs?"
|
||||
**Still valuable.** That's where the highest pain is. Do that first, others can wait.
|
||||
|
||||
### "Who should implement this?"
|
||||
**Experienced developer** familiar with MIR builder and JoinIR integration. Phase 4 requires careful attention.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Review this README** - Understand the resources available
|
||||
2. **Read the full plan** - [modularization-implementation-plan.md](./modularization-implementation-plan.md)
|
||||
3. **Get approval** - Share with team leads
|
||||
4. **Create a branch** - `refactor/modularize-control-flow`
|
||||
5. **Start Phase 1** - Use [modularization-quick-start.md](./modularization-quick-start.md)
|
||||
|
||||
---
|
||||
|
||||
## Document Status
|
||||
|
||||
- **Created**: 2025-12-05
|
||||
- **Status**: Ready for review and implementation
|
||||
- **Maintainer**: Claude Code (AI-assisted planning)
|
||||
- **Next Review**: After Week 1 completion
|
||||
|
||||
---
|
||||
|
||||
## Feedback
|
||||
|
||||
If you have questions or suggestions about this modularization plan:
|
||||
|
||||
1. **Open an issue** - Tag with `refactoring` label
|
||||
2. **Update the plan** - Submit a PR with improvements
|
||||
3. **Document lessons learned** - Add notes to this README
|
||||
|
||||
**Contact**: Open a discussion in the team channel
|
||||
|
||||
---
|
||||
|
||||
**Happy Modularizing!** 🚀
|
||||
@ -0,0 +1,426 @@
|
||||
# Modularization Directory Structure Diagrams
|
||||
|
||||
Visual reference for the proposed directory structures after modularization.
|
||||
|
||||
---
|
||||
|
||||
## 1. control_flow.rs Modularization
|
||||
|
||||
### Before (Current State)
|
||||
```
|
||||
src/mir/builder/
|
||||
├── control_flow.rs (1,632 lines) ⚠️ MONOLITH
|
||||
├── if_form.rs
|
||||
├── loops.rs
|
||||
└── ... (other files)
|
||||
```
|
||||
|
||||
### After (Proposed Structure)
|
||||
```
|
||||
src/mir/builder/
|
||||
├── control_flow/
|
||||
│ ├── mod.rs (~150 lines) ✅ Entry points
|
||||
│ │ ├── pub(super) fn cf_block()
|
||||
│ │ ├── pub(super) fn cf_if()
|
||||
│ │ ├── pub(super) fn cf_loop()
|
||||
│ │ ├── pub(super) fn cf_try_catch()
|
||||
│ │ └── pub(super) fn cf_throw()
|
||||
│ │
|
||||
│ ├── debug.rs (~50 lines) ✅ Debug utilities
|
||||
│ │ └── fn trace_varmap()
|
||||
│ │
|
||||
│ ├── utils.rs (~50 lines) ✅ Utility functions
|
||||
│ │ └── fn extract_loop_variable_from_condition()
|
||||
│ │
|
||||
│ ├── joinir/
|
||||
│ │ ├── mod.rs (~100 lines) ✅ JoinIR coordinator
|
||||
│ │ │
|
||||
│ │ ├── routing.rs (~150 lines) ✅ Routing logic
|
||||
│ │ │ ├── fn try_cf_loop_joinir()
|
||||
│ │ │ └── fn cf_loop_joinir_impl()
|
||||
│ │ │
|
||||
│ │ ├── merge/
|
||||
│ │ │ ├── mod.rs (~100 lines) ✅ Merge coordinator
|
||||
│ │ │ │ └── pub fn merge_joinir_mir_blocks()
|
||||
│ │ │ │
|
||||
│ │ │ ├── id_remapper.rs (~150 lines) ✅ ID remapping
|
||||
│ │ │ │ ├── struct JoinIrIdRemapper
|
||||
│ │ │ │ ├── fn create_remapper()
|
||||
│ │ │ │ └── fn remap_ids()
|
||||
│ │ │ │
|
||||
│ │ │ ├── block_allocator.rs (~100 lines) ✅ Block allocation
|
||||
│ │ │ │ └── fn allocate_blocks()
|
||||
│ │ │ │
|
||||
│ │ │ ├── value_collector.rs (~100 lines) ✅ Value collection
|
||||
│ │ │ │ └── fn collect_values()
|
||||
│ │ │ │
|
||||
│ │ │ ├── instruction_rewriter.rs (~150 lines) ✅ Instruction rewriting
|
||||
│ │ │ │ ├── fn rewrite_instructions()
|
||||
│ │ │ │ └── fn convert_call_to_jump()
|
||||
│ │ │ │
|
||||
│ │ │ └── exit_phi_builder.rs (~100 lines) ✅ Exit PHI construction
|
||||
│ │ │ └── fn build_exit_phi()
|
||||
│ │ │
|
||||
│ │ └── patterns/
|
||||
│ │ ├── mod.rs (~50 lines) ✅ Pattern dispatcher
|
||||
│ │ │ └── pub fn dispatch_pattern()
|
||||
│ │ │
|
||||
│ │ ├── pattern1_minimal.rs (~150 lines) ✅ Pattern 1 lowering
|
||||
│ │ │ └── pub fn cf_loop_pattern1_minimal()
|
||||
│ │ │
|
||||
│ │ ├── pattern2_with_break.rs (~130 lines) ✅ Pattern 2 lowering
|
||||
│ │ │ └── pub fn cf_loop_pattern2_with_break()
|
||||
│ │ │
|
||||
│ │ └── pattern3_with_if_phi.rs (~180 lines) ✅ Pattern 3 lowering
|
||||
│ │ └── pub fn cf_loop_pattern3_with_if_phi()
|
||||
│ │
|
||||
│ └── exception/
|
||||
│ ├── mod.rs (~50 lines) ✅ Exception API
|
||||
│ │
|
||||
│ ├── try_catch.rs (~150 lines) ✅ try/catch impl
|
||||
│ │ └── pub fn cf_try_catch()
|
||||
│ │
|
||||
│ └── throw.rs (~30 lines) ✅ throw impl
|
||||
│ └── pub fn cf_throw()
|
||||
│
|
||||
├── if_form.rs
|
||||
├── loops.rs
|
||||
└── ... (other files - unchanged)
|
||||
```
|
||||
|
||||
### Metrics Comparison
|
||||
|
||||
| Metric | Before | After | Change |
|
||||
|--------|--------|-------|--------|
|
||||
| **Total lines** | 1,632 | ~1,850 | +13% (for clarity) |
|
||||
| **Number of files** | 1 | 19 | +1800% |
|
||||
| **Largest file** | 1,632 | ~180 | -89% |
|
||||
| **Average file size** | 1,632 | ~97 | -94% |
|
||||
| **Functions per file** | 13 | 1-2 | Much clearer |
|
||||
|
||||
### Benefits
|
||||
- ✅ **714-line merge function → 6 focused modules** (100-150 lines each)
|
||||
- ✅ **Pattern lowerers isolated** → Easy to add Pattern 4/5/6
|
||||
- ✅ **Debug traces easier to locate** → NYASH_OPTION_C_DEBUG output clearer
|
||||
- ✅ **Merge conflicts reduced** → Changes isolated to specific files
|
||||
- ✅ **Code navigation improved** → Jump to definition works better
|
||||
|
||||
---
|
||||
|
||||
## 2. generic_case_a.rs Modularization
|
||||
|
||||
### Before (Current State)
|
||||
```
|
||||
src/mir/join_ir/lowering/
|
||||
├── generic_case_a.rs (1,056 lines) ⚠️ LARGE
|
||||
├── generic_case_a_entry_builder.rs (4,828 bytes)
|
||||
├── generic_case_a_whitespace_check.rs (4,552 bytes)
|
||||
└── ... (other files)
|
||||
```
|
||||
|
||||
### After (Proposed Structure)
|
||||
```
|
||||
src/mir/join_ir/lowering/
|
||||
├── generic_case_a/
|
||||
│ ├── mod.rs (~100 lines) ✅ Public API
|
||||
│ │ ├── pub use skip_ws::lower_case_a_skip_ws_with_scope
|
||||
│ │ ├── pub use trim::lower_case_a_trim_with_scope
|
||||
│ │ ├── pub use append_defs::lower_case_a_append_defs_with_scope
|
||||
│ │ └── pub use stage1_using_resolver::...
|
||||
│ │
|
||||
│ ├── skip_ws.rs (~220 lines) ✅ skip_ws lowerer
|
||||
│ │ ├── pub fn lower_case_a_skip_ws_with_scope()
|
||||
│ │ └── fn lower_case_a_skip_ws_core()
|
||||
│ │
|
||||
│ ├── trim.rs (~500 lines) ✅ trim lowerer
|
||||
│ │ ├── pub fn lower_case_a_trim_with_scope()
|
||||
│ │ └── fn lower_case_a_trim_core()
|
||||
│ │
|
||||
│ ├── append_defs.rs (~170 lines) ✅ append_defs lowerer
|
||||
│ │ ├── pub fn lower_case_a_append_defs_with_scope()
|
||||
│ │ └── fn lower_case_a_append_defs_core()
|
||||
│ │
|
||||
│ ├── stage1_using_resolver.rs (~180 lines) ✅ stage1 lowerer
|
||||
│ │ ├── pub fn lower_case_a_stage1_usingresolver_with_scope()
|
||||
│ │ └── fn lower_case_a_stage1_usingresolver_core()
|
||||
│ │
|
||||
│ ├── entry_builder.rs (~150 lines) ✅ (moved from parent)
|
||||
│ │ └── struct EntryFunctionBuilder
|
||||
│ │
|
||||
│ └── whitespace_check.rs (~150 lines) ✅ (moved from parent)
|
||||
│ └── fn check_whitespace()
|
||||
│
|
||||
└── ... (other files - unchanged)
|
||||
```
|
||||
|
||||
### Metrics Comparison
|
||||
|
||||
| Metric | Before | After | Change |
|
||||
|--------|--------|-------|--------|
|
||||
| **Total lines** | 1,056 | ~1,470 | +39% (for clarity) |
|
||||
| **Number of files** | 3 (scattered) | 7 (organized) | +133% |
|
||||
| **Largest file** | 1,056 | ~500 | -53% |
|
||||
| **Average file size** | 352 | ~210 | -40% |
|
||||
| **Functions per file** | 4 | 1-2 | Much clearer |
|
||||
|
||||
### Benefits
|
||||
- ✅ **Each lowerer in its own file** → Easy to maintain
|
||||
- ✅ **Companion files integrated** → All Case A logic in one directory
|
||||
- ✅ **trim.rs still large (500 lines)** → Could be further split if needed
|
||||
- ✅ **Clear public API** → mod.rs shows what's exported
|
||||
|
||||
---
|
||||
|
||||
## 3. loopform_builder.rs Modularization
|
||||
|
||||
### Before (Current State - After Phase 191)
|
||||
```
|
||||
src/mir/phi_core/
|
||||
├── loopform_builder.rs (1,166 lines) ⚠️ LARGE
|
||||
├── loopform_context.rs (✅ already modularized)
|
||||
├── loopform_variable_models.rs (✅ already modularized)
|
||||
├── loopform_utils.rs (✅ already modularized)
|
||||
├── loopform_exit_phi.rs (✅ already modularized)
|
||||
└── ... (other files)
|
||||
```
|
||||
|
||||
### After (Proposed Structure)
|
||||
```
|
||||
src/mir/phi_core/
|
||||
├── loopform/
|
||||
│ ├── mod.rs (~100 lines) ✅ Public API
|
||||
│ │ ├── pub use context::LoopFormContext
|
||||
│ │ ├── pub use variable_models::{CarrierVariable, PinnedVariable}
|
||||
│ │ ├── pub use exit_phi::build_exit_phis_for_control
|
||||
│ │ └── pub use builder_core::LoopFormBuilder
|
||||
│ │
|
||||
│ ├── context.rs (~150 lines) ✅ (existing)
|
||||
│ │ └── pub struct LoopFormContext
|
||||
│ │
|
||||
│ ├── variable_models.rs (~150 lines) ✅ (existing)
|
||||
│ │ ├── pub struct CarrierVariable
|
||||
│ │ ├── pub struct PinnedVariable
|
||||
│ │ └── pub struct LoopBypassFlags
|
||||
│ │
|
||||
│ ├── utils.rs (~100 lines) ✅ (existing)
|
||||
│ │ ├── pub fn is_loopform_debug_enabled()
|
||||
│ │ └── pub fn get_loop_bypass_flags()
|
||||
│ │
|
||||
│ ├── exit_phi.rs (~150 lines) ✅ (existing)
|
||||
│ │ └── pub fn build_exit_phis_for_control()
|
||||
│ │
|
||||
│ ├── passes/
|
||||
│ │ ├── mod.rs (~50 lines) ✅ 4-pass coordinator
|
||||
│ │ │ └── pub fn run_4_pass_architecture()
|
||||
│ │ │
|
||||
│ │ ├── pass1_discovery.rs (~150 lines) ✅ Variable discovery
|
||||
│ │ │ └── pub fn discover_variables()
|
||||
│ │ │
|
||||
│ │ ├── pass2_header_phi.rs (~150 lines) ✅ Header PHI
|
||||
│ │ │ └── pub fn build_header_phi()
|
||||
│ │ │
|
||||
│ │ ├── pass3_latch.rs (~100 lines) ✅ Latch processing
|
||||
│ │ │ └── pub fn process_latch()
|
||||
│ │ │
|
||||
│ │ └── pass4_exit_phi.rs (~150 lines) ✅ Exit PHI
|
||||
│ │ └── pub fn build_exit_phi()
|
||||
│ │
|
||||
│ └── builder_core.rs (~200 lines) ✅ Core builder
|
||||
│ └── pub struct LoopFormBuilder
|
||||
│
|
||||
└── ... (other files - unchanged)
|
||||
```
|
||||
|
||||
### Metrics Comparison
|
||||
|
||||
| Metric | Before | After | Change |
|
||||
|--------|--------|-------|--------|
|
||||
| **Total lines** | 1,166 (main file) | ~1,450 | +24% (for clarity) |
|
||||
| **Number of files** | 5 (partially modularized) | 11 (fully modularized) | +120% |
|
||||
| **Largest file** | 1,166 | ~200 | -83% |
|
||||
| **Average file size** | 233 | ~132 | -43% |
|
||||
|
||||
### Benefits
|
||||
- ✅ **Completes Phase 191 modularization** → Finishes what was started
|
||||
- ✅ **4-pass architecture explicit** → Each pass in its own file
|
||||
- ✅ **Already partially modularized** → Lower risk than control_flow.rs
|
||||
- ✅ **Clear separation of concerns** → Context, models, passes, builder
|
||||
|
||||
---
|
||||
|
||||
## File Size Distribution Comparison
|
||||
|
||||
### control_flow.rs
|
||||
```
|
||||
Before:
|
||||
█████████████████████████████████████████████████████████████████ 1,632 lines
|
||||
|
||||
After:
|
||||
merge/instruction_rewriter.rs: ███████ 150 lines
|
||||
pattern1_minimal.rs: ███████ 150 lines
|
||||
routing.rs: ███████ 150 lines
|
||||
pattern3_with_if_phi.rs: ████████ 180 lines
|
||||
try_catch.rs: ███████ 150 lines
|
||||
... (14 more files < 150 lines each)
|
||||
```
|
||||
|
||||
### generic_case_a.rs
|
||||
```
|
||||
Before:
|
||||
███████████████████████████████████████████████████ 1,056 lines
|
||||
|
||||
After:
|
||||
trim.rs: ████████████████████████ 500 lines
|
||||
skip_ws.rs: ██████████ 220 lines
|
||||
stage1_using_resolver.rs: ████████ 180 lines
|
||||
append_defs.rs: ████████ 170 lines
|
||||
... (3 more files < 150 lines each)
|
||||
```
|
||||
|
||||
### loopform_builder.rs
|
||||
```
|
||||
Before:
|
||||
████████████████████████████████████████████████████████ 1,166 lines
|
||||
|
||||
After:
|
||||
builder_core.rs: █████████ 200 lines
|
||||
context.rs: ███████ 150 lines
|
||||
variable_models.rs: ███████ 150 lines
|
||||
exit_phi.rs: ███████ 150 lines
|
||||
pass1_discovery.rs: ███████ 150 lines
|
||||
... (6 more files < 150 lines each)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Import Path Changes
|
||||
|
||||
### control_flow.rs
|
||||
|
||||
#### Before
|
||||
```rust
|
||||
use crate::mir::builder::MirBuilder;
|
||||
|
||||
impl MirBuilder {
|
||||
pub(super) fn cf_loop(...) -> Result<ValueId, String> {
|
||||
// 1,632 lines of code
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### After
|
||||
```rust
|
||||
// src/mir/builder/control_flow/mod.rs
|
||||
use crate::mir::builder::MirBuilder;
|
||||
|
||||
impl MirBuilder {
|
||||
pub(super) fn cf_loop(...) -> Result<ValueId, String> {
|
||||
// Delegates to routing module
|
||||
joinir::routing::try_cf_loop_joinir(self, ...)
|
||||
}
|
||||
}
|
||||
|
||||
// src/mir/builder/control_flow/joinir/routing.rs
|
||||
pub(in crate::mir::builder::control_flow) fn try_cf_loop_joinir(
|
||||
builder: &mut MirBuilder,
|
||||
...
|
||||
) -> Result<Option<ValueId>, String> {
|
||||
// 150 lines of focused code
|
||||
}
|
||||
```
|
||||
|
||||
### generic_case_a.rs
|
||||
|
||||
#### Before
|
||||
```rust
|
||||
// src/mir/join_ir/lowering/generic_case_a.rs
|
||||
pub(crate) fn lower_case_a_skip_ws_with_scope(...) -> Option<JoinModule> {
|
||||
// 200+ lines
|
||||
}
|
||||
```
|
||||
|
||||
#### After
|
||||
```rust
|
||||
// src/mir/join_ir/lowering/generic_case_a/mod.rs
|
||||
pub(crate) use skip_ws::lower_case_a_skip_ws_with_scope;
|
||||
|
||||
// src/mir/join_ir/lowering/generic_case_a/skip_ws.rs
|
||||
pub(crate) fn lower_case_a_skip_ws_with_scope(...) -> Option<JoinModule> {
|
||||
// 220 lines
|
||||
}
|
||||
```
|
||||
|
||||
### loopform_builder.rs
|
||||
|
||||
#### Before
|
||||
```rust
|
||||
// src/mir/phi_core/loopform_builder.rs
|
||||
pub use loopform_context::LoopFormContext;
|
||||
|
||||
pub fn build_exit_phis_for_control<O: LoopFormOps>(...) {
|
||||
// 1,166 lines
|
||||
}
|
||||
```
|
||||
|
||||
#### After
|
||||
```rust
|
||||
// src/mir/phi_core/loopform/mod.rs
|
||||
pub use context::LoopFormContext;
|
||||
pub use exit_phi::build_exit_phis_for_control;
|
||||
|
||||
// src/mir/phi_core/loopform/exit_phi.rs
|
||||
pub fn build_exit_phis_for_control<O: LoopFormOps>(...) {
|
||||
// 150 lines
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Navigation Improvements
|
||||
|
||||
### Before (Monolith Files)
|
||||
```
|
||||
Developer: "Where is the merge function?"
|
||||
Answer: "control_flow.rs, line 864-1578 (search through 1,632 lines)"
|
||||
|
||||
Developer: "Where is Pattern 3 lowering?"
|
||||
Answer: "control_flow.rs, line 696-863 (search through 1,632 lines)"
|
||||
|
||||
Developer: "What does merge_joinir_mir_blocks do?"
|
||||
Answer: "Read 714 lines to understand"
|
||||
```
|
||||
|
||||
### After (Modularized)
|
||||
```
|
||||
Developer: "Where is the merge function?"
|
||||
Answer: "control_flow/joinir/merge/mod.rs (100 lines coordinator)"
|
||||
|
||||
Developer: "Where is Pattern 3 lowering?"
|
||||
Answer: "control_flow/joinir/patterns/pattern3_with_if_phi.rs (180 lines)"
|
||||
|
||||
Developer: "What does merge_joinir_mir_blocks do?"
|
||||
Answer: "Read merge/mod.rs (100 lines) → delegates to 6 sub-modules"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
The modularization dramatically improves code organization:
|
||||
|
||||
- **control_flow.rs**: 1,632 lines → 19 files (avg 97 lines)
|
||||
- **generic_case_a.rs**: 1,056 lines → 7 files (avg 210 lines)
|
||||
- **loopform_builder.rs**: 1,166 lines → 11 files (avg 132 lines)
|
||||
|
||||
**Total Impact**: 3,854 lines → 37 focused modules
|
||||
|
||||
**Developer Experience**:
|
||||
- ✅ Easier navigation (jump to definition works better)
|
||||
- ✅ Clearer separation of concerns
|
||||
- ✅ Less merge conflicts
|
||||
- ✅ Easier to add new patterns/lowerers
|
||||
- ✅ Better code review experience (smaller diffs)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-12-05
|
||||
@ -0,0 +1,960 @@
|
||||
# Modularization Implementation Plan for Large Source Files
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This plan details the modularization of three oversized source files in the Nyash MIR builder:
|
||||
- **control_flow.rs** (1,632 lines) - **HIGHEST PRIORITY**
|
||||
- **generic_case_a.rs** (1,056 lines) - **MEDIUM PRIORITY**
|
||||
- **loopform_builder.rs** (1,166 lines) - **LOWER PRIORITY**
|
||||
|
||||
The strategy prioritizes **control_flow.rs** first due to its critical role in JoinIR integration and ongoing Pattern 4+ development. Each modularization is broken into incremental phases that maintain backward compatibility, with clear verification points and rollback procedures.
|
||||
|
||||
**Total Estimated Effort**: 15-20 hours across 2-3 weeks
|
||||
**Key Principle**: Zero breaking changes, backward compatible at every phase
|
||||
|
||||
---
|
||||
|
||||
## Why control_flow.rs First?
|
||||
|
||||
**control_flow.rs** is the integration point for JoinIR lowering patterns, actively being extended for Pattern 4+. Modularizing now:
|
||||
|
||||
1. **Prevents future pain** - Pattern 4/5/6 would add another 500+ lines to an already massive file
|
||||
2. **Sets the pattern** - Establishes the modularization template for other files
|
||||
3. **Reduces merge conflicts** - Isolates pattern-specific changes to dedicated files
|
||||
4. **Improves debuggability** - `NYASH_OPTION_C_DEBUG` traces become easier to locate
|
||||
5. **Currently blocking** - The 714-line `merge_joinir_mir_blocks()` function is a code smell that makes maintenance difficult
|
||||
|
||||
---
|
||||
|
||||
# 1. control_flow.rs Modularization (HIGHEST PRIORITY)
|
||||
|
||||
## Current State
|
||||
|
||||
**File**: `src/mir/builder/control_flow.rs` (1,632 lines)
|
||||
|
||||
**Functions** (13 total):
|
||||
- `trace_varmap()` (8 lines) - Debug utility
|
||||
- `cf_block()` (4 lines) - Block entry point
|
||||
- `cf_if()` (10 lines) - If entry point
|
||||
- `cf_loop()` (80 lines) - Loop entry point
|
||||
- `try_cf_loop_joinir()` (91 lines) - JoinIR routing logic
|
||||
- `cf_loop_joinir_impl()` (247 lines) - JoinIR pattern dispatcher
|
||||
- `cf_loop_pattern1_minimal()` (143 lines) - Pattern 1 lowering
|
||||
- `cf_loop_pattern2_with_break()` (120 lines) - Pattern 2 lowering
|
||||
- `cf_loop_pattern3_with_if_phi()` (168 lines) - Pattern 3 lowering
|
||||
- **`merge_joinir_mir_blocks()` (714 lines) - LARGEST FUNCTION** ⚠️
|
||||
- `cf_try_catch()` (138 lines) - Exception handling
|
||||
- `extract_loop_variable_from_condition()` (31 lines) - Utility
|
||||
- `cf_throw()` (23 lines) - Throw entry point
|
||||
|
||||
**Key Issues**:
|
||||
- `merge_joinir_mir_blocks()` is 714 lines (44% of the file!)
|
||||
- Pattern lowerers (1/2/3) are isolated but scattered
|
||||
- JoinIR integration logic mixed with entry points
|
||||
- No clear separation between routing and implementation
|
||||
|
||||
---
|
||||
|
||||
## Proposed New Structure
|
||||
|
||||
```
|
||||
src/mir/builder/control_flow/
|
||||
├── mod.rs (~150 lines) - Public API, entry points
|
||||
├── debug.rs (~50 lines) - Debug utilities (trace_varmap)
|
||||
├── joinir/
|
||||
│ ├── mod.rs (~100 lines) - JoinIR integration coordinator
|
||||
│ ├── routing.rs (~150 lines) - try_cf_loop_joinir, dispatcher
|
||||
│ ├── merge/
|
||||
│ │ ├── mod.rs (~100 lines) - merge_joinir_mir_blocks entry point
|
||||
│ │ ├── id_remapper.rs (~150 lines) - ValueId/BlockId remapping
|
||||
│ │ ├── block_allocator.rs (~100 lines) - Block ID allocation
|
||||
│ │ ├── value_collector.rs (~100 lines) - Value collection phase
|
||||
│ │ ├── instruction_rewriter.rs (~150 lines) - Instruction rewriting
|
||||
│ │ └── exit_phi_builder.rs (~100 lines) - Exit PHI construction
|
||||
│ └── patterns/
|
||||
│ ├── mod.rs (~50 lines) - Pattern dispatcher
|
||||
│ ├── pattern1_minimal.rs (~150 lines) - Pattern 1 lowering
|
||||
│ ├── pattern2_with_break.rs (~130 lines) - Pattern 2 lowering
|
||||
│ └── pattern3_with_if_phi.rs (~180 lines) - Pattern 3 lowering
|
||||
├── exception/
|
||||
│ ├── mod.rs (~50 lines) - Exception handling API
|
||||
│ ├── try_catch.rs (~150 lines) - try/catch implementation
|
||||
│ └── throw.rs (~30 lines) - throw implementation
|
||||
└── utils.rs (~50 lines) - extract_loop_variable, etc.
|
||||
|
||||
Total: ~1,850 lines (13% increase for clarity, but distributed across 19 files)
|
||||
Average file size: ~97 lines (vs 1,632 lines monolith)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase-by-Phase Migration Plan
|
||||
|
||||
### Phase 1: Extract Debug Utilities (30 min)
|
||||
|
||||
**Goal**: Move `trace_varmap()` to a dedicated debug module.
|
||||
|
||||
**Steps**:
|
||||
1. Create `src/mir/builder/control_flow/debug.rs`
|
||||
2. Move `trace_varmap()` implementation
|
||||
3. Update `mod.rs` to re-export `pub(super) use debug::*;`
|
||||
4. Run verification
|
||||
|
||||
**Files Created**:
|
||||
- `src/mir/builder/control_flow/debug.rs` (~50 lines)
|
||||
|
||||
**Files Modified**:
|
||||
- `src/mir/builder/control_flow.rs` → `.../control_flow/mod.rs`
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
tools/smokes/v2/run.sh --profile quick --filter "loop_*"
|
||||
```
|
||||
|
||||
**Rollback**: Delete `debug.rs`, revert imports in `mod.rs`
|
||||
|
||||
**Estimated Effort**: 30 minutes
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Extract Pattern Lowerers (2 hours)
|
||||
|
||||
**Goal**: Move Pattern 1/2/3 lowering functions to dedicated files.
|
||||
|
||||
**Steps**:
|
||||
1. Create `src/mir/builder/control_flow/joinir/patterns/` directory
|
||||
2. Create `patterns/mod.rs` with dispatcher
|
||||
3. Move `cf_loop_pattern1_minimal()` to `pattern1_minimal.rs`
|
||||
4. Move `cf_loop_pattern2_with_break()` to `pattern2_with_break.rs`
|
||||
5. Move `cf_loop_pattern3_with_if_phi()` to `pattern3_with_if_phi.rs`
|
||||
6. Update imports in `mod.rs`
|
||||
7. Run verification
|
||||
|
||||
**Files Created**:
|
||||
- `control_flow/joinir/patterns/mod.rs` (~50 lines)
|
||||
- `control_flow/joinir/patterns/pattern1_minimal.rs` (~150 lines)
|
||||
- `control_flow/joinir/patterns/pattern2_with_break.rs` (~130 lines)
|
||||
- `control_flow/joinir/patterns/pattern3_with_if_phi.rs` (~180 lines)
|
||||
|
||||
**Public API Changes**: None (all functions are already `fn`, not `pub(super) fn`)
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib -- --include-ignored
|
||||
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako
|
||||
tools/smokes/v2/run.sh --profile quick --filter "loop_*"
|
||||
```
|
||||
|
||||
**Rollback**: Delete `patterns/` directory, revert imports
|
||||
|
||||
**Estimated Effort**: 2 hours
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Extract JoinIR Routing Logic (1.5 hours)
|
||||
|
||||
**Goal**: Move `try_cf_loop_joinir()` and `cf_loop_joinir_impl()` to routing module.
|
||||
|
||||
**Steps**:
|
||||
1. Create `control_flow/joinir/routing.rs`
|
||||
2. Move `try_cf_loop_joinir()` implementation
|
||||
3. Move `cf_loop_joinir_impl()` implementation
|
||||
4. Update imports in `mod.rs`
|
||||
5. Run verification
|
||||
|
||||
**Files Created**:
|
||||
- `control_flow/joinir/routing.rs` (~150 lines)
|
||||
|
||||
**Files Modified**:
|
||||
- `control_flow/mod.rs` (update imports)
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --release
|
||||
HAKO_JOINIR_PRINT_TOKENS_MAIN=1 ./target/release/nyash test_program.hako
|
||||
```
|
||||
|
||||
**Rollback**: Delete `routing.rs`, revert imports
|
||||
|
||||
**Estimated Effort**: 1.5 hours
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Break Down merge_joinir_mir_blocks (6 hours) ⚠️ **CRITICAL**
|
||||
|
||||
**Goal**: Split the 714-line monster function into 6 logical modules.
|
||||
|
||||
**Background**: `merge_joinir_mir_blocks()` performs 6 distinct phases:
|
||||
1. **Block ID allocation** (lines 864-923)
|
||||
2. **Value collection** (lines 931-971)
|
||||
3. **Block merging** (lines 973-1100)
|
||||
4. **Instruction rewriting** (lines 1102-1400)
|
||||
5. **Exit PHI construction** (lines 1402-1500)
|
||||
6. **Boundary reconnection** (lines 1502-1578)
|
||||
|
||||
**Steps**:
|
||||
1. Create `control_flow/joinir/merge/` directory structure
|
||||
2. Extract `id_remapper.rs` - ID remapping utilities
|
||||
3. Extract `block_allocator.rs` - Block ID allocation logic
|
||||
4. Extract `value_collector.rs` - Value collection phase
|
||||
5. Extract `instruction_rewriter.rs` - Instruction transformation
|
||||
6. Extract `exit_phi_builder.rs` - Exit PHI construction
|
||||
7. Create `merge/mod.rs` as the coordinator
|
||||
8. Update imports
|
||||
9. Run comprehensive verification
|
||||
|
||||
**Files Created**:
|
||||
- `control_flow/joinir/merge/mod.rs` (~100 lines) - Coordinator
|
||||
- `control_flow/joinir/merge/id_remapper.rs` (~150 lines)
|
||||
- `control_flow/joinir/merge/block_allocator.rs` (~100 lines)
|
||||
- `control_flow/joinir/merge/value_collector.rs` (~100 lines)
|
||||
- `control_flow/joinir/merge/instruction_rewriter.rs` (~150 lines)
|
||||
- `control_flow/joinir/merge/exit_phi_builder.rs` (~100 lines)
|
||||
|
||||
**Public API**:
|
||||
```rust
|
||||
// control_flow/joinir/merge/mod.rs
|
||||
pub(in crate::mir::builder) fn merge_joinir_mir_blocks(
|
||||
builder: &mut MirBuilder,
|
||||
mir_module: &MirModule,
|
||||
boundary: Option<&JoinInlineBoundary>,
|
||||
debug: bool,
|
||||
) -> Result<Option<ValueId>, String> {
|
||||
// Coordinator that calls the 6 sub-modules
|
||||
let remapper = id_remapper::create_remapper(builder, mir_module, debug)?;
|
||||
let values = value_collector::collect_values(mir_module, &remapper, debug)?;
|
||||
// ... etc
|
||||
}
|
||||
```
|
||||
|
||||
**Verification** (CRITICAL - run ALL tests):
|
||||
```bash
|
||||
# Step 1: Build verification
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
|
||||
# Step 2: Smoke tests (ALL patterns)
|
||||
tools/smokes/v2/run.sh --profile quick
|
||||
|
||||
# Step 3: Debug trace verification
|
||||
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"
|
||||
|
||||
# Step 4: Regression check (run 3 times for determinism)
|
||||
for i in 1 2 3; do
|
||||
echo "=== Run $i ==="
|
||||
cargo test --release test_loop_patterns 2>&1 | grep "test result"
|
||||
done
|
||||
|
||||
# Step 5: Full integration test
|
||||
cargo test --release --all-features
|
||||
```
|
||||
|
||||
**Rollback**:
|
||||
```bash
|
||||
# If anything breaks, immediately:
|
||||
rm -rf src/mir/builder/control_flow/joinir/merge
|
||||
git checkout src/mir/builder/control_flow.rs
|
||||
cargo build --release && cargo test
|
||||
```
|
||||
|
||||
**Risk Mitigation**:
|
||||
- Keep the original `merge_joinir_mir_blocks()` as a comment at the top of `merge/mod.rs`
|
||||
- Add `#[cfg(test)]` unit tests for each sub-module
|
||||
- Use feature flag `NYASH_USE_LEGACY_MERGE=1` for emergency fallback (optional)
|
||||
|
||||
**Estimated Effort**: 6 hours (most complex phase)
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Extract Exception Handling (1 hour)
|
||||
|
||||
**Goal**: Move `cf_try_catch()` and `cf_throw()` to exception module.
|
||||
|
||||
**Steps**:
|
||||
1. Create `control_flow/exception/` directory
|
||||
2. Move `cf_try_catch()` to `try_catch.rs`
|
||||
3. Move `cf_throw()` to `throw.rs`
|
||||
4. Create `exception/mod.rs` as coordinator
|
||||
5. Update imports
|
||||
6. Run verification
|
||||
|
||||
**Files Created**:
|
||||
- `control_flow/exception/mod.rs` (~50 lines)
|
||||
- `control_flow/exception/try_catch.rs` (~150 lines)
|
||||
- `control_flow/exception/throw.rs` (~30 lines)
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib -- exception
|
||||
```
|
||||
|
||||
**Rollback**: Delete `exception/` directory, revert imports
|
||||
|
||||
**Estimated Effort**: 1 hour
|
||||
|
||||
---
|
||||
|
||||
### Phase 6: Extract Utilities (30 min)
|
||||
|
||||
**Goal**: Move `extract_loop_variable_from_condition()` to utils module.
|
||||
|
||||
**Steps**:
|
||||
1. Create `control_flow/utils.rs`
|
||||
2. Move utility functions
|
||||
3. Update imports
|
||||
4. Run verification
|
||||
|
||||
**Files Created**:
|
||||
- `control_flow/utils.rs` (~50 lines)
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
```
|
||||
|
||||
**Rollback**: Delete `utils.rs`, revert imports
|
||||
|
||||
**Estimated Effort**: 30 minutes
|
||||
|
||||
---
|
||||
|
||||
### Phase 7: Final Cleanup & Documentation (1 hour)
|
||||
|
||||
**Goal**: Clean up `mod.rs`, add module documentation, verify all imports.
|
||||
|
||||
**Steps**:
|
||||
1. Review `control_flow/mod.rs` for clarity
|
||||
2. Add module-level documentation to each file
|
||||
3. Ensure all `pub(super)` visibility is correct
|
||||
4. Run final comprehensive verification
|
||||
5. Update CLAUDE.md with new structure
|
||||
|
||||
**Documentation Template**:
|
||||
```rust
|
||||
//! Pattern 1 Minimal Loop Lowering
|
||||
//!
|
||||
//! This module implements the simplest JoinIR loop lowering pattern:
|
||||
//! - Single loop variable
|
||||
//! - No break statements
|
||||
//! - Simple condition (i < N)
|
||||
//!
|
||||
//! Used by: minimal_ssa_skip_ws, simple while loops
|
||||
//! Phase: 188 (Pattern 1 implementation)
|
||||
```
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release --all-features
|
||||
cargo test --release
|
||||
cargo clippy --all-targets
|
||||
tools/smokes/v2/run.sh --profile integration
|
||||
```
|
||||
|
||||
**Estimated Effort**: 1 hour
|
||||
|
||||
---
|
||||
|
||||
## Public API Changes
|
||||
|
||||
### Before (control_flow.rs)
|
||||
```rust
|
||||
impl MirBuilder {
|
||||
pub(super) fn cf_block(...) -> Result<ValueId, String>
|
||||
pub(super) fn cf_if(...) -> Result<ValueId, String>
|
||||
pub(super) fn cf_loop(...) -> Result<ValueId, String>
|
||||
pub(super) fn cf_try_catch(...) -> Result<ValueId, String>
|
||||
pub(super) fn cf_throw(...) -> Result<ValueId, String>
|
||||
}
|
||||
```
|
||||
|
||||
### After (control_flow/mod.rs)
|
||||
```rust
|
||||
// Re-export all entry points (NO CHANGE to public API)
|
||||
pub(super) use entry_points::{cf_block, cf_if, cf_loop, cf_try_catch, cf_throw};
|
||||
|
||||
// Internal modules (not exposed outside control_flow)
|
||||
mod debug;
|
||||
mod utils;
|
||||
mod joinir;
|
||||
mod exception;
|
||||
mod entry_points;
|
||||
```
|
||||
|
||||
**Guarantee**: Zero breaking changes - all `pub(super)` functions remain accessible from `MirBuilder`.
|
||||
|
||||
---
|
||||
|
||||
## Build Verification Strategy
|
||||
|
||||
### After Each Phase:
|
||||
|
||||
```bash
|
||||
# 1. Compilation check
|
||||
cargo build --release
|
||||
echo "Build: $?" >> /tmp/modularization_log.txt
|
||||
|
||||
# 2. Unit tests
|
||||
cargo test --lib
|
||||
echo "Unit tests: $?" >> /tmp/modularization_log.txt
|
||||
|
||||
# 3. Integration tests
|
||||
cargo test --release
|
||||
echo "Integration tests: $?" >> /tmp/modularization_log.txt
|
||||
|
||||
# 4. Smoke tests (only for Phases 2-4)
|
||||
tools/smokes/v2/run.sh --profile quick --filter "loop_*"
|
||||
echo "Smoke tests: $?" >> /tmp/modularization_log.txt
|
||||
|
||||
# 5. Debug trace verification (for Phase 4)
|
||||
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | tee /tmp/debug_trace.txt
|
||||
grep -q "merge_joinir" /tmp/debug_trace.txt && echo "Debug trace: OK"
|
||||
```
|
||||
|
||||
### Failure Handling:
|
||||
```bash
|
||||
# If any step fails, STOP immediately
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "FAILURE detected at Phase $PHASE"
|
||||
git status
|
||||
echo "Run rollback procedure? (y/n)"
|
||||
# ... rollback steps
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation | Detection |
|
||||
|------|------------|--------|------------|-----------|
|
||||
| **Breaking imports** | Medium | High | Incremental phases, test after each | `cargo build` fails |
|
||||
| **Merge function breakage** | Low | Critical | Keep original as comment, feature flag | Smoke tests fail |
|
||||
| **Performance regression** | Very Low | Medium | No algorithmic changes | Benchmark before/after |
|
||||
| **Debug trace changes** | Low | Low | Verify `NYASH_OPTION_C_DEBUG` output | Manual inspection |
|
||||
| **HashMap non-determinism** | Very Low | Low | Already using BTreeMap in critical paths | Run tests 3x |
|
||||
|
||||
### Critical Mitigations:
|
||||
1. **Phase 4 (merge function)**: Keep original function as comment
|
||||
2. **All phases**: Commit after each successful phase
|
||||
3. **Rollback plan**: Document exact `git checkout` commands for each file
|
||||
|
||||
---
|
||||
|
||||
## Implementation Effort Breakdown
|
||||
|
||||
| Phase | Description | Effort | Risk |
|
||||
|-------|-------------|--------|------|
|
||||
| Phase 1 | Debug utilities | 30 min | Low |
|
||||
| Phase 2 | Pattern lowerers | 2 hours | Low |
|
||||
| Phase 3 | JoinIR routing | 1.5 hours | Low |
|
||||
| **Phase 4** | **merge_joinir_mir_blocks** | **6 hours** | **Medium** |
|
||||
| Phase 5 | Exception handling | 1 hour | Low |
|
||||
| Phase 6 | Utilities | 30 min | Low |
|
||||
| Phase 7 | Cleanup & docs | 1 hour | Low |
|
||||
| **Total** | **control_flow.rs** | **12.5 hours** | - |
|
||||
|
||||
**Recommended Schedule**:
|
||||
- **Week 1**: Phases 1-3 (4 hours total) - Low risk warmup
|
||||
- **Week 2**: Phase 4 (6 hours) - Dedicated time for merge function
|
||||
- **Week 3**: Phases 5-7 (2.5 hours) - Final cleanup
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- ✅ All 267 tests pass (no regressions)
|
||||
- ✅ Build time ≤ current (no increase)
|
||||
- ✅ `control_flow/mod.rs` is < 200 lines (88% reduction)
|
||||
- ✅ Largest single file is < 200 lines (vs 714 lines before)
|
||||
- ✅ Debug traces still work (`NYASH_OPTION_C_DEBUG=1`)
|
||||
- ✅ Smoke tests pass for all patterns (1/2/3)
|
||||
- ✅ No HashMap non-determinism introduced
|
||||
- ✅ Code is easier to navigate (measured by developer feedback)
|
||||
|
||||
---
|
||||
|
||||
# 2. generic_case_a.rs Modularization (MEDIUM PRIORITY)
|
||||
|
||||
## Current State
|
||||
|
||||
**File**: `src/mir/join_ir/lowering/generic_case_a.rs` (1,056 lines)
|
||||
|
||||
**Functions** (4 public lowerers):
|
||||
- `lower_case_a_skip_ws_with_scope()` + `lower_case_a_skip_ws_core()` (~203 lines)
|
||||
- `lower_case_a_trim_with_scope()` + `lower_case_a_trim_core()` (~479 lines)
|
||||
- `lower_case_a_append_defs_with_scope()` + `lower_case_a_append_defs_core()` (~156 lines)
|
||||
- `lower_case_a_stage1_usingresolver_with_scope()` + `lower_case_a_stage1_usingresolver_core()` (~167 lines)
|
||||
|
||||
**Key Observation**: This file already has companion files:
|
||||
- `generic_case_a_entry_builder.rs` (4,828 bytes)
|
||||
- `generic_case_a_whitespace_check.rs` (4,552 bytes)
|
||||
|
||||
**Strategy**: Complete the modularization pattern by splitting the 4 lowerers into separate files.
|
||||
|
||||
---
|
||||
|
||||
## Proposed New Structure
|
||||
|
||||
```
|
||||
src/mir/join_ir/lowering/generic_case_a/
|
||||
├── mod.rs (~100 lines) - Public API
|
||||
├── skip_ws.rs (~220 lines) - skip_ws lowerer
|
||||
├── trim.rs (~500 lines) - trim lowerer
|
||||
├── append_defs.rs (~170 lines) - append_defs lowerer
|
||||
├── stage1_using_resolver.rs (~180 lines) - stage1 using resolver
|
||||
├── entry_builder.rs (~150 lines) - (moved from parent)
|
||||
└── whitespace_check.rs (~150 lines) - (moved from parent)
|
||||
|
||||
Total: ~1,470 lines (39% increase for clarity, distributed across 7 files)
|
||||
Average: ~210 lines per file
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase-by-Phase Migration Plan
|
||||
|
||||
### Phase 1: Create Directory Structure (15 min)
|
||||
|
||||
**Steps**:
|
||||
1. Create `src/mir/join_ir/lowering/generic_case_a/` directory
|
||||
2. Create `mod.rs` with public API exports
|
||||
3. Move existing companion files into directory
|
||||
4. Update parent `mod.rs` imports
|
||||
5. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
**Rollback**: Delete directory, revert parent `mod.rs`
|
||||
|
||||
**Estimated Effort**: 15 minutes
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Extract skip_ws Lowerer (45 min)
|
||||
|
||||
**Steps**:
|
||||
1. Create `generic_case_a/skip_ws.rs`
|
||||
2. Move `lower_case_a_skip_ws_with_scope()` and `_core()`
|
||||
3. Add module documentation
|
||||
4. Update `mod.rs` imports
|
||||
5. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib -- skip_ws
|
||||
```
|
||||
|
||||
**Estimated Effort**: 45 minutes
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Extract trim Lowerer (1 hour)
|
||||
|
||||
**Steps**:
|
||||
1. Create `generic_case_a/trim.rs`
|
||||
2. Move `lower_case_a_trim_with_scope()` and `_core()`
|
||||
3. Add documentation
|
||||
4. Update imports
|
||||
5. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib -- trim
|
||||
```
|
||||
|
||||
**Estimated Effort**: 1 hour
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Extract append_defs & stage1 Lowerers (1 hour)
|
||||
|
||||
**Steps**:
|
||||
1. Create `generic_case_a/append_defs.rs`
|
||||
2. Create `generic_case_a/stage1_using_resolver.rs`
|
||||
3. Move respective functions
|
||||
4. Add documentation
|
||||
5. Update imports
|
||||
6. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --release
|
||||
tools/smokes/v2/run.sh --profile quick --filter "funcscanner_*"
|
||||
```
|
||||
|
||||
**Estimated Effort**: 1 hour
|
||||
|
||||
---
|
||||
|
||||
### Phase 5: Final Cleanup (30 min)
|
||||
|
||||
**Steps**:
|
||||
1. Add module-level documentation
|
||||
2. Verify all imports are clean
|
||||
3. Run comprehensive tests
|
||||
4. Update documentation
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release --all-features
|
||||
cargo test --release
|
||||
```
|
||||
|
||||
**Estimated Effort**: 30 minutes
|
||||
|
||||
---
|
||||
|
||||
## Public API Changes
|
||||
|
||||
### Before
|
||||
```rust
|
||||
// src/mir/join_ir/lowering/generic_case_a.rs
|
||||
pub(crate) fn lower_case_a_skip_ws_with_scope(...) -> Option<JoinModule>
|
||||
pub(crate) fn lower_case_a_trim_with_scope(...) -> Option<JoinModule>
|
||||
// ... etc
|
||||
```
|
||||
|
||||
### After
|
||||
```rust
|
||||
// src/mir/join_ir/lowering/generic_case_a/mod.rs
|
||||
pub(crate) use skip_ws::lower_case_a_skip_ws_with_scope;
|
||||
pub(crate) use trim::lower_case_a_trim_with_scope;
|
||||
pub(crate) use append_defs::lower_case_a_append_defs_with_scope;
|
||||
pub(crate) use stage1_using_resolver::lower_case_a_stage1_usingresolver_with_scope;
|
||||
```
|
||||
|
||||
**Guarantee**: Zero breaking changes - all `pub(crate)` functions remain accessible.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Effort Breakdown
|
||||
|
||||
| Phase | Description | Effort | Risk |
|
||||
|-------|-------------|--------|------|
|
||||
| Phase 1 | Directory setup | 15 min | Low |
|
||||
| Phase 2 | skip_ws lowerer | 45 min | Low |
|
||||
| Phase 3 | trim lowerer | 1 hour | Low |
|
||||
| Phase 4 | append_defs & stage1 | 1 hour | Low |
|
||||
| Phase 5 | Cleanup | 30 min | Low |
|
||||
| **Total** | **generic_case_a.rs** | **3.5 hours** | - |
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- ✅ All tests pass
|
||||
- ✅ `generic_case_a/mod.rs` is < 150 lines
|
||||
- ✅ Each lowerer is in a dedicated file
|
||||
- ✅ Companion files integrated into directory
|
||||
- ✅ Documentation added to all modules
|
||||
|
||||
---
|
||||
|
||||
# 3. loopform_builder.rs Modularization (LOWER PRIORITY)
|
||||
|
||||
## Current State
|
||||
|
||||
**File**: `src/mir/phi_core/loopform_builder.rs` (1,166 lines)
|
||||
|
||||
**Status**: Already partially modularized in Phase 191!
|
||||
|
||||
**Existing Structure** (Phase 191):
|
||||
```
|
||||
src/mir/phi_core/
|
||||
├── loopform_builder.rs (1,166 lines) - Main coordinator
|
||||
├── loopform_context.rs - ValueId management
|
||||
├── loopform_variable_models.rs - CarrierVariable, PinnedVariable
|
||||
├── loopform_utils.rs - Debug and bypass utilities
|
||||
├── loopform_exit_phi.rs - Exit PHI builder
|
||||
```
|
||||
|
||||
**Remaining Work**: The main `loopform_builder.rs` still contains implementation logic that should be moved to dedicated modules.
|
||||
|
||||
---
|
||||
|
||||
## Proposed New Structure
|
||||
|
||||
```
|
||||
src/mir/phi_core/loopform/
|
||||
├── mod.rs (~100 lines) - Public API
|
||||
├── context.rs (~150 lines) - (existing loopform_context.rs)
|
||||
├── variable_models.rs (~150 lines) - (existing loopform_variable_models.rs)
|
||||
├── utils.rs (~100 lines) - (existing loopform_utils.rs)
|
||||
├── exit_phi.rs (~150 lines) - (existing loopform_exit_phi.rs)
|
||||
├── passes/
|
||||
│ ├── mod.rs (~50 lines) - 4-pass architecture coordinator
|
||||
│ ├── pass1_discovery.rs (~150 lines) - Variable discovery
|
||||
│ ├── pass2_header_phi.rs (~150 lines) - Header PHI construction
|
||||
│ ├── pass3_latch.rs (~100 lines) - Latch block processing
|
||||
│ └── pass4_exit_phi.rs (~150 lines) - Exit PHI construction
|
||||
└── builder_core.rs (~200 lines) - Core builder logic
|
||||
|
||||
Total: ~1,450 lines (24% increase for clarity, distributed across 11 files)
|
||||
Average: ~132 lines per file
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase-by-Phase Migration Plan
|
||||
|
||||
### Phase 1: Directory Structure (30 min)
|
||||
|
||||
**Steps**:
|
||||
1. Create `src/mir/phi_core/loopform/` directory
|
||||
2. Move existing modular files into directory
|
||||
3. Create `mod.rs` with re-exports
|
||||
4. Update parent `mod.rs` imports
|
||||
5. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib -- loopform
|
||||
```
|
||||
|
||||
**Estimated Effort**: 30 minutes
|
||||
|
||||
---
|
||||
|
||||
### Phase 2: Extract 4-Pass Architecture (2 hours)
|
||||
|
||||
**Steps**:
|
||||
1. Create `loopform/passes/` directory
|
||||
2. Identify the 4 passes in `loopform_builder.rs`
|
||||
3. Extract each pass to dedicated file
|
||||
4. Create `passes/mod.rs` as coordinator
|
||||
5. Update imports
|
||||
6. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --release -- loopform
|
||||
NYASH_LOOPFORM_DEBUG=1 ./target/release/nyash test_loop.hako
|
||||
```
|
||||
|
||||
**Estimated Effort**: 2 hours
|
||||
|
||||
---
|
||||
|
||||
### Phase 3: Extract Core Builder Logic (1 hour)
|
||||
|
||||
**Steps**:
|
||||
1. Create `loopform/builder_core.rs`
|
||||
2. Move remaining builder logic
|
||||
3. Update imports
|
||||
4. Run verification
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --release
|
||||
```
|
||||
|
||||
**Estimated Effort**: 1 hour
|
||||
|
||||
---
|
||||
|
||||
### Phase 4: Final Cleanup (30 min)
|
||||
|
||||
**Steps**:
|
||||
1. Add module documentation
|
||||
2. Verify all re-exports
|
||||
3. Run comprehensive tests
|
||||
|
||||
**Verification**:
|
||||
```bash
|
||||
cargo build --release --all-features
|
||||
cargo test --release
|
||||
tools/smokes/v2/run.sh --profile quick --filter "phi_*"
|
||||
```
|
||||
|
||||
**Estimated Effort**: 30 minutes
|
||||
|
||||
---
|
||||
|
||||
## Public API Changes
|
||||
|
||||
### Before
|
||||
```rust
|
||||
// src/mir/phi_core/loopform_builder.rs
|
||||
pub use loopform_context::LoopFormContext;
|
||||
pub use loopform_variable_models::{CarrierVariable, PinnedVariable};
|
||||
pub fn build_exit_phis_for_control<O: LoopFormOps>(...)
|
||||
```
|
||||
|
||||
### After
|
||||
```rust
|
||||
// src/mir/phi_core/loopform/mod.rs
|
||||
pub use context::LoopFormContext;
|
||||
pub use variable_models::{CarrierVariable, PinnedVariable};
|
||||
pub use exit_phi::build_exit_phis_for_control;
|
||||
```
|
||||
|
||||
**Guarantee**: Zero breaking changes.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Effort Breakdown
|
||||
|
||||
| Phase | Description | Effort | Risk |
|
||||
|-------|-------------|--------|------|
|
||||
| Phase 1 | Directory setup | 30 min | Low |
|
||||
| Phase 2 | 4-pass extraction | 2 hours | Medium |
|
||||
| Phase 3 | Core builder | 1 hour | Low |
|
||||
| Phase 4 | Cleanup | 30 min | Low |
|
||||
| **Total** | **loopform_builder.rs** | **4 hours** | - |
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- ✅ All tests pass
|
||||
- ✅ `loopform/mod.rs` is < 150 lines
|
||||
- ✅ Each pass is in a dedicated file
|
||||
- ✅ Existing modular files integrated
|
||||
- ✅ Documentation added
|
||||
|
||||
---
|
||||
|
||||
# Implementation Order & Timeline
|
||||
|
||||
## Recommended Schedule (3 weeks)
|
||||
|
||||
### Week 1: control_flow.rs Phases 1-3 (Low Risk)
|
||||
- **Monday**: Phase 1 (Debug utilities) - 30 min
|
||||
- **Tuesday**: Phase 2 (Pattern lowerers) - 2 hours
|
||||
- **Wednesday**: Phase 3 (JoinIR routing) - 1.5 hours
|
||||
- **Verification**: Run full smoke tests at end of week
|
||||
|
||||
### Week 2: control_flow.rs Phase 4 (High Risk)
|
||||
- **Monday-Tuesday**: Phase 4 (merge_joinir_mir_blocks) - 6 hours
|
||||
- **Wednesday**: Buffer day for fixing any issues
|
||||
- **Thursday-Friday**: Phases 5-7 (Exception, utilities, cleanup) - 2.5 hours
|
||||
|
||||
### Week 3: generic_case_a.rs (Optional)
|
||||
- **Monday-Tuesday**: generic_case_a.rs Phases 1-5 - 3.5 hours
|
||||
- **Wednesday**: Buffer
|
||||
- **Thursday-Friday**: Documentation & final verification
|
||||
|
||||
### Future (After Pattern 4+): loopform_builder.rs
|
||||
- **Timing**: After Pattern 4/5/6 development stabilizes
|
||||
- **Effort**: 4 hours
|
||||
- **Priority**: Lower (already partially modularized)
|
||||
|
||||
---
|
||||
|
||||
## Effort Summary
|
||||
|
||||
| File | Total Effort | Priority | Complexity | Blocking? |
|
||||
|------|--------------|----------|------------|-----------|
|
||||
| **control_flow.rs** | **12.5 hours** | **HIGHEST** | **High** | **Yes** (Pattern 4+) |
|
||||
| **generic_case_a.rs** | **3.5 hours** | **MEDIUM** | **Low** | No |
|
||||
| **loopform_builder.rs** | **4 hours** | **LOWER** | **Medium** | No |
|
||||
| **TOTAL** | **20 hours** | - | - | - |
|
||||
|
||||
---
|
||||
|
||||
## Risk/Mitigation Matrix
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation | Detection Method |
|
||||
|------|------------|--------|------------|------------------|
|
||||
| **Breaking imports** | Medium | High | Incremental phases, test after each | `cargo build` fails |
|
||||
| **Merge function breakage** | Low | Critical | Keep original as comment, feature flag | Smoke tests fail |
|
||||
| **Pattern lowerer breakage** | Low | High | Test each pattern independently | Integration tests |
|
||||
| **Performance regression** | Very Low | Medium | No algorithmic changes | Benchmark suite |
|
||||
| **Debug trace changes** | Low | Low | Verify `NYASH_OPTION_C_DEBUG` output | Manual inspection |
|
||||
| **Test failures** | Low | Medium | Run tests after every phase | CI/CD pipeline |
|
||||
| **Merge conflicts** | Medium | Low | Work on dedicated branch | Git status |
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria (Global)
|
||||
|
||||
### Quantitative:
|
||||
- ✅ All 267+ tests pass (no regressions)
|
||||
- ✅ Build time ≤ current (no increase)
|
||||
- ✅ Largest single file in modularized areas is < 250 lines
|
||||
- ✅ Average file size in modularized areas is < 150 lines
|
||||
|
||||
### Qualitative:
|
||||
- ✅ Code is easier to navigate (developer feedback)
|
||||
- ✅ New patterns can be added without modifying 1,600-line files
|
||||
- ✅ Debug traces remain functional
|
||||
- ✅ Documentation is clear and helpful
|
||||
|
||||
### Process:
|
||||
- ✅ Zero breaking changes at any phase
|
||||
- ✅ Each phase can be rolled back independently
|
||||
- ✅ Commits are small and focused
|
||||
- ✅ CI/CD passes after every commit
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Emergency Rollback Procedure
|
||||
|
||||
If anything goes wrong during modularization:
|
||||
|
||||
```bash
|
||||
# 1. Identify the problematic phase
|
||||
echo "ROLLBACK: Phase $PHASE failed"
|
||||
|
||||
# 2. Check git status
|
||||
git status
|
||||
|
||||
# 3. Rollback created files
|
||||
rm -rf src/mir/builder/control_flow/ # (example)
|
||||
|
||||
# 4. Restore original file
|
||||
git checkout src/mir/builder/control_flow.rs
|
||||
|
||||
# 5. Verify build
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
|
||||
# 6. Document the issue
|
||||
echo "$(date): Phase $PHASE rollback due to: $REASON" >> docs/development/refactoring/rollback_log.txt
|
||||
|
||||
# 7. Review and adjust plan
|
||||
# - Was the failure expected?
|
||||
# - Do we need to adjust the approach?
|
||||
# - Should we skip this phase?
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
This modularization plan provides a **safe, incremental path** to breaking down 3 large files into maintainable modules. The approach prioritizes:
|
||||
|
||||
1. **Zero breaking changes** - Backward compatible at every step
|
||||
2. **Clear verification** - Test after each phase
|
||||
3. **Easy rollback** - Can undo any phase if issues arise
|
||||
4. **Pattern setting** - control_flow.rs establishes the template for others
|
||||
|
||||
**Next Steps**:
|
||||
1. Review this plan with the team
|
||||
2. Get approval for Week 1 (control_flow.rs Phases 1-3)
|
||||
3. Create a dedicated branch: `refactor/modularize-control-flow`
|
||||
4. Begin Phase 1!
|
||||
|
||||
**Questions? Concerns?** Open an issue or discuss in the team channel.
|
||||
|
||||
---
|
||||
|
||||
**Document Version**: 1.0
|
||||
**Created**: 2025-12-05
|
||||
**Author**: Claude Code (AI-assisted planning)
|
||||
**Status**: Draft - Awaiting Review
|
||||
253
docs/development/refactoring/modularization-quick-start.md
Normal file
253
docs/development/refactoring/modularization-quick-start.md
Normal file
@ -0,0 +1,253 @@
|
||||
# Modularization Quick Start Checklist
|
||||
|
||||
This is the **TL;DR** version of the full implementation plan. Use this as a quick reference during execution.
|
||||
|
||||
**Full Plan**: [modularization-implementation-plan.md](./modularization-implementation-plan.md)
|
||||
|
||||
---
|
||||
|
||||
## 1. control_flow.rs Modularization (12.5 hours)
|
||||
|
||||
### Phase 1: Debug Utilities (30 min)
|
||||
- [ ] Create `src/mir/builder/control_flow/debug.rs`
|
||||
- [ ] Move `trace_varmap()`
|
||||
- [ ] Update `mod.rs` imports
|
||||
- [ ] Run: `cargo build --release && cargo test --lib`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract debug utilities from control_flow.rs"`
|
||||
|
||||
### Phase 2: Pattern Lowerers (2 hours)
|
||||
- [ ] Create `control_flow/joinir/patterns/` directory
|
||||
- [ ] Move `cf_loop_pattern1_minimal()` to `pattern1_minimal.rs`
|
||||
- [ ] Move `cf_loop_pattern2_with_break()` to `pattern2_with_break.rs`
|
||||
- [ ] Move `cf_loop_pattern3_with_if_phi()` to `pattern3_with_if_phi.rs`
|
||||
- [ ] Create `patterns/mod.rs` dispatcher
|
||||
- [ ] Run: `cargo build --release && cargo test --lib -- --include-ignored`
|
||||
- [ ] Test: `NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract pattern lowerers from control_flow.rs"`
|
||||
|
||||
### Phase 3: JoinIR Routing (1.5 hours)
|
||||
- [ ] Create `control_flow/joinir/routing.rs`
|
||||
- [ ] Move `try_cf_loop_joinir()`
|
||||
- [ ] Move `cf_loop_joinir_impl()`
|
||||
- [ ] Update imports
|
||||
- [ ] Run: `cargo build --release && cargo test --release`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract JoinIR routing logic"`
|
||||
|
||||
### Phase 4: merge_joinir_mir_blocks (6 hours) ⚠️ CRITICAL
|
||||
- [ ] Create `control_flow/joinir/merge/` directory
|
||||
- [ ] Extract `id_remapper.rs` (150 lines)
|
||||
- [ ] Extract `block_allocator.rs` (100 lines)
|
||||
- [ ] Extract `value_collector.rs` (100 lines)
|
||||
- [ ] Extract `instruction_rewriter.rs` (150 lines)
|
||||
- [ ] Extract `exit_phi_builder.rs` (100 lines)
|
||||
- [ ] Create `merge/mod.rs` coordinator (100 lines)
|
||||
- [ ] Update all imports
|
||||
- [ ] Run: `cargo build --release && cargo test --lib`
|
||||
- [ ] Run: `tools/smokes/v2/run.sh --profile quick`
|
||||
- [ ] Run: `NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"`
|
||||
- [ ] Run determinism test (3x):
|
||||
```bash
|
||||
for i in 1 2 3; do echo "=== Run $i ==="; cargo test --release test_loop_patterns; done
|
||||
```
|
||||
- [ ] Commit: `git commit -m "refactor: Break down merge_joinir_mir_blocks into modules"`
|
||||
|
||||
### Phase 5: Exception Handling (1 hour)
|
||||
- [ ] Create `control_flow/exception/` directory
|
||||
- [ ] Move `cf_try_catch()` to `try_catch.rs`
|
||||
- [ ] Move `cf_throw()` to `throw.rs`
|
||||
- [ ] Create `exception/mod.rs`
|
||||
- [ ] Run: `cargo build --release && cargo test --lib -- exception`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract exception handling"`
|
||||
|
||||
### Phase 6: Utilities (30 min)
|
||||
- [ ] Create `control_flow/utils.rs`
|
||||
- [ ] Move `extract_loop_variable_from_condition()`
|
||||
- [ ] Run: `cargo build --release && cargo test --lib`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract utilities"`
|
||||
|
||||
### Phase 7: Final Cleanup (1 hour)
|
||||
- [ ] Add module-level documentation to all files
|
||||
- [ ] Review `control_flow/mod.rs` for clarity
|
||||
- [ ] Verify all `pub(super)` visibility
|
||||
- [ ] Run: `cargo build --release --all-features`
|
||||
- [ ] Run: `cargo test --release`
|
||||
- [ ] Run: `cargo clippy --all-targets`
|
||||
- [ ] Run: `tools/smokes/v2/run.sh --profile integration`
|
||||
- [ ] Update `CLAUDE.md` with new structure
|
||||
- [ ] Commit: `git commit -m "docs: Update control_flow module documentation"`
|
||||
|
||||
---
|
||||
|
||||
## 2. generic_case_a.rs Modularization (3.5 hours)
|
||||
|
||||
### Phase 1: Directory Setup (15 min)
|
||||
- [ ] Create `src/mir/join_ir/lowering/generic_case_a/` directory
|
||||
- [ ] Create `mod.rs` with public API
|
||||
- [ ] Move `generic_case_a_entry_builder.rs` into directory
|
||||
- [ ] Move `generic_case_a_whitespace_check.rs` into directory
|
||||
- [ ] Update parent `mod.rs` imports
|
||||
- [ ] Run: `cargo build --release`
|
||||
- [ ] Commit: `git commit -m "refactor: Create generic_case_a directory structure"`
|
||||
|
||||
### Phase 2: Extract skip_ws (45 min)
|
||||
- [ ] Create `generic_case_a/skip_ws.rs`
|
||||
- [ ] Move `lower_case_a_skip_ws_with_scope()` and `_core()`
|
||||
- [ ] Add module documentation
|
||||
- [ ] Run: `cargo build --release && cargo test --lib -- skip_ws`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract skip_ws lowerer"`
|
||||
|
||||
### Phase 3: Extract trim (1 hour)
|
||||
- [ ] Create `generic_case_a/trim.rs`
|
||||
- [ ] Move `lower_case_a_trim_with_scope()` and `_core()`
|
||||
- [ ] Add module documentation
|
||||
- [ ] Run: `cargo build --release && cargo test --lib -- trim`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract trim lowerer"`
|
||||
|
||||
### Phase 4: Extract append_defs & stage1 (1 hour)
|
||||
- [ ] Create `generic_case_a/append_defs.rs`
|
||||
- [ ] Create `generic_case_a/stage1_using_resolver.rs`
|
||||
- [ ] Move respective functions
|
||||
- [ ] Add module documentation
|
||||
- [ ] Run: `cargo build --release && cargo test --release`
|
||||
- [ ] Run: `tools/smokes/v2/run.sh --profile quick --filter "funcscanner_*"`
|
||||
- [ ] Commit: `git commit -m "refactor: Extract append_defs and stage1 lowerers"`
|
||||
|
||||
### Phase 5: Final Cleanup (30 min)
|
||||
- [ ] Add module-level documentation to all files
|
||||
- [ ] Verify all imports are clean
|
||||
- [ ] Run: `cargo build --release --all-features && cargo test --release`
|
||||
- [ ] Commit: `git commit -m "docs: Complete generic_case_a modularization"`
|
||||
|
||||
---
|
||||
|
||||
## 3. loopform_builder.rs Modularization (4 hours) - FUTURE
|
||||
|
||||
**Status**: Already partially modularized in Phase 191. Remaining work is lower priority.
|
||||
|
||||
### Phase 1: Directory Setup (30 min)
|
||||
- [ ] Create `src/mir/phi_core/loopform/` directory
|
||||
- [ ] Move existing modular files into directory
|
||||
- [ ] Create `mod.rs` with re-exports
|
||||
- [ ] Run: `cargo build --release && cargo test --lib -- loopform`
|
||||
|
||||
### Phase 2: Extract 4-Pass Architecture (2 hours)
|
||||
- [ ] Create `loopform/passes/` directory
|
||||
- [ ] Extract Pass 1: Variable discovery
|
||||
- [ ] Extract Pass 2: Header PHI construction
|
||||
- [ ] Extract Pass 3: Latch block processing
|
||||
- [ ] Extract Pass 4: Exit PHI construction
|
||||
- [ ] Create `passes/mod.rs` coordinator
|
||||
- [ ] Run: `cargo build --release && cargo test --release -- loopform`
|
||||
|
||||
### Phase 3: Extract Core Builder (1 hour)
|
||||
- [ ] Create `loopform/builder_core.rs`
|
||||
- [ ] Move remaining builder logic
|
||||
- [ ] Run: `cargo build --release && cargo test --release`
|
||||
|
||||
### Phase 4: Final Cleanup (30 min)
|
||||
- [ ] Add module documentation
|
||||
- [ ] Run: `cargo build --release --all-features`
|
||||
- [ ] Run: `tools/smokes/v2/run.sh --profile quick --filter "phi_*"`
|
||||
|
||||
---
|
||||
|
||||
## Emergency Rollback Commands
|
||||
|
||||
```bash
|
||||
# Rollback control_flow.rs modularization
|
||||
rm -rf src/mir/builder/control_flow/
|
||||
git checkout src/mir/builder/control_flow.rs
|
||||
cargo build --release && cargo test --lib
|
||||
|
||||
# Rollback generic_case_a.rs modularization
|
||||
rm -rf src/mir/join_ir/lowering/generic_case_a/
|
||||
git checkout src/mir/join_ir/lowering/generic_case_a*.rs
|
||||
cargo build --release && cargo test --lib
|
||||
|
||||
# Rollback loopform_builder.rs modularization
|
||||
rm -rf src/mir/phi_core/loopform/
|
||||
git checkout src/mir/phi_core/loopform*.rs
|
||||
cargo build --release && cargo test --lib
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Commands
|
||||
|
||||
### Quick Verification (after each phase)
|
||||
```bash
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
```
|
||||
|
||||
### Comprehensive Verification (after critical phases)
|
||||
```bash
|
||||
cargo build --release --all-features
|
||||
cargo test --release
|
||||
cargo clippy --all-targets
|
||||
tools/smokes/v2/run.sh --profile quick
|
||||
```
|
||||
|
||||
### Debug Trace Verification (Phase 4 only)
|
||||
```bash
|
||||
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"
|
||||
```
|
||||
|
||||
### Determinism Check (Phase 4 only)
|
||||
```bash
|
||||
for i in 1 2 3; do
|
||||
echo "=== Run $i ==="
|
||||
cargo test --release test_loop_patterns 2>&1 | grep "test result"
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- ✅ All 267+ tests pass
|
||||
- ✅ Build time ≤ current
|
||||
- ✅ `control_flow/mod.rs` < 200 lines (vs 1,632)
|
||||
- ✅ Largest file < 250 lines (vs 714)
|
||||
- ✅ Debug traces still work
|
||||
- ✅ No HashMap non-determinism
|
||||
- ✅ Code is easier to navigate
|
||||
|
||||
---
|
||||
|
||||
## Timeline
|
||||
|
||||
### Week 1: control_flow.rs Phases 1-3 (Low Risk)
|
||||
- **Total**: 4 hours
|
||||
- **Risk**: Low
|
||||
- **Deliverable**: Pattern lowerers and routing isolated
|
||||
|
||||
### Week 2: control_flow.rs Phase 4 (High Risk)
|
||||
- **Total**: 6 hours
|
||||
- **Risk**: Medium
|
||||
- **Deliverable**: merge_joinir_mir_blocks modularized
|
||||
|
||||
### Week 3: control_flow.rs Phases 5-7 + generic_case_a.rs
|
||||
- **Total**: 6 hours
|
||||
- **Risk**: Low
|
||||
- **Deliverable**: control_flow.rs complete, generic_case_a.rs modularized
|
||||
|
||||
---
|
||||
|
||||
## Questions?
|
||||
|
||||
**Full Plan**: [modularization-implementation-plan.md](./modularization-implementation-plan.md)
|
||||
|
||||
**Open Issues**:
|
||||
- Create a GitHub issue if you encounter blockers
|
||||
- Tag with `refactoring` label
|
||||
|
||||
**Need Help?**:
|
||||
- Check the full plan for detailed explanations
|
||||
- Review the risk/mitigation matrix
|
||||
- Use the emergency rollback procedure if needed
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: 2025-12-05
|
||||
**Status**: Ready to execute
|
||||
789
docs/development/refactoring/phase4-merge-function-breakdown.md
Normal file
789
docs/development/refactoring/phase4-merge-function-breakdown.md
Normal file
@ -0,0 +1,789 @@
|
||||
# Phase 4: merge_joinir_mir_blocks Breakdown Guide
|
||||
|
||||
**CRITICAL PHASE** - This is the most complex modularization task.
|
||||
|
||||
**Context**: `merge_joinir_mir_blocks()` is a 714-line monster function that performs 6 distinct operations. This guide provides step-by-step instructions for breaking it down safely.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
### Current State
|
||||
- **File**: `src/mir/builder/control_flow.rs`
|
||||
- **Function**: `merge_joinir_mir_blocks()` (lines 864-1578)
|
||||
- **Size**: 714 lines (44% of the entire file!)
|
||||
- **Complexity**: 6 distinct phases, multiple data structures, critical to JoinIR integration
|
||||
|
||||
### Target State
|
||||
- **Directory**: `src/mir/builder/control_flow/joinir/merge/`
|
||||
- **Files**: 7 files (1 coordinator + 6 sub-modules)
|
||||
- **Average size**: ~100-150 lines per file
|
||||
- **Maintainability**: Each phase in its own file
|
||||
|
||||
---
|
||||
|
||||
## Function Analysis
|
||||
|
||||
### Current Structure (Lines 864-1578)
|
||||
|
||||
```rust
|
||||
fn merge_joinir_mir_blocks(
|
||||
&mut self,
|
||||
mir_module: &MirModule,
|
||||
boundary: Option<&JoinInlineBoundary>,
|
||||
debug: bool,
|
||||
) -> Result<Option<ValueId>, String> {
|
||||
// Phase 1: Block ID Allocation (lines 864-923)
|
||||
// - Create JoinIrIdRemapper
|
||||
// - Allocate new block IDs for all functions
|
||||
// - Map function entry blocks
|
||||
|
||||
// Phase 2: Value Collection (lines 931-971)
|
||||
// - Collect all ValueIds used across functions
|
||||
// - Track Const String → function name mapping
|
||||
// - Collect function parameters for tail call conversion
|
||||
|
||||
// Phase 3: ValueId Remapping (lines 973-1100)
|
||||
// - Allocate new ValueIds for all collected values
|
||||
// - Create remapping table
|
||||
// - Handle determinism (BTreeSet/BTreeMap usage)
|
||||
|
||||
// Phase 4: Block Merging & Instruction Rewriting (lines 1102-1400)
|
||||
// - For each function in sorted order:
|
||||
// - For each block in sorted order:
|
||||
// - Rewrite instructions with remapped IDs
|
||||
// - Convert Call → Jump for intra-module calls
|
||||
// - Convert Return → Jump to exit block
|
||||
// - Handle tail call optimization
|
||||
// - Insert merged blocks into current_function
|
||||
|
||||
// Phase 5: Exit PHI Construction (lines 1402-1500)
|
||||
// - Collect return values from all functions
|
||||
// - Create exit block with PHI node
|
||||
// - Return exit PHI value
|
||||
|
||||
// Phase 6: Boundary Reconnection (lines 1502-1578)
|
||||
// - If boundary with host_outputs specified:
|
||||
// - Map exit PHI back to host variable_map
|
||||
// - Update SSA slots in variable_map
|
||||
// - Reconnect inlined code to host context
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Proposed Module Breakdown
|
||||
|
||||
### 1. `merge/mod.rs` - Coordinator (~100 lines)
|
||||
|
||||
**Purpose**: Orchestrate the 6 phases without implementing logic.
|
||||
|
||||
```rust
|
||||
//! JoinIR MIR Block Merging Coordinator
|
||||
//!
|
||||
//! This module coordinates the merging of JoinIR-generated MIR functions
|
||||
//! into the host MIR builder. The process is broken into 6 phases:
|
||||
//!
|
||||
//! 1. Block ID allocation
|
||||
//! 2. Value collection
|
||||
//! 3. ValueId remapping
|
||||
//! 4. Instruction rewriting
|
||||
//! 5. Exit PHI construction
|
||||
//! 6. Boundary reconnection
|
||||
|
||||
use crate::mir::{MirModule, ValueId};
|
||||
use crate::mir::join_ir::lowering::inline_boundary::JoinInlineBoundary;
|
||||
|
||||
mod id_remapper;
|
||||
mod block_allocator;
|
||||
mod value_collector;
|
||||
mod instruction_rewriter;
|
||||
mod exit_phi_builder;
|
||||
|
||||
use id_remapper::JoinIrIdRemapper;
|
||||
|
||||
pub(in crate::mir::builder) fn merge_joinir_mir_blocks(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
mir_module: &MirModule,
|
||||
boundary: Option<&JoinInlineBoundary>,
|
||||
debug: bool,
|
||||
) -> Result<Option<ValueId>, String> {
|
||||
// Phase 1: Allocate block IDs
|
||||
let mut remapper = block_allocator::allocate_blocks(builder, mir_module, debug)?;
|
||||
|
||||
// Phase 2: Collect values
|
||||
let (used_values, value_to_func_name, function_params) =
|
||||
value_collector::collect_values(mir_module, &mut remapper, debug)?;
|
||||
|
||||
// Phase 3: Remap ValueIds
|
||||
id_remapper::remap_values(builder, &used_values, &mut remapper, debug)?;
|
||||
|
||||
// Phase 4: Merge blocks and rewrite instructions
|
||||
let exit_block_id = instruction_rewriter::merge_and_rewrite(
|
||||
builder,
|
||||
mir_module,
|
||||
&mut remapper,
|
||||
&value_to_func_name,
|
||||
&function_params,
|
||||
debug,
|
||||
)?;
|
||||
|
||||
// Phase 5: Build exit PHI
|
||||
let exit_phi_value = exit_phi_builder::build_exit_phi(
|
||||
builder,
|
||||
mir_module,
|
||||
&remapper,
|
||||
exit_block_id,
|
||||
debug,
|
||||
)?;
|
||||
|
||||
// Phase 6: Reconnect boundary (if specified)
|
||||
if let Some(boundary) = boundary {
|
||||
reconnect_boundary(builder, boundary, exit_phi_value, debug)?;
|
||||
}
|
||||
|
||||
Ok(exit_phi_value)
|
||||
}
|
||||
|
||||
fn reconnect_boundary(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
boundary: &JoinInlineBoundary,
|
||||
exit_phi_value: Option<ValueId>,
|
||||
debug: bool,
|
||||
) -> Result<(), String> {
|
||||
// Phase 6 implementation (simple, can stay in mod.rs)
|
||||
// ... (lines 1502-1578 logic)
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. `merge/id_remapper.rs` - ID Remapping (~150 lines)
|
||||
|
||||
**Purpose**: Manage ValueId and BlockId remapping.
|
||||
|
||||
```rust
|
||||
//! JoinIR ID Remapper
|
||||
//!
|
||||
//! Handles the mapping of ValueIds and BlockIds from JoinIR functions
|
||||
//! to the host MIR builder's ID space.
|
||||
|
||||
use crate::mir::{BasicBlockId, ValueId};
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
/// Remapper for JoinIR → Host MIR ID translation
|
||||
pub(super) struct JoinIrIdRemapper {
|
||||
/// (function_name, old_block_id) → new_block_id
|
||||
block_map: BTreeMap<(String, BasicBlockId), BasicBlockId>,
|
||||
|
||||
/// old_value_id → new_value_id
|
||||
value_map: BTreeMap<ValueId, ValueId>,
|
||||
|
||||
/// function_name → entry_block_id
|
||||
function_entry_map: BTreeMap<String, BasicBlockId>,
|
||||
}
|
||||
|
||||
impl JoinIrIdRemapper {
|
||||
pub(super) fn new() -> Self {
|
||||
Self {
|
||||
block_map: BTreeMap::new(),
|
||||
value_map: BTreeMap::new(),
|
||||
function_entry_map: BTreeMap::new(),
|
||||
}
|
||||
}
|
||||
|
||||
pub(super) fn set_block(&mut self, func_name: String, old_id: BasicBlockId, new_id: BasicBlockId) {
|
||||
self.block_map.insert((func_name, old_id), new_id);
|
||||
}
|
||||
|
||||
pub(super) fn get_block(&self, func_name: &str, old_id: BasicBlockId) -> Option<BasicBlockId> {
|
||||
self.block_map.get(&(func_name.to_string(), old_id)).copied()
|
||||
}
|
||||
|
||||
pub(super) fn set_value(&mut self, old_id: ValueId, new_id: ValueId) {
|
||||
self.value_map.insert(old_id, new_id);
|
||||
}
|
||||
|
||||
pub(super) fn get_value(&self, old_id: ValueId) -> Option<ValueId> {
|
||||
self.value_map.get(&old_id).copied()
|
||||
}
|
||||
|
||||
pub(super) fn set_function_entry(&mut self, func_name: String, entry_block: BasicBlockId) {
|
||||
self.function_entry_map.insert(func_name, entry_block);
|
||||
}
|
||||
|
||||
pub(super) fn get_function_entry(&self, func_name: &str) -> Option<BasicBlockId> {
|
||||
self.function_entry_map.get(func_name).copied()
|
||||
}
|
||||
|
||||
// ... (other helper methods from lines 873-959)
|
||||
}
|
||||
|
||||
pub(super) fn remap_values(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
used_values: &std::collections::BTreeSet<ValueId>,
|
||||
remapper: &mut JoinIrIdRemapper,
|
||||
debug: bool,
|
||||
) -> Result<(), String> {
|
||||
// Phase 3 logic (lines 973-1100)
|
||||
// - Allocate new ValueIds
|
||||
// - Store in remapper
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. `merge/block_allocator.rs` - Block Allocation (~100 lines)
|
||||
|
||||
**Purpose**: Allocate block IDs for all JoinIR functions.
|
||||
|
||||
```rust
|
||||
//! JoinIR Block ID Allocator
|
||||
//!
|
||||
//! Allocates new BasicBlockIds for all blocks in JoinIR functions
|
||||
//! to avoid ID conflicts with the host MIR builder.
|
||||
|
||||
use crate::mir::{BasicBlockId, MirModule};
|
||||
use super::id_remapper::JoinIrIdRemapper;
|
||||
|
||||
pub(super) fn allocate_blocks(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
mir_module: &MirModule,
|
||||
debug: bool,
|
||||
) -> Result<JoinIrIdRemapper, String> {
|
||||
let mut remapper = JoinIrIdRemapper::new();
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/block_allocator] Allocating block IDs for {} functions",
|
||||
mir_module.functions.len());
|
||||
}
|
||||
|
||||
// DETERMINISM FIX: Sort functions by name
|
||||
let mut functions: Vec<_> = mir_module.functions.iter().collect();
|
||||
functions.sort_by_key(|(name, _)| name.as_str());
|
||||
|
||||
for (func_name, func) in functions {
|
||||
if debug {
|
||||
eprintln!("[merge/block_allocator] Function: {}", func_name);
|
||||
}
|
||||
|
||||
// DETERMINISM FIX: Sort blocks by ID
|
||||
let mut blocks: Vec<_> = func.blocks.iter().collect();
|
||||
blocks.sort_by_key(|(id, _)| id.0);
|
||||
|
||||
for (old_block_id, _) in blocks {
|
||||
let new_block_id = builder.block_gen.next();
|
||||
remapper.set_block(func_name.clone(), *old_block_id, new_block_id);
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/block_allocator] Block: {:?} → {:?}",
|
||||
old_block_id, new_block_id);
|
||||
}
|
||||
}
|
||||
|
||||
// Store function entry block
|
||||
let entry_block_new = remapper.get_block(func_name, func.entry_block)
|
||||
.ok_or_else(|| format!("Entry block not found for {}", func_name))?;
|
||||
remapper.set_function_entry(func_name.clone(), entry_block_new);
|
||||
}
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/block_allocator] Allocation complete");
|
||||
}
|
||||
|
||||
Ok(remapper)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. `merge/value_collector.rs` - Value Collection (~100 lines)
|
||||
|
||||
**Purpose**: Collect all ValueIds used across JoinIR functions.
|
||||
|
||||
```rust
|
||||
//! JoinIR Value Collector
|
||||
//!
|
||||
//! Collects all ValueIds used in JoinIR functions for remapping.
|
||||
|
||||
use crate::mir::{ValueId, MirModule};
|
||||
use super::id_remapper::JoinIrIdRemapper;
|
||||
use std::collections::{BTreeSet, HashMap};
|
||||
|
||||
pub(super) fn collect_values(
|
||||
mir_module: &MirModule,
|
||||
remapper: &mut JoinIrIdRemapper,
|
||||
debug: bool,
|
||||
) -> Result<(BTreeSet<ValueId>, HashMap<ValueId, String>, HashMap<String, Vec<ValueId>>), String> {
|
||||
let mut used_values = BTreeSet::new();
|
||||
let mut value_to_func_name = HashMap::new();
|
||||
let mut function_params = HashMap::new();
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/value_collector] Collecting values from {} functions",
|
||||
mir_module.functions.len());
|
||||
}
|
||||
|
||||
for (func_name, func) in &mir_module.functions {
|
||||
// Collect function parameters
|
||||
function_params.insert(func_name.clone(), func.params.clone());
|
||||
|
||||
for block in func.blocks.values() {
|
||||
// Collect values from instructions
|
||||
let block_values = collect_values_in_block(block);
|
||||
used_values.extend(block_values);
|
||||
|
||||
// Track Const String → function name mapping
|
||||
for inst in &block.instructions {
|
||||
if let crate::mir::MirInstruction::Const { dst, value } = inst {
|
||||
if let crate::mir::types::ConstValue::String(s) = value {
|
||||
// Check if this is a function name
|
||||
if mir_module.functions.contains_key(s) {
|
||||
value_to_func_name.insert(*dst, s.clone());
|
||||
used_values.insert(*dst);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/value_collector] Collected {} values", used_values.len());
|
||||
}
|
||||
|
||||
Ok((used_values, value_to_func_name, function_params))
|
||||
}
|
||||
|
||||
fn collect_values_in_block(block: &crate::mir::BasicBlock) -> Vec<ValueId> {
|
||||
// Helper function from lines 948-970
|
||||
// ... (extract logic from remapper.collect_values_in_block)
|
||||
vec![]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. `merge/instruction_rewriter.rs` - Instruction Rewriting (~150 lines)
|
||||
|
||||
**Purpose**: Rewrite instructions and merge blocks.
|
||||
|
||||
```rust
|
||||
//! JoinIR Instruction Rewriter
|
||||
//!
|
||||
//! Rewrites JoinIR instructions with remapped IDs and merges blocks
|
||||
//! into the host MIR builder.
|
||||
|
||||
use crate::mir::{BasicBlockId, ValueId, MirModule, MirInstruction, BasicBlock};
|
||||
use super::id_remapper::JoinIrIdRemapper;
|
||||
use std::collections::HashMap;
|
||||
|
||||
pub(super) fn merge_and_rewrite(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
mir_module: &MirModule,
|
||||
remapper: &mut JoinIrIdRemapper,
|
||||
value_to_func_name: &HashMap<ValueId, String>,
|
||||
function_params: &HashMap<String, Vec<ValueId>>,
|
||||
debug: bool,
|
||||
) -> Result<BasicBlockId, String> {
|
||||
// Create exit block
|
||||
let exit_block_id = builder.block_gen.next();
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/instruction_rewriter] Merging blocks from {} functions",
|
||||
mir_module.functions.len());
|
||||
eprintln!("[merge/instruction_rewriter] Exit block: {:?}", exit_block_id);
|
||||
}
|
||||
|
||||
// DETERMINISM FIX: Sort functions by name
|
||||
let mut functions: Vec<_> = mir_module.functions.iter().collect();
|
||||
functions.sort_by_key(|(name, _)| name.as_str());
|
||||
|
||||
for (func_name, func) in functions {
|
||||
merge_function_blocks(
|
||||
builder,
|
||||
func_name,
|
||||
func,
|
||||
remapper,
|
||||
value_to_func_name,
|
||||
function_params,
|
||||
exit_block_id,
|
||||
debug,
|
||||
)?;
|
||||
}
|
||||
|
||||
Ok(exit_block_id)
|
||||
}
|
||||
|
||||
fn merge_function_blocks(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
func_name: &str,
|
||||
func: &crate::mir::MirFunction,
|
||||
remapper: &mut JoinIrIdRemapper,
|
||||
value_to_func_name: &HashMap<ValueId, String>,
|
||||
function_params: &HashMap<String, Vec<ValueId>>,
|
||||
exit_block_id: BasicBlockId,
|
||||
debug: bool,
|
||||
) -> Result<(), String> {
|
||||
// DETERMINISM FIX: Sort blocks by ID
|
||||
let mut blocks: Vec<_> = func.blocks.iter().collect();
|
||||
blocks.sort_by_key(|(id, _)| id.0);
|
||||
|
||||
for (old_block_id, old_block) in blocks {
|
||||
let new_block_id = remapper.get_block(func_name, *old_block_id)
|
||||
.ok_or_else(|| format!("Block ID not found: {:?}", old_block_id))?;
|
||||
|
||||
let new_block = rewrite_block(
|
||||
old_block,
|
||||
remapper,
|
||||
value_to_func_name,
|
||||
function_params,
|
||||
exit_block_id,
|
||||
debug,
|
||||
)?;
|
||||
|
||||
builder.current_function.as_mut()
|
||||
.ok_or("No current function")?
|
||||
.blocks.insert(new_block_id, new_block);
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn rewrite_block(
|
||||
old_block: &BasicBlock,
|
||||
remapper: &JoinIrIdRemapper,
|
||||
value_to_func_name: &HashMap<ValueId, String>,
|
||||
function_params: &HashMap<String, Vec<ValueId>>,
|
||||
exit_block_id: BasicBlockId,
|
||||
debug: bool,
|
||||
) -> Result<BasicBlock, String> {
|
||||
let mut new_instructions = Vec::new();
|
||||
|
||||
// Rewrite each instruction (lines 1102-1300)
|
||||
for inst in &old_block.instructions {
|
||||
let new_inst = rewrite_instruction(
|
||||
inst,
|
||||
remapper,
|
||||
value_to_func_name,
|
||||
function_params,
|
||||
exit_block_id,
|
||||
debug,
|
||||
)?;
|
||||
new_instructions.push(new_inst);
|
||||
}
|
||||
|
||||
// Rewrite terminator (lines 1300-1400)
|
||||
let new_terminator = rewrite_terminator(
|
||||
&old_block.terminator,
|
||||
remapper,
|
||||
exit_block_id,
|
||||
debug,
|
||||
)?;
|
||||
|
||||
Ok(BasicBlock {
|
||||
instructions: new_instructions,
|
||||
terminator: new_terminator,
|
||||
})
|
||||
}
|
||||
|
||||
fn rewrite_instruction(
|
||||
inst: &MirInstruction,
|
||||
remapper: &JoinIrIdRemapper,
|
||||
value_to_func_name: &HashMap<ValueId, String>,
|
||||
function_params: &HashMap<String, Vec<ValueId>>,
|
||||
exit_block_id: BasicBlockId,
|
||||
debug: bool,
|
||||
) -> Result<MirInstruction, String> {
|
||||
// Instruction rewriting logic (lines 1102-1300)
|
||||
// - Remap ValueIds
|
||||
// - Convert Call → Jump for intra-module calls
|
||||
// - Handle tail call optimization
|
||||
Ok(inst.clone()) // Placeholder
|
||||
}
|
||||
|
||||
fn rewrite_terminator(
|
||||
terminator: &Option<MirInstruction>,
|
||||
remapper: &JoinIrIdRemapper,
|
||||
exit_block_id: BasicBlockId,
|
||||
debug: bool,
|
||||
) -> Result<Option<MirInstruction>, String> {
|
||||
// Terminator rewriting logic (lines 1300-1400)
|
||||
// - Remap block IDs
|
||||
// - Convert Return → Jump to exit block
|
||||
Ok(terminator.clone()) // Placeholder
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 6. `merge/exit_phi_builder.rs` - Exit PHI Construction (~100 lines)
|
||||
|
||||
**Purpose**: Construct exit PHI node from return values.
|
||||
|
||||
```rust
|
||||
//! JoinIR Exit PHI Builder
|
||||
//!
|
||||
//! Constructs the exit block PHI node that merges return values
|
||||
//! from all inlined JoinIR functions.
|
||||
|
||||
use crate::mir::{BasicBlockId, ValueId, MirModule, MirInstruction, BasicBlock};
|
||||
use super::id_remapper::JoinIrIdRemapper;
|
||||
|
||||
pub(super) fn build_exit_phi(
|
||||
builder: &mut crate::mir::builder::MirBuilder,
|
||||
mir_module: &MirModule,
|
||||
remapper: &JoinIrIdRemapper,
|
||||
exit_block_id: BasicBlockId,
|
||||
debug: bool,
|
||||
) -> Result<Option<ValueId>, String> {
|
||||
if debug {
|
||||
eprintln!("[merge/exit_phi_builder] Building exit PHI");
|
||||
}
|
||||
|
||||
// Collect return values from all functions (lines 1402-1450)
|
||||
let mut return_values = Vec::new();
|
||||
|
||||
for (func_name, func) in &mir_module.functions {
|
||||
for (old_block_id, block) in &func.blocks {
|
||||
if let Some(MirInstruction::Return { value: Some(old_value) }) = &block.terminator {
|
||||
// Remap the return value
|
||||
let new_value = remapper.get_value(*old_value)
|
||||
.ok_or_else(|| format!("Return value not remapped: {:?}", old_value))?;
|
||||
|
||||
// Remap the block ID (predecessor of exit block)
|
||||
let new_block_id = remapper.get_block(func_name, *old_block_id)
|
||||
.ok_or_else(|| format!("Block ID not remapped: {:?}", old_block_id))?;
|
||||
|
||||
return_values.push((new_block_id, new_value));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if return_values.is_empty() {
|
||||
if debug {
|
||||
eprintln!("[merge/exit_phi_builder] No return values, creating void exit block");
|
||||
}
|
||||
|
||||
// Create empty exit block
|
||||
builder.current_function.as_mut()
|
||||
.ok_or("No current function")?
|
||||
.blocks.insert(exit_block_id, BasicBlock {
|
||||
instructions: vec![],
|
||||
terminator: None,
|
||||
});
|
||||
|
||||
return Ok(None);
|
||||
}
|
||||
|
||||
// Create PHI node (lines 1450-1500)
|
||||
let phi_value = builder.value_gen.next();
|
||||
|
||||
let phi_inst = MirInstruction::Phi {
|
||||
dst: phi_value,
|
||||
incoming: return_values.iter()
|
||||
.map(|(block_id, value_id)| (*block_id, *value_id))
|
||||
.collect(),
|
||||
};
|
||||
|
||||
// Create exit block with PHI
|
||||
builder.current_function.as_mut()
|
||||
.ok_or("No current function")?
|
||||
.blocks.insert(exit_block_id, BasicBlock {
|
||||
instructions: vec![phi_inst],
|
||||
terminator: None,
|
||||
});
|
||||
|
||||
if debug {
|
||||
eprintln!("[merge/exit_phi_builder] Exit PHI created: {:?}", phi_value);
|
||||
}
|
||||
|
||||
Ok(Some(phi_value))
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Create Directory Structure (5 min)
|
||||
```bash
|
||||
mkdir -p src/mir/builder/control_flow/joinir/merge
|
||||
```
|
||||
|
||||
### Step 2: Create Skeleton Files (10 min)
|
||||
```bash
|
||||
# Create empty files with module documentation
|
||||
touch src/mir/builder/control_flow/joinir/merge/mod.rs
|
||||
touch src/mir/builder/control_flow/joinir/merge/id_remapper.rs
|
||||
touch src/mir/builder/control_flow/joinir/merge/block_allocator.rs
|
||||
touch src/mir/builder/control_flow/joinir/merge/value_collector.rs
|
||||
touch src/mir/builder/control_flow/joinir/merge/instruction_rewriter.rs
|
||||
touch src/mir/builder/control_flow/joinir/merge/exit_phi_builder.rs
|
||||
```
|
||||
|
||||
### Step 3: Extract id_remapper.rs (30 min)
|
||||
- Copy `JoinIrIdRemapper` struct definition
|
||||
- Move helper methods
|
||||
- Add `remap_values()` function
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 4: Extract block_allocator.rs (30 min)
|
||||
- Copy block allocation logic (lines 888-923)
|
||||
- Create `allocate_blocks()` function
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 5: Extract value_collector.rs (30 min)
|
||||
- Copy value collection logic (lines 931-971)
|
||||
- Create `collect_values()` function
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 6: Extract instruction_rewriter.rs (1.5 hours)
|
||||
- Copy instruction rewriting logic (lines 1102-1400)
|
||||
- Create `merge_and_rewrite()` function
|
||||
- Create helper functions for instruction/terminator rewriting
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 7: Extract exit_phi_builder.rs (30 min)
|
||||
- Copy exit PHI logic (lines 1402-1500)
|
||||
- Create `build_exit_phi()` function
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 8: Create mod.rs Coordinator (1 hour)
|
||||
- Create coordinator function
|
||||
- Wire up all sub-modules
|
||||
- Move boundary reconnection logic (lines 1502-1578)
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 9: Update control_flow/mod.rs (15 min)
|
||||
- Update imports to use new `merge` module
|
||||
- Remove old `merge_joinir_mir_blocks()` implementation
|
||||
- Test compilation: `cargo build --release`
|
||||
|
||||
### Step 10: Comprehensive Testing (1 hour)
|
||||
```bash
|
||||
# Build verification
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
|
||||
# Smoke tests
|
||||
tools/smokes/v2/run.sh --profile quick
|
||||
|
||||
# Debug trace verification
|
||||
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge"
|
||||
|
||||
# Determinism check (run 3 times)
|
||||
for i in 1 2 3; do
|
||||
echo "=== Run $i ==="
|
||||
cargo test --release test_loop_patterns 2>&1 | grep "test result"
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] All 267+ tests pass
|
||||
- [ ] Build time ≤ current (no regression)
|
||||
- [ ] Debug traces still appear with `NYASH_OPTION_C_DEBUG=1`
|
||||
- [ ] No compiler warnings
|
||||
- [ ] No clippy warnings (`cargo clippy --all-targets`)
|
||||
- [ ] Determinism test passes (3 consecutive runs produce identical results)
|
||||
- [ ] Smoke tests pass for all patterns (1/2/3)
|
||||
|
||||
---
|
||||
|
||||
## Rollback Procedure
|
||||
|
||||
If Phase 4 fails:
|
||||
|
||||
```bash
|
||||
# 1. Remove new directory
|
||||
rm -rf src/mir/builder/control_flow/joinir/merge
|
||||
|
||||
# 2. Restore original control_flow.rs
|
||||
git checkout src/mir/builder/control_flow.rs
|
||||
|
||||
# 3. Verify build
|
||||
cargo build --release
|
||||
cargo test --lib
|
||||
|
||||
# 4. Document failure
|
||||
echo "$(date): Phase 4 rollback due to: $REASON" >> docs/development/refactoring/rollback_log.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### Pitfall 1: HashMap Non-Determinism
|
||||
**Problem**: Using `HashMap` for iteration causes non-deterministic ValueId allocation.
|
||||
**Solution**: Use `BTreeMap` and `BTreeSet` for all ID mappings.
|
||||
|
||||
### Pitfall 2: Missing Value Remapping
|
||||
**Problem**: Forgetting to remap a ValueId causes "value not found" errors.
|
||||
**Solution**: Systematically collect ALL values before remapping phase.
|
||||
|
||||
### Pitfall 3: Block Order Matters
|
||||
**Problem**: Processing blocks in random order causes test failures.
|
||||
**Solution**: Always sort by name/ID before iteration.
|
||||
|
||||
### Pitfall 4: Forgetting to Update Imports
|
||||
**Problem**: Old imports cause compilation failures.
|
||||
**Solution**: Update all imports in a single commit after extraction.
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- ✅ 714-line function → 6 focused modules (100-150 lines each)
|
||||
- ✅ All tests pass (no regressions)
|
||||
- ✅ Debug traces work (`NYASH_OPTION_C_DEBUG=1`)
|
||||
- ✅ Determinism maintained (BTreeMap/BTreeSet usage)
|
||||
- ✅ Code is easier to understand (each phase isolated)
|
||||
- ✅ Future maintenance easier (modify one phase at a time)
|
||||
|
||||
---
|
||||
|
||||
## Timeline
|
||||
|
||||
- **Total Estimated Effort**: 6 hours
|
||||
- **Buffer**: +2 hours for unexpected issues
|
||||
- **Recommended Schedule**: 2 days (3 hours/day)
|
||||
|
||||
**Day 1**:
|
||||
- Steps 1-5 (directory setup, id_remapper, block_allocator, value_collector)
|
||||
- Verification after each step
|
||||
|
||||
**Day 2**:
|
||||
- Steps 6-10 (instruction_rewriter, exit_phi_builder, coordinator, testing)
|
||||
- Comprehensive verification
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
Phase 4 is the most critical modularization task. By breaking down the 714-line monster function into 6 focused modules, we:
|
||||
|
||||
1. **Improve maintainability** - Each phase in its own file
|
||||
2. **Reduce complexity** - 714 lines → 100-150 lines per file
|
||||
3. **Enable future development** - Easy to modify individual phases
|
||||
4. **Maintain stability** - Zero breaking changes, backward compatible
|
||||
|
||||
**Next Steps**:
|
||||
1. Review this guide
|
||||
2. Set aside 2 days for implementation
|
||||
3. Execute steps 1-10 systematically
|
||||
4. Run comprehensive verification
|
||||
5. Commit success or rollback if needed
|
||||
|
||||
**Questions?** Refer to the full implementation plan or open an issue.
|
||||
|
||||
---
|
||||
|
||||
**Document Version**: 1.0
|
||||
**Created**: 2025-12-05
|
||||
**Author**: Claude Code (AI-assisted planning)
|
||||
**Status**: Ready for execution
|
||||
Reference in New Issue
Block a user