Create 5-document modularization strategy for 3 large source files: - control_flow.rs (1,632 lines → 19 modules) - generic_case_a.rs (1,056 lines → 7 modules) - loopform_builder.rs (1,166 lines → 11 modules) Documents included: 1. **README.md** (352 lines) Navigation hub with overview, metrics, timeline, ROI analysis 2. **modularization-implementation-plan.md** (960 lines) Complete 20-hour implementation guide with: - Phase-by-phase breakdown (13 phases across 3 files) - Hour-by-hour effort estimates - Risk assessment matrices - Success criteria (quantitative & qualitative) - Public API changes (zero breaking changes) 3. **modularization-quick-start.md** (253 lines) Actionable checklist with: - Copy-paste verification commands - Emergency rollback procedures - Timeline and milestones 4. **modularization-directory-structure.md** (426 lines) Visual guide with: - Before/after directory trees - File size metrics - Import path examples - Code quality comparison 5. **phase4-merge-function-breakdown.md** (789 lines) Detailed implementation for most complex phase: - 714-line merge_joinir_mir_blocks() → 6 focused modules - 10 detailed implementation steps - Common pitfalls and solutions Key metrics: - Total effort: 20 hours (2-3 weeks) - Files: 9 → 37 focused modules - Largest file: 1,632 → 180 lines (-89%) - Backward compatible: Zero breaking changes - ROI: Breakeven in 4 months (5 hrs/month saved) Priority: control_flow.rs Phase 1-4 (high impact) Safety: Fully reversible at each phase 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Modularization Implementation Resources
This directory contains comprehensive plans and guides for modularizing large source files in the Nyash codebase.
Overview
Goal: Break down 3 oversized files (3,854 lines total) into 37 focused modules with clear separation of concerns.
Priority:
- control_flow.rs (1,632 lines) - HIGHEST (blocking Pattern 4+ development)
- generic_case_a.rs (1,056 lines) - MEDIUM (high code deduplication potential)
- loopform_builder.rs (1,166 lines) - LOWER (already partially modularized)
Documents in This Directory
1. modularization-implementation-plan.md ⭐ START HERE
Comprehensive implementation plan covering all 3 files.
Contents:
- Executive summary
- Phase-by-phase migration plans for each file
- Public API changes
- Build verification strategies
- Risk assessment matrices
- Implementation effort breakdowns
- Success criteria
Who should read this: Anyone implementing the modularization.
Estimated read time: 30 minutes
2. modularization-quick-start.md 🚀 QUICK REFERENCE
TL;DR checklist version of the implementation plan.
Contents:
- Step-by-step checklists for each phase
- Verification commands
- Emergency rollback commands
- Timeline and milestones
Who should read this: Developers actively working on modularization.
Estimated read time: 10 minutes
3. modularization-directory-structure.md 📊 VISUAL GUIDE
Visual directory structure diagrams showing before/after states.
Contents:
- Directory tree diagrams for all 3 files
- Metrics comparison tables
- Import path changes
- Navigation improvement examples
- File size distribution charts
Who should read this: Anyone wanting to understand the proposed structure.
Estimated read time: 15 minutes
4. phase4-merge-function-breakdown.md 🔥 CRITICAL PHASE
Detailed implementation guide for Phase 4 (merge_joinir_mir_blocks breakdown).
Contents:
- Function analysis (714 lines → 6 modules)
- Detailed module breakdowns with code examples
- Step-by-step implementation steps (10 steps)
- Verification checklist
- Common pitfalls and solutions
- Rollback procedure
Who should read this: Developers working on control_flow.rs Phase 4.
Estimated read time: 20 minutes
Quick Navigation
I want to...
Start the modularization
→ Read modularization-implementation-plan.md (full plan) → Use modularization-quick-start.md (checklist)
Understand the proposed structure
→ Read modularization-directory-structure.md (visual guide)
Work on Phase 4 (merge function)
→ Read phase4-merge-function-breakdown.md (detailed guide)
Get approval for the plan
→ Share modularization-implementation-plan.md (comprehensive) → Use modularization-directory-structure.md (visual support)
Estimate effort
→ See "Implementation Effort Breakdown" in modularization-implementation-plan.md
Assess risks
→ See "Risk Assessment" sections in modularization-implementation-plan.md
Recommended Reading Order
For Implementers (Developers)
- Quick Start - modularization-quick-start.md (10 min)
- Full Plan - modularization-implementation-plan.md (30 min)
- Phase 4 Guide - phase4-merge-function-breakdown.md (when ready for Phase 4)
For Reviewers (Team Leads)
- Visual Guide - modularization-directory-structure.md (15 min)
- Full Plan - modularization-implementation-plan.md (30 min)
- Quick Start - modularization-quick-start.md (verification commands)
For Stakeholders (Management)
- Executive Summary - First page of modularization-implementation-plan.md (5 min)
- Metrics Comparison - Tables in modularization-directory-structure.md (5 min)
Key Metrics
control_flow.rs
- Lines: 1,632 → 1,850 (+13% for clarity)
- Files: 1 → 19
- Largest file: 1,632 → 180 (-89%)
- Effort: 12.5 hours
generic_case_a.rs
- Lines: 1,056 → 1,470 (+39% for clarity)
- Files: 3 → 7
- Largest file: 1,056 → 500 (-53%)
- Effort: 3.5 hours
loopform_builder.rs
- Lines: 1,166 → 1,450 (+24% for clarity)
- Files: 5 → 11
- Largest file: 1,166 → 200 (-83%)
- Effort: 4 hours
Total
- Lines: 3,854 → 4,770 (+24% for clarity, distributed across 37 files)
- Files: 9 → 37
- Total Effort: 20 hours (2-3 weeks)
Implementation Timeline
Week 1: control_flow.rs Phases 1-3 (Low Risk)
- Monday: Phase 1 (Debug utilities) - 30 min
- Tuesday: Phase 2 (Pattern lowerers) - 2 hours
- Wednesday: Phase 3 (JoinIR routing) - 1.5 hours
- Thursday-Friday: Verification and buffer
Deliverable: Pattern lowerers and routing isolated
Week 2: control_flow.rs Phase 4 (High Risk)
- Monday-Tuesday: Phase 4 (merge function) - 6 hours
- Wednesday: Buffer for issues
- Thursday-Friday: Phases 5-7 (Exception, utils, cleanup) - 2.5 hours
Deliverable: control_flow.rs fully modularized
Week 3: generic_case_a.rs (Optional)
- Monday-Tuesday: generic_case_a.rs Phases 1-5 - 3.5 hours
- Wednesday: Buffer
- Thursday-Friday: Documentation & final verification
Deliverable: generic_case_a.rs fully modularized
Future: loopform_builder.rs (After Pattern 4+)
- Timing: After Pattern 4/5/6 development stabilizes
- Effort: 4 hours
- Priority: Lower (already partially modularized)
Success Criteria
Quantitative
- ✅ All 267+ tests pass (no regressions)
- ✅ Build time ≤ current (no increase)
- ✅ Largest file < 250 lines (vs 1,632 before)
- ✅ Average file size < 150 lines
Qualitative
- ✅ Code is easier to navigate
- ✅ New patterns can be added without modifying 1,600-line files
- ✅ Debug traces remain functional
- ✅ Documentation is clear and helpful
Process
- ✅ Zero breaking changes at any phase
- ✅ Each phase can be rolled back independently
- ✅ Commits are small and focused
- ✅ CI/CD passes after every commit
Verification Commands
Quick Verification (after each phase)
cargo build --release
cargo test --lib
Comprehensive Verification (after critical phases)
cargo build --release --all-features
cargo test --release
cargo clippy --all-targets
tools/smokes/v2/run.sh --profile quick
Debug Trace Verification (Phase 4 only)
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"
Emergency Rollback
control_flow.rs
rm -rf src/mir/builder/control_flow/
git checkout src/mir/builder/control_flow.rs
cargo build --release && cargo test --lib
generic_case_a.rs
rm -rf src/mir/join_ir/lowering/generic_case_a/
git checkout src/mir/join_ir/lowering/generic_case_a*.rs
cargo build --release && cargo test --lib
loopform_builder.rs
rm -rf src/mir/phi_core/loopform/
git checkout src/mir/phi_core/loopform*.rs
cargo build --release && cargo test --lib
Why Modularize?
Current Pain Points
- 714-line merge function - Impossible to understand without hours of study
- 1,632-line control_flow.rs - Pattern 4+ would add another 500+ lines
- Merge conflicts - Multiple developers editing the same giant file
- Hard to debug -
NYASH_OPTION_C_DEBUGtraces are buried in massive files - Hard to test - Can't test individual phases in isolation
Benefits After Modularization
- 100-150 line modules - Easy to understand at a glance
- 19 focused files - Each with a single responsibility
- Isolated changes - Modify one phase without affecting others
- Easy debugging - Jump to specific module for traces
- Testable - Can unit test individual modules
ROI (Return on Investment)
- Time investment: 20 hours (2-3 weeks)
- Time saved: ~5 hours/month on maintenance (conservatively)
- Breakeven: 4 months
- Long-term benefit: Much easier Pattern 4/5/6 development
Implementation Order Justification
Why control_flow.rs First?
- Blocking Pattern 4+ - Currently blocking new pattern development
- Highest pain - 714-line merge function is the biggest code smell
- Sets the pattern - Establishes the modularization template for others
- Most benefit - Reduces merge conflicts immediately
Why generic_case_a.rs Second?
- High code deduplication - 4 similar lowerers can be separated
- Already partially split - Companion files already exist
- Medium priority - Not blocking, but would improve maintainability
Why loopform_builder.rs Last?
- Already partially modularized - Phase 191 did most of the work
- Lower priority - Not blocking anything
- Can wait - Best done after Pattern 4+ development stabilizes
Questions & Concerns
"Is this worth the effort?"
Yes. 20 hours investment for ongoing maintenance benefits. Breakeven in 4 months.
"Will this break anything?"
No. Zero breaking changes, backward compatible at every phase. Full test suite verification.
"Can we roll back if needed?"
Yes. Each phase can be rolled back independently with simple git commands.
"What if we only do control_flow.rs?"
Still valuable. That's where the highest pain is. Do that first, others can wait.
"Who should implement this?"
Experienced developer familiar with MIR builder and JoinIR integration. Phase 4 requires careful attention.
Next Steps
- Review this README - Understand the resources available
- Read the full plan - modularization-implementation-plan.md
- Get approval - Share with team leads
- Create a branch -
refactor/modularize-control-flow - Start Phase 1 - Use modularization-quick-start.md
Document Status
- Created: 2025-12-05
- Status: Ready for review and implementation
- Maintainer: Claude Code (AI-assisted planning)
- Next Review: After Week 1 completion
Feedback
If you have questions or suggestions about this modularization plan:
- Open an issue - Tag with
refactoringlabel - Update the plan - Submit a PR with improvements
- Document lessons learned - Add notes to this README
Contact: Open a discussion in the team channel
Happy Modularizing! 🚀