Files
nyash-codex 1730fea206 docs: Add comprehensive modularization implementation plan
Create 5-document modularization strategy for 3 large source files:
- control_flow.rs (1,632 lines → 19 modules)
- generic_case_a.rs (1,056 lines → 7 modules)
- loopform_builder.rs (1,166 lines → 11 modules)

Documents included:

1. **README.md** (352 lines)
   Navigation hub with overview, metrics, timeline, ROI analysis

2. **modularization-implementation-plan.md** (960 lines)
   Complete 20-hour implementation guide with:
   - Phase-by-phase breakdown (13 phases across 3 files)
   - Hour-by-hour effort estimates
   - Risk assessment matrices
   - Success criteria (quantitative & qualitative)
   - Public API changes (zero breaking changes)

3. **modularization-quick-start.md** (253 lines)
   Actionable checklist with:
   - Copy-paste verification commands
   - Emergency rollback procedures
   - Timeline and milestones

4. **modularization-directory-structure.md** (426 lines)
   Visual guide with:
   - Before/after directory trees
   - File size metrics
   - Import path examples
   - Code quality comparison

5. **phase4-merge-function-breakdown.md** (789 lines)
   Detailed implementation for most complex phase:
   - 714-line merge_joinir_mir_blocks() → 6 focused modules
   - 10 detailed implementation steps
   - Common pitfalls and solutions

Key metrics:
- Total effort: 20 hours (2-3 weeks)
- Files: 9 → 37 focused modules
- Largest file: 1,632 → 180 lines (-89%)
- Backward compatible: Zero breaking changes
- ROI: Breakeven in 4 months (5 hrs/month saved)

Priority: control_flow.rs Phase 1-4 (high impact)
Safety: Fully reversible at each phase

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 20:13:31 +09:00

353 lines
11 KiB
Markdown

# Modularization Implementation Resources
This directory contains comprehensive plans and guides for modularizing large source files in the Nyash codebase.
---
## Overview
**Goal**: Break down 3 oversized files (3,854 lines total) into 37 focused modules with clear separation of concerns.
**Priority**:
1. **control_flow.rs** (1,632 lines) - **HIGHEST** (blocking Pattern 4+ development)
2. **generic_case_a.rs** (1,056 lines) - **MEDIUM** (high code deduplication potential)
3. **loopform_builder.rs** (1,166 lines) - **LOWER** (already partially modularized)
---
## Documents in This Directory
### 1. [modularization-implementation-plan.md](./modularization-implementation-plan.md) ⭐ **START HERE**
**Comprehensive implementation plan** covering all 3 files.
**Contents**:
- Executive summary
- Phase-by-phase migration plans for each file
- Public API changes
- Build verification strategies
- Risk assessment matrices
- Implementation effort breakdowns
- Success criteria
**Who should read this**: Anyone implementing the modularization.
**Estimated read time**: 30 minutes
---
### 2. [modularization-quick-start.md](./modularization-quick-start.md) 🚀 **QUICK REFERENCE**
**TL;DR checklist version** of the implementation plan.
**Contents**:
- Step-by-step checklists for each phase
- Verification commands
- Emergency rollback commands
- Timeline and milestones
**Who should read this**: Developers actively working on modularization.
**Estimated read time**: 10 minutes
---
### 3. [modularization-directory-structure.md](./modularization-directory-structure.md) 📊 **VISUAL GUIDE**
**Visual directory structure diagrams** showing before/after states.
**Contents**:
- Directory tree diagrams for all 3 files
- Metrics comparison tables
- Import path changes
- Navigation improvement examples
- File size distribution charts
**Who should read this**: Anyone wanting to understand the proposed structure.
**Estimated read time**: 15 minutes
---
### 4. [phase4-merge-function-breakdown.md](./phase4-merge-function-breakdown.md) 🔥 **CRITICAL PHASE**
**Detailed implementation guide** for Phase 4 (merge_joinir_mir_blocks breakdown).
**Contents**:
- Function analysis (714 lines → 6 modules)
- Detailed module breakdowns with code examples
- Step-by-step implementation steps (10 steps)
- Verification checklist
- Common pitfalls and solutions
- Rollback procedure
**Who should read this**: Developers working on control_flow.rs Phase 4.
**Estimated read time**: 20 minutes
---
## Quick Navigation
### I want to...
#### **Start the modularization**
→ Read [modularization-implementation-plan.md](./modularization-implementation-plan.md) (full plan)
→ Use [modularization-quick-start.md](./modularization-quick-start.md) (checklist)
#### **Understand the proposed structure**
→ Read [modularization-directory-structure.md](./modularization-directory-structure.md) (visual guide)
#### **Work on Phase 4 (merge function)**
→ Read [phase4-merge-function-breakdown.md](./phase4-merge-function-breakdown.md) (detailed guide)
#### **Get approval for the plan**
→ Share [modularization-implementation-plan.md](./modularization-implementation-plan.md) (comprehensive)
→ Use [modularization-directory-structure.md](./modularization-directory-structure.md) (visual support)
#### **Estimate effort**
→ See "Implementation Effort Breakdown" in [modularization-implementation-plan.md](./modularization-implementation-plan.md)
#### **Assess risks**
→ See "Risk Assessment" sections in [modularization-implementation-plan.md](./modularization-implementation-plan.md)
---
## Recommended Reading Order
### For Implementers (Developers)
1. **Quick Start** - [modularization-quick-start.md](./modularization-quick-start.md) (10 min)
2. **Full Plan** - [modularization-implementation-plan.md](./modularization-implementation-plan.md) (30 min)
3. **Phase 4 Guide** - [phase4-merge-function-breakdown.md](./phase4-merge-function-breakdown.md) (when ready for Phase 4)
### For Reviewers (Team Leads)
1. **Visual Guide** - [modularization-directory-structure.md](./modularization-directory-structure.md) (15 min)
2. **Full Plan** - [modularization-implementation-plan.md](./modularization-implementation-plan.md) (30 min)
3. **Quick Start** - [modularization-quick-start.md](./modularization-quick-start.md) (verification commands)
### For Stakeholders (Management)
1. **Executive Summary** - First page of [modularization-implementation-plan.md](./modularization-implementation-plan.md) (5 min)
2. **Metrics Comparison** - Tables in [modularization-directory-structure.md](./modularization-directory-structure.md) (5 min)
---
## Key Metrics
### control_flow.rs
- **Lines**: 1,632 → 1,850 (+13% for clarity)
- **Files**: 1 → 19
- **Largest file**: 1,632 → 180 (-89%)
- **Effort**: 12.5 hours
### generic_case_a.rs
- **Lines**: 1,056 → 1,470 (+39% for clarity)
- **Files**: 3 → 7
- **Largest file**: 1,056 → 500 (-53%)
- **Effort**: 3.5 hours
### loopform_builder.rs
- **Lines**: 1,166 → 1,450 (+24% for clarity)
- **Files**: 5 → 11
- **Largest file**: 1,166 → 200 (-83%)
- **Effort**: 4 hours
### Total
- **Lines**: 3,854 → 4,770 (+24% for clarity, distributed across 37 files)
- **Files**: 9 → 37
- **Total Effort**: 20 hours (2-3 weeks)
---
## Implementation Timeline
### Week 1: control_flow.rs Phases 1-3 (Low Risk)
- **Monday**: Phase 1 (Debug utilities) - 30 min
- **Tuesday**: Phase 2 (Pattern lowerers) - 2 hours
- **Wednesday**: Phase 3 (JoinIR routing) - 1.5 hours
- **Thursday-Friday**: Verification and buffer
**Deliverable**: Pattern lowerers and routing isolated
### Week 2: control_flow.rs Phase 4 (High Risk)
- **Monday-Tuesday**: Phase 4 (merge function) - 6 hours
- **Wednesday**: Buffer for issues
- **Thursday-Friday**: Phases 5-7 (Exception, utils, cleanup) - 2.5 hours
**Deliverable**: control_flow.rs fully modularized
### Week 3: generic_case_a.rs (Optional)
- **Monday-Tuesday**: generic_case_a.rs Phases 1-5 - 3.5 hours
- **Wednesday**: Buffer
- **Thursday-Friday**: Documentation & final verification
**Deliverable**: generic_case_a.rs fully modularized
### Future: loopform_builder.rs (After Pattern 4+)
- **Timing**: After Pattern 4/5/6 development stabilizes
- **Effort**: 4 hours
- **Priority**: Lower (already partially modularized)
---
## Success Criteria
### Quantitative
- ✅ All 267+ tests pass (no regressions)
- ✅ Build time ≤ current (no increase)
- ✅ Largest file < 250 lines (vs 1,632 before)
- Average file size < 150 lines
### Qualitative
- Code is easier to navigate
- New patterns can be added without modifying 1,600-line files
- Debug traces remain functional
- Documentation is clear and helpful
### Process
- Zero breaking changes at any phase
- Each phase can be rolled back independently
- Commits are small and focused
- CI/CD passes after every commit
---
## Verification Commands
### Quick Verification (after each phase)
```bash
cargo build --release
cargo test --lib
```
### Comprehensive Verification (after critical phases)
```bash
cargo build --release --all-features
cargo test --release
cargo clippy --all-targets
tools/smokes/v2/run.sh --profile quick
```
### Debug Trace Verification (Phase 4 only)
```bash
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"
```
---
## Emergency Rollback
### control_flow.rs
```bash
rm -rf src/mir/builder/control_flow/
git checkout src/mir/builder/control_flow.rs
cargo build --release && cargo test --lib
```
### generic_case_a.rs
```bash
rm -rf src/mir/join_ir/lowering/generic_case_a/
git checkout src/mir/join_ir/lowering/generic_case_a*.rs
cargo build --release && cargo test --lib
```
### loopform_builder.rs
```bash
rm -rf src/mir/phi_core/loopform/
git checkout src/mir/phi_core/loopform*.rs
cargo build --release && cargo test --lib
```
---
## Why Modularize?
### Current Pain Points
1. **714-line merge function** - Impossible to understand without hours of study
2. **1,632-line control_flow.rs** - Pattern 4+ would add another 500+ lines
3. **Merge conflicts** - Multiple developers editing the same giant file
4. **Hard to debug** - `NYASH_OPTION_C_DEBUG` traces are buried in massive files
5. **Hard to test** - Can't test individual phases in isolation
### Benefits After Modularization
1. **100-150 line modules** - Easy to understand at a glance
2. **19 focused files** - Each with a single responsibility
3. **Isolated changes** - Modify one phase without affecting others
4. **Easy debugging** - Jump to specific module for traces
5. **Testable** - Can unit test individual modules
### ROI (Return on Investment)
- **Time investment**: 20 hours (2-3 weeks)
- **Time saved**: ~5 hours/month on maintenance (conservatively)
- **Breakeven**: 4 months
- **Long-term benefit**: Much easier Pattern 4/5/6 development
---
## Implementation Order Justification
### Why control_flow.rs First?
1. **Blocking Pattern 4+** - Currently blocking new pattern development
2. **Highest pain** - 714-line merge function is the biggest code smell
3. **Sets the pattern** - Establishes the modularization template for others
4. **Most benefit** - Reduces merge conflicts immediately
### Why generic_case_a.rs Second?
1. **High code deduplication** - 4 similar lowerers can be separated
2. **Already partially split** - Companion files already exist
3. **Medium priority** - Not blocking, but would improve maintainability
### Why loopform_builder.rs Last?
1. **Already partially modularized** - Phase 191 did most of the work
2. **Lower priority** - Not blocking anything
3. **Can wait** - Best done after Pattern 4+ development stabilizes
---
## Questions & Concerns
### "Is this worth the effort?"
**Yes.** 20 hours investment for ongoing maintenance benefits. Breakeven in 4 months.
### "Will this break anything?"
**No.** Zero breaking changes, backward compatible at every phase. Full test suite verification.
### "Can we roll back if needed?"
**Yes.** Each phase can be rolled back independently with simple git commands.
### "What if we only do control_flow.rs?"
**Still valuable.** That's where the highest pain is. Do that first, others can wait.
### "Who should implement this?"
**Experienced developer** familiar with MIR builder and JoinIR integration. Phase 4 requires careful attention.
---
## Next Steps
1. **Review this README** - Understand the resources available
2. **Read the full plan** - [modularization-implementation-plan.md](./modularization-implementation-plan.md)
3. **Get approval** - Share with team leads
4. **Create a branch** - `refactor/modularize-control-flow`
5. **Start Phase 1** - Use [modularization-quick-start.md](./modularization-quick-start.md)
---
## Document Status
- **Created**: 2025-12-05
- **Status**: Ready for review and implementation
- **Maintainer**: Claude Code (AI-assisted planning)
- **Next Review**: After Week 1 completion
---
## Feedback
If you have questions or suggestions about this modularization plan:
1. **Open an issue** - Tag with `refactoring` label
2. **Update the plan** - Submit a PR with improvements
3. **Document lessons learned** - Add notes to this README
**Contact**: Open a discussion in the team channel
---
**Happy Modularizing!** 🚀