Files
nyash-codex 1730fea206 docs: Add comprehensive modularization implementation plan
Create 5-document modularization strategy for 3 large source files:
- control_flow.rs (1,632 lines → 19 modules)
- generic_case_a.rs (1,056 lines → 7 modules)
- loopform_builder.rs (1,166 lines → 11 modules)

Documents included:

1. **README.md** (352 lines)
   Navigation hub with overview, metrics, timeline, ROI analysis

2. **modularization-implementation-plan.md** (960 lines)
   Complete 20-hour implementation guide with:
   - Phase-by-phase breakdown (13 phases across 3 files)
   - Hour-by-hour effort estimates
   - Risk assessment matrices
   - Success criteria (quantitative & qualitative)
   - Public API changes (zero breaking changes)

3. **modularization-quick-start.md** (253 lines)
   Actionable checklist with:
   - Copy-paste verification commands
   - Emergency rollback procedures
   - Timeline and milestones

4. **modularization-directory-structure.md** (426 lines)
   Visual guide with:
   - Before/after directory trees
   - File size metrics
   - Import path examples
   - Code quality comparison

5. **phase4-merge-function-breakdown.md** (789 lines)
   Detailed implementation for most complex phase:
   - 714-line merge_joinir_mir_blocks() → 6 focused modules
   - 10 detailed implementation steps
   - Common pitfalls and solutions

Key metrics:
- Total effort: 20 hours (2-3 weeks)
- Files: 9 → 37 focused modules
- Largest file: 1,632 → 180 lines (-89%)
- Backward compatible: Zero breaking changes
- ROI: Breakeven in 4 months (5 hrs/month saved)

Priority: control_flow.rs Phase 1-4 (high impact)
Safety: Fully reversible at each phase

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 20:13:31 +09:00
..

Modularization Implementation Resources

This directory contains comprehensive plans and guides for modularizing large source files in the Nyash codebase.


Overview

Goal: Break down 3 oversized files (3,854 lines total) into 37 focused modules with clear separation of concerns.

Priority:

  1. control_flow.rs (1,632 lines) - HIGHEST (blocking Pattern 4+ development)
  2. generic_case_a.rs (1,056 lines) - MEDIUM (high code deduplication potential)
  3. loopform_builder.rs (1,166 lines) - LOWER (already partially modularized)

Documents in This Directory

1. modularization-implementation-plan.md START HERE

Comprehensive implementation plan covering all 3 files.

Contents:

  • Executive summary
  • Phase-by-phase migration plans for each file
  • Public API changes
  • Build verification strategies
  • Risk assessment matrices
  • Implementation effort breakdowns
  • Success criteria

Who should read this: Anyone implementing the modularization.

Estimated read time: 30 minutes


2. modularization-quick-start.md 🚀 QUICK REFERENCE

TL;DR checklist version of the implementation plan.

Contents:

  • Step-by-step checklists for each phase
  • Verification commands
  • Emergency rollback commands
  • Timeline and milestones

Who should read this: Developers actively working on modularization.

Estimated read time: 10 minutes


3. modularization-directory-structure.md 📊 VISUAL GUIDE

Visual directory structure diagrams showing before/after states.

Contents:

  • Directory tree diagrams for all 3 files
  • Metrics comparison tables
  • Import path changes
  • Navigation improvement examples
  • File size distribution charts

Who should read this: Anyone wanting to understand the proposed structure.

Estimated read time: 15 minutes


4. phase4-merge-function-breakdown.md 🔥 CRITICAL PHASE

Detailed implementation guide for Phase 4 (merge_joinir_mir_blocks breakdown).

Contents:

  • Function analysis (714 lines → 6 modules)
  • Detailed module breakdowns with code examples
  • Step-by-step implementation steps (10 steps)
  • Verification checklist
  • Common pitfalls and solutions
  • Rollback procedure

Who should read this: Developers working on control_flow.rs Phase 4.

Estimated read time: 20 minutes


Quick Navigation

I want to...

Start the modularization

→ Read modularization-implementation-plan.md (full plan) → Use modularization-quick-start.md (checklist)

Understand the proposed structure

→ Read modularization-directory-structure.md (visual guide)

Work on Phase 4 (merge function)

→ Read phase4-merge-function-breakdown.md (detailed guide)

Get approval for the plan

→ Share modularization-implementation-plan.md (comprehensive) → Use modularization-directory-structure.md (visual support)

Estimate effort

→ See "Implementation Effort Breakdown" in modularization-implementation-plan.md

Assess risks

→ See "Risk Assessment" sections in modularization-implementation-plan.md


For Implementers (Developers)

  1. Quick Start - modularization-quick-start.md (10 min)
  2. Full Plan - modularization-implementation-plan.md (30 min)
  3. Phase 4 Guide - phase4-merge-function-breakdown.md (when ready for Phase 4)

For Reviewers (Team Leads)

  1. Visual Guide - modularization-directory-structure.md (15 min)
  2. Full Plan - modularization-implementation-plan.md (30 min)
  3. Quick Start - modularization-quick-start.md (verification commands)

For Stakeholders (Management)

  1. Executive Summary - First page of modularization-implementation-plan.md (5 min)
  2. Metrics Comparison - Tables in modularization-directory-structure.md (5 min)

Key Metrics

control_flow.rs

  • Lines: 1,632 → 1,850 (+13% for clarity)
  • Files: 1 → 19
  • Largest file: 1,632 → 180 (-89%)
  • Effort: 12.5 hours

generic_case_a.rs

  • Lines: 1,056 → 1,470 (+39% for clarity)
  • Files: 3 → 7
  • Largest file: 1,056 → 500 (-53%)
  • Effort: 3.5 hours

loopform_builder.rs

  • Lines: 1,166 → 1,450 (+24% for clarity)
  • Files: 5 → 11
  • Largest file: 1,166 → 200 (-83%)
  • Effort: 4 hours

Total

  • Lines: 3,854 → 4,770 (+24% for clarity, distributed across 37 files)
  • Files: 9 → 37
  • Total Effort: 20 hours (2-3 weeks)

Implementation Timeline

Week 1: control_flow.rs Phases 1-3 (Low Risk)

  • Monday: Phase 1 (Debug utilities) - 30 min
  • Tuesday: Phase 2 (Pattern lowerers) - 2 hours
  • Wednesday: Phase 3 (JoinIR routing) - 1.5 hours
  • Thursday-Friday: Verification and buffer

Deliverable: Pattern lowerers and routing isolated

Week 2: control_flow.rs Phase 4 (High Risk)

  • Monday-Tuesday: Phase 4 (merge function) - 6 hours
  • Wednesday: Buffer for issues
  • Thursday-Friday: Phases 5-7 (Exception, utils, cleanup) - 2.5 hours

Deliverable: control_flow.rs fully modularized

Week 3: generic_case_a.rs (Optional)

  • Monday-Tuesday: generic_case_a.rs Phases 1-5 - 3.5 hours
  • Wednesday: Buffer
  • Thursday-Friday: Documentation & final verification

Deliverable: generic_case_a.rs fully modularized

Future: loopform_builder.rs (After Pattern 4+)

  • Timing: After Pattern 4/5/6 development stabilizes
  • Effort: 4 hours
  • Priority: Lower (already partially modularized)

Success Criteria

Quantitative

  • All 267+ tests pass (no regressions)
  • Build time ≤ current (no increase)
  • Largest file < 250 lines (vs 1,632 before)
  • Average file size < 150 lines

Qualitative

  • Code is easier to navigate
  • New patterns can be added without modifying 1,600-line files
  • Debug traces remain functional
  • Documentation is clear and helpful

Process

  • Zero breaking changes at any phase
  • Each phase can be rolled back independently
  • Commits are small and focused
  • CI/CD passes after every commit

Verification Commands

Quick Verification (after each phase)

cargo build --release
cargo test --lib

Comprehensive Verification (after critical phases)

cargo build --release --all-features
cargo test --release
cargo clippy --all-targets
tools/smokes/v2/run.sh --profile quick

Debug Trace Verification (Phase 4 only)

NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"

Emergency Rollback

control_flow.rs

rm -rf src/mir/builder/control_flow/
git checkout src/mir/builder/control_flow.rs
cargo build --release && cargo test --lib

generic_case_a.rs

rm -rf src/mir/join_ir/lowering/generic_case_a/
git checkout src/mir/join_ir/lowering/generic_case_a*.rs
cargo build --release && cargo test --lib

loopform_builder.rs

rm -rf src/mir/phi_core/loopform/
git checkout src/mir/phi_core/loopform*.rs
cargo build --release && cargo test --lib

Why Modularize?

Current Pain Points

  1. 714-line merge function - Impossible to understand without hours of study
  2. 1,632-line control_flow.rs - Pattern 4+ would add another 500+ lines
  3. Merge conflicts - Multiple developers editing the same giant file
  4. Hard to debug - NYASH_OPTION_C_DEBUG traces are buried in massive files
  5. Hard to test - Can't test individual phases in isolation

Benefits After Modularization

  1. 100-150 line modules - Easy to understand at a glance
  2. 19 focused files - Each with a single responsibility
  3. Isolated changes - Modify one phase without affecting others
  4. Easy debugging - Jump to specific module for traces
  5. Testable - Can unit test individual modules

ROI (Return on Investment)

  • Time investment: 20 hours (2-3 weeks)
  • Time saved: ~5 hours/month on maintenance (conservatively)
  • Breakeven: 4 months
  • Long-term benefit: Much easier Pattern 4/5/6 development

Implementation Order Justification

Why control_flow.rs First?

  1. Blocking Pattern 4+ - Currently blocking new pattern development
  2. Highest pain - 714-line merge function is the biggest code smell
  3. Sets the pattern - Establishes the modularization template for others
  4. Most benefit - Reduces merge conflicts immediately

Why generic_case_a.rs Second?

  1. High code deduplication - 4 similar lowerers can be separated
  2. Already partially split - Companion files already exist
  3. Medium priority - Not blocking, but would improve maintainability

Why loopform_builder.rs Last?

  1. Already partially modularized - Phase 191 did most of the work
  2. Lower priority - Not blocking anything
  3. Can wait - Best done after Pattern 4+ development stabilizes

Questions & Concerns

"Is this worth the effort?"

Yes. 20 hours investment for ongoing maintenance benefits. Breakeven in 4 months.

"Will this break anything?"

No. Zero breaking changes, backward compatible at every phase. Full test suite verification.

"Can we roll back if needed?"

Yes. Each phase can be rolled back independently with simple git commands.

"What if we only do control_flow.rs?"

Still valuable. That's where the highest pain is. Do that first, others can wait.

"Who should implement this?"

Experienced developer familiar with MIR builder and JoinIR integration. Phase 4 requires careful attention.


Next Steps

  1. Review this README - Understand the resources available
  2. Read the full plan - modularization-implementation-plan.md
  3. Get approval - Share with team leads
  4. Create a branch - refactor/modularize-control-flow
  5. Start Phase 1 - Use modularization-quick-start.md

Document Status

  • Created: 2025-12-05
  • Status: Ready for review and implementation
  • Maintainer: Claude Code (AI-assisted planning)
  • Next Review: After Week 1 completion

Feedback

If you have questions or suggestions about this modularization plan:

  1. Open an issue - Tag with refactoring label
  2. Update the plan - Submit a PR with improvements
  3. Document lessons learned - Add notes to this README

Contact: Open a discussion in the team channel


Happy Modularizing! 🚀