Files
hakorune/docs/development/refactoring/modularization-implementation-plan.md
nyash-codex 1730fea206 docs: Add comprehensive modularization implementation plan
Create 5-document modularization strategy for 3 large source files:
- control_flow.rs (1,632 lines → 19 modules)
- generic_case_a.rs (1,056 lines → 7 modules)
- loopform_builder.rs (1,166 lines → 11 modules)

Documents included:

1. **README.md** (352 lines)
   Navigation hub with overview, metrics, timeline, ROI analysis

2. **modularization-implementation-plan.md** (960 lines)
   Complete 20-hour implementation guide with:
   - Phase-by-phase breakdown (13 phases across 3 files)
   - Hour-by-hour effort estimates
   - Risk assessment matrices
   - Success criteria (quantitative & qualitative)
   - Public API changes (zero breaking changes)

3. **modularization-quick-start.md** (253 lines)
   Actionable checklist with:
   - Copy-paste verification commands
   - Emergency rollback procedures
   - Timeline and milestones

4. **modularization-directory-structure.md** (426 lines)
   Visual guide with:
   - Before/after directory trees
   - File size metrics
   - Import path examples
   - Code quality comparison

5. **phase4-merge-function-breakdown.md** (789 lines)
   Detailed implementation for most complex phase:
   - 714-line merge_joinir_mir_blocks() → 6 focused modules
   - 10 detailed implementation steps
   - Common pitfalls and solutions

Key metrics:
- Total effort: 20 hours (2-3 weeks)
- Files: 9 → 37 focused modules
- Largest file: 1,632 → 180 lines (-89%)
- Backward compatible: Zero breaking changes
- ROI: Breakeven in 4 months (5 hrs/month saved)

Priority: control_flow.rs Phase 1-4 (high impact)
Safety: Fully reversible at each phase

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 20:13:31 +09:00

28 KiB

Modularization Implementation Plan for Large Source Files

Executive Summary

This plan details the modularization of three oversized source files in the Nyash MIR builder:

  • control_flow.rs (1,632 lines) - HIGHEST PRIORITY
  • generic_case_a.rs (1,056 lines) - MEDIUM PRIORITY
  • loopform_builder.rs (1,166 lines) - LOWER PRIORITY

The strategy prioritizes control_flow.rs first due to its critical role in JoinIR integration and ongoing Pattern 4+ development. Each modularization is broken into incremental phases that maintain backward compatibility, with clear verification points and rollback procedures.

Total Estimated Effort: 15-20 hours across 2-3 weeks Key Principle: Zero breaking changes, backward compatible at every phase


Why control_flow.rs First?

control_flow.rs is the integration point for JoinIR lowering patterns, actively being extended for Pattern 4+. Modularizing now:

  1. Prevents future pain - Pattern 4/5/6 would add another 500+ lines to an already massive file
  2. Sets the pattern - Establishes the modularization template for other files
  3. Reduces merge conflicts - Isolates pattern-specific changes to dedicated files
  4. Improves debuggability - NYASH_OPTION_C_DEBUG traces become easier to locate
  5. Currently blocking - The 714-line merge_joinir_mir_blocks() function is a code smell that makes maintenance difficult

1. control_flow.rs Modularization (HIGHEST PRIORITY)

Current State

File: src/mir/builder/control_flow.rs (1,632 lines)

Functions (13 total):

  • trace_varmap() (8 lines) - Debug utility
  • cf_block() (4 lines) - Block entry point
  • cf_if() (10 lines) - If entry point
  • cf_loop() (80 lines) - Loop entry point
  • try_cf_loop_joinir() (91 lines) - JoinIR routing logic
  • cf_loop_joinir_impl() (247 lines) - JoinIR pattern dispatcher
  • cf_loop_pattern1_minimal() (143 lines) - Pattern 1 lowering
  • cf_loop_pattern2_with_break() (120 lines) - Pattern 2 lowering
  • cf_loop_pattern3_with_if_phi() (168 lines) - Pattern 3 lowering
  • merge_joinir_mir_blocks() (714 lines) - LARGEST FUNCTION ⚠️
  • cf_try_catch() (138 lines) - Exception handling
  • extract_loop_variable_from_condition() (31 lines) - Utility
  • cf_throw() (23 lines) - Throw entry point

Key Issues:

  • merge_joinir_mir_blocks() is 714 lines (44% of the file!)
  • Pattern lowerers (1/2/3) are isolated but scattered
  • JoinIR integration logic mixed with entry points
  • No clear separation between routing and implementation

Proposed New Structure

src/mir/builder/control_flow/
├── mod.rs                          (~150 lines) - Public API, entry points
├── debug.rs                        (~50 lines)  - Debug utilities (trace_varmap)
├── joinir/
│   ├── mod.rs                      (~100 lines) - JoinIR integration coordinator
│   ├── routing.rs                  (~150 lines) - try_cf_loop_joinir, dispatcher
│   ├── merge/
│   │   ├── mod.rs                  (~100 lines) - merge_joinir_mir_blocks entry point
│   │   ├── id_remapper.rs          (~150 lines) - ValueId/BlockId remapping
│   │   ├── block_allocator.rs      (~100 lines) - Block ID allocation
│   │   ├── value_collector.rs      (~100 lines) - Value collection phase
│   │   ├── instruction_rewriter.rs (~150 lines) - Instruction rewriting
│   │   └── exit_phi_builder.rs     (~100 lines) - Exit PHI construction
│   └── patterns/
│       ├── mod.rs                  (~50 lines)  - Pattern dispatcher
│       ├── pattern1_minimal.rs     (~150 lines) - Pattern 1 lowering
│       ├── pattern2_with_break.rs  (~130 lines) - Pattern 2 lowering
│       └── pattern3_with_if_phi.rs (~180 lines) - Pattern 3 lowering
├── exception/
│   ├── mod.rs                      (~50 lines)  - Exception handling API
│   ├── try_catch.rs                (~150 lines) - try/catch implementation
│   └── throw.rs                    (~30 lines)  - throw implementation
└── utils.rs                        (~50 lines)  - extract_loop_variable, etc.

Total: ~1,850 lines (13% increase for clarity, but distributed across 19 files)
Average file size: ~97 lines (vs 1,632 lines monolith)

Phase-by-Phase Migration Plan

Phase 1: Extract Debug Utilities (30 min)

Goal: Move trace_varmap() to a dedicated debug module.

Steps:

  1. Create src/mir/builder/control_flow/debug.rs
  2. Move trace_varmap() implementation
  3. Update mod.rs to re-export pub(super) use debug::*;
  4. Run verification

Files Created:

  • src/mir/builder/control_flow/debug.rs (~50 lines)

Files Modified:

  • src/mir/builder/control_flow.rs.../control_flow/mod.rs

Verification:

cargo build --release
cargo test --lib
tools/smokes/v2/run.sh --profile quick --filter "loop_*"

Rollback: Delete debug.rs, revert imports in mod.rs

Estimated Effort: 30 minutes


Phase 2: Extract Pattern Lowerers (2 hours)

Goal: Move Pattern 1/2/3 lowering functions to dedicated files.

Steps:

  1. Create src/mir/builder/control_flow/joinir/patterns/ directory
  2. Create patterns/mod.rs with dispatcher
  3. Move cf_loop_pattern1_minimal() to pattern1_minimal.rs
  4. Move cf_loop_pattern2_with_break() to pattern2_with_break.rs
  5. Move cf_loop_pattern3_with_if_phi() to pattern3_with_if_phi.rs
  6. Update imports in mod.rs
  7. Run verification

Files Created:

  • control_flow/joinir/patterns/mod.rs (~50 lines)
  • control_flow/joinir/patterns/pattern1_minimal.rs (~150 lines)
  • control_flow/joinir/patterns/pattern2_with_break.rs (~130 lines)
  • control_flow/joinir/patterns/pattern3_with_if_phi.rs (~180 lines)

Public API Changes: None (all functions are already fn, not pub(super) fn)

Verification:

cargo build --release
cargo test --lib -- --include-ignored
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako
tools/smokes/v2/run.sh --profile quick --filter "loop_*"

Rollback: Delete patterns/ directory, revert imports

Estimated Effort: 2 hours


Phase 3: Extract JoinIR Routing Logic (1.5 hours)

Goal: Move try_cf_loop_joinir() and cf_loop_joinir_impl() to routing module.

Steps:

  1. Create control_flow/joinir/routing.rs
  2. Move try_cf_loop_joinir() implementation
  3. Move cf_loop_joinir_impl() implementation
  4. Update imports in mod.rs
  5. Run verification

Files Created:

  • control_flow/joinir/routing.rs (~150 lines)

Files Modified:

  • control_flow/mod.rs (update imports)

Verification:

cargo build --release
cargo test --release
HAKO_JOINIR_PRINT_TOKENS_MAIN=1 ./target/release/nyash test_program.hako

Rollback: Delete routing.rs, revert imports

Estimated Effort: 1.5 hours


Phase 4: Break Down merge_joinir_mir_blocks (6 hours) ⚠️ CRITICAL

Goal: Split the 714-line monster function into 6 logical modules.

Background: merge_joinir_mir_blocks() performs 6 distinct phases:

  1. Block ID allocation (lines 864-923)
  2. Value collection (lines 931-971)
  3. Block merging (lines 973-1100)
  4. Instruction rewriting (lines 1102-1400)
  5. Exit PHI construction (lines 1402-1500)
  6. Boundary reconnection (lines 1502-1578)

Steps:

  1. Create control_flow/joinir/merge/ directory structure
  2. Extract id_remapper.rs - ID remapping utilities
  3. Extract block_allocator.rs - Block ID allocation logic
  4. Extract value_collector.rs - Value collection phase
  5. Extract instruction_rewriter.rs - Instruction transformation
  6. Extract exit_phi_builder.rs - Exit PHI construction
  7. Create merge/mod.rs as the coordinator
  8. Update imports
  9. Run comprehensive verification

Files Created:

  • control_flow/joinir/merge/mod.rs (~100 lines) - Coordinator
  • control_flow/joinir/merge/id_remapper.rs (~150 lines)
  • control_flow/joinir/merge/block_allocator.rs (~100 lines)
  • control_flow/joinir/merge/value_collector.rs (~100 lines)
  • control_flow/joinir/merge/instruction_rewriter.rs (~150 lines)
  • control_flow/joinir/merge/exit_phi_builder.rs (~100 lines)

Public API:

// control_flow/joinir/merge/mod.rs
pub(in crate::mir::builder) fn merge_joinir_mir_blocks(
    builder: &mut MirBuilder,
    mir_module: &MirModule,
    boundary: Option<&JoinInlineBoundary>,
    debug: bool,
) -> Result<Option<ValueId>, String> {
    // Coordinator that calls the 6 sub-modules
    let remapper = id_remapper::create_remapper(builder, mir_module, debug)?;
    let values = value_collector::collect_values(mir_module, &remapper, debug)?;
    // ... etc
}

Verification (CRITICAL - run ALL tests):

# Step 1: Build verification
cargo build --release
cargo test --lib

# Step 2: Smoke tests (ALL patterns)
tools/smokes/v2/run.sh --profile quick

# Step 3: Debug trace verification
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | grep "merge_joinir"

# Step 4: Regression check (run 3 times for determinism)
for i in 1 2 3; do
  echo "=== Run $i ==="
  cargo test --release test_loop_patterns 2>&1 | grep "test result"
done

# Step 5: Full integration test
cargo test --release --all-features

Rollback:

# If anything breaks, immediately:
rm -rf src/mir/builder/control_flow/joinir/merge
git checkout src/mir/builder/control_flow.rs
cargo build --release && cargo test

Risk Mitigation:

  • Keep the original merge_joinir_mir_blocks() as a comment at the top of merge/mod.rs
  • Add #[cfg(test)] unit tests for each sub-module
  • Use feature flag NYASH_USE_LEGACY_MERGE=1 for emergency fallback (optional)

Estimated Effort: 6 hours (most complex phase)


Phase 5: Extract Exception Handling (1 hour)

Goal: Move cf_try_catch() and cf_throw() to exception module.

Steps:

  1. Create control_flow/exception/ directory
  2. Move cf_try_catch() to try_catch.rs
  3. Move cf_throw() to throw.rs
  4. Create exception/mod.rs as coordinator
  5. Update imports
  6. Run verification

Files Created:

  • control_flow/exception/mod.rs (~50 lines)
  • control_flow/exception/try_catch.rs (~150 lines)
  • control_flow/exception/throw.rs (~30 lines)

Verification:

cargo build --release
cargo test --lib -- exception

Rollback: Delete exception/ directory, revert imports

Estimated Effort: 1 hour


Phase 6: Extract Utilities (30 min)

Goal: Move extract_loop_variable_from_condition() to utils module.

Steps:

  1. Create control_flow/utils.rs
  2. Move utility functions
  3. Update imports
  4. Run verification

Files Created:

  • control_flow/utils.rs (~50 lines)

Verification:

cargo build --release
cargo test --lib

Rollback: Delete utils.rs, revert imports

Estimated Effort: 30 minutes


Phase 7: Final Cleanup & Documentation (1 hour)

Goal: Clean up mod.rs, add module documentation, verify all imports.

Steps:

  1. Review control_flow/mod.rs for clarity
  2. Add module-level documentation to each file
  3. Ensure all pub(super) visibility is correct
  4. Run final comprehensive verification
  5. Update CLAUDE.md with new structure

Documentation Template:

//! Pattern 1 Minimal Loop Lowering
//!
//! This module implements the simplest JoinIR loop lowering pattern:
//! - Single loop variable
//! - No break statements
//! - Simple condition (i < N)
//!
//! Used by: minimal_ssa_skip_ws, simple while loops
//! Phase: 188 (Pattern 1 implementation)

Verification:

cargo build --release --all-features
cargo test --release
cargo clippy --all-targets
tools/smokes/v2/run.sh --profile integration

Estimated Effort: 1 hour


Public API Changes

Before (control_flow.rs)

impl MirBuilder {
    pub(super) fn cf_block(...) -> Result<ValueId, String>
    pub(super) fn cf_if(...) -> Result<ValueId, String>
    pub(super) fn cf_loop(...) -> Result<ValueId, String>
    pub(super) fn cf_try_catch(...) -> Result<ValueId, String>
    pub(super) fn cf_throw(...) -> Result<ValueId, String>
}

After (control_flow/mod.rs)

// Re-export all entry points (NO CHANGE to public API)
pub(super) use entry_points::{cf_block, cf_if, cf_loop, cf_try_catch, cf_throw};

// Internal modules (not exposed outside control_flow)
mod debug;
mod utils;
mod joinir;
mod exception;
mod entry_points;

Guarantee: Zero breaking changes - all pub(super) functions remain accessible from MirBuilder.


Build Verification Strategy

After Each Phase:

# 1. Compilation check
cargo build --release
echo "Build: $?" >> /tmp/modularization_log.txt

# 2. Unit tests
cargo test --lib
echo "Unit tests: $?" >> /tmp/modularization_log.txt

# 3. Integration tests
cargo test --release
echo "Integration tests: $?" >> /tmp/modularization_log.txt

# 4. Smoke tests (only for Phases 2-4)
tools/smokes/v2/run.sh --profile quick --filter "loop_*"
echo "Smoke tests: $?" >> /tmp/modularization_log.txt

# 5. Debug trace verification (for Phase 4)
NYASH_OPTION_C_DEBUG=1 ./target/release/nyash apps/tests/loop_min_while.hako 2>&1 | tee /tmp/debug_trace.txt
grep -q "merge_joinir" /tmp/debug_trace.txt && echo "Debug trace: OK"

Failure Handling:

# If any step fails, STOP immediately
if [ $? -ne 0 ]; then
    echo "FAILURE detected at Phase $PHASE"
    git status
    echo "Run rollback procedure? (y/n)"
    # ... rollback steps
fi

Risk Assessment

Risk Likelihood Impact Mitigation Detection
Breaking imports Medium High Incremental phases, test after each cargo build fails
Merge function breakage Low Critical Keep original as comment, feature flag Smoke tests fail
Performance regression Very Low Medium No algorithmic changes Benchmark before/after
Debug trace changes Low Low Verify NYASH_OPTION_C_DEBUG output Manual inspection
HashMap non-determinism Very Low Low Already using BTreeMap in critical paths Run tests 3x

Critical Mitigations:

  1. Phase 4 (merge function): Keep original function as comment
  2. All phases: Commit after each successful phase
  3. Rollback plan: Document exact git checkout commands for each file

Implementation Effort Breakdown

Phase Description Effort Risk
Phase 1 Debug utilities 30 min Low
Phase 2 Pattern lowerers 2 hours Low
Phase 3 JoinIR routing 1.5 hours Low
Phase 4 merge_joinir_mir_blocks 6 hours Medium
Phase 5 Exception handling 1 hour Low
Phase 6 Utilities 30 min Low
Phase 7 Cleanup & docs 1 hour Low
Total control_flow.rs 12.5 hours -

Recommended Schedule:

  • Week 1: Phases 1-3 (4 hours total) - Low risk warmup
  • Week 2: Phase 4 (6 hours) - Dedicated time for merge function
  • Week 3: Phases 5-7 (2.5 hours) - Final cleanup

Success Criteria

  • All 267 tests pass (no regressions)
  • Build time ≤ current (no increase)
  • control_flow/mod.rs is < 200 lines (88% reduction)
  • Largest single file is < 200 lines (vs 714 lines before)
  • Debug traces still work (NYASH_OPTION_C_DEBUG=1)
  • Smoke tests pass for all patterns (1/2/3)
  • No HashMap non-determinism introduced
  • Code is easier to navigate (measured by developer feedback)

2. generic_case_a.rs Modularization (MEDIUM PRIORITY)

Current State

File: src/mir/join_ir/lowering/generic_case_a.rs (1,056 lines)

Functions (4 public lowerers):

  • lower_case_a_skip_ws_with_scope() + lower_case_a_skip_ws_core() (~203 lines)
  • lower_case_a_trim_with_scope() + lower_case_a_trim_core() (~479 lines)
  • lower_case_a_append_defs_with_scope() + lower_case_a_append_defs_core() (~156 lines)
  • lower_case_a_stage1_usingresolver_with_scope() + lower_case_a_stage1_usingresolver_core() (~167 lines)

Key Observation: This file already has companion files:

  • generic_case_a_entry_builder.rs (4,828 bytes)
  • generic_case_a_whitespace_check.rs (4,552 bytes)

Strategy: Complete the modularization pattern by splitting the 4 lowerers into separate files.


Proposed New Structure

src/mir/join_ir/lowering/generic_case_a/
├── mod.rs                        (~100 lines) - Public API
├── skip_ws.rs                    (~220 lines) - skip_ws lowerer
├── trim.rs                       (~500 lines) - trim lowerer
├── append_defs.rs                (~170 lines) - append_defs lowerer
├── stage1_using_resolver.rs      (~180 lines) - stage1 using resolver
├── entry_builder.rs              (~150 lines) - (moved from parent)
└── whitespace_check.rs           (~150 lines) - (moved from parent)

Total: ~1,470 lines (39% increase for clarity, distributed across 7 files)
Average: ~210 lines per file

Phase-by-Phase Migration Plan

Phase 1: Create Directory Structure (15 min)

Steps:

  1. Create src/mir/join_ir/lowering/generic_case_a/ directory
  2. Create mod.rs with public API exports
  3. Move existing companion files into directory
  4. Update parent mod.rs imports
  5. Run verification

Verification:

cargo build --release

Rollback: Delete directory, revert parent mod.rs

Estimated Effort: 15 minutes


Phase 2: Extract skip_ws Lowerer (45 min)

Steps:

  1. Create generic_case_a/skip_ws.rs
  2. Move lower_case_a_skip_ws_with_scope() and _core()
  3. Add module documentation
  4. Update mod.rs imports
  5. Run verification

Verification:

cargo build --release
cargo test --lib -- skip_ws

Estimated Effort: 45 minutes


Phase 3: Extract trim Lowerer (1 hour)

Steps:

  1. Create generic_case_a/trim.rs
  2. Move lower_case_a_trim_with_scope() and _core()
  3. Add documentation
  4. Update imports
  5. Run verification

Verification:

cargo build --release
cargo test --lib -- trim

Estimated Effort: 1 hour


Phase 4: Extract append_defs & stage1 Lowerers (1 hour)

Steps:

  1. Create generic_case_a/append_defs.rs
  2. Create generic_case_a/stage1_using_resolver.rs
  3. Move respective functions
  4. Add documentation
  5. Update imports
  6. Run verification

Verification:

cargo build --release
cargo test --release
tools/smokes/v2/run.sh --profile quick --filter "funcscanner_*"

Estimated Effort: 1 hour


Phase 5: Final Cleanup (30 min)

Steps:

  1. Add module-level documentation
  2. Verify all imports are clean
  3. Run comprehensive tests
  4. Update documentation

Verification:

cargo build --release --all-features
cargo test --release

Estimated Effort: 30 minutes


Public API Changes

Before

// src/mir/join_ir/lowering/generic_case_a.rs
pub(crate) fn lower_case_a_skip_ws_with_scope(...) -> Option<JoinModule>
pub(crate) fn lower_case_a_trim_with_scope(...) -> Option<JoinModule>
// ... etc

After

// src/mir/join_ir/lowering/generic_case_a/mod.rs
pub(crate) use skip_ws::lower_case_a_skip_ws_with_scope;
pub(crate) use trim::lower_case_a_trim_with_scope;
pub(crate) use append_defs::lower_case_a_append_defs_with_scope;
pub(crate) use stage1_using_resolver::lower_case_a_stage1_usingresolver_with_scope;

Guarantee: Zero breaking changes - all pub(crate) functions remain accessible.


Implementation Effort Breakdown

Phase Description Effort Risk
Phase 1 Directory setup 15 min Low
Phase 2 skip_ws lowerer 45 min Low
Phase 3 trim lowerer 1 hour Low
Phase 4 append_defs & stage1 1 hour Low
Phase 5 Cleanup 30 min Low
Total generic_case_a.rs 3.5 hours -

Success Criteria

  • All tests pass
  • generic_case_a/mod.rs is < 150 lines
  • Each lowerer is in a dedicated file
  • Companion files integrated into directory
  • Documentation added to all modules

3. loopform_builder.rs Modularization (LOWER PRIORITY)

Current State

File: src/mir/phi_core/loopform_builder.rs (1,166 lines)

Status: Already partially modularized in Phase 191!

Existing Structure (Phase 191):

src/mir/phi_core/
├── loopform_builder.rs           (1,166 lines) - Main coordinator
├── loopform_context.rs            - ValueId management
├── loopform_variable_models.rs    - CarrierVariable, PinnedVariable
├── loopform_utils.rs              - Debug and bypass utilities
├── loopform_exit_phi.rs           - Exit PHI builder

Remaining Work: The main loopform_builder.rs still contains implementation logic that should be moved to dedicated modules.


Proposed New Structure

src/mir/phi_core/loopform/
├── mod.rs                        (~100 lines) - Public API
├── context.rs                    (~150 lines) - (existing loopform_context.rs)
├── variable_models.rs            (~150 lines) - (existing loopform_variable_models.rs)
├── utils.rs                      (~100 lines) - (existing loopform_utils.rs)
├── exit_phi.rs                   (~150 lines) - (existing loopform_exit_phi.rs)
├── passes/
│   ├── mod.rs                    (~50 lines)  - 4-pass architecture coordinator
│   ├── pass1_discovery.rs        (~150 lines) - Variable discovery
│   ├── pass2_header_phi.rs       (~150 lines) - Header PHI construction
│   ├── pass3_latch.rs            (~100 lines) - Latch block processing
│   └── pass4_exit_phi.rs         (~150 lines) - Exit PHI construction
└── builder_core.rs               (~200 lines) - Core builder logic

Total: ~1,450 lines (24% increase for clarity, distributed across 11 files)
Average: ~132 lines per file

Phase-by-Phase Migration Plan

Phase 1: Directory Structure (30 min)

Steps:

  1. Create src/mir/phi_core/loopform/ directory
  2. Move existing modular files into directory
  3. Create mod.rs with re-exports
  4. Update parent mod.rs imports
  5. Run verification

Verification:

cargo build --release
cargo test --lib -- loopform

Estimated Effort: 30 minutes


Phase 2: Extract 4-Pass Architecture (2 hours)

Steps:

  1. Create loopform/passes/ directory
  2. Identify the 4 passes in loopform_builder.rs
  3. Extract each pass to dedicated file
  4. Create passes/mod.rs as coordinator
  5. Update imports
  6. Run verification

Verification:

cargo build --release
cargo test --release -- loopform
NYASH_LOOPFORM_DEBUG=1 ./target/release/nyash test_loop.hako

Estimated Effort: 2 hours


Phase 3: Extract Core Builder Logic (1 hour)

Steps:

  1. Create loopform/builder_core.rs
  2. Move remaining builder logic
  3. Update imports
  4. Run verification

Verification:

cargo build --release
cargo test --release

Estimated Effort: 1 hour


Phase 4: Final Cleanup (30 min)

Steps:

  1. Add module documentation
  2. Verify all re-exports
  3. Run comprehensive tests

Verification:

cargo build --release --all-features
cargo test --release
tools/smokes/v2/run.sh --profile quick --filter "phi_*"

Estimated Effort: 30 minutes


Public API Changes

Before

// src/mir/phi_core/loopform_builder.rs
pub use loopform_context::LoopFormContext;
pub use loopform_variable_models::{CarrierVariable, PinnedVariable};
pub fn build_exit_phis_for_control<O: LoopFormOps>(...)

After

// src/mir/phi_core/loopform/mod.rs
pub use context::LoopFormContext;
pub use variable_models::{CarrierVariable, PinnedVariable};
pub use exit_phi::build_exit_phis_for_control;

Guarantee: Zero breaking changes.


Implementation Effort Breakdown

Phase Description Effort Risk
Phase 1 Directory setup 30 min Low
Phase 2 4-pass extraction 2 hours Medium
Phase 3 Core builder 1 hour Low
Phase 4 Cleanup 30 min Low
Total loopform_builder.rs 4 hours -

Success Criteria

  • All tests pass
  • loopform/mod.rs is < 150 lines
  • Each pass is in a dedicated file
  • Existing modular files integrated
  • Documentation added

Implementation Order & Timeline

Week 1: control_flow.rs Phases 1-3 (Low Risk)

  • Monday: Phase 1 (Debug utilities) - 30 min
  • Tuesday: Phase 2 (Pattern lowerers) - 2 hours
  • Wednesday: Phase 3 (JoinIR routing) - 1.5 hours
  • Verification: Run full smoke tests at end of week

Week 2: control_flow.rs Phase 4 (High Risk)

  • Monday-Tuesday: Phase 4 (merge_joinir_mir_blocks) - 6 hours
  • Wednesday: Buffer day for fixing any issues
  • Thursday-Friday: Phases 5-7 (Exception, utilities, cleanup) - 2.5 hours

Week 3: generic_case_a.rs (Optional)

  • Monday-Tuesday: generic_case_a.rs Phases 1-5 - 3.5 hours
  • Wednesday: Buffer
  • Thursday-Friday: Documentation & final verification

Future (After Pattern 4+): loopform_builder.rs

  • Timing: After Pattern 4/5/6 development stabilizes
  • Effort: 4 hours
  • Priority: Lower (already partially modularized)

Effort Summary

File Total Effort Priority Complexity Blocking?
control_flow.rs 12.5 hours HIGHEST High Yes (Pattern 4+)
generic_case_a.rs 3.5 hours MEDIUM Low No
loopform_builder.rs 4 hours LOWER Medium No
TOTAL 20 hours - - -

Risk/Mitigation Matrix

Risk Likelihood Impact Mitigation Detection Method
Breaking imports Medium High Incremental phases, test after each cargo build fails
Merge function breakage Low Critical Keep original as comment, feature flag Smoke tests fail
Pattern lowerer breakage Low High Test each pattern independently Integration tests
Performance regression Very Low Medium No algorithmic changes Benchmark suite
Debug trace changes Low Low Verify NYASH_OPTION_C_DEBUG output Manual inspection
Test failures Low Medium Run tests after every phase CI/CD pipeline
Merge conflicts Medium Low Work on dedicated branch Git status

Success Criteria (Global)

Quantitative:

  • All 267+ tests pass (no regressions)
  • Build time ≤ current (no increase)
  • Largest single file in modularized areas is < 250 lines
  • Average file size in modularized areas is < 150 lines

Qualitative:

  • Code is easier to navigate (developer feedback)
  • New patterns can be added without modifying 1,600-line files
  • Debug traces remain functional
  • Documentation is clear and helpful

Process:

  • Zero breaking changes at any phase
  • Each phase can be rolled back independently
  • Commits are small and focused
  • CI/CD passes after every commit

Appendix: Emergency Rollback Procedure

If anything goes wrong during modularization:

# 1. Identify the problematic phase
echo "ROLLBACK: Phase $PHASE failed"

# 2. Check git status
git status

# 3. Rollback created files
rm -rf src/mir/builder/control_flow/  # (example)

# 4. Restore original file
git checkout src/mir/builder/control_flow.rs

# 5. Verify build
cargo build --release
cargo test --lib

# 6. Document the issue
echo "$(date): Phase $PHASE rollback due to: $REASON" >> docs/development/refactoring/rollback_log.txt

# 7. Review and adjust plan
# - Was the failure expected?
# - Do we need to adjust the approach?
# - Should we skip this phase?

Conclusion

This modularization plan provides a safe, incremental path to breaking down 3 large files into maintainable modules. The approach prioritizes:

  1. Zero breaking changes - Backward compatible at every step
  2. Clear verification - Test after each phase
  3. Easy rollback - Can undo any phase if issues arise
  4. Pattern setting - control_flow.rs establishes the template for others

Next Steps:

  1. Review this plan with the team
  2. Get approval for Week 1 (control_flow.rs Phases 1-3)
  3. Create a dedicated branch: refactor/modularize-control-flow
  4. Begin Phase 1!

Questions? Concerns? Open an issue or discuss in the team channel.


Document Version: 1.0 Created: 2025-12-05 Author: Claude Code (AI-assisted planning) Status: Draft - Awaiting Review