7.9 KiB
🚀 Issue: Nyash Codebase File Size Refactoring - Strategic Module Organization
Priority: 🔧 MEDIUM - Code maintainability and future development efficiency
Impact: Improved code organization, maintainability, and development velocity
Status: Ready for systematic refactoring implementation
📊 Current File Size Problems
🚨 Critical Oversized Files
1. main.rs: 1,490 lines (MASSIVE - Entry point bloat!)
2. parser/mod.rs: 1,461 lines (Parser complexity explosion)
3. box_trait.rs: 1,456 lines (Box trait definition inflation)
4. interpreter/expressions.rs: 1,166 lines (Expression engine complexity)
5. mir/builder.rs: 1,107 lines (MIR construction complexity)
Total Impact: 35,810 lines, 2,037 functions, 587 classes - Maintenance nightmare level
🎯 Refactoring Strategy (Gemini AI Expert Analysis)
Staged Approach - Risk Mitigation Priority Order
Stage 1: main.rs (1,490 lines) - HIGHEST PRIORITY
Rationale: Entry point with loose coupling - safest to refactor first
Current Problems:
- CLI argument parsing mixed with execution logic
- Backend selection (Interpreter/VM/WASM/AOT) embedded in main
- Benchmark functionality scattered throughout
- Debug options handling integrated with core logic
Proposed Split:
// NEW FILE: src/cli.rs
// - All clap-based CLI argument definitions and parsing
// - Command-line option structures
// - Help text generation and validation
// NEW FILE: src/runner.rs
// - Backend selection logic (Interpreter/VM/WASM/AOT)
// - File execution coordination
// - Benchmark runner implementation
// - REPL mode handling
// UPDATED: main.rs (target: <100 lines)
// - Thin entry point only
// - Call cli.rs for argument parsing
// - Pass results to runner.rs for execution
Stage 2: box_trait.rs (1,456 lines) - HIGH PRIORITY
Rationale: Everything is Box core - systematic categorization possible
Current Problems:
- All 16 Box types crammed into single file
- Trait definitions mixed with implementations
- No logical grouping by functionality
Proposed Reorganization:
// NEW FILE: src/boxes/traits.rs
// - NyashBox core trait definition
// - BoxCore trait and shared interfaces
// - Common box behavior abstractions
// NEW FILE: src/boxes/primitives.rs
// - IntegerBox, StringBox, FloatBox, BoolBox
// - Basic data type implementations
// NEW FILE: src/boxes/collections.rs
// - ArrayBox, MapBox, HashBox
// - Collection-based Box implementations
// NEW FILE: src/boxes/io.rs
// - SocketBox, FileBox, HTTPBox
// - I/O related Box implementations
// NEW FILE: src/boxes/system.rs
// - ConsoleBox, DebugBox, TimeBox
// - System interaction Box implementations
// UPDATED: src/boxes/mod.rs
// - Module declarations and public API facade
// - Unified Box registration and management
Stage 3: parser/mod.rs (1,461 lines) - MEDIUM PRIORITY
Rationale: Self-contained parsing logic - clear separation boundaries
Proposed Split:
// NEW FILE: src/parser/expressions.rs
// - Binary operations, method calls, if expressions
// - All expression parsing logic
// NEW FILE: src/parser/statements.rs
// - let bindings, return statements, loop constructs
// - All statement parsing logic
// NEW FILE: src/parser/literals.rs
// - Number, string, array literal parsing
// - Literal value construction
// NEW FILE: src/parser/common.rs
// - Whitespace/comment skipping utilities
// - Common parser helper functions
// UPDATED: src/parser/mod.rs
// - Module orchestration and public API
// - Top-level parse function coordination
Stage 4: interpreter/expressions.rs (1,166 lines)
Rationale: Mirror parser structure for consistency
Proposed Split:
// NEW FILE: src/interpreter/eval_operations.rs
// - Binary/unary operator evaluation
// - Arithmetic and logical operations
// NEW FILE: src/interpreter/eval_calls.rs
// - Method call resolution and execution
// - Function call handling
// NEW FILE: src/interpreter/eval_control_flow.rs
// - if expression evaluation
// - loop and control flow handling
// UPDATED: src/interpreter/expressions.rs
// - Evaluation dispatcher and coordinator
// - Expression type routing
Stage 5: mir/builder.rs (1,107 lines)
Rationale: Complex but structured - AST node correspondence
Proposed Split:
// NEW FILE: src/mir/builder/expressions.rs
// - AST expression nodes → MIR instructions
// - Expression-specific MIR generation
// NEW FILE: src/mir/builder/statements.rs
// - AST statement nodes → MIR instructions
// - Statement-specific MIR generation
// NEW FILE: src/mir/builder/variables.rs
// - Variable binding and scope management
// - MIR variable lifecycle handling
// UPDATED: src/mir/builder.rs
// - MirBuilder struct definition
// - Top-level coordination and delegation
🎯 Implementation Requirements
Architecture Preservation
- ✅ Everything is Box philosophy: Maintain unified Box abstraction
- ✅ Arc threading: Preserve thread-safety model
- ✅ Four backend support: Keep Interpreter/VM/WASM/AOT compatibility
- ✅ 16 Box types: Ensure all existing Box functionality preserved
Quality Standards
- No functionality changes: Pure refactoring - no behavior modification
- Compile guarantee: Each stage must compile successfully before next stage
- Test preservation: All existing tests must continue passing
- Import cleanup: Remove unused imports revealed by modularization
Rust Best Practices Compliance
- Module system: Follow Rust conventional module organization
- Public API design: Minimize exposed implementation details
- Documentation: Add module-level documentation for new files
- Error handling: Maintain existing error propagation patterns
🧪 Validation Requirements
After Each Stage
# Compilation check
cargo check --all-targets
cargo build --release
# Functionality verification
./target/release/nyash test_comprehensive_operators.hako
./target/release/nyash app_dice_rpg.hako
./target/release/nyash --benchmark --iterations 10
# Regression testing
cargo test
File Size Targets (Post-Refactoring)
main.rs: 1,490 → <100 lines (15x reduction)
parser/mod.rs: 1,461 → <200 lines (7x reduction)
box_trait.rs: 1,456 → REMOVED (distributed to boxes/* modules)
interpreter/expressions.rs: 1,166 → <300 lines (4x reduction)
mir/builder.rs: 1,107 → <250 lines (4x reduction)
📝 Reporting Requirements
Progressive Reporting
- Stage completion: Report each stage completion with file size metrics
- Issue discovery: Report any architectural issues discovered during refactoring
- Import optimization: Document removed unused imports and dependencies
- Performance impact: Measure compilation time changes per stage
Before/After Analysis
- File count: Document new files created and their responsibilities
- Module dependencies: Show new module dependency graph
- API changes: List any public API modifications (should be none)
- Build time: Measure compilation performance impact
🚀 Expected Benefits
Developer Experience
- Navigation: Faster code navigation and understanding
- Maintenance: Isolated changes with minimal side effects
- Collaboration: Multiple developers can work on different modules simultaneously
- Testing: More focused unit testing capabilities
Future Development
- Extensibility: Easier to add new Box types and functionality
- Debugging: Clearer separation of concerns for troubleshooting
- Refactoring: Future refactoring becomes safer and more targeted
🎯 This refactoring is essential for long-term maintainability and development velocity. The staged approach minimizes risk while maximizing organizational benefits.
📋 Critical: Start with Stage 1 (main.rs) as it provides the highest safety margin and immediate developer experience improvement.