Files
hakorune/REFACTORING_ANALYSIS_REPORT.md

6.8 KiB

Nyash Codebase Refactoring Analysis Report

Executive Summary

This report analyzes the Nyash codebase for refactoring opportunities, focusing on large files, duplicate patterns, plugin loader consolidation, and dead code removal. The analysis was conducted manually using grep, find, and file analysis tools due to Docker unavailability for nekocode.

Key Findings

1. Large Files Requiring Splitting

Largest Files (1000+ lines):

  • src/mir/builder.rs (1517 lines) - MIR builder with mixed responsibilities
  • src/interpreter/plugin_loader.rs (1217 lines) - Plugin loading logic
  • src/interpreter/expressions/calls.rs (1016 lines) - Method call handling
  • src/interpreter/core.rs (1012 lines) - Core interpreter functionality
  • src/ast.rs (1012 lines) - AST definitions and implementations
  • src/backend/vm.rs (991 lines) - VM execution engine
  • src/runtime/plugin_loader_v2.rs (906 lines) - V2 plugin loader

2. Plugin Loader Consolidation Opportunities

Multiple Plugin Loading Systems Detected:

  • src/interpreter/plugin_loader.rs - Original plugin loader
  • src/runtime/plugin_loader_v2.rs - V2 plugin loader
  • src/runtime/plugin_loader_legacy.rs - Legacy plugin loader
  • src/bin/test_plugin_loader_v2.rs - Test implementation

Issues Identified:

  • Code duplication across 3+ plugin loader implementations
  • Inconsistent FFI handling patterns
  • Redundant memory management in FileBoxHandle, MathBoxHandle, RandomBoxHandle
  • Complex TLV handling duplicated across loaders

3. Duplicate Code Patterns

Box Implementation Patterns:

  • 67 impl NyashBox implementations with similar structure
  • Repeated constructor patterns (.new() implementations)
  • Common drop/finalization patterns across Box types
  • Similar error handling patterns across modules

Common Anti-patterns:

  • Excessive .clone() usage (797 instances across files)
  • 36 #[allow(dead_code)] suppressions
  • 797 instances of potentially unsafe operations (unreachable, panic, unwrap)

4. Dead Code Analysis

Dead Code Indicators:

  • 36 explicit dead code suppressions
  • Multiple TODO/FIXME comments (20+ instances)
  • Unused imports and deprecated methods
  • Legacy code paths maintained alongside new implementations

Specific Refactoring Recommendations

Priority 1: Plugin Loader Consolidation

Goal: Merge 3 plugin loaders into unified system

Steps:

  1. Create unified PluginLoaderV3 in src/runtime/plugin_loader_unified.rs
  2. Extract common FFI patterns into src/runtime/plugin_ffi_common.rs
  3. Consolidate handle management (FileBoxHandle, MathBoxHandle, etc.) into generic PluginHandle<T>
  4. Migrate existing code incrementally
  5. Remove legacy loaders

Impact: ~2000 lines reduction, improved maintainability

Priority 2: MIR Builder Modularization

Goal: Split src/mir/builder.rs (1517 lines) into focused modules

Proposed Structure:

src/mir/builder/
├── mod.rs                 # Public interface and coordination
├── expressions.rs         # Expression lowering (exists, enhance)
├── statements.rs          # Statement lowering
├── control_flow.rs        # If/loop/match lowering
├── type_operations.rs     # Type checking and casting
├── phi_insertion.rs       # SSA phi node management
├── optimization.rs        # Basic optimization passes
└── validation.rs          # Builder state validation

Impact: Improved maintainability, easier testing, clearer separation of concerns

Priority 3: Interpreter Core Splitting

Goal: Break down large interpreter files

Strategy:

  • src/interpreter/core.rs → Split into context.rs, execution.rs, environment.rs
  • src/interpreter/expressions/calls.rs → Extract static methods, stdlib integration
  • Create src/interpreter/dispatch/ module for method dispatch logic

Priority 4: Box Pattern Standardization

Goal: Reduce duplication in Box implementations

Actions:

  1. Create BoxMacros proc-macro for common patterns
  2. Standardize constructor patterns
  3. Extract common drop/finalization logic
  4. Create Box trait bounds helpers

Technical Debt Remediation

Error Handling Improvements

  • Replace unwrap() with proper error propagation (797 instances)
  • Standardize error types across modules
  • Implement consistent error context

Memory Management

  • Audit .clone() usage for performance impact
  • Implement Cow<T> where appropriate
  • Optimize Arc/Rc usage patterns

Dead Code Removal

  • Remove 36 dead code suppressions systematically
  • Clean up TODO/FIXME comments (convert to GitHub issues)
  • Remove deprecated method implementations

Implementation Roadmap

Phase 1: Foundation (Week 1-2)

  • Create unified plugin loader interfaces
  • Extract common FFI patterns
  • Set up modular structure for MIR builder

Phase 2: Migration (Week 3-4)

  • Migrate existing plugin usage to unified loader
  • Split MIR builder into modules
  • Begin interpreter core modularization

Phase 3: Optimization (Week 5-6)

  • Implement Box pattern standardization
  • Remove dead code systematically
  • Performance audit and optimization

Phase 4: Validation (Week 7-8)

  • Comprehensive testing of refactored code
  • Performance benchmarking
  • Documentation updates

Risk Assessment

Low Risk:

  • Dead code removal
  • Documentation improvements
  • Box pattern standardization

Medium Risk:

  • MIR builder modularization
  • Interpreter splitting
  • Memory management optimizations

High Risk:

  • Plugin loader consolidation (affects external plugins)
  • Error handling refactor (wide-reaching changes)

Success Metrics

  • Code Quality: Reduce cyclomatic complexity by 30%
  • Maintainability: Reduce average file size from 500+ to <300 lines
  • Performance: No regression in benchmark performance
  • Reliability: Maintain 100% test coverage
  • Developer Experience: Reduce onboarding time for new contributors

Conclusion

The Nyash codebase shows signs of rapid development with opportunities for significant refactoring. The plugin loader consolidation offers the highest impact for maintenance improvement, while the MIR builder modularization will improve long-term extensibility. A phased approach is recommended to minimize disruption while delivering incremental benefits.

The analysis reveals a well-architected system that would benefit from tactical refactoring to improve maintainability without compromising the innovative "Everything is Box" design philosophy.


Phase 15 Addendum (Mainline only)

  • Implemented: extracted CLI directives scanning and fields-top lint into src/runner/cli_directives.rs to slim src/runner/mod.rs without behavior changes.
  • Proposed next steps (non-JIT): see docs/development/refactoring/candidates_phase15.md for focused items on Runner/LLVM/VM.