# Phase 142: Canonicalizer Pattern Extension ## Status - P0: ✅ Complete (trim leading/trailing) - P1: ✅ Complete (continue pattern) ## P0: trim leading/trailing (COMPLETE) ### Objective Extend Canonicalizer to recognize trim leading/trailing patterns, enabling proper routing through the normalized loop pipeline. ### Target Patterns - `tools/selfhost/test_pattern3_trim_leading.hako` - `start = start + 1` pattern - `tools/selfhost/test_pattern3_trim_trailing.hako` - `end = end - 1` pattern ### Accepted Criteria (All Met ✅) - ✅ Canonicalizer creates Skeleton for trim_leading/trailing - ✅ `decision.chosen == Pattern2Break` (ExitContract priority) - ✅ `decision.missing_caps == []` (no missing capabilities) - ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1) - ✅ Default behavior unchanged - ✅ Unit tests added - ✅ Documentation created ### Implementation Summary #### 1. Pattern Recognizer Generalization **File**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` **Changes**: - Extended `detect_skip_whitespace_pattern()` to accept both `+` and `-` operators - Added support for negative deltas (e.g., `-1` for `end = end - 1`) - Maintained backward compatibility with existing skip_whitespace patterns **Key Logic**: ```rust // Phase 142 P0: Accept both Add (+1) and Subtract (-1) let op_multiplier = match operator { BinaryOperator::Add => 1, BinaryOperator::Subtract => -1, _ => return None, }; // Calculate delta with sign (e.g., +1 or -1) let delta = const_val * op_multiplier; ``` **Recognized Patterns**: - skip_whitespace: `p = p + 1` (delta = +1) - trim_leading: `start = start + 1` (delta = +1) - trim_trailing: `end = end - 1` (delta = -1) #### 2. Unit Tests **File**: `src/mir/loop_canonicalizer/canonicalizer.rs` **Added Tests**: - `test_trim_leading_pattern_recognized()` - Verifies `start = start + 1` pattern - `test_trim_trailing_pattern_recognized()` - Verifies `end = end - 1` pattern **Test Coverage**: - Skeleton creation - Carrier slot creation with correct delta (+1 or -1) - ExitContract setup (has_break=true) - RoutingDecision (chosen=Pattern2Break, missing_caps=[]) **Test Results**: ``` running 2 tests test mir::loop_canonicalizer::canonicalizer::tests::test_trim_leading_pattern_recognized ... ok test mir::loop_canonicalizer::canonicalizer::tests::test_trim_trailing_pattern_recognized ... ok test result: ok. 2 passed; 0 failed; 0 ignored ``` #### 3. Manual Verification **Strict Parity Check**: ```bash NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \ tools/selfhost/test_pattern3_trim_leading.hako ``` **Output** (trim_leading): ``` [loop_canonicalizer] Decision: SUCCESS [loop_canonicalizer] Chosen pattern: Pattern2Break [loop_canonicalizer] Missing caps: [] [choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break [loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break ``` **Output** (trim_trailing): ``` [loop_canonicalizer] Decision: SUCCESS [loop_canonicalizer] Chosen pattern: Pattern2Break [loop_canonicalizer] Missing caps: [] [choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break [loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break ``` ### Design Principles Applied #### Box-First Modularization - Extended existing `detect_skip_whitespace_pattern()` instead of creating new functions - Maintained SSOT (Single Source of Truth) architecture - Preserved delegation pattern through `pattern_recognizer.rs` wrapper #### Incremental Implementation - Focused on recognizer generalization only - Did not modify routing or lowering logic - Kept scope minimal (P0 only) #### ExitContract Priority - Pattern choice determined by ExitContract (has_break=true) - Routes to Pattern2Break (not Pattern3IfPhi) - Consistent with existing SSOT policy ### Files Modified 1. `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` (+35 lines, improved comments) 2. `src/mir/loop_canonicalizer/canonicalizer.rs` (+178 lines, 2 new tests) ### Statistics - **Total changes**: +213 lines - **Unit tests**: 2 new tests (100% pass) - **Manual tests**: 2 patterns verified (strict parity green) - **Build status**: ✅ No errors, no warnings (lib) ### SSOT References - **Design**: `docs/development/current/main/design/loop-canonicalizer.md` - **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md` - **Pattern Detection**: `ast_feature_extractor.rs` (Phase 140-P4-A SSOT) ### Known Limitations - Pattern2 variable promotion (A-3 Trim promotion) not yet implemented - This is expected - Phase 142 P0 only targets recognizer extension - Promotion will be addressed in future phases ### Next Steps (Future Phases) - Phase 142 P1: Implement A-3 Trim promotion in Pattern2 handler - Phase 142 P2: Extend to other loop patterns (Pattern 3/4) - Phase 142 P3: Add more complex carrier update patterns ### Verification Commands ```bash # Unit tests cargo test --release loop_canonicalizer::canonicalizer::tests::test_trim --lib # Manual verification (trim_leading) NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \ tools/selfhost/test_pattern3_trim_leading.hako # Manual verification (trim_trailing) NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \ tools/selfhost/test_pattern3_trim_trailing.hako ``` ### Conclusion Phase 142 P0 successfully extends the Canonicalizer to recognize trim leading/trailing patterns. The implementation: - Maintains SSOT architecture - Passes all unit tests - Achieves strict parity agreement - Preserves existing behavior - Sets foundation for future pattern extensions All acceptance criteria met. ✅ --- ## P1: continue pattern (COMPLETE) ### Objective Extend Canonicalizer to recognize continue patterns, enabling proper routing through the normalized loop pipeline. ### Target Pattern - `tools/selfhost/test_pattern4_simple_continue.hako` - Simple continue pattern with carrier update ### Accepted Criteria (All Met ✅) - ✅ Canonicalizer creates Skeleton for continue pattern - ✅ `decision.chosen == Pattern4Continue` (router agreement) - ✅ `decision.missing_caps == []` (no missing capabilities) - ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1) - ✅ Default behavior unchanged - ✅ Unit tests added - ✅ Documentation updated ### Implementation Summary #### 1. Continue Pattern Detection **File**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` **New Function**: `detect_continue_pattern()` **Pattern Structure**: ```rust loop(cond) { // ... optional body statements (Body) if skip_cond { carrier = carrier + const // Optional update before continue continue } // ... rest of body statements (Rest) carrier = carrier + const // Carrier update } ``` **Example** (from test_pattern4_simple_continue.hako): ```nyash loop(i < n) { if is_even == 1 { i = i + 1 // Update before continue continue } sum = sum + i // Rest statements i = i + 1 // Carrier update } ``` **Key Logic**: - Finds if statement containing continue in then_body - Extracts body statements before the if - Extracts rest statements after the if - Detects carrier update (last statement in rest_stmts) - Returns `ContinuePatternInfo` with carrier name, delta, body_stmts, and rest_stmts #### 2. Canonicalizer Integration **File**: `src/mir/loop_canonicalizer/canonicalizer.rs` **Changes**: - Added `try_extract_continue_pattern()` call before skip_whitespace check - Build skeleton with continue pattern structure - Set `ExitContract` with `has_continue=true, has_break=false` - Route to `Pattern4Continue` **Skeleton Structure**: 1. HeaderCond - Loop condition 2. Body - Optional body statements before continue check 3. Body - Rest statements (excluding carrier update) 4. Update - Carrier update step #### 3. Module Re-exports **Files Modified** (re-export chain): - `src/mir/builder/control_flow/joinir/patterns/mod.rs` - Added `detect_continue_pattern`, `ContinuePatternInfo` - `src/mir/builder/control_flow/joinir/mod.rs` - Re-export to joinir level - `src/mir/builder/control_flow/mod.rs` - Re-export to control_flow level - `src/mir/builder.rs` - Re-export to builder level - `src/mir/mod.rs` - Re-export to crate level **Pattern**: Followed existing SSOT pattern from Phase 140-P4-A #### 4. Pattern Recognizer Wrapper **File**: `src/mir/loop_canonicalizer/pattern_recognizer.rs` **New Function**: `try_extract_continue_pattern()` - Delegates to `detect_continue_pattern()` from ast_feature_extractor - Returns tuple: `(carrier_name, delta, body_stmts, rest_stmts)` - Maintains backward compatibility with existing callsites #### 5. Unit Tests **File**: `src/mir/loop_canonicalizer/canonicalizer.rs` **Added Test**: `test_simple_continue_pattern_recognized()` - Builds AST: `loop(i < n) { if is_even { i = i + 1; continue } sum = sum + i; i = i + 1 }` - Verifies skeleton creation with correct structure - Checks carrier slot (name="i", delta=1) - Validates ExitContract (has_continue=true, has_break=false) - Confirms routing decision (Pattern4Continue, missing_caps=[]) **Test Results**: ``` running 8 tests test mir::loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized ... ok test result: ok. 8 passed; 0 failed; 0 ignored ``` #### 6. Manual Verification **Strict Parity Check**: ```bash NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \ tools/selfhost/test_pattern4_simple_continue.hako ``` **Output**: ``` [loop_canonicalizer] Function: main [loop_canonicalizer] Skeleton steps: 4 [loop_canonicalizer] Carriers: 1 [loop_canonicalizer] Has exits: true [loop_canonicalizer] Decision: SUCCESS [loop_canonicalizer] Chosen pattern: Pattern4Continue [loop_canonicalizer] Missing caps: [] [choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern4Continue [loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern4Continue ``` **Status**: ✅ Strict parity green! ### Design Principles Applied #### Box-First Modularization - Created dedicated `detect_continue_pattern()` function in ast_feature_extractor - Maintained SSOT architecture with proper re-export chain - Followed existing pattern from skip_whitespace detection #### Incremental Implementation - Focused on pattern recognition only (P1 scope) - Did not modify lowering logic (expected promotion errors) - Kept changes minimal and focused #### ExitContract Priority - Pattern choice determined by ExitContract (has_continue=true, has_break=false) - Routes to Pattern4Continue (not Pattern2 or Pattern3) - Consistent with existing SSOT policy from Phase 137-5 ### Files Modified 1. `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` (+167 lines, new function) 2. `src/mir/loop_canonicalizer/pattern_recognizer.rs` (+35 lines, wrapper function) 3. `src/mir/loop_canonicalizer/canonicalizer.rs` (+103 lines, continue support + unit test) 4. `src/mir/builder/control_flow/joinir/patterns/mod.rs` (+3 lines, re-export) 5. `src/mir/builder/control_flow/joinir/mod.rs` (+3 lines, re-export) 6. `src/mir/builder/control_flow/mod.rs` (+3 lines, re-export) 7. `src/mir/builder.rs` (+2 lines, re-export) 8. `src/mir/mod.rs` (+2 lines, re-export) ### Statistics - **Total changes**: +318 lines - **Unit tests**: 1 new test (100% pass) - **All canonicalizer tests**: 8 passed (100%) - **Manual tests**: 1 pattern verified (strict parity green) - **Build status**: ✅ No errors (warnings are pre-existing) ### SSOT References - **Design**: `docs/development/current/main/design/loop-canonicalizer.md` - **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md` - **Pattern Detection**: `ast_feature_extractor.rs` (Phase 140-P4-A SSOT) ### Known Limitations - Pattern4 variable promotion (A-3 Trim, A-4 DigitPos) not yet handling this pattern - This is expected - Phase 142 P1 only targets recognizer extension - Promotion will be addressed when Pattern4 lowering is enhanced ### Next Steps (Future Phases) - Phase 142 P2: Extend Pattern4 lowering to handle recognized continue patterns - Phase 142 P3: Add more complex continue patterns (multiple carriers, nested conditions) ### Verification Commands ```bash # Unit tests cargo test --release --lib loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized # All canonicalizer tests cargo test --release --lib loop_canonicalizer::canonicalizer::tests # Manual verification NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \ tools/selfhost/test_pattern4_simple_continue.hako ``` ### Conclusion Phase 142 P1 successfully extends the Canonicalizer to recognize continue patterns. The implementation: - Maintains SSOT architecture - Passes all unit tests (8/8) - Achieves strict parity agreement with router - Preserves existing behavior - Follows existing re-export pattern from Phase 140-P4-A All acceptance criteria met. ✅