Add parse_number pattern recognition to canonicalizer, expanding adaptation
range for digit collection loops with break in THEN clause.
## Changes
### New Recognizer (ast_feature_extractor.rs)
- `detect_parse_number_pattern()`: Detects `if invalid { break }` pattern
- `ParseNumberInfo`: Struct for extracted pattern info
- ~150 lines added
### Canonicalizer Integration (canonicalizer.rs)
- Parse_number pattern detection before skip_whitespace
- LoopSkeleton construction with 4 steps (Header + Body x2 + Update)
- Routes to Pattern2Break (has_break=true)
- ~60 lines modified
### Export Chain (6 files)
- patterns/mod.rs → joinir/mod.rs → control_flow/mod.rs
- builder.rs → mir/mod.rs
- 8 lines total
### Tests
- `test_parse_number_pattern_recognized()`: Unit test for recognition
- Strict parity verification: GREEN (canonical and router agree)
- ~130 lines added
## Pattern Comparison
| Aspect | Skip Whitespace | Parse Number |
|--------|----------------|--------------|
| Break location | ELSE clause | THEN clause |
| Pattern | `if cond { update } else { break }` | `if invalid { break } rest... update` |
| Body after if | None | Required (result append) |
## Results
- ✅ Skeleton creation successful
- ✅ RoutingDecision matches router (Pattern2Break)
- ✅ Strict parity OK (canonicalizer ↔ router agreement)
- ✅ Unit test PASS
- ✅ Manual test: test_pattern2_parse_number.hako executes correctly
## Statistics
- New patterns: 1 (parse_number)
- Total patterns: 3 (skip_whitespace, parse_number, continue)
- Lines added: ~280
- Files modified: 8
- Parity status: Green ✅
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
13 KiB
Phase 142: Canonicalizer Pattern Extension
Status
- P0: ✅ Complete (trim leading/trailing)
- P1: ✅ Complete (continue pattern)
P0: trim leading/trailing (COMPLETE)
Objective
Extend Canonicalizer to recognize trim leading/trailing patterns, enabling proper routing through the normalized loop pipeline.
Target Patterns
tools/selfhost/test_pattern3_trim_leading.hako-start = start + 1patterntools/selfhost/test_pattern3_trim_trailing.hako-end = end - 1pattern
Accepted Criteria (All Met ✅)
- ✅ Canonicalizer creates Skeleton for trim_leading/trailing
- ✅
decision.chosen == Pattern2Break(ExitContract priority) - ✅
decision.missing_caps == [](no missing capabilities) - ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
- ✅ Default behavior unchanged
- ✅ Unit tests added
- ✅ Documentation created
Implementation Summary
1. Pattern Recognizer Generalization
File: src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs
Changes:
- Extended
detect_skip_whitespace_pattern()to accept both+and-operators - Added support for negative deltas (e.g.,
-1forend = end - 1) - Maintained backward compatibility with existing skip_whitespace patterns
Key Logic:
// Phase 142 P0: Accept both Add (+1) and Subtract (-1)
let op_multiplier = match operator {
BinaryOperator::Add => 1,
BinaryOperator::Subtract => -1,
_ => return None,
};
// Calculate delta with sign (e.g., +1 or -1)
let delta = const_val * op_multiplier;
Recognized Patterns:
- skip_whitespace:
p = p + 1(delta = +1) - trim_leading:
start = start + 1(delta = +1) - trim_trailing:
end = end - 1(delta = -1)
2. Unit Tests
File: src/mir/loop_canonicalizer/canonicalizer.rs
Added Tests:
test_trim_leading_pattern_recognized()- Verifiesstart = start + 1patterntest_trim_trailing_pattern_recognized()- Verifiesend = end - 1pattern
Test Coverage:
- Skeleton creation
- Carrier slot creation with correct delta (+1 or -1)
- ExitContract setup (has_break=true)
- RoutingDecision (chosen=Pattern2Break, missing_caps=[])
Test Results:
running 2 tests
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_leading_pattern_recognized ... ok
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_trailing_pattern_recognized ... ok
test result: ok. 2 passed; 0 failed; 0 ignored
3. Manual Verification
Strict Parity Check:
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_trim_leading.hako
Output (trim_leading):
[loop_canonicalizer] Decision: SUCCESS
[loop_canonicalizer] Chosen pattern: Pattern2Break
[loop_canonicalizer] Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
Output (trim_trailing):
[loop_canonicalizer] Decision: SUCCESS
[loop_canonicalizer] Chosen pattern: Pattern2Break
[loop_canonicalizer] Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
Design Principles Applied
Box-First Modularization
- Extended existing
detect_skip_whitespace_pattern()instead of creating new functions - Maintained SSOT (Single Source of Truth) architecture
- Preserved delegation pattern through
pattern_recognizer.rswrapper
Incremental Implementation
- Focused on recognizer generalization only
- Did not modify routing or lowering logic
- Kept scope minimal (P0 only)
ExitContract Priority
- Pattern choice determined by ExitContract (has_break=true)
- Routes to Pattern2Break (not Pattern3IfPhi)
- Consistent with existing SSOT policy
Files Modified
src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs(+35 lines, improved comments)src/mir/loop_canonicalizer/canonicalizer.rs(+178 lines, 2 new tests)
Statistics
- Total changes: +213 lines
- Unit tests: 2 new tests (100% pass)
- Manual tests: 2 patterns verified (strict parity green)
- Build status: ✅ No errors, no warnings (lib)
SSOT References
- Design:
docs/development/current/main/design/loop-canonicalizer.md - JoinIR Architecture:
docs/development/current/main/joinir-architecture-overview.md - Pattern Detection:
ast_feature_extractor.rs(Phase 140-P4-A SSOT)
Known Limitations
- Pattern2 variable promotion (A-3 Trim promotion) not yet implemented
- This is expected - Phase 142 P0 only targets recognizer extension
- Promotion will be addressed in future phases
Next Steps (Future Phases)
- Phase 142 P1: Implement A-3 Trim promotion in Pattern2 handler
- Phase 142 P2: Extend to other loop patterns (Pattern 3/4)
- Phase 142 P3: Add more complex carrier update patterns
Verification Commands
# Unit tests
cargo test --release loop_canonicalizer::canonicalizer::tests::test_trim --lib
# Manual verification (trim_leading)
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_trim_leading.hako
# Manual verification (trim_trailing)
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_trim_trailing.hako
Conclusion
Phase 142 P0 successfully extends the Canonicalizer to recognize trim leading/trailing patterns. The implementation:
- Maintains SSOT architecture
- Passes all unit tests
- Achieves strict parity agreement
- Preserves existing behavior
- Sets foundation for future pattern extensions
All acceptance criteria met. ✅
P1: continue pattern (COMPLETE)
Objective
Extend Canonicalizer to recognize continue patterns, enabling proper routing through the normalized loop pipeline.
Target Pattern
tools/selfhost/test_pattern4_simple_continue.hako- Simple continue pattern with carrier update
Accepted Criteria (All Met ✅)
- ✅ Canonicalizer creates Skeleton for continue pattern
- ✅
decision.chosen == Pattern4Continue(router agreement) - ✅
decision.missing_caps == [](no missing capabilities) - ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
- ✅ Default behavior unchanged
- ✅ Unit tests added
- ✅ Documentation updated
Implementation Summary
1. Continue Pattern Detection
File: src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs
New Function: detect_continue_pattern()
Pattern Structure:
loop(cond) {
// ... optional body statements (Body)
if skip_cond {
carrier = carrier + const // Optional update before continue
continue
}
// ... rest of body statements (Rest)
carrier = carrier + const // Carrier update
}
Example (from test_pattern4_simple_continue.hako):
loop(i < n) {
if is_even == 1 {
i = i + 1 // Update before continue
continue
}
sum = sum + i // Rest statements
i = i + 1 // Carrier update
}
Key Logic:
- Finds if statement containing continue in then_body
- Extracts body statements before the if
- Extracts rest statements after the if
- Detects carrier update (last statement in rest_stmts)
- Returns
ContinuePatternInfowith carrier name, delta, body_stmts, and rest_stmts
2. Canonicalizer Integration
File: src/mir/loop_canonicalizer/canonicalizer.rs
Changes:
- Added
try_extract_continue_pattern()call before skip_whitespace check - Build skeleton with continue pattern structure
- Set
ExitContractwithhas_continue=true, has_break=false - Route to
Pattern4Continue
Skeleton Structure:
- HeaderCond - Loop condition
- Body - Optional body statements before continue check
- Body - Rest statements (excluding carrier update)
- Update - Carrier update step
3. Module Re-exports
Files Modified (re-export chain):
src/mir/builder/control_flow/joinir/patterns/mod.rs- Addeddetect_continue_pattern,ContinuePatternInfosrc/mir/builder/control_flow/joinir/mod.rs- Re-export to joinir levelsrc/mir/builder/control_flow/mod.rs- Re-export to control_flow levelsrc/mir/builder.rs- Re-export to builder levelsrc/mir/mod.rs- Re-export to crate level
Pattern: Followed existing SSOT pattern from Phase 140-P4-A
4. Pattern Recognizer Wrapper
File: src/mir/loop_canonicalizer/pattern_recognizer.rs
New Function: try_extract_continue_pattern()
- Delegates to
detect_continue_pattern()from ast_feature_extractor - Returns tuple:
(carrier_name, delta, body_stmts, rest_stmts) - Maintains backward compatibility with existing callsites
5. Unit Tests
File: src/mir/loop_canonicalizer/canonicalizer.rs
Added Test: test_simple_continue_pattern_recognized()
- Builds AST:
loop(i < n) { if is_even { i = i + 1; continue } sum = sum + i; i = i + 1 } - Verifies skeleton creation with correct structure
- Checks carrier slot (name="i", delta=1)
- Validates ExitContract (has_continue=true, has_break=false)
- Confirms routing decision (Pattern4Continue, missing_caps=[])
Test Results:
running 8 tests
test mir::loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized ... ok
test result: ok. 8 passed; 0 failed; 0 ignored
6. Manual Verification
Strict Parity Check:
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern4_simple_continue.hako
Output:
[loop_canonicalizer] Function: main
[loop_canonicalizer] Skeleton steps: 4
[loop_canonicalizer] Carriers: 1
[loop_canonicalizer] Has exits: true
[loop_canonicalizer] Decision: SUCCESS
[loop_canonicalizer] Chosen pattern: Pattern4Continue
[loop_canonicalizer] Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern4Continue
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern4Continue
Status: ✅ Strict parity green!
Design Principles Applied
Box-First Modularization
- Created dedicated
detect_continue_pattern()function in ast_feature_extractor - Maintained SSOT architecture with proper re-export chain
- Followed existing pattern from skip_whitespace detection
Incremental Implementation
- Focused on pattern recognition only (P1 scope)
- Did not modify lowering logic (expected promotion errors)
- Kept changes minimal and focused
ExitContract Priority
- Pattern choice determined by ExitContract (has_continue=true, has_break=false)
- Routes to Pattern4Continue (not Pattern2 or Pattern3)
- Consistent with existing SSOT policy from Phase 137-5
Files Modified
src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs(+167 lines, new function)src/mir/loop_canonicalizer/pattern_recognizer.rs(+35 lines, wrapper function)src/mir/loop_canonicalizer/canonicalizer.rs(+103 lines, continue support + unit test)src/mir/builder/control_flow/joinir/patterns/mod.rs(+3 lines, re-export)src/mir/builder/control_flow/joinir/mod.rs(+3 lines, re-export)src/mir/builder/control_flow/mod.rs(+3 lines, re-export)src/mir/builder.rs(+2 lines, re-export)src/mir/mod.rs(+2 lines, re-export)
Statistics
- Total changes: +318 lines
- Unit tests: 1 new test (100% pass)
- All canonicalizer tests: 8 passed (100%)
- Manual tests: 1 pattern verified (strict parity green)
- Build status: ✅ No errors (warnings are pre-existing)
SSOT References
- Design:
docs/development/current/main/design/loop-canonicalizer.md - JoinIR Architecture:
docs/development/current/main/joinir-architecture-overview.md - Pattern Detection:
ast_feature_extractor.rs(Phase 140-P4-A SSOT)
Known Limitations
- Pattern4 variable promotion (A-3 Trim, A-4 DigitPos) not yet handling this pattern
- This is expected - Phase 142 P1 only targets recognizer extension
- Promotion will be addressed when Pattern4 lowering is enhanced
Next Steps (Future Phases)
- Phase 142 P2: Extend Pattern4 lowering to handle recognized continue patterns
- Phase 142 P3: Add more complex continue patterns (multiple carriers, nested conditions)
Verification Commands
# Unit tests
cargo test --release --lib loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized
# All canonicalizer tests
cargo test --release --lib loop_canonicalizer::canonicalizer::tests
# Manual verification
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern4_simple_continue.hako
Conclusion
Phase 142 P1 successfully extends the Canonicalizer to recognize continue patterns. The implementation:
- Maintains SSOT architecture
- Passes all unit tests (8/8)
- Achieves strict parity agreement with router
- Preserves existing behavior
- Follows existing re-export pattern from Phase 140-P4-A
All acceptance criteria met. ✅