feat(canonicalizer): Phase 143-P0 - parse_number pattern support
Add parse_number pattern recognition to canonicalizer, expanding adaptation
range for digit collection loops with break in THEN clause.
## Changes
### New Recognizer (ast_feature_extractor.rs)
- `detect_parse_number_pattern()`: Detects `if invalid { break }` pattern
- `ParseNumberInfo`: Struct for extracted pattern info
- ~150 lines added
### Canonicalizer Integration (canonicalizer.rs)
- Parse_number pattern detection before skip_whitespace
- LoopSkeleton construction with 4 steps (Header + Body x2 + Update)
- Routes to Pattern2Break (has_break=true)
- ~60 lines modified
### Export Chain (6 files)
- patterns/mod.rs → joinir/mod.rs → control_flow/mod.rs
- builder.rs → mir/mod.rs
- 8 lines total
### Tests
- `test_parse_number_pattern_recognized()`: Unit test for recognition
- Strict parity verification: GREEN (canonical and router agree)
- ~130 lines added
## Pattern Comparison
| Aspect | Skip Whitespace | Parse Number |
|--------|----------------|--------------|
| Break location | ELSE clause | THEN clause |
| Pattern | `if cond { update } else { break }` | `if invalid { break } rest... update` |
| Body after if | None | Required (result append) |
## Results
- ✅ Skeleton creation successful
- ✅ RoutingDecision matches router (Pattern2Break)
- ✅ Strict parity OK (canonicalizer ↔ router agreement)
- ✅ Unit test PASS
- ✅ Manual test: test_pattern2_parse_number.hako executes correctly
## Statistics
- New patterns: 1 (parse_number)
- Total patterns: 3 (skip_whitespace, parse_number, continue)
- Lines added: ~280
- Files modified: 8
- Parity status: Green ✅
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
365
docs/development/current/main/phases/phase-142/README.md
Normal file
365
docs/development/current/main/phases/phase-142/README.md
Normal file
@ -0,0 +1,365 @@
|
||||
# Phase 142: Canonicalizer Pattern Extension
|
||||
|
||||
## Status
|
||||
- P0: ✅ Complete (trim leading/trailing)
|
||||
- P1: ✅ Complete (continue pattern)
|
||||
|
||||
## P0: trim leading/trailing (COMPLETE)
|
||||
|
||||
### Objective
|
||||
Extend Canonicalizer to recognize trim leading/trailing patterns, enabling proper routing through the normalized loop pipeline.
|
||||
|
||||
### Target Patterns
|
||||
- `tools/selfhost/test_pattern3_trim_leading.hako` - `start = start + 1` pattern
|
||||
- `tools/selfhost/test_pattern3_trim_trailing.hako` - `end = end - 1` pattern
|
||||
|
||||
### Accepted Criteria (All Met ✅)
|
||||
- ✅ Canonicalizer creates Skeleton for trim_leading/trailing
|
||||
- ✅ `decision.chosen == Pattern2Break` (ExitContract priority)
|
||||
- ✅ `decision.missing_caps == []` (no missing capabilities)
|
||||
- ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
|
||||
- ✅ Default behavior unchanged
|
||||
- ✅ Unit tests added
|
||||
- ✅ Documentation created
|
||||
|
||||
### Implementation Summary
|
||||
|
||||
#### 1. Pattern Recognizer Generalization
|
||||
**File**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs`
|
||||
|
||||
**Changes**:
|
||||
- Extended `detect_skip_whitespace_pattern()` to accept both `+` and `-` operators
|
||||
- Added support for negative deltas (e.g., `-1` for `end = end - 1`)
|
||||
- Maintained backward compatibility with existing skip_whitespace patterns
|
||||
|
||||
**Key Logic**:
|
||||
```rust
|
||||
// Phase 142 P0: Accept both Add (+1) and Subtract (-1)
|
||||
let op_multiplier = match operator {
|
||||
BinaryOperator::Add => 1,
|
||||
BinaryOperator::Subtract => -1,
|
||||
_ => return None,
|
||||
};
|
||||
|
||||
// Calculate delta with sign (e.g., +1 or -1)
|
||||
let delta = const_val * op_multiplier;
|
||||
```
|
||||
|
||||
**Recognized Patterns**:
|
||||
- skip_whitespace: `p = p + 1` (delta = +1)
|
||||
- trim_leading: `start = start + 1` (delta = +1)
|
||||
- trim_trailing: `end = end - 1` (delta = -1)
|
||||
|
||||
#### 2. Unit Tests
|
||||
**File**: `src/mir/loop_canonicalizer/canonicalizer.rs`
|
||||
|
||||
**Added Tests**:
|
||||
- `test_trim_leading_pattern_recognized()` - Verifies `start = start + 1` pattern
|
||||
- `test_trim_trailing_pattern_recognized()` - Verifies `end = end - 1` pattern
|
||||
|
||||
**Test Coverage**:
|
||||
- Skeleton creation
|
||||
- Carrier slot creation with correct delta (+1 or -1)
|
||||
- ExitContract setup (has_break=true)
|
||||
- RoutingDecision (chosen=Pattern2Break, missing_caps=[])
|
||||
|
||||
**Test Results**:
|
||||
```
|
||||
running 2 tests
|
||||
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_leading_pattern_recognized ... ok
|
||||
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_trailing_pattern_recognized ... ok
|
||||
|
||||
test result: ok. 2 passed; 0 failed; 0 ignored
|
||||
```
|
||||
|
||||
#### 3. Manual Verification
|
||||
**Strict Parity Check**:
|
||||
```bash
|
||||
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern3_trim_leading.hako
|
||||
```
|
||||
|
||||
**Output** (trim_leading):
|
||||
```
|
||||
[loop_canonicalizer] Decision: SUCCESS
|
||||
[loop_canonicalizer] Chosen pattern: Pattern2Break
|
||||
[loop_canonicalizer] Missing caps: []
|
||||
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
|
||||
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
|
||||
```
|
||||
|
||||
**Output** (trim_trailing):
|
||||
```
|
||||
[loop_canonicalizer] Decision: SUCCESS
|
||||
[loop_canonicalizer] Chosen pattern: Pattern2Break
|
||||
[loop_canonicalizer] Missing caps: []
|
||||
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
|
||||
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
|
||||
```
|
||||
|
||||
### Design Principles Applied
|
||||
|
||||
#### Box-First Modularization
|
||||
- Extended existing `detect_skip_whitespace_pattern()` instead of creating new functions
|
||||
- Maintained SSOT (Single Source of Truth) architecture
|
||||
- Preserved delegation pattern through `pattern_recognizer.rs` wrapper
|
||||
|
||||
#### Incremental Implementation
|
||||
- Focused on recognizer generalization only
|
||||
- Did not modify routing or lowering logic
|
||||
- Kept scope minimal (P0 only)
|
||||
|
||||
#### ExitContract Priority
|
||||
- Pattern choice determined by ExitContract (has_break=true)
|
||||
- Routes to Pattern2Break (not Pattern3IfPhi)
|
||||
- Consistent with existing SSOT policy
|
||||
|
||||
### Files Modified
|
||||
1. `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` (+35 lines, improved comments)
|
||||
2. `src/mir/loop_canonicalizer/canonicalizer.rs` (+178 lines, 2 new tests)
|
||||
|
||||
### Statistics
|
||||
- **Total changes**: +213 lines
|
||||
- **Unit tests**: 2 new tests (100% pass)
|
||||
- **Manual tests**: 2 patterns verified (strict parity green)
|
||||
- **Build status**: ✅ No errors, no warnings (lib)
|
||||
|
||||
### SSOT References
|
||||
- **Design**: `docs/development/current/main/design/loop-canonicalizer.md`
|
||||
- **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md`
|
||||
- **Pattern Detection**: `ast_feature_extractor.rs` (Phase 140-P4-A SSOT)
|
||||
|
||||
### Known Limitations
|
||||
- Pattern2 variable promotion (A-3 Trim promotion) not yet implemented
|
||||
- This is expected - Phase 142 P0 only targets recognizer extension
|
||||
- Promotion will be addressed in future phases
|
||||
|
||||
### Next Steps (Future Phases)
|
||||
- Phase 142 P1: Implement A-3 Trim promotion in Pattern2 handler
|
||||
- Phase 142 P2: Extend to other loop patterns (Pattern 3/4)
|
||||
- Phase 142 P3: Add more complex carrier update patterns
|
||||
|
||||
### Verification Commands
|
||||
```bash
|
||||
# Unit tests
|
||||
cargo test --release loop_canonicalizer::canonicalizer::tests::test_trim --lib
|
||||
|
||||
# Manual verification (trim_leading)
|
||||
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern3_trim_leading.hako
|
||||
|
||||
# Manual verification (trim_trailing)
|
||||
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern3_trim_trailing.hako
|
||||
```
|
||||
|
||||
### Conclusion
|
||||
Phase 142 P0 successfully extends the Canonicalizer to recognize trim leading/trailing patterns. The implementation:
|
||||
- Maintains SSOT architecture
|
||||
- Passes all unit tests
|
||||
- Achieves strict parity agreement
|
||||
- Preserves existing behavior
|
||||
- Sets foundation for future pattern extensions
|
||||
|
||||
All acceptance criteria met. ✅
|
||||
|
||||
---
|
||||
|
||||
## P1: continue pattern (COMPLETE)
|
||||
|
||||
### Objective
|
||||
Extend Canonicalizer to recognize continue patterns, enabling proper routing through the normalized loop pipeline.
|
||||
|
||||
### Target Pattern
|
||||
- `tools/selfhost/test_pattern4_simple_continue.hako` - Simple continue pattern with carrier update
|
||||
|
||||
### Accepted Criteria (All Met ✅)
|
||||
- ✅ Canonicalizer creates Skeleton for continue pattern
|
||||
- ✅ `decision.chosen == Pattern4Continue` (router agreement)
|
||||
- ✅ `decision.missing_caps == []` (no missing capabilities)
|
||||
- ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
|
||||
- ✅ Default behavior unchanged
|
||||
- ✅ Unit tests added
|
||||
- ✅ Documentation updated
|
||||
|
||||
### Implementation Summary
|
||||
|
||||
#### 1. Continue Pattern Detection
|
||||
**File**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs`
|
||||
|
||||
**New Function**: `detect_continue_pattern()`
|
||||
|
||||
**Pattern Structure**:
|
||||
```rust
|
||||
loop(cond) {
|
||||
// ... optional body statements (Body)
|
||||
if skip_cond {
|
||||
carrier = carrier + const // Optional update before continue
|
||||
continue
|
||||
}
|
||||
// ... rest of body statements (Rest)
|
||||
carrier = carrier + const // Carrier update
|
||||
}
|
||||
```
|
||||
|
||||
**Example** (from test_pattern4_simple_continue.hako):
|
||||
```nyash
|
||||
loop(i < n) {
|
||||
if is_even == 1 {
|
||||
i = i + 1 // Update before continue
|
||||
continue
|
||||
}
|
||||
sum = sum + i // Rest statements
|
||||
i = i + 1 // Carrier update
|
||||
}
|
||||
```
|
||||
|
||||
**Key Logic**:
|
||||
- Finds if statement containing continue in then_body
|
||||
- Extracts body statements before the if
|
||||
- Extracts rest statements after the if
|
||||
- Detects carrier update (last statement in rest_stmts)
|
||||
- Returns `ContinuePatternInfo` with carrier name, delta, body_stmts, and rest_stmts
|
||||
|
||||
#### 2. Canonicalizer Integration
|
||||
**File**: `src/mir/loop_canonicalizer/canonicalizer.rs`
|
||||
|
||||
**Changes**:
|
||||
- Added `try_extract_continue_pattern()` call before skip_whitespace check
|
||||
- Build skeleton with continue pattern structure
|
||||
- Set `ExitContract` with `has_continue=true, has_break=false`
|
||||
- Route to `Pattern4Continue`
|
||||
|
||||
**Skeleton Structure**:
|
||||
1. HeaderCond - Loop condition
|
||||
2. Body - Optional body statements before continue check
|
||||
3. Body - Rest statements (excluding carrier update)
|
||||
4. Update - Carrier update step
|
||||
|
||||
#### 3. Module Re-exports
|
||||
**Files Modified** (re-export chain):
|
||||
- `src/mir/builder/control_flow/joinir/patterns/mod.rs` - Added `detect_continue_pattern`, `ContinuePatternInfo`
|
||||
- `src/mir/builder/control_flow/joinir/mod.rs` - Re-export to joinir level
|
||||
- `src/mir/builder/control_flow/mod.rs` - Re-export to control_flow level
|
||||
- `src/mir/builder.rs` - Re-export to builder level
|
||||
- `src/mir/mod.rs` - Re-export to crate level
|
||||
|
||||
**Pattern**: Followed existing SSOT pattern from Phase 140-P4-A
|
||||
|
||||
#### 4. Pattern Recognizer Wrapper
|
||||
**File**: `src/mir/loop_canonicalizer/pattern_recognizer.rs`
|
||||
|
||||
**New Function**: `try_extract_continue_pattern()`
|
||||
- Delegates to `detect_continue_pattern()` from ast_feature_extractor
|
||||
- Returns tuple: `(carrier_name, delta, body_stmts, rest_stmts)`
|
||||
- Maintains backward compatibility with existing callsites
|
||||
|
||||
#### 5. Unit Tests
|
||||
**File**: `src/mir/loop_canonicalizer/canonicalizer.rs`
|
||||
|
||||
**Added Test**: `test_simple_continue_pattern_recognized()`
|
||||
- Builds AST: `loop(i < n) { if is_even { i = i + 1; continue } sum = sum + i; i = i + 1 }`
|
||||
- Verifies skeleton creation with correct structure
|
||||
- Checks carrier slot (name="i", delta=1)
|
||||
- Validates ExitContract (has_continue=true, has_break=false)
|
||||
- Confirms routing decision (Pattern4Continue, missing_caps=[])
|
||||
|
||||
**Test Results**:
|
||||
```
|
||||
running 8 tests
|
||||
test mir::loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized ... ok
|
||||
test result: ok. 8 passed; 0 failed; 0 ignored
|
||||
```
|
||||
|
||||
#### 6. Manual Verification
|
||||
**Strict Parity Check**:
|
||||
```bash
|
||||
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern4_simple_continue.hako
|
||||
```
|
||||
|
||||
**Output**:
|
||||
```
|
||||
[loop_canonicalizer] Function: main
|
||||
[loop_canonicalizer] Skeleton steps: 4
|
||||
[loop_canonicalizer] Carriers: 1
|
||||
[loop_canonicalizer] Has exits: true
|
||||
[loop_canonicalizer] Decision: SUCCESS
|
||||
[loop_canonicalizer] Chosen pattern: Pattern4Continue
|
||||
[loop_canonicalizer] Missing caps: []
|
||||
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern4Continue
|
||||
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern4Continue
|
||||
```
|
||||
|
||||
**Status**: ✅ Strict parity green!
|
||||
|
||||
### Design Principles Applied
|
||||
|
||||
#### Box-First Modularization
|
||||
- Created dedicated `detect_continue_pattern()` function in ast_feature_extractor
|
||||
- Maintained SSOT architecture with proper re-export chain
|
||||
- Followed existing pattern from skip_whitespace detection
|
||||
|
||||
#### Incremental Implementation
|
||||
- Focused on pattern recognition only (P1 scope)
|
||||
- Did not modify lowering logic (expected promotion errors)
|
||||
- Kept changes minimal and focused
|
||||
|
||||
#### ExitContract Priority
|
||||
- Pattern choice determined by ExitContract (has_continue=true, has_break=false)
|
||||
- Routes to Pattern4Continue (not Pattern2 or Pattern3)
|
||||
- Consistent with existing SSOT policy from Phase 137-5
|
||||
|
||||
### Files Modified
|
||||
1. `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` (+167 lines, new function)
|
||||
2. `src/mir/loop_canonicalizer/pattern_recognizer.rs` (+35 lines, wrapper function)
|
||||
3. `src/mir/loop_canonicalizer/canonicalizer.rs` (+103 lines, continue support + unit test)
|
||||
4. `src/mir/builder/control_flow/joinir/patterns/mod.rs` (+3 lines, re-export)
|
||||
5. `src/mir/builder/control_flow/joinir/mod.rs` (+3 lines, re-export)
|
||||
6. `src/mir/builder/control_flow/mod.rs` (+3 lines, re-export)
|
||||
7. `src/mir/builder.rs` (+2 lines, re-export)
|
||||
8. `src/mir/mod.rs` (+2 lines, re-export)
|
||||
|
||||
### Statistics
|
||||
- **Total changes**: +318 lines
|
||||
- **Unit tests**: 1 new test (100% pass)
|
||||
- **All canonicalizer tests**: 8 passed (100%)
|
||||
- **Manual tests**: 1 pattern verified (strict parity green)
|
||||
- **Build status**: ✅ No errors (warnings are pre-existing)
|
||||
|
||||
### SSOT References
|
||||
- **Design**: `docs/development/current/main/design/loop-canonicalizer.md`
|
||||
- **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md`
|
||||
- **Pattern Detection**: `ast_feature_extractor.rs` (Phase 140-P4-A SSOT)
|
||||
|
||||
### Known Limitations
|
||||
- Pattern4 variable promotion (A-3 Trim, A-4 DigitPos) not yet handling this pattern
|
||||
- This is expected - Phase 142 P1 only targets recognizer extension
|
||||
- Promotion will be addressed when Pattern4 lowering is enhanced
|
||||
|
||||
### Next Steps (Future Phases)
|
||||
- Phase 142 P2: Extend Pattern4 lowering to handle recognized continue patterns
|
||||
- Phase 142 P3: Add more complex continue patterns (multiple carriers, nested conditions)
|
||||
|
||||
### Verification Commands
|
||||
```bash
|
||||
# Unit tests
|
||||
cargo test --release --lib loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized
|
||||
|
||||
# All canonicalizer tests
|
||||
cargo test --release --lib loop_canonicalizer::canonicalizer::tests
|
||||
|
||||
# Manual verification
|
||||
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern4_simple_continue.hako
|
||||
```
|
||||
|
||||
### Conclusion
|
||||
Phase 142 P1 successfully extends the Canonicalizer to recognize continue patterns. The implementation:
|
||||
- Maintains SSOT architecture
|
||||
- Passes all unit tests (8/8)
|
||||
- Achieves strict parity agreement with router
|
||||
- Preserves existing behavior
|
||||
- Follows existing re-export pattern from Phase 140-P4-A
|
||||
|
||||
All acceptance criteria met. ✅
|
||||
Reference in New Issue
Block a user