366 lines
13 KiB
Markdown
366 lines
13 KiB
Markdown
|
|
# Phase 142: Canonicalizer Pattern Extension
|
||
|
|
|
||
|
|
## Status
|
||
|
|
- P0: ✅ Complete (trim leading/trailing)
|
||
|
|
- P1: ✅ Complete (continue pattern)
|
||
|
|
|
||
|
|
## P0: trim leading/trailing (COMPLETE)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Extend Canonicalizer to recognize trim leading/trailing patterns, enabling proper routing through the normalized loop pipeline.
|
||
|
|
|
||
|
|
### Target Patterns
|
||
|
|
- `tools/selfhost/test_pattern3_trim_leading.hako` - `start = start + 1` pattern
|
||
|
|
- `tools/selfhost/test_pattern3_trim_trailing.hako` - `end = end - 1` pattern
|
||
|
|
|
||
|
|
### Accepted Criteria (All Met ✅)
|
||
|
|
- ✅ Canonicalizer creates Skeleton for trim_leading/trailing
|
||
|
|
- ✅ `decision.chosen == Pattern2Break` (ExitContract priority)
|
||
|
|
- ✅ `decision.missing_caps == []` (no missing capabilities)
|
||
|
|
- ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
|
||
|
|
- ✅ Default behavior unchanged
|
||
|
|
- ✅ Unit tests added
|
||
|
|
- ✅ Documentation created
|
||
|
|
|
||
|
|
### Implementation Summary
|
||
|
|
|
||
|
|
#### 1. Pattern Recognizer Generalization
|
||
|
|
**File**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs`
|
||
|
|
|
||
|
|
**Changes**:
|
||
|
|
- Extended `detect_skip_whitespace_pattern()` to accept both `+` and `-` operators
|
||
|
|
- Added support for negative deltas (e.g., `-1` for `end = end - 1`)
|
||
|
|
- Maintained backward compatibility with existing skip_whitespace patterns
|
||
|
|
|
||
|
|
**Key Logic**:
|
||
|
|
```rust
|
||
|
|
// Phase 142 P0: Accept both Add (+1) and Subtract (-1)
|
||
|
|
let op_multiplier = match operator {
|
||
|
|
BinaryOperator::Add => 1,
|
||
|
|
BinaryOperator::Subtract => -1,
|
||
|
|
_ => return None,
|
||
|
|
};
|
||
|
|
|
||
|
|
// Calculate delta with sign (e.g., +1 or -1)
|
||
|
|
let delta = const_val * op_multiplier;
|
||
|
|
```
|
||
|
|
|
||
|
|
**Recognized Patterns**:
|
||
|
|
- skip_whitespace: `p = p + 1` (delta = +1)
|
||
|
|
- trim_leading: `start = start + 1` (delta = +1)
|
||
|
|
- trim_trailing: `end = end - 1` (delta = -1)
|
||
|
|
|
||
|
|
#### 2. Unit Tests
|
||
|
|
**File**: `src/mir/loop_canonicalizer/canonicalizer.rs`
|
||
|
|
|
||
|
|
**Added Tests**:
|
||
|
|
- `test_trim_leading_pattern_recognized()` - Verifies `start = start + 1` pattern
|
||
|
|
- `test_trim_trailing_pattern_recognized()` - Verifies `end = end - 1` pattern
|
||
|
|
|
||
|
|
**Test Coverage**:
|
||
|
|
- Skeleton creation
|
||
|
|
- Carrier slot creation with correct delta (+1 or -1)
|
||
|
|
- ExitContract setup (has_break=true)
|
||
|
|
- RoutingDecision (chosen=Pattern2Break, missing_caps=[])
|
||
|
|
|
||
|
|
**Test Results**:
|
||
|
|
```
|
||
|
|
running 2 tests
|
||
|
|
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_leading_pattern_recognized ... ok
|
||
|
|
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_trailing_pattern_recognized ... ok
|
||
|
|
|
||
|
|
test result: ok. 2 passed; 0 failed; 0 ignored
|
||
|
|
```
|
||
|
|
|
||
|
|
#### 3. Manual Verification
|
||
|
|
**Strict Parity Check**:
|
||
|
|
```bash
|
||
|
|
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||
|
|
tools/selfhost/test_pattern3_trim_leading.hako
|
||
|
|
```
|
||
|
|
|
||
|
|
**Output** (trim_leading):
|
||
|
|
```
|
||
|
|
[loop_canonicalizer] Decision: SUCCESS
|
||
|
|
[loop_canonicalizer] Chosen pattern: Pattern2Break
|
||
|
|
[loop_canonicalizer] Missing caps: []
|
||
|
|
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
|
||
|
|
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
|
||
|
|
```
|
||
|
|
|
||
|
|
**Output** (trim_trailing):
|
||
|
|
```
|
||
|
|
[loop_canonicalizer] Decision: SUCCESS
|
||
|
|
[loop_canonicalizer] Chosen pattern: Pattern2Break
|
||
|
|
[loop_canonicalizer] Missing caps: []
|
||
|
|
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
|
||
|
|
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
|
||
|
|
```
|
||
|
|
|
||
|
|
### Design Principles Applied
|
||
|
|
|
||
|
|
#### Box-First Modularization
|
||
|
|
- Extended existing `detect_skip_whitespace_pattern()` instead of creating new functions
|
||
|
|
- Maintained SSOT (Single Source of Truth) architecture
|
||
|
|
- Preserved delegation pattern through `pattern_recognizer.rs` wrapper
|
||
|
|
|
||
|
|
#### Incremental Implementation
|
||
|
|
- Focused on recognizer generalization only
|
||
|
|
- Did not modify routing or lowering logic
|
||
|
|
- Kept scope minimal (P0 only)
|
||
|
|
|
||
|
|
#### ExitContract Priority
|
||
|
|
- Pattern choice determined by ExitContract (has_break=true)
|
||
|
|
- Routes to Pattern2Break (not Pattern3IfPhi)
|
||
|
|
- Consistent with existing SSOT policy
|
||
|
|
|
||
|
|
### Files Modified
|
||
|
|
1. `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` (+35 lines, improved comments)
|
||
|
|
2. `src/mir/loop_canonicalizer/canonicalizer.rs` (+178 lines, 2 new tests)
|
||
|
|
|
||
|
|
### Statistics
|
||
|
|
- **Total changes**: +213 lines
|
||
|
|
- **Unit tests**: 2 new tests (100% pass)
|
||
|
|
- **Manual tests**: 2 patterns verified (strict parity green)
|
||
|
|
- **Build status**: ✅ No errors, no warnings (lib)
|
||
|
|
|
||
|
|
### SSOT References
|
||
|
|
- **Design**: `docs/development/current/main/design/loop-canonicalizer.md`
|
||
|
|
- **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md`
|
||
|
|
- **Pattern Detection**: `ast_feature_extractor.rs` (Phase 140-P4-A SSOT)
|
||
|
|
|
||
|
|
### Known Limitations
|
||
|
|
- Pattern2 variable promotion (A-3 Trim promotion) not yet implemented
|
||
|
|
- This is expected - Phase 142 P0 only targets recognizer extension
|
||
|
|
- Promotion will be addressed in future phases
|
||
|
|
|
||
|
|
### Next Steps (Future Phases)
|
||
|
|
- Phase 142 P1: Implement A-3 Trim promotion in Pattern2 handler
|
||
|
|
- Phase 142 P2: Extend to other loop patterns (Pattern 3/4)
|
||
|
|
- Phase 142 P3: Add more complex carrier update patterns
|
||
|
|
|
||
|
|
### Verification Commands
|
||
|
|
```bash
|
||
|
|
# Unit tests
|
||
|
|
cargo test --release loop_canonicalizer::canonicalizer::tests::test_trim --lib
|
||
|
|
|
||
|
|
# Manual verification (trim_leading)
|
||
|
|
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||
|
|
tools/selfhost/test_pattern3_trim_leading.hako
|
||
|
|
|
||
|
|
# Manual verification (trim_trailing)
|
||
|
|
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||
|
|
tools/selfhost/test_pattern3_trim_trailing.hako
|
||
|
|
```
|
||
|
|
|
||
|
|
### Conclusion
|
||
|
|
Phase 142 P0 successfully extends the Canonicalizer to recognize trim leading/trailing patterns. The implementation:
|
||
|
|
- Maintains SSOT architecture
|
||
|
|
- Passes all unit tests
|
||
|
|
- Achieves strict parity agreement
|
||
|
|
- Preserves existing behavior
|
||
|
|
- Sets foundation for future pattern extensions
|
||
|
|
|
||
|
|
All acceptance criteria met. ✅
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## P1: continue pattern (COMPLETE)
|
||
|
|
|
||
|
|
### Objective
|
||
|
|
Extend Canonicalizer to recognize continue patterns, enabling proper routing through the normalized loop pipeline.
|
||
|
|
|
||
|
|
### Target Pattern
|
||
|
|
- `tools/selfhost/test_pattern4_simple_continue.hako` - Simple continue pattern with carrier update
|
||
|
|
|
||
|
|
### Accepted Criteria (All Met ✅)
|
||
|
|
- ✅ Canonicalizer creates Skeleton for continue pattern
|
||
|
|
- ✅ `decision.chosen == Pattern4Continue` (router agreement)
|
||
|
|
- ✅ `decision.missing_caps == []` (no missing capabilities)
|
||
|
|
- ✅ Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
|
||
|
|
- ✅ Default behavior unchanged
|
||
|
|
- ✅ Unit tests added
|
||
|
|
- ✅ Documentation updated
|
||
|
|
|
||
|
|
### Implementation Summary
|
||
|
|
|
||
|
|
#### 1. Continue Pattern Detection
|
||
|
|
**File**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs`
|
||
|
|
|
||
|
|
**New Function**: `detect_continue_pattern()`
|
||
|
|
|
||
|
|
**Pattern Structure**:
|
||
|
|
```rust
|
||
|
|
loop(cond) {
|
||
|
|
// ... optional body statements (Body)
|
||
|
|
if skip_cond {
|
||
|
|
carrier = carrier + const // Optional update before continue
|
||
|
|
continue
|
||
|
|
}
|
||
|
|
// ... rest of body statements (Rest)
|
||
|
|
carrier = carrier + const // Carrier update
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Example** (from test_pattern4_simple_continue.hako):
|
||
|
|
```nyash
|
||
|
|
loop(i < n) {
|
||
|
|
if is_even == 1 {
|
||
|
|
i = i + 1 // Update before continue
|
||
|
|
continue
|
||
|
|
}
|
||
|
|
sum = sum + i // Rest statements
|
||
|
|
i = i + 1 // Carrier update
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Key Logic**:
|
||
|
|
- Finds if statement containing continue in then_body
|
||
|
|
- Extracts body statements before the if
|
||
|
|
- Extracts rest statements after the if
|
||
|
|
- Detects carrier update (last statement in rest_stmts)
|
||
|
|
- Returns `ContinuePatternInfo` with carrier name, delta, body_stmts, and rest_stmts
|
||
|
|
|
||
|
|
#### 2. Canonicalizer Integration
|
||
|
|
**File**: `src/mir/loop_canonicalizer/canonicalizer.rs`
|
||
|
|
|
||
|
|
**Changes**:
|
||
|
|
- Added `try_extract_continue_pattern()` call before skip_whitespace check
|
||
|
|
- Build skeleton with continue pattern structure
|
||
|
|
- Set `ExitContract` with `has_continue=true, has_break=false`
|
||
|
|
- Route to `Pattern4Continue`
|
||
|
|
|
||
|
|
**Skeleton Structure**:
|
||
|
|
1. HeaderCond - Loop condition
|
||
|
|
2. Body - Optional body statements before continue check
|
||
|
|
3. Body - Rest statements (excluding carrier update)
|
||
|
|
4. Update - Carrier update step
|
||
|
|
|
||
|
|
#### 3. Module Re-exports
|
||
|
|
**Files Modified** (re-export chain):
|
||
|
|
- `src/mir/builder/control_flow/joinir/patterns/mod.rs` - Added `detect_continue_pattern`, `ContinuePatternInfo`
|
||
|
|
- `src/mir/builder/control_flow/joinir/mod.rs` - Re-export to joinir level
|
||
|
|
- `src/mir/builder/control_flow/mod.rs` - Re-export to control_flow level
|
||
|
|
- `src/mir/builder.rs` - Re-export to builder level
|
||
|
|
- `src/mir/mod.rs` - Re-export to crate level
|
||
|
|
|
||
|
|
**Pattern**: Followed existing SSOT pattern from Phase 140-P4-A
|
||
|
|
|
||
|
|
#### 4. Pattern Recognizer Wrapper
|
||
|
|
**File**: `src/mir/loop_canonicalizer/pattern_recognizer.rs`
|
||
|
|
|
||
|
|
**New Function**: `try_extract_continue_pattern()`
|
||
|
|
- Delegates to `detect_continue_pattern()` from ast_feature_extractor
|
||
|
|
- Returns tuple: `(carrier_name, delta, body_stmts, rest_stmts)`
|
||
|
|
- Maintains backward compatibility with existing callsites
|
||
|
|
|
||
|
|
#### 5. Unit Tests
|
||
|
|
**File**: `src/mir/loop_canonicalizer/canonicalizer.rs`
|
||
|
|
|
||
|
|
**Added Test**: `test_simple_continue_pattern_recognized()`
|
||
|
|
- Builds AST: `loop(i < n) { if is_even { i = i + 1; continue } sum = sum + i; i = i + 1 }`
|
||
|
|
- Verifies skeleton creation with correct structure
|
||
|
|
- Checks carrier slot (name="i", delta=1)
|
||
|
|
- Validates ExitContract (has_continue=true, has_break=false)
|
||
|
|
- Confirms routing decision (Pattern4Continue, missing_caps=[])
|
||
|
|
|
||
|
|
**Test Results**:
|
||
|
|
```
|
||
|
|
running 8 tests
|
||
|
|
test mir::loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized ... ok
|
||
|
|
test result: ok. 8 passed; 0 failed; 0 ignored
|
||
|
|
```
|
||
|
|
|
||
|
|
#### 6. Manual Verification
|
||
|
|
**Strict Parity Check**:
|
||
|
|
```bash
|
||
|
|
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||
|
|
tools/selfhost/test_pattern4_simple_continue.hako
|
||
|
|
```
|
||
|
|
|
||
|
|
**Output**:
|
||
|
|
```
|
||
|
|
[loop_canonicalizer] Function: main
|
||
|
|
[loop_canonicalizer] Skeleton steps: 4
|
||
|
|
[loop_canonicalizer] Carriers: 1
|
||
|
|
[loop_canonicalizer] Has exits: true
|
||
|
|
[loop_canonicalizer] Decision: SUCCESS
|
||
|
|
[loop_canonicalizer] Chosen pattern: Pattern4Continue
|
||
|
|
[loop_canonicalizer] Missing caps: []
|
||
|
|
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern4Continue
|
||
|
|
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern4Continue
|
||
|
|
```
|
||
|
|
|
||
|
|
**Status**: ✅ Strict parity green!
|
||
|
|
|
||
|
|
### Design Principles Applied
|
||
|
|
|
||
|
|
#### Box-First Modularization
|
||
|
|
- Created dedicated `detect_continue_pattern()` function in ast_feature_extractor
|
||
|
|
- Maintained SSOT architecture with proper re-export chain
|
||
|
|
- Followed existing pattern from skip_whitespace detection
|
||
|
|
|
||
|
|
#### Incremental Implementation
|
||
|
|
- Focused on pattern recognition only (P1 scope)
|
||
|
|
- Did not modify lowering logic (expected promotion errors)
|
||
|
|
- Kept changes minimal and focused
|
||
|
|
|
||
|
|
#### ExitContract Priority
|
||
|
|
- Pattern choice determined by ExitContract (has_continue=true, has_break=false)
|
||
|
|
- Routes to Pattern4Continue (not Pattern2 or Pattern3)
|
||
|
|
- Consistent with existing SSOT policy from Phase 137-5
|
||
|
|
|
||
|
|
### Files Modified
|
||
|
|
1. `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs` (+167 lines, new function)
|
||
|
|
2. `src/mir/loop_canonicalizer/pattern_recognizer.rs` (+35 lines, wrapper function)
|
||
|
|
3. `src/mir/loop_canonicalizer/canonicalizer.rs` (+103 lines, continue support + unit test)
|
||
|
|
4. `src/mir/builder/control_flow/joinir/patterns/mod.rs` (+3 lines, re-export)
|
||
|
|
5. `src/mir/builder/control_flow/joinir/mod.rs` (+3 lines, re-export)
|
||
|
|
6. `src/mir/builder/control_flow/mod.rs` (+3 lines, re-export)
|
||
|
|
7. `src/mir/builder.rs` (+2 lines, re-export)
|
||
|
|
8. `src/mir/mod.rs` (+2 lines, re-export)
|
||
|
|
|
||
|
|
### Statistics
|
||
|
|
- **Total changes**: +318 lines
|
||
|
|
- **Unit tests**: 1 new test (100% pass)
|
||
|
|
- **All canonicalizer tests**: 8 passed (100%)
|
||
|
|
- **Manual tests**: 1 pattern verified (strict parity green)
|
||
|
|
- **Build status**: ✅ No errors (warnings are pre-existing)
|
||
|
|
|
||
|
|
### SSOT References
|
||
|
|
- **Design**: `docs/development/current/main/design/loop-canonicalizer.md`
|
||
|
|
- **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md`
|
||
|
|
- **Pattern Detection**: `ast_feature_extractor.rs` (Phase 140-P4-A SSOT)
|
||
|
|
|
||
|
|
### Known Limitations
|
||
|
|
- Pattern4 variable promotion (A-3 Trim, A-4 DigitPos) not yet handling this pattern
|
||
|
|
- This is expected - Phase 142 P1 only targets recognizer extension
|
||
|
|
- Promotion will be addressed when Pattern4 lowering is enhanced
|
||
|
|
|
||
|
|
### Next Steps (Future Phases)
|
||
|
|
- Phase 142 P2: Extend Pattern4 lowering to handle recognized continue patterns
|
||
|
|
- Phase 142 P3: Add more complex continue patterns (multiple carriers, nested conditions)
|
||
|
|
|
||
|
|
### Verification Commands
|
||
|
|
```bash
|
||
|
|
# Unit tests
|
||
|
|
cargo test --release --lib loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized
|
||
|
|
|
||
|
|
# All canonicalizer tests
|
||
|
|
cargo test --release --lib loop_canonicalizer::canonicalizer::tests
|
||
|
|
|
||
|
|
# Manual verification
|
||
|
|
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||
|
|
tools/selfhost/test_pattern4_simple_continue.hako
|
||
|
|
```
|
||
|
|
|
||
|
|
### Conclusion
|
||
|
|
Phase 142 P1 successfully extends the Canonicalizer to recognize continue patterns. The implementation:
|
||
|
|
- Maintains SSOT architecture
|
||
|
|
- Passes all unit tests (8/8)
|
||
|
|
- Achieves strict parity agreement with router
|
||
|
|
- Preserves existing behavior
|
||
|
|
- Follows existing re-export pattern from Phase 140-P4-A
|
||
|
|
|
||
|
|
All acceptance criteria met. ✅
|