feat(canonicalizer): Phase 143-P0 - parse_number pattern support

Add parse_number pattern recognition to canonicalizer, expanding adaptation
range for digit collection loops with break in THEN clause.

## Changes

### New Recognizer (ast_feature_extractor.rs)
- `detect_parse_number_pattern()`: Detects `if invalid { break }` pattern
- `ParseNumberInfo`: Struct for extracted pattern info
- ~150 lines added

### Canonicalizer Integration (canonicalizer.rs)
- Parse_number pattern detection before skip_whitespace
- LoopSkeleton construction with 4 steps (Header + Body x2 + Update)
- Routes to Pattern2Break (has_break=true)
- ~60 lines modified

### Export Chain (6 files)
- patterns/mod.rs → joinir/mod.rs → control_flow/mod.rs
- builder.rs → mir/mod.rs
- 8 lines total

### Tests
- `test_parse_number_pattern_recognized()`: Unit test for recognition
- Strict parity verification: GREEN (canonical and router agree)
- ~130 lines added

## Pattern Comparison

| Aspect | Skip Whitespace | Parse Number |
|--------|----------------|--------------|
| Break location | ELSE clause | THEN clause |
| Pattern | `if cond { update } else { break }` | `if invalid { break } rest... update` |
| Body after if | None | Required (result append) |

## Results

-  Skeleton creation successful
-  RoutingDecision matches router (Pattern2Break)
-  Strict parity OK (canonicalizer ↔ router agreement)
-  Unit test PASS
-  Manual test: test_pattern2_parse_number.hako executes correctly

## Statistics

- New patterns: 1 (parse_number)
- Total patterns: 3 (skip_whitespace, parse_number, continue)
- Lines added: ~280
- Files modified: 8
- Parity status: Green 

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
nyash-codex
2025-12-16 09:08:37 +09:00
parent 521e58d061
commit d605611a16
12 changed files with 2119 additions and 17 deletions

View File

@ -0,0 +1,218 @@
# Phase 142 P0: Implementation Summary
## Overview
Successfully extended the Canonicalizer to recognize trim leading/trailing patterns by generalizing the skip_whitespace pattern detector.
## Acceptance Criteria Status
✅ All criteria met:
- ✅ Canonicalizer creates Skeleton for trim_leading/trailing
- ✅ RoutingDecision.chosen == Pattern2Break (ExitContract priority)
- ✅ decision.missing_caps == [] (no missing capabilities)
- ✅ Strict parity green (both test files)
- ✅ Default behavior unchanged
- ✅ Unit tests added (2 new tests)
- ✅ Documentation created
## Files Modified
### 1. ast_feature_extractor.rs (+31 lines, -11 lines)
**Path**: `src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs`
**Key Changes**:
- Generalized `detect_skip_whitespace_pattern()` to accept both `Add` and `Subtract` operators
- Added `op_multiplier` logic to handle +1 and -1 deltas
- Updated documentation to reflect support for trim patterns
- Maintained SSOT architecture
**Core Logic**:
```rust
// Phase 142 P0: Accept both Add (+1) and Subtract (-1)
let op_multiplier = match operator {
BinaryOperator::Add => 1,
BinaryOperator::Subtract => -1,
_ => return None,
};
// Calculate delta with sign (e.g., +1 or -1)
let delta = const_val * op_multiplier;
```
### 2. canonicalizer.rs (+184 lines, -1 line)
**Path**: `src/mir/loop_canonicalizer/canonicalizer.rs`
**Key Changes**:
- Added `test_trim_leading_pattern_recognized()` unit test
- Added `test_trim_trailing_pattern_recognized()` unit test
- Fixed `test_skip_whitespace_fails_with_wrong_delta()` to use `Multiply` operator (clearer semantics)
**Test Coverage**:
- Skeleton structure verification
- Carrier slot verification (with correct delta sign)
- ExitContract verification
- RoutingDecision verification
### 3. pattern_recognizer.rs (-1 line)
**Path**: `src/mir/loop_canonicalizer/pattern_recognizer.rs`
**Key Changes**:
- Removed unused `SkipWhitespaceInfo` import
- Code cleanup only, no functional changes
## Statistics
### Code Changes
- **Total files modified**: 3
- **Total lines changed**: +206, -11
- **Net addition**: +195 lines
- **Test lines**: +178 lines (91% of changes)
### Test Results
```
Unit Tests:
running 7 tests
test result: ok. 7 passed; 0 failed; 0 ignored
Manual Verification (trim_leading):
[loop_canonicalizer] Decision: SUCCESS
[loop_canonicalizer] Chosen pattern: Pattern2Break
[loop_canonicalizer] Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
Manual Verification (trim_trailing):
[loop_canonicalizer] Decision: SUCCESS
[loop_canonicalizer] Chosen pattern: Pattern2Break
[loop_canonicalizer] Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break
```
### Build Status
- ✅ Compilation: Success (no errors)
- ✅ Warnings: 0 new warnings (unused import fixed)
- ✅ Formatting: Applied (cargo fmt)
## Design Principles Applied
### 1. Box-First Modularization
- Extended existing function instead of creating new ones
- Maintained SSOT pattern
- Preserved delegation architecture
### 2. Incremental Implementation
- Minimal scope (recognizer only)
- No changes to routing or lowering logic
- P0 focus maintained
### 3. ExitContract Priority
- Pattern choice determined by ExitContract
- has_break=true → Pattern2Break
- Consistent with existing policy
### 4. Fail-Fast Principle
- Clear error messages
- No fallback logic
- Explicit pattern matching
### 5. Source Code Quality
- Clean, well-documented code
- Comprehensive comments
- Consistent formatting
## Recognized Patterns
### Before Phase 142
```rust
// Only: p = p + 1
if is_ws { p = p + 1 } else { break }
```
### After Phase 142 P0
```rust
// Pattern 1: skip_whitespace (original)
if is_ws { p = p + 1 } else { break }
// Pattern 2: trim_leading (new)
if is_ws { start = start + 1 } else { break }
// Pattern 3: trim_trailing (new)
if is_ws { end = end - 1 } else { break }
```
## Known Limitations
### Expected Behavior
- Pattern2 variable promotion (A-3 Trim promotion) not implemented
- This is intentional - Phase 142 P0 only targets recognizer extension
- Promotion will be addressed in future phases
### No Impact On
- Default behavior (unchanged)
- Existing patterns (backward compatible)
- Performance (minimal overhead)
## Verification Commands
### Unit Tests
```bash
cargo test --release loop_canonicalizer::canonicalizer::tests --lib
# Expected: 7 passed
```
### Manual Tests
```bash
# trim_leading
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_trim_leading.hako
# Expected: [choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
# trim_trailing
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_trim_trailing.hako
# Expected: [choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
```
### Diff Statistics
```bash
git diff --stat
# Expected:
# .../joinir/patterns/ast_feature_extractor.rs | 31 +++-
# src/mir/loop_canonicalizer/canonicalizer.rs | 184 +++++++++++++++++-
# src/mir/loop_canonicalizer/pattern_recognizer.rs | 2 +-
# 3 files changed, 206 insertions(+), 11 deletions(-)
```
## Next Steps
### Phase 142 P1 (Future)
- Implement A-3 Trim promotion in Pattern2 handler
- Enable full execution of trim patterns
- Address variable promotion issues
### Phase 142 P2 (Future)
- Extend to Pattern 3/4 routing
- Support more complex carrier updates
- Generalize condition patterns
### Phase 142 P3 (Future)
- Multi-carrier trim patterns
- Nested trim patterns
- Performance optimizations
## Conclusion
Phase 142 P0 successfully achieved all objectives:
- Recognizer generalization complete ✅
- Unit tests passing ✅
- Strict parity green ✅
- Documentation complete ✅
- Code quality maintained ✅
The implementation follows all design principles, maintains SSOT architecture, and sets a solid foundation for future pattern extensions.
---
**Implementation Date**: 2025-12-16
**Status**: ✅ Complete
**Tests**: 7/7 passing
**Parity**: Green
**Documentation**: Complete