Commit Graph

4 Commits

Author SHA1 Message Date
42339ca77f feat(mir): Phase 143 P1 - Add parse_string pattern to canonicalizer
Expand loop canonicalizer to recognize parse_string patterns with both
continue (escape handling) and return (quote found) statements.

## Implementation

### New Pattern Detection (ast_feature_extractor.rs)
- Add `detect_parse_string_pattern()` function
- Support nested continue detection using `has_continue_node()` helper
- Recognize both return and continue in same loop body
- Return ParseStringInfo { carrier_name, delta, body_stmts }
- ~120 lines added

### Canonicalizer Integration (canonicalizer.rs)
- Try parse_string pattern first (most specific)
- Build LoopSkeleton with HeaderCond, Body, Update steps
- Set ExitContract: has_continue=true, has_return=true
- Route to Pattern4Continue (both exits present)
- ~45 lines modified

### Export Chain
- Add re-exports through 7 module levels:
  ast_feature_extractor → patterns → joinir → control_flow → builder → mir
- 10 lines total across 7 files

### Unit Test
- Add `test_parse_string_pattern_recognized()` in canonicalizer.rs
- Verify skeleton structure (3+ steps)
- Verify carrier (name="p", delta=1, role=Counter)
- Verify exit contract (continue=true, return=true, break=false)
- Verify routing decision (Pattern4Continue, no missing_caps)
- ~180 lines added

## Target Pattern
`tools/selfhost/test_pattern4_parse_string.hako`

Pattern structure:
- Check for closing quote → return
- Check for escape sequence → continue (nested inside another if)
- Regular character processing → p++

## Results
-  Strict parity green: Pattern4Continue
-  All 19 unit tests pass
-  Nested continue detection working
-  ExitContract correctly set (first pattern with both continue+return)
-  Default behavior unchanged

## Technical Challenges
1. Nested continue detection required recursive search
2. First pattern with both has_continue=true AND has_return=true
3. Variable step updates (p++ vs p+=2) handled with base delta

## Statistics
- New patterns: 1 (parse_string)
- Total patterns: 4 (skip_whitespace, parse_number, continue, parse_string)
- New capabilities: 0 (uses existing ConstStep)
- Lines added: ~300
- Files modified: 9
- Parity status: Green 

Phase 143 P1: Complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 12:37:47 +09:00
d605611a16 feat(canonicalizer): Phase 143-P0 - parse_number pattern support
Add parse_number pattern recognition to canonicalizer, expanding adaptation
range for digit collection loops with break in THEN clause.

## Changes

### New Recognizer (ast_feature_extractor.rs)
- `detect_parse_number_pattern()`: Detects `if invalid { break }` pattern
- `ParseNumberInfo`: Struct for extracted pattern info
- ~150 lines added

### Canonicalizer Integration (canonicalizer.rs)
- Parse_number pattern detection before skip_whitespace
- LoopSkeleton construction with 4 steps (Header + Body x2 + Update)
- Routes to Pattern2Break (has_break=true)
- ~60 lines modified

### Export Chain (6 files)
- patterns/mod.rs → joinir/mod.rs → control_flow/mod.rs
- builder.rs → mir/mod.rs
- 8 lines total

### Tests
- `test_parse_number_pattern_recognized()`: Unit test for recognition
- Strict parity verification: GREEN (canonical and router agree)
- ~130 lines added

## Pattern Comparison

| Aspect | Skip Whitespace | Parse Number |
|--------|----------------|--------------|
| Break location | ELSE clause | THEN clause |
| Pattern | `if cond { update } else { break }` | `if invalid { break } rest... update` |
| Body after if | None | Required (result append) |

## Results

-  Skeleton creation successful
-  RoutingDecision matches router (Pattern2Break)
-  Strict parity OK (canonicalizer ↔ router agreement)
-  Unit test PASS
-  Manual test: test_pattern2_parse_number.hako executes correctly

## Statistics

- New patterns: 1 (parse_number)
- Total patterns: 3 (skip_whitespace, parse_number, continue)
- Lines added: ~280
- Files modified: 8
- Parity status: Green 

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 09:08:37 +09:00
d0d3512be4 refactor(mir): Phase 140-P4-B - pattern_recognizer を SSOT 化(71 行削減)
- try_extract_skip_whitespace_pattern を ast_feature_extractor 呼び出しに変更
- 重複コード 100+ 行削減(250 → 179 行)
- crate-wide re-export chain 確立(ast_feature_extractor → patterns → joinir → control_flow → builder → mir)
- 全テスト PASS(14 tests)
- フォーマット適用

Phase 140-P4-A/P4-B 完了:SSOT 化成功
2025-12-16 07:09:22 +09:00
5edd81e373 refactor(mir): Phase 138-P1-A - loop_canonicalizer を4モジュール分割
## 概要
- 931行の mod.rs を 4モジュール + 161行 mod.rs に分割
- 全14テスト PASS、退行なし

## モジュール構成
- skeleton_types.rs (213行) - LoopSkeleton/SkeletonStep/UpdateKind/CarrierSlot/ExitContract
- capability_guard.rs (104行) - RoutingDecision/capability_tags
- pattern_recognizer.rs (249行) - try_extract_skip_whitespace_pattern
- canonicalizer.rs (414行) - canonicalize_loop_expr + 統合テスト
- mod.rs (161行) - 型定義と re-export

## ファイルサイズ達成
- 最大ファイル: canonicalizer.rs 414行(目標250行を一部超過するが許容範囲)
- mod.rs: 931行 → 161行 (83%削減)
- 合計: 1141行(元の931行 + tests分離で増加)

## テスト結果
- 14 tests passed
- loop_canonicalizer::* 全テスト green
2025-12-16 06:41:46 +09:00