feat(mir): Loop Canonicalizer Phase 4 - router parity verification

## Summary
既存 Router と Canonicalizer の選択が一致することを dev-only で検証。
不一致は理由付き Fail-Fast(strict mode)。

## Changes
- src/mir/builder/control_flow/joinir/routing.rs:
  - verify_router_parity() 実装
  - cf_loop_joinir_impl でパリティチェック呼び出し
  - 2つのユニットテスト追加
    - test_parity_check_mismatch_detected
    - test_parity_check_match_simple_while
- docs/development/current/main/phases/phase-137/phase-137-4-parity-verification.md:
  - Phase 4 完全ドキュメント

## Verification Modes
- Debug mode (HAKO_JOINIR_DEBUG=1): ログのみ
- Strict mode (HAKO_JOINIR_STRICT=1): 不一致でエラー

## Known Mismatch
- skip_whitespace pattern:
  - Canonicalizer: Pattern3IfPhi (構造認識)
  - Router: Pattern2Break (has_break優先)
  - Phase 5+ で分類ルール改善予定

## Tests
- Unit tests: 2 tests PASS
- Integration: skip_whitespace parity mismatch 検出確認
- cargo test --release --lib: 1046/1046 PASS

Phase 137-4 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
nyash-codex
2025-12-16 05:51:57 +09:00
parent a0009d474d
commit 9ea15e8417
2 changed files with 455 additions and 1 deletions

View File

@ -0,0 +1,197 @@
# Phase 137-4: Loop Canonicalizer Router Parity Verification
**Status**: ✅ Complete
**Date**: 2025-12-16
## Summary
Dev-only verification that Loop Canonicalizer and Router pattern detection agree on pattern classification. On mismatch, provides detailed diagnostic with Fail-Fast option.
## Implementation
### Location
`src/mir/builder/control_flow/joinir/routing.rs`
### Components
1. **Parity Verification Function** (`verify_router_parity`)
- Runs canonicalizer on the loop AST
- Compares canonicalizer's chosen pattern with router's `ctx.pattern_kind`
- Logs match/mismatch results
- In strict mode, returns error on mismatch
2. **Integration Point**
- Called in `cf_loop_joinir_impl` after `LoopPatternContext` is created
- Only runs when `joinir_dev_enabled()` returns true
- Deferred until after ctx creation to access `ctx.pattern_kind`
### Behavior Modes
#### Debug Mode (Default)
```bash
HAKO_JOINIR_DEBUG=1 ./target/release/hakorune program.hako
```
- Logs parity check results to stderr
- On mismatch: Logs warning but continues execution
- On match: Logs success message
#### Strict Mode
```bash
HAKO_JOINIR_STRICT=1 ./target/release/hakorune program.hako
```
- Same as debug mode, but on mismatch:
- Returns error and stops compilation
- Provides detailed mismatch diagnostic
### Output Examples
#### Match (Success)
```
[loop_canonicalizer/PARITY] OK in function 'foo':
canonical and actual agree on Pattern1SimpleWhile
```
#### Mismatch (Warning in Debug)
```
[loop_canonicalizer/PARITY] MISMATCH in function 'main':
canonical=Pattern3IfPhi, actual=Pattern2Break
```
#### Mismatch (Error in Strict)
```
[ERROR] ❌ MIR compilation error:
[loop_canonicalizer/PARITY] MISMATCH in function 'main':
canonical=Pattern3IfPhi, actual=Pattern2Break
```
#### Canonicalizer Failure
```
[loop_canonicalizer/PARITY] Canonicalizer failed for 'bar':
Phase 3: Loop does not match skip_whitespace pattern
```
## Test Coverage
### Unit Tests
1. **`test_parity_check_mismatch_detected`**
- Verifies mismatch detection on skip_whitespace pattern
- Canonicalizer: Pattern3IfPhi (recognizes if-else structure)
- Router: Pattern2Break (sees has_break flag)
- Asserts inequality to document expected mismatch
2. **`test_parity_check_match_simple_while`**
- Verifies canonicalizer fails on Pattern1 (not yet implemented)
- Router: Pattern1SimpleWhile
- Canonicalizer: Fail-Fast (only supports skip_whitespace in Phase 3)
### Integration Tests
```bash
# Verify mismatch detection (debug mode)
HAKO_JOINIR_DEBUG=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_skip_whitespace.hako
# Verify strict mode error
HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
tools/selfhost/test_pattern3_skip_whitespace.hako
# Expected: Exit with error due to mismatch
```
## Known Mismatches
### skip_whitespace Pattern
**Structure**:
```nyash
loop(p < len) {
if is_ws == 1 {
p = p + 1
} else {
break
}
}
```
**Mismatch**:
- **Canonicalizer**: Pattern3IfPhi
- Recognizes specific if-else structure with conditional carrier update
- Sees: `if cond { carrier += 1 } else { break }` as Pattern3 variant
- **Router**: Pattern2Break
- Classification based on `has_break` flag
- Priority: break detection takes precedence over if-else structure
**Resolution Strategy**:
- Phase 4: Document and observe (current)
- Phase 5+: Refine classification rules to handle hybrid patterns
- Option A: Extend Pattern3 to include "break-in-else" variant
- Option B: Create new Pattern6 for this specific structure
- Option C: Make router defer to canonicalizer's decision
## Design Rationale
### Why Two Systems?
1. **Router (Existing)**
- Feature-based classification (`has_break`, `has_continue`, etc.)
- Fast, simple flags
- Priority-based (e.g., break > if-else)
2. **Canonicalizer (New)**
- Structure-based pattern matching
- Deep AST analysis
- Recognizes specific code idioms
### Why Parity Check?
- **Incremental Migration**: Allows canonicalizer development without breaking router
- **Safety Net**: Catches classification divergence early
- **Documentation**: Explicitly records where systems disagree
- **Flexibility**: Dev-only, no production overhead
### Why Dev-Only?
- **No Performance Impact**: Zero cost in release builds (flag-gated)
- **Development Insight**: Helps refine both systems
- **Fail-Fast Option**: Strict mode for CI/validation
- **Graceful Degradation**: Debug mode allows execution to continue
## Acceptance Criteria
- ✅ Flag OFF: No behavioral change
- ✅ Dev-only: Match/mismatch observable
- ✅ Strict mode: Mismatch stops compilation
- ✅ Debug mode: Mismatch logs warning
- ✅ Unit tests: 2 tests passing
- ✅ Integration test: skip_whitespace mismatch detected
- ✅ All tests: `cargo test --release --lib` passes (1046/1046)
## Future Work
### Phase 5: Pattern Classification Refinement
- Resolve skip_whitespace classification (Pattern2 vs Pattern3)
- Extend Pattern3 to handle break-in-else variant
- OR: Create Pattern6 for "conditional update with early exit"
### Phase 6: Canonicalizer Expansion
- Add Pattern1 (Simple While) to canonicalizer
- Add Pattern2 (Conditional Break) variants
- Add Pattern4 (Continue) support
- Add Pattern5 (Infinite Early Exit) support
### Phase 7: Router Migration
- Consider migrating router to use canonicalizer decisions
- Option: Make `ctx.pattern_kind = decision.chosen` if canonicalizer succeeds
- Gradual transition from feature-based to structure-based routing
## References
- **Phase 137-2**: Loop Canonicalizer observation (dev-only logging)
- **Phase 137-3**: skip_whitespace pattern recognition
- **Design SSOT**: `docs/development/current/main/design/loop-canonicalizer.md`
- **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md`

View File

@ -6,6 +6,77 @@ use crate::mir::builder::MirBuilder;
use crate::mir::ValueId;
impl MirBuilder {
/// Phase 137-4: Verify router parity between canonicalizer and router
///
/// Dev-only: Ensures the canonicalizer's pattern choice matches the router's
/// pattern_kind. On mismatch:
/// - Debug mode (HAKO_JOINIR_DEBUG=1): Log warning
/// - Strict mode (HAKO_JOINIR_STRICT=1): Return error
fn verify_router_parity(
&self,
condition: &ASTNode,
body: &[ASTNode],
func_name: &str,
ctx: &super::patterns::LoopPatternContext,
) -> Result<(), String> {
use crate::ast::Span;
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
// Reconstruct loop AST for canonicalizer
let loop_ast = ASTNode::Loop {
condition: Box::new(condition.clone()),
body: body.to_vec(),
span: Span::unknown(),
};
// Run canonicalizer
let (_, decision) = canonicalize_loop_expr(&loop_ast)
.map_err(|e| format!("[loop_canonicalizer/PARITY] Canonicalizer error: {}", e))?;
// Compare patterns only if canonicalizer succeeded
if let Some(canonical_pattern) = decision.chosen {
let actual_pattern = ctx.pattern_kind;
if canonical_pattern != actual_pattern {
// Pattern mismatch detected
let msg = format!(
"[loop_canonicalizer/PARITY] MISMATCH in function '{}': \
canonical={:?}, actual={:?}",
func_name, canonical_pattern, actual_pattern
);
// Check strict mode
let is_strict = std::env::var("HAKO_JOINIR_STRICT").is_ok()
|| std::env::var("NYASH_JOINIR_STRICT").is_ok();
if is_strict {
// Strict mode: fail fast
return Err(msg);
} else {
// Debug mode: log only
eprintln!("{}", msg);
}
} else {
// Patterns match - success!
eprintln!(
"[loop_canonicalizer/PARITY] OK in function '{}': \
canonical and actual agree on {:?}",
func_name, canonical_pattern
);
}
} else {
// Canonicalizer failed (Fail-Fast)
// Log but don't error - router might still handle it
eprintln!(
"[loop_canonicalizer/PARITY] Canonicalizer failed for '{}': {}",
func_name,
decision.notes.join("; ")
);
}
Ok(())
}
/// Phase 49: Try JoinIR Frontend for mainline integration
///
/// Returns `Ok(Some(value))` if the loop is successfully lowered via JoinIR,
@ -122,7 +193,7 @@ impl MirBuilder {
func_name: &str,
debug: bool,
) -> Result<Option<ValueId>, String> {
// Phase 137-2: Dev-only observation via Loop Canonicalizer
// Phase 137-2/137-4: Dev-only observation via Loop Canonicalizer
if crate::config::env::joinir_dev_enabled() {
use crate::ast::Span;
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
@ -149,6 +220,20 @@ impl MirBuilder {
if decision.is_fail_fast() {
eprintln!("[loop_canonicalizer] Reason: {}", decision.notes.join("; "));
}
// Phase 137-4: Router parity verification
if let Some(canonical_pattern) = decision.chosen {
// Get actual pattern from router (will be determined by LoopPatternContext)
// We need to defer this check until after ctx is created
// Store decision for later parity check
trace::trace().debug(
"canonicalizer",
&format!(
"Phase 137-4: Canonical pattern chosen: {:?} (parity check pending)",
canonical_pattern
),
);
}
}
Err(e) => {
eprintln!("[loop_canonicalizer] Function: {}", func_name);
@ -182,6 +267,11 @@ impl MirBuilder {
LoopPatternContext::new(condition, body, &func_name, debug)
};
// Phase 137-4: Router parity verification (after ctx is created)
if crate::config::env::joinir_dev_enabled() {
self.verify_router_parity(condition, body, func_name, &ctx)?;
}
if let Some(result) = route_loop_pattern(self, &ctx)? {
trace::trace().routing("router", func_name, "Pattern router succeeded");
return Ok(Some(result));
@ -198,3 +288,170 @@ impl MirBuilder {
self.cf_loop_joinir_legacy_binding(condition, body, func_name, debug)
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::ast::{ASTNode, BinaryOperator, LiteralValue, Span};
/// Test helper: Create a skip_whitespace pattern loop AST
fn build_skip_whitespace_loop() -> ASTNode {
ASTNode::Loop {
condition: Box::new(ASTNode::BinaryOp {
operator: BinaryOperator::Less,
left: Box::new(ASTNode::Variable {
name: "p".to_string(),
span: Span::unknown(),
}),
right: Box::new(ASTNode::Variable {
name: "len".to_string(),
span: Span::unknown(),
}),
span: Span::unknown(),
}),
body: vec![ASTNode::If {
condition: Box::new(ASTNode::BinaryOp {
operator: BinaryOperator::Equal,
left: Box::new(ASTNode::Variable {
name: "is_ws".to_string(),
span: Span::unknown(),
}),
right: Box::new(ASTNode::Literal {
value: LiteralValue::Integer(1),
span: Span::unknown(),
}),
span: Span::unknown(),
}),
then_body: vec![ASTNode::Assignment {
target: Box::new(ASTNode::Variable {
name: "p".to_string(),
span: Span::unknown(),
}),
value: Box::new(ASTNode::BinaryOp {
operator: BinaryOperator::Add,
left: Box::new(ASTNode::Variable {
name: "p".to_string(),
span: Span::unknown(),
}),
right: Box::new(ASTNode::Literal {
value: LiteralValue::Integer(1),
span: Span::unknown(),
}),
span: Span::unknown(),
}),
span: Span::unknown(),
}],
else_body: Some(vec![ASTNode::Break {
span: Span::unknown(),
}]),
span: Span::unknown(),
}],
span: Span::unknown(),
}
}
#[test]
fn test_parity_check_mismatch_detected() {
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
use crate::mir::builder::control_flow::joinir::patterns::ast_feature_extractor as ast_features;
let loop_ast = build_skip_whitespace_loop();
// Extract condition and body
let (condition, body) = match &loop_ast {
ASTNode::Loop { condition, body, .. } => (condition.as_ref(), body.as_slice()),
_ => panic!("Expected loop node"),
};
// Run canonicalizer
let (_, canonical_decision) = canonicalize_loop_expr(&loop_ast).unwrap();
let canonical_pattern = canonical_decision.chosen.expect("Canonicalizer should succeed");
// Run router's pattern detection
let has_continue = ast_features::detect_continue_in_body(body);
let has_break = ast_features::detect_break_in_body(body);
let features = ast_features::extract_features(condition, body, has_continue, has_break);
let actual_pattern = crate::mir::loop_pattern_detection::classify(&features);
// Verify mismatch
// Canonicalizer: Pattern3IfPhi (recognizes if-else structure)
// Router: Pattern2Break (sees has_break)
assert_eq!(
canonical_pattern,
crate::mir::loop_pattern_detection::LoopPatternKind::Pattern3IfPhi
);
assert_eq!(
actual_pattern,
crate::mir::loop_pattern_detection::LoopPatternKind::Pattern2Break
);
assert_ne!(canonical_pattern, actual_pattern, "Phase 137-4: This test verifies mismatch detection");
}
#[test]
fn test_parity_check_match_simple_while() {
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
use crate::mir::builder::control_flow::joinir::patterns::ast_feature_extractor as ast_features;
// Simple while loop: no break, no continue, no if
let loop_ast = ASTNode::Loop {
condition: Box::new(ASTNode::BinaryOp {
operator: BinaryOperator::Less,
left: Box::new(ASTNode::Variable {
name: "i".to_string(),
span: Span::unknown(),
}),
right: Box::new(ASTNode::Literal {
value: LiteralValue::Integer(10),
span: Span::unknown(),
}),
span: Span::unknown(),
}),
body: vec![ASTNode::Assignment {
target: Box::new(ASTNode::Variable {
name: "i".to_string(),
span: Span::unknown(),
}),
value: Box::new(ASTNode::BinaryOp {
operator: BinaryOperator::Add,
left: Box::new(ASTNode::Variable {
name: "i".to_string(),
span: Span::unknown(),
}),
right: Box::new(ASTNode::Literal {
value: LiteralValue::Integer(1),
span: Span::unknown(),
}),
span: Span::unknown(),
}),
span: Span::unknown(),
}],
span: Span::unknown(),
};
// Extract condition and body
let (condition, body) = match &loop_ast {
ASTNode::Loop { condition, body, .. } => (condition.as_ref(), body.as_slice()),
_ => panic!("Expected loop node"),
};
// Canonicalizer will fail for simple patterns (not yet implemented)
let canonical_result = canonicalize_loop_expr(&loop_ast);
// Router's pattern detection
let has_continue = ast_features::detect_continue_in_body(body);
let has_break = ast_features::detect_break_in_body(body);
let features = ast_features::extract_features(condition, body, has_continue, has_break);
let actual_pattern = crate::mir::loop_pattern_detection::classify(&features);
// Router should classify as Pattern1SimpleWhile
assert_eq!(
actual_pattern,
crate::mir::loop_pattern_detection::LoopPatternKind::Pattern1SimpleWhile
);
// Canonicalizer should fail (not implemented yet for Pattern1)
assert!(canonical_result.is_ok());
let (_, decision) = canonical_result.unwrap();
assert!(decision.is_fail_fast(), "Canonicalizer should fail for simple patterns (Phase 3 only supports skip_whitespace)");
}
}