feat(mir): Loop Canonicalizer Phase 4 - router parity verification
## Summary
既存 Router と Canonicalizer の選択が一致することを dev-only で検証。
不一致は理由付き Fail-Fast(strict mode)。
## Changes
- src/mir/builder/control_flow/joinir/routing.rs:
- verify_router_parity() 実装
- cf_loop_joinir_impl でパリティチェック呼び出し
- 2つのユニットテスト追加
- test_parity_check_mismatch_detected
- test_parity_check_match_simple_while
- docs/development/current/main/phases/phase-137/phase-137-4-parity-verification.md:
- Phase 4 完全ドキュメント
## Verification Modes
- Debug mode (HAKO_JOINIR_DEBUG=1): ログのみ
- Strict mode (HAKO_JOINIR_STRICT=1): 不一致でエラー
## Known Mismatch
- skip_whitespace pattern:
- Canonicalizer: Pattern3IfPhi (構造認識)
- Router: Pattern2Break (has_break優先)
- Phase 5+ で分類ルール改善予定
## Tests
- Unit tests: 2 tests PASS
- Integration: skip_whitespace parity mismatch 検出確認
- cargo test --release --lib: 1046/1046 PASS
Phase 137-4 complete
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -0,0 +1,197 @@
|
||||
# Phase 137-4: Loop Canonicalizer Router Parity Verification
|
||||
|
||||
**Status**: ✅ Complete
|
||||
**Date**: 2025-12-16
|
||||
|
||||
## Summary
|
||||
|
||||
Dev-only verification that Loop Canonicalizer and Router pattern detection agree on pattern classification. On mismatch, provides detailed diagnostic with Fail-Fast option.
|
||||
|
||||
## Implementation
|
||||
|
||||
### Location
|
||||
|
||||
`src/mir/builder/control_flow/joinir/routing.rs`
|
||||
|
||||
### Components
|
||||
|
||||
1. **Parity Verification Function** (`verify_router_parity`)
|
||||
- Runs canonicalizer on the loop AST
|
||||
- Compares canonicalizer's chosen pattern with router's `ctx.pattern_kind`
|
||||
- Logs match/mismatch results
|
||||
- In strict mode, returns error on mismatch
|
||||
|
||||
2. **Integration Point**
|
||||
- Called in `cf_loop_joinir_impl` after `LoopPatternContext` is created
|
||||
- Only runs when `joinir_dev_enabled()` returns true
|
||||
- Deferred until after ctx creation to access `ctx.pattern_kind`
|
||||
|
||||
### Behavior Modes
|
||||
|
||||
#### Debug Mode (Default)
|
||||
```bash
|
||||
HAKO_JOINIR_DEBUG=1 ./target/release/hakorune program.hako
|
||||
```
|
||||
- Logs parity check results to stderr
|
||||
- On mismatch: Logs warning but continues execution
|
||||
- On match: Logs success message
|
||||
|
||||
#### Strict Mode
|
||||
```bash
|
||||
HAKO_JOINIR_STRICT=1 ./target/release/hakorune program.hako
|
||||
```
|
||||
- Same as debug mode, but on mismatch:
|
||||
- Returns error and stops compilation
|
||||
- Provides detailed mismatch diagnostic
|
||||
|
||||
### Output Examples
|
||||
|
||||
#### Match (Success)
|
||||
```
|
||||
[loop_canonicalizer/PARITY] OK in function 'foo':
|
||||
canonical and actual agree on Pattern1SimpleWhile
|
||||
```
|
||||
|
||||
#### Mismatch (Warning in Debug)
|
||||
```
|
||||
[loop_canonicalizer/PARITY] MISMATCH in function 'main':
|
||||
canonical=Pattern3IfPhi, actual=Pattern2Break
|
||||
```
|
||||
|
||||
#### Mismatch (Error in Strict)
|
||||
```
|
||||
[ERROR] ❌ MIR compilation error:
|
||||
[loop_canonicalizer/PARITY] MISMATCH in function 'main':
|
||||
canonical=Pattern3IfPhi, actual=Pattern2Break
|
||||
```
|
||||
|
||||
#### Canonicalizer Failure
|
||||
```
|
||||
[loop_canonicalizer/PARITY] Canonicalizer failed for 'bar':
|
||||
Phase 3: Loop does not match skip_whitespace pattern
|
||||
```
|
||||
|
||||
## Test Coverage
|
||||
|
||||
### Unit Tests
|
||||
|
||||
1. **`test_parity_check_mismatch_detected`**
|
||||
- Verifies mismatch detection on skip_whitespace pattern
|
||||
- Canonicalizer: Pattern3IfPhi (recognizes if-else structure)
|
||||
- Router: Pattern2Break (sees has_break flag)
|
||||
- Asserts inequality to document expected mismatch
|
||||
|
||||
2. **`test_parity_check_match_simple_while`**
|
||||
- Verifies canonicalizer fails on Pattern1 (not yet implemented)
|
||||
- Router: Pattern1SimpleWhile
|
||||
- Canonicalizer: Fail-Fast (only supports skip_whitespace in Phase 3)
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```bash
|
||||
# Verify mismatch detection (debug mode)
|
||||
HAKO_JOINIR_DEBUG=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern3_skip_whitespace.hako
|
||||
|
||||
# Verify strict mode error
|
||||
HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
|
||||
tools/selfhost/test_pattern3_skip_whitespace.hako
|
||||
# Expected: Exit with error due to mismatch
|
||||
```
|
||||
|
||||
## Known Mismatches
|
||||
|
||||
### skip_whitespace Pattern
|
||||
|
||||
**Structure**:
|
||||
```nyash
|
||||
loop(p < len) {
|
||||
if is_ws == 1 {
|
||||
p = p + 1
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Mismatch**:
|
||||
- **Canonicalizer**: Pattern3IfPhi
|
||||
- Recognizes specific if-else structure with conditional carrier update
|
||||
- Sees: `if cond { carrier += 1 } else { break }` as Pattern3 variant
|
||||
|
||||
- **Router**: Pattern2Break
|
||||
- Classification based on `has_break` flag
|
||||
- Priority: break detection takes precedence over if-else structure
|
||||
|
||||
**Resolution Strategy**:
|
||||
- Phase 4: Document and observe (current)
|
||||
- Phase 5+: Refine classification rules to handle hybrid patterns
|
||||
- Option A: Extend Pattern3 to include "break-in-else" variant
|
||||
- Option B: Create new Pattern6 for this specific structure
|
||||
- Option C: Make router defer to canonicalizer's decision
|
||||
|
||||
## Design Rationale
|
||||
|
||||
### Why Two Systems?
|
||||
|
||||
1. **Router (Existing)**
|
||||
- Feature-based classification (`has_break`, `has_continue`, etc.)
|
||||
- Fast, simple flags
|
||||
- Priority-based (e.g., break > if-else)
|
||||
|
||||
2. **Canonicalizer (New)**
|
||||
- Structure-based pattern matching
|
||||
- Deep AST analysis
|
||||
- Recognizes specific code idioms
|
||||
|
||||
### Why Parity Check?
|
||||
|
||||
- **Incremental Migration**: Allows canonicalizer development without breaking router
|
||||
- **Safety Net**: Catches classification divergence early
|
||||
- **Documentation**: Explicitly records where systems disagree
|
||||
- **Flexibility**: Dev-only, no production overhead
|
||||
|
||||
### Why Dev-Only?
|
||||
|
||||
- **No Performance Impact**: Zero cost in release builds (flag-gated)
|
||||
- **Development Insight**: Helps refine both systems
|
||||
- **Fail-Fast Option**: Strict mode for CI/validation
|
||||
- **Graceful Degradation**: Debug mode allows execution to continue
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- ✅ Flag OFF: No behavioral change
|
||||
- ✅ Dev-only: Match/mismatch observable
|
||||
- ✅ Strict mode: Mismatch stops compilation
|
||||
- ✅ Debug mode: Mismatch logs warning
|
||||
- ✅ Unit tests: 2 tests passing
|
||||
- ✅ Integration test: skip_whitespace mismatch detected
|
||||
- ✅ All tests: `cargo test --release --lib` passes (1046/1046)
|
||||
|
||||
## Future Work
|
||||
|
||||
### Phase 5: Pattern Classification Refinement
|
||||
|
||||
- Resolve skip_whitespace classification (Pattern2 vs Pattern3)
|
||||
- Extend Pattern3 to handle break-in-else variant
|
||||
- OR: Create Pattern6 for "conditional update with early exit"
|
||||
|
||||
### Phase 6: Canonicalizer Expansion
|
||||
|
||||
- Add Pattern1 (Simple While) to canonicalizer
|
||||
- Add Pattern2 (Conditional Break) variants
|
||||
- Add Pattern4 (Continue) support
|
||||
- Add Pattern5 (Infinite Early Exit) support
|
||||
|
||||
### Phase 7: Router Migration
|
||||
|
||||
- Consider migrating router to use canonicalizer decisions
|
||||
- Option: Make `ctx.pattern_kind = decision.chosen` if canonicalizer succeeds
|
||||
- Gradual transition from feature-based to structure-based routing
|
||||
|
||||
## References
|
||||
|
||||
- **Phase 137-2**: Loop Canonicalizer observation (dev-only logging)
|
||||
- **Phase 137-3**: skip_whitespace pattern recognition
|
||||
- **Design SSOT**: `docs/development/current/main/design/loop-canonicalizer.md`
|
||||
- **JoinIR Architecture**: `docs/development/current/main/joinir-architecture-overview.md`
|
||||
@ -6,6 +6,77 @@ use crate::mir::builder::MirBuilder;
|
||||
use crate::mir::ValueId;
|
||||
|
||||
impl MirBuilder {
|
||||
/// Phase 137-4: Verify router parity between canonicalizer and router
|
||||
///
|
||||
/// Dev-only: Ensures the canonicalizer's pattern choice matches the router's
|
||||
/// pattern_kind. On mismatch:
|
||||
/// - Debug mode (HAKO_JOINIR_DEBUG=1): Log warning
|
||||
/// - Strict mode (HAKO_JOINIR_STRICT=1): Return error
|
||||
fn verify_router_parity(
|
||||
&self,
|
||||
condition: &ASTNode,
|
||||
body: &[ASTNode],
|
||||
func_name: &str,
|
||||
ctx: &super::patterns::LoopPatternContext,
|
||||
) -> Result<(), String> {
|
||||
use crate::ast::Span;
|
||||
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
|
||||
|
||||
// Reconstruct loop AST for canonicalizer
|
||||
let loop_ast = ASTNode::Loop {
|
||||
condition: Box::new(condition.clone()),
|
||||
body: body.to_vec(),
|
||||
span: Span::unknown(),
|
||||
};
|
||||
|
||||
// Run canonicalizer
|
||||
let (_, decision) = canonicalize_loop_expr(&loop_ast)
|
||||
.map_err(|e| format!("[loop_canonicalizer/PARITY] Canonicalizer error: {}", e))?;
|
||||
|
||||
// Compare patterns only if canonicalizer succeeded
|
||||
if let Some(canonical_pattern) = decision.chosen {
|
||||
let actual_pattern = ctx.pattern_kind;
|
||||
|
||||
if canonical_pattern != actual_pattern {
|
||||
// Pattern mismatch detected
|
||||
let msg = format!(
|
||||
"[loop_canonicalizer/PARITY] MISMATCH in function '{}': \
|
||||
canonical={:?}, actual={:?}",
|
||||
func_name, canonical_pattern, actual_pattern
|
||||
);
|
||||
|
||||
// Check strict mode
|
||||
let is_strict = std::env::var("HAKO_JOINIR_STRICT").is_ok()
|
||||
|| std::env::var("NYASH_JOINIR_STRICT").is_ok();
|
||||
|
||||
if is_strict {
|
||||
// Strict mode: fail fast
|
||||
return Err(msg);
|
||||
} else {
|
||||
// Debug mode: log only
|
||||
eprintln!("{}", msg);
|
||||
}
|
||||
} else {
|
||||
// Patterns match - success!
|
||||
eprintln!(
|
||||
"[loop_canonicalizer/PARITY] OK in function '{}': \
|
||||
canonical and actual agree on {:?}",
|
||||
func_name, canonical_pattern
|
||||
);
|
||||
}
|
||||
} else {
|
||||
// Canonicalizer failed (Fail-Fast)
|
||||
// Log but don't error - router might still handle it
|
||||
eprintln!(
|
||||
"[loop_canonicalizer/PARITY] Canonicalizer failed for '{}': {}",
|
||||
func_name,
|
||||
decision.notes.join("; ")
|
||||
);
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Phase 49: Try JoinIR Frontend for mainline integration
|
||||
///
|
||||
/// Returns `Ok(Some(value))` if the loop is successfully lowered via JoinIR,
|
||||
@ -122,7 +193,7 @@ impl MirBuilder {
|
||||
func_name: &str,
|
||||
debug: bool,
|
||||
) -> Result<Option<ValueId>, String> {
|
||||
// Phase 137-2: Dev-only observation via Loop Canonicalizer
|
||||
// Phase 137-2/137-4: Dev-only observation via Loop Canonicalizer
|
||||
if crate::config::env::joinir_dev_enabled() {
|
||||
use crate::ast::Span;
|
||||
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
|
||||
@ -149,6 +220,20 @@ impl MirBuilder {
|
||||
if decision.is_fail_fast() {
|
||||
eprintln!("[loop_canonicalizer] Reason: {}", decision.notes.join("; "));
|
||||
}
|
||||
|
||||
// Phase 137-4: Router parity verification
|
||||
if let Some(canonical_pattern) = decision.chosen {
|
||||
// Get actual pattern from router (will be determined by LoopPatternContext)
|
||||
// We need to defer this check until after ctx is created
|
||||
// Store decision for later parity check
|
||||
trace::trace().debug(
|
||||
"canonicalizer",
|
||||
&format!(
|
||||
"Phase 137-4: Canonical pattern chosen: {:?} (parity check pending)",
|
||||
canonical_pattern
|
||||
),
|
||||
);
|
||||
}
|
||||
}
|
||||
Err(e) => {
|
||||
eprintln!("[loop_canonicalizer] Function: {}", func_name);
|
||||
@ -182,6 +267,11 @@ impl MirBuilder {
|
||||
LoopPatternContext::new(condition, body, &func_name, debug)
|
||||
};
|
||||
|
||||
// Phase 137-4: Router parity verification (after ctx is created)
|
||||
if crate::config::env::joinir_dev_enabled() {
|
||||
self.verify_router_parity(condition, body, func_name, &ctx)?;
|
||||
}
|
||||
|
||||
if let Some(result) = route_loop_pattern(self, &ctx)? {
|
||||
trace::trace().routing("router", func_name, "Pattern router succeeded");
|
||||
return Ok(Some(result));
|
||||
@ -198,3 +288,170 @@ impl MirBuilder {
|
||||
self.cf_loop_joinir_legacy_binding(condition, body, func_name, debug)
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::ast::{ASTNode, BinaryOperator, LiteralValue, Span};
|
||||
|
||||
/// Test helper: Create a skip_whitespace pattern loop AST
|
||||
fn build_skip_whitespace_loop() -> ASTNode {
|
||||
ASTNode::Loop {
|
||||
condition: Box::new(ASTNode::BinaryOp {
|
||||
operator: BinaryOperator::Less,
|
||||
left: Box::new(ASTNode::Variable {
|
||||
name: "p".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
right: Box::new(ASTNode::Variable {
|
||||
name: "len".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
body: vec![ASTNode::If {
|
||||
condition: Box::new(ASTNode::BinaryOp {
|
||||
operator: BinaryOperator::Equal,
|
||||
left: Box::new(ASTNode::Variable {
|
||||
name: "is_ws".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
right: Box::new(ASTNode::Literal {
|
||||
value: LiteralValue::Integer(1),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
then_body: vec![ASTNode::Assignment {
|
||||
target: Box::new(ASTNode::Variable {
|
||||
name: "p".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
value: Box::new(ASTNode::BinaryOp {
|
||||
operator: BinaryOperator::Add,
|
||||
left: Box::new(ASTNode::Variable {
|
||||
name: "p".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
right: Box::new(ASTNode::Literal {
|
||||
value: LiteralValue::Integer(1),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}],
|
||||
else_body: Some(vec![ASTNode::Break {
|
||||
span: Span::unknown(),
|
||||
}]),
|
||||
span: Span::unknown(),
|
||||
}],
|
||||
span: Span::unknown(),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_parity_check_mismatch_detected() {
|
||||
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
|
||||
use crate::mir::builder::control_flow::joinir::patterns::ast_feature_extractor as ast_features;
|
||||
|
||||
let loop_ast = build_skip_whitespace_loop();
|
||||
|
||||
// Extract condition and body
|
||||
let (condition, body) = match &loop_ast {
|
||||
ASTNode::Loop { condition, body, .. } => (condition.as_ref(), body.as_slice()),
|
||||
_ => panic!("Expected loop node"),
|
||||
};
|
||||
|
||||
// Run canonicalizer
|
||||
let (_, canonical_decision) = canonicalize_loop_expr(&loop_ast).unwrap();
|
||||
let canonical_pattern = canonical_decision.chosen.expect("Canonicalizer should succeed");
|
||||
|
||||
// Run router's pattern detection
|
||||
let has_continue = ast_features::detect_continue_in_body(body);
|
||||
let has_break = ast_features::detect_break_in_body(body);
|
||||
let features = ast_features::extract_features(condition, body, has_continue, has_break);
|
||||
let actual_pattern = crate::mir::loop_pattern_detection::classify(&features);
|
||||
|
||||
// Verify mismatch
|
||||
// Canonicalizer: Pattern3IfPhi (recognizes if-else structure)
|
||||
// Router: Pattern2Break (sees has_break)
|
||||
assert_eq!(
|
||||
canonical_pattern,
|
||||
crate::mir::loop_pattern_detection::LoopPatternKind::Pattern3IfPhi
|
||||
);
|
||||
assert_eq!(
|
||||
actual_pattern,
|
||||
crate::mir::loop_pattern_detection::LoopPatternKind::Pattern2Break
|
||||
);
|
||||
assert_ne!(canonical_pattern, actual_pattern, "Phase 137-4: This test verifies mismatch detection");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_parity_check_match_simple_while() {
|
||||
use crate::mir::loop_canonicalizer::canonicalize_loop_expr;
|
||||
use crate::mir::builder::control_flow::joinir::patterns::ast_feature_extractor as ast_features;
|
||||
|
||||
// Simple while loop: no break, no continue, no if
|
||||
let loop_ast = ASTNode::Loop {
|
||||
condition: Box::new(ASTNode::BinaryOp {
|
||||
operator: BinaryOperator::Less,
|
||||
left: Box::new(ASTNode::Variable {
|
||||
name: "i".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
right: Box::new(ASTNode::Literal {
|
||||
value: LiteralValue::Integer(10),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
body: vec![ASTNode::Assignment {
|
||||
target: Box::new(ASTNode::Variable {
|
||||
name: "i".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
value: Box::new(ASTNode::BinaryOp {
|
||||
operator: BinaryOperator::Add,
|
||||
left: Box::new(ASTNode::Variable {
|
||||
name: "i".to_string(),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
right: Box::new(ASTNode::Literal {
|
||||
value: LiteralValue::Integer(1),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}),
|
||||
span: Span::unknown(),
|
||||
}],
|
||||
span: Span::unknown(),
|
||||
};
|
||||
|
||||
// Extract condition and body
|
||||
let (condition, body) = match &loop_ast {
|
||||
ASTNode::Loop { condition, body, .. } => (condition.as_ref(), body.as_slice()),
|
||||
_ => panic!("Expected loop node"),
|
||||
};
|
||||
|
||||
// Canonicalizer will fail for simple patterns (not yet implemented)
|
||||
let canonical_result = canonicalize_loop_expr(&loop_ast);
|
||||
|
||||
// Router's pattern detection
|
||||
let has_continue = ast_features::detect_continue_in_body(body);
|
||||
let has_break = ast_features::detect_break_in_body(body);
|
||||
let features = ast_features::extract_features(condition, body, has_continue, has_break);
|
||||
let actual_pattern = crate::mir::loop_pattern_detection::classify(&features);
|
||||
|
||||
// Router should classify as Pattern1SimpleWhile
|
||||
assert_eq!(
|
||||
actual_pattern,
|
||||
crate::mir::loop_pattern_detection::LoopPatternKind::Pattern1SimpleWhile
|
||||
);
|
||||
|
||||
// Canonicalizer should fail (not implemented yet for Pattern1)
|
||||
assert!(canonical_result.is_ok());
|
||||
let (_, decision) = canonical_result.unwrap();
|
||||
assert!(decision.is_fail_fast(), "Canonicalizer should fail for simple patterns (Phase 3 only supports skip_whitespace)");
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user