Files
hakorune/docs/development/current/main/phases/phase-142

Phase 142: Canonicalizer Pattern Extension

Status

  • P0: Complete (trim leading/trailing)
  • P1: Complete (continue pattern)

P0: trim leading/trailing (COMPLETE)

Objective

Extend Canonicalizer to recognize trim leading/trailing patterns, enabling proper routing through the normalized loop pipeline.

Target Patterns

  • tools/selfhost/test_pattern3_trim_leading.hako - start = start + 1 pattern
  • tools/selfhost/test_pattern3_trim_trailing.hako - end = end - 1 pattern

Accepted Criteria (All Met )

  • Canonicalizer creates Skeleton for trim_leading/trailing
  • decision.chosen == Pattern2Break (ExitContract priority)
  • decision.missing_caps == [] (no missing capabilities)
  • Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
  • Default behavior unchanged
  • Unit tests added
  • Documentation created

Implementation Summary

1. Pattern Recognizer Generalization

File: src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs

Changes:

  • Extended detect_skip_whitespace_pattern() to accept both + and - operators
  • Added support for negative deltas (e.g., -1 for end = end - 1)
  • Maintained backward compatibility with existing skip_whitespace patterns

Key Logic:

// Phase 142 P0: Accept both Add (+1) and Subtract (-1)
let op_multiplier = match operator {
    BinaryOperator::Add => 1,
    BinaryOperator::Subtract => -1,
    _ => return None,
};

// Calculate delta with sign (e.g., +1 or -1)
let delta = const_val * op_multiplier;

Recognized Patterns:

  • skip_whitespace: p = p + 1 (delta = +1)
  • trim_leading: start = start + 1 (delta = +1)
  • trim_trailing: end = end - 1 (delta = -1)

2. Unit Tests

File: src/mir/loop_canonicalizer/canonicalizer.rs

Added Tests:

  • test_trim_leading_pattern_recognized() - Verifies start = start + 1 pattern
  • test_trim_trailing_pattern_recognized() - Verifies end = end - 1 pattern

Test Coverage:

  • Skeleton creation
  • Carrier slot creation with correct delta (+1 or -1)
  • ExitContract setup (has_break=true)
  • RoutingDecision (chosen=Pattern2Break, missing_caps=[])

Test Results:

running 2 tests
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_leading_pattern_recognized ... ok
test mir::loop_canonicalizer::canonicalizer::tests::test_trim_trailing_pattern_recognized ... ok

test result: ok. 2 passed; 0 failed; 0 ignored

3. Manual Verification

Strict Parity Check:

NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
  tools/selfhost/test_pattern3_trim_leading.hako

Output (trim_leading):

[loop_canonicalizer]   Decision: SUCCESS
[loop_canonicalizer]   Chosen pattern: Pattern2Break
[loop_canonicalizer]   Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break

Output (trim_trailing):

[loop_canonicalizer]   Decision: SUCCESS
[loop_canonicalizer]   Chosen pattern: Pattern2Break
[loop_canonicalizer]   Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern2Break
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern2Break

Design Principles Applied

Box-First Modularization

  • Extended existing detect_skip_whitespace_pattern() instead of creating new functions
  • Maintained SSOT (Single Source of Truth) architecture
  • Preserved delegation pattern through pattern_recognizer.rs wrapper

Incremental Implementation

  • Focused on recognizer generalization only
  • Did not modify routing or lowering logic
  • Kept scope minimal (P0 only)

ExitContract Priority

  • Pattern choice determined by ExitContract (has_break=true)
  • Routes to Pattern2Break (not Pattern3IfPhi)
  • Consistent with existing SSOT policy

Files Modified

  1. src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs (+35 lines, improved comments)
  2. src/mir/loop_canonicalizer/canonicalizer.rs (+178 lines, 2 new tests)

Statistics

  • Total changes: +213 lines
  • Unit tests: 2 new tests (100% pass)
  • Manual tests: 2 patterns verified (strict parity green)
  • Build status: No errors, no warnings (lib)

SSOT References

  • Design: docs/development/current/main/design/loop-canonicalizer.md
  • JoinIR Architecture: docs/development/current/main/joinir-architecture-overview.md
  • Pattern Detection: ast_feature_extractor.rs (Phase 140-P4-A SSOT)

Known Limitations

  • Pattern2 variable promotion (A-3 Trim promotion) not yet implemented
  • This is expected - Phase 142 P0 only targets recognizer extension
  • Promotion will be addressed in future phases

Next Steps (Future Phases)

  • Phase 142 P1: Implement A-3 Trim promotion in Pattern2 handler
  • Phase 142 P2: Extend to other loop patterns (Pattern 3/4)
  • Phase 142 P3: Add more complex carrier update patterns

Verification Commands

# Unit tests
cargo test --release loop_canonicalizer::canonicalizer::tests::test_trim --lib

# Manual verification (trim_leading)
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
  tools/selfhost/test_pattern3_trim_leading.hako

# Manual verification (trim_trailing)
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
  tools/selfhost/test_pattern3_trim_trailing.hako

Conclusion

Phase 142 P0 successfully extends the Canonicalizer to recognize trim leading/trailing patterns. The implementation:

  • Maintains SSOT architecture
  • Passes all unit tests
  • Achieves strict parity agreement
  • Preserves existing behavior
  • Sets foundation for future pattern extensions

All acceptance criteria met.


P1: continue pattern (COMPLETE)

Objective

Extend Canonicalizer to recognize continue patterns, enabling proper routing through the normalized loop pipeline.

Target Pattern

  • tools/selfhost/test_pattern4_simple_continue.hako - Simple continue pattern with carrier update

Accepted Criteria (All Met )

  • Canonicalizer creates Skeleton for continue pattern
  • decision.chosen == Pattern4Continue (router agreement)
  • decision.missing_caps == [] (no missing capabilities)
  • Strict parity green (NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1)
  • Default behavior unchanged
  • Unit tests added
  • Documentation updated

Implementation Summary

1. Continue Pattern Detection

File: src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs

New Function: detect_continue_pattern()

Pattern Structure:

loop(cond) {
    // ... optional body statements (Body)
    if skip_cond {
        carrier = carrier + const  // Optional update before continue
        continue
    }
    // ... rest of body statements (Rest)
    carrier = carrier + const  // Carrier update
}

Example (from test_pattern4_simple_continue.hako):

loop(i < n) {
  if is_even == 1 {
    i = i + 1  // Update before continue
    continue
  }
  sum = sum + i  // Rest statements
  i = i + 1      // Carrier update
}

Key Logic:

  • Finds if statement containing continue in then_body
  • Extracts body statements before the if
  • Extracts rest statements after the if
  • Detects carrier update (last statement in rest_stmts)
  • Returns ContinuePatternInfo with carrier name, delta, body_stmts, and rest_stmts

2. Canonicalizer Integration

File: src/mir/loop_canonicalizer/canonicalizer.rs

Changes:

  • Added try_extract_continue_pattern() call before skip_whitespace check
  • Build skeleton with continue pattern structure
  • Set ExitContract with has_continue=true, has_break=false
  • Route to Pattern4Continue

Skeleton Structure:

  1. HeaderCond - Loop condition
  2. Body - Optional body statements before continue check
  3. Body - Rest statements (excluding carrier update)
  4. Update - Carrier update step

3. Module Re-exports

Files Modified (re-export chain):

  • src/mir/builder/control_flow/joinir/patterns/mod.rs - Added detect_continue_pattern, ContinuePatternInfo
  • src/mir/builder/control_flow/joinir/mod.rs - Re-export to joinir level
  • src/mir/builder/control_flow/mod.rs - Re-export to control_flow level
  • src/mir/builder.rs - Re-export to builder level
  • src/mir/mod.rs - Re-export to crate level

Pattern: Followed existing SSOT pattern from Phase 140-P4-A

4. Pattern Recognizer Wrapper

File: src/mir/loop_canonicalizer/pattern_recognizer.rs

New Function: try_extract_continue_pattern()

  • Delegates to detect_continue_pattern() from ast_feature_extractor
  • Returns tuple: (carrier_name, delta, body_stmts, rest_stmts)
  • Maintains backward compatibility with existing callsites

5. Unit Tests

File: src/mir/loop_canonicalizer/canonicalizer.rs

Added Test: test_simple_continue_pattern_recognized()

  • Builds AST: loop(i < n) { if is_even { i = i + 1; continue } sum = sum + i; i = i + 1 }
  • Verifies skeleton creation with correct structure
  • Checks carrier slot (name="i", delta=1)
  • Validates ExitContract (has_continue=true, has_break=false)
  • Confirms routing decision (Pattern4Continue, missing_caps=[])

Test Results:

running 8 tests
test mir::loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized ... ok
test result: ok. 8 passed; 0 failed; 0 ignored

6. Manual Verification

Strict Parity Check:

NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
  tools/selfhost/test_pattern4_simple_continue.hako

Output:

[loop_canonicalizer] Function: main
[loop_canonicalizer]   Skeleton steps: 4
[loop_canonicalizer]   Carriers: 1
[loop_canonicalizer]   Has exits: true
[loop_canonicalizer]   Decision: SUCCESS
[loop_canonicalizer]   Chosen pattern: Pattern4Continue
[loop_canonicalizer]   Missing caps: []
[choose_pattern_kind/PARITY] OK: canonical and actual agree on Pattern4Continue
[loop_canonicalizer/PARITY] OK in function 'main': canonical and actual agree on Pattern4Continue

Status: Strict parity green!

Design Principles Applied

Box-First Modularization

  • Created dedicated detect_continue_pattern() function in ast_feature_extractor
  • Maintained SSOT architecture with proper re-export chain
  • Followed existing pattern from skip_whitespace detection

Incremental Implementation

  • Focused on pattern recognition only (P1 scope)
  • Did not modify lowering logic (expected promotion errors)
  • Kept changes minimal and focused

ExitContract Priority

  • Pattern choice determined by ExitContract (has_continue=true, has_break=false)
  • Routes to Pattern4Continue (not Pattern2 or Pattern3)
  • Consistent with existing SSOT policy from Phase 137-5

Files Modified

  1. src/mir/builder/control_flow/joinir/patterns/ast_feature_extractor.rs (+167 lines, new function)
  2. src/mir/loop_canonicalizer/pattern_recognizer.rs (+35 lines, wrapper function)
  3. src/mir/loop_canonicalizer/canonicalizer.rs (+103 lines, continue support + unit test)
  4. src/mir/builder/control_flow/joinir/patterns/mod.rs (+3 lines, re-export)
  5. src/mir/builder/control_flow/joinir/mod.rs (+3 lines, re-export)
  6. src/mir/builder/control_flow/mod.rs (+3 lines, re-export)
  7. src/mir/builder.rs (+2 lines, re-export)
  8. src/mir/mod.rs (+2 lines, re-export)

Statistics

  • Total changes: +318 lines
  • Unit tests: 1 new test (100% pass)
  • All canonicalizer tests: 8 passed (100%)
  • Manual tests: 1 pattern verified (strict parity green)
  • Build status: No errors (warnings are pre-existing)

SSOT References

  • Design: docs/development/current/main/design/loop-canonicalizer.md
  • JoinIR Architecture: docs/development/current/main/joinir-architecture-overview.md
  • Pattern Detection: ast_feature_extractor.rs (Phase 140-P4-A SSOT)

Known Limitations

  • Pattern4 variable promotion (A-3 Trim, A-4 DigitPos) not yet handling this pattern
  • This is expected - Phase 142 P1 only targets recognizer extension
  • Promotion will be addressed when Pattern4 lowering is enhanced

Next Steps (Future Phases)

  • Phase 142 P2: Extend Pattern4 lowering to handle recognized continue patterns
  • Phase 142 P3: Add more complex continue patterns (multiple carriers, nested conditions)

Verification Commands

# Unit tests
cargo test --release --lib loop_canonicalizer::canonicalizer::tests::test_simple_continue_pattern_recognized

# All canonicalizer tests
cargo test --release --lib loop_canonicalizer::canonicalizer::tests

# Manual verification
NYASH_JOINIR_DEV=1 HAKO_JOINIR_STRICT=1 ./target/release/hakorune \
  tools/selfhost/test_pattern4_simple_continue.hako

Conclusion

Phase 142 P1 successfully extends the Canonicalizer to recognize continue patterns. The implementation:

  • Maintains SSOT architecture
  • Passes all unit tests (8/8)
  • Achieves strict parity agreement with router
  • Preserves existing behavior
  • Follows existing re-export pattern from Phase 140-P4-A

All acceptance criteria met.


P2: Pattern4 Lowering Extension (IN PROGRESS)

Objective

Extend Pattern4 lowering to handle "continue + return" patterns found in parse_string/array/object.

Target Pattern

  • tools/selfhost/test_pattern4_parse_string.hako - Parse string with continue (escape) + return (quote)

Pattern4 Lowering Contract (Phase 142 P2)

Accepted Minimum Structure

Return Handling:

  • Position: Early return inside one or more if blocks
  • Type: Scalar return values (complex returns are out of scope)
  • Constraint: Only the last return in loop body is processed

Continue Side Updates:

  • Pattern: if cond { carrier = carrier ± 1; continue }
  • Update: Constant step only (+1, -1, +2, -2, etc.)
  • Constraint: Multiple carriers not yet supported

Carrier and Payload:

  • Carrier: Loop variable used in loop condition
  • Payload: State updated on non-continue path (e.g., result string)

Exit Contract:

  • has_continue = true (continue pattern exists)
  • has_return = true (early return exists)
  • Both must coexist

Unsupported (Fail-Fast)

The following patterns are rejected with explicit error messages:

  • Multiple continue patterns (2+ continue statements)
  • Nested continue-return (continue inside if inside if)
  • Complex return values (returning multiple fields)
  • Variable step updates (escape sequence handling, etc.)

Implementation Strategy

Step 1: Clarify Pattern4 contract (this document) Step 2: Add E2E test case Step 3: Extend Pattern4 lowerer Step 4: Consider box-ification / modularization Step 5: Implementation and verification

Progress

  • Step 1: Contract clarification
  • Step 2: Add test case
  • Step 3: Extend lowerer
  • Step 4: Consider box-ification
  • Step 5: Verification complete

Acceptance Criteria

  • Representative test (parse_string or simple_continue) passes JoinIR lowering
  • Execution results are correct in both VM and LLVM (scope to be determined)
  • No regression in existing tests (phase132_exit_phi_parity, etc.)
  • Unsupported patterns fail fast with reason (error_tags)
  • No new environment variables added (dev-only observation only)
  • Documentation updated

Files to Modify

  1. docs/development/current/main/phases/phase-142/README.md - Contract documentation
  2. tools/selfhost/test_pattern4_parse_string_lowering.hako - Minimal E2E test (new)
  3. src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs - Lowerer extension
  4. src/mir/builder/control_flow/joinir/patterns/pattern4_carrier_analyzer.rs - Carrier analysis (if needed)

Step 3-A: Early Return Fail-Fast (COMPLETE )

Status: COMPLETE - Return detection and explicit error implemented

Implementation: Added has_return_in_body() helper function to Pattern4 lowerer

  • Recursively scans loop body for return statements
  • Returns explicit Fail-Fast error when return is detected
  • Error message references Phase 142 P2 for future lowering

Test Results: All 14 canonicalizer tests PASS (no regressions)

Key Achievement: Unsafe silent acceptance is now prevented - early returns explicitly surface as errors with actionable messages.

Step 3-B: Return Path JoinIR Generation (DEFERRED)

Status: 🔄 DEFERRED for separate session - Large-scale implementation requires careful design

Why separate: JoinIR generation involves responsibility boundary decisions (Pattern4 direct vs delegation to Pattern5) and ExitMeta/payload handling. Separating ensures cleaner cause analysis.

Design questions to resolve first:

  1. Should return be handled directly in Pattern4 lowerer, or delegated to Pattern5?
  2. How to transport return payload through exit/boundary/ExitMeta (can we reuse ContinueReturn assets)?

SSOT References

  • Design: docs/development/current/main/design/loop-canonicalizer.md
  • JoinIR Architecture: docs/development/current/main/joinir-architecture-overview.md
  • Pattern4 Implementation: src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs