## Overview Analyzed 34 loops across selfhost codebase to identify JoinIR coverage gaps. Current readiness: 47% (16/30 loops). Next frontier: Pattern P5b (Escape Handling). ## Current Status - Phase 91 planning document: Complete - Loop inventory across 6 key files - Priority ranking: P5b (escape) > P5 (guard) > P6 (nested) - Effort estimates and ROI analysis - Pattern P5b Design: Complete - Problem statement (variable-step carriers) - Pattern definition with Skeleton layout - Recognition algorithm (8-step detection) - Capability taxonomy (P5b-specific guards) - Lowering strategy (Phase 92 preview) - Test fixture: Created - Minimal escape sequence parser - JSON string with backslash escape - Loop Canonicalizer extended - Capability table updated with P5b entries - Fail-Fast criteria documented - Implementation checklist added ## Key Findings ### Loop Readiness Matrix | Category | Count | JoinIR Status | |----------|-------|--------------| | Pattern 1 (simple bounded) | 16 | ✅ Ready | | Pattern 2 (with break) | 1 | ⚠️ Partial | | **Pattern P5b (escape seq)** | ~3 | ❌ NEW | | Pattern P5 (guard-bounded) | ~2 | ❌ Deferred | | Pattern P6 (nested loops) | ~8 | ❌ Deferred | ### Top Candidates 1. **P5b**: json_loader.hako:30 (8 lines, high reuse) - Effort: 2-3 days (recognition) - Impact: Unlocks all escape parsers 2. **P5**: mini_vm_core.hako:541 (204 lines, monolithic) - Effort: 1-2 weeks - Impact: Major JSON optimization 3. **P6**: seam_inspector.hako:76 (7+ nesting) - Effort: 2-3 weeks - Impact: Demonstrates nested composition ## Phase 91 Strategy **Recognition-only phase** (no lowering in P1): - Step 1: Design & planning ✅ - Step 2: Canonicalizer implementation (detect_escape_pattern) - Step 3: Unit tests + parity verification - Step 4: Lowering deferred to Phase 92 ## Files Added - docs/development/current/main/phases/phase-91/README.md - Full analysis & planning - docs/development/current/main/design/pattern-p5b-escape-design.md - Technical design - tools/selfhost/test_pattern5b_escape_minimal.hako - Test fixture ## Files Modified - docs/development/current/main/design/loop-canonicalizer.md - Capability table extended with P5b entries - Pattern P5b full section added - Implementation checklist updated ## Acceptance Criteria (Phase 91 Step 1) - ✅ Loop inventory complete (34 loops across 6 files) - ✅ Pattern P5b design document ready - ✅ Test fixture created - ✅ Capability taxonomy extended - ⏳ Implementation deferred (Step 2+) ## References - JoinIR Architecture: joinir-architecture-overview.md - Phase 91 Plan: phases/phase-91/README.md - P5b Design: design/pattern-p5b-escape-design.md Next: Implement detect_escape_pattern() recognition in Phase 91 Step 2 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Phase 91: JoinIR Coverage Expansion (Selfhost depth-2)
Status
- 🔍 Analysis Complete: Loop inventory across selfhost codebase
- 📋 Planning: Pattern P5b (Escape Handling) candidate selected
- ⏳ Implementation: Deferred to dedicated session
Executive Summary
Current JoinIR Readiness: 47% (16/30 loops in selfhost code)
| Category | Count | Status | Effort |
|---|---|---|---|
| Pattern 1 (simple bounded) | 16 | ✅ Ready | None |
| Pattern 2 (with break) | 1 | ⚠️ Partial | Low |
| Pattern P5b (escape handling) | ~3 | ❌ Blocked | Medium |
| Pattern P5 (guard-bounded) | ~2 | ❌ Blocked | High |
| Pattern P6 (nested loops) | ~8 | ❌ Blocked | Very High |
Analysis Results
Loop Inventory by Component
File: apps/selfhost-vm/boxes/json_cur.hako (3 loops)
- Lines 9-14: ✅ Pattern 1 (simple bounded)
- Lines 23-32: ✅ Pattern 1 variant with break
- Lines 42-57: ✅ Pattern 1 with guard-less loop(true)
File: apps/selfhost-vm/json_loader.hako (3 loops)
- Lines 16-22: ✅ Pattern 1 (simple bounded)
- Lines 30-37: ❌ Pattern P5b CANDIDATE (escape sequence handling)
- Lines 43-48: ✅ Pattern 1 (simple bounded)
File: apps/selfhost-vm/boxes/mini_vm_core.hako (9 loops)
- Lines 208-231: ⚠️ Pattern 1 variant (with continue)
- Lines 239-253: ✅ Pattern 1 (with accumulator)
- Lines 388-400, 493-505: ✅ Pattern 1 (6 bounded search loops)
- Lines 541-745: ❌ Pattern P5 PRIME CANDIDATE (guard-bounded, 204-line collect_prints)
File: apps/selfhost-vm/boxes/seam_inspector.hako (13 loops)
- Lines 10-26: ✅ Pattern 1
- Lines 38-42, 116-120, 123-127: ✅ Pattern 1 variants
- Lines 76-107: ❌ Pattern P6 (deeply nested, 7+ levels)
- Remaining: Mix of ⚠️ Pattern 1 variants with nested loops
File: apps/selfhost-vm/boxes/mini_vm_prints.hako (1 loop)
- Line 118+: ❌ Pattern P5 (guard-bounded multi-case)
Candidate Selection: Priority Order
🥇 IMMEDIATE CANDIDATE: Pattern P5b (Escape Handling)
Target: json_loader.hako:30 - read_digits_from()
Scope: 8-line loop
Current Structure:
loop(i < n) {
local ch = s.substring(i, i+1)
if ch == "\"" { break }
if ch == "\\" {
i = i + 1
ch = s.substring(i, i+1)
}
out = out + ch
i = i + 1
}
Pattern Classification:
- Header:
loop(i < n) - Escape Check:
if ch == "\\" { i = i + 2 instead of i + 1 } - Body: Append character
- Carriers:
i(position),out(buffer) - Challenge: Variable increment (sometimes +1, sometimes +2)
Why This Candidate:
- ✅ Small scope (8 lines) - good for initial implementation
- ✅ High reuse potential - same pattern appears in multiple parser locations
- ✅ Moderate complexity - requires conditional step extension (not fully generic)
- ✅ Clear benefit - would unlock escape sequence handling across all string parsers
- ❌ Scope limitation - conditional increment not yet in Canonicalizer
Effort Estimate: 2-3 days
- Canonicalizer extension: 4-6 hours
- Pattern recognizer: 2-3 hours
- Lowering implementation: 4-6 hours
- Testing + verification: 2-3 hours
🥈 SECOND CANDIDATE: Pattern P5 (Guard-Bounded)
Target: mini_vm_core.hako:541 - collect_prints()
Scope: 204-line loop (monolithic)
Current Structure:
loop(true) {
guard = guard + 1
if guard > 200 { break }
local p = index_of_from(json, k_print, pos)
if p < 0 { break }
// 5 different cases based on JSON type
if is_binary_op { ... pos = ... out.push(...) }
if is_compare { ... pos = ... out.push(...) }
if is_literal { ... pos = ... out.push(...) }
if is_function_call { ... pos = ... out.push(...) }
if is_nested { ... pos = ... out.push(...) }
pos = obj_end + 1
}
Pattern Classification:
- Header:
loop(true)(unconditional) - Guard:
guard > LIMITwith increment each iteration - Body: Multiple case-based mutations
- Carriers:
pos,printed,guard,out(ArrayBox) - Exit conditions: Guard exhaustion OR search failure
Why This Candidate:
- ✅ Monolithic optimization opportunity - 204 lines of complex control flow
- ✅ Real-world JSON parsing - demonstrates practical JoinIR application
- ✅ High performance impact - guard counter could be eliminated via SSA
- ❌ High complexity - needs new Pattern5 guard-handling variant
- ❌ Large scope - would benefit from split into micro-loops first
Effort Estimate: 1-2 weeks
- Design: 2-3 days (pattern definition, contract)
- Implementation: 5-7 days
- Testing + verification: 2-3 days
Alternative Strategy: Could split into 5 micro-loops per case:
// Instead of one 204-line loop with 5 cases:
// Create 5 functions, each handling one case:
loop_binary_op() { ... }
loop_compare() { ... }
loop_literal() { ... }
loop_function_call() { ... }
loop_nested() { ... }
// Then main loop dispatches:
loop(true) {
guard = guard + 1
if guard > limit { break }
if type == BINARY_OP { loop_binary_op(...) }
...
}
This would make each sub-loop Pattern 1-compatible immediately.
🥉 THIRD CANDIDATE: Pattern P6 (Nested Loops)
Target: seam_inspector.hako:76 - _scan_boxes()
Scope: Multi-level nested (7+ nesting levels)
Current Structure: 37-line outer loop containing 6 nested loops
Pattern Classification:
- Nesting levels: 7+
- Carriers: Multiple per level (
i,j,k,name,pos, etc.) - Exit conditions: Varied per level (bounds, break, continue)
- Scope handoff: Complex state passing between levels
Why This Candidate:
- ✅ Demonstrates nested composition - needed for production parsers
- ✅ Realistic code - actual box/function scanner
- ❌ Highest complexity - requires recursive JoinIR composition
- ❌ Long-term project - 2-3 weeks minimum
Effort Estimate: 2-3 weeks
- Design recursive composition: 3-5 days
- Per-level implementation: 7-10 days
- Testing nested composition: 3-5 days
Recommended Immediate Action
Phase 91 (This Session): Pattern P5b Planning
Objective: Design Pattern P5b (escape sequence handling) with minimal implementation
Steps:
- ✅ Analysis complete (done by Explore agent)
- Design P5b pattern (canonicalizer contract)
- Create minimal fixture (
test_pattern5b_escape_minimal.hako) - Extend Canonicalizer to recognize escape patterns
- Plan lowering (defer implementation to next session)
- Document P5b architecture in loop-canonicalizer.md
Acceptance Criteria:
- ✅ Pattern P5b design document complete
- ✅ Minimal escape test fixture created
- ✅ Canonicalizer recognizes escape patterns (dev-only observation)
- ✅ Parity check passes (strict mode)
- ✅ No lowering changes yet (recognition-only phase)
Deliverables:
docs/development/current/main/phases/phase-91/README.md- This documentdocs/development/current/main/design/pattern-p5b-escape-design.md- Pattern design (new)tools/selfhost/test_pattern5b_escape_minimal.hako- Test fixture (new)- Updated
docs/development/current/main/design/loop-canonicalizer.md- Capability tags extended
Design: Pattern P5b (Escape Sequence Handling)
Motivation
String parsing commonly requires escape sequence handling:
- Double quotes:
"text with \" escaped quote" - Backslashes:
"path\\with\\backslashes" - Newlines:
"text with \n newline"
Current loops handle this with conditional increment:
if ch == "\\" {
i = i + 1 // Skip escape character itself
ch = next_char
}
i = i + 1 // Always advance
This variable-step pattern is not JoinIR-compatible because:
- Loop increment is conditional (sometimes +1, sometimes +2)
- Canonicalizer expects constant-delta carriers
- Lowering expects uniform update rules
Solution: Pattern P5b Definition
Header Requirement
loop(i < n) // Bounded loop on string length
Escape Check Requirement
if ch == "\\" {
i = i + delta_skip // Skip character (typically +1, +2, or variable)
// Optional: consume escape character
ch = s.substring(i, i+1)
}
After-Escape Requirement
// Standard character processing
out = out + ch
i = i + delta_normal // Standard increment (typically +1)
Skeleton Structure
LoopSkeleton {
steps: [
HeaderCond(i < n),
Body(escape_check_stmts),
Body(process_char_stmts),
Update(i = i + normal_delta, maybe(i = i + skip_delta))
]
}
Carrier Configuration
- Primary Carrier: Loop variable (
i)delta_normal: +1 (standard case)delta_escape: +1 or +2 (skip escape)
- Secondary Carrier: Accumulator (
out)- Pattern:
out = out + value
- Pattern:
ExitContract
ExitContract {
has_break: true, // Break on quote detection
has_continue: false,
has_return: false,
carriers: vec![
CarrierInfo { name: "i", deltas: [+1, +2] },
CarrierInfo { name: "out", pattern: Append }
]
}
Routing Decision
RoutingDecision {
chosen: Pattern5bEscape,
structure_notes: ["escape_handling", "variable_step"],
missing_caps: [] // All required capabilities present
}
Recognition Algorithm
AST Inspection Steps
-
Find escape check:
- Pattern:
if ch == "\\" { ... } - Extract: Escape character constant
- Extract: Increment inside if block
- Pattern:
-
Extract skip delta:
- Pattern:
i = i + <const> - Calculate:
skip_delta = <const>
- Pattern:
-
Find normal increment:
- Pattern:
i = i + <const>(after escape if block) - Calculate:
normal_delta = <const>
- Pattern:
-
Validate break condition:
- Pattern:
if <char> == "<quote>" { break } - Required for string boundary detection
- Pattern:
-
Build LoopSkeleton:
- Carriers:
[{name: "i", deltas: [normal, skip]}, ...] - ExitContract:
has_break=true - RoutingDecision:
chosen=Pattern5bEscape
- Carriers:
Implementation Plan
Canonicalizer Extension (src/mir/loop_canonicalizer/canonicalizer.rs)
Add detect_escape_pattern() recognition:
fn detect_escape_pattern(
loop_expr: &Expr,
carriers: &[String]
) -> Option<EscapePatternInfo> {
// Step 1-5 as above
// Return: { escape_char, skip_delta, normal_delta, carrier_name }
}
Priority: Call before detect_skip_whitespace_pattern() (more specific pattern first)
Pattern Recognizer Wrapper (src/mir/loop_canonicalizer/pattern_recognizer.rs)
Expose detect_escape_pattern():
pub fn try_extract_escape_pattern(
loop_expr: &Expr
) -> Option<(String, i64, i64)> { // (carrier, normal_delta, skip_delta)
// Delegate to canonicalizer detection
}
Test Fixture (tools/selfhost/test_pattern5b_escape_minimal.hako)
Minimal reproducible example:
// Minimal escape sequence parser
local s = "\\"hello\\" world"
local n = s.length()
local i = 0
local out = ""
loop(i < n) {
local ch = s.substring(i, i+1)
if ch == "\"" {
break
}
if ch == "\\" {
i = i + 1 // Skip escape character
if i < n {
ch = s.substring(i, i+1)
}
}
out = out + ch
i = i + 1
}
print(out) // Should print: hello" world
Files to Modify (Phase 91)
New Files
docs/development/current/main/phases/phase-91/README.md← You are heredocs/development/current/main/design/pattern-p5b-escape-design.md(new - detailed design)tools/selfhost/test_pattern5b_escape_minimal.hako(new - test fixture)
Modified Files
-
docs/development/current/main/design/loop-canonicalizer.md- Add Pattern P5b to capability matrix
- Add recognition algorithm
- Add routing decision table
-
(Phase 91 Step 2+)
src/mir/loop_canonicalizer/canonicalizer.rs- Add
detect_escape_pattern()function - Extend
canonicalize_loop_expr()to check for escape patterns
- Add
-
(Phase 91 Step 2+)
src/mir/loop_canonicalizer/pattern_recognizer.rs- Add
try_extract_escape_pattern()wrapper
- Add
Next Steps (Future Sessions)
Phase 91 Step 2: Implementation
- Implement
detect_escape_pattern()in Canonicalizer - Add unit tests for escape pattern recognition
- Verify strict parity with router
Phase 92: Lowering
- Implement Pattern5bEscape lowerer
- Handle variable-step carrier updates
- E2E test with
test_pattern5b_escape_minimal.hako
Phase 93: Pattern P5 (Guard-Bounded)
- Implement Pattern5 for
mini_vm_core.hako:541 - Consider micro-loop refactoring alternative
- Document guard-counter optimization strategy
Phase 94+: Pattern P6 (Nested Loops)
- Recursive JoinIR composition for
seam_inspector.hako:76 - Cross-level scope/carrier handoff
SSOT References
- JoinIR Architecture:
docs/development/current/main/joinir-architecture-overview.md - Loop Canonicalizer Design:
docs/development/current/main/design/loop-canonicalizer.md - Capability Tags:
src/mir/loop_canonicalizer/capability_guard.rs
Summary
Phase 91 establishes the next frontier of JoinIR coverage: Pattern P5b (Escape Handling).
This pattern unlocks:
- ✅ All string escape parsing loops
- ✅ Foundation for Pattern P5 (guard-bounded)
- ✅ Preparation for Pattern P6 (nested loops)
Current readiness: 47% (16/30 loops) After Phase 91: Expected to reach ~60% (18/30 loops) Long-term target: >90% coverage with P5, P5b, P6 patterns
All acceptance criteria defined. Implementation ready for next session.