Files
hakorune/docs/development/current/main/phase194-loop-inventory.md
nyash-codex 996925ebaf fix(joinir): Phase 196 Select double-remap bug in instruction_rewriter
Root cause: PHI inputs were being remapped twice in instruction_rewriter.rs
- Line 304: remap_instruction() already remapped JoinIR → Host ValueIds
- Line 328: remap_value() attempted to remap again → undefined ValueIds

Fix: Only remap block IDs, use already-remapped ValueIds as-is

Test results:
- phase195_sum_count.hako → 93  (multi-carrier P3)
- loop_if_phi.hako → sum=9  (single-carrier P3)
- loop_min_while.hako → 0,1,2  (Pattern 1)
- joinir_min_loop.hako → RC:0  (Pattern 2)
- No [joinir/freeze], no regressions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 14:45:04 +09:00

9.6 KiB

Phase 194: JsonParser Loop Inventory

Date: 2025-12-09 File: tools/hako_shared/json_parser.hako Total Loops: 10


JoinIR Target Loops (Phase 194)

Loops that CAN be handled by existing P1/P2/P5 patterns.

Loop Name Pattern Line Carrier Update Status
_skip_whitespace P2 312-319 p p+1 Whitelisted
_trim (leading) P5 330-337 start start+1 Whitelisted
_trim (trailing) P5 340-347 end end-1 Whitelisted
_match_literal P2 357-362 i i+1 Whitelisted

Pattern Details

_skip_whitespace (Line 312-319)

loop(p < s.length()) {
    local ch = s.substring(p, p+1)
    if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
        p = p + 1
    } else {
        break
    }
}
  • Pattern: P2 (break condition)
  • Carrier: p (IntegerBox)
  • Update: p = p + 1
  • Routing: Already whitelisted as JsonParserBox._skip_whitespace/2

_trim Leading (Line 330-337)

loop(start < end) {
    local ch = s.substring(start, start+1)
    if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
        start = start + 1
    } else {
        break
    }
}
  • Pattern: P5 (Trim specialized)
  • Carrier: start (IntegerBox)
  • Update: start = start + 1
  • Routing: Already whitelisted as JsonParserBox._trim/1

_trim Trailing (Line 340-347)

loop(end > start) {
    local ch = s.substring(end-1, end)
    if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
        end = end - 1
    } else {
        break
    }
}
  • Pattern: P5 (Trim specialized)
  • Carrier: end (IntegerBox)
  • Update: end = end - 1
  • Routing: Already whitelisted as JsonParserBox._trim/1

_match_literal (Line 357-362)

loop(i < len) {
    if s.substring(pos + i, pos + i + 1) != literal.substring(i, i + 1) {
        return 0
    }
    i = i + 1
}
  • Pattern: P2 (break via return)
  • Carrier: i (IntegerBox)
  • Update: i = i + 1
  • Routing: Already whitelisted as JsonParserBox._match_literal/3

Deferred Loops (Phase 200+)

Loops that CANNOT be handled by current P1/P2/P5 patterns.

Loop Name Line Deferral Reason Target Phase
_parse_number 121-133 digits.indexOf() - ConditionEnv constraint Phase 200+
_atoi 453-460 digits.indexOf() - ConditionEnv constraint Phase 200+
_parse_string 150-178 Complex carriers (escaped flag via continue) Phase 195+
_unescape_string 373-431 Complex carriers (is_escape, has_next, process_escape) Phase 195+
_parse_array 203-231 Multiple MethodCalls (_parse_value, nested) Phase 195+
_parse_object 256-304 Multiple MethodCalls (_parse_string, _parse_value) Phase 195+

Deferral Details

_parse_number (Line 121-133) - ConditionEnv Constraint

local digits = "0123456789"  // ← External local variable
loop(p < s.length()) {
    local ch = s.substring(p, p+1)
    local digit_pos = digits.indexOf(ch)  // ← Uses external 'digits'

    if digit_pos < 0 {
        break
    }

    num_str = num_str + ch
    p = p + 1
}
  • Reason: digits is an external local variable not in ConditionEnv
  • Constraint: Phase 193 ConditionEnv only includes: function params, loop carriers, body-local vars
  • Workaround: None (fundamental ConditionEnv limitation)
  • Solution: Phase 200+ ConditionEnv expansion OR .hako rewrite to inline digits

_atoi (Line 453-460) - ConditionEnv Constraint

local digits = "0123456789"  // ← External local variable
loop(i < n) {
    local ch = s.substring(i, i+1)
    if ch < "0" || ch > "9" { break }
    local pos = digits.indexOf(ch)  // ← Uses external 'digits'
    if pos < 0 { break }
    v = v * 10 + pos
    i = i + 1
}
  • Reason: Same as _parse_number - digits.indexOf() dependency
  • Solution: Phase 200+ ConditionEnv expansion

_parse_string (Line 150-178) - Complex Carriers

loop(p < s.length()) {
    local ch = s.substring(p, p+1)

    if ch == '"' {
        // ... MapBox construction ...
        return result
    }

    if ch == "\\" {
        local has_next = 0
        if p + 1 < s.length() { has_next = 1 }

        if has_next == 0 { return null }

        str = str + ch
        p = p + 1
        str = str + s.substring(p, p+1)
        p = p + 1
        continue  // ← Escape handling via continue
    }

    str = str + ch
    p = p + 1
}
  • Reason: Escape handling logic uses continue + conditional flag (has_next)
  • Pattern Fit: P2 or P3 with complex control flow
  • Solution: Phase 195+ Pattern 3 extension for if-in-loop with continue

_unescape_string (Line 373-431) - Complex Carriers

loop(i < s.length()) {
    local ch = s.substring(i, i+1)

    local is_escape = 0
    local has_next = 0

    if ch == "\\" { is_escape = 1 }
    if i + 1 < s.length() { has_next = 1 }

    local process_escape = 0
    if is_escape == 1 {
        if has_next == 1 {
            process_escape = 1
        }
    }

    if process_escape == 1 {
        // ... multiple escape type checks ...
        continue
    }

    result = result + ch
    i = i + 1
}
  • Reason: Multiple body-local flags (is_escape, has_next, process_escape) + nested conditions
  • Pattern Fit: P3 with complex flatten pattern
  • Solution: Phase 195+ Pattern 3 extension for multi-flag carriers

_parse_array / _parse_object - Multiple MethodCalls

// _parse_array
loop(p < s.length()) {
    local elem_result = me._parse_value(s, p)  // ← MethodCall 1
    if elem_result == null { return null }

    local elem = elem_result.get("value")  // ← MethodCall 2
    arr.push(elem)  // ← MethodCall 3

    p = elem_result.get("pos")  // ← MethodCall 4
    // ...
}

// _parse_object
loop(p < s.length()) {
    local key_result = me._parse_string(s, p)  // ← MethodCall 1
    // ...
    local value_result = me._parse_value(s, p)  // ← MethodCall 2
    // ...
    obj.set(key, value)  // ← MethodCall 3
    // ...
}
  • Reason: Multiple MethodCalls per iteration (4+ calls)
  • Constraint: Phase 193 supports 1 MethodCall in init, not multiple in body
  • Solution: Phase 195+ MethodCall extension for loop body

Statistics

Coverage

  • JoinIR Target: 4/10 loops (40%)
  • Deferred: 6/10 loops (60%)

Pattern Distribution (Target Loops)

  • P2 (Break): 2 loops (_skip_whitespace, _match_literal)
  • P5 (Trim): 2 loops (_trim leading, _trim trailing)

Deferral Reasons

  • ConditionEnv Constraint: 2 loops (_parse_number, _atoi)
  • Complex Carriers: 2 loops (_parse_string, _unescape_string)
  • Multiple MethodCalls: 2 loops (_parse_array, _parse_object)

Implementation Strategy (Phase 194)

Already Whitelisted

All 4 target loops are ALREADY in the routing whitelist:

  • JsonParserBox._skip_whitespace/2 (Line 88)
  • JsonParserBox._trim/1 (Line 87)
  • JsonParserBox._match_literal/3 (Line 89)

🎯 E2E Validation

Goal: Verify these loops actually run on JoinIR route without fallback.

# Test 1: Basic execution
NYASH_JOINIR_CORE=1 ./target/release/hakorune tools/hako_shared/json_parser.hako

# Test 2: Trace verification
NYASH_JOINIR_CORE=1 NYASH_JOINIR_DEBUG=1 ./target/release/hakorune tools/hako_shared/json_parser.hako 2>&1 | grep "\[trace:joinir\]"

# Expected: NO [joinir/freeze] messages (freeze = fallback to legacy)

⚠️ No New Whitelist Additions Needed

Phase 194 is about validating existing infrastructure, not adding new functions.

📊 Success Criteria

  • All 4 target loops run on JoinIR route (no freeze)
  • JsonParser parses basic JSON successfully
  • No regressions in Phase 190-193 tests
  • Clear documentation of deferred loops (this file)

Next Phase Priorities (Based on Deferral Analysis)

Phase 200+: ConditionEnv Expansion (High Value)

Impact: 2 loops (_parse_number, _atoi) Solution: Expand ConditionEnv to include function-scoped locals OR .hako rewrite

Phase 195+: Pattern 3 Extension (Medium Value)

Impact: 2 loops (_parse_string, _unescape_string) Solution: Support complex carriers with multiple body-local flags

Phase 195+: MethodCall Extension (Low Priority)

Impact: 2 loops (_parse_array, _parse_object) Solution: Support multiple MethodCalls in loop body Note: These are complex parsers, may be better handled by future optimization passes


Recommendations

Short-term (Phase 194)

  1. E2E test with existing 4 whitelisted loops
  2. Verify no fallback via trace
  3. Document findings (this file)

Medium-term (Phase 195)

  1. Focus on Pattern 3 extension (if-in-loop + multiple carriers)
  2. Target _parse_string as representative case
  3. Defer _unescape_string complexity until Pattern 3 is stable

Long-term (Phase 200+)

  1. ConditionEnv expansion design document
  2. Evaluate .hako rewrite vs. compiler extension
  3. Consider digits.indexOf() as representative case for all external local dependencies

Conclusion

Phase 194 is validation-focused: 4 target loops are ALREADY whitelisted and should work with existing P1/P2/P5 patterns. The 6 deferred loops have clear constraints that require future infrastructure:

  • ConditionEnv: Needs Phase 200+ design
  • Complex Carriers: Needs Phase 195 Pattern 3 extension
  • Multiple MethodCalls: Needs Phase 195+ design

This inventory provides a clear roadmap for future JoinIR expansion based on real-world loop patterns.