Files
hakorune/docs/development/current/main/phase175-multicarrier-design.md
nyash-codex 24aa8ced75 feat(joinir): Phase 175 - P5 multi-carrier architecture validation
Task 175-1: Analyzed _parse_string carrier candidates
- Identified 3 carriers: pos (position), result (buffer), is_ch_match (promoted)
- Categorized as: required carriers (pos, result), promoted carrier (is_ch_match)

Task 175-2: Validated existing boxes support multi-carrier
- CarrierInfo: Vec<CarrierData> supports arbitrary carriers 
- LoopUpdateAnalyzer: Loops over all carriers 
- ExitMeta: Vec<(String, ValueId)> supports all exit bindings 
- ExitLineReconnector: Reconnects all carriers to variable_map 
- No code changes needed - architecture already supports it!

Task 175-3: PoC test revealed Pattern2 limitation
- Test: test_jsonparser_parse_string_min2.hako (pos + result carriers)
- CarrierInfo detected 3 carriers correctly (pos, result, is_ch_match)
- variable_map contains all carriers at pattern2_start
- BUT: Pattern2's Trim optimization only emits pos carrier in MIR
- MIR shows result stays as empty string (no loop update emitted)
- Root cause: Trim pattern focuses on position-only optimization

Task 175-4: Documentation updates
- Created: phase175-multicarrier-design.md (comprehensive analysis)
- Updated: CURRENT_TASK.md (Phase 175 completion)
- Updated: routing.rs (added JsonParserStringTest2 whitelist)

Key Finding:
- Architecture is sound  - all boxes support multi-carrier
- Pattern2 implementation gap  - Trim optimization ignores non-position carriers
- Phase 176 scope: Extend Pattern2 to emit all carrier updates

Next: Phase 176 for escape sequence handling and full multi-carrier emission
2025-12-08 13:34:43 +09:00

8.3 KiB

Phase 175: P5 Multiple Carrier Support Design

Date: 2025-12-08 Purpose: Extend P5 pipeline to support multiple carriers for complex loops like _parse_string


1. Target: _parse_string Carrier Analysis

1.1 Loop Structure (lines 150-178)

_parse_string(s, pos) {
    if s.substring(pos, pos+1) != '"' { return null }

    local p = pos + 1      // Carrier 1: position index
    local str = ""         // Carrier 2: result buffer

    loop(p < s.length()) {
        local ch = s.substring(p, p+1)  // LoopBodyLocal (promotion candidate)

        if ch == '"' {
            // End of string
            local result = new MapBox()
            result.set("value", me._unescape_string(str))  // Uses str carrier
            result.set("pos", p + 1)                        // Uses p carrier
            result.set("type", "string")
            return result
        }

        if ch == "\\" {
            // Escape sequence (Phase 176 scope)
            local has_next = 0
            if p + 1 < s.length() { has_next = 1 }
            if has_next == 0 { return null }

            str = str + ch
            p = p + 1
            str = str + s.substring(p, p+1)
            p = p + 1
            continue  // ⚠️ Phase 176 scope
        }

        str = str + ch  // Update carrier 2
        p = p + 1       // Update carrier 1
    }

    return null
}

1.2 Carrier Candidates Table

Variable Type Update Pattern Exit Usage Carrier Status
p IntegerBox p = p + 1 Position in result.set("pos", p + 1) Required Carrier
str StringBox str = str + ch String buffer in result.set("value", me._unescape_string(str)) Required Carrier
ch StringBox local ch = s.substring(p, p+1) Loop body comparison LoopBodyLocal (promotion target)
has_next IntegerBox local has_next = 0 Escape processing guard Loop body only (Phase 176)

1.3 Carrier Classification

Required Carriers (Exit-dependent):

  1. p: Position index - final value used in result.set("pos", p + 1)
  2. str: Result buffer - final value used in result.set("value", me._unescape_string(str))

Promoted Carriers (P5 mechanism): 3. is_ch_match: Bool carrier promoted from ch (Trim pattern detection)

  • Pattern: ch = s.substring(p, p+1)ch == "\"" equality chain
  • Promotion: LoopBodyCarrierPromoter converts to bool carrier

Loop-Internal Only (No carrier needed):

  • ch: LoopBodyLocal, promotion target → becomes is_ch_match carrier
  • has_next: Escape sequence guard (Phase 176 scope)

2. Phase 175 Minimal PoC Scope

Goal: Prove multi-carrier support with 2 carriers (p + str), excluding escape handling

2.1 Minimal PoC Structure

_parse_string_min2() {
    me.s = "hello world\""
    me.pos = 0
    me.len = me.s.length()
    me.result = ""

    // 2-carrier version: p + result updated together
    loop(me.pos < me.len) {
        local ch = me.s.substring(me.pos, me.pos+1)
        if ch == "\"" {
            break
        } else {
            me.result = me.result + ch  // Carrier 2 update
            me.pos = me.pos + 1         // Carrier 1 update
        }
    }

    // Exit: both pos and result are used
    print("Parsed string: ")
    print(me.result)
    print(", final pos: ")
    print(me.pos)
}

Carrier Count: 2 (pos, result) + 1 promoted (is_ch_match) = 3 total

Excluded from Phase 175:

  • Escape sequence handling (\\", continue path)
  • Complex nested conditionals
  • Focus: Simple char accumulation + position increment

3. P5 Multi-Carrier Architecture

3.1 Existing Boxes Already Support Multi-Carrier

3.1.1 LoopUpdateAnalyzer (Multi-carrier ready )

File: src/mir/join_ir/lowering/loop_update_analyzer.rs API: identify_updated_carriers(body, all_carriers) -> CarrierInfo Multi-carrier support: Loops over all_carriers.carriers

Code:

pub fn identify_updated_carriers(
    body: &[ASTNode],
    all_carriers: &CarrierInfo,
) -> Result<CarrierInfo, String> {
    let mut updated = CarrierInfo::new();
    for carrier in &all_carriers.carriers {  // ✅ Multi-carrier loop
        if is_updated_in_body(body, &carrier.name) {
            updated.add_carrier(carrier.clone());
        }
    }
    Ok(updated)
}

3.1.2 LoopBodyCarrierPromoter (Adds carriers )

File: src/mir/loop_pattern_detection/loop_body_carrier_promoter.rs API: try_promote(request) -> PromotionResult Multi-carrier support: Generates additional carriers from promotion

Behavior:

let promoted = LoopBodyCarrierPromoter::try_promote(&request)?;
carrier_info.merge_from(&promoted.to_carrier_info());  // Add promoted carrier
// Result: carrier_info.carriers = [pos, result, is_ch_match]

3.1.3 CarrierInfo (Multi-carrier container )

File: src/mir/join_ir/lowering/carrier_info.rs API: carriers: Vec<CarrierData>, merge_from(&other) Multi-carrier support: Vec holds arbitrary number of carriers

Phase 175-3 Usage:

let mut carrier_info = CarrierInfo::new();
carrier_info.add_carrier(CarrierData {  // Carrier 1
    name: "pos".to_string(),
    update_expr: UpdateExpr::Simple { ... },
});
carrier_info.add_carrier(CarrierData {  // Carrier 2
    name: "result".to_string(),
    update_expr: UpdateExpr::Simple { ... },
});
// carrier_info.carriers.len() == 2 ✅

3.1.4 ExitMeta / ExitBinding (Multi-carrier ready )

File: src/mir/builder/control_flow/joinir/merge/exit_phi_builder.rs API: carrier_exits: Vec<(String, ValueId)> Multi-carrier support: Vec holds all carrier exits

ExitMetaCollector Behavior:

for carrier in &carrier_info.carriers {  // ✅ Multi-carrier loop
    exit_bindings.push((carrier.name.clone(), exit_value_id));
}
// exit_bindings = [("pos", ValueId(10)), ("result", ValueId(20)), ("is_ch_match", ValueId(30))]

3.1.5 ExitLineReconnector (Multi-carrier ready )

File: src/mir/builder/control_flow/joinir/merge/mod.rs API: reconnect_exit_bindings(exit_bindings, loop_header_phi_info, variable_map) Multi-carrier support: Loops over all exit_bindings

Behavior:

for (carrier_name, _) in exit_bindings {  // ✅ Multi-carrier loop
    if let Some(phi_dst) = loop_header_phi_info.get_carrier_phi_dst(carrier_name) {
        variable_map.insert(carrier_name.clone(), phi_dst);
    }
}
// variable_map: {"pos" -> ValueId(100), "result" -> ValueId(200), "is_ch_match" -> ValueId(300)}

4. Conclusion: Architecture Supports Multi-Carrier (Pattern2 limitation found)

4.1 Phase 175-3 Test Results

Test: local_tests/test_jsonparser_parse_string_min2.hako

MIR Analysis:

bb3:
    %9 = copy %8  // result = "" (initialization)

bb5:  // Loop header
    %25 = phi [%4, bb3], [%21, bb10]  // Only pos carrier!
    // ❌ Missing: result carrier PHI

bb10:  // Loop update
    %21 = %25 Add %20  // pos = pos + 1
    // ❌ Missing: result = result + ch update

bb12:  // Exit block
    %29 = copy %9  // Still uses original %9 (empty string)!

Root Cause: Pattern2's Trim optimization only emits pos carrier, ignoring result updates in loop body.

Architecture Validation :

  • CarrierInfo detected 3 carriers (pos, result, is_ch_match)
  • variable_map contains all carriers at pattern2_start
  • Existing boxes (ExitMeta, ExitLineReconnector) support multi-carrier
  • Pattern2 lowerer only emits loop update for pos, not result

Conclusion:

  • Architecture is sound - all boxes support multi-carrier
  • Pattern2 implementation gap - Trim optimization doesn't emit body updates for non-position carriers
  • Phase 176 scope - Extend Pattern2 to emit all carrier updates, not just position

4.2 Next Steps

  • Phase 175-3: Run PoC test (test_jsonparser_parse_string_min2.hako)
  • Phase 176: Add escape sequence handling (continue path, Phase 176 scope)
  • Phase 177: Full _parse_string with all edge cases

5. References