diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index c7778898..859ef9ea 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -69,6 +69,14 @@ → Task 174-5: ドキュメント更新(phase174-jsonparser-p5b-design.md + CURRENT_TASK)。 → **成果**: Trim P5 パイプラインが `_parse_string` 最小化版でも機能することを実証。 文字比較対象が `"\""`(終端クォート)でも Trim と同じ昇格パターンで動作確認。 + - [x] Phase 175: P5 複数キャリア対応(_parse_string 向け) ✅ (2025-12-08) + → Task 175-1: _parse_string のキャリア候補を洗い出す(pos, result, is_ch_match の 3 キャリア特定)。 + → Task 175-2: CarrierInfo / ExitMeta に複数キャリアを載せる設計を固める(既存箱がすべて複数対応済みと確認)。 + → Task 175-3: 小さい PoC で 2 キャリアだけ通してみる(MIR 検証で Pattern2 の制約を特定)。 + → Task 175-4: CURRENT_TASK / joinir-architecture-overview 更新。 + → **成果**: P5 アーキテクチャが複数キャリアに対応済みであることを確認。 + CarrierInfo は 3 キャリア検出済み(pos, result, is_ch_match)。 + Pattern2 実装が Trim 最適化により `pos` のみ emit する制約を発見(Phase 176 改善対象)。 --- diff --git a/docs/development/current/main/phase175-multicarrier-design.md b/docs/development/current/main/phase175-multicarrier-design.md new file mode 100644 index 00000000..999cb2ca --- /dev/null +++ b/docs/development/current/main/phase175-multicarrier-design.md @@ -0,0 +1,259 @@ +# Phase 175: P5 Multiple Carrier Support Design + +**Date**: 2025-12-08 +**Purpose**: Extend P5 pipeline to support multiple carriers for complex loops like `_parse_string` + +--- + +## 1. Target: _parse_string Carrier Analysis + +### 1.1 Loop Structure (lines 150-178) + +```hako +_parse_string(s, pos) { + if s.substring(pos, pos+1) != '"' { return null } + + local p = pos + 1 // Carrier 1: position index + local str = "" // Carrier 2: result buffer + + loop(p < s.length()) { + local ch = s.substring(p, p+1) // LoopBodyLocal (promotion candidate) + + if ch == '"' { + // End of string + local result = new MapBox() + result.set("value", me._unescape_string(str)) // Uses str carrier + result.set("pos", p + 1) // Uses p carrier + result.set("type", "string") + return result + } + + if ch == "\\" { + // Escape sequence (Phase 176 scope) + local has_next = 0 + if p + 1 < s.length() { has_next = 1 } + if has_next == 0 { return null } + + str = str + ch + p = p + 1 + str = str + s.substring(p, p+1) + p = p + 1 + continue // ⚠️ Phase 176 scope + } + + str = str + ch // Update carrier 2 + p = p + 1 // Update carrier 1 + } + + return null +} +``` + +### 1.2 Carrier Candidates Table + +| Variable | Type | Update Pattern | Exit Usage | Carrier Status | +|---------|------|----------------|------------|----------------| +| `p` | IntegerBox | `p = p + 1` | Position in `result.set("pos", p + 1)` | ✅ **Required Carrier** | +| `str` | StringBox | `str = str + ch` | String buffer in `result.set("value", me._unescape_string(str))` | ✅ **Required Carrier** | +| `ch` | StringBox | `local ch = s.substring(p, p+1)` | Loop body comparison | ❌ **LoopBodyLocal** (promotion target) | +| `has_next` | IntegerBox | `local has_next = 0` | Escape processing guard | ❌ **Loop body only** (Phase 176) | + +### 1.3 Carrier Classification + +**Required Carriers (Exit-dependent)**: +1. **`p`**: Position index - final value used in `result.set("pos", p + 1)` +2. **`str`**: Result buffer - final value used in `result.set("value", me._unescape_string(str))` + +**Promoted Carriers (P5 mechanism)**: +3. **`is_ch_match`**: Bool carrier promoted from `ch` (Trim pattern detection) + - Pattern: `ch = s.substring(p, p+1)` → `ch == "\""` equality chain + - Promotion: `LoopBodyCarrierPromoter` converts to bool carrier + +**Loop-Internal Only (No carrier needed)**: +- `ch`: LoopBodyLocal, promotion target → becomes `is_ch_match` carrier +- `has_next`: Escape sequence guard (Phase 176 scope) + +--- + +## 2. Phase 175 Minimal PoC Scope + +**Goal**: Prove multi-carrier support with 2 carriers (`p` + `str`), excluding escape handling + +### 2.1 Minimal PoC Structure + +```hako +_parse_string_min2() { + me.s = "hello world\"" + me.pos = 0 + me.len = me.s.length() + me.result = "" + + // 2-carrier version: p + result updated together + loop(me.pos < me.len) { + local ch = me.s.substring(me.pos, me.pos+1) + if ch == "\"" { + break + } else { + me.result = me.result + ch // Carrier 2 update + me.pos = me.pos + 1 // Carrier 1 update + } + } + + // Exit: both pos and result are used + print("Parsed string: ") + print(me.result) + print(", final pos: ") + print(me.pos) +} +``` + +**Carrier Count**: 2 (`pos`, `result`) + 1 promoted (`is_ch_match`) = **3 total** + +**Excluded from Phase 175**: +- ❌ Escape sequence handling (`\\"`, `continue` path) +- ❌ Complex nested conditionals +- ✅ Focus: Simple char accumulation + position increment + +--- + +## 3. P5 Multi-Carrier Architecture + +### 3.1 Existing Boxes Already Support Multi-Carrier ✅ + +#### 3.1.1 LoopUpdateAnalyzer (Multi-carrier ready ✅) + +**File**: `src/mir/join_ir/lowering/loop_update_analyzer.rs` +**API**: `identify_updated_carriers(body, all_carriers) -> CarrierInfo` +**Multi-carrier support**: ✅ Loops over `all_carriers.carriers` + +**Code**: +```rust +pub fn identify_updated_carriers( + body: &[ASTNode], + all_carriers: &CarrierInfo, +) -> Result { + let mut updated = CarrierInfo::new(); + for carrier in &all_carriers.carriers { // ✅ Multi-carrier loop + if is_updated_in_body(body, &carrier.name) { + updated.add_carrier(carrier.clone()); + } + } + Ok(updated) +} +``` + +#### 3.1.2 LoopBodyCarrierPromoter (Adds carriers ✅) + +**File**: `src/mir/loop_pattern_detection/loop_body_carrier_promoter.rs` +**API**: `try_promote(request) -> PromotionResult` +**Multi-carrier support**: ✅ Generates **additional carriers** from promotion + +**Behavior**: +```rust +let promoted = LoopBodyCarrierPromoter::try_promote(&request)?; +carrier_info.merge_from(&promoted.to_carrier_info()); // Add promoted carrier +// Result: carrier_info.carriers = [pos, result, is_ch_match] +``` + +#### 3.1.3 CarrierInfo (Multi-carrier container ✅) + +**File**: `src/mir/join_ir/lowering/carrier_info.rs` +**API**: `carriers: Vec`, `merge_from(&other)` +**Multi-carrier support**: ✅ `Vec` holds arbitrary number of carriers + +**Phase 175-3 Usage**: +```rust +let mut carrier_info = CarrierInfo::new(); +carrier_info.add_carrier(CarrierData { // Carrier 1 + name: "pos".to_string(), + update_expr: UpdateExpr::Simple { ... }, +}); +carrier_info.add_carrier(CarrierData { // Carrier 2 + name: "result".to_string(), + update_expr: UpdateExpr::Simple { ... }, +}); +// carrier_info.carriers.len() == 2 ✅ +``` + +#### 3.1.4 ExitMeta / ExitBinding (Multi-carrier ready ✅) + +**File**: `src/mir/builder/control_flow/joinir/merge/exit_phi_builder.rs` +**API**: `carrier_exits: Vec<(String, ValueId)>` +**Multi-carrier support**: ✅ `Vec` holds all carrier exits + +**ExitMetaCollector Behavior**: +```rust +for carrier in &carrier_info.carriers { // ✅ Multi-carrier loop + exit_bindings.push((carrier.name.clone(), exit_value_id)); +} +// exit_bindings = [("pos", ValueId(10)), ("result", ValueId(20)), ("is_ch_match", ValueId(30))] +``` + +#### 3.1.5 ExitLineReconnector (Multi-carrier ready ✅) + +**File**: `src/mir/builder/control_flow/joinir/merge/mod.rs` +**API**: `reconnect_exit_bindings(exit_bindings, loop_header_phi_info, variable_map)` +**Multi-carrier support**: ✅ Loops over all `exit_bindings` + +**Behavior**: +```rust +for (carrier_name, _) in exit_bindings { // ✅ Multi-carrier loop + if let Some(phi_dst) = loop_header_phi_info.get_carrier_phi_dst(carrier_name) { + variable_map.insert(carrier_name.clone(), phi_dst); + } +} +// variable_map: {"pos" -> ValueId(100), "result" -> ValueId(200), "is_ch_match" -> ValueId(300)} +``` + +--- + +## 4. Conclusion: Architecture Supports Multi-Carrier ✅ (Pattern2 limitation found) + +### 4.1 Phase 175-3 Test Results + +**Test**: `local_tests/test_jsonparser_parse_string_min2.hako` + +**MIR Analysis**: +```mir +bb3: + %9 = copy %8 // result = "" (initialization) + +bb5: // Loop header + %25 = phi [%4, bb3], [%21, bb10] // Only pos carrier! + // ❌ Missing: result carrier PHI + +bb10: // Loop update + %21 = %25 Add %20 // pos = pos + 1 + // ❌ Missing: result = result + ch update + +bb12: // Exit block + %29 = copy %9 // Still uses original %9 (empty string)! +``` + +**Root Cause**: Pattern2's Trim optimization only emits `pos` carrier, ignoring `result` updates in loop body. + +**Architecture Validation** ✅: +- ✅ `CarrierInfo` detected 3 carriers (`pos`, `result`, `is_ch_match`) +- ✅ `variable_map` contains all carriers at pattern2_start +- ✅ Existing boxes (ExitMeta, ExitLineReconnector) support multi-carrier +- ❌ Pattern2 lowerer only emits loop update for `pos`, not `result` + +**Conclusion**: +- **Architecture is sound** - all boxes support multi-carrier +- **Pattern2 implementation gap** - Trim optimization doesn't emit body updates for non-position carriers +- **Phase 176 scope** - Extend Pattern2 to emit all carrier updates, not just position + +### 4.2 Next Steps + +- **Phase 175-3**: Run PoC test (`test_jsonparser_parse_string_min2.hako`) +- **Phase 176**: Add escape sequence handling (`continue` path, Phase 176 scope) +- **Phase 177**: Full `_parse_string` with all edge cases + +--- + +## 5. References + +- **Phase 170**: LoopUpdateSummary design +- **Phase 171**: LoopBodyCarrierPromoter implementation +- **Phase 174**: P5 minimal PoC (quote detection only) +- **Pattern Space**: [docs/development/current/main/loop_pattern_space.md](loop_pattern_space.md) diff --git a/src/mir/builder/control_flow/joinir/routing.rs b/src/mir/builder/control_flow/joinir/routing.rs index 429858ca..a5583e44 100644 --- a/src/mir/builder/control_flow/joinir/routing.rs +++ b/src/mir/builder/control_flow/joinir/routing.rs @@ -94,6 +94,9 @@ impl MirBuilder { // Phase 174: JsonParser complex loop P5B extension test "JsonParserStringTest.parse_string_min/0" => true, "JsonParserStringTest.main/0" => true, + // Phase 175: P5 multi-carrier support (2 carriers: pos + result) + "JsonParserStringTest2.parse_string_min2/0" => true, + "JsonParserStringTest2.main/0" => true, _ => false, };