Phase 185: Body-local Pattern2/4 integration skeleton - Added collect_body_local_variables() helper - Integrated UpdateEnv usage in loop_with_break_minimal - Test files created (blocked by init lowering) Phase 186: Body-local init lowering infrastructure - Created LoopBodyLocalInitLowerer box (378 lines) - Supports BinOp (+/-/*//) + Const + Variable - Fail-Fast for method calls/string operations - 3 unit tests passing Phase 187: String UpdateLowering design (doc-only) - Defined UpdateKind whitelist (6 categories) - StringAppendChar/Literal patterns identified - 3-layer architecture documented - No code changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
371 lines
12 KiB
Markdown
371 lines
12 KiB
Markdown
# Phase 187: String UpdateLowering Design (Doc-Only)
|
||
|
||
**Date**: 2025-12-09
|
||
**Status**: Design Phase (No Code Changes)
|
||
**Prerequisite**: Phase 178 Fail-Fast must remain intact
|
||
|
||
---
|
||
|
||
## Executive Summary
|
||
|
||
Phase 187 defines **what kinds of string updates are safe to handle in JoinIR**, using an UpdateKind-based whitelist approach. This is a design-only phase—no code will be changed.
|
||
|
||
**Core Principle**: Maintain Phase 178's Fail-Fast behavior while establishing a clear path forward for string operations.
|
||
|
||
---
|
||
|
||
## 1. UpdateKind Candidates
|
||
|
||
We classify update patterns into categories based on their complexity and safety:
|
||
|
||
### 1.1 Safe Patterns (Whitelist Candidates)
|
||
|
||
#### CounterLike
|
||
**Pattern**: `pos = pos + 1`, `i = i - 1`
|
||
**String Relevance**: Position tracking in string scanning loops
|
||
**Safety**: ✅ Simple arithmetic, deterministic
|
||
**Decision**: **ALLOW** (already supported in Phase 178)
|
||
|
||
#### AccumulationLike (Numeric)
|
||
**Pattern**: `sum = sum + i`, `total = total * factor`
|
||
**String Relevance**: None (numeric only)
|
||
**Safety**: ✅ Arithmetic operations, well-understood
|
||
**Decision**: **ALLOW** (already supported in Phase 178)
|
||
|
||
#### StringAppendChar
|
||
**Pattern**: `result = result + ch` (where `ch` is a single character variable)
|
||
**Example**: JsonParser `_parse_number`: `num_str = num_str + digit_ch`
|
||
**Safety**: ⚠️ Requires:
|
||
- RHS must be `UpdateRhs::Variable(name)`
|
||
- Variable scope: LoopBodyLocal or OuterLocal
|
||
- Single character (enforced at runtime by StringBox semantics)
|
||
**Decision**: **ALLOW** (with validation)
|
||
|
||
**Rationale**: This pattern is structurally identical to numeric accumulation:
|
||
```
|
||
sum = sum + i // Numeric accumulation
|
||
result = result + ch // String accumulation (char-by-char)
|
||
```
|
||
|
||
#### StringAppendLiteral
|
||
**Pattern**: `s = s + "..."` (where `"..."` is a string literal)
|
||
**Example**: `debug_output = debug_output + "[INFO] "`
|
||
**Safety**: ⚠️ Requires:
|
||
- RHS must be `UpdateRhs::StringLiteral(s)`
|
||
- Literal must be compile-time constant
|
||
**Decision**: **ALLOW** (with validation)
|
||
|
||
**Rationale**: Simpler than StringAppendChar—no variable resolution needed.
|
||
|
||
### 1.2 Unsafe Patterns (Fail-Fast)
|
||
|
||
#### Complex (Method Calls)
|
||
**Pattern**: `result = result + s.substring(pos, end)`
|
||
**Example**: JsonParser `_unescape_string`
|
||
**Safety**: ❌ Requires:
|
||
- Method call evaluation
|
||
- Multiple arguments
|
||
- Potentially non-deterministic results
|
||
**Decision**: **REJECT** with `[joinir/freeze]`
|
||
|
||
**Error Message**:
|
||
```
|
||
[pattern2/can_lower] Complex string update detected (method call in RHS).
|
||
JoinIR does not support this pattern yet. Use simpler string operations.
|
||
```
|
||
|
||
#### Complex (Nested BinOp)
|
||
**Pattern**: `x = x + (a + b)`, `result = result + s1 + s2`
|
||
**Safety**: ❌ Nested expression evaluation required
|
||
**Decision**: **REJECT** with `[joinir/freeze]`
|
||
|
||
---
|
||
|
||
## 2. Fail-Fast Policy (Phase 178 Preservation)
|
||
|
||
**Non-Negotiable**: Phase 178's Fail-Fast behavior must remain intact.
|
||
|
||
### 2.1 Current Fail-Fast Logic (Untouched)
|
||
|
||
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs`
|
||
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs`
|
||
|
||
```rust
|
||
// Phase 178: Reject string/complex updates
|
||
fn can_lower(...) -> bool {
|
||
for update in carrier_updates.values() {
|
||
match update {
|
||
UpdateExpr::BinOp { rhs, .. } => {
|
||
if matches!(rhs, UpdateRhs::StringLiteral(_) | UpdateRhs::Other) {
|
||
// Phase 178: Fail-Fast for string updates
|
||
return false; // ← This stays unchanged in Phase 187
|
||
}
|
||
}
|
||
_ => {}
|
||
}
|
||
}
|
||
true
|
||
}
|
||
```
|
||
|
||
**Phase 187 Changes**: NONE (this code is not touched in Phase 187).
|
||
|
||
### 2.2 Future Whitelist Expansion (Phase 188+)
|
||
|
||
In **Phase 188** (implementation phase), we will:
|
||
1. Extend `can_lower()` to accept `StringAppendChar` and `StringAppendLiteral`
|
||
2. Add validation to ensure safety constraints (variable scope, literal type)
|
||
3. Extend `CarrierUpdateLowerer` to emit JoinIR for string append operations
|
||
|
||
**Phase 187 does NOT implement this**—we only design what "safe" means.
|
||
|
||
---
|
||
|
||
## 3. Lowerer Responsibility Separation
|
||
|
||
### 3.1 Detection Layer (Pattern2/4)
|
||
|
||
**Responsibility**: UpdateKind classification only
|
||
**Location**: `pattern2_with_break.rs`, `pattern4_with_continue.rs`
|
||
|
||
```rust
|
||
// Phase 187 Design: What Pattern2/4 WILL check (future)
|
||
fn can_lower_string_update(update: &UpdateExpr) -> bool {
|
||
match update {
|
||
UpdateExpr::BinOp { rhs, .. } => {
|
||
match rhs {
|
||
UpdateRhs::Variable(_) => true, // StringAppendChar
|
||
UpdateRhs::StringLiteral(_) => true, // StringAppendLiteral
|
||
UpdateRhs::Other => false, // Complex (reject)
|
||
UpdateRhs::Const(_) => true, // Numeric (already allowed)
|
||
}
|
||
}
|
||
_ => true,
|
||
}
|
||
}
|
||
```
|
||
|
||
**Key Point**: Pattern2/4 only perform classification—they do NOT emit JoinIR for strings.
|
||
|
||
### 3.2 Emission Layer (CarrierUpdateLowerer + Expr Lowerer)
|
||
|
||
**Responsibility**: Actual JoinIR instruction emission
|
||
**Location**: `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
|
||
|
||
**Current State (Phase 184)**:
|
||
- Handles numeric carriers only (`CounterLike`, `AccumulationLike`)
|
||
- Emits `Compute { op: Add/Sub/Mul, ... }` for numeric BinOp
|
||
|
||
**Future State (Phase 188+ Implementation)**:
|
||
- Extend to handle `StringAppendChar`:
|
||
```rust
|
||
// Emit StringBox.concat() call or equivalent
|
||
let concat_result = emit_string_concat(lhs_value, ch_value);
|
||
```
|
||
- Extend to handle `StringAppendLiteral`:
|
||
```rust
|
||
// Emit string literal + concat
|
||
let literal_value = emit_string_literal("...");
|
||
let concat_result = emit_string_concat(lhs_value, literal_value);
|
||
```
|
||
|
||
**Phase 187 Design**: Document this separation, but do NOT implement.
|
||
|
||
---
|
||
|
||
## 4. Architecture Diagram
|
||
|
||
```
|
||
AST → LoopUpdateAnalyzer → UpdateKind classification
|
||
↓
|
||
Pattern2/4.can_lower()
|
||
(Whitelist check only)
|
||
↓
|
||
[ALLOW] → CarrierUpdateLowerer
|
||
(Emit JoinIR instructions)
|
||
↓
|
||
JoinIR Module
|
||
|
||
[REJECT] → [joinir/freeze] error
|
||
```
|
||
|
||
**Separation of Concerns**:
|
||
1. **LoopUpdateAnalyzer**: Extracts `UpdateExpr` from AST (already exists)
|
||
2. **Pattern2/4**: Classifies into Allow/Reject (Phase 178 logic + Phase 188 extension)
|
||
3. **CarrierUpdateLowerer**: Emits JoinIR (Phase 184 for numeric, Phase 188+ for string)
|
||
|
||
---
|
||
|
||
## 5. Representative Cases (Not Implemented)
|
||
|
||
### 5.1 JsonParser Update Patterns
|
||
|
||
#### _parse_number: `num_str = num_str + ch`
|
||
**UpdateKind**: `StringAppendChar`
|
||
**Classification**:
|
||
- `num_str`: carrier name
|
||
- `ch`: LoopBodyLocal variable (single character from string scan)
|
||
- RHS: `UpdateRhs::Variable("ch")`
|
||
**Decision**: **ALLOW** (Phase 188+)
|
||
|
||
#### _atoi: `num = num * 10 + digit`
|
||
**UpdateKind**: `AccumulationLike` (numeric)
|
||
**Classification**:
|
||
- Nested BinOp: `(num * 10) + digit`
|
||
- Currently detected as `UpdateRhs::Other`
|
||
**Decision**: **COMPLEX** (requires BinOp tree analysis, Phase 189+)
|
||
|
||
#### _unescape_string: `result = result + s.substring(...)`
|
||
**UpdateKind**: `Complex` (method call)
|
||
**Classification**:
|
||
- RHS: `UpdateRhs::Other` (MethodCall)
|
||
**Decision**: **REJECT** with Fail-Fast
|
||
|
||
### 5.2 UpdateKind Mapping Table
|
||
|
||
| Loop Variable | Update Pattern | UpdateRhs | UpdateKind | Phase 187 Decision |
|
||
|---------------|----------------|-----------|------------|-------------------|
|
||
| `num_str` | `num_str + ch` | `Variable("ch")` | StringAppendChar | ALLOW (Phase 188+) |
|
||
| `result` | `result + "\n"` | `StringLiteral("\n")` | StringAppendLiteral | ALLOW (Phase 188+) |
|
||
| `num` | `num * 10 + digit` | `Other` (nested BinOp) | Complex | REJECT (Phase 189+) |
|
||
| `result` | `result + s.substring(...)` | `Other` (MethodCall) | Complex | REJECT (Fail-Fast) |
|
||
| `pos` | `pos + 1` | `Const(1)` | CounterLike | ALLOW (Phase 178 ✅) |
|
||
| `sum` | `sum + i` | `Variable("i")` | AccumulationLike | ALLOW (Phase 178 ✅) |
|
||
|
||
---
|
||
|
||
## 6. Next Steps (Phase 188+ Implementation)
|
||
|
||
### Phase 188: StringAppendChar/Literal Implementation
|
||
|
||
**Scope**: Extend Pattern2/4 and CarrierUpdateLowerer to support string append.
|
||
|
||
**Tasks**:
|
||
1. **Extend `can_lower()` whitelist** (Pattern2/4)
|
||
- Accept `UpdateRhs::Variable(_)` for string carriers
|
||
- Accept `UpdateRhs::StringLiteral(_)` for string carriers
|
||
- Keep `UpdateRhs::Other` as Fail-Fast
|
||
|
||
2. **Extend CarrierUpdateLowerer** (emission layer)
|
||
- Detect carrier type (String vs Integer)
|
||
- Emit `StringBox.concat()` call for string append
|
||
- Emit `Compute { Add }` for numeric (existing logic)
|
||
|
||
3. **Add validation**
|
||
- Check variable scope (LoopBodyLocal or OuterLocal only)
|
||
- Check literal type (string only)
|
||
|
||
4. **E2E Test**
|
||
- `_parse_number` minimal version with `num_str = num_str + ch`
|
||
|
||
**Estimate**: 3-4 hours
|
||
|
||
### Phase 189+: Complex BinOp (Future)
|
||
|
||
**Scope**: Handle nested BinOp like `num * 10 + digit`.
|
||
|
||
**Tasks**:
|
||
1. Extend `analyze_rhs()` to recursively parse BinOp trees
|
||
2. Classify simple nested patterns (e.g., `(x * 10) + y`) as safe
|
||
3. Keep truly complex patterns (e.g., method calls in BinOp) as Fail-Fast
|
||
|
||
**Estimate**: 5-6 hours
|
||
|
||
---
|
||
|
||
## 7. Design Constraints
|
||
|
||
### 7.1 Box Theory Compliance
|
||
|
||
**Separation of Concerns**:
|
||
- UpdateKind classification → LoopUpdateAnalyzer (existing box)
|
||
- Can-lower decision → Pattern2/4 (control flow box)
|
||
- JoinIR emission → CarrierUpdateLowerer (lowering box)
|
||
|
||
**No Cross-Boundary Leakage**:
|
||
- Pattern2/4 do NOT emit JoinIR directly for string operations
|
||
- CarrierUpdateLowerer does NOT make can-lower decisions
|
||
|
||
### 7.2 Fail-Fast Preservation
|
||
|
||
**Phase 178 Logic Untouched**:
|
||
- All `UpdateRhs::StringLiteral` and `UpdateRhs::Other` continue to trigger Fail-Fast
|
||
- Phase 187 only documents what "safe" means—implementation is Phase 188+
|
||
|
||
**Error Messages**:
|
||
- Current: `"String/complex update detected, rejecting Pattern 2 (unsupported)"`
|
||
- Future (Phase 188+): More specific messages for different rejection reasons
|
||
|
||
### 7.3 Testability
|
||
|
||
**Unit Test Separation**:
|
||
- LoopUpdateAnalyzer tests: AST → UpdateExpr extraction
|
||
- Pattern2/4 tests: UpdateExpr → can_lower decision
|
||
- CarrierUpdateLowerer tests: UpdateExpr → JoinIR emission
|
||
|
||
**E2E Test**:
|
||
- JsonParser representative loops (Phase 188+)
|
||
|
||
---
|
||
|
||
## 8. Documentation Updates
|
||
|
||
### 8.1 joinir-architecture-overview.md
|
||
|
||
Add one sentence in Section 2.2 (条件式ライン):
|
||
|
||
```markdown
|
||
- **LoopUpdateAnalyzer / CarrierUpdateLowerer**
|
||
- ファイル:
|
||
- `src/mir/join_ir/lowering/loop_update_analyzer.rs`
|
||
- `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
|
||
- 責務:
|
||
- ループで更新される変数(carrier)を検出し、UpdateExpr を保持。
|
||
- Pattern 4 では実際に更新されるキャリアだけを残す。
|
||
- **Phase 187設計**: String 更新は UpdateKind ベースのホワイトリストで扱う方針(StringAppendChar/Literal は Phase 188+ で実装予定)。
|
||
```
|
||
|
||
### 8.2 CURRENT_TASK.md
|
||
|
||
Add Phase 187 entry:
|
||
|
||
```markdown
|
||
- [x] **Phase 187: String UpdateLowering 設計** ✅ (2025-12-09)
|
||
- UpdateKind ベースのホワイトリスト設計(doc-only)
|
||
- StringAppendChar/StringAppendLiteral を安全パターンとして定義
|
||
- Complex (method call / nested BinOp) は Fail-Fast 維持
|
||
- Phase 178 の Fail-Fast は完全保持
|
||
- Phase 188+ での実装方針を確立
|
||
```
|
||
|
||
---
|
||
|
||
## 9. Success Criteria (Phase 187)
|
||
|
||
- [x] Design document created (`phase187-string-update-design.md`)
|
||
- [x] UpdateKind whitelist defined (6 categories)
|
||
- [x] Fail-Fast preservation confirmed (Phase 178 untouched)
|
||
- [x] Lowerer responsibility separation documented
|
||
- [x] Representative cases analyzed (JsonParser loops)
|
||
- [x] Architecture diagram created
|
||
- [x] Next steps defined (Phase 188+ implementation)
|
||
- [x] `joinir-architecture-overview.md` updated (1-sentence addition)
|
||
- [x] `CURRENT_TASK.md` updated (Phase 187 entry added)
|
||
|
||
**All criteria met**: Phase 187 complete (design-only).
|
||
|
||
---
|
||
|
||
## 10. Conclusion
|
||
|
||
Phase 187 establishes a clear design for string update handling in JoinIR:
|
||
|
||
1. **Safe Patterns**: CounterLike, AccumulationLike, StringAppendChar, StringAppendLiteral
|
||
2. **Unsafe Patterns**: Complex (method calls, nested BinOp) → Fail-Fast
|
||
3. **Separation of Concerns**: Detection (Pattern2/4) vs Emission (CarrierUpdateLowerer)
|
||
4. **Phase 178 Preservation**: All Fail-Fast logic remains unchanged
|
||
|
||
**No code changes in Phase 187**—all design decisions documented for Phase 188+ implementation.
|
||
|
||
**Next Phase**: Phase 188 - Implement StringAppendChar/Literal lowering (3-4 hours estimate).
|