371 lines
12 KiB
Markdown
371 lines
12 KiB
Markdown
|
|
# Phase 187: String UpdateLowering Design (Doc-Only)
|
|||
|
|
|
|||
|
|
**Date**: 2025-12-09
|
|||
|
|
**Status**: Design Phase (No Code Changes)
|
|||
|
|
**Prerequisite**: Phase 178 Fail-Fast must remain intact
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Executive Summary
|
|||
|
|
|
|||
|
|
Phase 187 defines **what kinds of string updates are safe to handle in JoinIR**, using an UpdateKind-based whitelist approach. This is a design-only phase—no code will be changed.
|
|||
|
|
|
|||
|
|
**Core Principle**: Maintain Phase 178's Fail-Fast behavior while establishing a clear path forward for string operations.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. UpdateKind Candidates
|
|||
|
|
|
|||
|
|
We classify update patterns into categories based on their complexity and safety:
|
|||
|
|
|
|||
|
|
### 1.1 Safe Patterns (Whitelist Candidates)
|
|||
|
|
|
|||
|
|
#### CounterLike
|
|||
|
|
**Pattern**: `pos = pos + 1`, `i = i - 1`
|
|||
|
|
**String Relevance**: Position tracking in string scanning loops
|
|||
|
|
**Safety**: ✅ Simple arithmetic, deterministic
|
|||
|
|
**Decision**: **ALLOW** (already supported in Phase 178)
|
|||
|
|
|
|||
|
|
#### AccumulationLike (Numeric)
|
|||
|
|
**Pattern**: `sum = sum + i`, `total = total * factor`
|
|||
|
|
**String Relevance**: None (numeric only)
|
|||
|
|
**Safety**: ✅ Arithmetic operations, well-understood
|
|||
|
|
**Decision**: **ALLOW** (already supported in Phase 178)
|
|||
|
|
|
|||
|
|
#### StringAppendChar
|
|||
|
|
**Pattern**: `result = result + ch` (where `ch` is a single character variable)
|
|||
|
|
**Example**: JsonParser `_parse_number`: `num_str = num_str + digit_ch`
|
|||
|
|
**Safety**: ⚠️ Requires:
|
|||
|
|
- RHS must be `UpdateRhs::Variable(name)`
|
|||
|
|
- Variable scope: LoopBodyLocal or OuterLocal
|
|||
|
|
- Single character (enforced at runtime by StringBox semantics)
|
|||
|
|
**Decision**: **ALLOW** (with validation)
|
|||
|
|
|
|||
|
|
**Rationale**: This pattern is structurally identical to numeric accumulation:
|
|||
|
|
```
|
|||
|
|
sum = sum + i // Numeric accumulation
|
|||
|
|
result = result + ch // String accumulation (char-by-char)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### StringAppendLiteral
|
|||
|
|
**Pattern**: `s = s + "..."` (where `"..."` is a string literal)
|
|||
|
|
**Example**: `debug_output = debug_output + "[INFO] "`
|
|||
|
|
**Safety**: ⚠️ Requires:
|
|||
|
|
- RHS must be `UpdateRhs::StringLiteral(s)`
|
|||
|
|
- Literal must be compile-time constant
|
|||
|
|
**Decision**: **ALLOW** (with validation)
|
|||
|
|
|
|||
|
|
**Rationale**: Simpler than StringAppendChar—no variable resolution needed.
|
|||
|
|
|
|||
|
|
### 1.2 Unsafe Patterns (Fail-Fast)
|
|||
|
|
|
|||
|
|
#### Complex (Method Calls)
|
|||
|
|
**Pattern**: `result = result + s.substring(pos, end)`
|
|||
|
|
**Example**: JsonParser `_unescape_string`
|
|||
|
|
**Safety**: ❌ Requires:
|
|||
|
|
- Method call evaluation
|
|||
|
|
- Multiple arguments
|
|||
|
|
- Potentially non-deterministic results
|
|||
|
|
**Decision**: **REJECT** with `[joinir/freeze]`
|
|||
|
|
|
|||
|
|
**Error Message**:
|
|||
|
|
```
|
|||
|
|
[pattern2/can_lower] Complex string update detected (method call in RHS).
|
|||
|
|
JoinIR does not support this pattern yet. Use simpler string operations.
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
#### Complex (Nested BinOp)
|
|||
|
|
**Pattern**: `x = x + (a + b)`, `result = result + s1 + s2`
|
|||
|
|
**Safety**: ❌ Nested expression evaluation required
|
|||
|
|
**Decision**: **REJECT** with `[joinir/freeze]`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Fail-Fast Policy (Phase 178 Preservation)
|
|||
|
|
|
|||
|
|
**Non-Negotiable**: Phase 178's Fail-Fast behavior must remain intact.
|
|||
|
|
|
|||
|
|
### 2.1 Current Fail-Fast Logic (Untouched)
|
|||
|
|
|
|||
|
|
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs`
|
|||
|
|
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs`
|
|||
|
|
|
|||
|
|
```rust
|
|||
|
|
// Phase 178: Reject string/complex updates
|
|||
|
|
fn can_lower(...) -> bool {
|
|||
|
|
for update in carrier_updates.values() {
|
|||
|
|
match update {
|
|||
|
|
UpdateExpr::BinOp { rhs, .. } => {
|
|||
|
|
if matches!(rhs, UpdateRhs::StringLiteral(_) | UpdateRhs::Other) {
|
|||
|
|
// Phase 178: Fail-Fast for string updates
|
|||
|
|
return false; // ← This stays unchanged in Phase 187
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
_ => {}
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
true
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Phase 187 Changes**: NONE (this code is not touched in Phase 187).
|
|||
|
|
|
|||
|
|
### 2.2 Future Whitelist Expansion (Phase 188+)
|
|||
|
|
|
|||
|
|
In **Phase 188** (implementation phase), we will:
|
|||
|
|
1. Extend `can_lower()` to accept `StringAppendChar` and `StringAppendLiteral`
|
|||
|
|
2. Add validation to ensure safety constraints (variable scope, literal type)
|
|||
|
|
3. Extend `CarrierUpdateLowerer` to emit JoinIR for string append operations
|
|||
|
|
|
|||
|
|
**Phase 187 does NOT implement this**—we only design what "safe" means.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Lowerer Responsibility Separation
|
|||
|
|
|
|||
|
|
### 3.1 Detection Layer (Pattern2/4)
|
|||
|
|
|
|||
|
|
**Responsibility**: UpdateKind classification only
|
|||
|
|
**Location**: `pattern2_with_break.rs`, `pattern4_with_continue.rs`
|
|||
|
|
|
|||
|
|
```rust
|
|||
|
|
// Phase 187 Design: What Pattern2/4 WILL check (future)
|
|||
|
|
fn can_lower_string_update(update: &UpdateExpr) -> bool {
|
|||
|
|
match update {
|
|||
|
|
UpdateExpr::BinOp { rhs, .. } => {
|
|||
|
|
match rhs {
|
|||
|
|
UpdateRhs::Variable(_) => true, // StringAppendChar
|
|||
|
|
UpdateRhs::StringLiteral(_) => true, // StringAppendLiteral
|
|||
|
|
UpdateRhs::Other => false, // Complex (reject)
|
|||
|
|
UpdateRhs::Const(_) => true, // Numeric (already allowed)
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
_ => true,
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Key Point**: Pattern2/4 only perform classification—they do NOT emit JoinIR for strings.
|
|||
|
|
|
|||
|
|
### 3.2 Emission Layer (CarrierUpdateLowerer + Expr Lowerer)
|
|||
|
|
|
|||
|
|
**Responsibility**: Actual JoinIR instruction emission
|
|||
|
|
**Location**: `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
|
|||
|
|
|
|||
|
|
**Current State (Phase 184)**:
|
|||
|
|
- Handles numeric carriers only (`CounterLike`, `AccumulationLike`)
|
|||
|
|
- Emits `Compute { op: Add/Sub/Mul, ... }` for numeric BinOp
|
|||
|
|
|
|||
|
|
**Future State (Phase 188+ Implementation)**:
|
|||
|
|
- Extend to handle `StringAppendChar`:
|
|||
|
|
```rust
|
|||
|
|
// Emit StringBox.concat() call or equivalent
|
|||
|
|
let concat_result = emit_string_concat(lhs_value, ch_value);
|
|||
|
|
```
|
|||
|
|
- Extend to handle `StringAppendLiteral`:
|
|||
|
|
```rust
|
|||
|
|
// Emit string literal + concat
|
|||
|
|
let literal_value = emit_string_literal("...");
|
|||
|
|
let concat_result = emit_string_concat(lhs_value, literal_value);
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Phase 187 Design**: Document this separation, but do NOT implement.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. Architecture Diagram
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
AST → LoopUpdateAnalyzer → UpdateKind classification
|
|||
|
|
↓
|
|||
|
|
Pattern2/4.can_lower()
|
|||
|
|
(Whitelist check only)
|
|||
|
|
↓
|
|||
|
|
[ALLOW] → CarrierUpdateLowerer
|
|||
|
|
(Emit JoinIR instructions)
|
|||
|
|
↓
|
|||
|
|
JoinIR Module
|
|||
|
|
|
|||
|
|
[REJECT] → [joinir/freeze] error
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Separation of Concerns**:
|
|||
|
|
1. **LoopUpdateAnalyzer**: Extracts `UpdateExpr` from AST (already exists)
|
|||
|
|
2. **Pattern2/4**: Classifies into Allow/Reject (Phase 178 logic + Phase 188 extension)
|
|||
|
|
3. **CarrierUpdateLowerer**: Emits JoinIR (Phase 184 for numeric, Phase 188+ for string)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Representative Cases (Not Implemented)
|
|||
|
|
|
|||
|
|
### 5.1 JsonParser Update Patterns
|
|||
|
|
|
|||
|
|
#### _parse_number: `num_str = num_str + ch`
|
|||
|
|
**UpdateKind**: `StringAppendChar`
|
|||
|
|
**Classification**:
|
|||
|
|
- `num_str`: carrier name
|
|||
|
|
- `ch`: LoopBodyLocal variable (single character from string scan)
|
|||
|
|
- RHS: `UpdateRhs::Variable("ch")`
|
|||
|
|
**Decision**: **ALLOW** (Phase 188+)
|
|||
|
|
|
|||
|
|
#### _atoi: `num = num * 10 + digit`
|
|||
|
|
**UpdateKind**: `AccumulationLike` (numeric)
|
|||
|
|
**Classification**:
|
|||
|
|
- Nested BinOp: `(num * 10) + digit`
|
|||
|
|
- Currently detected as `UpdateRhs::Other`
|
|||
|
|
**Decision**: **COMPLEX** (requires BinOp tree analysis, Phase 189+)
|
|||
|
|
|
|||
|
|
#### _unescape_string: `result = result + s.substring(...)`
|
|||
|
|
**UpdateKind**: `Complex` (method call)
|
|||
|
|
**Classification**:
|
|||
|
|
- RHS: `UpdateRhs::Other` (MethodCall)
|
|||
|
|
**Decision**: **REJECT** with Fail-Fast
|
|||
|
|
|
|||
|
|
### 5.2 UpdateKind Mapping Table
|
|||
|
|
|
|||
|
|
| Loop Variable | Update Pattern | UpdateRhs | UpdateKind | Phase 187 Decision |
|
|||
|
|
|---------------|----------------|-----------|------------|-------------------|
|
|||
|
|
| `num_str` | `num_str + ch` | `Variable("ch")` | StringAppendChar | ALLOW (Phase 188+) |
|
|||
|
|
| `result` | `result + "\n"` | `StringLiteral("\n")` | StringAppendLiteral | ALLOW (Phase 188+) |
|
|||
|
|
| `num` | `num * 10 + digit` | `Other` (nested BinOp) | Complex | REJECT (Phase 189+) |
|
|||
|
|
| `result` | `result + s.substring(...)` | `Other` (MethodCall) | Complex | REJECT (Fail-Fast) |
|
|||
|
|
| `pos` | `pos + 1` | `Const(1)` | CounterLike | ALLOW (Phase 178 ✅) |
|
|||
|
|
| `sum` | `sum + i` | `Variable("i")` | AccumulationLike | ALLOW (Phase 178 ✅) |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Next Steps (Phase 188+ Implementation)
|
|||
|
|
|
|||
|
|
### Phase 188: StringAppendChar/Literal Implementation
|
|||
|
|
|
|||
|
|
**Scope**: Extend Pattern2/4 and CarrierUpdateLowerer to support string append.
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
1. **Extend `can_lower()` whitelist** (Pattern2/4)
|
|||
|
|
- Accept `UpdateRhs::Variable(_)` for string carriers
|
|||
|
|
- Accept `UpdateRhs::StringLiteral(_)` for string carriers
|
|||
|
|
- Keep `UpdateRhs::Other` as Fail-Fast
|
|||
|
|
|
|||
|
|
2. **Extend CarrierUpdateLowerer** (emission layer)
|
|||
|
|
- Detect carrier type (String vs Integer)
|
|||
|
|
- Emit `StringBox.concat()` call for string append
|
|||
|
|
- Emit `Compute { Add }` for numeric (existing logic)
|
|||
|
|
|
|||
|
|
3. **Add validation**
|
|||
|
|
- Check variable scope (LoopBodyLocal or OuterLocal only)
|
|||
|
|
- Check literal type (string only)
|
|||
|
|
|
|||
|
|
4. **E2E Test**
|
|||
|
|
- `_parse_number` minimal version with `num_str = num_str + ch`
|
|||
|
|
|
|||
|
|
**Estimate**: 3-4 hours
|
|||
|
|
|
|||
|
|
### Phase 189+: Complex BinOp (Future)
|
|||
|
|
|
|||
|
|
**Scope**: Handle nested BinOp like `num * 10 + digit`.
|
|||
|
|
|
|||
|
|
**Tasks**:
|
|||
|
|
1. Extend `analyze_rhs()` to recursively parse BinOp trees
|
|||
|
|
2. Classify simple nested patterns (e.g., `(x * 10) + y`) as safe
|
|||
|
|
3. Keep truly complex patterns (e.g., method calls in BinOp) as Fail-Fast
|
|||
|
|
|
|||
|
|
**Estimate**: 5-6 hours
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. Design Constraints
|
|||
|
|
|
|||
|
|
### 7.1 Box Theory Compliance
|
|||
|
|
|
|||
|
|
**Separation of Concerns**:
|
|||
|
|
- UpdateKind classification → LoopUpdateAnalyzer (existing box)
|
|||
|
|
- Can-lower decision → Pattern2/4 (control flow box)
|
|||
|
|
- JoinIR emission → CarrierUpdateLowerer (lowering box)
|
|||
|
|
|
|||
|
|
**No Cross-Boundary Leakage**:
|
|||
|
|
- Pattern2/4 do NOT emit JoinIR directly for string operations
|
|||
|
|
- CarrierUpdateLowerer does NOT make can-lower decisions
|
|||
|
|
|
|||
|
|
### 7.2 Fail-Fast Preservation
|
|||
|
|
|
|||
|
|
**Phase 178 Logic Untouched**:
|
|||
|
|
- All `UpdateRhs::StringLiteral` and `UpdateRhs::Other` continue to trigger Fail-Fast
|
|||
|
|
- Phase 187 only documents what "safe" means—implementation is Phase 188+
|
|||
|
|
|
|||
|
|
**Error Messages**:
|
|||
|
|
- Current: `"String/complex update detected, rejecting Pattern 2 (unsupported)"`
|
|||
|
|
- Future (Phase 188+): More specific messages for different rejection reasons
|
|||
|
|
|
|||
|
|
### 7.3 Testability
|
|||
|
|
|
|||
|
|
**Unit Test Separation**:
|
|||
|
|
- LoopUpdateAnalyzer tests: AST → UpdateExpr extraction
|
|||
|
|
- Pattern2/4 tests: UpdateExpr → can_lower decision
|
|||
|
|
- CarrierUpdateLowerer tests: UpdateExpr → JoinIR emission
|
|||
|
|
|
|||
|
|
**E2E Test**:
|
|||
|
|
- JsonParser representative loops (Phase 188+)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. Documentation Updates
|
|||
|
|
|
|||
|
|
### 8.1 joinir-architecture-overview.md
|
|||
|
|
|
|||
|
|
Add one sentence in Section 2.2 (条件式ライン):
|
|||
|
|
|
|||
|
|
```markdown
|
|||
|
|
- **LoopUpdateAnalyzer / CarrierUpdateLowerer**
|
|||
|
|
- ファイル:
|
|||
|
|
- `src/mir/join_ir/lowering/loop_update_analyzer.rs`
|
|||
|
|
- `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
|
|||
|
|
- 責務:
|
|||
|
|
- ループで更新される変数(carrier)を検出し、UpdateExpr を保持。
|
|||
|
|
- Pattern 4 では実際に更新されるキャリアだけを残す。
|
|||
|
|
- **Phase 187設計**: String 更新は UpdateKind ベースのホワイトリストで扱う方針(StringAppendChar/Literal は Phase 188+ で実装予定)。
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 8.2 CURRENT_TASK.md
|
|||
|
|
|
|||
|
|
Add Phase 187 entry:
|
|||
|
|
|
|||
|
|
```markdown
|
|||
|
|
- [x] **Phase 187: String UpdateLowering 設計** ✅ (2025-12-09)
|
|||
|
|
- UpdateKind ベースのホワイトリスト設計(doc-only)
|
|||
|
|
- StringAppendChar/StringAppendLiteral を安全パターンとして定義
|
|||
|
|
- Complex (method call / nested BinOp) は Fail-Fast 維持
|
|||
|
|
- Phase 178 の Fail-Fast は完全保持
|
|||
|
|
- Phase 188+ での実装方針を確立
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. Success Criteria (Phase 187)
|
|||
|
|
|
|||
|
|
- [x] Design document created (`phase187-string-update-design.md`)
|
|||
|
|
- [x] UpdateKind whitelist defined (6 categories)
|
|||
|
|
- [x] Fail-Fast preservation confirmed (Phase 178 untouched)
|
|||
|
|
- [x] Lowerer responsibility separation documented
|
|||
|
|
- [x] Representative cases analyzed (JsonParser loops)
|
|||
|
|
- [x] Architecture diagram created
|
|||
|
|
- [x] Next steps defined (Phase 188+ implementation)
|
|||
|
|
- [x] `joinir-architecture-overview.md` updated (1-sentence addition)
|
|||
|
|
- [x] `CURRENT_TASK.md` updated (Phase 187 entry added)
|
|||
|
|
|
|||
|
|
**All criteria met**: Phase 187 complete (design-only).
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 10. Conclusion
|
|||
|
|
|
|||
|
|
Phase 187 establishes a clear design for string update handling in JoinIR:
|
|||
|
|
|
|||
|
|
1. **Safe Patterns**: CounterLike, AccumulationLike, StringAppendChar, StringAppendLiteral
|
|||
|
|
2. **Unsafe Patterns**: Complex (method calls, nested BinOp) → Fail-Fast
|
|||
|
|
3. **Separation of Concerns**: Detection (Pattern2/4) vs Emission (CarrierUpdateLowerer)
|
|||
|
|
4. **Phase 178 Preservation**: All Fail-Fast logic remains unchanged
|
|||
|
|
|
|||
|
|
**No code changes in Phase 187**—all design decisions documented for Phase 188+ implementation.
|
|||
|
|
|
|||
|
|
**Next Phase**: Phase 188 - Implement StringAppendChar/Literal lowering (3-4 hours estimate).
|