Files
hakorune/docs/development/current/main/phase187-string-update-design.md

371 lines
12 KiB
Markdown
Raw Normal View History

# Phase 187: String UpdateLowering Design (Doc-Only)
**Date**: 2025-12-09
**Status**: Design Phase (No Code Changes)
**Prerequisite**: Phase 178 Fail-Fast must remain intact
---
## Executive Summary
Phase 187 defines **what kinds of string updates are safe to handle in JoinIR**, using an UpdateKind-based whitelist approach. This is a design-only phase—no code will be changed.
**Core Principle**: Maintain Phase 178's Fail-Fast behavior while establishing a clear path forward for string operations.
---
## 1. UpdateKind Candidates
We classify update patterns into categories based on their complexity and safety:
### 1.1 Safe Patterns (Whitelist Candidates)
#### CounterLike
**Pattern**: `pos = pos + 1`, `i = i - 1`
**String Relevance**: Position tracking in string scanning loops
**Safety**: ✅ Simple arithmetic, deterministic
**Decision**: **ALLOW** (already supported in Phase 178)
#### AccumulationLike (Numeric)
**Pattern**: `sum = sum + i`, `total = total * factor`
**String Relevance**: None (numeric only)
**Safety**: ✅ Arithmetic operations, well-understood
**Decision**: **ALLOW** (already supported in Phase 178)
#### StringAppendChar
**Pattern**: `result = result + ch` (where `ch` is a single character variable)
**Example**: JsonParser `_parse_number`: `num_str = num_str + digit_ch`
**Safety**: ⚠️ Requires:
- RHS must be `UpdateRhs::Variable(name)`
- Variable scope: LoopBodyLocal or OuterLocal
- Single character (enforced at runtime by StringBox semantics)
**Decision**: **ALLOW** (with validation)
**Rationale**: This pattern is structurally identical to numeric accumulation:
```
sum = sum + i // Numeric accumulation
result = result + ch // String accumulation (char-by-char)
```
#### StringAppendLiteral
**Pattern**: `s = s + "..."` (where `"..."` is a string literal)
**Example**: `debug_output = debug_output + "[INFO] "`
**Safety**: ⚠️ Requires:
- RHS must be `UpdateRhs::StringLiteral(s)`
- Literal must be compile-time constant
**Decision**: **ALLOW** (with validation)
**Rationale**: Simpler than StringAppendChar—no variable resolution needed.
### 1.2 Unsafe Patterns (Fail-Fast)
#### Complex (Method Calls)
**Pattern**: `result = result + s.substring(pos, end)`
**Example**: JsonParser `_unescape_string`
**Safety**: ❌ Requires:
- Method call evaluation
- Multiple arguments
- Potentially non-deterministic results
**Decision**: **REJECT** with `[joinir/freeze]`
**Error Message**:
```
[pattern2/can_lower] Complex string update detected (method call in RHS).
JoinIR does not support this pattern yet. Use simpler string operations.
```
#### Complex (Nested BinOp)
**Pattern**: `x = x + (a + b)`, `result = result + s1 + s2`
**Safety**: ❌ Nested expression evaluation required
**Decision**: **REJECT** with `[joinir/freeze]`
---
## 2. Fail-Fast Policy (Phase 178 Preservation)
**Non-Negotiable**: Phase 178's Fail-Fast behavior must remain intact.
### 2.1 Current Fail-Fast Logic (Untouched)
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs`
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs`
```rust
// Phase 178: Reject string/complex updates
fn can_lower(...) -> bool {
for update in carrier_updates.values() {
match update {
UpdateExpr::BinOp { rhs, .. } => {
if matches!(rhs, UpdateRhs::StringLiteral(_) | UpdateRhs::Other) {
// Phase 178: Fail-Fast for string updates
return false; // ← This stays unchanged in Phase 187
}
}
_ => {}
}
}
true
}
```
**Phase 187 Changes**: NONE (this code is not touched in Phase 187).
### 2.2 Future Whitelist Expansion (Phase 188+)
In **Phase 188** (implementation phase), we will:
1. Extend `can_lower()` to accept `StringAppendChar` and `StringAppendLiteral`
2. Add validation to ensure safety constraints (variable scope, literal type)
3. Extend `CarrierUpdateLowerer` to emit JoinIR for string append operations
**Phase 187 does NOT implement this**—we only design what "safe" means.
---
## 3. Lowerer Responsibility Separation
### 3.1 Detection Layer (Pattern2/4)
**Responsibility**: UpdateKind classification only
**Location**: `pattern2_with_break.rs`, `pattern4_with_continue.rs`
```rust
// Phase 187 Design: What Pattern2/4 WILL check (future)
fn can_lower_string_update(update: &UpdateExpr) -> bool {
match update {
UpdateExpr::BinOp { rhs, .. } => {
match rhs {
UpdateRhs::Variable(_) => true, // StringAppendChar
UpdateRhs::StringLiteral(_) => true, // StringAppendLiteral
UpdateRhs::Other => false, // Complex (reject)
UpdateRhs::Const(_) => true, // Numeric (already allowed)
}
}
_ => true,
}
}
```
**Key Point**: Pattern2/4 only perform classification—they do NOT emit JoinIR for strings.
### 3.2 Emission Layer (CarrierUpdateLowerer + Expr Lowerer)
**Responsibility**: Actual JoinIR instruction emission
**Location**: `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
**Current State (Phase 184)**:
- Handles numeric carriers only (`CounterLike`, `AccumulationLike`)
- Emits `Compute { op: Add/Sub/Mul, ... }` for numeric BinOp
**Future State (Phase 188+ Implementation)**:
- Extend to handle `StringAppendChar`:
```rust
// Emit StringBox.concat() call or equivalent
let concat_result = emit_string_concat(lhs_value, ch_value);
```
- Extend to handle `StringAppendLiteral`:
```rust
// Emit string literal + concat
let literal_value = emit_string_literal("...");
let concat_result = emit_string_concat(lhs_value, literal_value);
```
**Phase 187 Design**: Document this separation, but do NOT implement.
---
## 4. Architecture Diagram
```
AST → LoopUpdateAnalyzer → UpdateKind classification
Pattern2/4.can_lower()
(Whitelist check only)
[ALLOW] → CarrierUpdateLowerer
(Emit JoinIR instructions)
JoinIR Module
[REJECT] → [joinir/freeze] error
```
**Separation of Concerns**:
1. **LoopUpdateAnalyzer**: Extracts `UpdateExpr` from AST (already exists)
2. **Pattern2/4**: Classifies into Allow/Reject (Phase 178 logic + Phase 188 extension)
3. **CarrierUpdateLowerer**: Emits JoinIR (Phase 184 for numeric, Phase 188+ for string)
---
## 5. Representative Cases (Not Implemented)
### 5.1 JsonParser Update Patterns
#### _parse_number: `num_str = num_str + ch`
**UpdateKind**: `StringAppendChar`
**Classification**:
- `num_str`: carrier name
- `ch`: LoopBodyLocal variable (single character from string scan)
- RHS: `UpdateRhs::Variable("ch")`
**Decision**: **ALLOW** (Phase 188+)
#### _atoi: `num = num * 10 + digit`
**UpdateKind**: `AccumulationLike` (numeric)
**Classification**:
- Nested BinOp: `(num * 10) + digit`
- Currently detected as `UpdateRhs::Other`
**Decision**: **COMPLEX** (requires BinOp tree analysis, Phase 189+)
#### _unescape_string: `result = result + s.substring(...)`
**UpdateKind**: `Complex` (method call)
**Classification**:
- RHS: `UpdateRhs::Other` (MethodCall)
**Decision**: **REJECT** with Fail-Fast
### 5.2 UpdateKind Mapping Table
| Loop Variable | Update Pattern | UpdateRhs | UpdateKind | Phase 187 Decision |
|---------------|----------------|-----------|------------|-------------------|
| `num_str` | `num_str + ch` | `Variable("ch")` | StringAppendChar | ALLOW (Phase 188+) |
| `result` | `result + "\n"` | `StringLiteral("\n")` | StringAppendLiteral | ALLOW (Phase 188+) |
| `num` | `num * 10 + digit` | `Other` (nested BinOp) | Complex | REJECT (Phase 189+) |
| `result` | `result + s.substring(...)` | `Other` (MethodCall) | Complex | REJECT (Fail-Fast) |
| `pos` | `pos + 1` | `Const(1)` | CounterLike | ALLOW (Phase 178 ✅) |
| `sum` | `sum + i` | `Variable("i")` | AccumulationLike | ALLOW (Phase 178 ✅) |
---
## 6. Next Steps (Phase 188+ Implementation)
### Phase 188: StringAppendChar/Literal Implementation
**Scope**: Extend Pattern2/4 and CarrierUpdateLowerer to support string append.
**Tasks**:
1. **Extend `can_lower()` whitelist** (Pattern2/4)
- Accept `UpdateRhs::Variable(_)` for string carriers
- Accept `UpdateRhs::StringLiteral(_)` for string carriers
- Keep `UpdateRhs::Other` as Fail-Fast
2. **Extend CarrierUpdateLowerer** (emission layer)
- Detect carrier type (String vs Integer)
- Emit `StringBox.concat()` call for string append
- Emit `Compute { Add }` for numeric (existing logic)
3. **Add validation**
- Check variable scope (LoopBodyLocal or OuterLocal only)
- Check literal type (string only)
4. **E2E Test**
- `_parse_number` minimal version with `num_str = num_str + ch`
**Estimate**: 3-4 hours
### Phase 189+: Complex BinOp (Future)
**Scope**: Handle nested BinOp like `num * 10 + digit`.
**Tasks**:
1. Extend `analyze_rhs()` to recursively parse BinOp trees
2. Classify simple nested patterns (e.g., `(x * 10) + y`) as safe
3. Keep truly complex patterns (e.g., method calls in BinOp) as Fail-Fast
**Estimate**: 5-6 hours
---
## 7. Design Constraints
### 7.1 Box Theory Compliance
**Separation of Concerns**:
- UpdateKind classification → LoopUpdateAnalyzer (existing box)
- Can-lower decision → Pattern2/4 (control flow box)
- JoinIR emission → CarrierUpdateLowerer (lowering box)
**No Cross-Boundary Leakage**:
- Pattern2/4 do NOT emit JoinIR directly for string operations
- CarrierUpdateLowerer does NOT make can-lower decisions
### 7.2 Fail-Fast Preservation
**Phase 178 Logic Untouched**:
- All `UpdateRhs::StringLiteral` and `UpdateRhs::Other` continue to trigger Fail-Fast
- Phase 187 only documents what "safe" means—implementation is Phase 188+
**Error Messages**:
- Current: `"String/complex update detected, rejecting Pattern 2 (unsupported)"`
- Future (Phase 188+): More specific messages for different rejection reasons
### 7.3 Testability
**Unit Test Separation**:
- LoopUpdateAnalyzer tests: AST → UpdateExpr extraction
- Pattern2/4 tests: UpdateExpr → can_lower decision
- CarrierUpdateLowerer tests: UpdateExpr → JoinIR emission
**E2E Test**:
- JsonParser representative loops (Phase 188+)
---
## 8. Documentation Updates
### 8.1 joinir-architecture-overview.md
Add one sentence in Section 2.2 (条件式ライン):
```markdown
- **LoopUpdateAnalyzer / CarrierUpdateLowerer**
- ファイル:
- `src/mir/join_ir/lowering/loop_update_analyzer.rs`
- `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
- 責務:
- ループで更新される変数carrierを検出し、UpdateExpr を保持。
- Pattern 4 では実際に更新されるキャリアだけを残す。
- **Phase 187設計**: String 更新は UpdateKind ベースのホワイトリストで扱う方針StringAppendChar/Literal は Phase 188+ で実装予定)。
```
### 8.2 CURRENT_TASK.md
Add Phase 187 entry:
```markdown
- [x] **Phase 187: String UpdateLowering 設計** ✅ (2025-12-09)
- UpdateKind ベースのホワイトリスト設計doc-only
- StringAppendChar/StringAppendLiteral を安全パターンとして定義
- Complex (method call / nested BinOp) は Fail-Fast 維持
- Phase 178 の Fail-Fast は完全保持
- Phase 188+ での実装方針を確立
```
---
## 9. Success Criteria (Phase 187)
- [x] Design document created (`phase187-string-update-design.md`)
- [x] UpdateKind whitelist defined (6 categories)
- [x] Fail-Fast preservation confirmed (Phase 178 untouched)
- [x] Lowerer responsibility separation documented
- [x] Representative cases analyzed (JsonParser loops)
- [x] Architecture diagram created
- [x] Next steps defined (Phase 188+ implementation)
- [x] `joinir-architecture-overview.md` updated (1-sentence addition)
- [x] `CURRENT_TASK.md` updated (Phase 187 entry added)
**All criteria met**: Phase 187 complete (design-only).
---
## 10. Conclusion
Phase 187 establishes a clear design for string update handling in JoinIR:
1. **Safe Patterns**: CounterLike, AccumulationLike, StringAppendChar, StringAppendLiteral
2. **Unsafe Patterns**: Complex (method calls, nested BinOp) → Fail-Fast
3. **Separation of Concerns**: Detection (Pattern2/4) vs Emission (CarrierUpdateLowerer)
4. **Phase 178 Preservation**: All Fail-Fast logic remains unchanged
**No code changes in Phase 187**—all design decisions documented for Phase 188+ implementation.
**Next Phase**: Phase 188 - Implement StringAppendChar/Literal lowering (3-4 hours estimate).