Files
hakorune/docs/development/current/main/phase187-string-update-design.md
nyash-codex d4231f5d3a feat(joinir): Phase 185-187 body-local infrastructure + string design
Phase 185: Body-local Pattern2/4 integration skeleton
- Added collect_body_local_variables() helper
- Integrated UpdateEnv usage in loop_with_break_minimal
- Test files created (blocked by init lowering)

Phase 186: Body-local init lowering infrastructure
- Created LoopBodyLocalInitLowerer box (378 lines)
- Supports BinOp (+/-/*//) + Const + Variable
- Fail-Fast for method calls/string operations
- 3 unit tests passing

Phase 187: String UpdateLowering design (doc-only)
- Defined UpdateKind whitelist (6 categories)
- StringAppendChar/Literal patterns identified
- 3-layer architecture documented
- No code changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-09 00:59:38 +09:00

371 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 187: String UpdateLowering Design (Doc-Only)
**Date**: 2025-12-09
**Status**: Design Phase (No Code Changes)
**Prerequisite**: Phase 178 Fail-Fast must remain intact
---
## Executive Summary
Phase 187 defines **what kinds of string updates are safe to handle in JoinIR**, using an UpdateKind-based whitelist approach. This is a design-only phase—no code will be changed.
**Core Principle**: Maintain Phase 178's Fail-Fast behavior while establishing a clear path forward for string operations.
---
## 1. UpdateKind Candidates
We classify update patterns into categories based on their complexity and safety:
### 1.1 Safe Patterns (Whitelist Candidates)
#### CounterLike
**Pattern**: `pos = pos + 1`, `i = i - 1`
**String Relevance**: Position tracking in string scanning loops
**Safety**: ✅ Simple arithmetic, deterministic
**Decision**: **ALLOW** (already supported in Phase 178)
#### AccumulationLike (Numeric)
**Pattern**: `sum = sum + i`, `total = total * factor`
**String Relevance**: None (numeric only)
**Safety**: ✅ Arithmetic operations, well-understood
**Decision**: **ALLOW** (already supported in Phase 178)
#### StringAppendChar
**Pattern**: `result = result + ch` (where `ch` is a single character variable)
**Example**: JsonParser `_parse_number`: `num_str = num_str + digit_ch`
**Safety**: ⚠️ Requires:
- RHS must be `UpdateRhs::Variable(name)`
- Variable scope: LoopBodyLocal or OuterLocal
- Single character (enforced at runtime by StringBox semantics)
**Decision**: **ALLOW** (with validation)
**Rationale**: This pattern is structurally identical to numeric accumulation:
```
sum = sum + i // Numeric accumulation
result = result + ch // String accumulation (char-by-char)
```
#### StringAppendLiteral
**Pattern**: `s = s + "..."` (where `"..."` is a string literal)
**Example**: `debug_output = debug_output + "[INFO] "`
**Safety**: ⚠️ Requires:
- RHS must be `UpdateRhs::StringLiteral(s)`
- Literal must be compile-time constant
**Decision**: **ALLOW** (with validation)
**Rationale**: Simpler than StringAppendChar—no variable resolution needed.
### 1.2 Unsafe Patterns (Fail-Fast)
#### Complex (Method Calls)
**Pattern**: `result = result + s.substring(pos, end)`
**Example**: JsonParser `_unescape_string`
**Safety**: ❌ Requires:
- Method call evaluation
- Multiple arguments
- Potentially non-deterministic results
**Decision**: **REJECT** with `[joinir/freeze]`
**Error Message**:
```
[pattern2/can_lower] Complex string update detected (method call in RHS).
JoinIR does not support this pattern yet. Use simpler string operations.
```
#### Complex (Nested BinOp)
**Pattern**: `x = x + (a + b)`, `result = result + s1 + s2`
**Safety**: ❌ Nested expression evaluation required
**Decision**: **REJECT** with `[joinir/freeze]`
---
## 2. Fail-Fast Policy (Phase 178 Preservation)
**Non-Negotiable**: Phase 178's Fail-Fast behavior must remain intact.
### 2.1 Current Fail-Fast Logic (Untouched)
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs`
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs`
```rust
// Phase 178: Reject string/complex updates
fn can_lower(...) -> bool {
for update in carrier_updates.values() {
match update {
UpdateExpr::BinOp { rhs, .. } => {
if matches!(rhs, UpdateRhs::StringLiteral(_) | UpdateRhs::Other) {
// Phase 178: Fail-Fast for string updates
return false; // ← This stays unchanged in Phase 187
}
}
_ => {}
}
}
true
}
```
**Phase 187 Changes**: NONE (this code is not touched in Phase 187).
### 2.2 Future Whitelist Expansion (Phase 188+)
In **Phase 188** (implementation phase), we will:
1. Extend `can_lower()` to accept `StringAppendChar` and `StringAppendLiteral`
2. Add validation to ensure safety constraints (variable scope, literal type)
3. Extend `CarrierUpdateLowerer` to emit JoinIR for string append operations
**Phase 187 does NOT implement this**—we only design what "safe" means.
---
## 3. Lowerer Responsibility Separation
### 3.1 Detection Layer (Pattern2/4)
**Responsibility**: UpdateKind classification only
**Location**: `pattern2_with_break.rs`, `pattern4_with_continue.rs`
```rust
// Phase 187 Design: What Pattern2/4 WILL check (future)
fn can_lower_string_update(update: &UpdateExpr) -> bool {
match update {
UpdateExpr::BinOp { rhs, .. } => {
match rhs {
UpdateRhs::Variable(_) => true, // StringAppendChar
UpdateRhs::StringLiteral(_) => true, // StringAppendLiteral
UpdateRhs::Other => false, // Complex (reject)
UpdateRhs::Const(_) => true, // Numeric (already allowed)
}
}
_ => true,
}
}
```
**Key Point**: Pattern2/4 only perform classification—they do NOT emit JoinIR for strings.
### 3.2 Emission Layer (CarrierUpdateLowerer + Expr Lowerer)
**Responsibility**: Actual JoinIR instruction emission
**Location**: `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
**Current State (Phase 184)**:
- Handles numeric carriers only (`CounterLike`, `AccumulationLike`)
- Emits `Compute { op: Add/Sub/Mul, ... }` for numeric BinOp
**Future State (Phase 188+ Implementation)**:
- Extend to handle `StringAppendChar`:
```rust
// Emit StringBox.concat() call or equivalent
let concat_result = emit_string_concat(lhs_value, ch_value);
```
- Extend to handle `StringAppendLiteral`:
```rust
// Emit string literal + concat
let literal_value = emit_string_literal("...");
let concat_result = emit_string_concat(lhs_value, literal_value);
```
**Phase 187 Design**: Document this separation, but do NOT implement.
---
## 4. Architecture Diagram
```
AST → LoopUpdateAnalyzer → UpdateKind classification
Pattern2/4.can_lower()
(Whitelist check only)
[ALLOW] → CarrierUpdateLowerer
(Emit JoinIR instructions)
JoinIR Module
[REJECT] → [joinir/freeze] error
```
**Separation of Concerns**:
1. **LoopUpdateAnalyzer**: Extracts `UpdateExpr` from AST (already exists)
2. **Pattern2/4**: Classifies into Allow/Reject (Phase 178 logic + Phase 188 extension)
3. **CarrierUpdateLowerer**: Emits JoinIR (Phase 184 for numeric, Phase 188+ for string)
---
## 5. Representative Cases (Not Implemented)
### 5.1 JsonParser Update Patterns
#### _parse_number: `num_str = num_str + ch`
**UpdateKind**: `StringAppendChar`
**Classification**:
- `num_str`: carrier name
- `ch`: LoopBodyLocal variable (single character from string scan)
- RHS: `UpdateRhs::Variable("ch")`
**Decision**: **ALLOW** (Phase 188+)
#### _atoi: `num = num * 10 + digit`
**UpdateKind**: `AccumulationLike` (numeric)
**Classification**:
- Nested BinOp: `(num * 10) + digit`
- Currently detected as `UpdateRhs::Other`
**Decision**: **COMPLEX** (requires BinOp tree analysis, Phase 189+)
#### _unescape_string: `result = result + s.substring(...)`
**UpdateKind**: `Complex` (method call)
**Classification**:
- RHS: `UpdateRhs::Other` (MethodCall)
**Decision**: **REJECT** with Fail-Fast
### 5.2 UpdateKind Mapping Table
| Loop Variable | Update Pattern | UpdateRhs | UpdateKind | Phase 187 Decision |
|---------------|----------------|-----------|------------|-------------------|
| `num_str` | `num_str + ch` | `Variable("ch")` | StringAppendChar | ALLOW (Phase 188+) |
| `result` | `result + "\n"` | `StringLiteral("\n")` | StringAppendLiteral | ALLOW (Phase 188+) |
| `num` | `num * 10 + digit` | `Other` (nested BinOp) | Complex | REJECT (Phase 189+) |
| `result` | `result + s.substring(...)` | `Other` (MethodCall) | Complex | REJECT (Fail-Fast) |
| `pos` | `pos + 1` | `Const(1)` | CounterLike | ALLOW (Phase 178 ✅) |
| `sum` | `sum + i` | `Variable("i")` | AccumulationLike | ALLOW (Phase 178 ✅) |
---
## 6. Next Steps (Phase 188+ Implementation)
### Phase 188: StringAppendChar/Literal Implementation
**Scope**: Extend Pattern2/4 and CarrierUpdateLowerer to support string append.
**Tasks**:
1. **Extend `can_lower()` whitelist** (Pattern2/4)
- Accept `UpdateRhs::Variable(_)` for string carriers
- Accept `UpdateRhs::StringLiteral(_)` for string carriers
- Keep `UpdateRhs::Other` as Fail-Fast
2. **Extend CarrierUpdateLowerer** (emission layer)
- Detect carrier type (String vs Integer)
- Emit `StringBox.concat()` call for string append
- Emit `Compute { Add }` for numeric (existing logic)
3. **Add validation**
- Check variable scope (LoopBodyLocal or OuterLocal only)
- Check literal type (string only)
4. **E2E Test**
- `_parse_number` minimal version with `num_str = num_str + ch`
**Estimate**: 3-4 hours
### Phase 189+: Complex BinOp (Future)
**Scope**: Handle nested BinOp like `num * 10 + digit`.
**Tasks**:
1. Extend `analyze_rhs()` to recursively parse BinOp trees
2. Classify simple nested patterns (e.g., `(x * 10) + y`) as safe
3. Keep truly complex patterns (e.g., method calls in BinOp) as Fail-Fast
**Estimate**: 5-6 hours
---
## 7. Design Constraints
### 7.1 Box Theory Compliance
**Separation of Concerns**:
- UpdateKind classification → LoopUpdateAnalyzer (existing box)
- Can-lower decision → Pattern2/4 (control flow box)
- JoinIR emission → CarrierUpdateLowerer (lowering box)
**No Cross-Boundary Leakage**:
- Pattern2/4 do NOT emit JoinIR directly for string operations
- CarrierUpdateLowerer does NOT make can-lower decisions
### 7.2 Fail-Fast Preservation
**Phase 178 Logic Untouched**:
- All `UpdateRhs::StringLiteral` and `UpdateRhs::Other` continue to trigger Fail-Fast
- Phase 187 only documents what "safe" means—implementation is Phase 188+
**Error Messages**:
- Current: `"String/complex update detected, rejecting Pattern 2 (unsupported)"`
- Future (Phase 188+): More specific messages for different rejection reasons
### 7.3 Testability
**Unit Test Separation**:
- LoopUpdateAnalyzer tests: AST → UpdateExpr extraction
- Pattern2/4 tests: UpdateExpr → can_lower decision
- CarrierUpdateLowerer tests: UpdateExpr → JoinIR emission
**E2E Test**:
- JsonParser representative loops (Phase 188+)
---
## 8. Documentation Updates
### 8.1 joinir-architecture-overview.md
Add one sentence in Section 2.2 (条件式ライン):
```markdown
- **LoopUpdateAnalyzer / CarrierUpdateLowerer**
- ファイル:
- `src/mir/join_ir/lowering/loop_update_analyzer.rs`
- `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
- 責務:
- ループで更新される変数carrierを検出し、UpdateExpr を保持。
- Pattern 4 では実際に更新されるキャリアだけを残す。
- **Phase 187設計**: String 更新は UpdateKind ベースのホワイトリストで扱う方針StringAppendChar/Literal は Phase 188+ で実装予定)。
```
### 8.2 CURRENT_TASK.md
Add Phase 187 entry:
```markdown
- [x] **Phase 187: String UpdateLowering 設計** ✅ (2025-12-09)
- UpdateKind ベースのホワイトリスト設計doc-only
- StringAppendChar/StringAppendLiteral を安全パターンとして定義
- Complex (method call / nested BinOp) は Fail-Fast 維持
- Phase 178 の Fail-Fast は完全保持
- Phase 188+ での実装方針を確立
```
---
## 9. Success Criteria (Phase 187)
- [x] Design document created (`phase187-string-update-design.md`)
- [x] UpdateKind whitelist defined (6 categories)
- [x] Fail-Fast preservation confirmed (Phase 178 untouched)
- [x] Lowerer responsibility separation documented
- [x] Representative cases analyzed (JsonParser loops)
- [x] Architecture diagram created
- [x] Next steps defined (Phase 188+ implementation)
- [x] `joinir-architecture-overview.md` updated (1-sentence addition)
- [x] `CURRENT_TASK.md` updated (Phase 187 entry added)
**All criteria met**: Phase 187 complete (design-only).
---
## 10. Conclusion
Phase 187 establishes a clear design for string update handling in JoinIR:
1. **Safe Patterns**: CounterLike, AccumulationLike, StringAppendChar, StringAppendLiteral
2. **Unsafe Patterns**: Complex (method calls, nested BinOp) → Fail-Fast
3. **Separation of Concerns**: Detection (Pattern2/4) vs Emission (CarrierUpdateLowerer)
4. **Phase 178 Preservation**: All Fail-Fast logic remains unchanged
**No code changes in Phase 187**—all design decisions documented for Phase 188+ implementation.
**Next Phase**: Phase 188 - Implement StringAppendChar/Literal lowering (3-4 hours estimate).