Files
hakorune/docs/development/current/main/phase187-string-update-design.md
nyash-codex a2933880ae feat(joinir): Phase 188 StringAppend support in Pattern2/4
- Extended Pattern2/4 whitelist to accept StringLiteral updates
- CarrierUpdateEmitter now emits JoinIR for string append
- Selective Fail-Fast: accept safe patterns, reject complex

Changes:
- pattern2_with_break.rs: StringLiteral whitelist
- pattern4_with_continue.rs: StringLiteral whitelist
- carrier_update_emitter.rs: StringLiteral JoinIR emission

Tests:
- phase188_string_append_char.hako
- phase188_string_append_literal.hako
- 10/10 carrier_update_emitter tests PASS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-09 01:09:54 +09:00

413 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 187: String UpdateLowering Design (Doc-Only)
**Date**: 2025-12-09
**Status**: Design Phase (No Code Changes)
**Prerequisite**: Phase 178 Fail-Fast must remain intact
---
## Executive Summary
Phase 187 defines **what kinds of string updates are safe to handle in JoinIR**, using an UpdateKind-based whitelist approach. This is a design-only phase—no code will be changed.
**Core Principle**: Maintain Phase 178's Fail-Fast behavior while establishing a clear path forward for string operations.
---
## 1. UpdateKind Candidates
We classify update patterns into categories based on their complexity and safety:
### 1.1 Safe Patterns (Whitelist Candidates)
#### CounterLike
**Pattern**: `pos = pos + 1`, `i = i - 1`
**String Relevance**: Position tracking in string scanning loops
**Safety**: ✅ Simple arithmetic, deterministic
**Decision**: **ALLOW** (already supported in Phase 178)
#### AccumulationLike (Numeric)
**Pattern**: `sum = sum + i`, `total = total * factor`
**String Relevance**: None (numeric only)
**Safety**: ✅ Arithmetic operations, well-understood
**Decision**: **ALLOW** (already supported in Phase 178)
#### StringAppendChar
**Pattern**: `result = result + ch` (where `ch` is a single character variable)
**Example**: JsonParser `_parse_number`: `num_str = num_str + digit_ch`
**Safety**: ⚠️ Requires:
- RHS must be `UpdateRhs::Variable(name)`
- Variable scope: LoopBodyLocal or OuterLocal
- Single character (enforced at runtime by StringBox semantics)
**Decision**: **ALLOW** (with validation)
**Rationale**: This pattern is structurally identical to numeric accumulation:
```
sum = sum + i // Numeric accumulation
result = result + ch // String accumulation (char-by-char)
```
#### StringAppendLiteral
**Pattern**: `s = s + "..."` (where `"..."` is a string literal)
**Example**: `debug_output = debug_output + "[INFO] "`
**Safety**: ⚠️ Requires:
- RHS must be `UpdateRhs::StringLiteral(s)`
- Literal must be compile-time constant
**Decision**: **ALLOW** (with validation)
**Rationale**: Simpler than StringAppendChar—no variable resolution needed.
### 1.2 Unsafe Patterns (Fail-Fast)
#### Complex (Method Calls)
**Pattern**: `result = result + s.substring(pos, end)`
**Example**: JsonParser `_unescape_string`
**Safety**: ❌ Requires:
- Method call evaluation
- Multiple arguments
- Potentially non-deterministic results
**Decision**: **REJECT** with `[joinir/freeze]`
**Error Message**:
```
[pattern2/can_lower] Complex string update detected (method call in RHS).
JoinIR does not support this pattern yet. Use simpler string operations.
```
#### Complex (Nested BinOp)
**Pattern**: `x = x + (a + b)`, `result = result + s1 + s2`
**Safety**: ❌ Nested expression evaluation required
**Decision**: **REJECT** with `[joinir/freeze]`
---
## 2. Fail-Fast Policy (Phase 178 Preservation)
**Non-Negotiable**: Phase 178's Fail-Fast behavior must remain intact.
### 2.1 Current Fail-Fast Logic (Untouched)
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs`
**File**: `src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs`
```rust
// Phase 178: Reject string/complex updates
fn can_lower(...) -> bool {
for update in carrier_updates.values() {
match update {
UpdateExpr::BinOp { rhs, .. } => {
if matches!(rhs, UpdateRhs::StringLiteral(_) | UpdateRhs::Other) {
// Phase 178: Fail-Fast for string updates
return false; // ← This stays unchanged in Phase 187
}
}
_ => {}
}
}
true
}
```
**Phase 187 Changes**: NONE (this code is not touched in Phase 187).
### 2.2 Future Whitelist Expansion (Phase 188+)
In **Phase 188** (implementation phase), we will:
1. Extend `can_lower()` to accept `StringAppendChar` and `StringAppendLiteral`
2. Add validation to ensure safety constraints (variable scope, literal type)
3. Extend `CarrierUpdateLowerer` to emit JoinIR for string append operations
**Phase 187 does NOT implement this**—we only design what "safe" means.
---
## 3. Lowerer Responsibility Separation
### 3.1 Detection Layer (Pattern2/4)
**Responsibility**: UpdateKind classification only
**Location**: `pattern2_with_break.rs`, `pattern4_with_continue.rs`
```rust
// Phase 187 Design: What Pattern2/4 WILL check (future)
fn can_lower_string_update(update: &UpdateExpr) -> bool {
match update {
UpdateExpr::BinOp { rhs, .. } => {
match rhs {
UpdateRhs::Variable(_) => true, // StringAppendChar
UpdateRhs::StringLiteral(_) => true, // StringAppendLiteral
UpdateRhs::Other => false, // Complex (reject)
UpdateRhs::Const(_) => true, // Numeric (already allowed)
}
}
_ => true,
}
}
```
**Key Point**: Pattern2/4 only perform classification—they do NOT emit JoinIR for strings.
### 3.2 Emission Layer (CarrierUpdateLowerer + Expr Lowerer)
**Responsibility**: Actual JoinIR instruction emission
**Location**: `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
**Current State (Phase 184)**:
- Handles numeric carriers only (`CounterLike`, `AccumulationLike`)
- Emits `Compute { op: Add/Sub/Mul, ... }` for numeric BinOp
**Future State (Phase 188+ Implementation)**:
- Extend to handle `StringAppendChar`:
```rust
// Emit StringBox.concat() call or equivalent
let concat_result = emit_string_concat(lhs_value, ch_value);
```
- Extend to handle `StringAppendLiteral`:
```rust
// Emit string literal + concat
let literal_value = emit_string_literal("...");
let concat_result = emit_string_concat(lhs_value, literal_value);
```
**Phase 187 Design**: Document this separation, but do NOT implement.
---
## 4. Architecture Diagram
```
AST → LoopUpdateAnalyzer → UpdateKind classification
Pattern2/4.can_lower()
(Whitelist check only)
[ALLOW] → CarrierUpdateLowerer
(Emit JoinIR instructions)
JoinIR Module
[REJECT] → [joinir/freeze] error
```
**Separation of Concerns**:
1. **LoopUpdateAnalyzer**: Extracts `UpdateExpr` from AST (already exists)
2. **Pattern2/4**: Classifies into Allow/Reject (Phase 178 logic + Phase 188 extension)
3. **CarrierUpdateLowerer**: Emits JoinIR (Phase 184 for numeric, Phase 188+ for string)
---
## 5. Representative Cases (Not Implemented)
### 5.1 JsonParser Update Patterns
#### _parse_number: `num_str = num_str + ch`
**UpdateKind**: `StringAppendChar`
**Classification**:
- `num_str`: carrier name
- `ch`: LoopBodyLocal variable (single character from string scan)
- RHS: `UpdateRhs::Variable("ch")`
**Decision**: **ALLOW** (Phase 188+)
#### _atoi: `num = num * 10 + digit`
**UpdateKind**: `AccumulationLike` (numeric)
**Classification**:
- Nested BinOp: `(num * 10) + digit`
- Currently detected as `UpdateRhs::Other`
**Decision**: **COMPLEX** (requires BinOp tree analysis, Phase 189+)
#### _unescape_string: `result = result + s.substring(...)`
**UpdateKind**: `Complex` (method call)
**Classification**:
- RHS: `UpdateRhs::Other` (MethodCall)
**Decision**: **REJECT** with Fail-Fast
### 5.2 UpdateKind Mapping Table
| Loop Variable | Update Pattern | UpdateRhs | UpdateKind | Phase 187 Decision |
|---------------|----------------|-----------|------------|-------------------|
| `num_str` | `num_str + ch` | `Variable("ch")` | StringAppendChar | ALLOW (Phase 188+) |
| `result` | `result + "\n"` | `StringLiteral("\n")` | StringAppendLiteral | ALLOW (Phase 188+) |
| `num` | `num * 10 + digit` | `Other` (nested BinOp) | Complex | REJECT (Phase 189+) |
| `result` | `result + s.substring(...)` | `Other` (MethodCall) | Complex | REJECT (Fail-Fast) |
| `pos` | `pos + 1` | `Const(1)` | CounterLike | ALLOW (Phase 178 ✅) |
| `sum` | `sum + i` | `Variable("i")` | AccumulationLike | ALLOW (Phase 178 ✅) |
---
## 6. Next Steps (Phase 188+ Implementation)
### Phase 188: StringAppendChar/Literal Implementation
**Scope**: Extend Pattern2/4 and CarrierUpdateLowerer to support string append.
**Tasks**:
1. **Extend `can_lower()` whitelist** (Pattern2/4)
- Accept `UpdateRhs::Variable(_)` for string carriers
- Accept `UpdateRhs::StringLiteral(_)` for string carriers
- Keep `UpdateRhs::Other` as Fail-Fast
2. **Extend CarrierUpdateLowerer** (emission layer)
- Detect carrier type (String vs Integer)
- Emit `StringBox.concat()` call for string append
- Emit `Compute { Add }` for numeric (existing logic)
3. **Add validation**
- Check variable scope (LoopBodyLocal or OuterLocal only)
- Check literal type (string only)
4. **E2E Test**
- `_parse_number` minimal version with `num_str = num_str + ch`
**Estimate**: 3-4 hours
### Phase 189+: Complex BinOp (Future)
**Scope**: Handle nested BinOp like `num * 10 + digit`.
**Tasks**:
1. Extend `analyze_rhs()` to recursively parse BinOp trees
2. Classify simple nested patterns (e.g., `(x * 10) + y`) as safe
3. Keep truly complex patterns (e.g., method calls in BinOp) as Fail-Fast
**Estimate**: 5-6 hours
---
## 7. Design Constraints
### 7.1 Box Theory Compliance
**Separation of Concerns**:
- UpdateKind classification → LoopUpdateAnalyzer (existing box)
- Can-lower decision → Pattern2/4 (control flow box)
- JoinIR emission → CarrierUpdateLowerer (lowering box)
**No Cross-Boundary Leakage**:
- Pattern2/4 do NOT emit JoinIR directly for string operations
- CarrierUpdateLowerer does NOT make can-lower decisions
### 7.2 Fail-Fast Preservation
**Phase 178 Logic Untouched**:
- All `UpdateRhs::StringLiteral` and `UpdateRhs::Other` continue to trigger Fail-Fast
- Phase 187 only documents what "safe" means—implementation is Phase 188+
**Error Messages**:
- Current: `"String/complex update detected, rejecting Pattern 2 (unsupported)"`
- Future (Phase 188+): More specific messages for different rejection reasons
### 7.3 Testability
**Unit Test Separation**:
- LoopUpdateAnalyzer tests: AST → UpdateExpr extraction
- Pattern2/4 tests: UpdateExpr → can_lower decision
- CarrierUpdateLowerer tests: UpdateExpr → JoinIR emission
**E2E Test**:
- JsonParser representative loops (Phase 188+)
---
## 8. Documentation Updates
### 8.1 joinir-architecture-overview.md
Add one sentence in Section 2.2 (条件式ライン):
```markdown
- **LoopUpdateAnalyzer / CarrierUpdateLowerer**
- ファイル:
- `src/mir/join_ir/lowering/loop_update_analyzer.rs`
- `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
- 責務:
- ループで更新される変数carrierを検出し、UpdateExpr を保持。
- Pattern 4 では実際に更新されるキャリアだけを残す。
- **Phase 187設計**: String 更新は UpdateKind ベースのホワイトリストで扱う方針StringAppendChar/Literal は Phase 188+ で実装予定)。
```
### 8.2 CURRENT_TASK.md
Add Phase 187 entry:
```markdown
- [x] **Phase 187: String UpdateLowering 設計** ✅ (2025-12-09)
- UpdateKind ベースのホワイトリスト設計doc-only
- StringAppendChar/StringAppendLiteral を安全パターンとして定義
- Complex (method call / nested BinOp) は Fail-Fast 維持
- Phase 178 の Fail-Fast は完全保持
- Phase 188+ での実装方針を確立
```
---
## 9. Success Criteria (Phase 187)
- [x] Design document created (`phase187-string-update-design.md`)
- [x] UpdateKind whitelist defined (6 categories)
- [x] Fail-Fast preservation confirmed (Phase 178 untouched)
- [x] Lowerer responsibility separation documented
- [x] Representative cases analyzed (JsonParser loops)
- [x] Architecture diagram created
- [x] Next steps defined (Phase 188+ implementation)
- [x] `joinir-architecture-overview.md` updated (1-sentence addition)
- [x] `CURRENT_TASK.md` updated (Phase 187 entry added)
**All criteria met**: Phase 187 complete (design-only).
---
## Phase 188 Implementation Complete (2025-12-09)
### Implementation Summary
Phase 188 successfully implemented StringAppendChar and StringAppendLiteral support in JoinIR patterns.
**Changes Made**:
1. **Pattern2/4 `can_lower()` Whitelist** (Task 188-2)
- Updated `pattern2_with_break.rs` and `pattern4_with_continue.rs`
- Allow: `UpdateRhs::Const`, `UpdateRhs::Variable`, `UpdateRhs::StringLiteral`
- Reject: `UpdateRhs::Other` (complex updates only)
- Old behavior: Rejected all string updates
- New behavior: Accept safe string patterns, reject only complex ones
2. **CarrierUpdateLowerer JoinIR Emission** (Task 188-3)
- Updated `carrier_update_emitter.rs` (both UpdateEnv and ConditionEnv versions)
- `UpdateRhs::StringLiteral(s)` → Emit `Const { value: ConstValue::String(s) }` + `BinOp`
- `UpdateRhs::Variable(name)` → Resolve variable, emit `BinOp` (handles both numeric and string)
- `UpdateRhs::Other` → Return error (should be caught by can_lower)
3. **E2E Test Files** (Task 188-4)
- Created `apps/tests/phase188_string_append_char.hako` (Pattern 2 with break)
- Created `apps/tests/phase188_string_append_literal.hako` (Pattern 4 with continue)
- Both tests compile and run without errors
- JoinIR generation succeeds for both patterns
**Verification**:
```bash
# Pattern 2: StringAppendChar
NYASH_JOINIR_CORE=1 ./target/release/hakorune apps/tests/phase188_string_append_char.hako
# Output: Pattern 2 triggered, JoinIR generated successfully
# Pattern 4: StringAppendLiteral
NYASH_JOINIR_CORE=1 ./target/release/hakorune apps/tests/phase188_string_append_literal.hako
# Output: Pattern 4 triggered, JoinIR generated successfully
```
**Key Achievement**: Phase 178's Fail-Fast is now selective - only rejects truly complex updates (method calls, nested BinOp), while allowing safe string concatenation patterns.
---
## 10. Conclusion
Phase 187 establishes a clear design for string update handling in JoinIR:
1. **Safe Patterns**: CounterLike, AccumulationLike, StringAppendChar, StringAppendLiteral
2. **Unsafe Patterns**: Complex (method calls, nested BinOp) → Fail-Fast
3. **Separation of Concerns**: Detection (Pattern2/4) vs Emission (CarrierUpdateLowerer)
4. **Phase 178 Preservation**: All Fail-Fast logic remains unchanged
**No code changes in Phase 187**—all design decisions documented for Phase 188+ implementation.
**Next Phase**: Phase 188 - Implement StringAppendChar/Literal lowering (3-4 hours estimate).