Files
hakorune/docs/development/current/main/phase187-string-update-design.md
nyash-codex a2933880ae feat(joinir): Phase 188 StringAppend support in Pattern2/4
- Extended Pattern2/4 whitelist to accept StringLiteral updates
- CarrierUpdateEmitter now emits JoinIR for string append
- Selective Fail-Fast: accept safe patterns, reject complex

Changes:
- pattern2_with_break.rs: StringLiteral whitelist
- pattern4_with_continue.rs: StringLiteral whitelist
- carrier_update_emitter.rs: StringLiteral JoinIR emission

Tests:
- phase188_string_append_char.hako
- phase188_string_append_literal.hako
- 10/10 carrier_update_emitter tests PASS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-09 01:09:54 +09:00

14 KiB
Raw Blame History

Phase 187: String UpdateLowering Design (Doc-Only)

Date: 2025-12-09 Status: Design Phase (No Code Changes) Prerequisite: Phase 178 Fail-Fast must remain intact


Executive Summary

Phase 187 defines what kinds of string updates are safe to handle in JoinIR, using an UpdateKind-based whitelist approach. This is a design-only phase—no code will be changed.

Core Principle: Maintain Phase 178's Fail-Fast behavior while establishing a clear path forward for string operations.


1. UpdateKind Candidates

We classify update patterns into categories based on their complexity and safety:

1.1 Safe Patterns (Whitelist Candidates)

CounterLike

Pattern: pos = pos + 1, i = i - 1 String Relevance: Position tracking in string scanning loops Safety: Simple arithmetic, deterministic Decision: ALLOW (already supported in Phase 178)

AccumulationLike (Numeric)

Pattern: sum = sum + i, total = total * factor String Relevance: None (numeric only) Safety: Arithmetic operations, well-understood Decision: ALLOW (already supported in Phase 178)

StringAppendChar

Pattern: result = result + ch (where ch is a single character variable) Example: JsonParser _parse_number: num_str = num_str + digit_ch Safety: ⚠️ Requires:

  • RHS must be UpdateRhs::Variable(name)
  • Variable scope: LoopBodyLocal or OuterLocal
  • Single character (enforced at runtime by StringBox semantics) Decision: ALLOW (with validation)

Rationale: This pattern is structurally identical to numeric accumulation:

sum = sum + i        // Numeric accumulation
result = result + ch // String accumulation (char-by-char)

StringAppendLiteral

Pattern: s = s + "..." (where "..." is a string literal) Example: debug_output = debug_output + "[INFO] " Safety: ⚠️ Requires:

  • RHS must be UpdateRhs::StringLiteral(s)
  • Literal must be compile-time constant Decision: ALLOW (with validation)

Rationale: Simpler than StringAppendChar—no variable resolution needed.

1.2 Unsafe Patterns (Fail-Fast)

Complex (Method Calls)

Pattern: result = result + s.substring(pos, end) Example: JsonParser _unescape_string Safety: Requires:

  • Method call evaluation
  • Multiple arguments
  • Potentially non-deterministic results Decision: REJECT with [joinir/freeze]

Error Message:

[pattern2/can_lower] Complex string update detected (method call in RHS).
JoinIR does not support this pattern yet. Use simpler string operations.

Complex (Nested BinOp)

Pattern: x = x + (a + b), result = result + s1 + s2 Safety: Nested expression evaluation required Decision: REJECT with [joinir/freeze]


2. Fail-Fast Policy (Phase 178 Preservation)

Non-Negotiable: Phase 178's Fail-Fast behavior must remain intact.

2.1 Current Fail-Fast Logic (Untouched)

File: src/mir/builder/control_flow/joinir/patterns/pattern2_with_break.rs File: src/mir/builder/control_flow/joinir/patterns/pattern4_with_continue.rs

// Phase 178: Reject string/complex updates
fn can_lower(...) -> bool {
    for update in carrier_updates.values() {
        match update {
            UpdateExpr::BinOp { rhs, .. } => {
                if matches!(rhs, UpdateRhs::StringLiteral(_) | UpdateRhs::Other) {
                    // Phase 178: Fail-Fast for string updates
                    return false; // ← This stays unchanged in Phase 187
                }
            }
            _ => {}
        }
    }
    true
}

Phase 187 Changes: NONE (this code is not touched in Phase 187).

2.2 Future Whitelist Expansion (Phase 188+)

In Phase 188 (implementation phase), we will:

  1. Extend can_lower() to accept StringAppendChar and StringAppendLiteral
  2. Add validation to ensure safety constraints (variable scope, literal type)
  3. Extend CarrierUpdateLowerer to emit JoinIR for string append operations

Phase 187 does NOT implement this—we only design what "safe" means.


3. Lowerer Responsibility Separation

3.1 Detection Layer (Pattern2/4)

Responsibility: UpdateKind classification only Location: pattern2_with_break.rs, pattern4_with_continue.rs

// Phase 187 Design: What Pattern2/4 WILL check (future)
fn can_lower_string_update(update: &UpdateExpr) -> bool {
    match update {
        UpdateExpr::BinOp { rhs, .. } => {
            match rhs {
                UpdateRhs::Variable(_) => true,       // StringAppendChar
                UpdateRhs::StringLiteral(_) => true,  // StringAppendLiteral
                UpdateRhs::Other => false,            // Complex (reject)
                UpdateRhs::Const(_) => true,          // Numeric (already allowed)
            }
        }
        _ => true,
    }
}

Key Point: Pattern2/4 only perform classification—they do NOT emit JoinIR for strings.

3.2 Emission Layer (CarrierUpdateLowerer + Expr Lowerer)

Responsibility: Actual JoinIR instruction emission Location: src/mir/join_ir/lowering/carrier_update_lowerer.rs

Current State (Phase 184):

  • Handles numeric carriers only (CounterLike, AccumulationLike)
  • Emits Compute { op: Add/Sub/Mul, ... } for numeric BinOp

Future State (Phase 188+ Implementation):

  • Extend to handle StringAppendChar:
    // Emit StringBox.concat() call or equivalent
    let concat_result = emit_string_concat(lhs_value, ch_value);
    
  • Extend to handle StringAppendLiteral:
    // Emit string literal + concat
    let literal_value = emit_string_literal("...");
    let concat_result = emit_string_concat(lhs_value, literal_value);
    

Phase 187 Design: Document this separation, but do NOT implement.


4. Architecture Diagram

AST → LoopUpdateAnalyzer → UpdateKind classification
                              ↓
                    Pattern2/4.can_lower()
                    (Whitelist check only)
                              ↓
                    [ALLOW] → CarrierUpdateLowerer
                              (Emit JoinIR instructions)
                              ↓
                              JoinIR Module

                    [REJECT] → [joinir/freeze] error

Separation of Concerns:

  1. LoopUpdateAnalyzer: Extracts UpdateExpr from AST (already exists)
  2. Pattern2/4: Classifies into Allow/Reject (Phase 178 logic + Phase 188 extension)
  3. CarrierUpdateLowerer: Emits JoinIR (Phase 184 for numeric, Phase 188+ for string)

5. Representative Cases (Not Implemented)

5.1 JsonParser Update Patterns

_parse_number: num_str = num_str + ch

UpdateKind: StringAppendChar Classification:

  • num_str: carrier name
  • ch: LoopBodyLocal variable (single character from string scan)
  • RHS: UpdateRhs::Variable("ch") Decision: ALLOW (Phase 188+)

_atoi: num = num * 10 + digit

UpdateKind: AccumulationLike (numeric) Classification:

  • Nested BinOp: (num * 10) + digit
  • Currently detected as UpdateRhs::Other Decision: COMPLEX (requires BinOp tree analysis, Phase 189+)

_unescape_string: result = result + s.substring(...)

UpdateKind: Complex (method call) Classification:

  • RHS: UpdateRhs::Other (MethodCall) Decision: REJECT with Fail-Fast

5.2 UpdateKind Mapping Table

Loop Variable Update Pattern UpdateRhs UpdateKind Phase 187 Decision
num_str num_str + ch Variable("ch") StringAppendChar ALLOW (Phase 188+)
result result + "\n" StringLiteral("\n") StringAppendLiteral ALLOW (Phase 188+)
num num * 10 + digit Other (nested BinOp) Complex REJECT (Phase 189+)
result result + s.substring(...) Other (MethodCall) Complex REJECT (Fail-Fast)
pos pos + 1 Const(1) CounterLike ALLOW (Phase 178 )
sum sum + i Variable("i") AccumulationLike ALLOW (Phase 178 )

6. Next Steps (Phase 188+ Implementation)

Phase 188: StringAppendChar/Literal Implementation

Scope: Extend Pattern2/4 and CarrierUpdateLowerer to support string append.

Tasks:

  1. Extend can_lower() whitelist (Pattern2/4)

    • Accept UpdateRhs::Variable(_) for string carriers
    • Accept UpdateRhs::StringLiteral(_) for string carriers
    • Keep UpdateRhs::Other as Fail-Fast
  2. Extend CarrierUpdateLowerer (emission layer)

    • Detect carrier type (String vs Integer)
    • Emit StringBox.concat() call for string append
    • Emit Compute { Add } for numeric (existing logic)
  3. Add validation

    • Check variable scope (LoopBodyLocal or OuterLocal only)
    • Check literal type (string only)
  4. E2E Test

    • _parse_number minimal version with num_str = num_str + ch

Estimate: 3-4 hours

Phase 189+: Complex BinOp (Future)

Scope: Handle nested BinOp like num * 10 + digit.

Tasks:

  1. Extend analyze_rhs() to recursively parse BinOp trees
  2. Classify simple nested patterns (e.g., (x * 10) + y) as safe
  3. Keep truly complex patterns (e.g., method calls in BinOp) as Fail-Fast

Estimate: 5-6 hours


7. Design Constraints

7.1 Box Theory Compliance

Separation of Concerns:

  • UpdateKind classification → LoopUpdateAnalyzer (existing box)
  • Can-lower decision → Pattern2/4 (control flow box)
  • JoinIR emission → CarrierUpdateLowerer (lowering box)

No Cross-Boundary Leakage:

  • Pattern2/4 do NOT emit JoinIR directly for string operations
  • CarrierUpdateLowerer does NOT make can-lower decisions

7.2 Fail-Fast Preservation

Phase 178 Logic Untouched:

  • All UpdateRhs::StringLiteral and UpdateRhs::Other continue to trigger Fail-Fast
  • Phase 187 only documents what "safe" means—implementation is Phase 188+

Error Messages:

  • Current: "String/complex update detected, rejecting Pattern 2 (unsupported)"
  • Future (Phase 188+): More specific messages for different rejection reasons

7.3 Testability

Unit Test Separation:

  • LoopUpdateAnalyzer tests: AST → UpdateExpr extraction
  • Pattern2/4 tests: UpdateExpr → can_lower decision
  • CarrierUpdateLowerer tests: UpdateExpr → JoinIR emission

E2E Test:

  • JsonParser representative loops (Phase 188+)

8. Documentation Updates

8.1 joinir-architecture-overview.md

Add one sentence in Section 2.2 (条件式ライン):

- **LoopUpdateAnalyzer / CarrierUpdateLowerer**
  - ファイル:
    - `src/mir/join_ir/lowering/loop_update_analyzer.rs`
    - `src/mir/join_ir/lowering/carrier_update_lowerer.rs`
  - 責務:
    - ループで更新される変数carrierを検出し、UpdateExpr を保持。
    - Pattern 4 では実際に更新されるキャリアだけを残す。
    - **Phase 187設計**: String 更新は UpdateKind ベースのホワイトリストで扱う方針StringAppendChar/Literal は Phase 188+ で実装予定)。

8.2 CURRENT_TASK.md

Add Phase 187 entry:

  - [x] **Phase 187: String UpdateLowering 設計** ✅ (2025-12-09)
        - UpdateKind ベースのホワイトリスト設計doc-only
        - StringAppendChar/StringAppendLiteral を安全パターンとして定義
        - Complex (method call / nested BinOp) は Fail-Fast 維持
        - Phase 178 の Fail-Fast は完全保持
        - Phase 188+ での実装方針を確立

9. Success Criteria (Phase 187)

  • Design document created (phase187-string-update-design.md)
  • UpdateKind whitelist defined (6 categories)
  • Fail-Fast preservation confirmed (Phase 178 untouched)
  • Lowerer responsibility separation documented
  • Representative cases analyzed (JsonParser loops)
  • Architecture diagram created
  • Next steps defined (Phase 188+ implementation)
  • joinir-architecture-overview.md updated (1-sentence addition)
  • CURRENT_TASK.md updated (Phase 187 entry added)

All criteria met: Phase 187 complete (design-only).


Phase 188 Implementation Complete (2025-12-09)

Implementation Summary

Phase 188 successfully implemented StringAppendChar and StringAppendLiteral support in JoinIR patterns.

Changes Made:

  1. Pattern2/4 can_lower() Whitelist (Task 188-2)

    • Updated pattern2_with_break.rs and pattern4_with_continue.rs
    • Allow: UpdateRhs::Const, UpdateRhs::Variable, UpdateRhs::StringLiteral
    • Reject: UpdateRhs::Other (complex updates only)
    • Old behavior: Rejected all string updates
    • New behavior: Accept safe string patterns, reject only complex ones
  2. CarrierUpdateLowerer JoinIR Emission (Task 188-3)

    • Updated carrier_update_emitter.rs (both UpdateEnv and ConditionEnv versions)
    • UpdateRhs::StringLiteral(s) → Emit Const { value: ConstValue::String(s) } + BinOp
    • UpdateRhs::Variable(name) → Resolve variable, emit BinOp (handles both numeric and string)
    • UpdateRhs::Other → Return error (should be caught by can_lower)
  3. E2E Test Files (Task 188-4)

    • Created apps/tests/phase188_string_append_char.hako (Pattern 2 with break)
    • Created apps/tests/phase188_string_append_literal.hako (Pattern 4 with continue)
    • Both tests compile and run without errors
    • JoinIR generation succeeds for both patterns

Verification:

# Pattern 2: StringAppendChar
NYASH_JOINIR_CORE=1 ./target/release/hakorune apps/tests/phase188_string_append_char.hako
# Output: Pattern 2 triggered, JoinIR generated successfully

# Pattern 4: StringAppendLiteral
NYASH_JOINIR_CORE=1 ./target/release/hakorune apps/tests/phase188_string_append_literal.hako
# Output: Pattern 4 triggered, JoinIR generated successfully

Key Achievement: Phase 178's Fail-Fast is now selective - only rejects truly complex updates (method calls, nested BinOp), while allowing safe string concatenation patterns.


10. Conclusion

Phase 187 establishes a clear design for string update handling in JoinIR:

  1. Safe Patterns: CounterLike, AccumulationLike, StringAppendChar, StringAppendLiteral
  2. Unsafe Patterns: Complex (method calls, nested BinOp) → Fail-Fast
  3. Separation of Concerns: Detection (Pattern2/4) vs Emission (CarrierUpdateLowerer)
  4. Phase 178 Preservation: All Fail-Fast logic remains unchanged

No code changes in Phase 187—all design decisions documented for Phase 188+ implementation.

Next Phase: Phase 188 - Implement StringAppendChar/Literal lowering (3-4 hours estimate).