Files
hakorune/docs/development/current/main/phase48-norm-p4-design.md
nyash-codex 4ecb6435d3 docs(joinir): Phase 48 - Normalized P4 (Continue) design
Complete design documentation for Pattern4 (continue) Normalized support,
extending unified Normalized infrastructure to all 4 loop patterns.

Design documents:
- phase48-norm-p4-design.md: Complete P4 Normalized design (380 lines)
  - Key insight: continue = immediate TailCallFn(loop_step, ...) (no new instruction!)
  - Infrastructure reuse: 95%+ from P2/P3 (only ContinueCheck step kind new)
  - Target loops prioritized:
    - ◎ _parse_array (skip whitespace) - PRIMARY (Phase 48-A)
    - ○ _parse_object (skip whitespace) - Extended
    - △ _unescape_string, _parse_string - Later
  - Control flow: ContinueCheck before Updates (skip processing early)
  - Same EnvLayout/ConditionEnv/CarrierInfo/ExitLine as P2/P3
  - Implementation strategy: dev-only → canonical (proven approach)

Architecture:
- Unified Normalized for P1/P2/P3/P4 (all patterns same pipeline)
- P4 uses loop_step(env, k_exit) skeleton (same as P2/P3)
- continue semantics: TailCallFn = skip to next iteration (CPS natural fit)

Benefits:
- 95% infrastructure reuse from P2/P3
- No new JpInst needed (continue = existing TailCallFn)
- Incremental rollout (dev → canonical)
- Clear semantic: continue = immediate recursion

Implementation roadmap:
- Phase 48-A: Minimal continue (dev-only)
- Phase 48-B: Extended patterns (multi-carrier)
- Phase 48-C: Canonical promotion

Updates:
- joinir-architecture-overview.md: Added Phase 48 section
- CURRENT_TASK.md: Phase 48 entry (Design Complete)
- phase47-norm-p3-design.md: Minor formatting

Status: Design phase complete (doc-only, no implementation yet)
Next: Phase 48-A implementation (when requested)
2025-12-12 06:06:39 +09:00

12 KiB

Phase 48: Normalized P4 (Continue) Design

Status: Design Phase (doc-only) Date: 2025-12-12

Goal

Design Pattern4 (continue) Normalized architecture, extending the unified Normalized infrastructure that successfully handles P1/P2/P3.

Key insight: P4 is the reverse control flow of P2 (break). Where P2 exits early, P4 skips to next iteration early. Same infrastructure, different routing.

Background: Unified Normalized Success

Phase 43-47 established unified Normalized for P1/P2/P3:

  • Pattern1: Simple while loops
  • Pattern2: Break loops (skip_whitespace, _atoi, _parse_number)
  • Pattern3: If-sum loops (conditional carrier updates)

Infrastructure proven:

  • Structured→Normalized→MIR(direct) pipeline
  • EnvLayout, JpInst/JpOp, StepScheduleBox
  • ConditionEnv, CarrierInfo, ExitLine
  • All patterns use same loop_step(env, k_exit) skeleton

Why P4 Uses Same Normalized

Control Flow Comparison

Aspect P2 (Break) P4 (Continue) Difference
Normal flow Execute body, update carriers, loop Same Identical
Early exit if (cond) break → exit loop if (cond) continue → next iteration Flow direction
Carrier updates Before break check After continue check Order
Infrastructure ConditionEnv, ExitLine, PHI Same Reusable

Key difference: continue = TailCallFn(loop_step, env', k_exit) (immediate recursion) vs break = TailCallKont(k_exit, result) (exit to continuation).

P4 in Normalized JoinIR

// P2 (break) structure:
loop_step(env, k_exit) {
    if (header_cond) {
        // body
        if (break_cond) {
            TailCallKont(k_exit, result)  // Exit early
        }
        // update carriers
        TailCallFn(loop_step, env', k_exit)  // Loop back
    } else {
        TailCallKont(k_exit, result)  // Normal exit
    }
}

// P4 (continue) structure:
loop_step(env, k_exit) {
    if (header_cond) {
        // body
        if (continue_cond) {
            TailCallFn(loop_step, env', k_exit)  // Skip to next iteration ← continue!
        }
        // update carriers (only if NOT continued)
        TailCallFn(loop_step, env'', k_exit)  // Loop back
    } else {
        TailCallKont(k_exit, result)  // Normal exit
    }
}

Observation: continue is just an early TailCallFn call. No new JpInst needed!

Target P4 Loops (JsonParser)

Priority Assessment

Loop Pattern Complexity Priority Rationale
_parse_array (skip whitespace) P4 minimal Low ◎ PRIMARY Simple continue, single carrier (i)
_parse_object (skip whitespace) P4 minimal Low ○ Extended Same as _parse_array
_unescape_string (skip special chars) P4 mid Medium △ Later String operations, body-local
_parse_string (escape handling) P4 mid Medium △ Later Complex escape sequences

Phase 48-A Target: _parse_array (skip whitespace)

Example (simplified):

local i = 0
local s = "[1, 2]"
local len = s.length()

loop(i < len) {
    local ch = s.substring(i, i+1)

    if (ch == " " || ch == "\t") {
        i = i + 1
        continue  // Skip whitespace
    }

    // Process non-whitespace character
    // ...
    i = i + 1
}

Characteristics:

  • Simple condition: ch == " " || ch == "\t" (OR pattern)
  • Single carrier: i (position counter)
  • Body-local: ch (character)
  • continue before carrier update

Normalized shape:

  • EnvLayout: { i: int }
  • StepSchedule: [HeaderCond, BodyInit(ch), ContinueCheck, Updates(process), Tail(i++)]

Normalized Components for P4

StepScheduleBox Extension

P2/P3 steps (existing):

enum StepKind {
    HeaderCond,   // loop(cond)
    BodyInit,     // local ch = ...
    BreakCheck,   // if (cond) break  (P2)
    IfCond,       // if (cond) in body  (P3)
    ThenUpdates,  // carrier updates (P3)
    Updates,      // carrier updates
    Tail,         // i = i + 1
}

P4 addition:

enum StepKind {
    // ... existing ...

    ContinueCheck,  // if (cond) continue  (P4)
}

P4 schedule:

// _parse_array skip whitespace pattern
[HeaderCond, BodyInit, ContinueCheck, Updates, Tail]

// vs P2 pattern
[HeaderCond, BodyInit, BreakCheck, Updates, Tail]

// Observation: Same structure, different check semantics!

JpInst Reuse

No new JpInst needed! P4 uses existing instructions:

// P2 break:
If { cond, then_target: k_exit, else_target: continue_body }

// P4 continue:
If { cond, then_target: loop_step_with_tail, else_target: process_body }

Key: continue = immediate TailCallFn(loop_step, ...), not a new instruction.

EnvLayout (Same as P2)

P2 example:

struct Pattern2Env {
    i: int,      // loop param
    sum: int,    // carrier
}

P4 example (identical structure):

struct Pattern4Env {
    i: int,      // loop param (position counter)
    // No additional carriers for skip whitespace
}

No new fields needed - P4 carriers work same as P2/P3.

Architecture: Unified Normalized

┌──────────────────────────────────────────┐
│   Structured JoinIR (Pattern1-4 共通)    │
│  - ConditionEnv (P2/P3/P4 統一)          │
│  - CarrierInfo                           │
│  - ExitLine/Boundary                     │
└──────────────┬───────────────────────────┘
               │
               ▼
┌──────────────────────────────────────────┐
│   Normalized JoinIR (Pattern1-4 共通)    │  ← P4 もここに載る!
│  - EnvLayout (P2 完成 → P3/P4 拡張)      │
│  - JpInst/JpOp (既存で対応済み)          │
│  - StepScheduleBox (ContinueCheck 追加)   │
└──────────────┬───────────────────────────┘
               │
               ▼
┌──────────────────────────────────────────┐
│   MIR (Pattern1-4 共通)                  │
└──────────────────────────────────────────┘

Implementation Strategy

Phase 48-A: Minimal _parse_array skip whitespace (dev-only)

Goal: Prove P4 can use Normalized infrastructure with minimal additions.

Steps:

  1. ShapeGuard: Add Pattern4ContinueMinimal shape
  2. StepScheduleBox: Add ContinueCheck step kind
  3. Normalized lowering:
    • Generate If JpInst for continue check
    • then_target → immediate TailCallFn(loop_step, ...) (continue)
    • else_target → process body, then tail
  4. Test: Verify Structured→Normalized→MIR(direct) matches Structured→MIR

Expected additions:

  • shape_guard.rs: +1 shape variant
  • step_schedule.rs: +1 step kind (ContinueCheck)
  • normalized.rs: +40 lines (normalize_pattern4_continue_minimal)
  • tests/normalized_joinir_min.rs: +1 P4 test

Dev fixture: Create pattern4_continue_minimal from _parse_array skip whitespace

Phase 48-B: _parse_object, _unescape_string (dev-only)

Goal: Extend to multiple carriers, string operations.

Additions:

  • Multi-carrier EnvLayout (if needed)
  • String body-local handling (already exists from P2 DigitPos)

Phase 48-C: Canonical promotion

Goal: Move P4 minimal from dev-only to canonical (like P2/P3).

Key Design Decisions

1. Continue = TailCallFn, not new instruction

Rationale: continue is semantically "skip to next iteration", which is exactly what TailCallFn(loop_step, env', k_exit) does in CPS.

Benefit: No new JpInst, reuses existing MIR generation.

2. ContinueCheck step before Updates

Rationale: continue must happen BEFORE carrier updates (skip processing).

P4 step order:

HeaderCond → BodyInit → ContinueCheck → Updates (processing) → Tail (increment)
                             ↓ (if true)
                        TailCallFn (skip Updates)

3. Same EnvLayout as P2

Rationale: P4 carriers (position, accumulators) are same types as P2.

Benefit: No new EnvLayout design, reuses P2 infrastructure 100%.

Comparison with P2/P3

Component P2 (Break) P3 (If-Sum) P4 (Continue) Shared?
EnvLayout Yes
ConditionEnv Yes
CarrierInfo Yes
ExitLine Yes
StepKind BreakCheck IfCond, ThenUpdates ContinueCheck Additive
JpInst If, TailCallFn, TailCallKont Same Same Yes
Control flow Exit early Conditional update Skip early Different

Infrastructure reuse: 95%+ (only StepKind and control flow routing differ)

Testing Strategy

Phase 48-A: Minimal

Test: test_normalized_pattern4_continue_minimal

#[cfg(feature = "normalized_dev")]
#[test]
fn test_normalized_pattern4_continue_minimal() {
    let source = r#"
        local i = 0
        local n = 5
        local count = 0
        loop(i < n) {
            if (i == 2) {
                i = i + 1
                continue
            }
            count = count + 1
            i = i + 1
        }
        print("count = " + count.to_string())
    "#;

    // Compare Structured→MIR vs Normalized→MIR(direct)
    assert_vm_output_matches(source);
}

Expected output:

count = 4  (skipped i==2, so counted 0,1,3,4)

Success Criteria

Phase 48-A complete when:

  1. test_normalized_pattern4_continue_minimal passes (dev-only)
  2. Structured→Normalized→MIR(direct) output matches Structured→MIR
  3. All 938+ tests still pass (no regressions)
  4. ShapeGuard can detect Pattern4ContinueMinimal
  5. Documentation updated (architecture overview, CURRENT_TASK)

Phase 48-B complete when:

  1. _parse_object, _unescape_string tests pass (dev-only)
  2. Multi-carrier + string operations work in P4 Normalized

Phase 48-C complete when:

  1. P4 minimal promoted to canonical (always Normalized)
  2. Performance validated

Scope Management

In Scope (Phase 48-A):

  • Minimal P4 (simple continue pattern)
  • Dev-only Normalized support
  • Reuse P2/P3 infrastructure (ConditionEnv, CarrierInfo, ExitLine)

Out of Scope (deferred):

  • Complex P4 patterns (nested if, multiple continue points)
  • Canonical promotion (Phase 48-C)
  • Selfhost loops (later phase)

File Impact Estimate

Expected modifications (Phase 48-A):

  1. shape_guard.rs: +20 lines (Pattern4ContinueMinimal shape)
  2. step_schedule.rs: +10 lines (ContinueCheck step kind)
  3. normalized.rs: +40 lines (normalize_pattern4_continue_minimal)
  4. tests/normalized_joinir_min.rs: +30 lines (P4 test)
  5. phase48-norm-p4-design.md: +250 lines (this doc)
  6. joinir-architecture-overview.md: +10 lines (Phase 48 section)
  7. CURRENT_TASK.md: +5 lines (Phase 48 entry)

Total: ~365 lines (+), pure additive (no P1/P2/P3 code changes)

Benefits

  1. Infrastructure reuse: 95% of P2/P3 Normalized code works for P4
  2. Unified pipeline: All patterns (P1/P2/P3/P4) use same Normalized
  3. Incremental rollout: Dev-only → canonical, proven approach from P2/P3
  4. Semantic clarity: continue = immediate TailCallFn (no new concepts)

Next Steps After Phase 48

  1. Phase 48-A implementation: Minimal P4 (continue) dev-only
  2. Phase 48-B: Extended P4 (multi-carrier, string ops)
  3. Phase 48-C: Canonical promotion
  4. Selfhost loops: Complex patterns from selfhost compiler

References