Files
hakorune/docs/development/current/main/investigations/phase132-llvm-exit-phi-wrong-result.md
nyash-codex 447d4ea246 feat(llvm): Phase 132 - Pattern 1 exit value parity fix + Box-First refactoring
## Phase 132: Exit PHI Value Parity Fix

### Problem
Pattern 1 (Simple While) returned 0 instead of final loop variable value (3)
- VM: RC: 3  (correct)
- LLVM: Result: 0  (wrong)

### Root Cause (Two Layers)
1. **JoinIR/Boundary**: Missing exit_bindings → ExitLineReconnector not firing
2. **LLVM Python**: block_end_values snapshot dropping PHI values

### Fix
**JoinIR** (simple_while_minimal.rs):
- Jump(k_exit, [i_param]) passes exit value

**Boundary** (pattern1_minimal.rs):
- Added LoopExitBinding with carrier_name="i", role=LoopState
- Enables ExitLineReconnector to update variable_map

**LLVM** (block_lower.py):
- Use predeclared_ret_phis for reliable PHI filtering
- Protect builder.vmap PHIs from overwrites (SSOT principle)

### Result
-  VM: RC: 3
-  LLVM: Result: 3
-  VM/LLVM parity achieved

## Phase 132-Post: Box-First Refactoring

### Rust Side
**JoinModule::require_function()** (mod.rs):
- Encapsulate function search logic
- 10 lines → 1 line (90% reduction)
- Reusable for Pattern 2-5

### Python Side
**PhiManager Box** (phi_manager.py - new):
- Centralized PHI lifecycle management
- 47 lines → 8 lines (83% reduction)
- SSOT: builder.vmap owns PHIs
- Fail-Fast: No silent overwrites

**Integration**:
- LLVMBuilder: Added phi_manager
- block_lower.py: Delegated to PhiManager
- tagging.py: Register PHIs with manager

### Documentation
**New Files**:
- docs/development/architecture/exit-phi-design.md
- docs/development/current/main/investigations/phase132-llvm-exit-phi-wrong-result.md
- docs/development/current/main/phases/phase-132/

**Updated**:
- docs/development/current/main/10-Now.md
- docs/development/current/main/phase131-3-llvm-lowering-inventory.md

### Design Principles
- Box-First: Logic encapsulated in classes/methods
- SSOT: Single Source of Truth (builder.vmap for PHIs)
- Fail-Fast: Early explicit failures, no fallbacks
- Separation of Concerns: 3-layer architecture

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-15 03:17:31 +09:00

6.6 KiB
Raw Blame History

Phase 132: LLVM Exit PHI=0 Bug Investigation & Fix

Date: 2025-12-15 Status: Fixed Impact: Critical - Exit PHIs from loops were returning 0 instead of correct values

Problem Statement

LLVM backend was returning 0 for exit PHI values in simple while loops, while VM backend correctly returned the final loop variable value.

Test Case

static box Main {
  main() {
    local i = 0
    loop(i < 3) { i = i + 1 }
    return i  // Should return 3, was returning 0 in LLVM
  }
}

Expected: Result: 3 (matches VM) Actual (before fix): Result: 0 MIR: Correct (bb3: %1 = phi [%3, bb6]; ret %1) LLVM IR (before fix): ret i64 0 (wrong!) LLVM IR (after fix): ret i64 %"phi_1" (correct!)

Root Cause Analysis

この不具合は「2層」にまたがっていました:

  1. JoinIR/Boundary 層で exit 値が境界を通っていないVM でも 0 になり得る)
  2. LLVM Python 層で PHI の SSOT が壊れていて exit PHI が 0 になるVM は正常でも LLVM が壊れる)

このページは主に (2) の LLVM Python 層の根治を記録します。
(1) の修正は Phase 132 の一部として別途コード側で入っています(修正ファイル一覧を参照)。

Investigation Path

  1. PHI filtering issue in vmap_cur initialization

    • Confirmed: Filter relied on phi.basic_block attribute
    • Issue: llvmlite sets phi.basic_block = None until IR finalization
    • Filter at block_lower.py:323-365 was silently dropping ALL PHIs
  2. builder.vmap overwrite issue

    • Confirmed: The real root cause!

The Actual Bug

Two separate issues combined to cause the bug:

Issue 1: Unreliable PHI.basic_block Attribute

  • llvmlite's PHI instructions have basic_block = None when created
  • Filter logic at block_lower.py:326-340 relied on phi.basic_block.name comparison
  • Since basic_block was always None, filter excluded ALL PHIs from vmap_cur
  • Fix: Use predeclared_ret_phis dict instead of basic_block attribute

Issue 2: builder.vmap PHI Overwrites (Critical!)

At block_lower.py:437-448, Pass A syncs created values to builder.vmap:

# Phase 131-7: Sync ALL created values to global vmap
for vid in created_ids:
    val = vmap_cur.get(vid)
    if val is not None:
        builder.vmap[vid] = val  # ❌ Unconditional overwrite!

The Fatal Sequence:

  1. Pass A setup: Creates PHI v1 for bb3, stores in builder.vmap[1]
  2. Pass A processes bb0:
    • vmap_cur filters out v1 PHI (not from bb0)
    • const v1 instruction writes to vmap_cur[1]
    • Line 444: Syncs vmap_cur[1] → builder.vmap[1], overwriting PHI!
  3. Pass A processes bb3:
    • vmap_cur initialized from builder.vmap
    • builder.vmap[1] is now the const (not PHI!)
    • return v1 uses const 0 instead of PHI

The Fix

Fix 1: PHI Filtering (block_lower.py:320-347)

Before (unreliable basic_block check):

if hasattr(_val, 'add_incoming'):
    bb_of = getattr(getattr(_val, 'basic_block', None), 'name', None)
    bb_name = getattr(bb, 'name', None)
    keep = (bb_of == bb_name)  # ❌ Always False! bb_of is None

After (use predeclared_ret_phis dict):

if hasattr(_val, 'add_incoming'):  # Is it a PHI?
    phi_key = (int(bid), int(_vid))
    if phi_key in predecl_phis:
        keep = True  # ✅ Reliable tracking
    else:
        keep = False  # Avoid namespace collision

Fix 2: Protect builder.vmap PHIs (block_lower.py:437-455)

Before (unconditional overwrite):

for vid in created_ids:
    val = vmap_cur.get(vid)
    if val is not None:
        builder.vmap[vid] = val  # ❌ Overwrites PHIs!

After (PHI protection):

for vid in created_ids:
    val = vmap_cur.get(vid)
    if val is not None:
        existing = builder.vmap.get(vid)
        # Don't overwrite existing PHIs - SSOT principle
        if existing is not None and hasattr(existing, 'add_incoming'):
            continue  # ✅ Skip sync, preserve PHI
        builder.vmap[vid] = val

Verification

Test Results

# ✅ LLVM matches VM
NYASH_LLVM_USE_HARNESS=1 NYASH_LLVM_STRICT=1 ./target/release/hakorune --backend llvm /tmp/p1_return_i.hako
# Output: Result: 3

# ✅ VM baseline
./target/release/hakorune --backend vm /tmp/p1_return_i.hako
# Output: RC: 3

Generated LLVM IR Comparison

Before (wrong):

bb3:
  %"phi_1" = phi  i64 [%"phi_3", %"bb6"]
  ret i64 0  ; ❌ Hardcoded 0!

After (correct):

bb3:
  %"phi_1" = phi  i64 [%"phi_3", %"bb6"]
  ret i64 %"phi_1"  ; ✅ Uses PHI value!

Design Lessons

The SSOT Principle

builder.vmap is the Single Source of Truth for PHI nodes:

  • PHIs are created once in setup_phi_placeholders
  • PHIs must NEVER be overwritten by later instructions
  • vmap_cur is per-block and must filter PHIs correctly

PHI Ownership Tracking

llvmlite limitation: PHI.basic_block is None until finalization Solution: Explicit tracking via predeclared_ret_phis: Dict[(block_id, value_id), PHI]

Fail-Fast vs Silent Failures

The original filter silently dropped PHIs via broad exception handling:

except Exception:
    keep = False  # ❌ Silent failure!

Better approach: Explicit checks with trace logging for debugging.

  • Phase 131: Block_end_values SSOT system
  • Phase 131-12: VMap snapshot investigation
  • Phase 131-14-B: Jump-only block resolution

Files Modified

JoinIR/Boundary 層exit 値の SSOT を境界で明示)

  • src/mir/join_ir/lowering/simple_while_minimal.rsJump(k_exit, [i_param])
  • src/mir/builder/control_flow/joinir/patterns/pattern1_minimal.rsLoopExitBinding を作って境界へ設定)

LLVM Python 層PHI SSOT の維持)

  • src/llvm_py/builders/block_lower.py
    • PHI filtering を predeclared_ret_phis ベースへ変更(phi.basic_block 依存を排除)
    • builder.vmap へ sync する際、既存 PHI を上書きしないPHI を SSOT として保護)

Debug Environment Variables

NYASH_LLVM_STRICT=1           # Fail-fast on errors
NYASH_LLVM_TRACE_PHI=1        # PHI wiring traces
NYASH_LLVM_TRACE_VMAP=1       # VMap operation traces
NYASH_LLVM_DUMP_IR=/tmp/x.ll  # Dump generated IR

Acceptance Criteria

/tmp/p1_return_i.hako returns 3 in LLVM (was 0) STRICT mode enabled, no fallback to 0 VM and LLVM results match No regression on Phase 131 test cases Generated LLVM IR uses ret i64 %phi_1 not ret i64 0

Next Steps

  • Add regression test for exit PHI patterns
  • Document PHI ownership model in the LLVM harness docs (SSOT: phase131-3-llvm-lowering-inventory.md)