hakorune/docs/development/current/main/phase131-6-next-steps.md

# Phase 131-6: Next Steps - PHI Bug Fix

## Summary

**Problem**: LLVM backend generates infinite loop for `loop_min_while.hako`
**Root Cause**: PHI node incoming values not properly wired in `finalize_phis()`
**Impact**: P0 - Breaks basic loop functionality

## Recommended Fix Strategy

### Option A: Structural Fix (Recommended)

**Approach**: Fix the PHI wiring logic in `finalize_phis()` to properly connect incoming values.

**Location**: `/home/tomoaki/git/hakorune-selfhost/src/llvm_py/llvm_builder.py:601-735`

**Suspected Issue**:
```python
# Lines 679-681: Self-carry logic
if vs == int(dst_vid) and init_src_vid is not None:
    vs = int(init_src_vid)   # ← May incorrectly replace %12 with %2
```

**Fix Hypothesis**:
The self-carry logic is meant to handle PHI nodes that reference themselves, but it may be incorrectly triggering for normal loop-carried dependencies. We need to:

1. Add trace logging to see what values are being resolved
2. Check if `vs == int(dst_vid)` is incorrectly matching %12 (the updated value) as a self-reference
3. Verify that `_value_at_end_i64()` is correctly retrieving the value of %12 from bb7

**Debug Commands**:
```bash
# Enable verbose logging (if available)
export NYASH_CLI_VERBOSE=1
export NYASH_LLVM_DEBUG=1

# Generate LLVM IR to inspect
tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test

# Check generated LLVM IR
llvm-dis /tmp/loop_test.o  # If object contains IR
# Or add --emit-llvm-ir flag if available
```

**Steps**:
1. Add debug prints in `finalize_phis()` to log:
   - Which incoming values are being wired: `(block_id, dst_vid, [(pred_id, value_id)])`
   - What `nearest_pred_on_path()` returns for each incoming edge
   - What `_value_at_end_i64()` returns for each value

2. Compare debug output between working (VM) and broken (LLVM) paths

3. Fix the logic that's causing %12 to be ignored or replaced

4. Verify fix doesn't break Case A or B2

### Option B: Workaround (Not Recommended)

Disable loop optimization or use VM backend for loops. This doesn't solve the root cause.

## Acceptance Tests

### Must Pass

1. **Simple Add** (already passing):
```bash
./target/release/hakorune /tmp/simple_add.hako  # Should print 1
```

2. **Loop Min While** (currently failing):
```bash
tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test
timeout 2 /tmp/loop_test  # Should print 0\n1\n2 and exit
```

3. **Phase 87 LLVM Min** (regression check):
```bash
tools/build_llvm.sh apps/tests/phase87_llvm_exe_min.hako -o /tmp/phase87_test
/tmp/phase87_test
echo $?  # Should be 42
```

### Should Not Regress

- Case A: `phase87_llvm_exe_min.hako` ✅
- Case B2: Simple print without loop ✅
- VM backend: All existing VM tests ✅

## Implementation Checklist

- [ ] Add debug logging to `finalize_phis()`
- [ ] Identify which incoming value is being incorrectly wired
- [ ] Fix the wiring logic
- [ ] Test Case B (loop_min_while) - must output `0\n1\n2`
- [ ] Test Case A regression - must exit with 42
- [ ] Test Case B2 regression - must print 42
- [ ] Document the fix in this file and phase131-3-llvm-lowering-inventory.md
- [ ] Consider adding MIR→LLVM IR validation pass

## Timeline

- **Phase 131-6 Diagnosis**: 2025-12-14 ✅ Complete
- **Phase 131-7 Fix**: TBD
- **Phase 131-8 Verification**: TBD

## Related Documents

- [Phase 131-6 Diagnosis](phase131-6-ssa-dominance-diagnosis.md) - Full diagnostic report
- [Phase 131-3 Inventory](phase131-3-llvm-lowering-inventory.md) - Test case matrix
- [LLVM Builder Code](../../src/llvm_py/llvm_builder.py) - Implementation

## Notes for Future Self

**Why MIR is correct but LLVM is wrong**:
- MIR SSA form verified ✅
- VM execution verified ✅
- LLVM emission succeeds ✅
- LLVM linking succeeds ✅
- **LLVM runtime fails** ❌ → Bug is in IR generation, not MIR

**Key Insight**:
The bug is NOT in the MIR builder or JoinIR merger. The bug is specifically in how the Python LLVM builder (`llvm_builder.py`) translates MIR PHI nodes into LLVM IR PHI nodes. This is a **translation bug**, not a **semantic bug**.

**Architecture Note**:
This confirms the value of the 2-pillar strategy (VM + LLVM). The VM serves as a reference implementation to verify MIR correctness before blaming the frontend.
docs(phase131): Phase 131-6 根治調査完了 - MIR正常/LLVM壊れてる確定 Phase 131-6 根治調査: SSA Dominance 診断完了確定結果: - ✅ MIR は正しい（SSA 形式完璧、use-before-def なし） - ✅ Rust VM は正常（0,1,2 を出力して停止） - ❌ LLVM backend が壊れている（0 を無限に出力）根本原因特定: - 場所: src/llvm_py/llvm_builder.py の finalize_phis() (lines 601-735) - 問題: PHI incoming value wiring が壊れている - 疑わしいコード: lines 679-681 の self-carry logic - 結果: PHI %3 が常に初期値 0 を返す → ループカウンタ増えないテスト結果: - Simple Add: VM ✅ 1, LLVM ✅ 1 (PASS) - Loop Min While: VM ✅ 0,1,2, LLVM ❌ 0 forever (BUG) - Phase 87 Min: VM ✅ 42, LLVM ✅ 42 (PASS) 新規ドキュメント: - phase131-6-ssa-dominance-diagnosis.md: 完全な診断結果 - phase131-6-next-steps.md: 修正戦略と実装チェックリスト - phase131-3-llvm-lowering-inventory.md: 更新済み Next: Phase 131-7 (finalize_phis 修正) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> 2025-12-14 07:03:38 +09:00			`# Phase 131-6: Next Steps - PHI Bug Fix`

			`## Summary`

			Problem: LLVM backend generates infinite loop for `loop_min_while.hako`
			Root Cause: PHI node incoming values not properly wired in `finalize_phis()`
			`Impact: P0 - Breaks basic loop functionality`

			`## Recommended Fix Strategy`

			`### Option A: Structural Fix (Recommended)`

			Approach: Fix the PHI wiring logic in `finalize_phis()` to properly connect incoming values.

			Location: `/home/tomoaki/git/hakorune-selfhost/src/llvm_py/llvm_builder.py:601-735`

			`Suspected Issue:`
			```python
			`# Lines 679-681: Self-carry logic`
			`if vs == int(dst_vid) and init_src_vid is not None:`
			`vs = int(init_src_vid) # ← May incorrectly replace %12 with %2`
			```

			`Fix Hypothesis:`
			`The self-carry logic is meant to handle PHI nodes that reference themselves, but it may be incorrectly triggering for normal loop-carried dependencies. We need to:`

			`1. Add trace logging to see what values are being resolved`
			2. Check if `vs == int(dst_vid)` is incorrectly matching %12 (the updated value) as a self-reference
			3. Verify that `_value_at_end_i64()` is correctly retrieving the value of %12 from bb7

			`Debug Commands:`
			```bash
			`# Enable verbose logging (if available)`
			`export NYASH_CLI_VERBOSE=1`
			`export NYASH_LLVM_DEBUG=1`

			`# Generate LLVM IR to inspect`
			`tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test`

			`# Check generated LLVM IR`
			`llvm-dis /tmp/loop_test.o # If object contains IR`
			`# Or add --emit-llvm-ir flag if available`
			```

			`Steps:`
			1. Add debug prints in `finalize_phis()` to log:
			- Which incoming values are being wired: `(block_id, dst_vid, [(pred_id, value_id)])`
			- What `nearest_pred_on_path()` returns for each incoming edge
			- What `_value_at_end_i64()` returns for each value

			`2. Compare debug output between working (VM) and broken (LLVM) paths`

			`3. Fix the logic that's causing %12 to be ignored or replaced`

			`4. Verify fix doesn't break Case A or B2`

			`### Option B: Workaround (Not Recommended)`

			`Disable loop optimization or use VM backend for loops. This doesn't solve the root cause.`

			`## Acceptance Tests`

			`### Must Pass`

			`1. Simple Add (already passing):`
			```bash
			`./target/release/hakorune /tmp/simple_add.hako # Should print 1`
			```

			`2. Loop Min While (currently failing):`
			```bash
			`tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test`
			`timeout 2 /tmp/loop_test # Should print 0\n1\n2 and exit`
			```

			`3. Phase 87 LLVM Min (regression check):`
			```bash
			`tools/build_llvm.sh apps/tests/phase87_llvm_exe_min.hako -o /tmp/phase87_test`
			`/tmp/phase87_test`
			`echo $? # Should be 42`
			```

			`### Should Not Regress`

			- Case A: `phase87_llvm_exe_min.hako` ✅
			`- Case B2: Simple print without loop ✅`
			`- VM backend: All existing VM tests ✅`

			`## Implementation Checklist`

			- [ ] Add debug logging to `finalize_phis()`
			`- [ ] Identify which incoming value is being incorrectly wired`
			`- [ ] Fix the wiring logic`
			- [ ] Test Case B (loop_min_while) - must output `0\n1\n2`
			`- [ ] Test Case A regression - must exit with 42`
			`- [ ] Test Case B2 regression - must print 42`
			`- [ ] Document the fix in this file and phase131-3-llvm-lowering-inventory.md`
			`- [ ] Consider adding MIR→LLVM IR validation pass`

			`## Timeline`

			`- Phase 131-6 Diagnosis: 2025-12-14 ✅ Complete`
			`- Phase 131-7 Fix: TBD`
			`- Phase 131-8 Verification: TBD`

			`## Related Documents`

			`- [Phase 131-6 Diagnosis](phase131-6-ssa-dominance-diagnosis.md) - Full diagnostic report`
			`- [Phase 131-3 Inventory](phase131-3-llvm-lowering-inventory.md) - Test case matrix`
			`- [LLVM Builder Code](../../src/llvm_py/llvm_builder.py) - Implementation`

			`## Notes for Future Self`

			`Why MIR is correct but LLVM is wrong:`
			`- MIR SSA form verified ✅`
			`- VM execution verified ✅`
			`- LLVM emission succeeds ✅`
			`- LLVM linking succeeds ✅`
			`- LLVM runtime fails ❌ → Bug is in IR generation, not MIR`

			`Key Insight:`
			The bug is NOT in the MIR builder or JoinIR merger. The bug is specifically in how the Python LLVM builder (`llvm_builder.py`) translates MIR PHI nodes into LLVM IR PHI nodes. This is a translation bug, not a semantic bug.

			`Architecture Note:`
			`This confirms the value of the 2-pillar strategy (VM + LLVM). The VM serves as a reference implementation to verify MIR correctness before blaming the frontend.`