From 35db1f8d782c972399e818957fb0f30972c548a9 Mon Sep 17 00:00:00 2001 From: nyash-codex Date: Sun, 14 Dec 2025 07:03:38 +0900 Subject: [PATCH] =?UTF-8?q?docs(phase131):=20Phase=20131-6=20=E6=A0=B9?= =?UTF-8?q?=E6=B2=BB=E8=AA=BF=E6=9F=BB=E5=AE=8C=E4=BA=86=20-=20MIR?= =?UTF-8?q?=E6=AD=A3=E5=B8=B8/LLVM=E5=A3=8A=E3=82=8C=E3=81=A6=E3=82=8B?= =?UTF-8?q?=E7=A2=BA=E5=AE=9A?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 131-6 根治調査: SSA Dominance 診断完了 確定結果: - ✅ MIR は正しい(SSA 形式完璧、use-before-def なし) - ✅ Rust VM は正常(0,1,2 を出力して停止) - ❌ LLVM backend が壊れている(0 を無限に出力) 根本原因特定: - 場所: src/llvm_py/llvm_builder.py の finalize_phis() (lines 601-735) - 問題: PHI incoming value wiring が壊れている - 疑わしいコード: lines 679-681 の self-carry logic - 結果: PHI %3 が常に初期値 0 を返す → ループカウンタ増えない テスト結果: - Simple Add: VM ✅ 1, LLVM ✅ 1 (PASS) - Loop Min While: VM ✅ 0,1,2, LLVM ❌ 0 forever (BUG) - Phase 87 Min: VM ✅ 42, LLVM ✅ 42 (PASS) 新規ドキュメント: - phase131-6-ssa-dominance-diagnosis.md: 完全な診断結果 - phase131-6-next-steps.md: 修正戦略と実装チェックリスト - phase131-3-llvm-lowering-inventory.md: 更新済み Next: Phase 131-7 (finalize_phis 修正) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 --- .../phase131-3-llvm-lowering-inventory.md | 35 ++- .../current/main/phase131-6-next-steps.md | 125 ++++++++++ .../phase131-6-ssa-dominance-diagnosis.md | 216 ++++++++++++++++++ 3 files changed, 368 insertions(+), 8 deletions(-) create mode 100644 docs/development/current/main/phase131-6-next-steps.md create mode 100644 docs/development/current/main/phase131-6-ssa-dominance-diagnosis.md diff --git a/docs/development/current/main/phase131-3-llvm-lowering-inventory.md b/docs/development/current/main/phase131-3-llvm-lowering-inventory.md index c1564952..ff2126e8 100644 --- a/docs/development/current/main/phase131-3-llvm-lowering-inventory.md +++ b/docs/development/current/main/phase131-3-llvm-lowering-inventory.md @@ -8,7 +8,7 @@ | Case | File | Emit | Link | Run | Notes | |------|------|------|------|-----|-------| | A | `apps/tests/phase87_llvm_exe_min.hako` | ✅ | ✅ | ✅ | **PASS** - Simple return 42, no BoxCall, exit code verified | -| B | `apps/tests/loop_min_while.hako` | ✅ | ✅ | ❌ | **TAG-RUN** - EMIT/LINK fixed (Phase 131-5), infinite loop in runtime (PHI update bug) | +| B | `apps/tests/loop_min_while.hako` | ✅ | ✅ | ❌ | **TAG-RUN** - EMIT/LINK fixed (Phase 131-5), **infinite loop in runtime** (PHI incoming value bug - Phase 131-6) | | B2 | `/tmp/case_b_simple.hako` | ✅ | ✅ | ✅ | **PASS** - Simple print(42) without loop works | | C | `apps/tests/llvm_stage3_loop_only.hako` | ❌ | - | - | **TAG-EMIT** - Complex loop (break/continue) fails JoinIR pattern matching | @@ -137,7 +137,7 @@ declare i64 @nyash.console.log(i8*) --- -### 3. TAG-RUN: Loop Infinite Iteration (Case B) - 🔍 NEW ISSUE +### 3. TAG-RUN: Loop Infinite Iteration (Case B) - ❌ CONFIRMED PHI BUG (Phase 131-6) **File**: `apps/tests/loop_min_while.hako` @@ -159,13 +159,32 @@ $ /tmp/loop_min_while ... (infinite loop, prints 0 forever) ``` -**Diagnosis**: -- Loop counter `i` is not being updated correctly -- PHI node receives correct values but store/load may be broken -- String conversion creates new handles (seen in trace: `from_i8_string -> N`) -- Loop condition (`i < 3`) always evaluates to true +**Root Cause** (Phase 131-6 Diagnosis - 2025-12-14): -**Hypothesis**: PHI value is computed correctly but not written back to memory location, causing `i = i + 1` to have no effect. +The MIR is **correct**. The MIR dump shows proper SSA form: + +```mir +bb4: + %3 = phi [%2, bb0], [%12, bb7] // ← Should receive %12 from loop body + +bb7: + extern_call env.console.log(%3) // ← Prints current value + %11 = const 1 + %12 = %3 Add %11 // ← Computes i+1 + br label bb4 // ← Loops back with %12 +``` + +**Verification**: +- ✅ VM execution: Correctly outputs `0, 1, 2` +- ❌ LLVM execution: Infinite loop outputting `0` +- ✅ MIR SSA dominance: No use-before-def violations +- ❌ LLVM PHI wiring: Incoming value from bb7 not properly connected + +**Bug Location**: `/home/tomoaki/git/hakorune-selfhost/src/llvm_py/llvm_builder.py` +- Function: `finalize_phis()` (lines 601-735+) +- Suspected issue: Self-carry logic (lines 679-681) or value resolution (line 683) causes %12 to be ignored + +**Detailed Analysis**: See [`docs/development/current/main/phase131-6-ssa-dominance-diagnosis.md`](/home/tomoaki/git/hakorune-selfhost/docs/development/current/main/phase131-6-ssa-dominance-diagnosis.md) **Next Steps**: 1. Inspect generated LLVM IR for store instructions after PHI diff --git a/docs/development/current/main/phase131-6-next-steps.md b/docs/development/current/main/phase131-6-next-steps.md new file mode 100644 index 00000000..7746e98f --- /dev/null +++ b/docs/development/current/main/phase131-6-next-steps.md @@ -0,0 +1,125 @@ +# Phase 131-6: Next Steps - PHI Bug Fix + +## Summary + +**Problem**: LLVM backend generates infinite loop for `loop_min_while.hako` +**Root Cause**: PHI node incoming values not properly wired in `finalize_phis()` +**Impact**: P0 - Breaks basic loop functionality + +## Recommended Fix Strategy + +### Option A: Structural Fix (Recommended) + +**Approach**: Fix the PHI wiring logic in `finalize_phis()` to properly connect incoming values. + +**Location**: `/home/tomoaki/git/hakorune-selfhost/src/llvm_py/llvm_builder.py:601-735` + +**Suspected Issue**: +```python +# Lines 679-681: Self-carry logic +if vs == int(dst_vid) and init_src_vid is not None: + vs = int(init_src_vid) # ← May incorrectly replace %12 with %2 +``` + +**Fix Hypothesis**: +The self-carry logic is meant to handle PHI nodes that reference themselves, but it may be incorrectly triggering for normal loop-carried dependencies. We need to: + +1. Add trace logging to see what values are being resolved +2. Check if `vs == int(dst_vid)` is incorrectly matching %12 (the updated value) as a self-reference +3. Verify that `_value_at_end_i64()` is correctly retrieving the value of %12 from bb7 + +**Debug Commands**: +```bash +# Enable verbose logging (if available) +export NYASH_CLI_VERBOSE=1 +export NYASH_LLVM_DEBUG=1 + +# Generate LLVM IR to inspect +tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test + +# Check generated LLVM IR +llvm-dis /tmp/loop_test.o # If object contains IR +# Or add --emit-llvm-ir flag if available +``` + +**Steps**: +1. Add debug prints in `finalize_phis()` to log: + - Which incoming values are being wired: `(block_id, dst_vid, [(pred_id, value_id)])` + - What `nearest_pred_on_path()` returns for each incoming edge + - What `_value_at_end_i64()` returns for each value + +2. Compare debug output between working (VM) and broken (LLVM) paths + +3. Fix the logic that's causing %12 to be ignored or replaced + +4. Verify fix doesn't break Case A or B2 + +### Option B: Workaround (Not Recommended) + +Disable loop optimization or use VM backend for loops. This doesn't solve the root cause. + +## Acceptance Tests + +### Must Pass + +1. **Simple Add** (already passing): +```bash +./target/release/hakorune /tmp/simple_add.hako # Should print 1 +``` + +2. **Loop Min While** (currently failing): +```bash +tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test +timeout 2 /tmp/loop_test # Should print 0\n1\n2 and exit +``` + +3. **Phase 87 LLVM Min** (regression check): +```bash +tools/build_llvm.sh apps/tests/phase87_llvm_exe_min.hako -o /tmp/phase87_test +/tmp/phase87_test +echo $? # Should be 42 +``` + +### Should Not Regress + +- Case A: `phase87_llvm_exe_min.hako` ✅ +- Case B2: Simple print without loop ✅ +- VM backend: All existing VM tests ✅ + +## Implementation Checklist + +- [ ] Add debug logging to `finalize_phis()` +- [ ] Identify which incoming value is being incorrectly wired +- [ ] Fix the wiring logic +- [ ] Test Case B (loop_min_while) - must output `0\n1\n2` +- [ ] Test Case A regression - must exit with 42 +- [ ] Test Case B2 regression - must print 42 +- [ ] Document the fix in this file and phase131-3-llvm-lowering-inventory.md +- [ ] Consider adding MIR→LLVM IR validation pass + +## Timeline + +- **Phase 131-6 Diagnosis**: 2025-12-14 ✅ Complete +- **Phase 131-7 Fix**: TBD +- **Phase 131-8 Verification**: TBD + +## Related Documents + +- [Phase 131-6 Diagnosis](phase131-6-ssa-dominance-diagnosis.md) - Full diagnostic report +- [Phase 131-3 Inventory](phase131-3-llvm-lowering-inventory.md) - Test case matrix +- [LLVM Builder Code](../../src/llvm_py/llvm_builder.py) - Implementation + +## Notes for Future Self + +**Why MIR is correct but LLVM is wrong**: +- MIR SSA form verified ✅ +- VM execution verified ✅ +- LLVM emission succeeds ✅ +- LLVM linking succeeds ✅ +- **LLVM runtime fails** ❌ → Bug is in IR generation, not MIR + +**Key Insight**: +The bug is NOT in the MIR builder or JoinIR merger. The bug is specifically in how the Python LLVM builder (`llvm_builder.py`) translates MIR PHI nodes into LLVM IR PHI nodes. This is a **translation bug**, not a **semantic bug**. + +**Architecture Note**: +This confirms the value of the 2-pillar strategy (VM + LLVM). The VM serves as a reference implementation to verify MIR correctness before blaming the frontend. diff --git a/docs/development/current/main/phase131-6-ssa-dominance-diagnosis.md b/docs/development/current/main/phase131-6-ssa-dominance-diagnosis.md new file mode 100644 index 00000000..62dbd935 --- /dev/null +++ b/docs/development/current/main/phase131-6-ssa-dominance-diagnosis.md @@ -0,0 +1,216 @@ +# Phase 131-6: MIR SSA Dominance Diagnosis + +## Executive Summary + +**Status**: ❌ LLVM Backend Bug Confirmed +**Severity**: P0 - Breaks basic loop functionality +**Root Cause**: PHI node incoming values not properly wired in LLVM IR generation + +## Evidence Chain + +### 1. Test Case SSOT + +**File**: `/tmp/simple_add.hako` +```nyash +static box Main { + main() { + local i + i = 0 + i = i + 1 + print(i) + return 0 + } +} +``` + +**Expected**: Prints `1` +**Actual (VM)**: ✅ Prints `1` +**Actual (LLVM)**: ✅ Prints `1` + +**File**: `apps/tests/loop_min_while.hako` +```nyash +static box Main { + main() { + local i = 0 + loop(i < 3) { + print(i) + i = i + 1 + } + return 0 + } +} +``` + +**Expected**: Prints `0\n1\n2` +**Actual (VM)**: ✅ Prints `0\n1\n2` +**Actual (LLVM)**: ❌ Prints `0` infinitely (infinite loop) + +### 2. MIR Verification + +**Command**: `./target/release/hakorune --dump-mir apps/tests/loop_min_while.hako` + +**MIR Output** (relevant blocks): +```mir +define i64 @main() { +bb0: + %1 = const 0 + %2 = copy %1 + br label bb4 + +bb3: + %17 = const 0 + ret %17 + +bb4: + %3 = phi [%2, bb0], [%12, bb7] // ← PHI node for loop variable + br label bb5 + +bb5: + %8 = const 3 + %9 = icmp Lt %3, %8 + %10 = Not %9 + br %10, label bb6, label bb7 + +bb6: + br label bb3 + +bb7: + extern_call env.console.log(%3) // ← Prints %3 (should increment each iteration) + %11 = const 1 + %12 = %3 Add %11 // ← %12 = %3 + 1 (updated value) + br label bb4 // ← Jumps back with %12 +} +``` + +**Analysis**: +- ✅ SSA form is correct +- ✅ All values defined before use within each block +- ✅ PHI node properly declares incoming values: `[%2, bb0]` (initial) and `[%12, bb7]` (loop update) +- ✅ No use-before-def violations + +### 3. VM Execution Verification + +**Command**: `timeout 2 ./target/release/hakorune apps/tests/loop_min_while.hako` + +**Output**: +``` +0 +1 +2 +RC: 0 +``` + +**Conclusion**: ✅ MIR is correct, VM interprets it correctly + +### 4. LLVM Execution Failure + +**Build Command**: `bash tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test` + +**Build Result**: ✅ Success (no errors) + +**Run Command**: `/tmp/loop_test` + +**Output** (truncated): +``` +0 +0 +0 +0 +... (repeats infinitely) +``` + +**Conclusion**: ❌ LLVM backend bug - PHI node not working + +## Root Cause Analysis + +### Affected Component +**File**: `/home/tomoaki/git/hakorune-selfhost/src/llvm_py/llvm_builder.py` +**Function**: `finalize_phis()` (lines 601-735+) + +### Bug Mechanism + +The PHI node `%3 = phi [%2, bb0], [%12, bb7]` should: +1. On first iteration: Use %2 (value 0 from bb0) +2. On subsequent iterations: Use %12 (updated value from bb7) + +**What's happening**: +- %3 always resolves to 0 (initial value from %2) +- The incoming value from bb7 (%12) is not being properly connected +- Loop variable never increments → infinite loop + +### Suspected Code Location + +In `finalize_phis()` around lines 670-688: +```python +chosen: Dict[int, ir.Value] = {} +for (b_decl, v_src) in incoming: + try: + bd = int(b_decl); vs = int(v_src) + except Exception: + continue + pred_match = nearest_pred_on_path(bd) + if pred_match is None: + continue + # If self-carry is specified (vs == dst_vid), map to init_src_vid when available + if vs == int(dst_vid) and init_src_vid is not None: + vs = int(init_src_vid) # ← SUSPICIOUS: May cause %12 to be ignored + try: + val = self.resolver._value_at_end_i64(vs, pred_match, self.preds, self.block_end_values, self.vmap, self.bb_map) + except Exception: + val = None + if val is None: + val = ir.Constant(self.i64, 0) # ← Falls back to 0 + chosen[pred_match] = val +``` + +### Hypothesis + +The self-carry logic (lines 679-681) or value resolution (line 683) may be incorrectly mapping or failing to retrieve %12 from bb7, causing the PHI to always use the fallback value of 0. + +## Next Steps + +### Immediate Action Required + +1. **Add Trace Logging**: + - Enable `NYASH_CLI_VERBOSE=1` or similar PHI-specific tracing + - Log what values are being wired to each PHI incoming edge + +2. **Minimal Fix Verification**: + - Verify `_value_at_end_i64(12, 7, ...)` returns the correct LLVM value + - Check if `nearest_pred_on_path()` correctly identifies bb7 as predecessor of bb4 + +3. **Test Matrix**: + - Simple Add: ✅ (already passing) + - Loop Min While: ❌ (currently failing) + - Case A/B2 from previous phases: (regression check needed) + +### Long-term Solution + +Implement structural dominance verification: +- MIR verifier pass to check SSA properties +- LLVM IR verification before object emission +- Automated test for PHI node correctness + +## Acceptance Criteria + +### Must Pass +1. ✅ `tools/build_llvm.sh apps/tests/loop_min_while.hako -o /tmp/loop_test && /tmp/loop_test` outputs `0\n1\n2` and exits +2. ✅ Simple Add still works: `/tmp/simple_add` outputs `1` +3. ✅ No regression in existing LLVM smoke tests + +### Documentation +1. ✅ This diagnosis added to `docs/development/current/main/` +2. ✅ Fix explanation added to phase131-3-llvm-lowering-inventory.md +3. ✅ Test case added to prevent regression + +## Files Modified (To Be Updated) + +- `src/llvm_py/llvm_builder.py` - PHI wiring logic +- `docs/development/current/main/phase131-3-llvm-lowering-inventory.md` - Add Phase 131-6 section +- (potential) Test case addition + +## Timeline + +- **Diagnosis**: 2025-12-14 (Complete) +- **Fix**: TBD +- **Verification**: TBD