Files
hakorune/docs/development/current/main/cse-pass-callee-fix.md
nyash-codex 120fbdb523 fix(mir): Receiver used_values for DCE + trace + cleanup
- Fix: Call with Callee::Method now includes receiver in used_values()
  - Prevents DCE from eliminating Copy instructions that define receivers
  - Pattern 3 (loop_if_phi.hako) now works correctly (sum=9)

- Add: NYASH_DCE_TRACE=1 for debugging eliminated instructions
  - Shows which pure instructions DCE removes and from which block

- Cleanup: Consolidate Call used_values to single source of truth
  - Early return in methods.rs handles all Call variants
  - Removed duplicate match arm (now unreachable!())
  - ChatGPT's suggestion for cleaner architecture

- Docs: Phase 166 analysis of inst_meta layer architecture
  - Identified CSE pass callee bug (to be fixed next)
  - Improvement proposals for CallLikeInst

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 23:26:55 +09:00

226 lines
7.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# CSE Pass 修正提案 - Callee フィールド対応
## 問題の詳細
### 現在の実装
**src/mir/passes/cse.rs lines 72-91**:
```rust
fn instruction_key(i: &MirInstruction) -> String {
match i {
MirInstruction::Const { value, .. } => {
format!("const_{:?}", value)
}
MirInstruction::BinOp { op, lhs, rhs, .. } => {
format!("binop_{:?}_{}_{}", op, lhs.as_u32(), rhs.as_u32())
}
MirInstruction::Compare { op, lhs, rhs, .. } => {
format!("cmp_{:?}_{}_{}", op, lhs.as_u32(), rhs.as_u32())
}
MirInstruction::Call { func, args, .. } => {
let args_str = args
.iter()
.map(|v| v.as_u32().to_string())
.collect::<Vec<_>>()
.join(",");
format!("call_{}_{}", func.as_u32(), args_str)
// ← callee フィールドを無視している!
}
other => format!("other_{:?}", other),
}
}
```
### 問題のシナリオ
```hako
box StringUtil {
upper(s) {
return s.upper()
}
}
local x = new StringBox("hello")
local y = new StringBox("world")
// Case 1: 異なる receiver を持つメソッド呼び出し
%r1 = call Method { receiver: Some(%x), method: "upper", ... } ()
// CSE key: "call_<x>_" (callee 無視)
%r2 = call Method { receiver: Some(%y), method: "upper", ... } ()
// CSE key: "call_<y>_" (callee 無視)
// ↑ x と y は異なる ValueId → キーは異なる
// → この場合は OK偶然
// Case 2: 同じメソッド呼び出しを2回
%s1 = new StringBox("hello")
%r1 = call Method { receiver: Some(%s1), method: "upper", ... } ()
%r2 = call Method { receiver: Some(%s1), method: "upper", ... } ()
// 両方のキー: "call_<s1>_"
// → CSE が正しく検出できる
// Case 3: 複数のメソッド・同じ receiver
%obj = new StringBox("hello")
%r1 = call Method { receiver: Some(%obj), method: "upper", ... } ()
%r2 = call Method { receiver: Some(%obj), method: "lower", ... } ()
// 両方のキー: "call_<obj>_"
// ← これは WRONG! 異なるメソッドなのに同じキー
// Case 4: Global function 呼び出しの場合
%r1 = call Global("print") (%msg)
// callee フィールド: Global("print")
// func フィールド: ValueId::INVALID
// 現在のキー: "call_<INVALID>_<msg>"
// ← func だけではメソッド情報を失う
```
### 修正方法
**提案1: callee を含める(推奨)**
```rust
fn instruction_key(i: &MirInstruction) -> String {
match i {
// ...
MirInstruction::Call { callee, func, args, .. } => {
let args_str = args
.iter()
.map(|v| v.as_u32().to_string())
.collect::<Vec<_>>()
.join(",");
// callee がある場合は callee を使用
if let Some(c) = callee {
match c {
Callee::Global(name) => {
format!("call_global_{}__{}", name, args_str)
}
Callee::Method {
box_name,
method,
receiver,
..
} => {
let recv_str = receiver.as_ref()
.map(|r| r.as_u32().to_string())
.unwrap_or_else(|| "static".to_string());
format!("call_method_{}_{}_{}_{}",
box_name, method, recv_str, args_str)
}
Callee::Value(v) => {
format!("call_value_{}__{}", v.as_u32(), args_str)
}
Callee::Extern(name) => {
format!("call_extern_{}__{}", name, args_str)
}
Callee::Constructor { box_type } => {
format!("call_ctor_{}_{}", box_type, args_str)
}
Callee::Closure { .. } => {
format!("call_closure__{}", args_str)
}
}
} else {
// legacy path: func を使用
format!("call_legacy_{}_{}", func.as_u32(), args_str)
}
}
other => format!("other_{:?}", other),
}
}
```
**提案2: callee 情報を簡潔に(軽量版)**
```rust
fn instruction_key(i: &MirInstruction) -> String {
match i {
// ...
MirInstruction::Call { callee, func, args, .. } => {
let args_str = args
.iter()
.map(|v| v.as_u32().to_string())
.collect::<Vec<_>>()
.join(",");
// callee を string hash として含める
let callee_key = format!("{:?}", callee); // or hash(callee)
format!("call_{}__{}", callee_key, args_str)
}
other => format!("other_{:?}", other),
}
}
```
## テストケース
### テスト1: 同じメソッド・異なる receiver
```rust
#[test]
fn test_cse_different_receivers() {
// MIR:
// %x = new StringBox("hello")
// %y = new StringBox("world")
// %r1 = call Method { receiver: Some(%x), method: "upper", ... } ()
// %r2 = call Method { receiver: Some(%y), method: "upper", ... } ()
// → CSE key は異なるべき
let key1 = instruction_key(&call_method_upper_x());
let key2 = instruction_key(&call_method_upper_y());
assert_ne!(key1, key2); // 異なる receiver → 異なるキー
}
```
### テスト2: 異なるメソッド・同じ receiver
```rust
#[test]
fn test_cse_different_methods() {
// MIR:
// %obj = new StringBox("hello")
// %r1 = call Method { receiver: Some(%obj), method: "upper", ... } ()
// %r2 = call Method { receiver: Some(%obj), method: "lower", ... } ()
// → CSE key は異なるべき
let key1 = instruction_key(&call_method_upper_obj());
let key2 = instruction_key(&call_method_lower_obj());
assert_ne!(key1, key2); // 異なるメソッド → 異なるキー
}
```
### テスト3: Global 関数呼び出し
```rust
#[test]
fn test_cse_global_function() {
// MIR:
// %r1 = call Global("print") (%msg1)
// %r2 = call Global("print") (%msg1)
// → CSE key は同じ
let key1 = instruction_key(&call_global_print_msg1());
let key2 = instruction_key(&call_global_print_msg1());
assert_eq!(key1, key2); // 同じ関数・同じ引数 → 同じキー
}
```
## 実装スケジュール
| Step | 作業内容 | 時間 |
|------|---------|------|
| 1 | cse.rs の instruction_key() を修正 | 1h |
| 2 | テストケース追加 | 0.5h |
| 3 | 既存スモークテストの確認 | 0.5h |
| 4 | ドキュメント更新 | 0.5h |
**合計**: 2.5 時間
## 期待効果
- **CSE 正確性向上**: receiver/method を区別した最適化
- **バグ予防**: 異なるメソッド呼び出しを誤って統合する問題を防止
- **パフォーマンス**: わずかなキー生成コスト(許容範囲)