Files
hakorune/docs/development/current/main/phase170-completion-report.md
nyash-codex 4e32a803a7 feat(joinir): Phase 33-22 CommonPatternInitializer & JoinIRConversionPipeline integration
Unifies initialization and conversion logic across all 4 loop patterns,
eliminating code duplication and establishing single source of truth.

## Changes

### Infrastructure (New)
- CommonPatternInitializer (117 lines): Unified loop var extraction + CarrierInfo building
- JoinIRConversionPipeline (127 lines): Unified JoinIR→MIR→Merge flow

### Pattern Refactoring
- Pattern 1: Uses CommonPatternInitializer + JoinIRConversionPipeline (-25 lines)
- Pattern 2: Uses CommonPatternInitializer + JoinIRConversionPipeline (-25 lines)
- Pattern 3: Uses CommonPatternInitializer + JoinIRConversionPipeline (-25 lines)
- Pattern 4: Uses CommonPatternInitializer + JoinIRConversionPipeline (-40 lines)

### Code Reduction
- Total reduction: ~115 lines across all patterns
- Zero code duplication in initialization/conversion
- Pattern files: 806 lines total (down from ~920)

### Quality Improvements
- Single source of truth for initialization
- Consistent conversion flow across all patterns
- Guaranteed boundary.loop_var_name setting (prevents SSA-undef bugs)
- Improved maintainability and testability

### Testing
- All 4 patterns tested and passing:
  - Pattern 1 (Simple While): 
  - Pattern 2 (With Break): 
  - Pattern 3 (If-Else PHI): 
  - Pattern 4 (With Continue): 

### Documentation
- Phase 33-22 inventory and results document
- Updated joinir-architecture-overview.md with new infrastructure

## Breaking Changes
None - pure refactoring with no API changes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-07 21:02:20 +09:00

379 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 170: JsonParserBox JoinIR Preparation & Re-validation - Completion Report
**Date**: 2025-12-07
**Duration**: 1 session (autonomous work)
**Status**: ✅ **Complete** - Environment prepared, blockers identified, next phase planned
---
## Executive Summary
Phase 170 successfully prepared the environment for JsonParserBox JoinIR validation and identified the critical ValueId boundary mapping issue blocking runtime execution. All tasks completed:
-**Task A-1**: JoinIR routing whitelist expanded (6 JsonParserBox methods + test helper)
-**Task A-2**: ValueId boundary issue identified with full root cause analysis
- ⚠️ **Task B**: Mini tests blocked by `using` statement (workaround: simplified test created)
-**Task C**: Next phase direction decided (Option A: Fix boundary mapping)
**Key Achievement**: Identified that BoolExprLowerer integration (Phase 167-169) is correct, but the boundary mechanism needs condition variable extraction to work properly.
---
## Phase 170-A: Environment Setup ✅
### Task A-1: JoinIR Routing Whitelist Expansion
**Objective**: Allow JsonParserBox methods to route to JoinIR patterns instead of `[joinir/freeze]`.
**Changes Made**:
**File**: `src/mir/builder/control_flow/joinir/routing.rs` (lines 68-76)
**Added entries**:
```rust
// Phase 170-A-1: Enable JsonParserBox methods for JoinIR routing
"JsonParserBox._trim/1" => true,
"JsonParserBox._skip_whitespace/2" => true,
"JsonParserBox._match_literal/2" => true,
"JsonParserBox._parse_string/2" => true,
"JsonParserBox._parse_array/2" => true,
"JsonParserBox._parse_object/2" => true,
// Phase 170-A-1: Test methods (simplified versions)
"TrimTest.trim/1" => true,
```
**Result**: ✅ Methods now route to pattern matching instead of immediate `[joinir/freeze]` rejection.
**Evidence**:
```bash
HAKO_JOINIR_DEBUG=1 ./target/release/hakorune local_tests/test_trim_main_pattern.hako
# Output: [joinir/pattern2] Generated JoinIR for Loop with Break Pattern (Phase 169)
```
---
### Task A-2: ValueId Boundary Issue Identification
**Objective**: Understand if ValueId boundary mapping affects JsonParserBox tests.
**Test Created**: `local_tests/test_trim_main_pattern.hako` (48 lines)
- Simplified `_trim` method with same loop structure as JsonParserBox
- Two loops with break (Pattern2 x 2)
- Condition variables: `start < end`, `end > start`
**Findings**:
1. **Pattern Detection**: ✅ Works correctly
- Both loops match Pattern2
- JoinIR generation succeeds
2. **Runtime Execution**: ❌ Silent failure
- Program compiles successfully
- No output produced
- Exit code 0 (but no print statements executed)
3. **Root Cause Identified**: ValueId boundary mapping
- Condition variables (`start`, `end`) resolved from HOST `variable_map`
- HOST ValueIds (33, 34, 48, 49) used directly in JoinIR
- Not included in `JoinInlineBoundary`
- Merge process doesn't remap them → undefined at runtime
**Evidence**:
```
[ssa-undef-debug] fn=TrimTest.trim/1 bb=BasicBlockId(12) inst_idx=0 used=ValueId(33) inst=Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), rhs: ValueId(34) }
[ssa-undef-debug] fn=TrimTest.trim/1 bb=BasicBlockId(12) inst_idx=0 used=ValueId(34) inst=Compare { dst: ValueId(26), op: Lt, lhs: ValueId(33), rhs: ValueId(34) }
```
**Impact**: CRITICAL - Blocks ALL JsonParserBox methods with complex conditions.
**Detailed Analysis**: See [phase170-valueid-boundary-analysis.md](phase170-valueid-boundary-analysis.md)
---
## Phase 170-B: JsonParserBox Mini Test Re-execution ⚠️
### Original Test Files
**Location**: `tools/selfhost/json_parser_{string,array,object}_min.hako`
**Blocker**: `using` statement not working
```
[using] not found: 'tools/hako_shared/json_parser.hako" with JsonParserBox'
```
**Root Cause**: JsonParserBox is defined in external file, not compiled/loaded at runtime.
**Impact**: Can't run original integration tests in current form.
---
### Workaround: Simplified Test
**Created**: `local_tests/test_trim_main_pattern.hako`
**Purpose**: Test same loop structure without `using` dependency.
**Structure**:
```nyash
static box TrimTest {
method trim(s) {
// Same structure as JsonParserBox._trim
loop(start < end) { ... break }
loop(end > start) { ... break }
}
main(args) { ... }
}
```
**Result**: Successfully routes to Pattern2, exposes boundary issue.
---
## Phase 170-C: Next Phase Planning ✅
### Immediate TODOs (Phase 171+ Candidates)
**Priority 1: Fix ValueId Boundary Mapping** (HIGHEST PRIORITY)
- **Why**: Blocks all JsonParserBox complex condition tests
- **What**: Extract condition variables and add to `JoinInlineBoundary`
- **Where**: Pattern lowerers (pattern1/2/3/4)
- **Estimate**: 4.5 hours
- **Details**: See Option A in [phase170-valueid-boundary-analysis.md](phase170-valueid-boundary-analysis.md)
**Priority 2: Using Statement / Box Loading** (MEDIUM)
- **Why**: Enable actual JsonParserBox integration tests
- **What**: Compile and register boxes from `using` statements
- **Alternatives**:
- Inline JsonParser code in tests (quick workaround)
- Auto-compile static boxes (proper solution)
**Priority 3: Multi-Loop Function Support** (LOW)
- **Why**: `_trim` has 2 loops in one function
- **Current**: Each loop calls JoinIR routing separately (seems to work)
- **Risk**: May need validation that multiple JoinIR calls per function work correctly
---
### Recommended Next Phase Direction
**Option A: Fix Boundary Mapping First****RECOMMENDED**
**Rationale**:
1. **Root blocker**: Boundary issue blocks ALL tests, not just one
2. **BoolExprLowerer correct**: Phase 169 integration is solid
3. **Pattern matching correct**: Routing and detection work perfectly
4. **Isolated fix**: Boundary extraction is well-scoped and testable
5. **High impact**: Once fixed, all JsonParser methods should work
**Alternative**: Option B (simplify code) or Option C (postpone) - both less effective.
---
## Test Results Matrix
| Method/Test | Pattern | JoinIR Status | Blocker | Notes |
|-------------|---------|---------------|---------|-------|
| `TrimTest.trim/1` (loop 1) | Pattern2 | ⚠️ Routes OK, runtime fail | ValueId boundary | `start < end` uses undefined ValueId(33, 34) |
| `TrimTest.trim/1` (loop 2) | Pattern2 | ⚠️ Routes OK, runtime fail | ValueId boundary | `end > start` uses undefined ValueId(48, 49) |
| `JsonParserBox._trim/1` | (untested) | - | Using statement | Can't load JsonParserBox at runtime |
| `JsonParserBox._skip_whitespace/2` | (untested) | - | Using statement | Can't load JsonParserBox at runtime |
| `JsonParserBox._match_literal/2` | (untested) | - | Using statement | Can't load JsonParserBox at runtime |
| `JsonParserBox._parse_string/2` | (untested) | - | Using statement | Can't load JsonParserBox at runtime |
| `JsonParserBox._parse_array/2` | (untested) | - | Using statement | Can't load JsonParserBox at runtime |
| `JsonParserBox._parse_object/2` | (untested) | - | Using statement | Can't load JsonParserBox at runtime |
**Summary**:
- **Routing**: ✅ All methods whitelisted, pattern detection works
- **Compilation**: ✅ BoolExprLowerer generates correct JoinIR
- **Runtime**: ❌ ValueId boundary issue prevents execution
- **Integration**: ⚠️ `using` statement blocks full JsonParser tests
---
## Files Modified
**Modified**:
- `src/mir/builder/control_flow/joinir/routing.rs` (+8 lines, whitelist expansion)
**Created**:
- `local_tests/test_trim_main_pattern.hako` (+48 lines, test file)
- `docs/development/current/main/phase170-valueid-boundary-analysis.md` (+270 lines, analysis)
- `docs/development/current/main/phase170-completion-report.md` (+THIS file)
**Updated**:
- `CURRENT_TASK.md` (added Phase 170 section with progress summary)
- `docs/development/current/main/phase166-validation-report.md` (added Phase 170 update section)
---
## Technical Insights
### Boundary Mechanism Gap
**Current Design**:
```rust
JoinInlineBoundary::new_inputs_only(
vec![ValueId(0)], // JoinIR loop variable
vec![loop_var_id], // HOST loop variable
);
```
**What's Missing**: Condition variables!
**Needed Design**:
```rust
JoinInlineBoundary::new_inputs_only(
vec![ValueId(0), ValueId(1), ValueId(2)], // loop var + cond vars
vec![loop_var_id, start_id, end_id], // HOST ValueIds
);
```
**Why It Matters**:
- `condition_to_joinir.rs` directly references HOST `variable_map` ValueIds
- These ValueIds are NOT in JoinIR's fresh allocator space
- Without boundary mapping, they remain undefined after merge
- Silent failure: compiles but doesn't execute
### Two ValueId Namespaces
**HOST Context** (Main MirBuilder):
- ValueIds from 0 upward (e.g., `start = ValueId(33)`)
- All variables in `builder.variable_map`
- Pre-existing before JoinIR call
**JoinIR Context** (Fresh Allocator):
- ValueIds from 0 upward (independent sequence)
- Generated by JoinIR lowerer
- Post-merge: remapped to new HOST ValueIds
**Bridge**: `JoinInlineBoundary` maps between the two spaces with Copy instructions.
**Current Gap**: Only explicitly listed variables get bridged. Condition variables are implicitly referenced but not bridged.
---
## Validation Checklist
- [x] Whitelist expanded (6 JsonParserBox methods + test)
- [x] Pattern routing verified (Pattern2 detected correctly)
- [x] BoolExprLowerer integration verified (generates JoinIR correctly)
- [x] Boundary issue identified (root cause documented)
- [x] Test file created (simplified _trim test)
- [x] Root cause analysis completed (270-line document)
- [x] Next phase direction decided (Option A recommended)
- [x] Documentation updated (CURRENT_TASK.md, phase166 report)
- [x] Files committed (ready for next phase)
---
## Next Phase: Phase 171 - Boundary Mapping Fix
**Recommended Implementation**:
1. **Create condition variable extractor** (30 min)
- File: `src/mir/builder/control_flow/joinir/patterns/cond_var_extractor.rs`
- Function: `extract_condition_variables(ast: &ASTNode, builder: &MirBuilder) -> Vec<(String, ValueId)>`
2. **Update Pattern2** (1 hour)
- Extract condition variables before lowering
- Create expanded boundary with condition vars
- Test with `TrimTest.trim/1`
3. **Update Pattern1, Pattern3, Pattern4** (3 hours)
- Apply same pattern
- Ensure all patterns include condition vars in boundary
4. **Validation** (30 min)
- Re-run `TrimTest.trim/1` → should print output
- Re-run JsonParserBox tests (if `using` resolved)
**Total Estimate**: 5 hours
---
## Conclusion
Phase 170 successfully prepared the environment for JsonParserBox validation and identified the critical blocker preventing runtime execution. The boundary mapping issue is well-understood, with a clear solution path (Option A: extract condition variables).
**Key Achievements**:
- ✅ Whitelist expansion enables JsonParserBox routing
- ✅ BoolExprLowerer integration verified working correctly
- ✅ Boundary issue root cause identified and documented
- ✅ Clear next steps with 5-hour implementation estimate
**Next Step**: Implement Phase 171 - Condition Variable Extraction for Boundary Mapping.
---
## Phase 170C1: CaseA Shape 検出の暫定実装メモ
Phase 170C1 では、当初「LoopUpdateAnalyzer (AST) → UpdateExpr を使って Generic 判定を減らす」方針だったが、
実装コストと他フェーズとの依存関係を考慮し、まずは **carrier 名ベースの軽量ヒューリスティック** を導入した。
### 現状の実装方針
- `CaseALoweringShape::detect_from_features()` の内部で、LoopFeatures だけでは足りない情報を
**carrier 名からのヒント** で補っている:
- `i`, `e`, `idx`, `pos` など → 「位置・インデックス」寄りのキャリア
- `result`, `defs` など → 「蓄積・結果」寄りのキャリア
- これにより、`Generic` 一択だったものを簡易的に:
- StringExamination 系(位置スキャン系)
- ArrayAccumulation 系(配列への追加系)
に二分できるようにしている。
### 限界と今後
- これはあくまで **Phase 170C1 の暫定策** であり、箱理論上の最終形ではない:
- 変数名に依存しているため、完全にハードコードを排除できているわけではない。
- 真に綺麗にするには、LoopUpdateAnalyzer / 型推定層から UpdateKind や carrier 型情報を LoopFeatures に統合する必要がある。
- 今後のフェーズ170C2 以降)では:
- `LoopUpdateAnalyzer``UpdateKind` の分類を追加し、
- `CounterLike` / `AccumulationLike` 等を LoopFeatures に持たせる。
- 可能であれば carrier の型String / Array 等)を推定する軽量層を追加し、
`CaseALoweringShape`**名前ではなく UpdateKind/型情報だけ** を見て判定する方向に寄せていく。
この暫定実装は「Phase 200 での loop_to_join.rs ハードコード削除に向けた足場」として扱い、
将来的には carrier 名依存のヒューリスティックを段階的に薄めていく予定。
---
## Phase 170C2b: LoopUpdateSummary 統合(実装メモ)
Phase 170C2b では、LoopUpdateSummaryBox を実コードに差し込み、
CaseALoweringShape が直接 carrier 名を見ることなく UpdateKind 経由で判定できるようにした。
### 実装ポイント
- 既存の `LoopUpdateSummary` 型を活用し、`LoopFeatures` にフィールドを追加:
```rust
pub struct LoopFeatures {
// 既存フィールド …
pub update_summary: Option<LoopUpdateSummary>, // ← new
}
```
- `CaseALoweringShape` 側に `detect_with_updates()` を追加し、
`LoopUpdateSummary` 内の `UpdateKind` を見て形を決めるようにした:
```rust
match update.kind {
UpdateKind::CounterLike => CaseALoweringShape::StringExamination,
UpdateKind::AccumulationLike => CaseALoweringShape::ArrayAccumulation,
UpdateKind::Other => CaseALoweringShape::Generic,
}
```
- `loop_to_join.rs` では、まず `detect_with_updates()` を試し、
それで決まらない場合のみ従来のフォールバックに流す構造に変更。
### 効果と現状
- carrier 名に依存するロジックは `LoopUpdateSummaryBox` の内部に閉じ込められ、
CaseALoweringShape / loop_to_join.rs からは UpdateKind だけが見える形になった。
- 代表的な ループスモーク 16 本のうち 15 本が PASS1 本は既知の別問題)で、
既存パターンへの回帰は維持されている。
この状態を起点に、今後 Phase 170C3 以降で `LoopUpdateSummary` の中身AST/MIR ベースの解析)だけを差し替えることで、
段階的に carrier 名ヒューリスティックを薄めていく計画。