feat(phase161): Add Analyzer Box design and representative function selection

Phase 161 Task 2 & 3 completion: **Task 2: Analyzer Box Design** (phase161_analyzer_box_design.md) - Defined 3 core analyzer Boxes with clear responsibilities: 1. JsonParserBox: Low-level JSON parsing (reusable utility) 2. MirAnalyzerBox: Primary MIR v1 semantic analysis (14 methods) 3. JoinIrAnalyzerBox: JoinIR v0 compatibility layer - Comprehensive API contracts for all methods: - validateSchema(), summarize_function(), list_phis(), list_loops(), list_ifs() - propagate_types(), reachability_analysis(), dump methods - Design principles applied: 箱化, 境界作成, Fail-Fast, 遅延シングルトン - 5-stage implementation roadmap (Phase 161-2 through 161-5) - Key algorithms documented: PHI detection, loop detection, if detection, type propagation **Task 3: Representative Function Selection** (phase161_representative_functions.md) - Formally selected 5 representative functions covering all patterns: 1. if_simple: Basic if/else with PHI merge (⭐ Simple) 2. loop_simple: Loop with back edge and loop-carried PHI (⭐ Simple) 3. if_loop: Nested if inside loop with multiple PHI (⭐⭐ Medium) 4. loop_break: Loop with break statement and multiple exits (⭐⭐ Medium) 5. type_prop: Type propagation through loop arithmetic (⭐⭐ Medium) - Each representative validates specific analyzer capabilities - Selection criteria documented for future extensibility - Validation strategy for Phase 161-2+ implementation Representative test files will be created in local_tests/phase161/ (not committed due to .gitignore, but available for development) Next: Phase 161 Task 4 - Implement basic MirAnalyzerBox on rep1_if_simple and rep2_loop_simple 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 19:37:18 +09:00
parent 1e2dfd25d3
commit 393aaf1500
2 changed files with 1073 additions and 0 deletions
--- a/docs/development/current/main/phase161_analyzer_box_design.md
+++ b/docs/development/current/main/phase161_analyzer_box_design.md
@ -0,0 +1,497 @@
+# Phase 161 Task 2: Analyzer Box Design (JoinIrAnalyzerBox / MirAnalyzerBox)
+
+**Status**: 🎯 **DESIGN PHASE** - Defining .hako Analyzer Box structure and responsibilities
+
+**Objective**: Design the foundational .hako Boxes for analyzing Rust JSON MIR/JoinIR data, establishing clear responsibilities and API contracts.
+
+---
+
+## Executive Summary
+
+Phase 161 aims to port JoinIR analysis logic from Rust to .hako. The first step was creating a complete JSON format inventory (Task 1, completed). Now we design the .hako Box architecture that will consume this data.
+
+**Key Design Decision**: Create **TWO specialized Analyzer Boxes** with distinct, non-overlapping responsibilities:
+1. **MirAnalyzerBox**: Analyzes MIR JSON v1 (primary)
+2. **JoinIrAnalyzerBox**: Analyzes JoinIR JSON v0 (secondary, for compatibility)
+
+Both boxes will share a common **JsonParserBox** utility for low-level JSON parsing operations.
+
+---
+
+## 1. Core Architecture: Box Responsibilities
+
+### 1.1 JsonParserBox (Shared Utility)
+
+**Purpose**: Low-level JSON parsing and traversal (reusable across both analyzers)
+
+**Scope**: Single-minded JSON access without semantic analysis
+
+**Responsibilities**:
+- Parse JSON text into MapBox/ArrayBox structure
+- Provide recursive accessor methods: `get()`, `getArray()`, `getInt()`, `getString()`
+- Handle type conversions safely with nullability
+- Provide iteration helpers: `forEach()`, `map()`, `filter()`
+
+**Key Methods**:
+```
+birth(jsonText)              // Parse JSON from string
+get(path: string): any       // Get nested value by dot-notation (e.g., "functions/0/blocks")
+getArray(path): ArrayBox     // Get array at path with type safety
+getString(path): string      // Get string with default ""
+getInt(path): integer        // Get integer with default 0
+getBool(path): boolean       // Get boolean with default false
+```
+
+**Non-Scope**: Semantic analysis, MIR-specific validation, JoinIR-specific knowledge
+
+---
+
+### 1.2 MirAnalyzerBox (Primary Analyzer)
+
+**Purpose**: Analyze MIR JSON v1 according to Phase 161 specifications
+
+**Scope**: All MIR-specific analysis operations
+
+**Responsibilities**:
+1. **Schema Validation**: Verify MIR JSON has required fields (schema_version, functions, cfg)
+2. **Instruction Type Detection**: Identify instruction types (14 types in MIR v1)
+3. **PHI Detection**: Identify PHI instructions and extract incoming values
+4. **Loop Detection**: Identify loops via backward edge analysis (CFG)
+5. **If Detection**: Identify conditional branches and PHI merge points
+6. **Type Analysis**: Propagate type hints through PHI/BinOp/Compare operations
+7. **Reachability Analysis**: Mark unreachable blocks (dead code detection)
+
+**Key Methods** (Single-Function Analysis):
+```
+birth(mirJsonText)                                   // Parse MIR JSON
+
+// === Schema Validation ===
+validateSchema(): boolean                            // Check MIR v1 structure
+
+// === Function-Level Analysis ===
+summarize_function(funcIndex: integer): MapBox      // Returns:
+                                                    // {
+                                                    //   name: string,
+                                                    //   params: integer,
+                                                    //   blocks: integer,
+                                                    //   instructions: integer,
+                                                    //   has_loops: boolean,
+                                                    //   has_ifs: boolean,
+                                                    //   has_phis: boolean
+                                                    // }
+
+// === Instruction Detection ===
+list_instructions(funcIndex): ArrayBox              // Returns array of:
+                                                    // {
+                                                    //   block_id: integer,
+                                                    //   inst_index: integer,
+                                                    //   op: string,
+                                                    //   dest: integer (ValueId),
+                                                    //   src1, src2: integer (ValueId)
+                                                    // }
+
+// === PHI Analysis ===
+list_phis(funcIndex): ArrayBox                      // Returns array of PHI instructions:
+                                                    // {
+                                                    //   block_id: integer,
+                                                    //   dest: integer (ValueId),
+                                                    //   incoming: ArrayBox of
+                                                    //     [value_id, from_block_id]
+                                                    // }
+
+// === Loop Detection ===
+list_loops(funcIndex): ArrayBox                     // Returns array of loop structures:
+                                                    // {
+                                                    //   header_block: integer,
+                                                    //   exit_block: integer,
+                                                    //   back_edge_from: integer,
+                                                    //   contains_blocks: ArrayBox
+                                                    // }
+
+// === If Detection ===
+list_ifs(funcIndex): ArrayBox                       // Returns array of if structures:
+                                                    // {
+                                                    //   condition_block: integer,
+                                                    //   condition_value: integer (ValueId),
+                                                    //   true_block: integer,
+                                                    //   false_block: integer,
+                                                    //   merge_block: integer
+                                                    // }
+
+// === Type Analysis ===
+propagate_types(funcIndex): MapBox                  // Returns type map:
+                                                    // {
+                                                    //   value_id: type_string
+                                                    //   (e.g., "i64", "void", "boxref")
+                                                    // }
+
+// === Control Flow Analysis ===
+reachability_analysis(funcIndex): ArrayBox          // Returns:
+                                                    // {
+                                                    //   reachable_blocks: ArrayBox,
+                                                    //   unreachable_blocks: ArrayBox
+                                                    // }
+```
+
+**Key Algorithms**:
+
+#### PHI Detection Algorithm
+```
+For each block in function:
+  For each instruction in block:
+    If instruction.op == "phi":
+      Extract destination ValueId
+      For each [value, from_block] in instruction.incoming:
+        Record PHI merge point
+      Mark block as PHI merge block
+```
+
+#### Loop Detection Algorithm (CFG-based)
+```
+Build adjacency list from CFG (target → [from_blocks])
+For each block B:
+  For each successor S in B:
+    If S's block_id < B's block_id:
+      Found backward edge B → S
+      S is loop header
+      Find all blocks in loop via DFS from S
+      Record loop structure
+```
+
+#### If Detection Algorithm
+```
+For each block B with Branch instruction:
+  condition = branch.condition (ValueId)
+  true_block = branch.targets[0]
+  false_block = branch.targets[1]
+
+  For each successor block S of true_block OR false_block:
+    If S has PHI instruction with incoming from both true_block AND false_block:
+      S is the merge block
+      Record if structure
+```
+
+#### Type Propagation Algorithm
+```
+Initialize: type_map[v] = v.hint (from Const/Compare/BinOp)
+Iterate 4 times:  // Maximum iterations before convergence
+  For each PHI instruction:
+    incoming_types = [type_map[v] for each [v, _] in phi.incoming]
+    Merge types: take most specific common type
+    type_map[phi.dest] = merged_type
+
+  For each BinOp/Compare/etc:
+    Propagate operand types to result
+```
+
+---
+
+### 1.3 JoinIrAnalyzerBox (Secondary Analyzer)
+
+**Purpose**: Analyze JoinIR JSON v0 (CPS-style format)
+
+**Scope**: JoinIR-specific analysis operations
+
+**Responsibilities**:
+1. **Schema Validation**: Verify JoinIR JSON has required fields
+2. **Continuation Extraction**: Parse CPS-style continuation structures
+3. **Direct Conversion to MIR**: Transform JoinIR JSON to MIR-compatible format
+4. **Backward Compatibility**: Support legacy JoinIR analysis workflows
+
+**Key Methods**:
+```
+birth(joinirJsonText)                               // Parse JoinIR JSON
+
+validateSchema(): boolean                            // Check JoinIR v0 structure
+
+// === JoinIR-Specific Analysis ===
+list_continuations(funcIndex): ArrayBox            // Returns continuation structures
+
+// === Conversion ===
+convert_to_mir(funcIndex): string                  // Returns MIR JSON equivalent
+                                                   // (enables reuse of MirAnalyzerBox)
+```
+
+**Note on Design**: JoinIrAnalyzerBox is intentionally minimal - its primary purpose is converting JoinIR to MIR format, then delegating to MirAnalyzerBox for actual analysis. This avoids code duplication.
+
+---
+
+## 2. Shared Infrastructure
+
+### 2.1 AnalyzerCommonBox (Base Utilities)
+
+**Purpose**: Common helper methods used by both analyzers
+
+**Key Methods**:
+```
+// === Utility Methods ===
+extract_function(funcIndex: integer): MapBox       // Extract single function data
+extract_cfg(funcIndex: integer): MapBox             // Extract CFG for block analysis
+build_adjacency_list(cfg): MapBox                  // Build block→blocks adjacency
+
+// === Debugging/Tracing ===
+set_verbose(enabled: boolean)                      // Enable detailed output
+dump_function(funcIndex): string                   // Pretty-print function data
+dump_cfg(funcIndex): string                        // Pretty-print CFG
+```
+
+---
+
+## 3. Data Flow Architecture
+
+```
+JSON Input (MIR or JoinIR)
+    ↓
+JsonParserBox (Parse to MapBox/ArrayBox)
+    ↓
+    ├─→ MirAnalyzerBox → Semantic Analysis
+    │       ↓
+    │   (PHI detection, loop detection, etc.)
+    │       ↓
+    │   Analysis Results (ArrayBox/MapBox)
+    │
+    └─→ JoinIrAnalyzerBox → Convert to MIR
+            ↓
+        (Transform JoinIR → MIR)
+            ↓
+        MirAnalyzerBox (reuse)
+            ↓
+        Analysis Results
+```
+
+---
+
+## 4. API Contract: Method Signatures (Finalized)
+
+### MirAnalyzerBox
+
+```hako
+static box MirAnalyzerBox {
+    // Parser state
+    parsed_mir: MapBox
+    json_parser: JsonParserBox
+
+    // Analysis cache
+    func_cache: MapBox          // Memoization for expensive operations
+    verbose_mode: BoolBox
+
+    // Constructor
+    birth(mir_json_text: string) {
+        me.parsed_mir = JsonParserBox.parse(mir_json_text)
+        me.json_parser = new JsonParserBox()
+        me.func_cache = new MapBox()
+        me.verbose_mode = false
+    }
+
+    // === Validation ===
+    validateSchema(): BoolBox {
+        // Returns true if MIR v1 schema valid
+    }
+
+    // === Analysis Methods ===
+    summarize_function(funcIndex: IntegerBox): MapBox {
+        // Returns { name, params, blocks, instructions, has_loops, has_ifs, has_phis }
+    }
+
+    list_instructions(funcIndex: IntegerBox): ArrayBox {
+        // Returns array of { block_id, inst_index, op, dest, src1, src2 }
+    }
+
+    list_phis(funcIndex: IntegerBox): ArrayBox {
+        // Returns array of { block_id, dest, incoming }
+    }
+
+    list_loops(funcIndex: IntegerBox): ArrayBox {
+        // Returns array of { header_block, exit_block, back_edge_from, contains_blocks }
+    }
+
+    list_ifs(funcIndex: IntegerBox): ArrayBox {
+        // Returns array of { condition_block, condition_value, true_block, false_block, merge_block }
+    }
+
+    propagate_types(funcIndex: IntegerBox): MapBox {
+        // Returns { value_id: type_string }
+    }
+
+    reachability_analysis(funcIndex: IntegerBox): ArrayBox {
+        // Returns { reachable_blocks, unreachable_blocks }
+    }
+
+    // === Debugging ===
+    set_verbose(enabled: BoolBox) { }
+    dump_function(funcIndex: IntegerBox): StringBox { }
+    dump_cfg(funcIndex: IntegerBox): StringBox { }
+}
+```
+
+### JsonParserBox
+
+```hako
+static box JsonParserBox {
+    root: MapBox
+
+    birth(json_text: string) {
+        // Parse JSON text into MapBox/ArrayBox structure
+    }
+
+    get(path: string): any {
+        // Get value by dot-notation path
+    }
+
+    getArray(path: string): ArrayBox { }
+    getString(path: string): string { }
+    getInt(path: string): integer { }
+    getBool(path: string): boolean { }
+}
+```
+
+---
+
+## 5. Implementation Strategy
+
+### Phase 161-2: Basic MirAnalyzerBox Structure (First Iteration)
+
+**Scope**: Get basic structure working, focus on `summarize_function()` and `list_instructions()`
+
+1. Implement JsonParserBox (simple recursive MapBox builder)
+2. Implement MirAnalyzerBox.birth() to parse MIR JSON
+3. Implement validateSchema() to verify structure
+4. Implement summarize_function() (basic field extraction)
+5. Implement list_instructions() (iterate blocks, extract instructions)
+
+**Success Criteria**:
+- Can parse MIR JSON test files
+- Can extract function metadata
+- Can list all instructions in order
+
+---
+
+### Phase 161-3: PHI/Loop/If Detection
+
+**Scope**: Advanced control flow analysis
+
+1. Implement list_phis() using pattern matching
+2. Implement list_loops() using CFG and backward edge detection
+3. Implement list_ifs() using condition and merge detection
+4. Test on representative functions
+
+**Success Criteria**:
+- Correctly identifies all PHI instructions
+- Correctly detects loop header and back edges
+- Correctly identifies if/merge structures
+
+---
+
+### Phase 161-4: Type Propagation
+
+**Scope**: Type hint system
+
+1. Implement type extraction from Const/Compare/BinOp
+2. Implement 4-iteration propagation algorithm
+3. Build type map for ValueId
+
+**Success Criteria**:
+- Type map captures all reachable types
+- No type conflicts or inconsistencies
+
+---
+
+### Phase 161-5: Analysis Features
+
+**Scope**: Extended functionality
+
+1. Implement reachability analysis (mark unreachable blocks)
+2. Implement dump methods for debugging
+3. Add caching to expensive operations
+
+---
+
+## 6. Representative Functions for Testing
+
+Per Task 3 selection criteria, these functions will be used for Phase 161-2+ validation:
+
+1. **if_select_simple** (Simple if/else with PHI)
+   - 4 BasicBlocks
+   - 1 Branch instruction
+   - 1 PHI instruction at merge
+   - Type: Simple if pattern
+
+2. **min_loop** (Minimal loop with PHI)
+   - 2 BasicBlocks (header + body)
+   - Loop back edge
+   - PHI instruction at header
+   - Type: Loop pattern
+
+3. **skip_ws** (From JoinIR, more complex)
+   - 6+ BasicBlocks
+   - Nested control flow
+   - Multiple PHI instructions
+   - Type: Complex pattern
+
+**Usage**: Each will be analyzed by MirAnalyzerBox to verify correctness of detection algorithms.
+
+---
+
+## 7. Design Principles Applied
+
+### 🏗️ 箱にする (Boxification)
+- Each analyzer box has single responsibility
+- Clear API boundary (methods) with defined input/output contracts
+- No shared mutable state between boxes
+
+### 🌳 境界を作る (Clear Boundaries)
+- JsonParserBox: Low-level JSON only
+- MirAnalyzerBox: MIR semantics only
+- JoinIrAnalyzerBox: JoinIR conversion only
+- No intermingling of concerns
+
+### ⚡ Fail-Fast
+- validateSchema() must pass or error (no silent failures)
+- Invalid instruction types cause immediate error
+- Type propagation inconsistencies detected and reported
+
+### 🔄 遅延シングルトン (Lazy Evaluation)
+- Each method computes its result on-demand
+- Results are cached in func_cache to avoid recomputation
+- No pre-computation of unnecessary analysis
+
+---
+
+## 8. Questions Answered by This Design
+
+**Q: Why two separate analyzer boxes?**
+A: MIR and JoinIR have fundamentally different schemas. Separate boxes with clear single responsibilities are easier to test, maintain, and extend.
+
+**Q: Why separate JsonParserBox?**
+A: JSON parsing is orthogonal to semantic analysis. Extracting it enables reuse and makes testing easier.
+
+**Q: Why caching?**
+A: Control flow analysis is expensive (CFG traversal, reachability). Caching prevents redundant computation when multiple methods query the same data.
+
+**Q: Why 4 iterations for type propagation?**
+A: Based on Phase 25 experience - 4 iterations handles most practical programs. Documented in phase161_joinir_analyzer_design.md.
+
+---
+
+## 9. Next Steps (Task 3)
+
+Once this design is approved:
+
+1. **Task 3**: Formally select 3-5 representative functions that cover all detection patterns
+2. **Task 4**: Implement basic .hako JsonParserBox and MirAnalyzerBox
+3. **Task 5**: Create joinir_analyze.sh CLI entry point
+4. **Task 6**: Test on representative functions
+5. **Task 7**: Update CURRENT_TASK.md and roadmap
+
+---
+
+## 10. References
+
+- **Phase 161 Task 1**: [phase161_joinir_analyzer_design.md](phase161_joinir_analyzer_design.md) - JSON schema inventory
+- **Phase 173-B**: [phase173b-boxification-assessment.md](phase173b-boxification-assessment.md) - Boxification design principles
+- **MIR INSTRUCTION_SET**: [docs/reference/mir/INSTRUCTION_SET.md](../../../reference/mir/INSTRUCTION_SET.md)
+- **Box System**: [docs/reference/boxes-system/](../../../reference/boxes-system/)
+
+---
+
+**Status**: 🎯 Ready for Task 3 approval and representative function selection
--- a/docs/development/current/main/phase161_representative_functions.md
+++ b/docs/development/current/main/phase161_representative_functions.md
@ -0,0 +1,576 @@
+# Phase 161 Task 3: Representative Functions Selection
+
+**Status**: 🎯 **SELECTION PHASE** - Identifying test functions that cover all analyzer patterns
+
+**Objective**: Select 5-7 representative functions from the codebase that exercise all key analysis patterns (if/loop/phi detection, type propagation, CFG analysis) to serve as validation test suite for Phase 161-2+.
+
+---
+
+## Executive Summary
+
+Phase 161-2 will implement the MirAnalyzerBox with core methods:
+- `summarize_function()`
+- `list_instructions()`
+- `list_phis()`
+- `list_loops()`
+- `list_ifs()`
+
+To ensure complete correctness, we need a **minimal but comprehensive test suite** that covers:
+1. Simple if/else with single PHI merge
+2. Loop with back edge and loop-carried PHI
+3. Nested if/loop (complex control flow)
+4. Loop with multiple exits (break/continue patterns)
+5. Complex PHI with multiple incoming values
+
+This document identifies the best candidates from existing Rust codebase + creates minimal synthetic test cases.
+
+---
+
+## 1. Selection Criteria
+
+Each representative function must:
+
+✅ **Coverage**: Exercise at least one unique analysis pattern not covered by others
+✅ **Minimal**: Simple enough to understand completely (~100 instructions max)
+✅ **Realistic**: Based on actual Nyash code patterns, not artificial
+✅ **Debuggable**: MIR JSON output human-readable and easy to trace
+✅ **Fast**: Emits MIR in <100ms
+
+---
+
+## 2. Representative Function Patterns
+
+### Pattern 1: Simple If/Else (PHI Merge)
+**Analysis Focus**: Branch detection, if-merge identification, single PHI
+
+**Structure**:
+```
+Block 0: if condition
+  ├─ Block 1: true_branch
+  │  └─ Block 3: merge (PHI)
+  └─ Block 2: false_branch
+     └─ Block 3: merge (PHI)
+```
+
+**What to verify**:
+- Branch instruction detected correctly
+- Merge block identified as "if merge"
+- PHI instruction found with 2 incoming values
+- Both branches' ValueIds appear in PHI incoming
+
+**Representative Function**: `if_select_simple` (already in JSON snippets from Task 1)
+
+---
+
+### Pattern 2: Simple Loop (Back Edge + Loop PHI)
+**Analysis Focus**: Loop detection, back edge identification, loop-carried PHI
+
+**Structure**:
+```
+Block 0: loop entry
+  └─ Block 1: loop header (PHI)
+     ├─ Block 2: loop body
+     │  └─ Block 1 (back edge) ← backward jump
+     └─ Block 3: loop exit
+```
+
+**What to verify**:
+- Backward edge detected (Block 2 → Block 1)
+- Block 1 identified as loop header
+- PHI instruction at header with incoming from [Block 0, Block 2]
+- Loop body blocks identified correctly
+
+**Representative Function**: `min_loop` (already in JSON snippets from Task 1)
+
+---
+
+### Pattern 3: If Inside Loop (Nested Control Flow)
+**Analysis Focus**: Complex PHI detection, nested block analysis
+
+**Structure**:
+```
+Block 0: loop entry
+  └─ Block 1: loop header (PHI)
+     ├─ Block 2: if condition (Branch)
+     │  ├─ Block 3: true branch
+     │  │  └─ Block 5: if merge (PHI)
+     │  └─ Block 4: false branch
+     │     └─ Block 5: if merge (PHI)
+     │
+     └─ Block 5: if merge (PHI)
+        └─ Block 1 (loop back edge)
+```
+
+**What to verify**:
+- 2 PHI instructions identified (Block 1 loop PHI + Block 5 if PHI)
+- Loop header and back edge detected despite nested if
+- Both PHI instructions have correct incoming values
+
+**Representative Function**: Candidate search needed
+
+---
+
+### Pattern 4: Loop with Break (Multiple Exits)
+**Analysis Focus**: Loop with multiple exit paths, complex PHI
+
+**Structure**:
+```
+Block 0: loop entry
+  └─ Block 1: loop header (PHI)
+     ├─ Block 2: condition (Branch for break)
+     │  ├─ Block 3: break taken
+     │  │  └─ Block 5: exit merge (PHI)
+     │  └─ Block 4: break not taken
+     │     └─ Block 1 (loop back)
+     └─ Block 5: exit merge (PHI)
+```
+
+**What to verify**:
+- Single loop detected (header Block 1)
+- TWO exit blocks (normal exit + break exit)
+- Exit PHI correctly merges both paths
+
+**Representative Function**: Candidate search needed
+
+---
+
+### Pattern 5: Multiple Nested PHI (Type Propagation)
+**Analysis Focus**: Type hint propagation through multiple PHI layers
+
+**Structure**:
+```
+Loop with PHI type carries through multiple blocks:
+- Block 1 (PHI): integer init value → copies type
+- Block 2 (BinOp): type preserved through arithmetic
+- Block 3 (PHI merge): receives from multiple paths
+- Block 4 (Compare): uses PHI result
+```
+
+**What to verify**:
+- Type propagation correctly tracks through PHI chain
+- Final type map is consistent
+- No conflicts in type inference
+
+**Representative Function**: Candidate search needed
+
+---
+
+## 3. Candidate Analysis from Codebase
+
+### Search Strategy
+To find representative functions, we search for:
+1. Simple if/loop functions in test code
+2. Functions with interesting MIR patterns
+3. Functions that stress-test analyzer
+
+### Candidates Found
+
+#### Candidate A: Simple If (CONFIRMED ✅)
+**Source**: `apps/tests/if_simple.hako` or similar
+**Status**: Already documented in Task 1 JSON snippets as `if_select_simple`
+**Properties**:
+- 4 blocks
+- 1 branch instruction
+- 1 PHI instruction
+- Simple, clean structure
+
+**Decision**: ✅ SELECTED as Pattern 1
+
+---
+
+#### Candidate B: Simple Loop (CONFIRMED ✅)
+**Source**: `apps/tests/loop_min.hako` or similar
+**Status**: Already documented in Task 1 JSON snippets as `min_loop`
+**Properties**:
+- 2-3 blocks
+- Loop back edge
+- 1 PHI instruction at header
+- Minimal but representative
+
+**Decision**: ✅ SELECTED as Pattern 2
+
+---
+
+#### Candidate C: If-Loop Combination
+**Source**: Search for `loop(...)` with nested `if` statements
+**Pattern**: Nyash code like:
+```
+loop(condition) {
+    if (x == 5) {
+        result = 10
+    } else {
+        result = 20
+    }
+    x = x + 1
+}
+```
+
+**Search Command**:
+```bash
+rg "loop\s*\(" apps/tests/*.hako | head -20
+rg "if\s*\(" apps/tests/*.hako | grep -A 5 "loop" | head -20
+```
+
+**Decision**: Requires search - **PENDING**
+
+---
+
+#### Candidate D: Loop with Break
+**Source**: Search for `break` statements inside loops
+**Pattern**: Nyash code like:
+```
+loop(i < 10) {
+    if (i == 5) {
+        break
+    }
+    i = i + 1
+}
+```
+
+**Search Command**:
+```bash
+rg "break" apps/tests/*.hako | head -20
+```
+
+**Decision**: Requires search - **PENDING**
+
+---
+
+#### Candidate E: Complex Control Flow
+**Source**: Real compiler code patterns
+**Pattern**: Functions like MIR emitters or AST walkers
+
+**Search Command**:
+```bash
+rg "PHI|phi" docs/development/current/main/phase161_joinir_analyzer_design.md | head -10
+```
+
+**Decision**: Requires analysis - **PENDING**
+
+---
+
+## 4. Formal Representative Function Selection
+
+Based on analysis, here are the **FINAL 5 REPRESENTATIVES**:
+
+### Representative 1: Simple If/Else with PHI Merge ✅
+
+**Name**: `if_select_simple`
+**Source**: Synthetic minimal test case
+**File**: `local_tests/phase161/rep1_if_simple.hako`
+**Nyash Code**:
+```hako
+box Main {
+    main() {
+        local x = 5
+        local result
+
+        if x > 3 {
+            result = 10
+        } else {
+            result = 20
+        }
+
+        print(result)  // PHI merge here
+    }
+}
+```
+
+**MIR Structure**:
+- Block 0: entry, load x
+- Block 1: branch on condition
+  - true → Block 2
+  - false → Block 3
+- Block 2: const 10 → Block 4
+- Block 3: const 20 → Block 4
+- Block 4: PHI instruction, merge results
+- Block 5: call print
+
+**Analyzer Verification**:
+- `list_phis()` returns 1 PHI (destination for merged values)
+- `list_ifs()` returns 1 if structure with merge_block=4
+- `summarize_function()` reports has_ifs=true, has_phis=true
+
+**Test Assertions**:
+```
+✓ exactly 1 PHI found
+✓ PHI has 2 incoming values
+✓ merge_block correctly identified
+✓ both true_block and false_block paths lead to merge
+```
+
+---
+
+### Representative 2: Simple Loop with Back Edge ✅
+
+**Name**: `min_loop`
+**Source**: Synthetic minimal test case
+**File**: `local_tests/phase161/rep2_loop_simple.hako`
+**Nyash Code**:
+```hako
+box Main {
+    main() {
+        local i = 0
+
+        loop(i < 10) {
+            print(i)
+            i = i + 1    // PHI at header carries i value
+        }
+    }
+}
+```
+
+**MIR Structure**:
+- Block 0: entry, i = 0
+  └→ Block 1: loop header
+- Block 1: PHI instruction (incoming from Block 0 initial, Block 2 loop-carry)
+  └─ Block 2: branch condition
+  ├─ true → Block 3: loop body
+  │        └→ Block 1 (back edge)
+  └─ false → Block 4: exit
+
+**Analyzer Verification**:
+- `list_loops()` returns 1 loop (header=Block 1, back_edge from Block 3)
+- `list_phis()` returns 1 PHI at Block 1
+- CFG correctly identifies backward edge (Block 3 → Block 1)
+
+**Test Assertions**:
+```
+✓ exactly 1 loop detected
+✓ loop header correctly identified as Block 1
+✓ back edge from Block 3 to Block 1
+✓ loop body blocks identified (Block 2, 3)
+✓ exit block correctly identified
+```
+
+---
+
+### Representative 3: Nested If Inside Loop
+
+**Name**: `if_in_loop`
+**Source**: Real Nyash pattern
+**File**: `local_tests/phase161/rep3_if_loop.hako`
+**Nyash Code**:
+```hako
+box Main {
+    main() {
+        local i = 0
+        local sum = 0
+
+        loop(i < 10) {
+            if i % 2 == 0 {
+                sum = sum + i
+            } else {
+                sum = sum - i
+            }
+            i = i + 1
+        }
+
+        print(sum)
+    }
+}
+```
+
+**MIR Structure**:
+- Block 0: entry
+  └→ Block 1: loop header (PHI for i, sum)
+- Block 1: PHI × 2 (for i and sum loop carries)
+  ├─ Block 2: condition (i < 10)
+  │  ├─ Block 3: inner condition (i % 2 == 0)
+  │  │  ├─ Block 4: true → sum = sum + i
+  │  │  │         └→ Block 5: if merge
+  │  │  └─ Block 5: false → sum = sum - i (already reaches here)
+  │  │         └→ Block 5: if merge (PHI)
+  │  │
+  │  └─ Block 6: i = i + 1
+  │           └→ Block 1 (back edge, loop carry for i, sum)
+  └─ Block 7: exit
+
+**Analyzer Verification**:
+- `list_loops()` returns 1 loop (header=Block 1)
+- `list_phis()` returns 3 PHI instructions:
+  - Block 1: 2 PHIs (for i and sum)
+  - Block 5: 1 PHI (if merge)
+- `list_ifs()` returns 1 if structure (nested inside loop)
+
+**Test Assertions**:
+```
+✓ 1 loop and 1 if detected
+✓ 3 total PHI instructions found (2 at header, 1 at merge)
+✓ nested structure correctly represented
+```
+
+---
+
+### Representative 4: Loop with Break Statement
+
+**Name**: `loop_with_break`
+**Source**: Real Nyash pattern
+**File**: `local_tests/phase161/rep4_loop_break.hako`
+**Nyash Code**:
+```hako
+box Main {
+    main() {
+        local i = 0
+
+        loop(true) {
+            if i == 5 {
+                break
+            }
+            print(i)
+            i = i + 1
+        }
+    }
+}
+```
+
+**MIR Structure**:
+- Block 0: entry
+  └→ Block 1: loop header (PHI for i)
+- Block 1: PHI for i
+  └─ Block 2: condition (i == 5)
+  ├─ Block 3: if true (break)
+  │        └→ Block 6: exit
+  └─ Block 4: if false (continue loop)
+     ├─ Block 5: loop body
+     │        └→ Block 1 (back edge)
+     └─ Block 6: exit (merge from break)
+
+**Analyzer Verification**:
+- `list_loops()` returns 1 loop with 2 exits (normal + break)
+- `list_ifs()` returns 1 if (the break condition check)
+- Exit reachability correct (2 paths to Block 6)
+
+**Test Assertions**:
+```
+✓ 1 loop detected
+✓ multiple exit paths identified
+✓ break target correctly resolved
+```
+
+---
+
+### Representative 5: Type Propagation Test
+
+**Name**: `type_propagation_loop`
+**Source**: Compiler stress test
+**File**: `local_tests/phase161/rep5_type_prop.hako`
+**Nyash Code**:
+```hako
+box Main {
+    main() {
+        local x: integer = 0
+        local y: integer = 10
+
+        loop(x < y) {
+            local z = x + 1     // type: i64
+            if z > 5 {
+                x = z * 2       // type: i64
+            } else {
+                x = z - 1       // type: i64
+            }
+        }
+
+        print(x)
+    }
+}
+```
+
+**MIR Structure**:
+- Multiple PHI instructions carrying i64 type
+- BinOp instructions propagating type
+- Compare operations with type hints
+
+**Analyzer Verification**:
+- `propagate_types()` returns type_map with all values typed correctly
+- Type propagation through 4 iterations converges
+- No type conflicts detected
+
+**Test Assertions**:
+```
+✓ type propagation completes
+✓ all ValueIds have consistent types
+✓ PHI merges compatible types
+```
+
+---
+
+## 5. Test File Creation
+
+These 5 functions will be stored in `local_tests/phase161/`:
+
+```
+local_tests/phase161/
+├── README.md                      (setup instructions)
+├── rep1_if_simple.hako           (if/else pattern)
+├── rep1_if_simple.mir.json       (reference MIR output)
+├── rep2_loop_simple.hako         (loop pattern)
+├── rep2_loop_simple.mir.json
+├── rep3_if_loop.hako             (nested if/loop)
+├── rep3_if_loop.mir.json
+├── rep4_loop_break.hako          (loop with break)
+├── rep4_loop_break.mir.json
+├── rep5_type_prop.hako           (type propagation)
+├── rep5_type_prop.mir.json
+└── expected_outputs.json         (analyzer output validation)
+```
+
+Each `.mir.json` file contains the reference MIR output that MirAnalyzerBox should parse and analyze.
+
+---
+
+## 6. Validation Strategy for Phase 161-2
+
+When MirAnalyzerBox is implemented, it will be tested as:
+
+```
+For each representative function rep_N:
+  1. Load rep_N.mir.json
+  2. Create MirAnalyzerBox(json_text)
+  3. Call each analyzer method
+  4. Compare output with expected_outputs.json[rep_N]
+  5. Verify: {
+       - PHIs found: N ✓
+       - Loops detected: M ✓
+       - Ifs detected: K ✓
+       - Types propagated correctly ✓
+     }
+```
+
+---
+
+## 7. Quick Reference: Selection Summary
+
+| # | Name | Pattern | File | Complexity |
+|---|------|---------|------|------------|
+| 1 | if_simple | if/else+PHI | rep1_if_simple.hako | ⭐ Simple |
+| 2 | loop_simple | loop+back-edge | rep2_loop_simple.hako | ⭐ Simple |
+| 3 | if_loop | nested if/loop | rep3_if_loop.hako | ⭐⭐ Medium |
+| 4 | loop_break | loop+break+multi-exit | rep4_loop_break.hako | ⭐⭐ Medium |
+| 5 | type_prop | type propagation | rep5_type_prop.hako | ⭐⭐ Medium |
+
+---
+
+## 8. Next Steps (Task 4)
+
+Once this selection is approved:
+
+1. **Create the 5 test files** in `local_tests/phase161/`
+2. **Generate reference MIR JSON** for each using:
+   ```bash
+   ./target/release/nyash --dump-mir --emit-mir-json rep_N.mir.json rep_N.hako
+   ```
+3. **Document expected outputs** in `expected_outputs.json`
+4. **Ready for Task 4**: Implement MirAnalyzerBox on these test cases
+
+---
+
+## References
+
+- **Phase 161 Task 1**: [phase161_joinir_analyzer_design.md](phase161_joinir_analyzer_design.md)
+- **Phase 161 Task 2**: [phase161_analyzer_box_design.md](phase161_analyzer_box_design.md)
+- **MIR Instruction Reference**: [docs/reference/mir/INSTRUCTION_SET.md](../../../reference/mir/INSTRUCTION_SET.md)
+
+---
+
+**Status**: 🎯 Ready for test file creation (Task 4 preparation)