feat(phase161): Add Analyzer Box design and representative function selection

Phase 161 Task 2 & 3 completion: **Task 2: Analyzer Box Design** (phase161_analyzer_box_design.md) - Defined 3 core analyzer Boxes with clear responsibilities: 1. JsonParserBox: Low-level JSON parsing (reusable utility) 2. MirAnalyzerBox: Primary MIR v1 semantic analysis (14 methods) 3. JoinIrAnalyzerBox: JoinIR v0 compatibility layer - Comprehensive API contracts for all methods: - validateSchema(), summarize_function(), list_phis(), list_loops(), list_ifs() - propagate_types(), reachability_analysis(), dump methods - Design principles applied: 箱化, 境界作成, Fail-Fast, 遅延シングルトン - 5-stage implementation roadmap (Phase 161-2 through 161-5) - Key algorithms documented: PHI detection, loop detection, if detection, type propagation **Task 3: Representative Function Selection** (phase161_representative_functions.md) - Formally selected 5 representative functions covering all patterns: 1. if_simple: Basic if/else with PHI merge (⭐ Simple) 2. loop_simple: Loop with back edge and loop-carried PHI (⭐ Simple) 3. if_loop: Nested if inside loop with multiple PHI (⭐⭐ Medium) 4. loop_break: Loop with break statement and multiple exits (⭐⭐ Medium) 5. type_prop: Type propagation through loop arithmetic (⭐⭐ Medium) - Each representative validates specific analyzer capabilities - Selection criteria documented for future extensibility - Validation strategy for Phase 161-2+ implementation Representative test files will be created in local_tests/phase161/ (not committed due to .gitignore, but available for development) Next: Phase 161 Task 4 - Implement basic MirAnalyzerBox on rep1_if_simple and rep2_loop_simple 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 19:37:18 +09:00
parent 1e2dfd25d3
commit 393aaf1500
2 changed files with 1073 additions and 0 deletions
--- a/docs/development/current/main/phase161_analyzer_box_design.md
+++ b/docs/development/current/main/phase161_analyzer_box_design.md
@ -0,0 +1,497 @@
 # Phase 161 Task 2: Analyzer Box Design (JoinIrAnalyzerBox / MirAnalyzerBox)
 **Status**: 🎯 **DESIGN PHASE** - Defining .hako Analyzer Box structure and responsibilities
 **Objective**: Design the foundational .hako Boxes for analyzing Rust JSON MIR/JoinIR data, establishing clear responsibilities and API contracts.
 ---
 ## Executive Summary
 Phase 161 aims to port JoinIR analysis logic from Rust to .hako. The first step was creating a complete JSON format inventory (Task 1, completed). Now we design the .hako Box architecture that will consume this data.
 **Key Design Decision**: Create **TWO specialized Analyzer Boxes** with distinct, non-overlapping responsibilities:
 1. **MirAnalyzerBox**: Analyzes MIR JSON v1 (primary)
 2. **JoinIrAnalyzerBox**: Analyzes JoinIR JSON v0 (secondary, for compatibility)
 Both boxes will share a common **JsonParserBox** utility for low-level JSON parsing operations.
 ---
 ## 1. Core Architecture: Box Responsibilities
 ### 1.1 JsonParserBox (Shared Utility)
 **Purpose**: Low-level JSON parsing and traversal (reusable across both analyzers)
 **Scope**: Single-minded JSON access without semantic analysis
 **Responsibilities**:
 - Parse JSON text into MapBox/ArrayBox structure
 - Provide recursive accessor methods: `get()`, `getArray()`, `getInt()`, `getString()`
 - Handle type conversions safely with nullability
 - Provide iteration helpers: `forEach()`, `map()`, `filter()`
 **Key Methods**:
 ```
 birth(jsonText)              // Parse JSON from string
 get(path: string): any       // Get nested value by dot-notation (e.g., "functions/0/blocks")
 getArray(path): ArrayBox     // Get array at path with type safety
 getString(path): string      // Get string with default ""
 getInt(path): integer        // Get integer with default 0
 getBool(path): boolean       // Get boolean with default false
 ```
 **Non-Scope**: Semantic analysis, MIR-specific validation, JoinIR-specific knowledge
 ---
 ### 1.2 MirAnalyzerBox (Primary Analyzer)
 **Purpose**: Analyze MIR JSON v1 according to Phase 161 specifications
 **Scope**: All MIR-specific analysis operations
 **Responsibilities**:
 1. **Schema Validation**: Verify MIR JSON has required fields (schema_version, functions, cfg)
 2. **Instruction Type Detection**: Identify instruction types (14 types in MIR v1)
 3. **PHI Detection**: Identify PHI instructions and extract incoming values
 4. **Loop Detection**: Identify loops via backward edge analysis (CFG)
 5. **If Detection**: Identify conditional branches and PHI merge points
 6. **Type Analysis**: Propagate type hints through PHI/BinOp/Compare operations
 7. **Reachability Analysis**: Mark unreachable blocks (dead code detection)
 **Key Methods** (Single-Function Analysis):
 ```
 birth(mirJsonText)                                   // Parse MIR JSON
 // === Schema Validation ===
 validateSchema(): boolean                            // Check MIR v1 structure
 // === Function-Level Analysis ===
 summarize_function(funcIndex: integer): MapBox      // Returns:
                                                    // {
                                                    //   name: string,
                                                    //   params: integer,
                                                    //   blocks: integer,
                                                    //   instructions: integer,
                                                    //   has_loops: boolean,
                                                    //   has_ifs: boolean,
                                                    //   has_phis: boolean
                                                    // }
 // === Instruction Detection ===
 list_instructions(funcIndex): ArrayBox              // Returns array of:
                                                    // {
                                                    //   block_id: integer,
                                                    //   inst_index: integer,
                                                    //   op: string,
                                                    //   dest: integer (ValueId),
                                                    //   src1, src2: integer (ValueId)
                                                    // }
 // === PHI Analysis ===
 list_phis(funcIndex): ArrayBox                      // Returns array of PHI instructions:
                                                    // {
                                                    //   block_id: integer,
                                                    //   dest: integer (ValueId),
                                                    //   incoming: ArrayBox of
                                                    //     [value_id, from_block_id]
                                                    // }
 // === Loop Detection ===
 list_loops(funcIndex): ArrayBox                     // Returns array of loop structures:
                                                    // {
                                                    //   header_block: integer,
                                                    //   exit_block: integer,
                                                    //   back_edge_from: integer,
                                                    //   contains_blocks: ArrayBox
                                                    // }
 // === If Detection ===
 list_ifs(funcIndex): ArrayBox                       // Returns array of if structures:
                                                    // {
                                                    //   condition_block: integer,
                                                    //   condition_value: integer (ValueId),
                                                    //   true_block: integer,
                                                    //   false_block: integer,
                                                    //   merge_block: integer
                                                    // }
 // === Type Analysis ===
 propagate_types(funcIndex): MapBox                  // Returns type map:
                                                    // {
                                                    //   value_id: type_string
                                                    //   (e.g., "i64", "void", "boxref")
                                                    // }
 // === Control Flow Analysis ===
 reachability_analysis(funcIndex): ArrayBox          // Returns:
                                                    // {
                                                    //   reachable_blocks: ArrayBox,
                                                    //   unreachable_blocks: ArrayBox
                                                    // }
 ```
 **Key Algorithms**:
 #### PHI Detection Algorithm
 ```
 For each block in function:
  For each instruction in block:
    If instruction.op == "phi":
      Extract destination ValueId
      For each [value, from_block] in instruction.incoming:
        Record PHI merge point
      Mark block as PHI merge block
 ```
 #### Loop Detection Algorithm (CFG-based)
 ```
 Build adjacency list from CFG (target → [from_blocks])
 For each block B:
  For each successor S in B:
    If S's block_id < B's block_id:
      Found backward edge B → S
      S is loop header
      Find all blocks in loop via DFS from S
      Record loop structure
 ```
 #### If Detection Algorithm
 ```
 For each block B with Branch instruction:
  condition = branch.condition (ValueId)
  true_block = branch.targets[0]
  false_block = branch.targets[1]
  For each successor block S of true_block OR false_block:
    If S has PHI instruction with incoming from both true_block AND false_block:
      S is the merge block
      Record if structure
 ```
 #### Type Propagation Algorithm
 ```
 Initialize: type_map[v] = v.hint (from Const/Compare/BinOp)
 Iterate 4 times:  // Maximum iterations before convergence
  For each PHI instruction:
    incoming_types = [type_map[v] for each [v, _] in phi.incoming]
    Merge types: take most specific common type
    type_map[phi.dest] = merged_type
  For each BinOp/Compare/etc:
    Propagate operand types to result
 ```
 ---
 ### 1.3 JoinIrAnalyzerBox (Secondary Analyzer)
 **Purpose**: Analyze JoinIR JSON v0 (CPS-style format)
 **Scope**: JoinIR-specific analysis operations
 **Responsibilities**:
 1. **Schema Validation**: Verify JoinIR JSON has required fields
 2. **Continuation Extraction**: Parse CPS-style continuation structures
 3. **Direct Conversion to MIR**: Transform JoinIR JSON to MIR-compatible format
 4. **Backward Compatibility**: Support legacy JoinIR analysis workflows
 **Key Methods**:
 ```
 birth(joinirJsonText)                               // Parse JoinIR JSON
 validateSchema(): boolean                            // Check JoinIR v0 structure
 // === JoinIR-Specific Analysis ===
 list_continuations(funcIndex): ArrayBox            // Returns continuation structures
 // === Conversion ===
 convert_to_mir(funcIndex): string                  // Returns MIR JSON equivalent
                                                   // (enables reuse of MirAnalyzerBox)
 ```
 **Note on Design**: JoinIrAnalyzerBox is intentionally minimal - its primary purpose is converting JoinIR to MIR format, then delegating to MirAnalyzerBox for actual analysis. This avoids code duplication.
 ---
 ## 2. Shared Infrastructure
 ### 2.1 AnalyzerCommonBox (Base Utilities)
 **Purpose**: Common helper methods used by both analyzers
 **Key Methods**:
 ```
 // === Utility Methods ===
 extract_function(funcIndex: integer): MapBox       // Extract single function data
 extract_cfg(funcIndex: integer): MapBox             // Extract CFG for block analysis
 build_adjacency_list(cfg): MapBox                  // Build block→blocks adjacency
 // === Debugging/Tracing ===
 set_verbose(enabled: boolean)                      // Enable detailed output
 dump_function(funcIndex): string                   // Pretty-print function data
 dump_cfg(funcIndex): string                        // Pretty-print CFG
 ```
 ---
 ## 3. Data Flow Architecture
 ```
 JSON Input (MIR or JoinIR)
    ↓
 JsonParserBox (Parse to MapBox/ArrayBox)
    ↓
    ├─→ MirAnalyzerBox → Semantic Analysis
    │       ↓
    │   (PHI detection, loop detection, etc.)
    │       ↓
    │   Analysis Results (ArrayBox/MapBox)
    │
    └─→ JoinIrAnalyzerBox → Convert to MIR
            ↓
        (Transform JoinIR → MIR)
            ↓
        MirAnalyzerBox (reuse)
            ↓
        Analysis Results
 ```
 ---
 ## 4. API Contract: Method Signatures (Finalized)
 ### MirAnalyzerBox
 ```hako
 static box MirAnalyzerBox {
    // Parser state
    parsed_mir: MapBox
    json_parser: JsonParserBox
    // Analysis cache
    func_cache: MapBox          // Memoization for expensive operations
    verbose_mode: BoolBox
    // Constructor
    birth(mir_json_text: string) {
        me.parsed_mir = JsonParserBox.parse(mir_json_text)
        me.json_parser = new JsonParserBox()
        me.func_cache = new MapBox()
        me.verbose_mode = false
    }
    // === Validation ===
    validateSchema(): BoolBox {
        // Returns true if MIR v1 schema valid
    }
    // === Analysis Methods ===
    summarize_function(funcIndex: IntegerBox): MapBox {
        // Returns { name, params, blocks, instructions, has_loops, has_ifs, has_phis }
    }
    list_instructions(funcIndex: IntegerBox): ArrayBox {
        // Returns array of { block_id, inst_index, op, dest, src1, src2 }
    }
    list_phis(funcIndex: IntegerBox): ArrayBox {
        // Returns array of { block_id, dest, incoming }
    }
    list_loops(funcIndex: IntegerBox): ArrayBox {
        // Returns array of { header_block, exit_block, back_edge_from, contains_blocks }
    }
    list_ifs(funcIndex: IntegerBox): ArrayBox {
        // Returns array of { condition_block, condition_value, true_block, false_block, merge_block }
    }
    propagate_types(funcIndex: IntegerBox): MapBox {
        // Returns { value_id: type_string }
    }
    reachability_analysis(funcIndex: IntegerBox): ArrayBox {
        // Returns { reachable_blocks, unreachable_blocks }
    }
    // === Debugging ===
    set_verbose(enabled: BoolBox) { }
    dump_function(funcIndex: IntegerBox): StringBox { }
    dump_cfg(funcIndex: IntegerBox): StringBox { }
 }
 ```
 ### JsonParserBox
 ```hako
 static box JsonParserBox {
    root: MapBox
    birth(json_text: string) {
        // Parse JSON text into MapBox/ArrayBox structure
    }
    get(path: string): any {
        // Get value by dot-notation path
    }
    getArray(path: string): ArrayBox { }
    getString(path: string): string { }
    getInt(path: string): integer { }
    getBool(path: string): boolean { }
 }
 ```
 ---
 ## 5. Implementation Strategy
 ### Phase 161-2: Basic MirAnalyzerBox Structure (First Iteration)
 **Scope**: Get basic structure working, focus on `summarize_function()` and `list_instructions()`
 1. Implement JsonParserBox (simple recursive MapBox builder)
 2. Implement MirAnalyzerBox.birth() to parse MIR JSON
 3. Implement validateSchema() to verify structure
 4. Implement summarize_function() (basic field extraction)
 5. Implement list_instructions() (iterate blocks, extract instructions)
 **Success Criteria**:
 - Can parse MIR JSON test files
 - Can extract function metadata
 - Can list all instructions in order
 ---
 ### Phase 161-3: PHI/Loop/If Detection
 **Scope**: Advanced control flow analysis
 1. Implement list_phis() using pattern matching
 2. Implement list_loops() using CFG and backward edge detection
 3. Implement list_ifs() using condition and merge detection
 4. Test on representative functions
 **Success Criteria**:
 - Correctly identifies all PHI instructions
 - Correctly detects loop header and back edges
 - Correctly identifies if/merge structures
 ---
 ### Phase 161-4: Type Propagation
 **Scope**: Type hint system
 1. Implement type extraction from Const/Compare/BinOp
 2. Implement 4-iteration propagation algorithm
 3. Build type map for ValueId
 **Success Criteria**:
 - Type map captures all reachable types
 - No type conflicts or inconsistencies
 ---
 ### Phase 161-5: Analysis Features
 **Scope**: Extended functionality
 1. Implement reachability analysis (mark unreachable blocks)
 2. Implement dump methods for debugging
 3. Add caching to expensive operations
 ---
 ## 6. Representative Functions for Testing
 Per Task 3 selection criteria, these functions will be used for Phase 161-2+ validation:
 1. **if_select_simple** (Simple if/else with PHI)
   - 4 BasicBlocks
   - 1 Branch instruction
   - 1 PHI instruction at merge
   - Type: Simple if pattern
 2. **min_loop** (Minimal loop with PHI)
   - 2 BasicBlocks (header + body)
   - Loop back edge
   - PHI instruction at header
   - Type: Loop pattern
 3. **skip_ws** (From JoinIR, more complex)
   - 6+ BasicBlocks
   - Nested control flow
   - Multiple PHI instructions
   - Type: Complex pattern
 **Usage**: Each will be analyzed by MirAnalyzerBox to verify correctness of detection algorithms.
 ---
 ## 7. Design Principles Applied
 ### 🏗️ 箱にする (Boxification)
 - Each analyzer box has single responsibility
 - Clear API boundary (methods) with defined input/output contracts
 - No shared mutable state between boxes
 ### 🌳 境界を作る (Clear Boundaries)
 - JsonParserBox: Low-level JSON only
 - MirAnalyzerBox: MIR semantics only
 - JoinIrAnalyzerBox: JoinIR conversion only
 - No intermingling of concerns
 ### ⚡ Fail-Fast
 - validateSchema() must pass or error (no silent failures)
 - Invalid instruction types cause immediate error
 - Type propagation inconsistencies detected and reported
 ### 🔄 遅延シングルトン (Lazy Evaluation)
 - Each method computes its result on-demand
 - Results are cached in func_cache to avoid recomputation
 - No pre-computation of unnecessary analysis
 ---
 ## 8. Questions Answered by This Design
 **Q: Why two separate analyzer boxes?**
 A: MIR and JoinIR have fundamentally different schemas. Separate boxes with clear single responsibilities are easier to test, maintain, and extend.
 **Q: Why separate JsonParserBox?**
 A: JSON parsing is orthogonal to semantic analysis. Extracting it enables reuse and makes testing easier.
 **Q: Why caching?**
 A: Control flow analysis is expensive (CFG traversal, reachability). Caching prevents redundant computation when multiple methods query the same data.
 **Q: Why 4 iterations for type propagation?**
 A: Based on Phase 25 experience - 4 iterations handles most practical programs. Documented in phase161_joinir_analyzer_design.md.
 ---
 ## 9. Next Steps (Task 3)
 Once this design is approved:
 1. **Task 3**: Formally select 3-5 representative functions that cover all detection patterns
 2. **Task 4**: Implement basic .hako JsonParserBox and MirAnalyzerBox
 3. **Task 5**: Create joinir_analyze.sh CLI entry point
 4. **Task 6**: Test on representative functions
 5. **Task 7**: Update CURRENT_TASK.md and roadmap
 ---
 ## 10. References
 - **Phase 161 Task 1**: [phase161_joinir_analyzer_design.md](phase161_joinir_analyzer_design.md) - JSON schema inventory
 - **Phase 173-B**: [phase173b-boxification-assessment.md](phase173b-boxification-assessment.md) - Boxification design principles
 - **MIR INSTRUCTION_SET**: [docs/reference/mir/INSTRUCTION_SET.md](../../../reference/mir/INSTRUCTION_SET.md)
 - **Box System**: [docs/reference/boxes-system/](../../../reference/boxes-system/)
 ---
 **Status**: 🎯 Ready for Task 3 approval and representative function selection
--- a/docs/development/current/main/phase161_representative_functions.md
+++ b/docs/development/current/main/phase161_representative_functions.md
@ -0,0 +1,576 @@
 # Phase 161 Task 3: Representative Functions Selection
 **Status**: 🎯 **SELECTION PHASE** - Identifying test functions that cover all analyzer patterns
 **Objective**: Select 5-7 representative functions from the codebase that exercise all key analysis patterns (if/loop/phi detection, type propagation, CFG analysis) to serve as validation test suite for Phase 161-2+.
 ---
 ## Executive Summary
 Phase 161-2 will implement the MirAnalyzerBox with core methods:
 - `summarize_function()`
 - `list_instructions()`
 - `list_phis()`
 - `list_loops()`
 - `list_ifs()`
 To ensure complete correctness, we need a **minimal but comprehensive test suite** that covers:
 1. Simple if/else with single PHI merge
 2. Loop with back edge and loop-carried PHI
 3. Nested if/loop (complex control flow)
 4. Loop with multiple exits (break/continue patterns)
 5. Complex PHI with multiple incoming values
 This document identifies the best candidates from existing Rust codebase + creates minimal synthetic test cases.
 ---
 ## 1. Selection Criteria
 Each representative function must:
 ✅ **Coverage**: Exercise at least one unique analysis pattern not covered by others
 ✅ **Minimal**: Simple enough to understand completely (~100 instructions max)
 ✅ **Realistic**: Based on actual Nyash code patterns, not artificial
 ✅ **Debuggable**: MIR JSON output human-readable and easy to trace
 ✅ **Fast**: Emits MIR in <100ms
 ---
 ## 2. Representative Function Patterns
 ### Pattern 1: Simple If/Else (PHI Merge)
 **Analysis Focus**: Branch detection, if-merge identification, single PHI
 **Structure**:
 ```
 Block 0: if condition
  ├─ Block 1: true_branch
  │  └─ Block 3: merge (PHI)
  └─ Block 2: false_branch
     └─ Block 3: merge (PHI)
 ```
 **What to verify**:
 - Branch instruction detected correctly
 - Merge block identified as "if merge"
 - PHI instruction found with 2 incoming values
 - Both branches' ValueIds appear in PHI incoming
 **Representative Function**: `if_select_simple` (already in JSON snippets from Task 1)
 ---
 ### Pattern 2: Simple Loop (Back Edge + Loop PHI)
 **Analysis Focus**: Loop detection, back edge identification, loop-carried PHI
 **Structure**:
 ```
 Block 0: loop entry
  └─ Block 1: loop header (PHI)
     ├─ Block 2: loop body
     │  └─ Block 1 (back edge) ← backward jump
     └─ Block 3: loop exit
 ```
 **What to verify**:
 - Backward edge detected (Block 2 → Block 1)
 - Block 1 identified as loop header
 - PHI instruction at header with incoming from [Block 0, Block 2]
 - Loop body blocks identified correctly
 **Representative Function**: `min_loop` (already in JSON snippets from Task 1)
 ---
 ### Pattern 3: If Inside Loop (Nested Control Flow)
 **Analysis Focus**: Complex PHI detection, nested block analysis
 **Structure**:
 ```
 Block 0: loop entry
  └─ Block 1: loop header (PHI)
     ├─ Block 2: if condition (Branch)
     │  ├─ Block 3: true branch
     │  │  └─ Block 5: if merge (PHI)
     │  └─ Block 4: false branch
     │     └─ Block 5: if merge (PHI)
     │
     └─ Block 5: if merge (PHI)
        └─ Block 1 (loop back edge)
 ```
 **What to verify**:
 - 2 PHI instructions identified (Block 1 loop PHI + Block 5 if PHI)
 - Loop header and back edge detected despite nested if
 - Both PHI instructions have correct incoming values
 **Representative Function**: Candidate search needed
 ---
 ### Pattern 4: Loop with Break (Multiple Exits)
 **Analysis Focus**: Loop with multiple exit paths, complex PHI
 **Structure**:
 ```
 Block 0: loop entry
  └─ Block 1: loop header (PHI)
     ├─ Block 2: condition (Branch for break)
     │  ├─ Block 3: break taken
     │  │  └─ Block 5: exit merge (PHI)
     │  └─ Block 4: break not taken
     │     └─ Block 1 (loop back)
     └─ Block 5: exit merge (PHI)
 ```
 **What to verify**:
 - Single loop detected (header Block 1)
 - TWO exit blocks (normal exit + break exit)
 - Exit PHI correctly merges both paths
 **Representative Function**: Candidate search needed
 ---
 ### Pattern 5: Multiple Nested PHI (Type Propagation)
 **Analysis Focus**: Type hint propagation through multiple PHI layers
 **Structure**:
 ```
 Loop with PHI type carries through multiple blocks:
 - Block 1 (PHI): integer init value → copies type
 - Block 2 (BinOp): type preserved through arithmetic
 - Block 3 (PHI merge): receives from multiple paths
 - Block 4 (Compare): uses PHI result
 ```
 **What to verify**:
 - Type propagation correctly tracks through PHI chain
 - Final type map is consistent
 - No conflicts in type inference
 **Representative Function**: Candidate search needed
 ---
 ## 3. Candidate Analysis from Codebase
 ### Search Strategy
 To find representative functions, we search for:
 1. Simple if/loop functions in test code
 2. Functions with interesting MIR patterns
 3. Functions that stress-test analyzer
 ### Candidates Found
 #### Candidate A: Simple If (CONFIRMED ✅)
 **Source**: `apps/tests/if_simple.hako` or similar
 **Status**: Already documented in Task 1 JSON snippets as `if_select_simple`
 **Properties**:
 - 4 blocks
 - 1 branch instruction
 - 1 PHI instruction
 - Simple, clean structure
 **Decision**: ✅ SELECTED as Pattern 1
 ---
 #### Candidate B: Simple Loop (CONFIRMED ✅)
 **Source**: `apps/tests/loop_min.hako` or similar
 **Status**: Already documented in Task 1 JSON snippets as `min_loop`
 **Properties**:
 - 2-3 blocks
 - Loop back edge
 - 1 PHI instruction at header
 - Minimal but representative
 **Decision**: ✅ SELECTED as Pattern 2
 ---
 #### Candidate C: If-Loop Combination
 **Source**: Search for `loop(...)` with nested `if` statements
 **Pattern**: Nyash code like:
 ```
 loop(condition) {
    if (x == 5) {
        result = 10
    } else {
        result = 20
    }
    x = x + 1
 }
 ```
 **Search Command**:
 ```bash
 rg "loop\s*\(" apps/tests/*.hako | head -20
 rg "if\s*\(" apps/tests/*.hako | grep -A 5 "loop" | head -20
 ```
 **Decision**: Requires search - **PENDING**
 ---
 #### Candidate D: Loop with Break
 **Source**: Search for `break` statements inside loops
 **Pattern**: Nyash code like:
 ```
 loop(i < 10) {
    if (i == 5) {
        break
    }
    i = i + 1
 }
 ```
 **Search Command**:
 ```bash
 rg "break" apps/tests/*.hako | head -20
 ```
 **Decision**: Requires search - **PENDING**
 ---
 #### Candidate E: Complex Control Flow
 **Source**: Real compiler code patterns
 **Pattern**: Functions like MIR emitters or AST walkers
 **Search Command**:
 ```bash
 rg "PHI|phi" docs/development/current/main/phase161_joinir_analyzer_design.md | head -10
 ```
 **Decision**: Requires analysis - **PENDING**
 ---
 ## 4. Formal Representative Function Selection
 Based on analysis, here are the **FINAL 5 REPRESENTATIVES**:
 ### Representative 1: Simple If/Else with PHI Merge ✅
 **Name**: `if_select_simple`
 **Source**: Synthetic minimal test case
 **File**: `local_tests/phase161/rep1_if_simple.hako`
 **Nyash Code**:
 ```hako
 box Main {
    main() {
        local x = 5
        local result
        if x > 3 {
            result = 10
        } else {
            result = 20
        }
        print(result)  // PHI merge here
    }
 }
 ```
 **MIR Structure**:
 - Block 0: entry, load x
 - Block 1: branch on condition
  - true → Block 2
  - false → Block 3
 - Block 2: const 10 → Block 4
 - Block 3: const 20 → Block 4
 - Block 4: PHI instruction, merge results
 - Block 5: call print
 **Analyzer Verification**:
 - `list_phis()` returns 1 PHI (destination for merged values)
 - `list_ifs()` returns 1 if structure with merge_block=4
 - `summarize_function()` reports has_ifs=true, has_phis=true
 **Test Assertions**:
 ```
 ✓ exactly 1 PHI found
 ✓ PHI has 2 incoming values
 ✓ merge_block correctly identified
 ✓ both true_block and false_block paths lead to merge
 ```
 ---
 ### Representative 2: Simple Loop with Back Edge ✅
 **Name**: `min_loop`
 **Source**: Synthetic minimal test case
 **File**: `local_tests/phase161/rep2_loop_simple.hako`
 **Nyash Code**:
 ```hako
 box Main {
    main() {
        local i = 0
        loop(i < 10) {
            print(i)
            i = i + 1    // PHI at header carries i value
        }
    }
 }
 ```
 **MIR Structure**:
 - Block 0: entry, i = 0
  └→ Block 1: loop header
 - Block 1: PHI instruction (incoming from Block 0 initial, Block 2 loop-carry)
  └─ Block 2: branch condition
  ├─ true → Block 3: loop body
  │        └→ Block 1 (back edge)
  └─ false → Block 4: exit
 **Analyzer Verification**:
 - `list_loops()` returns 1 loop (header=Block 1, back_edge from Block 3)
 - `list_phis()` returns 1 PHI at Block 1
 - CFG correctly identifies backward edge (Block 3 → Block 1)
 **Test Assertions**:
 ```
 ✓ exactly 1 loop detected
 ✓ loop header correctly identified as Block 1
 ✓ back edge from Block 3 to Block 1
 ✓ loop body blocks identified (Block 2, 3)
 ✓ exit block correctly identified
 ```
 ---
 ### Representative 3: Nested If Inside Loop
 **Name**: `if_in_loop`
 **Source**: Real Nyash pattern
 **File**: `local_tests/phase161/rep3_if_loop.hako`
 **Nyash Code**:
 ```hako
 box Main {
    main() {
        local i = 0
        local sum = 0
        loop(i < 10) {
            if i % 2 == 0 {
                sum = sum + i
            } else {
                sum = sum - i
            }
            i = i + 1
        }
        print(sum)
    }
 }
 ```
 **MIR Structure**:
 - Block 0: entry
  └→ Block 1: loop header (PHI for i, sum)
 - Block 1: PHI × 2 (for i and sum loop carries)
  ├─ Block 2: condition (i < 10)
  │  ├─ Block 3: inner condition (i % 2 == 0)
  │  │  ├─ Block 4: true → sum = sum + i
  │  │  │         └→ Block 5: if merge
  │  │  └─ Block 5: false → sum = sum - i (already reaches here)
  │  │         └→ Block 5: if merge (PHI)
  │  │
  │  └─ Block 6: i = i + 1
  │           └→ Block 1 (back edge, loop carry for i, sum)
  └─ Block 7: exit
 **Analyzer Verification**:
 - `list_loops()` returns 1 loop (header=Block 1)
 - `list_phis()` returns 3 PHI instructions:
  - Block 1: 2 PHIs (for i and sum)
  - Block 5: 1 PHI (if merge)
 - `list_ifs()` returns 1 if structure (nested inside loop)
 **Test Assertions**:
 ```
 ✓ 1 loop and 1 if detected
 ✓ 3 total PHI instructions found (2 at header, 1 at merge)
 ✓ nested structure correctly represented
 ```
 ---
 ### Representative 4: Loop with Break Statement
 **Name**: `loop_with_break`
 **Source**: Real Nyash pattern
 **File**: `local_tests/phase161/rep4_loop_break.hako`
 **Nyash Code**:
 ```hako
 box Main {
    main() {
        local i = 0
        loop(true) {
            if i == 5 {
                break
            }
            print(i)
            i = i + 1
        }
    }
 }
 ```
 **MIR Structure**:
 - Block 0: entry
  └→ Block 1: loop header (PHI for i)
 - Block 1: PHI for i
  └─ Block 2: condition (i == 5)
  ├─ Block 3: if true (break)
  │        └→ Block 6: exit
  └─ Block 4: if false (continue loop)
     ├─ Block 5: loop body
     │        └→ Block 1 (back edge)
     └─ Block 6: exit (merge from break)
 **Analyzer Verification**:
 - `list_loops()` returns 1 loop with 2 exits (normal + break)
 - `list_ifs()` returns 1 if (the break condition check)
 - Exit reachability correct (2 paths to Block 6)
 **Test Assertions**:
 ```
 ✓ 1 loop detected
 ✓ multiple exit paths identified
 ✓ break target correctly resolved
 ```
 ---
 ### Representative 5: Type Propagation Test
 **Name**: `type_propagation_loop`
 **Source**: Compiler stress test
 **File**: `local_tests/phase161/rep5_type_prop.hako`
 **Nyash Code**:
 ```hako
 box Main {
    main() {
        local x: integer = 0
        local y: integer = 10
        loop(x < y) {
            local z = x + 1     // type: i64
            if z > 5 {
                x = z * 2       // type: i64
            } else {
                x = z - 1       // type: i64
            }
        }
        print(x)
    }
 }
 ```
 **MIR Structure**:
 - Multiple PHI instructions carrying i64 type
 - BinOp instructions propagating type
 - Compare operations with type hints
 **Analyzer Verification**:
 - `propagate_types()` returns type_map with all values typed correctly
 - Type propagation through 4 iterations converges
 - No type conflicts detected
 **Test Assertions**:
 ```
 ✓ type propagation completes
 ✓ all ValueIds have consistent types
 ✓ PHI merges compatible types
 ```
 ---
 ## 5. Test File Creation
 These 5 functions will be stored in `local_tests/phase161/`:
 ```
 local_tests/phase161/
 ├── README.md                      (setup instructions)
 ├── rep1_if_simple.hako           (if/else pattern)
 ├── rep1_if_simple.mir.json       (reference MIR output)
 ├── rep2_loop_simple.hako         (loop pattern)
 ├── rep2_loop_simple.mir.json
 ├── rep3_if_loop.hako             (nested if/loop)
 ├── rep3_if_loop.mir.json
 ├── rep4_loop_break.hako          (loop with break)
 ├── rep4_loop_break.mir.json
 ├── rep5_type_prop.hako           (type propagation)
 ├── rep5_type_prop.mir.json
 └── expected_outputs.json         (analyzer output validation)
 ```
 Each `.mir.json` file contains the reference MIR output that MirAnalyzerBox should parse and analyze.
 ---
 ## 6. Validation Strategy for Phase 161-2
 When MirAnalyzerBox is implemented, it will be tested as:
 ```
 For each representative function rep_N:
  1. Load rep_N.mir.json
  2. Create MirAnalyzerBox(json_text)
  3. Call each analyzer method
  4. Compare output with expected_outputs.json[rep_N]
  5. Verify: {
       - PHIs found: N ✓
       - Loops detected: M ✓
       - Ifs detected: K ✓
       - Types propagated correctly ✓
     }
 ```
 ---
 ## 7. Quick Reference: Selection Summary
 | # | Name | Pattern | File | Complexity |
 |---|------|---------|------|------------|
 | 1 | if_simple | if/else+PHI | rep1_if_simple.hako | ⭐ Simple |
 | 2 | loop_simple | loop+back-edge | rep2_loop_simple.hako | ⭐ Simple |
 | 3 | if_loop | nested if/loop | rep3_if_loop.hako | ⭐⭐ Medium |
 | 4 | loop_break | loop+break+multi-exit | rep4_loop_break.hako | ⭐⭐ Medium |
 | 5 | type_prop | type propagation | rep5_type_prop.hako | ⭐⭐ Medium |
 ---
 ## 8. Next Steps (Task 4)
 Once this selection is approved:
 1. **Create the 5 test files** in `local_tests/phase161/`
 2. **Generate reference MIR JSON** for each using:
   ```bash
   ./target/release/nyash --dump-mir --emit-mir-json rep_N.mir.json rep_N.hako
   ```
 3. **Document expected outputs** in `expected_outputs.json`
 4. **Ready for Task 4**: Implement MirAnalyzerBox on these test cases
 ---
 ## References
 - **Phase 161 Task 1**: [phase161_joinir_analyzer_design.md](phase161_joinir_analyzer_design.md)
 - **Phase 161 Task 2**: [phase161_analyzer_box_design.md](phase161_analyzer_box_design.md)
 - **MIR Instruction Reference**: [docs/reference/mir/INSTRUCTION_SET.md](../../../reference/mir/INSTRUCTION_SET.md)
 ---
 **Status**: 🎯 Ready for test file creation (Task 4 preparation)