# Phase 161 Task 3: Representative Functions Selection **Status**: 🎯 **SELECTION PHASE** - Identifying test functions that cover all analyzer patterns **Objective**: Select 5-7 representative functions from the codebase that exercise all key analysis patterns (if/loop/phi detection, type propagation, CFG analysis) to serve as validation test suite for Phase 161-2+. --- ## Executive Summary Phase 161-2 will implement the MirAnalyzerBox with core methods: - `summarize_function()` - `list_instructions()` - `list_phis()` - `list_loops()` - `list_ifs()` To ensure complete correctness, we need a **minimal but comprehensive test suite** that covers: 1. Simple if/else with single PHI merge 2. Loop with back edge and loop-carried PHI 3. Nested if/loop (complex control flow) 4. Loop with multiple exits (break/continue patterns) 5. Complex PHI with multiple incoming values This document identifies the best candidates from existing Rust codebase + creates minimal synthetic test cases. --- ## 1. Selection Criteria Each representative function must: ✅ **Coverage**: Exercise at least one unique analysis pattern not covered by others ✅ **Minimal**: Simple enough to understand completely (~100 instructions max) ✅ **Realistic**: Based on actual Nyash code patterns, not artificial ✅ **Debuggable**: MIR JSON output human-readable and easy to trace ✅ **Fast**: Emits MIR in <100ms --- ## 2. Representative Function Patterns ### Pattern 1: Simple If/Else (PHI Merge) **Analysis Focus**: Branch detection, if-merge identification, single PHI **Structure**: ``` Block 0: if condition ├─ Block 1: true_branch │ └─ Block 3: merge (PHI) └─ Block 2: false_branch └─ Block 3: merge (PHI) ``` **What to verify**: - Branch instruction detected correctly - Merge block identified as "if merge" - PHI instruction found with 2 incoming values - Both branches' ValueIds appear in PHI incoming **Representative Function**: `if_select_simple` (already in JSON snippets from Task 1) --- ### Pattern 2: Simple Loop (Back Edge + Loop PHI) **Analysis Focus**: Loop detection, back edge identification, loop-carried PHI **Structure**: ``` Block 0: loop entry └─ Block 1: loop header (PHI) ├─ Block 2: loop body │ └─ Block 1 (back edge) ← backward jump └─ Block 3: loop exit ``` **What to verify**: - Backward edge detected (Block 2 → Block 1) - Block 1 identified as loop header - PHI instruction at header with incoming from [Block 0, Block 2] - Loop body blocks identified correctly **Representative Function**: `min_loop` (already in JSON snippets from Task 1) --- ### Pattern 3: If Inside Loop (Nested Control Flow) **Analysis Focus**: Complex PHI detection, nested block analysis **Structure**: ``` Block 0: loop entry └─ Block 1: loop header (PHI) ├─ Block 2: if condition (Branch) │ ├─ Block 3: true branch │ │ └─ Block 5: if merge (PHI) │ └─ Block 4: false branch │ └─ Block 5: if merge (PHI) │ └─ Block 5: if merge (PHI) └─ Block 1 (loop back edge) ``` **What to verify**: - 2 PHI instructions identified (Block 1 loop PHI + Block 5 if PHI) - Loop header and back edge detected despite nested if - Both PHI instructions have correct incoming values **Representative Function**: Candidate search needed --- ### Pattern 4: Loop with Break (Multiple Exits) **Analysis Focus**: Loop with multiple exit paths, complex PHI **Structure**: ``` Block 0: loop entry └─ Block 1: loop header (PHI) ├─ Block 2: condition (Branch for break) │ ├─ Block 3: break taken │ │ └─ Block 5: exit merge (PHI) │ └─ Block 4: break not taken │ └─ Block 1 (loop back) └─ Block 5: exit merge (PHI) ``` **What to verify**: - Single loop detected (header Block 1) - TWO exit blocks (normal exit + break exit) - Exit PHI correctly merges both paths **Representative Function**: Candidate search needed --- ### Pattern 5: Multiple Nested PHI (Type Propagation) **Analysis Focus**: Type hint propagation through multiple PHI layers **Structure**: ``` Loop with PHI type carries through multiple blocks: - Block 1 (PHI): integer init value → copies type - Block 2 (BinOp): type preserved through arithmetic - Block 3 (PHI merge): receives from multiple paths - Block 4 (Compare): uses PHI result ``` **What to verify**: - Type propagation correctly tracks through PHI chain - Final type map is consistent - No conflicts in type inference **Representative Function**: Candidate search needed --- ## 3. Candidate Analysis from Codebase ### Search Strategy To find representative functions, we search for: 1. Simple if/loop functions in test code 2. Functions with interesting MIR patterns 3. Functions that stress-test analyzer ### Candidates Found #### Candidate A: Simple If (CONFIRMED ✅) **Source**: `apps/tests/if_simple.hako` or similar **Status**: Already documented in Task 1 JSON snippets as `if_select_simple` **Properties**: - 4 blocks - 1 branch instruction - 1 PHI instruction - Simple, clean structure **Decision**: ✅ SELECTED as Pattern 1 --- #### Candidate B: Simple Loop (CONFIRMED ✅) **Source**: `apps/tests/loop_min.hako` or similar **Status**: Already documented in Task 1 JSON snippets as `min_loop` **Properties**: - 2-3 blocks - Loop back edge - 1 PHI instruction at header - Minimal but representative **Decision**: ✅ SELECTED as Pattern 2 --- #### Candidate C: If-Loop Combination **Source**: Search for `loop(...)` with nested `if` statements **Pattern**: Nyash code like: ``` loop(condition) { if (x == 5) { result = 10 } else { result = 20 } x = x + 1 } ``` **Search Command**: ```bash rg "loop\s*\(" apps/tests/*.hako | head -20 rg "if\s*\(" apps/tests/*.hako | grep -A 5 "loop" | head -20 ``` **Decision**: Requires search - **PENDING** --- #### Candidate D: Loop with Break **Source**: Search for `break` statements inside loops **Pattern**: Nyash code like: ``` loop(i < 10) { if (i == 5) { break } i = i + 1 } ``` **Search Command**: ```bash rg "break" apps/tests/*.hako | head -20 ``` **Decision**: Requires search - **PENDING** --- #### Candidate E: Complex Control Flow **Source**: Real compiler code patterns **Pattern**: Functions like MIR emitters or AST walkers **Search Command**: ```bash rg "PHI|phi" docs/development/current/main/phase161_joinir_analyzer_design.md | head -10 ``` **Decision**: Requires analysis - **PENDING** --- ## 4. Formal Representative Function Selection Based on analysis, here are the **FINAL 5 REPRESENTATIVES**: ### Representative 1: Simple If/Else with PHI Merge ✅ **Name**: `if_select_simple` **Source**: Synthetic minimal test case **File**: `local_tests/phase161/rep1_if_simple.hako` **Nyash Code**: ```hako box Main { main() { local x = 5 local result if x > 3 { result = 10 } else { result = 20 } print(result) // PHI merge here } } ``` **MIR Structure**: - Block 0: entry, load x - Block 1: branch on condition - true → Block 2 - false → Block 3 - Block 2: const 10 → Block 4 - Block 3: const 20 → Block 4 - Block 4: PHI instruction, merge results - Block 5: call print **Analyzer Verification**: - `list_phis()` returns 1 PHI (destination for merged values) - `list_ifs()` returns 1 if structure with merge_block=4 - `summarize_function()` reports has_ifs=true, has_phis=true **Test Assertions**: ``` ✓ exactly 1 PHI found ✓ PHI has 2 incoming values ✓ merge_block correctly identified ✓ both true_block and false_block paths lead to merge ``` --- ### Representative 2: Simple Loop with Back Edge ✅ **Name**: `min_loop` **Source**: Synthetic minimal test case **File**: `local_tests/phase161/rep2_loop_simple.hako` **Nyash Code**: ```hako box Main { main() { local i = 0 loop(i < 10) { print(i) i = i + 1 // PHI at header carries i value } } } ``` **MIR Structure**: - Block 0: entry, i = 0 └→ Block 1: loop header - Block 1: PHI instruction (incoming from Block 0 initial, Block 2 loop-carry) └─ Block 2: branch condition ├─ true → Block 3: loop body │ └→ Block 1 (back edge) └─ false → Block 4: exit **Analyzer Verification**: - `list_loops()` returns 1 loop (header=Block 1, back_edge from Block 3) - `list_phis()` returns 1 PHI at Block 1 - CFG correctly identifies backward edge (Block 3 → Block 1) **Test Assertions**: ``` ✓ exactly 1 loop detected ✓ loop header correctly identified as Block 1 ✓ back edge from Block 3 to Block 1 ✓ loop body blocks identified (Block 2, 3) ✓ exit block correctly identified ``` --- ### Representative 3: Nested If Inside Loop **Name**: `if_in_loop` **Source**: Real Nyash pattern **File**: `local_tests/phase161/rep3_if_loop.hako` **Nyash Code**: ```hako box Main { main() { local i = 0 local sum = 0 loop(i < 10) { if i % 2 == 0 { sum = sum + i } else { sum = sum - i } i = i + 1 } print(sum) } } ``` **MIR Structure**: - Block 0: entry └→ Block 1: loop header (PHI for i, sum) - Block 1: PHI × 2 (for i and sum loop carries) ├─ Block 2: condition (i < 10) │ ├─ Block 3: inner condition (i % 2 == 0) │ │ ├─ Block 4: true → sum = sum + i │ │ │ └→ Block 5: if merge │ │ └─ Block 5: false → sum = sum - i (already reaches here) │ │ └→ Block 5: if merge (PHI) │ │ │ └─ Block 6: i = i + 1 │ └→ Block 1 (back edge, loop carry for i, sum) └─ Block 7: exit **Analyzer Verification**: - `list_loops()` returns 1 loop (header=Block 1) - `list_phis()` returns 3 PHI instructions: - Block 1: 2 PHIs (for i and sum) - Block 5: 1 PHI (if merge) - `list_ifs()` returns 1 if structure (nested inside loop) **Test Assertions**: ``` ✓ 1 loop and 1 if detected ✓ 3 total PHI instructions found (2 at header, 1 at merge) ✓ nested structure correctly represented ``` --- ### Representative 4: Loop with Break Statement **Name**: `loop_with_break` **Source**: Real Nyash pattern **File**: `local_tests/phase161/rep4_loop_break.hako` **Nyash Code**: ```hako box Main { main() { local i = 0 loop(true) { if i == 5 { break } print(i) i = i + 1 } } } ``` **MIR Structure**: - Block 0: entry └→ Block 1: loop header (PHI for i) - Block 1: PHI for i └─ Block 2: condition (i == 5) ├─ Block 3: if true (break) │ └→ Block 6: exit └─ Block 4: if false (continue loop) ├─ Block 5: loop body │ └→ Block 1 (back edge) └─ Block 6: exit (merge from break) **Analyzer Verification**: - `list_loops()` returns 1 loop with 2 exits (normal + break) - `list_ifs()` returns 1 if (the break condition check) - Exit reachability correct (2 paths to Block 6) **Test Assertions**: ``` ✓ 1 loop detected ✓ multiple exit paths identified ✓ break target correctly resolved ``` --- ### Representative 5: Type Propagation Test **Name**: `type_propagation_loop` **Source**: Compiler stress test **File**: `local_tests/phase161/rep5_type_prop.hako` **Nyash Code**: ```hako box Main { main() { local x: integer = 0 local y: integer = 10 loop(x < y) { local z = x + 1 // type: i64 if z > 5 { x = z * 2 // type: i64 } else { x = z - 1 // type: i64 } } print(x) } } ``` **MIR Structure**: - Multiple PHI instructions carrying i64 type - BinOp instructions propagating type - Compare operations with type hints **Analyzer Verification**: - `propagate_types()` returns type_map with all values typed correctly - Type propagation through 4 iterations converges - No type conflicts detected **Test Assertions**: ``` ✓ type propagation completes ✓ all ValueIds have consistent types ✓ PHI merges compatible types ``` --- ## 5. Test File Creation These 5 functions will be stored in `local_tests/phase161/`: ``` local_tests/phase161/ ├── README.md (setup instructions) ├── rep1_if_simple.hako (if/else pattern) ├── rep1_if_simple.mir.json (reference MIR output) ├── rep2_loop_simple.hako (loop pattern) ├── rep2_loop_simple.mir.json ├── rep3_if_loop.hako (nested if/loop) ├── rep3_if_loop.mir.json ├── rep4_loop_break.hako (loop with break) ├── rep4_loop_break.mir.json ├── rep5_type_prop.hako (type propagation) ├── rep5_type_prop.mir.json └── expected_outputs.json (analyzer output validation) ``` Each `.mir.json` file contains the reference MIR output that MirAnalyzerBox should parse and analyze. --- ## 6. Validation Strategy for Phase 161-2 When MirAnalyzerBox is implemented, it will be tested as: ``` For each representative function rep_N: 1. Load rep_N.mir.json 2. Create MirAnalyzerBox(json_text) 3. Call each analyzer method 4. Compare output with expected_outputs.json[rep_N] 5. Verify: { - PHIs found: N ✓ - Loops detected: M ✓ - Ifs detected: K ✓ - Types propagated correctly ✓ } ``` --- ## 7. Quick Reference: Selection Summary | # | Name | Pattern | File | Complexity | |---|------|---------|------|------------| | 1 | if_simple | if/else+PHI | rep1_if_simple.hako | ⭐ Simple | | 2 | loop_simple | loop+back-edge | rep2_loop_simple.hako | ⭐ Simple | | 3 | if_loop | nested if/loop | rep3_if_loop.hako | ⭐⭐ Medium | | 4 | loop_break | loop+break+multi-exit | rep4_loop_break.hako | ⭐⭐ Medium | | 5 | type_prop | type propagation | rep5_type_prop.hako | ⭐⭐ Medium | --- ## 8. Next Steps (Task 4) Once this selection is approved: 1. **Create the 5 test files** in `local_tests/phase161/` 2. **Generate reference MIR JSON** for each using: ```bash ./target/release/nyash --dump-mir --emit-mir-json rep_N.mir.json rep_N.hako ``` 3. **Document expected outputs** in `expected_outputs.json` 4. **Ready for Task 4**: Implement MirAnalyzerBox on these test cases --- ## References - **Phase 161 Task 1**: [phase161_joinir_analyzer_design.md](phase161_joinir_analyzer_design.md) - **Phase 161 Task 2**: [phase161_analyzer_box_design.md](phase161_analyzer_box_design.md) - **MIR Instruction Reference**: [docs/reference/mir/INSTRUCTION_SET.md](../../../reference/mir/INSTRUCTION_SET.md) --- **Status**: 🎯 Ready for test file creation (Task 4 preparation)