Files
hakorune/docs/archive/phases/phase-150-167/phase154_feedback.md
nyash-codex d4f90976da refactor(joinir): Phase 244 - ConditionLoweringBox trait unification
Unify condition lowering logic across Pattern 2/4 with trait-based API.

New infrastructure:
- condition_lowering_box.rs: ConditionLoweringBox trait + ConditionContext (293 lines)
- ExprLowerer implements ConditionLoweringBox trait (+51 lines)

Pattern migrations:
- Pattern 2 (loop_with_break_minimal.rs): Use trait API
- Pattern 4 (loop_with_continue_minimal.rs): Use trait API

Benefits:
- Unified condition lowering interface
- Extensible for future lowering strategies
- Clean API boundary between patterns and lowering logic
- Zero code duplication

Test results: 911/911 PASS (+2 new tests)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 02:35:31 +09:00

10 KiB

Phase 154: Feedback & Recommendations

Implementation Feedback

What Went Well

  1. Clear Architecture from the Start

    • The MIR already had CFG information (predecessors, successors, reachability)
    • Following Phase 153's boxed pattern made DeadBlockAnalyzerBox straightforward
    • Separation of concerns: CFG extraction (Rust) vs. analysis (.hako)
  2. Comprehensive Documentation

    • Phase 150 results provided excellent test cases to reference
    • Inventory document clarified the data flow early
    • Implementation summary will help Phase 155 engineer
  3. Test-Driven Design

    • Created test cases before implementation
    • 4 patterns cover common unreachable code scenarios
    • Smoke script validates infrastructure even without data bridge
  4. Graceful Degradation

    • DeadBlockAnalyzerBox skips gracefully if CFG unavailable
    • Smoke test doesn't fail during MVP phase
    • Non-breaking changes to existing hako_check flow

Challenges Encountered ⚠️

  1. Data Bridge Gap

    • Issue: analysis_consumer.hako builds IR from text scanning, not MIR
    • Impact: CFG extractor implemented but not yet usable
    • Resolution: Documented as Phase 155 task (2-3 hour estimate)
  2. Module System Complexity

    • Initial confusion about where to add cfg_extractor module
    • Resolved by examining existing join_ir module pattern
    • Lesson: Always check mod.rs structure first
  3. Testing Without Data

    • Hard to validate DeadBlockAnalyzerBox without actual CFG
    • Created mock-aware smoke test as workaround
    • Lesson: Unit tests in Rust covered CFG extraction logic

Recommendations for Future Phases

Immediate (Phase 155) 🔥

Priority 1: Complete CFG Data Bridge

Add builtin function to extract CFG from compiled MIR:

// In src/boxes/compiler_box.rs or similar
fn extract_mir_cfg(text: &str, path: &str) -> Result<serde_json::Value, String> {
    // 1. Compile text to MIR
    let ast = parse_source(text)?;
    let module = MirCompiler::new().compile(&ast)?;

    // 2. Extract CFG
    let cfg = extract_cfg_info(&module);

    Ok(cfg)
}

Then update analysis_consumer.hako:

build_from_source_flags(text, path, no_ast) {
    local ir = new MapBox()
    // ... existing text scanning ...

    // NEW: Add CFG if available
    local cfg = extract_mir_cfg(text, path)  // Rust builtin
    if cfg != null {
        ir.set("cfg", cfg)
    }

    return ir
}

Estimated Time: 2-3 hours Complexity: Low (mostly plumbing)

Short-term (Phase 156-158) 📋

Enhancement 1: Source Location Mapping

  • Track span information in CFG extractor
  • Show line numbers in HC020 output
  • Example: [HC020] ... :: test.hako:15

Enhancement 2: Better Reason Inference

  • Analyze surrounding context (not just terminator type)
  • Distinguish between different Branch patterns
  • Example: "constant false condition" vs "always true guard"

Enhancement 3: Integration Testing

  • Add hako_check tests to main test suite
  • Verify HC020 doesn't trigger false positives
  • Test with large real-world .hako files

Medium-term (Phase 160-165) 🚀

Feature 1: Constant Propagation

local x = 0
if x > 0 {  // Can be proven false at compile time
    // Should trigger HC020
}

Currently: May not detect (depends on MIR optimizer) Future: Always detect via symbolic execution

Feature 2: Path-Sensitive Analysis

if condition {
    if not condition {  // Contradiction!
        // Should trigger HC020
    }
}

Feature 3: CFG Visualization

# Generate DOT graph with dead blocks highlighted
./tools/hako_check.sh --format dot --dead-blocks program.hako > cfg.dot
dot -Tpng cfg.dot -o cfg.png

Red nodes = unreachable blocks Gray edges = never taken

Long-term (Phase 200+) 🌟

Feature 4: Interactive Reports

  • HTML output with clickable CFG
  • Hover over blocks to see code
  • Click to jump to source location

Feature 5: Fix Suggestions

[HC020] Unreachable basic block: fn=Main.test bb=5 (after early return)
Suggestion: Remove unreachable code at lines 15-20, or move before return at line 12

Design Improvements for Similar Tasks

1. Early Data Flow Validation

What to do differently:

  • Before implementing analyzer, verify data availability
  • Create end-to-end mockup with hardcoded data
  • Test analysis logic before building pipeline

Why it helps:

  • Catches integration gaps early
  • Validates assumptions about data format
  • Allows parallel development (data bridge + analysis)

2. Incremental Testing

What to do differently:

  • Test CFG extractor → Test DeadBlockAnalyzerBox → Test CLI integration
  • Each step independently validated
  • Mock data for middle layers

Why it helps:

  • Pinpoints failures faster
  • Easier to debug
  • More confidence at each stage

3. Documentation-First Approach

What worked well:

  • Writing inventory doc forced careful investigation
  • Implementation summary guided development
  • Feedback doc captures lessons learned

Apply to future phases:

  • Always start with design doc
  • Document decisions and alternatives
  • Create feedback doc after completion

Specific Code Improvements

CFG Extractor

Current:

fn terminator_to_string(inst: &MirInstruction) -> String {
    match inst {
        MirInstruction::Branch { .. } => "Branch".to_string(),
        MirInstruction::Jump { .. } => "Jump".to_string(),
        MirInstruction::Return { .. } => "Return".to_string(),
        _ => "Unknown".to_string(),
    }
}

Future Enhancement:

fn terminator_to_string(inst: &MirInstruction) -> String {
    match inst {
        MirInstruction::Branch { condition, .. } => {
            // Check if condition is constant
            if is_constant_value(condition) {
                "ConstantBranch".to_string()
            } else {
                "Branch".to_string()
            }
        }
        // ... rest ...
    }
}

Benefit: Enables better reason inference in DeadBlockAnalyzerBox

DeadBlockAnalyzerBox

Current:

_infer_unreachable_reason(terminator) {
    if terminator == "Return" { return "after early return" }
    if terminator == "Jump" { return "unreachable branch" }
    // ...
}

Future Enhancement:

_infer_unreachable_reason(block, cfg) {
    // Look at parent block's terminator
    local parent_id = me._find_parent_block(block, cfg)
    local parent = me._get_block_by_id(cfg, parent_id)

    if parent.terminator == "ConstantBranch" {
        return "constant false condition"
    }
    // ... more sophisticated analysis ...
}

Benefit: More actionable error messages

Testing Recommendations

Unit Test Coverage

Add tests for:

  • CFG extraction with complex control flow (nested loops, try-catch)
  • DeadBlockAnalyzerBox edge cases (empty functions, single-block functions)
  • Terminator inference logic

Test file:

// src/mir/cfg_extractor.rs
#[test]
fn test_nested_loop_cfg() { /* ... */ }

#[test]
fn test_try_catch_cfg() { /* ... */ }

Integration Test Patterns

Once bridge is complete:

  1. Positive Tests (should detect HC020)

    • Early return in nested if
    • Break in loop with subsequent code
    • Panic/throw with subsequent code
  2. Negative Tests (should NOT detect HC020)

    • Conditional return (not always taken)
    • Loop with conditional break
    • Reachable code after if-else
  3. Edge Cases

    • Empty functions
    • Functions with only one block
    • Functions with dead blocks but live code

Process Improvements

1. Parallel Development Strategy

For Phase 155:

  • One person implements data bridge (Rust-side)
  • Another prepares integration tests (.hako-side)
  • Meet in middle with agreed JSON format

2. Smoke Test Evolution

Current: Lenient (allows CFG unavailability) Phase 155: Strict (requires CFG, expects HC020) Phase 160: Comprehensive (multiple files, performance tests)

Update smoke script as pipeline matures.

3. Documentation Maintenance

Add to each phase:

  • Before: Design doc with alternatives
  • During: Progress log with blockers
  • After: Feedback doc with lessons

Benefit: Knowledge transfer, future reference, debugging aid

Questions for Discussion

Q1: CFG Bridge Location

Options:

  1. New builtin function: extract_mir_cfg(text, path)
  2. Extend existing parser: HakoParserCoreBox.parse_with_cfg(text)
  3. Separate tool: hakorune --extract-cfg program.hako

Recommendation: Option 1 (cleaner separation)

Q2: Performance Optimization

Question: Should we cache CFG extraction?

Analysis:

  • MIR compilation is expensive (~50-100ms per file)
  • CFG extraction is cheap (~1ms)
  • But hako_check runs once, so caching not useful

Recommendation: No caching for now, revisit in CI/CD context

Q3: Error Handling

Question: What if MIR compilation fails?

Options:

  1. Skip HC020 silently
  2. Show warning "CFG unavailable due to compile error"
  3. Fail entire hako_check run

Recommendation: Option 2 (user-friendly, informative)

Success Metrics (Phase 155)

Definition of Done

  • CFG data bridge implemented and tested
  • All 4 test cases produce HC020 output
  • Smoke test passes without warnings
  • No false positives on large .hako files
  • Documentation updated (bridge complete)

Performance Targets

  • CFG extraction: <5ms per function
  • HC020 analysis: <10ms per file
  • Total overhead: <5% of hako_check runtime

Quality Gates

  • Zero false positives on test suite
  • HC020 output includes function and block ID
  • Error messages are actionable
  • No crashes on malformed input

Conclusion

Phase 154 successfully delivered the infrastructure for block-level dead code detection. The core components are complete, tested, and documented.

Key Success Factors:

  • Clear architecture from Phase 153 pattern
  • Thorough investigation before implementation
  • Graceful degradation during MVP
  • Comprehensive documentation for handoff

Next Phase (155) is Straightforward:

  • 2-3 hours of Rust-side plumbing
  • Well-defined task with clear deliverables
  • High confidence in success

Recommendation: Proceed with Phase 155 implementation immediately. The foundation is solid, and the remaining work is mechanical.


Feedback Author: Claude (Anthropic) Date: 2025-12-04 Phase: 154 (Complete) Next Phase: 155 (CFG Data Bridge) Status: Historical