2025-12-04 15:15:27 +09:00
|
|
|
# Phase 154: Feedback & Recommendations
|
|
|
|
|
|
|
|
|
|
## Implementation Feedback
|
|
|
|
|
|
|
|
|
|
### What Went Well ✅
|
|
|
|
|
|
|
|
|
|
1. **Clear Architecture from the Start**
|
|
|
|
|
- The MIR already had CFG information (predecessors, successors, reachability)
|
|
|
|
|
- Following Phase 153's boxed pattern made DeadBlockAnalyzerBox straightforward
|
|
|
|
|
- Separation of concerns: CFG extraction (Rust) vs. analysis (.hako)
|
|
|
|
|
|
|
|
|
|
2. **Comprehensive Documentation**
|
|
|
|
|
- Phase 150 results provided excellent test cases to reference
|
|
|
|
|
- Inventory document clarified the data flow early
|
|
|
|
|
- Implementation summary will help Phase 155 engineer
|
|
|
|
|
|
|
|
|
|
3. **Test-Driven Design**
|
|
|
|
|
- Created test cases before implementation
|
|
|
|
|
- 4 patterns cover common unreachable code scenarios
|
|
|
|
|
- Smoke script validates infrastructure even without data bridge
|
|
|
|
|
|
|
|
|
|
4. **Graceful Degradation**
|
|
|
|
|
- DeadBlockAnalyzerBox skips gracefully if CFG unavailable
|
|
|
|
|
- Smoke test doesn't fail during MVP phase
|
|
|
|
|
- Non-breaking changes to existing hako_check flow
|
|
|
|
|
|
|
|
|
|
### Challenges Encountered ⚠️
|
|
|
|
|
|
|
|
|
|
1. **Data Bridge Gap**
|
|
|
|
|
- **Issue:** analysis_consumer.hako builds IR from text scanning, not MIR
|
|
|
|
|
- **Impact:** CFG extractor implemented but not yet usable
|
|
|
|
|
- **Resolution:** Documented as Phase 155 task (2-3 hour estimate)
|
|
|
|
|
|
|
|
|
|
2. **Module System Complexity**
|
|
|
|
|
- Initial confusion about where to add `cfg_extractor` module
|
|
|
|
|
- Resolved by examining existing join_ir module pattern
|
|
|
|
|
- **Lesson:** Always check `mod.rs` structure first
|
|
|
|
|
|
|
|
|
|
3. **Testing Without Data**
|
|
|
|
|
- Hard to validate DeadBlockAnalyzerBox without actual CFG
|
|
|
|
|
- Created mock-aware smoke test as workaround
|
|
|
|
|
- **Lesson:** Unit tests in Rust covered CFG extraction logic
|
|
|
|
|
|
|
|
|
|
## Recommendations for Future Phases
|
|
|
|
|
|
|
|
|
|
### Immediate (Phase 155) 🔥
|
|
|
|
|
|
|
|
|
|
**Priority 1: Complete CFG Data Bridge**
|
|
|
|
|
|
|
|
|
|
Add builtin function to extract CFG from compiled MIR:
|
|
|
|
|
|
|
|
|
|
```rust
|
|
|
|
|
// In src/boxes/compiler_box.rs or similar
|
|
|
|
|
fn extract_mir_cfg(text: &str, path: &str) -> Result<serde_json::Value, String> {
|
|
|
|
|
// 1. Compile text to MIR
|
|
|
|
|
let ast = parse_source(text)?;
|
|
|
|
|
let module = MirCompiler::new().compile(&ast)?;
|
|
|
|
|
|
|
|
|
|
// 2. Extract CFG
|
|
|
|
|
let cfg = extract_cfg_info(&module);
|
|
|
|
|
|
|
|
|
|
Ok(cfg)
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Then update `analysis_consumer.hako`:
|
|
|
|
|
```hako
|
|
|
|
|
build_from_source_flags(text, path, no_ast) {
|
|
|
|
|
local ir = new MapBox()
|
|
|
|
|
// ... existing text scanning ...
|
|
|
|
|
|
|
|
|
|
// NEW: Add CFG if available
|
|
|
|
|
local cfg = extract_mir_cfg(text, path) // Rust builtin
|
|
|
|
|
if cfg != null {
|
|
|
|
|
ir.set("cfg", cfg)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return ir
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Estimated Time:** 2-3 hours
|
|
|
|
|
**Complexity:** Low (mostly plumbing)
|
|
|
|
|
|
|
|
|
|
### Short-term (Phase 156-158) 📋
|
|
|
|
|
|
|
|
|
|
**Enhancement 1: Source Location Mapping**
|
|
|
|
|
- Track span information in CFG extractor
|
|
|
|
|
- Show line numbers in HC020 output
|
|
|
|
|
- Example: `[HC020] ... :: test.hako:15`
|
|
|
|
|
|
|
|
|
|
**Enhancement 2: Better Reason Inference**
|
|
|
|
|
- Analyze surrounding context (not just terminator type)
|
|
|
|
|
- Distinguish between different Branch patterns
|
|
|
|
|
- Example: "constant false condition" vs "always true guard"
|
|
|
|
|
|
|
|
|
|
**Enhancement 3: Integration Testing**
|
|
|
|
|
- Add hako_check tests to main test suite
|
|
|
|
|
- Verify HC020 doesn't trigger false positives
|
|
|
|
|
- Test with large real-world .hako files
|
|
|
|
|
|
|
|
|
|
### Medium-term (Phase 160-165) 🚀
|
|
|
|
|
|
|
|
|
|
**Feature 1: Constant Propagation**
|
|
|
|
|
```hako
|
|
|
|
|
local x = 0
|
|
|
|
|
if x > 0 { // Can be proven false at compile time
|
|
|
|
|
// Should trigger HC020
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
Currently: May not detect (depends on MIR optimizer)
|
|
|
|
|
Future: Always detect via symbolic execution
|
|
|
|
|
|
|
|
|
|
**Feature 2: Path-Sensitive Analysis**
|
|
|
|
|
```hako
|
|
|
|
|
if condition {
|
|
|
|
|
if not condition { // Contradiction!
|
|
|
|
|
// Should trigger HC020
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Feature 3: CFG Visualization**
|
|
|
|
|
```bash
|
|
|
|
|
# Generate DOT graph with dead blocks highlighted
|
|
|
|
|
./tools/hako_check.sh --format dot --dead-blocks program.hako > cfg.dot
|
|
|
|
|
dot -Tpng cfg.dot -o cfg.png
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Red nodes = unreachable blocks
|
|
|
|
|
Gray edges = never taken
|
|
|
|
|
|
|
|
|
|
### Long-term (Phase 200+) 🌟
|
|
|
|
|
|
|
|
|
|
**Feature 4: Interactive Reports**
|
|
|
|
|
- HTML output with clickable CFG
|
|
|
|
|
- Hover over blocks to see code
|
|
|
|
|
- Click to jump to source location
|
|
|
|
|
|
|
|
|
|
**Feature 5: Fix Suggestions**
|
|
|
|
|
```
|
|
|
|
|
[HC020] Unreachable basic block: fn=Main.test bb=5 (after early return)
|
|
|
|
|
Suggestion: Remove unreachable code at lines 15-20, or move before return at line 12
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Design Improvements for Similar Tasks
|
|
|
|
|
|
|
|
|
|
### 1. Early Data Flow Validation
|
|
|
|
|
|
|
|
|
|
**What to do differently:**
|
|
|
|
|
- Before implementing analyzer, verify data availability
|
|
|
|
|
- Create end-to-end mockup with hardcoded data
|
|
|
|
|
- Test analysis logic before building pipeline
|
|
|
|
|
|
|
|
|
|
**Why it helps:**
|
|
|
|
|
- Catches integration gaps early
|
|
|
|
|
- Validates assumptions about data format
|
|
|
|
|
- Allows parallel development (data bridge + analysis)
|
|
|
|
|
|
|
|
|
|
### 2. Incremental Testing
|
|
|
|
|
|
|
|
|
|
**What to do differently:**
|
|
|
|
|
- Test CFG extractor → Test DeadBlockAnalyzerBox → Test CLI integration
|
|
|
|
|
- Each step independently validated
|
|
|
|
|
- Mock data for middle layers
|
|
|
|
|
|
|
|
|
|
**Why it helps:**
|
|
|
|
|
- Pinpoints failures faster
|
|
|
|
|
- Easier to debug
|
|
|
|
|
- More confidence at each stage
|
|
|
|
|
|
|
|
|
|
### 3. Documentation-First Approach
|
|
|
|
|
|
|
|
|
|
**What worked well:**
|
|
|
|
|
- Writing inventory doc forced careful investigation
|
|
|
|
|
- Implementation summary guided development
|
|
|
|
|
- Feedback doc captures lessons learned
|
|
|
|
|
|
|
|
|
|
**Apply to future phases:**
|
|
|
|
|
- Always start with design doc
|
|
|
|
|
- Document decisions and alternatives
|
|
|
|
|
- Create feedback doc after completion
|
|
|
|
|
|
|
|
|
|
## Specific Code Improvements
|
|
|
|
|
|
|
|
|
|
### CFG Extractor
|
|
|
|
|
|
|
|
|
|
**Current:**
|
|
|
|
|
```rust
|
|
|
|
|
fn terminator_to_string(inst: &MirInstruction) -> String {
|
|
|
|
|
match inst {
|
|
|
|
|
MirInstruction::Branch { .. } => "Branch".to_string(),
|
|
|
|
|
MirInstruction::Jump { .. } => "Jump".to_string(),
|
|
|
|
|
MirInstruction::Return { .. } => "Return".to_string(),
|
|
|
|
|
_ => "Unknown".to_string(),
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Future Enhancement:**
|
|
|
|
|
```rust
|
|
|
|
|
fn terminator_to_string(inst: &MirInstruction) -> String {
|
|
|
|
|
match inst {
|
|
|
|
|
MirInstruction::Branch { condition, .. } => {
|
|
|
|
|
// Check if condition is constant
|
|
|
|
|
if is_constant_value(condition) {
|
|
|
|
|
"ConstantBranch".to_string()
|
|
|
|
|
} else {
|
|
|
|
|
"Branch".to_string()
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
// ... rest ...
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Benefit: Enables better reason inference in DeadBlockAnalyzerBox
|
|
|
|
|
|
|
|
|
|
### DeadBlockAnalyzerBox
|
|
|
|
|
|
|
|
|
|
**Current:**
|
|
|
|
|
```hako
|
|
|
|
|
_infer_unreachable_reason(terminator) {
|
|
|
|
|
if terminator == "Return" { return "after early return" }
|
|
|
|
|
if terminator == "Jump" { return "unreachable branch" }
|
|
|
|
|
// ...
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Future Enhancement:**
|
|
|
|
|
```hako
|
|
|
|
|
_infer_unreachable_reason(block, cfg) {
|
|
|
|
|
// Look at parent block's terminator
|
|
|
|
|
local parent_id = me._find_parent_block(block, cfg)
|
|
|
|
|
local parent = me._get_block_by_id(cfg, parent_id)
|
|
|
|
|
|
|
|
|
|
if parent.terminator == "ConstantBranch" {
|
|
|
|
|
return "constant false condition"
|
|
|
|
|
}
|
|
|
|
|
// ... more sophisticated analysis ...
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Benefit: More actionable error messages
|
|
|
|
|
|
|
|
|
|
## Testing Recommendations
|
|
|
|
|
|
|
|
|
|
### Unit Test Coverage
|
|
|
|
|
|
|
|
|
|
**Add tests for:**
|
|
|
|
|
- CFG extraction with complex control flow (nested loops, try-catch)
|
|
|
|
|
- DeadBlockAnalyzerBox edge cases (empty functions, single-block functions)
|
|
|
|
|
- Terminator inference logic
|
|
|
|
|
|
|
|
|
|
**Test file:**
|
|
|
|
|
```rust
|
|
|
|
|
// src/mir/cfg_extractor.rs
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_nested_loop_cfg() { /* ... */ }
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn test_try_catch_cfg() { /* ... */ }
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Integration Test Patterns
|
|
|
|
|
|
|
|
|
|
**Once bridge is complete:**
|
|
|
|
|
|
|
|
|
|
1. **Positive Tests** (should detect HC020)
|
|
|
|
|
- Early return in nested if
|
|
|
|
|
- Break in loop with subsequent code
|
|
|
|
|
- Panic/throw with subsequent code
|
|
|
|
|
|
|
|
|
|
2. **Negative Tests** (should NOT detect HC020)
|
|
|
|
|
- Conditional return (not always taken)
|
|
|
|
|
- Loop with conditional break
|
|
|
|
|
- Reachable code after if-else
|
|
|
|
|
|
|
|
|
|
3. **Edge Cases**
|
|
|
|
|
- Empty functions
|
|
|
|
|
- Functions with only one block
|
|
|
|
|
- Functions with dead blocks but live code
|
|
|
|
|
|
|
|
|
|
## Process Improvements
|
|
|
|
|
|
|
|
|
|
### 1. Parallel Development Strategy
|
|
|
|
|
|
|
|
|
|
**For Phase 155:**
|
|
|
|
|
- One person implements data bridge (Rust-side)
|
|
|
|
|
- Another prepares integration tests (.hako-side)
|
|
|
|
|
- Meet in middle with agreed JSON format
|
|
|
|
|
|
|
|
|
|
### 2. Smoke Test Evolution
|
|
|
|
|
|
|
|
|
|
**Current:** Lenient (allows CFG unavailability)
|
|
|
|
|
**Phase 155:** Strict (requires CFG, expects HC020)
|
|
|
|
|
**Phase 160:** Comprehensive (multiple files, performance tests)
|
|
|
|
|
|
|
|
|
|
Update smoke script as pipeline matures.
|
|
|
|
|
|
|
|
|
|
### 3. Documentation Maintenance
|
|
|
|
|
|
|
|
|
|
**Add to each phase:**
|
|
|
|
|
- Before: Design doc with alternatives
|
|
|
|
|
- During: Progress log with blockers
|
|
|
|
|
- After: Feedback doc with lessons
|
|
|
|
|
|
|
|
|
|
**Benefit:** Knowledge transfer, future reference, debugging aid
|
|
|
|
|
|
|
|
|
|
## Questions for Discussion
|
|
|
|
|
|
|
|
|
|
### Q1: CFG Bridge Location
|
|
|
|
|
|
|
|
|
|
**Options:**
|
|
|
|
|
1. New builtin function: `extract_mir_cfg(text, path)`
|
|
|
|
|
2. Extend existing parser: `HakoParserCoreBox.parse_with_cfg(text)`
|
|
|
|
|
3. Separate tool: `hakorune --extract-cfg program.hako`
|
|
|
|
|
|
|
|
|
|
**Recommendation:** Option 1 (cleaner separation)
|
|
|
|
|
|
|
|
|
|
### Q2: Performance Optimization
|
|
|
|
|
|
|
|
|
|
**Question:** Should we cache CFG extraction?
|
|
|
|
|
|
|
|
|
|
**Analysis:**
|
|
|
|
|
- MIR compilation is expensive (~50-100ms per file)
|
|
|
|
|
- CFG extraction is cheap (~1ms)
|
|
|
|
|
- But hako_check runs once, so caching not useful
|
|
|
|
|
|
|
|
|
|
**Recommendation:** No caching for now, revisit in CI/CD context
|
|
|
|
|
|
|
|
|
|
### Q3: Error Handling
|
|
|
|
|
|
|
|
|
|
**Question:** What if MIR compilation fails?
|
|
|
|
|
|
|
|
|
|
**Options:**
|
|
|
|
|
1. Skip HC020 silently
|
|
|
|
|
2. Show warning "CFG unavailable due to compile error"
|
|
|
|
|
3. Fail entire hako_check run
|
|
|
|
|
|
|
|
|
|
**Recommendation:** Option 2 (user-friendly, informative)
|
|
|
|
|
|
|
|
|
|
## Success Metrics (Phase 155)
|
|
|
|
|
|
|
|
|
|
### Definition of Done
|
|
|
|
|
|
|
|
|
|
- [ ] CFG data bridge implemented and tested
|
|
|
|
|
- [ ] All 4 test cases produce HC020 output
|
|
|
|
|
- [ ] Smoke test passes without warnings
|
|
|
|
|
- [ ] No false positives on large .hako files
|
|
|
|
|
- [ ] Documentation updated (bridge complete)
|
|
|
|
|
|
|
|
|
|
### Performance Targets
|
|
|
|
|
|
|
|
|
|
- CFG extraction: <5ms per function
|
|
|
|
|
- HC020 analysis: <10ms per file
|
|
|
|
|
- Total overhead: <5% of hako_check runtime
|
|
|
|
|
|
|
|
|
|
### Quality Gates
|
|
|
|
|
|
|
|
|
|
- Zero false positives on test suite
|
|
|
|
|
- HC020 output includes function and block ID
|
|
|
|
|
- Error messages are actionable
|
|
|
|
|
- No crashes on malformed input
|
|
|
|
|
|
|
|
|
|
## Conclusion
|
|
|
|
|
|
|
|
|
|
Phase 154 **successfully delivered the infrastructure** for block-level dead code detection. The core components are complete, tested, and documented.
|
|
|
|
|
|
|
|
|
|
**Key Success Factors:**
|
|
|
|
|
- Clear architecture from Phase 153 pattern
|
|
|
|
|
- Thorough investigation before implementation
|
|
|
|
|
- Graceful degradation during MVP
|
|
|
|
|
- Comprehensive documentation for handoff
|
|
|
|
|
|
|
|
|
|
**Next Phase (155) is Straightforward:**
|
|
|
|
|
- 2-3 hours of Rust-side plumbing
|
|
|
|
|
- Well-defined task with clear deliverables
|
|
|
|
|
- High confidence in success
|
|
|
|
|
|
|
|
|
|
**Recommendation:** Proceed with Phase 155 implementation immediately. The foundation is solid, and the remaining work is mechanical.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
**Feedback Author:** Claude (Anthropic)
|
|
|
|
|
**Date:** 2025-12-04
|
|
|
|
|
**Phase:** 154 (Complete)
|
|
|
|
|
**Next Phase:** 155 (CFG Data Bridge)
|
2025-12-11 02:35:31 +09:00
|
|
|
Status: Historical
|