Implements the Trim pattern detection logic for carrier promotion:
- find_definition_in_body(): Iterative AST traversal to locate variable definitions
- is_substring_method_call(): Detects substring() method calls
- extract_equality_literals(): Extracts string literals from OR chains (ch == " " || ch == "\t")
- TrimPatternInfo: Captures detected pattern details for carrier promotion
This enables Pattern 5 to detect trim-style loops:
```hako
loop(start < end) {
local ch = s.substring(start, start+1)
if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
start = start + 1
} else {
break
}
}
```
Unit tests cover:
- Simple and nested definition detection
- substring method call detection
- Single and chained equality literal extraction
- Full Trim pattern detection with 2-4 whitespace characters
Next: Phase 171-C-3 integration with Pattern 2/4 routing
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
258 lines
6.5 KiB
Markdown
258 lines
6.5 KiB
Markdown
# Phase 171-A: Blocked Loop Inventory
|
|
|
|
**Date**: 2025-12-07
|
|
**Status**: Initial inventory complete
|
|
**Purpose**: Identify loops blocked by LoopBodyLocal variables in break conditions
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
This document catalogs loops that cannot be lowered by Pattern 2/4 because they use **LoopBodyLocal** variables in their break conditions. These are candidates for Pattern 5 carrier promotion.
|
|
|
|
---
|
|
|
|
## Blocked Loops Found
|
|
|
|
### 1. TrimTest.trim/1 - Leading Whitespace Trim
|
|
|
|
**File**: `local_tests/test_trim_main_pattern.hako`
|
|
**Lines**: 20-27
|
|
|
|
```hako
|
|
loop(start < end) {
|
|
local ch = s.substring(start, start+1)
|
|
if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
|
|
start = start + 1
|
|
} else {
|
|
break
|
|
}
|
|
}
|
|
```
|
|
|
|
**Blocking Variable**: `ch` (LoopBodyLocal)
|
|
|
|
**Analysis**:
|
|
- Loop parameter: `start` (LoopParam)
|
|
- Outer variable: `end` (OuterLocal)
|
|
- Break condition uses: `ch` (LoopBodyLocal)
|
|
- `ch` is defined inside loop body as `s.substring(start, start+1)`
|
|
|
|
**Error Message**:
|
|
```
|
|
[trace:debug] pattern2: Pattern 2 lowerer failed: Variable 'ch' not bound in ConditionEnv
|
|
```
|
|
|
|
**Why Blocked**:
|
|
Pattern 2 expects break conditions to only use:
|
|
- LoopParam (`start`)
|
|
- OuterLocal (`end`, `s`)
|
|
|
|
But the break condition `ch == " " || ...` uses `ch`, which is defined inside the loop body.
|
|
|
|
---
|
|
|
|
### 2. TrimTest.trim/1 - Trailing Whitespace Trim
|
|
|
|
**File**: `local_tests/test_trim_main_pattern.hako`
|
|
**Lines**: 30-37
|
|
|
|
```hako
|
|
loop(end > start) {
|
|
local ch = s.substring(end-1, end)
|
|
if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
|
|
end = end - 1
|
|
} else {
|
|
break
|
|
}
|
|
}
|
|
```
|
|
|
|
**Blocking Variable**: `ch` (LoopBodyLocal)
|
|
|
|
**Analysis**:
|
|
- Loop parameter: `end` (LoopParam)
|
|
- Outer variable: `start` (OuterLocal)
|
|
- Break condition uses: `ch` (LoopBodyLocal)
|
|
- `ch` is defined inside loop body as `s.substring(end-1, end)`
|
|
|
|
**Same blocking reason as Loop 1.**
|
|
|
|
---
|
|
|
|
### 3. JsonParserBox - MethodCall in Condition
|
|
|
|
**File**: `tools/hako_shared/json_parser.hako`
|
|
|
|
```hako
|
|
loop(i < s.length()) {
|
|
// ...
|
|
}
|
|
```
|
|
|
|
**Blocking Issue**: `s.length()` is a MethodCall in the condition expression.
|
|
|
|
**Error Message**:
|
|
```
|
|
[ERROR] ❌ MIR compilation error: [cf_loop/pattern4] Lowering failed:
|
|
Unsupported expression in value context: MethodCall {
|
|
object: Variable { name: "s", ... },
|
|
method: "length",
|
|
arguments: [],
|
|
...
|
|
}
|
|
```
|
|
|
|
**Why Blocked**:
|
|
Pattern 4's value context lowering doesn't support MethodCall expressions yet.
|
|
|
|
**Note**: This is not a LoopBodyLocal issue, but a MethodCall limitation. May be addressed in Phase 171-D (Optional).
|
|
|
|
---
|
|
|
|
## Pattern5-A Target Decision
|
|
|
|
### Selected Target: TrimTest Loop 1 (Leading Whitespace)
|
|
|
|
We select the **first loop** from TrimTest as Pattern5-A target for the following reasons:
|
|
|
|
1. **Clear structure**: Simple substring + equality checks
|
|
2. **Representative**: Same pattern as many real-world parsers
|
|
3. **Self-contained**: Doesn't depend on complex outer state
|
|
4. **Testable**: Easy to write unit tests
|
|
|
|
### Pattern5-A Specification
|
|
|
|
**Loop Structure**:
|
|
```hako
|
|
loop(start < end) {
|
|
local ch = s.substring(start, start+1)
|
|
if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
|
|
start = start + 1
|
|
} else {
|
|
break
|
|
}
|
|
}
|
|
```
|
|
|
|
**Variables**:
|
|
- `start`: LoopParam (carrier, mutated)
|
|
- `end`: OuterLocal (condition-only)
|
|
- `s`: OuterLocal (used in body)
|
|
- `ch`: LoopBodyLocal (blocking variable)
|
|
|
|
**Break Condition**:
|
|
```
|
|
!(ch == " " || ch == "\t" || ch == "\n" || ch == "\r")
|
|
```
|
|
|
|
---
|
|
|
|
## Promotion Strategy: Design D (Evaluated Bool Carrier)
|
|
|
|
### Rationale
|
|
|
|
We choose **Design D** (Evaluated Bool Carrier) over other options:
|
|
|
|
**Why not carry `ch` directly?**
|
|
- `ch` is a StringBox, not a primitive value
|
|
- Would require complex carrier type system
|
|
- Would break existing Pattern 2/4 assumptions
|
|
|
|
**Design D approach**:
|
|
- Introduce a new carrier: `is_whitespace` (bool)
|
|
- Evaluate `ch == " " || ...` in loop body
|
|
- Store result in `is_whitespace` carrier
|
|
- Use `is_whitespace` in break condition
|
|
|
|
### Transformed Structure
|
|
|
|
**Before (Pattern5-A)**:
|
|
```hako
|
|
loop(start < end) {
|
|
local ch = s.substring(start, start+1)
|
|
if ch == " " || ch == "\t" || ch == "\n" || ch == "\r" {
|
|
start = start + 1
|
|
} else {
|
|
break
|
|
}
|
|
}
|
|
```
|
|
|
|
**After (Pattern2 compatible)**:
|
|
```hako
|
|
// Initialization (before loop)
|
|
local is_whitespace = true // Initial assumption
|
|
|
|
loop(start < end && is_whitespace) {
|
|
local ch = s.substring(start, start+1)
|
|
is_whitespace = (ch == " " || ch == "\t" || ch == "\n" || ch == "\r")
|
|
|
|
if is_whitespace {
|
|
start = start + 1
|
|
} else {
|
|
break // Now redundant, but kept for clarity
|
|
}
|
|
}
|
|
```
|
|
|
|
**Key transformations**:
|
|
1. Add `is_whitespace` carrier initialization
|
|
2. Update loop condition to include `is_whitespace`
|
|
3. Compute `is_whitespace` in loop body
|
|
4. Original if-else becomes simpler (could be optimized away)
|
|
|
|
---
|
|
|
|
## Next Steps (Phase 171-C)
|
|
|
|
### Phase 171-C-1: Skeleton Implementation ✅
|
|
- Create `LoopBodyCarrierPromoter` box
|
|
- Define `PromotionRequest` / `PromotionResult` types
|
|
- Implement skeleton `try_promote()` method
|
|
- Add `find_definition_in_body()` helper
|
|
|
|
### Phase 171-C-2: Trim Pattern Promotion Logic
|
|
- Detect substring + equality pattern
|
|
- Generate `is_whitespace` carrier
|
|
- Generate initialization statement
|
|
- Generate update statement
|
|
|
|
### Phase 171-C-3: Integration with Pattern 2/4
|
|
- Call `LoopBodyCarrierPromoter::try_promote()` in routing
|
|
- If promotion succeeds, route to Pattern 2
|
|
- If promotion fails, return UnsupportedPattern
|
|
|
|
### Phase 171-D: MethodCall Support (Optional)
|
|
- Handle `s.length()` in loop conditions
|
|
- May require carrier promotion for method results
|
|
- Lower priority than Trim pattern
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
**Blocked Loops**:
|
|
- 2 loops in TrimTest (LoopBodyLocal `ch`)
|
|
- 1+ loops in JsonParser (MethodCall in condition)
|
|
|
|
**Pattern5-A Target**:
|
|
- TrimTest leading whitespace trim loop
|
|
- Clear, representative, testable
|
|
|
|
**Promotion Strategy**:
|
|
- Design D: Evaluated Bool Carrier
|
|
- Transform `ch` checks → `is_whitespace` carrier
|
|
- Make compatible with Pattern 2
|
|
|
|
**Implementation Status**:
|
|
- Phase 171-A: ✅ Inventory complete
|
|
- Phase 171-B: ✅ Target selected
|
|
- Phase 171-C-1: ✅ Skeleton implementation complete
|
|
- Phase 171-C-2: ✅ Trim pattern detection implemented
|
|
- `find_definition_in_body()`: AST traversal for variable definitions
|
|
- `is_substring_method_call()`: Detects `substring()` method calls
|
|
- `extract_equality_literals()`: Extracts string literals from OR chains
|
|
- `TrimPatternInfo`: Captures pattern details for carrier promotion
|
|
- Phase 171-C-3: ⏳ Integration with Pattern 2/4 routing
|