feat(joinir): Phase 247-EX - DigitPos dual-value architecture
Extends DigitPos promotion to generate TWO carriers for Pattern A/B support: - Boolean carrier (is_digit_pos) for break conditions - Integer carrier (digit_value) for NumberAccumulation ## Implementation 1. **DigitPosPromoter** (loop_body_digitpos_promoter.rs) - Generates dual carriers: is_<var> (bool) + <base>_value (int) - Smart naming: "digit_pos" → "digit" (removes "_pos" suffix) 2. **UpdateEnv** (update_env.rs) - Context-aware promoted variable resolution - Priority: <base>_value (int) → is_<var> (bool) → standard - Pass promoted_loopbodylocals from CarrierInfo 3. **Integration** (loop_with_break_minimal.rs) - UpdateEnv constructor updated to pass promoted list ## Test Results - **Before**: 925 tests PASS - **After**: 931 tests PASS (+6 new tests, 0 failures) ## New Tests - test_promoted_variable_resolution_digit_pos - Full dual-value - test_promoted_variable_resolution_fallback_to_bool - Fallback - test_promoted_variable_not_a_carrier - Error handling ## Impact | Pattern | Before | After | |---------|--------|-------| | _parse_number | ✅ Works (bool only) | ✅ Works (bool used, int unused) | | _atoi | ❌ Failed (missing int) | ✅ READY (int carrier available!) | 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -0,0 +1,435 @@
|
||||
# Phase 246-EX: JsonParser _atoi JoinIR Integration
|
||||
|
||||
## 0. Executive Summary
|
||||
|
||||
**Goal**: Integrate JsonParser's `_atoi` function into the JoinIR loop lowering system using the existing NumberAccumulation infrastructure.
|
||||
|
||||
**Status**: Step 0 - Infrastructure confirmed, design document created
|
||||
|
||||
**Key Finding**: ✅ `UpdateRhs::NumberAccumulation` already exists (Phase 190) and fully supports the `result = result * 10 + digit_pos` pattern!
|
||||
|
||||
---
|
||||
|
||||
## 1. _atoi Loop Structure Analysis
|
||||
|
||||
### Source Code (from `apps/json/jsonparser.hako`)
|
||||
|
||||
```nyash
|
||||
box JsonParser {
|
||||
method _atoi(s, len) {
|
||||
local result = 0
|
||||
local digits = "0123456789"
|
||||
local i = 0
|
||||
|
||||
loop(i < len) {
|
||||
local ch = s.substring(i, i + 1)
|
||||
local digit_pos = digits.indexOf(ch)
|
||||
if digit_pos < 0 { break }
|
||||
result = result * 10 + digit_pos
|
||||
i = i + 1
|
||||
}
|
||||
|
||||
return result
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Loop Components
|
||||
|
||||
#### Loop Header
|
||||
- **Condition**: `i < len`
|
||||
- **Type**: Simple comparison (supported by ExprLowerer)
|
||||
|
||||
#### Loop Variables
|
||||
| Variable | Type | Role | Initial Value |
|
||||
|----------|------|------|---------------|
|
||||
| `i` | position | LoopState (counter) | 0 |
|
||||
| `result` | accumulator | LoopState (number accumulation) | 0 |
|
||||
|
||||
#### Loop Body
|
||||
1. **Local declarations**:
|
||||
- `ch = s.substring(i, i + 1)` - current character
|
||||
- `digit_pos = digits.indexOf(ch)` - digit value (-1 if not digit)
|
||||
|
||||
2. **Break condition**:
|
||||
- `if digit_pos < 0 { break }` - exit on non-digit character
|
||||
|
||||
3. **Updates**:
|
||||
- `result = result * 10 + digit_pos` - number accumulation (NumberAccumulation pattern)
|
||||
- `i = i + 1` - position increment (Const increment)
|
||||
|
||||
#### Captured Variables (Function Parameters)
|
||||
- `s` - input string to parse
|
||||
- `len` - length of string to process
|
||||
- `digits` - pre-loop local (digit lookup string "0123456789")
|
||||
|
||||
---
|
||||
|
||||
## 2. Pattern Classification
|
||||
|
||||
### Pattern Type: **Pattern 2 (Break)**
|
||||
|
||||
**Rationale**:
|
||||
- Single if-break structure
|
||||
- No continue statements
|
||||
- Early exit condition (non-digit character)
|
||||
- Two carriers with different update patterns
|
||||
|
||||
### Carriers
|
||||
|
||||
| Carrier | Role | Update Pattern | UpdateExpr Variant |
|
||||
|---------|------|----------------|-------------------|
|
||||
| `i` | Loop counter | `i = i + 1` | `UpdateExpr::Const(1)` |
|
||||
| `result` | Number accumulator | `result = result * 10 + digit_pos` | `UpdateExpr::BinOp { rhs: NumberAccumulation { base: 10, digit_var: "digit_pos" } }` |
|
||||
|
||||
---
|
||||
|
||||
## 3. UpdateExpr Infrastructure Confirmation (Phase 190)
|
||||
|
||||
### 3.1 Existing NumberAccumulation Support
|
||||
|
||||
**Location**: `/home/tomoaki/git/hakorune-selfhost/src/mir/join_ir/lowering/loop_update_analyzer.rs`
|
||||
|
||||
#### UpdateRhs Enum (lines 46-64)
|
||||
```rust
|
||||
pub enum UpdateRhs {
|
||||
Const(i64),
|
||||
Variable(String),
|
||||
StringLiteral(String),
|
||||
/// Phase 190: Number accumulation pattern: result = result * base + digit
|
||||
NumberAccumulation {
|
||||
base: i64,
|
||||
digit_var: String,
|
||||
},
|
||||
Other,
|
||||
}
|
||||
```
|
||||
|
||||
✅ **NumberAccumulation variant exists!**
|
||||
|
||||
### 3.2 Detection Logic (lines 157-192)
|
||||
|
||||
**Pattern Recognition**:
|
||||
```rust
|
||||
// Detects: (carrier * base) + digit
|
||||
if matches!(operator, BinaryOperator::Add | BinaryOperator::Subtract) {
|
||||
if let ASTNode::BinaryOp {
|
||||
operator: BinaryOperator::Multiply,
|
||||
left: mul_left,
|
||||
right: mul_right,
|
||||
..
|
||||
} = left.as_ref() {
|
||||
// Check if multiplication is: carrier * base
|
||||
if mul_lhs_name == carrier_name {
|
||||
if let ASTNode::Literal { value: LiteralValue::Integer(base), .. } = mul_right.as_ref() {
|
||||
if let Some(digit_var) = Self::extract_variable_name(right) {
|
||||
// NumberAccumulation pattern detected!
|
||||
return Some(UpdateExpr::BinOp {
|
||||
lhs: carrier_name.to_string(),
|
||||
op,
|
||||
rhs: UpdateRhs::NumberAccumulation { base: *base, digit_var },
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
✅ **Exactly matches our pattern: `result = result * 10 + digit_pos`**
|
||||
|
||||
### 3.3 Emission Logic (carrier_update_emitter.rs)
|
||||
|
||||
**Location**: `/home/tomoaki/git/hakorune-selfhost/src/mir/join_ir/lowering/carrier_update_emitter.rs` (lines 139-170)
|
||||
|
||||
```rust
|
||||
UpdateRhs::NumberAccumulation { base, digit_var } => {
|
||||
// Step 1: Emit const for base
|
||||
let base_id = alloc_value();
|
||||
instructions.push(JoinInst::Compute(MirLikeInst::Const {
|
||||
dst: base_id,
|
||||
value: ConstValue::Integer(*base),
|
||||
}));
|
||||
|
||||
// Step 2: Emit multiplication: tmp = carrier * base
|
||||
let tmp_id = alloc_value();
|
||||
instructions.push(JoinInst::Compute(MirLikeInst::BinOp {
|
||||
dst: tmp_id,
|
||||
op: BinOpKind::Mul,
|
||||
lhs: carrier_param,
|
||||
rhs: base_id,
|
||||
}));
|
||||
|
||||
// Step 3: Resolve digit variable
|
||||
let digit_id = env.resolve(digit_var).ok_or_else(...)?;
|
||||
|
||||
// Step 4: Emit addition: result = tmp + digit
|
||||
let result = alloc_value();
|
||||
instructions.push(JoinInst::Compute(MirLikeInst::BinOp {
|
||||
dst: result,
|
||||
op: *op, // Add or Subtract
|
||||
lhs: tmp_id,
|
||||
rhs: digit_id,
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
✅ **Complete emission infrastructure ready!**
|
||||
|
||||
### 3.4 Test Coverage
|
||||
|
||||
**Existing tests** (loop_update_analyzer.rs):
|
||||
- ✅ `test_analyze_number_accumulation_base10()` - decimal pattern
|
||||
- ✅ `test_analyze_number_accumulation_base2()` - binary pattern
|
||||
- ✅ `test_analyze_number_accumulation_wrong_lhs()` - validation
|
||||
|
||||
**Existing tests** (carrier_update_emitter.rs):
|
||||
- ✅ `test_emit_number_accumulation_base10()` - JoinIR emission
|
||||
- ✅ `test_emit_number_accumulation_digit_not_found()` - error handling
|
||||
|
||||
---
|
||||
|
||||
## 4. Infrastructure Confirmation Results
|
||||
|
||||
### ✅ Checklist
|
||||
|
||||
- [x] `UpdateRhs::NumberAccumulation` variant exists
|
||||
- [x] `result = result * 10 + digit_pos` pattern fully supported
|
||||
- [x] Detection logic handles nested BinaryOp (Mul inside Add)
|
||||
- [x] Emission logic generates correct JoinIR sequence
|
||||
- [x] Unit tests cover base10 and base2 patterns
|
||||
- [x] Error handling for missing digit variable
|
||||
|
||||
### 🎯 No Extensions Needed!
|
||||
|
||||
The existing Phase 190 infrastructure is **complete and ready** for _atoi integration. No modifications to UpdateExpr, detection, or emission logic required.
|
||||
|
||||
---
|
||||
|
||||
## 5. Integration Requirements
|
||||
|
||||
### 5.1 Condition Expression Support
|
||||
|
||||
#### Header Condition: `i < len`
|
||||
- **Type**: Simple comparison
|
||||
- **Lowering**: ExprLowerer with ConditionEnv
|
||||
- **Required vars in env**: `i`, `len`
|
||||
|
||||
#### Break Condition: `digit_pos < 0`
|
||||
- **Type**: Simple comparison
|
||||
- **Lowering**: ExprLowerer or digitpos_condition_normalizer
|
||||
- **Required vars in env**: `digit_pos`
|
||||
|
||||
### 5.2 Variable Capture Requirements
|
||||
|
||||
#### Function Parameters (must be in ConditionEnv)
|
||||
- `s` - input string
|
||||
- `len` - string length
|
||||
|
||||
#### Loop-Local Variables (LoopBodyLocal detection required)
|
||||
- `digits` - digit lookup string (pre-loop local)
|
||||
- `ch` - current character (body-local)
|
||||
- `digit_pos` - digit position (body-local, used in break condition)
|
||||
|
||||
### 5.3 UpdateEnv Resolution
|
||||
|
||||
The UpdateEnv must resolve:
|
||||
1. **Carriers**: `i`, `result` (from carrier params)
|
||||
2. **Condition vars**: `len` (from function params)
|
||||
3. **Body-locals**: `digit_pos` (from loop body)
|
||||
|
||||
---
|
||||
|
||||
## 6. Test Plan
|
||||
|
||||
### 6.1 E2E Test Cases
|
||||
|
||||
**File**: `apps/tests/json_atoi_smoke.hako`
|
||||
|
||||
| Input | Expected Output | Test Case |
|
||||
|-------|----------------|-----------|
|
||||
| `"0"` | `0` | Single digit zero |
|
||||
| `"42"` | `42` | Two digits |
|
||||
| `"123"` | `123` | Multiple digits |
|
||||
| `"007"` | `7` | Leading zeros |
|
||||
| `"123abc"` | `123` | Break at non-digit |
|
||||
| `"abc"` | `0` | Immediate break (no digits) |
|
||||
| `""` | `0` | Empty string |
|
||||
|
||||
### 6.2 JoinIR Structure Tests
|
||||
|
||||
**Verify**:
|
||||
1. **UpdateExpr detection**:
|
||||
- `i = i + 1` → `UpdateExpr::Const(1)`
|
||||
- `result = result * 10 + digit_pos` → `UpdateExpr::BinOp { rhs: NumberAccumulation { base: 10, digit_var: "digit_pos" } }`
|
||||
|
||||
2. **CarrierInfo**:
|
||||
- Both `i` and `result` marked as LoopState
|
||||
- Correct initial values (both 0)
|
||||
|
||||
3. **ExitMeta**:
|
||||
- Contains `("i", ...)` and `("result", ...)`
|
||||
- Exit ValueIds available for function return
|
||||
|
||||
4. **JoinIR instructions**:
|
||||
- Mul+Add sequence emitted for `result` update
|
||||
- Correct base const (10)
|
||||
- Correct digit_pos variable resolution
|
||||
|
||||
### 6.3 MIR Dump Verification
|
||||
|
||||
**Commands**:
|
||||
```bash
|
||||
# Basic MIR structure
|
||||
./target/release/hakorune --dump-mir apps/tests/json_atoi_smoke.hako
|
||||
|
||||
# Detailed MIR with effects
|
||||
./target/release/hakorune --dump-mir --mir-verbose --mir-verbose-effects apps/tests/json_atoi_smoke.hako
|
||||
|
||||
# JSON format for detailed analysis
|
||||
./target/release/hakorune --emit-mir-json mir.json apps/tests/json_atoi_smoke.hako
|
||||
jq '.functions[] | select(.name == "_atoi") | .blocks' mir.json
|
||||
```
|
||||
|
||||
**Expected MIR patterns**:
|
||||
- Const instruction for base 10
|
||||
- Mul instruction: `%tmp = %result * %base`
|
||||
- Add instruction: `%result_next = %tmp + %digit_pos`
|
||||
- PHI nodes for both carriers at loop header
|
||||
|
||||
---
|
||||
|
||||
## 7. Implementation Strategy (Recommended Steps)
|
||||
|
||||
### Step 1: Minimal Smoke Test
|
||||
- Create simple test case with hardcoded digit loop
|
||||
- Verify NumberAccumulation detection works
|
||||
- Confirm JoinIR emission is correct
|
||||
|
||||
### Step 2: LoopBodyLocal Detection
|
||||
- Ensure `digit_pos` is detected as loop body local
|
||||
- Verify it's available in UpdateEnv during emission
|
||||
- Test break condition lowering with `digit_pos < 0`
|
||||
|
||||
### Step 3: Function Parameter Capture
|
||||
- Verify `s` and `len` are captured correctly
|
||||
- Test ConditionEnv resolution for header condition `i < len`
|
||||
- Ensure ExprLowerer can access function params
|
||||
|
||||
### Step 4: Full _atoi Integration
|
||||
- Test complete `_atoi` function from jsonparser.hako
|
||||
- Verify all E2E test cases pass
|
||||
- Check MIR dump for correct structure
|
||||
|
||||
### Step 5: Edge Cases
|
||||
- Empty string handling
|
||||
- Single character strings
|
||||
- Non-digit immediate break
|
||||
- Large numbers (overflow consideration)
|
||||
|
||||
---
|
||||
|
||||
## 8. Potential Challenges
|
||||
|
||||
### 8.1 LoopBodyLocal Detection
|
||||
|
||||
**Issue**: `digit_pos` must be recognized as a loop body local variable that is:
|
||||
1. Declared inside loop body (`local digit_pos = ...`)
|
||||
2. Used in break condition (`if digit_pos < 0`)
|
||||
3. Available in UpdateEnv during result update emission
|
||||
|
||||
**Solution**: Existing LoopBodyLocalDetector should handle this (Phase 184).
|
||||
|
||||
### 8.2 Method Call in Loop Body
|
||||
|
||||
**Issue**: `s.substring(i, i + 1)` and `digits.indexOf(ch)` are method calls with multiple arguments.
|
||||
|
||||
**Solution**: LoopBodyLocal lowering should handle method calls as "complex expressions" (Phase 184 already supports this via `ExprLowerer`).
|
||||
|
||||
### 8.3 UpdateEnv Variable Resolution
|
||||
|
||||
**Issue**: UpdateEnv must resolve variables from three sources:
|
||||
- Carriers: `i`, `result`
|
||||
- Function params: `s`, `len`
|
||||
- Body-locals: `ch`, `digit_pos`
|
||||
|
||||
**Solution**: Phase 184's `UpdateEnv::new(&cond_env, &body_env)` should handle this if:
|
||||
- ConditionEnv contains function params + carriers
|
||||
- LoopBodyLocalEnv contains body-local ValueIds
|
||||
|
||||
---
|
||||
|
||||
## 9. Success Criteria
|
||||
|
||||
### ✅ Phase 246-EX Complete When:
|
||||
|
||||
1. **Detection**: `result = result * 10 + digit_pos` correctly identified as NumberAccumulation
|
||||
2. **Emission**: Mul+Add JoinIR sequence generated with correct ValueIds
|
||||
3. **Execution**: All E2E test cases produce correct numeric outputs
|
||||
4. **Validation**: MIR dump shows expected loop structure with PHI nodes
|
||||
5. **Integration**: Works within full JsonParser context (not just isolated test)
|
||||
|
||||
---
|
||||
|
||||
## 10. References
|
||||
|
||||
### Phase 190: NumberAccumulation Infrastructure
|
||||
- Detection: `/src/mir/join_ir/lowering/loop_update_analyzer.rs` (lines 157-192)
|
||||
- Emission: `/src/mir/join_ir/lowering/carrier_update_emitter.rs` (lines 139-170)
|
||||
- Tests: Both files contain comprehensive unit tests
|
||||
|
||||
### Phase 184: UpdateEnv and LoopBodyLocal
|
||||
- UpdateEnv: `/src/mir/join_ir/lowering/update_env.rs`
|
||||
- LoopBodyLocal detection: (search for LoopBodyLocalDetector)
|
||||
|
||||
### Related Phases
|
||||
- Phase 176-2: Carrier update emission basics
|
||||
- Phase 178: String literal updates (similar multi-carrier pattern)
|
||||
- Phase 197: Loop update analyzer extraction
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Quick Reference
|
||||
|
||||
### Pattern Match: _atoi Loop
|
||||
|
||||
```
|
||||
Loop Type: Pattern 2 (Break)
|
||||
Header: i < len
|
||||
Break: digit_pos < 0
|
||||
Carriers: i (counter), result (accumulator)
|
||||
Body-locals: ch, digit_pos
|
||||
Captured: s, len, digits (pre-loop)
|
||||
|
||||
UpdateExpr:
|
||||
i -> Const(1)
|
||||
result -> BinOp { op: Add, rhs: NumberAccumulation { base: 10, digit_var: "digit_pos" } }
|
||||
```
|
||||
|
||||
### JoinIR Emission Sequence (NumberAccumulation)
|
||||
|
||||
```
|
||||
%base_10 = Const(10)
|
||||
%tmp = BinOp(Mul, %result_param, %base_10)
|
||||
%digit_id = <resolved from UpdateEnv>
|
||||
%result_next = BinOp(Add, %tmp, %digit_id)
|
||||
```
|
||||
|
||||
### Debug Commands
|
||||
|
||||
```bash
|
||||
# Trace variable mapping
|
||||
NYASH_TRACE_VARMAP=1 cargo test --release test_json_atoi -- --nocapture
|
||||
|
||||
# JoinIR debug
|
||||
NYASH_JOINIR_DEBUG=1 ./target/release/hakorune apps/tests/json_atoi_smoke.hako 2>&1 | grep "\[trace:"
|
||||
|
||||
# UpdateExpr detection (if logging added)
|
||||
NYASH_UPDATE_ANALYZER_DEBUG=1 ./target/release/hakorune apps/tests/json_atoi_smoke.hako
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Document Status**: ✅ Infrastructure confirmed, ready for implementation
|
||||
**Next Step**: Phase 246-EX Step 1 - Minimal smoke test creation
|
||||
**Last Updated**: 2025-12-11 (Phase 246-EX Step 0 completion)
|
||||
Reference in New Issue
Block a user