Phase 73 plans migration from name-based to BindingId-based scope management in JoinIR lowering, aligning with MIR's lexical scope model. Design decision: Option A (Parallel BindingId Layer) with gradual migration. Migration roadmap: Phases 74-77, ~8-12 hours total, zero production impact. Changes: - phase73-scope-manager-design.md: SSOT design (~700 lines) - phase73-completion-summary.md: Deliverables summary - phase73-index.md: Navigation index - scope_manager_bindingid_poc/: Working PoC (437 lines, dev-only) Tests: 6/6 PoC tests PASS, lib 950/950 PASS Implementation: Parallel layer (no changes to existing code paths) 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
19 KiB
Phase 73: JoinIR ScopeManager → BindingId-Based Design
Status: Design Phase (No Production Code Changes) Date: 2025-12-13 Purpose: SSOT document for migrating JoinIR lowering's name-based lookup to BindingId-based scope management
Executive Summary
Problem
JoinIR lowering currently uses name-based variable lookup (String → ValueId maps) while MIR builder uses BindingId-based lexical scope tracking (Phase 68-69). This mismatch creates potential bugs:
- Shadowing Confusion: Same variable name in nested scopes can reference different bindings
- Future Bug Source: As lexical scope becomes more sophisticated, name-only lookup will break
- Inconsistent Mental Model: Developers must track two different scope systems
Solution Direction
Introduce BindingId into JoinIR lowering's scope management to align with MIR's lexical scope model.
Non-Goal (Phase 73)
- ❌ No production code changes
- ❌ No breaking changes to existing APIs
- ✅ Design-only: Document current state, proposed architecture, migration path
Current State Analysis
1. MIR Builder: BindingId + LexicalScope (Phase 68-69)
Location: src/mir/builder/vars/lexical_scope.rs
Key Structures:
// Conceptual model (from ast_analyzer.rs - dev-only)
struct BindingId(u32); // Unique ID for each variable binding
struct LexicalScopeFrame {
declared: BTreeSet<String>, // Names declared in this scope
restore: BTreeMap<String, Option<ValueId>>, // Shadowing restoration map
}
How It Works:
- Each
local xdeclaration creates a new binding with unique BindingId - LexicalScopeGuard tracks scope entry/exit via RAII
- On scope exit, shadowed bindings are restored via
restoremap variable_map: HashMap<String, ValueId>is the SSA resolution map (name → current ValueId)
Shadowing Example:
local x = 1; // BindingId(0) → ValueId(5)
{
local x = 2; // BindingId(1) → ValueId(10) (shadows BindingId(0))
print(x); // Resolves to ValueId(10)
}
print(x); // Restores to ValueId(5)
Key Insight: MIR builder uses name → ValueId for SSA conversion, but BindingId for scope tracking (declared/restore).
2. JoinIR Lowering: Name-Based Lookup (Current)
Location: src/mir/join_ir/lowering/
Key Structures:
2.1 ConditionEnv (condition_env.rs)
pub struct ConditionEnv {
name_to_join: BTreeMap<String, ValueId>, // Loop params + condition-only vars
captured: BTreeMap<String, ValueId>, // Captured function-scoped vars
}
- Maps variable names to JoinIR-local ValueIds
- Used for loop condition lowering (
i < n,p < s.length())
2.2 LoopBodyLocalEnv (loop_body_local_env.rs)
pub struct LoopBodyLocalEnv {
locals: BTreeMap<String, ValueId>, // Body-local variables
}
- Maps body-local variable names to ValueIds
- Example:
local temp = i * 2inside loop body
2.3 CarrierInfo (carrier_info.rs)
pub struct CarrierInfo {
loop_var_name: String,
loop_var_id: ValueId,
carriers: Vec<CarrierVar>,
promoted_loopbodylocals: Vec<String>, // Phase 224: Promoted variable names
}
pub struct CarrierVar {
name: String,
host_id: ValueId, // HOST function's ValueId
join_id: Option<ValueId>, // JoinIR-local ValueId
role: CarrierRole,
init: CarrierInit,
}
- Tracks carrier variables (loop state, condition-only)
- Uses naming convention for promoted variables:
- DigitPos pattern:
"digit_pos"→"is_digit_pos" - Trim pattern:
"ch"→"is_ch_match"
- DigitPos pattern:
- Relies on string matching (
resolve_promoted_join_id)
2.4 ScopeManager Trait (scope_manager.rs - Phase 231)
pub trait ScopeManager {
fn lookup(&self, name: &str) -> Option<ValueId>;
fn scope_of(&self, name: &str) -> Option<VarScopeKind>;
}
pub struct Pattern2ScopeManager<'a> {
condition_env: &'a ConditionEnv,
loop_body_local_env: Option<&'a LoopBodyLocalEnv>,
captured_env: Option<&'a CapturedEnv>,
carrier_info: &'a CarrierInfo,
}
Lookup Order (Pattern2ScopeManager):
- ConditionEnv (loop var, carriers, condition-only)
- LoopBodyLocalEnv (body-local variables)
- CapturedEnv (function-scoped captured variables)
- Promoted LoopBodyLocal → Carrier (via naming convention)
Current Issues:
- ✅ Works for current patterns: No shadowing within JoinIR fragments
- ⚠️ Fragile: Relies on naming convention (
is_digit_pos) and string matching - ⚠️ Shadowing-Unaware: If same name appears in multiple scopes, last match wins
- ⚠️ Mismatch with MIR: MIR uses BindingId for shadowing, JoinIR uses name-only
3. Where Shadowing Can Go Wrong
3.1 Current Patterns (Safe for Now)
- Pattern 1-4: No shadowing within single JoinIR fragment
- Carrier promotion: Naming convention avoids conflicts (
digit_pos→is_digit_pos) - Captured vars: Function-scoped, no re-declaration
3.2 Future Risks
Scenario: Nested loops with shadowing
local i = 0;
loop(i < 10) {
local i = i * 2; // BindingId(1) shadows BindingId(0)
print(i); // Which ValueId does ScopeManager return?
}
Current Behavior: ScopeManager::lookup("i") would return the first match in ConditionEnv, ignoring inner scope.
Expected Behavior: Should respect lexical scope like MIR builder does.
3.3 Promoted Variable Naming Brittleness
// CarrierInfo::resolve_promoted_join_id (lines 432-464)
let candidates = [
format!("is_{}", original_name), // DigitPos pattern
format!("is_{}_match", original_name), // Trim pattern
];
for carrier_name in &candidates {
if let Some(carrier) = self.carriers.iter().find(|c| c.name == *carrier_name) {
return carrier.join_id;
}
}
- Fragile: Relies on string prefixes (
is_,is_*_match) - Not Future-Proof: New patterns require new naming conventions
- BindingId Alternative: Store original BindingId → promoted BindingId mapping
Proposed Architecture
Phase 73 Goals
- Document the BindingId-based design
- Identify minimal changes needed
- Define migration path (phased approach)
- No production code changes (design-only)
Design Option A: Parallel BindingId Layer (Recommended)
Strategy: Add BindingId alongside existing name-based lookup, gradually migrate.
A.1 Enhanced ConditionEnv
pub struct ConditionEnv {
// Phase 73: Legacy name-based (keep for backward compatibility)
name_to_join: BTreeMap<String, ValueId>,
captured: BTreeMap<String, ValueId>,
// Phase 73+: NEW - BindingId-based tracking
binding_to_join: BTreeMap<BindingId, ValueId>, // BindingId → JoinIR ValueId
name_to_binding: BTreeMap<String, BindingId>, // Name → current BindingId (for shadowing)
}
Benefits:
- ✅ Backward compatible (legacy code uses
name_to_join) - ✅ Gradual migration (new code uses
binding_to_join) - ✅ Shadowing-aware (
name_to_bindingtracks current binding)
Implementation Path:
- Add
binding_to_joinandname_to_bindingfields (initially empty) - Update
get()to checkbinding_to_joinfirst, fall back toname_to_join - Migrate one pattern at a time (Pattern 1 → 2 → 3 → 4)
- Remove legacy fields after full migration
A.2 Enhanced CarrierInfo
pub struct CarrierVar {
name: String,
host_id: ValueId,
join_id: Option<ValueId>,
role: CarrierRole,
init: CarrierInit,
// Phase 73+: NEW
host_binding: Option<BindingId>, // HOST function's BindingId
}
pub struct CarrierInfo {
loop_var_name: String,
loop_var_id: ValueId,
carriers: Vec<CarrierVar>,
trim_helper: Option<TrimLoopHelper>,
// Phase 73+: Replace string list with BindingId map
promoted_bindings: BTreeMap<BindingId, BindingId>, // Original → Promoted
}
Benefits:
- ✅ No more naming convention hacks (
is_digit_pos,is_ch_match) - ✅ Direct BindingId → BindingId mapping for promoted variables
- ✅ Type-safe promotion tracking
Migration:
// Phase 73+: Promoted variable resolution
fn resolve_promoted_binding(&self, original: BindingId) -> Option<BindingId> {
self.promoted_bindings.get(&original).copied()
}
// Legacy fallback (Phase 73 transition only)
fn resolve_promoted_join_id(&self, name: &str) -> Option<ValueId> {
// OLD: String matching
// NEW: BindingId lookup
}
A.3 Enhanced ScopeManager
pub trait ScopeManager {
// Phase 73+: NEW - BindingId-based lookup
fn lookup_binding(&self, binding: BindingId) -> Option<ValueId>;
// Legacy (keep for backward compatibility)
fn lookup(&self, name: &str) -> Option<ValueId>;
fn scope_of(&self, name: &str) -> Option<VarScopeKind>;
}
pub struct Pattern2ScopeManager<'a> {
condition_env: &'a ConditionEnv,
loop_body_local_env: Option<&'a LoopBodyLocalEnv>,
captured_env: Option<&'a CapturedEnv>,
carrier_info: &'a CarrierInfo,
// Phase 73+: NEW - BindingId context from HOST
host_bindings: Option<&'a BTreeMap<String, BindingId>>,
}
impl<'a> ScopeManager for Pattern2ScopeManager<'a> {
fn lookup_binding(&self, binding: BindingId) -> Option<ValueId> {
// 1. Check condition_env.binding_to_join
if let Some(id) = self.condition_env.binding_to_join.get(&binding) {
return Some(*id);
}
// 2. Check promoted bindings
if let Some(promoted) = self.carrier_info.resolve_promoted_binding(binding) {
return self.condition_env.binding_to_join.get(&promoted).copied();
}
// 3. Fallback to legacy name-based lookup (transition only)
None
}
}
Design Option B: Full BindingId Replacement (Not Recommended for Phase 73)
Strategy: Replace all name-based maps with BindingId-based maps in one go.
Why Not Recommended:
- ❌ High risk (breaks existing code)
- ❌ Requires simultaneous changes to MIR builder, JoinIR lowering, all patterns
- ❌ Hard to rollback if issues arise
- ❌ Violates Phase 73 constraint (design-only)
When to Use: Phase 80+ (after Option A migration complete)
Integration with MIR Builder
Challenge: BindingId Source of Truth
Question: Where do BindingIds come from in JoinIR lowering?
Answer: MIR builder's variable_map + LexicalScopeFrame
Current Flow (Phase 73)
- MIR builder maintains
variable_map: HashMap<String, ValueId> - JoinIR lowering receives
variable_mapand createsConditionEnv - ConditionEnv uses names as keys (no BindingId tracking)
Proposed Flow (Phase 73+)
- MIR builder maintains:
variable_map: HashMap<String, ValueId>(SSA conversion)binding_map: HashMap<String, BindingId>(NEW - lexical scope tracking)
- JoinIR lowering receives both maps
- ConditionEnv builds:
name_to_join: BTreeMap<String, ValueId>(legacy)binding_to_join: BTreeMap<BindingId, ValueId>(NEW - from binding_map)
Required MIR Builder Changes
1. Add binding_map to MirBuilder
// src/mir/builder.rs
pub struct MirBuilder {
pub variable_map: HashMap<String, ValueId>,
// Phase 73+: NEW
pub binding_map: HashMap<String, BindingId>, // Current BindingId per name
next_binding_id: u32,
// Existing fields...
}
2. Update declare_local_in_current_scope
// src/mir/builder/vars/lexical_scope.rs
pub fn declare_local_in_current_scope(
&mut self,
name: &str,
value: ValueId,
) -> Result<BindingId, String> { // Phase 73+: Return BindingId
let frame = self.lexical_scope_stack.last_mut()
.ok_or("COMPILER BUG: local declaration outside lexical scope")?;
// Allocate new BindingId
let binding = BindingId(self.next_binding_id);
self.next_binding_id += 1;
if frame.declared.insert(name.to_string()) {
let previous_value = self.variable_map.get(name).copied();
let previous_binding = self.binding_map.get(name).copied(); // Phase 73+
frame.restore.insert(name.to_string(), previous_value);
frame.restore_bindings.insert(name.to_string(), previous_binding); // Phase 73+
}
self.variable_map.insert(name.to_string(), value);
self.binding_map.insert(name.to_string(), binding); // Phase 73+
Ok(binding)
}
3. Update pop_lexical_scope
pub fn pop_lexical_scope(&mut self) {
let frame = self.lexical_scope_stack.pop()
.expect("COMPILER BUG: pop_lexical_scope without push_lexical_scope");
for (name, previous) in frame.restore {
match previous {
Some(prev_id) => { self.variable_map.insert(name, prev_id); }
None => { self.variable_map.remove(&name); }
}
}
// Phase 73+: Restore BindingIds
for (name, previous_binding) in frame.restore_bindings {
match previous_binding {
Some(prev_binding) => { self.binding_map.insert(name, prev_binding); }
None => { self.binding_map.remove(&name); }
}
}
}
Migration Path (Phased Approach)
Phase 73 (Current - Design Only)
- ✅ This document (SSOT)
- ✅ No production code changes
- ✅ Define acceptance criteria for Phase 74+
Phase 74 (Infrastructure)
Goal: Add BindingId infrastructure without breaking existing code
Tasks:
- Add
binding_maptoMirBuilder(default empty) - Add
binding_to_jointoConditionEnv(default empty) - Add
host_bindingtoCarrierVar(default None) - Update
declare_local_in_current_scopeto returnBindingId - Add
#[cfg(feature = "normalized_dev")]gated BindingId tests
Acceptance Criteria:
- ✅ All existing tests pass (no behavior change)
- ✅
binding_mappopulated during local declarations - ✅ BindingId allocator works (unit tests)
Phase 75 (Pattern 1 Pilot)
Goal: Migrate Pattern 1 (Simple While Minimal) to use BindingId
Why Pattern 1?
- Simplest pattern (no carriers, no shadowing)
- Low risk (easy to validate)
- Proves BindingId integration works
Tasks:
- Update
CarrierInfo::from_variable_mapto acceptbinding_map - Update
Pattern1ScopeManager(if exists) to uselookup_binding - Add E2E test with Pattern 1 + BindingId
Acceptance Criteria:
- ✅ Pattern 1 tests pass with BindingId lookup
- ✅ Legacy name-based lookup still works (fallback)
Phase 76 (Pattern 2 - Carrier Promotion)
Goal: Migrate Pattern 2 (with promoted LoopBodyLocal) to BindingId
Challenges:
- Promoted variable tracking (
digit_pos→is_digit_pos) - Replace
promoted_loopbodylocals: Vec<String>withpromoted_bindings: BTreeMap<BindingId, BindingId>
Tasks:
- Add
promoted_bindingstoCarrierInfo - Update
resolve_promoted_join_idto use BindingId - Update Pattern 2 lowering to populate
promoted_bindings
Acceptance Criteria:
- ✅ Pattern 2 tests pass (DigitPos pattern)
- ✅ No more naming convention hacks (
is_*,is_*_match)
Phase 77 (Pattern 3 & 4)
Goal: Complete migration for remaining patterns
Tasks:
- Migrate Pattern 3 (multi-carrier)
- Migrate Pattern 4 (generic case A)
- Remove legacy
name_to_joinfallbacks
Acceptance Criteria:
- ✅ All patterns use BindingId exclusively
- ✅ Legacy code paths removed
- ✅ Full test suite passes
Phase 78+ (Future Enhancements)
Optional Improvements:
- Nested loop shadowing support
- BindingId-based ownership analysis (Phase 63 integration)
- BindingId-based SSA optimization (dead code elimination)
Acceptance Criteria (Phase 73)
Design Document Complete
- ✅ Current state analysis (MIR + JoinIR scope systems)
- ✅ Proposed architecture (Option A: Parallel BindingId Layer)
- ✅ Integration points (MirBuilder changes)
- ✅ Migration path (Phases 74-77)
No Production Code Changes
- ✅ No changes to
src/mir/builder.rs - ✅ No changes to
src/mir/join_ir/lowering/*.rs - ✅ Optional: Minimal PoC in
#[cfg(feature = "normalized_dev")]
Stakeholder Review
- ⏰ User review (confirm design makes sense)
- ⏰ Identify any missed edge cases
Open Questions
Q1: Should BindingId be global or per-function?
Current Assumption: Per-function (like ValueId)
Reasoning:
- Each function has independent binding scope
- No cross-function binding references
- Simpler allocation (no global state)
Alternative: Global BindingId pool (for Phase 63 ownership analysis)
Q2: How to handle captured variables?
Current: CapturedEnv uses names, marks as immutable
Proposed: Add binding_id to CapturedVar
pub struct CapturedVar {
name: String,
host_id: ValueId,
host_binding: BindingId, // Phase 73+
is_immutable: bool,
}
Q3: Performance impact of dual maps?
Concern: binding_to_join + name_to_join doubles memory
Mitigation:
- Phase 74-75: Both maps active (transition)
- Phase 76+: Remove
name_to_joinafter migration - BTreeMap overhead minimal for typical loop sizes (<10 variables)
References
Related Phases
- Phase 68-69: MIR lexical scope + shadowing (existing implementation)
- Phase 63: Ownership analysis (dev-only, uses BindingId)
- Phase 231: ScopeManager trait (current implementation)
- Phase 238: ExprLowerer scope boundaries (design doc)
Key Files
src/mir/builder/vars/lexical_scope.rs- MIR lexical scope implementationsrc/mir/join_ir/lowering/scope_manager.rs- JoinIR ScopeManager traitsrc/mir/join_ir/lowering/condition_env.rs- ConditionEnv (name-based)src/mir/join_ir/lowering/carrier_info.rs- CarrierInfo (name-based promotion)src/mir/join_ir/ownership/ast_analyzer.rs- BindingId usage (dev-only)
Appendix: Example Scenarios
A1: Shadowing Handling (Future)
local sum = 0;
loop(i < n) {
local sum = i * 2; // BindingId(1) shadows BindingId(0)
total = total + sum;
}
print(sum); // BindingId(0) restored
Expected Behavior:
- Inner
sumhas BindingId(1) - ScopeManager resolves
sumto BindingId(1) inside loop - Outer
sum(BindingId(0)) restored after loop
A2: Promoted Variable Tracking (Current)
loop(p < len) {
local digit_pos = digits.indexOf(ch);
if digit_pos < 0 { break; } // Promoted to carrier
}
Current (Phase 73): String-based promotion
promoted_loopbodylocals: ["digit_pos"]resolve_promoted_join_id("digit_pos")→ searches for"is_digit_pos"
Proposed (Phase 76+): BindingId-based promotion
promoted_bindings: { BindingId(5) → BindingId(10) }lookup_binding(BindingId(5))→ returns ValueId from BindingId(10)
Conclusion
Phase 73 Deliverable: This design document serves as SSOT for BindingId migration.
Next Steps:
- User review and approval
- Phase 74: Infrastructure implementation (BindingId allocation)
- Phase 75-77: Gradual pattern migration
Estimated Total Effort:
- Phase 73 (design): ✅ Complete
- Phase 74 (infra): 2-3 hours
- Phase 75 (Pattern 1): 1-2 hours
- Phase 76 (Pattern 2): 2-3 hours
- Phase 77 (Pattern 3-4): 2-3 hours
- Total: 8-12 hours
Risk Level: Low (gradual migration, backward compatible)