Parser improvements:
- Added expression statement fallback in parse_statement() for flexible syntax
- Fixed ternary operator to use PeekExpr instead of If AST (better lowering)
- Added peek_token() check to avoid ?/?: operator conflicts
LLVM Python improvements:
- Added optional ESC_JSON_FIX environment flag for string concatenation
- Improved PHI generation with better default handling
- Enhanced substring tracking for esc_json pattern
Documentation updates:
- Updated language guide with peek expression examples
- Added box theory diagrams to Phase 15 planning
- Clarified peek vs when syntax differences
These changes enable cleaner parser implementation for self-hosting,
especially for handling digit conversion with peek expressions instead
of 19-line if-else chains.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes to resolver.py:
- Improved PHI value tracking in _value_at_end_i64() (lines 268-285)
- Added trace logging for snap hits with PHI detection
- Fixed PHI placeholder reuse logic to preserve dominance
- PHI values now returned directly from snapshots when valid
Changes to llvm_builder.py:
- Fixed externcall instruction parsing (line 522: 'func' instead of 'name')
- Improved block snapshot tracing (line 439)
- Added PHI incoming metadata tracking (lines 316-376)
- Enhanced definition tracking for lifetime hints
This should help debug the string carry=0 issue in esc_dirname_smoke where
PHI values were being incorrectly coerced instead of preserved.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Resolver-only reads across BBs; remove vmap fallbacks
- Create PHIs at block start; insert casts in preds before terminators
- Re-materialize int in preds to satisfy dominance (add/zext/trunc)
- Use constant GEP for method strings to avoid order dependency
- Order non-PHI lowering to preserve producer→consumer dominance
- Update docs: RESOLVER_API.md, LLVM_HARNESS.md
- compare_harness_on_off: ON/OFF exits match; linking green
Major improvement to reduce parameter explosion (15+ args → 3-4 contexts):
- Add LowerFnCtx/BlockCtx for grouping related parameters
- Add lightweight StrHandle/StrPtr newtypes for string safety
- Implement boxed API wrappers for boxcall/fields/invoke
- Add dev checks infrastructure (NYASH_DEV_CHECK_DISPATCH_ONLY_PHI)
Key achievements:
- lower_boxcall: 16 args → 7 args via boxed API
- fields/invoke: Similar parameter reduction
- BuilderCursor discipline enforced throughout
- String handle invariant: i64 across blocks, i8* only at call sites
Status:
- Internal migration in progress (fields → invoke → marshal)
- Full cutover pending due to borrow checker constraints
- dep_tree_min_string.o generation successful (sealed=ON)
Next: Complete internal migration before flipping to boxed APIs
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added paper-g-ai-assisted-compiler folder documenting:
- Week-long LLVM backend development with AI assistance
- Key insights from PHI/SSA struggles to Resolver API solution
- Development log capturing the chaotic reality
- Abstract in both English and Japanese
Key quote: 'I don't remember anymore' - capturing the authentic
experience of intensive AI-assisted development where the process
itself becomes the research data.
This represents potentially the first fully documented case of
building a compiler backend primarily through AI assistance.
Added:
- Resolver API (resolve_i64) for unified value resolution with per-block cache
- llvmlite harness (Python) for rapid PHI/SSA verification
- Comprehensive LLVM documentation suite:
- LLVM_LAYER_OVERVIEW.md: Overall architecture and invariants
- RESOLVER_API.md: Value resolution strategy
- LLVM_HARNESS.md: Python verification harness
Updated:
- BuilderCursor applied to ALL lowering paths (externcall/newbox/arrays/maps/call)
- localize_to_i64 for dominance safety in strings/compare/flow
- NYASH_LLVM_DUMP_ON_FAIL=1 for debug IR output
Key insight: LoopForm didn't cause problems, it just exposed existing design flaws:
- Scattered value resolution (now unified via Resolver)
- Inconsistent type conversion placement
- Ambiguous PHI wiring responsibilities
Next: Wire Resolver throughout, achieve sealed=ON green for dep_tree_min_string
## Phase 15 Documentation Updates
### ROADMAP.md
- Added LLVM Native EXE Generation as item 5 in "Next (small boxes)"
- Covers complete pipeline from MIR to executable
- Includes plan for separate nyash-llvm-compiler crate
### New Document: llvm-exe-strategy.md
- Detailed implementation strategy for LLVM backend EXE generation
- Architecture for separating LLVM compiler into independent crate
- Reduces main build time from 5-7min to 1-2min
- Enables parallel builds in CI/CD
### README.md Updates
- Updated EXE generation section to prioritize LLVM over Cranelift
- Added links to new LLVM strategy document
- Clarified current implementation status
## Benefits
- Faster development cycles with reduced build times
- Better CI/CD performance through parallelization
- Clear separation of concerns between core and LLVM compiler
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
## LLVM Call Instruction Modularization
- Moved MirInstruction::Call lowering to separate instructions/call.rs
- Follows the principle of one MIR instruction per file
- Call implementation was already complete, just needed modularization
## Phase 21 Documentation
- Moved all Phase 21 content to private/papers/paper-f-self-parsing-db/
- Preserved AI evaluations from Gemini and Codex
- Academic paper potential confirmed by both AIs
- Self-parsing AST database approach validated
## Next Steps
- Continue monitoring ChatGPT5's LLVM improvements
- Consider creating separate nyash-llvm-compiler crate when LLVM layer is stable
- This will reduce build times by isolating LLVM dependencies
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major improvements to LLVM backend function call infrastructure:
## Key Changes
### Function Call System Complete
- All MIR functions now properly lowered to LLVM (not just entry)
- Function parameter binding to LLVM arguments implemented
- ny_main() wrapper added for proper entry point handling
- Callee resolution from ValueId to function symbols working
### Call Instruction Analysis
- MirInstruction::Call was implemented but system was incomplete
- Fixed "rhs missing" errors caused by undefined Call return values
- Function calls now properly return values through the system
### Code Modularization (Ongoing)
- BoxCall → instructions/boxcall.rs ✓
- ExternCall → instructions/externcall.rs ✓
- Call remains in mod.rs (to be refactored)
### Phase 21 Documentation
- Added comprehensive AI evaluation from Gemini and Codex
- Both AIs confirm academic paper potential for self-parsing AST DB approach
- "Code as Database" concept validated as novel contribution
Co-authored-by: ChatGPT5 <noreply@openai.com>
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major enhancements to LLVM code generation and type handling:
1. String Operations:
- Added StringBox length fast-path (length/len methods)
- Converts i8* to handle when needed for len_h call
- Consistent handle-based string operations
2. Array/Map Fast-paths:
- ArrayBox: get/set/push/length operations
- MapBox: get/set/has/size with handle-based keys
- Optimized paths for common collection operations
3. Field Access:
- getField/setField implementation with handle conversion
- Proper i64 handle to pointer conversions
4. NewBox Improvements:
- StringBox/IntegerBox pass-through optimizations
- Fallback to env.box.new when type_id unavailable
- Support for dynamic box creation
5. Documentation:
- Added ARCHITECTURE.md for overall design
- Added EXTERNCALL.md for external call specs
- Added LOWERING_LLVM.md for LLVM lowering rules
- Added PLUGIN_ABI.md for plugin interface
6. Type System:
- Added UserBox type registration in nyash_box.toml
- Consistent handle (i64) representation across system
Results: More robust LLVM code generation with proper type handling
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Documented the architectural decision for Nyash runtime design:
1. Core boxes (String/Integer/Array/Map/Bool) built into nyrt
- Essential for self-hosting
- Available at boot without plugin loader
- High performance (no FFI overhead)
2. All other boxes as plugins (File/Net/User-defined)
- Extensible ecosystem
- Clear separation of concerns
3. Minimal ExternCall (only 5 functions)
- print/error (output)
- panic/exit (process control)
- now (time)
Key principle: Everything goes through BoxCall interface
- No special fast paths
- Unified architecture
- "Everything is Box" philosophy maintained
This design balances self-hosting requirements with architectural purity.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Added eprintln! debug messages to trace handle values
- Helps investigate why plugin return values display as blank
- Part of ongoing LLVM backend plugin return value investigation
Related to issue where print(c.get()) shows blank output
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
- Add src/tests/parser_bitops_test.rs and vm_bitops_test.rs
- Update tokenizer unit test to expect SHIFT_RIGHT
- Update quick-reference and language guide to document &,|,^,<<,>> and Arrow deprecation
Known: one unrelated test failing (consolebox println TLV vs typebox) pre-existing.
- Add JIT Self-Host Quickstart section for Phase 15
- Include important flags reference (plugins, parsers, debugging)
- Add Codex async workflow documentation for parallel tasks
- Update test execution with Phase 15 smoke tests
- Improve build time notes (JIT vs LLVM)
- Align with current Phase 15 progress and tooling
🎉 Bootstrap (c0→c1→c1') test confirmed working\!
Co-Authored-By: Claude <noreply@anthropic.com>
- Update phase indicator to Phase 15 (Self-Hosting)
- Update documentation links to Phase 15 resources
- Reflect completion of R1-R5 tasks and ongoing work
- Fix CURRENT_TASK.md location to root directory
Co-Authored-By: Claude <noreply@anthropic.com>