docs: add MIR13 mode doc and set PHI-off as default; bridge lowering split (if/loop/try); llvmlite resolver stabilization; curated runner default PHI-off; refresh CURRENT_TASK.md

This commit is contained in:
Selfhosting Dev
2025-09-17 10:58:12 +09:00
parent 31f90012e0
commit d99b941218
131 changed files with 2584 additions and 2657 deletions

View File

@ -0,0 +1,47 @@
# ArrayBox get/set -> Invalid arguments (plugin side)
Status: open
Summary
- Error messages observed during AOT/LLVM smoke that touches ArrayBox:
- "Plugin invoke error: ArrayBox.set -> Invalid arguments"
- "Plugin invoke error: ArrayBox.get -> Invalid arguments"
- VInvoke(MapBox) path is stable; issue is isolated to ArrayBox plugin invocation.
Environment
- LLVM 18 / inkwell 0.5.0 (llvm18-0)
- nyash-rust with Phase 11.2 lowering
- tools/llvm_smoke.sh (Array smoke is gated via NYASH_LLVM_ARRAY_SMOKE=1)
Repro
1) Enable array smoke explicitly:
- `NYASH_LLVM_ARRAY_SMOKE=1 ./tools/llvm_smoke.sh release`
2) Observe plugin-side errors for ArrayBox.get/set.
Expected
- Array get/set should be routed to NyRT safety shims (`nyash_array_get_h/set_h`) with handle + index/value semantics that match the core VM.
Observed
- Plugin path is taken for ArrayBox.get/set and the plugin rejects arguments as invalid.
Notes / Hypothesis
- LLVM lowering is intended to map ArrayBox.get/set to NyRT shims. The plugin path should not be engaged for array core operations.
- If by-name fallback occurs (NYASH_LLVM_ALLOW_BY_NAME=1), the array methods might route to plugin-by-name with i64-only ABI and mismatched TLV types (index/value encoding).
Plan
1) Confirm lowering branch for BoxCall(ArrayBox.get/set) always selects NyRT shims under LLVM, regardless of by-name flag.
2) If by-name fallback is unavoidable in current scenario, ensure integer index/value are encoded/tagged correctly (tag=3 for i64) and receiver is a handle.
3) Add a targeted smoke (OFF by default) that calls only `get/set/length` and prints deterministic result.
4) Optional: Add debug env `NYASH_PLUGIN_TLV_DUMP=1` to print decoded TLV for failing invokes to speed diagnosis.
Workarounds
- Keep `NYASH_LLVM_ARRAY_SMOKE=0` in CI until fixed.

View File

@ -0,0 +1,40 @@
# LLVM lowering: string + int causes binop type mismatch
Status: open
Summary
- When compiling code that concatenates a string literal with a non-string (e.g., integer), LLVM object emission fails with a type mismatch in binop.
- Example from `apps/ny-map-llvm-smoke/main.nyash`: `print("Map: v=" + v)` and `print("size=" + s)`.
Environment
- LLVM 18 / inkwell 0.5.0
- Phase 11.2 lowering
Repro
1) Run: `NYASH_LLVM_ARRAY_SMOKE=1 ./tools/llvm_smoke.sh release` (or build/link the map smoke similarly)
2) Observe: `❌ LLVM object emit error: binop type mismatch`
Expected
- String concatenation should be lowered to a safe runtime shim (e.g., NyRT string builder or `nyash_string_concat`) that accepts `(i8* string, i64/int)` and returns `i8*`.
Observed
- `+` binop is currently generated as integer addition for non-float operands, leading to a type mismatch when one side is a pointer (string) and the other is integer.
Plan
1) Introduce string-like detection in lowering: if either operand is `String` (or pointer from `nyash_string_new`), route to a NyRT concat shim.
2) Provide NyRT APIs:
- `nyash.string.concat_ss(i8*, i8*) -> i8*`
- `nyash.string.concat_si(i8*, i64) -> i8*`
- Optional: `concat_sf`, `concat_sb` (format helpers)
3) As an interim simplification for smoke, emit `print("..." )` in two steps to avoid mixed-type `+` until the concat shim is ready.
CI
- Keep `apps/ny-llvm-smoke` OFF by default. Re-enable once concat shim lands and binop lowering is updated.

View File

@ -0,0 +1,26 @@
# Parser/Bridge: Unary and ASI Alignment (Stage2)
Context
- Rust parser already parses unary minus with higher precedence (parse_unary → factor → term) but PyVM pipe path did not reflect unary when emitting MIR JSON for the PyVM harness.
- BridgeJSON v0 pathis correct for unary by transforming to `0 - expr` in the Python MVP, but Rust→PyVM path uses `emit_mir_json_for_harness` which skipped `UnaryOp`.
- ASI in arguments split over newlines is not yet supported in Rust (e.g., newline inside `(..., ...)` after a comma in a chained call), while Bridge/Selfhost cover ASI for statements and operators.
Proposed minimal steps
- Unary for PyVM harness:
- Option A (preferred later): extend `emit_mir_json_for_harness[_bin]` to export a `unop` instruction and add PyVM support. Requires schema change.
- Option B (quick): legalize unary `Neg` to `Const(0); BinOp('-', 0, v)` before emitting, by inserting a synthetic temporary. This requires value id minting in emitter to remain selfconsistent, which we currently do not have. So Option B is nontrivial without changing emitter capabilities.
- Decision: keep Bridge JSON v0 path authoritative for unary tests; avoid relying on Rust→PyVM for unary until we add a `unop` schema.
- ASI inside call arguments (multiline):
- Keep as NOT SUPPORTED for Rust parser in Phase15. Use singleline args in tests.
- Selfhost/Bridge side already tolerate semicolons optionally after statements; operatorcontinuation is supported in Bridge MVP.
Tracking
- If we want to support unary in the PyVM harness emitter:
- Add `unop` to tools/pyvm_runner.py and src/llvm_py/pyvm/vm.py (accept `{op:"unop", kind:"neg", src: vid, dst: vid}`)
- Teach emitters to export `UnaryOp` accordingly (`emit_mir_json_for_harness[_bin]`).
Status
- Bridge unary: OKny_stage2_bridge_smoke includes unary
- Rust→PyVM unary: not supported in emitter; will stay out of CI until schema update
- ASI in args over newline: not supported by Rust parser; keep tests singleline for now