6.5 KiB
Phase 274 P2 (impl): LLVM (llvmlite harness) TypeOp alignment
Status: planned / design-first
Goal: make LLVM harness execution match the SSOT semantics in docs/reference/language/types.md for:
TypeOp(Check, value, ty)→BoolTypeOp(Cast, value, ty)→valueorTypeError
Primary reference implementation (SSOT runtime): src/backend/mir_interpreter/handlers/type_ops.rs
0. What is currently wrong (must-fix)
LLVM harness TypeOp is stubbed in src/llvm_py/instructions/typeop.py:
is: returns 0 for most types (IntegerBox is “non-zero” heuristic)cast/as: pass-through (never errors)
This conflicts with SSOT:
ismust reflect actual runtime type match.asmust fail-fast (TypeError) on mismatch.
Note:
- It is OK if the compiler constant-folds trivial cases (e.g.
1.is("Integer")). - For P2 verification, you must use a fixture that keeps
TypeOpin MIR (runtime-unknown / union value).
1. Acceptance criteria (minimum)
- Behavior parity with Rust VM (SSOT)
- With
NYASH_LLVM_USE_HARNESS=1and--backend llvm, this fixture behaves the same as VM:apps/tests/phase274_p2_typeop_primitives_only.hako(recommended: harness-safe baseline)
- Fail-fast
ason mismatch must raise a TypeError (not return 0 / pass-through).ismust return0/1deterministically (no “unknown → 0” unless it is truly not a match).
- No hardcode / no new env sprawl
- No “BoxName string match special-cases” except small alias normalization shared with frontend (
IntegerBox/StringBoxetc.). - Do not add new environment variables for behavior.
2. Design constraint: LLVM harness value representation (key risk)
In llvmlite harness, a runtime “value” is currently represented as an i64, but it mixes:
- raw integers (from
const i64) - boxed handles (e.g. strings are boxed to handles via
nyash.box.from_i8_string) - various call/bridge conventions
Because a handle is also an i64, the harness cannot reliably decide at runtime whether an i64 is “raw int” or “handle”, unless the value representation is made uniform.
This means TypeOp parity cannot be achieved reliably without addressing representation.
3. Recommended implementation strategy (P2): make representation uniform for TypeOp
Strategy A (recommended): “all values are handles” in LLVM harness
Make every runtime value in llvmlite harness be a handle (i64) to a boxed value:
- integers:
nyash.box.from_i64(i64) -> handle - floats:
nyash.box.from_f64(f64) -> handle - strings: already boxed (
nyash.box.from_i8_string) - bool/void: use existing conventions (or add kernel shims if needed)
Then TypeOp becomes implementable via runtime introspection on handles.
A.1 Kernel helper needed (small, SSOT-friendly)
Add a kernel export (in crates/nyash_kernel/src/lib.rs) that checks a handle’s runtime type:
nyash.any.is_type_h(handle: i64, type_name: *const i8) -> i64(0/1)- optionally
nyash.any.cast_h(handle: i64, type_name: *const i8) -> i64(handle or 0; but prefer fail-fast at caller)
Implementation rule:
- Must use actual runtime object type (builtins + plugin boxes + InstanceBox class name).
- Must not guess via resolver/type facts.
A.2 LLVM harness lowering
Update src/llvm_py/instructions/typeop.py:
- Always resolve
src_valas handle (i64). check/is: callnyash.any.is_type_h(src_val, type_name_ptr)→ i64 0/1cast/as: callis_type_h; if false, emit a runtime error (use existing “panic”/error path if available) or call a kernelnyash.panic.type_errorstyle function (add if missing).
Also update other lowerers incrementally so that values feeding TypeOp are handles (start with the fixture path).
Strategy B (fallback): keep mixed representation, but document divergence (not parity)
If Strategy A is too large for P2, constrain scope:
- Implement
TypeOpusing compile-timeresolver.value_typeshints. - Document clearly in Phase 274 README: LLVM harness TypeOp is “best-effort using type facts” and is not SSOT-correct under re-assignment.
This keeps the harness useful for SSA/CFG validation, but is not runtime-parity.
Note: Strategy B should be treated as temporary and must be called out as backend divergence in docs.
4. Concrete work items (P2)
- Audit current failure path
- Identify how LLVM harness reports runtime errors today (type errors, asserts).
- Prefer a single runtime helper rather than sprinkling Python exceptions.
1.5) Fix MIR JSON emission for TypeOp (required)
The LLVM harness consumes MIR JSON emitted by the Rust runner.
If TypeOp is missing in that JSON, the harness will never see it (and the JSON can become invalid due to missing defs).
Checklist:
src/runner/mir_json_emit/mod.rsmust emit{"op":"typeop", ...}in both emitters:emit_mir_json_for_harness(nyash_rust::mir) ✅ already supports TypeOpemit_mir_json_for_harness_bin(crate::mir) ⚠️ ensure TypeOp is included
- Add kernel introspection helper(s)
crates/nyash_kernel/src/lib.rs: addnyash.any.is_type_h.- It must handle:
- primitives boxed (
IntegerBox,FloatBox,BoolBox,StringBox,VoidBox) InstanceBoxuser classes (byclass_name)- plugin boxes (by metadata / resolved type name)
- primitives boxed (
- Implement real
TypeOplowering
src/llvm_py/instructions/typeop.py:- normalize
target_typealiases (same mapping as frontend docs:Int→IntegerBox, etc.) is→ call kernel checkas/cast→ check then return src or TypeError
- normalize
- Add LLVM smoke (integration)
- New script (name suggestion):
tools/smokes/v2/profiles/integration/apps/phase274_p2_typeop_is_as_llvm.sh
- Run:
NYASH_LLVM_USE_HARNESS=1 ./target/release/hakorune --backend llvm apps/tests/phase274_p2_typeop_primitives_only.hako
- Expect: exit code
3(same as VM).
5. Notes / non-goals (P2)
- Do not implement a full static type system here.
- Do not add rule logic to the resolver (no “guessing chains”).
- Do not add new environment variables for behavior selection.
- If you must limit scope, limit it by fixtures and document it in Phase 274 README as explicit divergence.
Fixture rule (important)
To avoid “TypeOp disappeared” false negatives:
- Do not use pure compile-time constants for
is/aschecks. - Prefer a union value formed by a runtime-unknown branch (e.g.
process.argv().size() > 0).- Note:
env.process.argvis currently not supported on Rust VM, andenv.getis not linked for LLVM AOT yet; keep harness fixtures minimal unless the required externs are implemented in NyRT.
- Note: