Files
hakorune/docs/development/current/main/phases/phase-274/P2-INSTRUCTIONS.md

6.5 KiB
Raw Blame History

Phase 274 P2 (impl): LLVM (llvmlite harness) TypeOp alignment

Status: planned / design-first

Goal: make LLVM harness execution match the SSOT semantics in docs/reference/language/types.md for:

  • TypeOp(Check, value, ty)Bool
  • TypeOp(Cast, value, ty)value or TypeError

Primary reference implementation (SSOT runtime): src/backend/mir_interpreter/handlers/type_ops.rs


0. What is currently wrong (must-fix)

LLVM harness TypeOp is stubbed in src/llvm_py/instructions/typeop.py:

  • is: returns 0 for most types (IntegerBox is “non-zero” heuristic)
  • cast/as: pass-through (never errors)

This conflicts with SSOT:

  • is must reflect actual runtime type match.
  • as must fail-fast (TypeError) on mismatch.

Note:

  • It is OK if the compiler constant-folds trivial cases (e.g. 1.is("Integer")).
  • For P2 verification, you must use a fixture that keeps TypeOp in MIR (runtime-unknown / union value).

1. Acceptance criteria (minimum)

  1. Behavior parity with Rust VM (SSOT)
  • With NYASH_LLVM_USE_HARNESS=1 and --backend llvm, this fixture behaves the same as VM:
    • apps/tests/phase274_p2_typeop_primitives_only.hako (recommended: harness-safe baseline)
  1. Fail-fast
  • as on mismatch must raise a TypeError (not return 0 / pass-through).
  • is must return 0/1 deterministically (no “unknown → 0” unless it is truly not a match).
  1. No hardcode / no new env sprawl
  • No “BoxName string match special-cases” except small alias normalization shared with frontend (IntegerBox/StringBox etc.).
  • Do not add new environment variables for behavior.

2. Design constraint: LLVM harness value representation (key risk)

In llvmlite harness, a runtime “value” is currently represented as an i64, but it mixes:

  • raw integers (from const i64)
  • boxed handles (e.g. strings are boxed to handles via nyash.box.from_i8_string)
  • various call/bridge conventions

Because a handle is also an i64, the harness cannot reliably decide at runtime whether an i64 is “raw int” or “handle”, unless the value representation is made uniform.

This means TypeOp parity cannot be achieved reliably without addressing representation.


Make every runtime value in llvmlite harness be a handle (i64) to a boxed value:

  • integers: nyash.box.from_i64(i64) -> handle
  • floats: nyash.box.from_f64(f64) -> handle
  • strings: already boxed (nyash.box.from_i8_string)
  • bool/void: use existing conventions (or add kernel shims if needed)

Then TypeOp becomes implementable via runtime introspection on handles.

A.1 Kernel helper needed (small, SSOT-friendly)

Add a kernel export (in crates/nyash_kernel/src/lib.rs) that checks a handles runtime type:

  • nyash.any.is_type_h(handle: i64, type_name: *const i8) -> i64 (0/1)
  • optionally nyash.any.cast_h(handle: i64, type_name: *const i8) -> i64 (handle or 0; but prefer fail-fast at caller)

Implementation rule:

  • Must use actual runtime object type (builtins + plugin boxes + InstanceBox class name).
  • Must not guess via resolver/type facts.

A.2 LLVM harness lowering

Update src/llvm_py/instructions/typeop.py:

  • Always resolve src_val as handle (i64).
  • check/is: call nyash.any.is_type_h(src_val, type_name_ptr) → i64 0/1
  • cast/as: call is_type_h; if false, emit a runtime error (use existing “panic”/error path if available) or call a kernel nyash.panic.type_error style function (add if missing).

Also update other lowerers incrementally so that values feeding TypeOp are handles (start with the fixture path).

Strategy B (fallback): keep mixed representation, but document divergence (not parity)

If Strategy A is too large for P2, constrain scope:

  • Implement TypeOp using compile-time resolver.value_types hints.
  • Document clearly in Phase 274 README: LLVM harness TypeOp is “best-effort using type facts” and is not SSOT-correct under re-assignment.

This keeps the harness useful for SSA/CFG validation, but is not runtime-parity.

Note: Strategy B should be treated as temporary and must be called out as backend divergence in docs.


4. Concrete work items (P2)

  1. Audit current failure path
  • Identify how LLVM harness reports runtime errors today (type errors, asserts).
  • Prefer a single runtime helper rather than sprinkling Python exceptions.

1.5) Fix MIR JSON emission for TypeOp (required)

The LLVM harness consumes MIR JSON emitted by the Rust runner. If TypeOp is missing in that JSON, the harness will never see it (and the JSON can become invalid due to missing defs).

Checklist:

  • src/runner/mir_json_emit/mod.rs must emit {"op":"typeop", ...} in both emitters:
    • emit_mir_json_for_harness (nyash_rust::mir) already supports TypeOp
    • emit_mir_json_for_harness_bin (crate::mir) ⚠️ ensure TypeOp is included
  1. Add kernel introspection helper(s)
  • crates/nyash_kernel/src/lib.rs: add nyash.any.is_type_h.
  • It must handle:
    • primitives boxed (IntegerBox, FloatBox, BoolBox, StringBox, VoidBox)
    • InstanceBox user classes (by class_name)
    • plugin boxes (by metadata / resolved type name)
  1. Implement real TypeOp lowering
  • src/llvm_py/instructions/typeop.py:
    • normalize target_type aliases (same mapping as frontend docs: IntIntegerBox, etc.)
    • is → call kernel check
    • as/cast → check then return src or TypeError
  1. Add LLVM smoke (integration)
  • New script (name suggestion):
    • tools/smokes/v2/profiles/integration/apps/phase274_p2_typeop_is_as_llvm.sh
  • Run:
    • NYASH_LLVM_USE_HARNESS=1 ./target/release/hakorune --backend llvm apps/tests/phase274_p2_typeop_primitives_only.hako
  • Expect: exit code 3 (same as VM).

5. Notes / non-goals (P2)

  • Do not implement a full static type system here.
  • Do not add rule logic to the resolver (no “guessing chains”).
  • Do not add new environment variables for behavior selection.
  • If you must limit scope, limit it by fixtures and document it in Phase 274 README as explicit divergence.

Fixture rule (important)

To avoid “TypeOp disappeared” false negatives:

  • Do not use pure compile-time constants for is/as checks.
  • Prefer a union value formed by a runtime-unknown branch (e.g. process.argv().size() > 0).
    • Note: env.process.argv is currently not supported on Rust VM, and env.get is not linked for LLVM AOT yet; keep harness fixtures minimal unless the required externs are implemented in NyRT.