feat(perf): add Phase 21.8 foundation for IntArrayCore/MatI64 numeric boxes
Prepare infrastructure for specialized numeric array benchmarking: - Add IntArrayCore plugin stub (crates/nyash_kernel/src/plugin/intarray.rs) - Add IntArrayCore/MatI64 box definitions (lang/src/runtime/numeric/) - Add Phase 21.8 documentation and task tracking - Update nyash.toml/hako.toml with numeric library configuration - Extend microbench.sh for matmul_core benchmark case Next: Resolve Stage-B MirBuilder to recognize MatI64/IntArrayCore as boxes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@ -4,7 +4,7 @@
|
||||
- .hako 側(AotPrep)で前処理最適化(構造のみ)を行い、LLVM/AOT に渡すIRを軽量にする。
|
||||
- 既定は挙動不変(opt‑in)。Return 純化ガードで安全性を担保。
|
||||
|
||||
チェックリスト
|
||||
チェックリスト(21.5 時点の着地)
|
||||
- [x] パス分割(StrlenFold / LoopHoist / ConstDedup / CollectionsHot / BinopCSE)
|
||||
- [x] CollectionsHot(Array/Map)導入(既定OFF)
|
||||
- [x] Map key モード `NYASH_AOT_MAP_KEY_MODE={h|i64|hh|auto}`
|
||||
@ -17,6 +17,11 @@
|
||||
- [ ] Idempotence(置換済みタグで再実行時も不変)
|
||||
- [ ] `arraymap`/`matmul` ≤ 125%(C基準)
|
||||
|
||||
メモ(21.5 クロージング)
|
||||
- linidx/maplin など「線形インデックス+Array/Map」系は CollectionsHot + hoist/CSE で C≒100% 近辺まで到達。
|
||||
- arraymap は Array/Map 部分の externcall 化は進んだものの、文字列キー生成(toString/`\"k\"+idx`)と hash パスが支配的なため、C の単純 int[] とは根本的に前提が異なる状態で終了。
|
||||
- matmul は CollectionsHot 自体は単体では効いているが、行列積そのものが ArrayBox ベースであり、Core 数値箱不在のまま 80% 目標には届かず。これは 21.6 以降の「Core 数値箱+行列箱」導入で扱う。
|
||||
|
||||
トグル
|
||||
- `NYASH_MIR_LOOP_HOIST=1` … StrlenFold/LoopHoist/ConstDedup/BinopCSE を有効化
|
||||
- `NYASH_AOT_COLLECTIONS_HOT=1` … CollectionsHot(Array/Map)
|
||||
|
||||
@ -0,0 +1,90 @@
|
||||
# Phase 21.6 — Core Numeric Boxes (Draft)
|
||||
|
||||
Status: proposal (to refine at 21.6 kickoff)
|
||||
|
||||
## Goal
|
||||
|
||||
Provide explicit, low‑level numeric boxes that:
|
||||
|
||||
- Give Nyash a “fair” core for int/f64 benchmarks against C.
|
||||
- Stay compatible with the existing ArrayBox API (no breaking changes).
|
||||
- Can be used both explicitly in `.hako` and (later) as conservative AotPrep targets.
|
||||
|
||||
This phase focuses on design + minimal implementation; aggressive auto‑rewrites stay behind opt‑in flags.
|
||||
|
||||
## Scope (21.6)
|
||||
|
||||
- Design and add **IntArrayCore** numeric core (NyRT + Hako wrapper):
|
||||
- NyRT: `IntArrayCore` box(Rust)with internal layout `Vec<i64>`(contiguous, row‑major semantics)。
|
||||
- Hako: `IntArrayCoreBox` in `nyash.core.numeric.intarray`, wrapping NyRT via externcall:
|
||||
- `static new(len: i64) -> IntArrayCoreBox` → `nyash.intarray.new_h`
|
||||
- `length(self) -> i64` → `nyash.intarray.len_h`
|
||||
- `get_unchecked(self, idx: i64) -> i64` → `nyash.intarray.get_hi`
|
||||
- `set_unchecked(self, idx: i64, v: i64)` → `nyash.intarray.set_hii`
|
||||
- Semantics: i64‑only、固定長(構造変更なし)。境界チェックは NyRT 側(Fail‑Fast)に限定し、Hako 側は数値カーネル専用の薄いラッパーに留める。
|
||||
|
||||
- Design and add **MatI64** (matrix box) on top of IntArrayCore:
|
||||
- Internal layout: `rows: i64`, `cols: i64`, `stride: i64`, `core: IntArrayCoreBox`.
|
||||
- Minimal API:
|
||||
- `new(rows: i64, cols: i64) -> MatI64`
|
||||
- `rows(self) -> i64`, `cols(self) -> i64`
|
||||
- `at(self, r: i64, c: i64) -> i64`
|
||||
- `set(self, r: i64, c: i64, v: i64)`
|
||||
- Provide one reference implementation:
|
||||
- `MatOps.matmul_naive(a: MatI64, b: MatI64) -> MatI64` (O(n³), clear structure, not tuned).
|
||||
|
||||
- Bench alignment:
|
||||
- Add `matmul_core` benchmark:
|
||||
- Nyash: MatI64 + IntArrayCore implementation.
|
||||
- C: struct `{ int64_t *ptr; int64_t rows; int64_t cols; int64_t stride; }` + helper `get/set`.
|
||||
- Keep existing `matmul` (ArrayBox vs raw `int*`) as “language‑level” benchmark.
|
||||
|
||||
Out of scope for 21.6:
|
||||
|
||||
- Auto‑rewrite from `ArrayBox` → `IntArrayCore` / `MatI64` in AotPrep (only sketched, not default).
|
||||
- SIMD / blocked matmul / cache‑tuned kernels (can be separate optimization phases).
|
||||
- f64/complex variants (only type skeletons, if any).
|
||||
|
||||
## Design Notes
|
||||
|
||||
- **Layering**
|
||||
- Core: IntArrayCore (and future F64ArrayCore) are “muscle” boxes: minimal, numeric‑only. NyRT では IntArrayCore(Rust)、Hako では IntArrayCoreBox として露出。
|
||||
- Matrix: MatI64 expresses 2D shape and indexing; it owns an IntArrayCoreBox.
|
||||
- High‑level: ArrayBox / MapBox / existing user APIs remain unchanged.
|
||||
|
||||
- **Hako ABI vs Nyash implementation**
|
||||
- IntArrayCore lives as a NyRT box (C/Rust implementation) exposed via Hako ABI (`nyash.intarray.*`).
|
||||
- IntArrayCoreBox, MatI64 and MatOps are written in Nyash, calling IntArrayCore via externcall while exposing boxcall APIs to user code.
|
||||
- This keeps heavy lifting in NyRT while keeping the 2D semantics in `.hako`.
|
||||
|
||||
- **Fair C comparison**
|
||||
- For `matmul_core`, C should mirror IntArrayCore/MatI64:
|
||||
- Same struct layout (ptr + len / rows + cols + stride).
|
||||
- Same naive O(n³) algorithm.
|
||||
- This separates:
|
||||
- “Nyash vs C as languages” → existing `matmul` (ArrayBox vs `int*`).
|
||||
- “Core numeric kernel parity” → new `matmul_core` (IntArrayCore vs equivalent C).
|
||||
|
||||
## AotPrep / Future Work (21.6+)
|
||||
|
||||
Not for default in 21.6, but to keep in mind:
|
||||
|
||||
- Add conservative patterns in Collections/AotPrep to detect:
|
||||
- `ArrayBox<i64>` with:
|
||||
- Fixed length.
|
||||
- No structural mutations after initialization.
|
||||
- Access patterns of the form `base + i*cols + j` (or similar linear forms).
|
||||
- Allow opt‑in rewrite from such patterns to IntArrayCore/MatI64 calls.
|
||||
|
||||
- Keep all auto‑rewrites:
|
||||
- Behind env toggles (e.g. `NYASH_AOT_INTARRAY_CORE=1`).
|
||||
- Semantics‑preserving by construction; fall back to ArrayBox path when unsure.
|
||||
|
||||
## Open Questions for 21.6 Kickoff
|
||||
|
||||
- Exact module names:
|
||||
- `nyash.core.intarray` / `nyash.core.matrix` vs `nyash.linalg.*`.
|
||||
- Bounds checking policy for IntArrayCore:
|
||||
- Always on (fail‑fast) vs dev toggle for light checks in hot loops.
|
||||
- Interop:
|
||||
- Whether MatI64 should expose its IntArrayCore (e.g. `as_core_row_major()`) for advanced users.
|
||||
85
docs/development/roadmap/phases/phase-21.8/README.md
Normal file
85
docs/development/roadmap/phases/phase-21.8/README.md
Normal file
@ -0,0 +1,85 @@
|
||||
# Phase 21.8 — Numeric Core Integration & Builder Support
|
||||
|
||||
Status: proposal (to hand off to Claude Code)
|
||||
|
||||
## Goal
|
||||
|
||||
Integrate the new numeric core boxes (IntArrayCore + MatI64) into the Hakorune selfhost chain so that:
|
||||
|
||||
- Stage‑B → MirBuilder → ny‑llvmc(crate) can emit MIR(JSON) and EXE for code that uses:
|
||||
- `using nyash.core.numeric.intarray as IntArrayCore`
|
||||
- `using nyash.core.numeric.matrix_i64 as MatI64`
|
||||
- The `matmul_core` microbench (MatI64 + IntArrayCore) runs end‑to‑end in EXE mode and can be compared fairly against a matching C implementation.
|
||||
|
||||
21.6 provides the core boxes; 21.8 focuses on wiring them into the builder/runtime chain without changing default behaviour for other code.
|
||||
|
||||
## Scope (21.8, this host)
|
||||
|
||||
- Stage‑B / MirBuilder:
|
||||
- Ensure `MatI64` and `IntArrayCore` are recognized as valid boxes when referenced via:
|
||||
- `using nyash.core.numeric.matrix_i64 as MatI64`
|
||||
- `using nyash.core.numeric.intarray as IntArrayCore`
|
||||
- Fix the current provider‑emit failure:
|
||||
- Error today: `[mirbuilder/parse/error] undefined variable: MatI64` during `env.mirbuilder.emit`.
|
||||
- Diagnose and adjust Stage‑B / MirBuilder so that static box references (`MatI64.new`, `A.mul_naive`) compile in the same way as other boxes.
|
||||
|
||||
- AotPrep / emit pipeline:
|
||||
- Keep AotPrep unchanged for now; the goal is to make `tools/hakorune_emit_mir.sh` succeed on `matmul_core` sources without special‑casing.
|
||||
- Ensure `tools/hakorune_emit_mir.sh` with:
|
||||
- `HAKO_APPLY_AOT_PREP=1 NYASH_AOT_COLLECTIONS_HOT=1 NYASH_LLVM_FAST=1 NYASH_MIR_LOOP_HOIST=1`
|
||||
- can emit valid MIR(JSON) for MatI64/IntArrayCore code.
|
||||
|
||||
- Microbench integration:
|
||||
- Finish wiring `matmul_core` in `tools/perf/microbench.sh`:
|
||||
- Hako side: MatI64/IntArrayCore based O(n³) matmul (`MatI64.mul_naive`).
|
||||
- C side: `MatI64Core { int64_t *ptr; rows; cols; stride; }` with identical algorithm.
|
||||
- Accept that performance may still be far from the 80% target; 21.8 focuses on **structural integration and parity**, not tuning.
|
||||
|
||||
Out of scope:
|
||||
|
||||
- New optimizations inside AotPrep / CollectionsHot.
|
||||
- SIMD/blocked matmul kernels (to be handled in a later optimization phase).
|
||||
- f64/complex matrix variants.
|
||||
|
||||
## Tasks for implementation (Claude Code)
|
||||
|
||||
1) **Fix MatI64 visibility in Stage‑B / MirBuilder**
|
||||
- Reproduce the current failure:
|
||||
- Use a small `.hako` like:
|
||||
- `using nyash.core.numeric.matrix_i64 as MatI64`
|
||||
- `static box Main { method main(args) { local n = 4; local A = MatI64.new(n,n); return A.at(0,0); } }`
|
||||
- Confirm `env.mirbuilder.emit` reports `undefined variable: MatI64`.
|
||||
- Investigate how modules from `nyash.toml` (`"nyash.core.numeric.matrix_i64" = "lang/src/runtime/numeric/mat_i64_box.hako"`) are made visible to Stage‑B and MirBuilder.
|
||||
- Adjust the resolver / module prelude so that `MatI64` (and `IntArrayCore`) are treated like other core boxes:
|
||||
- Either via explicit prelude inclusion,
|
||||
- Or via module registry entries consumed by the builder.
|
||||
|
||||
2) **Ensure `tools/hakorune_emit_mir.sh` can emit MIR(JSON) for matmul_core**
|
||||
- Once MatI64 is visible, run:
|
||||
- `HAKO_APPLY_AOT_PREP=1 NYASH_AOT_COLLECTIONS_HOT=1 NYASH_LLVM_FAST=1 NYASH_MIR_LOOP_HOIST=1 NYASH_JSON_ONLY=1 tools/hakorune_emit_mir.sh <matmul_core.hako> tmp/matmul_core.json`
|
||||
- Acceptance:
|
||||
- No `undefined variable: MatI64` / `IntArrayCore` errors.
|
||||
- `tmp/matmul_core.json` is valid MIR(JSON) (same schema as existing matmul case).
|
||||
|
||||
3) **Finish `matmul_core` microbench**
|
||||
- Use the existing skeleton in `tools/perf/microbench.sh` (`case matmul_core`):
|
||||
- Confirm Hako side compiles and runs under `--backend vm`.
|
||||
- Confirm EXE path works:
|
||||
- `NYASH_SKIP_TOML_ENV=1 NYASH_LLVM_SKIP_BUILD=1 tools/perf/microbench.sh --case matmul_core --backend llvm --exe --runs 1 --n 64`
|
||||
- Update `benchmarks/README.md`:
|
||||
- Add `matmul_core` row with a short description:
|
||||
- “MatI64/IntArrayCore vs MatI64Core C struct (ptr+rows+cols+stride)”
|
||||
- Record initial ratios (even if far from 80%).
|
||||
|
||||
4) **Keep existing behaviour stable**
|
||||
- No changes to default user behaviour, env toggles, or existing benches beyond adding `matmul_core`.
|
||||
- Ensure quick/profile smokes (where applicable) remain green with numeric core present.
|
||||
|
||||
## Notes
|
||||
|
||||
- 21.6 already introduced:
|
||||
- NyRT `IntArrayCore` (Vec<i64> + RwLock) and handle‑based externs (`nyash.intarray.*`).
|
||||
- Hako wrappers `IntArrayCore` and `MatI64` in `lang/src/runtime/numeric/`.
|
||||
- `nyash.toml` module aliases for `nyash.core.numeric.intarray` and `nyash.core.numeric.matrix_i64`.
|
||||
- 21.8 is about wiring these into the builder/emit chain so that Hakorune can compile and benchmark numeric core code end‑to‑end.
|
||||
|
||||
Reference in New Issue
Block a user