docs/ci: selfhost bootstrap/exe-first workflows; add ny-llvmc scaffolding + JSON v0 schema validation; plan: unify to Nyash ABI v2 (no backwards compat)

This commit is contained in:
Selfhosting Dev
2025-09-17 20:33:19 +09:00
parent a5054a271b
commit 4ea3ca2685
56 changed files with 2275 additions and 1623 deletions

View File

@ -16,6 +16,12 @@ Protocol
- Output: `.o` オブジェクト(既定: `NYASH_AOT_OBJECT_OUT` または `NYASH_LLVM_OBJ_OUT`)。
- 入口: `ny_main() -> i64`(戻り値は exit code 相当。必要時 handle 正規化を行う)。
CLIcrate
- `crates/nyash-llvm-compiler` 提供の `ny-llvmc` は llvmlite ハーネスの薄ラッパーだよ。
- ダミー: `./target/release/ny-llvmc --dummy --out /tmp/dummy.o`
- JSON から: `./target/release/ny-llvmc --in mir.json --out out.o`
- 既定のハーネスパスは `tools/llvmlite_harness.py`。変更は `--harness <path>` で上書き可。
Quick Start
- 依存: `python3 -m pip install llvmlite`
- ダミー生成(配線検証):
@ -41,5 +47,9 @@ Notes
- 初版は固定 `ny_main` から開始してもよい配線確認。以降、MIR 命令を順次対応。
- ハーネスは自律(外部状態に依存しない)。エラーは即 stderr に詳細を出す。
Schema Validation任意
- JSON v0 のスキーマは `docs/reference/mir/json_v0.schema.json` にあるよ。
- 検証: `python3 tools/validate_mir_json.py <mir.json>`(要: `python3 -m pip install jsonschema`)。
Appendix: 静的リンクについて
- 生成 EXE は NyRTlibnyrt.aを静的リンク。完全静的-staticは musl 推奨dlopen 不可になるため動的プラグインは使用不可)。

View File

@ -55,6 +55,7 @@
- [言語リファレンス](reference/language/LANGUAGE_REFERENCE_2025.md)
- [アーキテクチャ概要](reference/architecture/TECHNICAL_ARCHITECTURE_2025.md)
- [実行バックエンド](reference/architecture/execution-backends.md)
- [GC モードと運用](reference/runtime/gc.md)
- [プラグインシステム](reference/plugin-system/)
- [CLIオプション早見表](tools/cli-options.md)

View File

@ -0,0 +1,36 @@
SelfHosting Pilot — Quick Guide (Phase15)
Overview
- Goal: Run Ny→JSON v0 via the selfhost compiler path and execute with PyVM/LLVM.
- Default remains envgated for safety; CI runs smokes to build confidence.
Recommended Flows
- Runner (pilot): `NYASH_USE_NY_COMPILER=1 ./target/release/nyash --backend vm apps/examples/string_p0.nyash`
- Emitonly: `NYASH_USE_NY_COMPILER=1 NYASH_NY_COMPILER_EMIT_ONLY=1 ...`
- EXEfirst (parser EXE): `tools/build_compiler_exe.sh && NYASH_USE_NY_COMPILER=1 NYASH_USE_NY_COMPILER_EXE=1 ./target/release/nyash --backend vm apps/examples/string_p0.nyash`
- LLVM AOT: `NYASH_LLVM_USE_HARNESS=1 tools/build_llvm.sh apps/... -o app && ./app`
CI Workflows
- Selfhost Bootstrap (always): `.github/workflows/selfhost-bootstrap.yml`
- Builds nyash (`cranelift-jit`) and runs `tools/bootstrap_selfhost_smoke.sh`.
- Selfhost EXEfirst (optional): `.github/workflows/selfhost-exe-first.yml`
- Installs LLVM 18 + llvmlite, then runs `tools/exe_first_smoke.sh`.
Useful Env Flags
- `NYASH_USE_NY_COMPILER=1`: Enable selfhost compiler pipeline.
- `NYASH_NY_COMPILER_EMIT_ONLY=1`: Print JSON v0 only (no execution).
- `NYASH_NY_COMPILER_TIMEOUT_MS=4000`: Child timeout (ms). Default 2000.
- `NYASH_USE_NY_COMPILER_EXE=1`: Prefer external parser EXE.
- `NYASH_NY_COMPILER_EXE_PATH=<path>`: Override EXE path.
- `NYASH_SELFHOST_READ_TMP=1`: Child reads `tmp/ny_parser_input.ny` when supported.
Troubleshooting (short)
- No Python found: install `python3` (PyVM / harness).
- No `llvm-config-18`: install LLVM 18 dev (see EXEfirst workflow).
- llvmlite import error: `python3 -m pip install llvmlite`.
- Parser child timeout: raise `NYASH_NY_COMPILER_TIMEOUT_MS`.
- EXEfirst bridge mismatch: rerun with `NYASH_CLI_VERBOSE=1` and keep `dist/nyash_compiler/sample.json` for inspection.
Notes
- JSON v0 schema is stable but not yet versioned; validation is planned.
- Default backend `vm` maps to PyVM unless legacy VM features are enabled.

View File

@ -0,0 +1,96 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://nyash.dev/schema/mir/json_v0.schema.json",
"title": "Nyash MIR JSON v0",
"type": "object",
"additionalProperties": true,
"properties": {
"schema_version": { "type": ["integer", "string" ] },
"functions": {
"oneOf": [
{ "$ref": "#/definitions/functionList" },
{ "$ref": "#/definitions/functionMap" }
]
}
},
"required": ["functions"],
"definitions": {
"functionList": {
"type": "array",
"items": { "$ref": "#/definitions/function" }
},
"functionMap": {
"type": "object",
"additionalProperties": { "$ref": "#/definitions/functionBody" }
},
"function": {
"type": "object",
"additionalProperties": true,
"properties": {
"name": { "type": "string" },
"params": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": true,
"properties": {
"name": { "type": "string" },
"type": { "type": "string" }
},
"required": ["name", "type"]
}
},
"return_type": { "type": "string" },
"entry_block": { "type": ["integer", "string"] },
"blocks": { "$ref": "#/definitions/blocks" }
},
"required": ["name", "blocks"]
},
"functionBody": {
"type": "object",
"additionalProperties": true,
"properties": {
"params": { "$ref": "#/definitions/function/properties/params" },
"return_type": { "type": "string" },
"entry_block": { "type": ["integer", "string"] },
"blocks": { "$ref": "#/definitions/blocks" }
},
"required": ["blocks"]
},
"blocks": {
"oneOf": [
{
"type": "array",
"items": { "$ref": "#/definitions/block" }
},
{
"type": "object",
"additionalProperties": { "$ref": "#/definitions/block" }
}
]
},
"block": {
"type": "object",
"additionalProperties": true,
"properties": {
"id": { "type": ["integer", "string"] },
"instructions": { "$ref": "#/definitions/instructions" },
"terminator": { "$ref": "#/definitions/instruction" }
},
"required": ["id", "instructions"]
},
"instructions": {
"type": "array",
"items": { "$ref": "#/definitions/instruction" }
},
"instruction": {
"type": "object",
"additionalProperties": true,
"properties": {
"kind": { "type": "string" }
},
"required": ["kind"]
}
}
}

View File

@ -0,0 +1,62 @@
Nyash GC Modes — Design and Usage
Overview
- Nyash adopts a pragmatic GC strategy that balances safety, performance, and simplicity.
- Default is reference counting with a periodic cycle collector; advanced modes exist for tuning and debugging.
UserFacing Modes (recommended)
- rc+cycle (default, safe)
- Reference counting with periodic cycle detection/collection.
- Recommended for most applications; memory leaks from cycles are handled.
- minorgen (highperformance)
- Lightweight generational GC: moving nursery (Gen0), nonmoving upper generations.
- Write barrier (old→new) is minimal; plugin/FFI objects remain nonmoving via handle indirection.
Advanced Modes (for language dev/debug)
- stw (debug/verification)
- Nonmoving stoptheworld markandsweep. Useful for strict correctness checks and leak cause isolation.
- rc (baseline for comparisons)
- rc+cycle with cycle detection disabled. For performance comparisons or targeted debugging.
- off (expert, selfresponsibility)
- Cycle detection and tracing off. Use only when cycles are guaranteed not to occur. Not recommended for longrunning services.
Selection & Precedence
- CLI: `--gc {auto,rc+cycle,minorgen,stw,rc,off}` (auto = rc+cycle)
- ENV: `NYASH_GC_MODE` (overridden by CLI)
- nyash.toml [env] applies last
Instrumentation & Diagnostics
- `NYASH_GC_METRICS=1`: print brief metrics (allocs/bytes/cycles/pauses)
- `NYASH_GC_METRICS_JSON=1`: emit JSON metrics for CI/aggregation
- `NYASH_GC_LEAK_DIAG=1`: on exit, dump suspected unreleased objects (TopK by type/site)
- `NYASH_GC_ALLOC_THRESHOLD=<N>`: warn or fail when allocations/bytes exceed threshold
Operational Guidance
- Default: rc+cycle for stable operations.
- Try minorgen when throughput/latency matter; it will fall back to rc+cycle on unsupported platforms or when plugin objects are declared nonmoving.
- off/rc are for special cases only; prefer enabling leak diagnostics when using them in development.
Implementation Roadmap (Stepwise)
1) Wiring & Observability
- Introduce `GcMode`, `GcController`, unify roots (handles, globals, frames) and safepoints.
- Add `LeakRegistry` (allocation ledger) and exittime dump.
- Ship rc+cycle (trial deletion) behind the controller (dev default can be rc+cycle).
2) minorgen (nursery)
- Moving Gen0 with simple promotion; upper generations nonmoving marksweep.
- Minimal write barrier (old→new card marking). Plugin/FFI remain nonmoving.
3) stw (dev verify)
- Nonmoving STW markandsweep for correctness checks.
Notes
- Safepoint and barrier MIR ops already exist and are reused as GC coordination hooks.
- Handle indirection keeps future moving GCs compatible with plugin/FFI boundaries.
LLVM Safepoints
- Automatic safepoint insertion can be toggled for the LLVM harness/backend:
- NYASH_LLVM_AUTO_SAFEPOINT=1 enables insertion (default 1)
- Injection points: loop headers, function calls, externcalls, and selected boxcalls.
- Safepoints call ny_check_safepoint/ny_safepoint in NyRT, which forwards to runtime hooks (GC.safepoint + scheduler poll).
Controller & Metrics
- The unified GcController implements GcHooks and aggregates metrics (safepoints/read/write/alloc).
- CountingGc is a thin wrapper around GcController for compatibility.

View File

@ -16,6 +16,19 @@
- `--vm-stats`: VM命令統計を有効化`NYASH_VM_STATS=1`
- `--vm-stats-json`: VM統計をJSONで出力`NYASH_VM_STATS_JSON=1`
## GC
- `--gc {auto|rc+cycle|minorgen|stw|rc|off}`: GCモード既定: `auto` → rc+cycle
- `rc+cycle`: 参照カウント + 循環回収(推奨・安定)
- `minorgen`: 高速向けの軽量世代別Gen0移動、上位非移動
- `stw`: 検証用の非移動MarkSweep開発者向け
- `rc`: 循環回収なしのRC比較用
- `off`: 自己責任モード(循環はリーク)
- 関連ENV
- `NYASH_GC_MODE`CLIが優先
- `NYASH_GC_METRICS` / `NYASH_GC_METRICS_JSON`
- `NYASH_GC_LEAK_DIAG` / `NYASH_GC_ALLOC_THRESHOLD`
- 詳細: `docs/reference/runtime/gc.md`
## WASM/AOT
- `--compile-wasm`: WATを出力
- `--compile-native` / `--aot`: AOT実行ファイル出力要wasm-backend