builder/vm: stabilize json_lint_vm under unified calls

- Fix condition_fn resolution: Value call path + dev safety + stub injection
- VM bridge: handle Method::birth via BoxCall; ArrayBox push/get/length/set direct bridge
- Receiver safety: pin receiver in method_call_handlers to avoid undefined use across blocks
- Local vars: materialize on declaration (use init ValueId; void for uninit)
- Prefer legacy BoxCall for Array/Map/String/user boxes in emit_box_or_plugin_call (stability-first)
- Test runner: update LLVM hint to llvmlite harness (remove LLVM_SYS_180_PREFIX guidance)
- Docs/roadmap: update CURRENT_TASK with unified default-ON + guards

Note: NYASH_DEV_BIRTH_INJECT_BUILTINS=1 can re-enable builtin birth() injection during migration.
This commit is contained in:
nyash-codex
2025-09-28 12:19:49 +09:00
parent 41a46b433d
commit 510f4cf523
74 changed files with 2846 additions and 825 deletions

View File

@ -1,172 +1,14 @@
# Nyash JSON Native
Layer Guard — json_native
> yyjsonC依存→ 完全Nyash実装で外部依存完全排除
Scope and responsibility
- This layer implements a minimal native JSON library in Ny.
- Responsibilities: scanning, tokenizing, and parsing JSON; building node structures.
- Forbidden: runtime/VM specifics, code generation, nonJSON language concerns.
## 🎯 プロジェクト目標
Imports policy (SSOT)
- Dev/CI: file-using allowed for development convenience.
- Prod: use only `nyash.toml` using entries (no adhoc file imports).
- **C依存ゼロ**: yyjsonからの完全脱却
- **Everything is Box**: 全てをNyash Boxで実装
- **80/20ルール**: 動作優先、最適化は後
- **段階切り替え**: `NYASH_JSON_PROVIDER=nyash|yyjson|serde`
## 📦 アーキテクチャ設計
### 🔍 yyjson分析結果
```c
// yyjson核心設計パターン
struct yyjson_val {
uint64_t tag; // 型+サブタイプ+長さ
yyjson_val_uni uni; // ペイロードunion
};
```
### 🎨 Nyash版設計 (Everything is Box)
```nyash
// 🌟 JSON値を表現するBox
box JsonNode {
kind: StringBox // "null"|"bool"|"int"|"string"|"array"|"object"
value: Box // 実際の値(種類に応じて)
children: ArrayBox // 配列・オブジェクト用(オプション)
keys: ArrayBox // オブジェクトのキー配列(オプション)
}
// 🔧 JSON字句解析器
box JsonLexer {
text: StringBox // 入力文字列
pos: IntegerBox // 現在位置
tokens: ArrayBox // トークン配列
}
// 🏗️ JSON構文解析器
box JsonParser {
lexer: JsonLexer // 字句解析器
current: IntegerBox // 現在のトークン位置
}
```
## 📂 ファイル構造
```
apps/lib/json_native/
├── README.md # この設計ドキュメント
├── lexer.nyash # JSON字句解析器状態機械ベース
├── parser.nyash # JSON構文解析器再帰下降
├── node.nyash # JsonNode実装Everything is Box
├── tests/ # テストケース
│ ├── lexer_test.nyash
│ ├── parser_test.nyash
│ └── integration_test.nyash
└── examples/ # 使用例
├── simple_parse.nyash
└── complex_object.nyash
```
## 🎯 実装戦略
### Phase 1: JsonNode基盤1週目
- [x] JsonNode Box実装
- [x] 基本的な値型サポートnull, bool, int, string
- [x] 配列・オブジェクト構造サポート
### Phase 2: JsonLexer実装2週目
- [ ] トークナイザー実装(状態機械ベース)
- [ ] エラーハンドリング
- [ ] 位置情報追跡
### Phase 3: JsonParser実装3週目
- [ ] 再帰下降パーサー実装
- [ ] JsonNode構築
- [ ] ネストされた構造サポート
### Phase 4: 統合最適化4週目
- [ ] C ABI Bridge最小実装
- [ ] 既存JSONBoxとの互換性
- [ ] 性能測定・最適化
## 🔄 既存システムとの統合
### 段階的移行戦略
```bash
# 環境変数で切り替え
export NYASH_JSON_PROVIDER=nyash # 新実装
export NYASH_JSON_PROVIDER=yyjson # 既存C実装
export NYASH_JSON_PROVIDER=serde # Rust実装
```
### 既存APIとの互換性
```nyash
// 既存JSONBox APIを維持
local json = new JSONBox()
json.parse("{\"key\": \"value\"}") // 内部でNyash実装を使用
```
## 🎨 設計思想
### Everything is Box原則
- **JsonNode**: 完全なBox実装
- **境界明確化**: 各コンポーネントをBox化
- **差し替え可能**: プロバイダー切り替え対応
### 80/20ルール適用
- **80%**: まず動く実装(文字列操作ベース)
- **20%**: 後で最適化バイナリ処理、SIMD等
### フォールバック戦略
- **エラー時**: 既存実装に安全復帰
- **性能不足時**: yyjsonに切り替え可能
- **互換性**: 既存APIを100%維持
## 🧪 テスト戦略
### 基本テストケース
```json
{"null": null}
{"bool": true}
{"int": 42}
{"string": "hello"}
{"array": [1,2,3]}
{"object": {"nested": "value"}}
```
### エラーケース
```
{invalid} // 不正なJSON
{"unclosed": "str // 閉じられていない文字列
[1,2, // 不完全な配列
```
### 性能テストケース
```
大きなJSONファイル10MB+
深くネストされた構造100レベル+
多数の小さなオブジェクト10万個+
```
## 🚀 実行例
```nyash
// 基本的な使用例
using "apps/lib/json_native/node.nyash" as JsonNative
// JSON文字列をパース
local text = "{\"name\": \"Nyash\", \"version\": 1}"
local node = JsonNative.parse(text)
// 値にアクセス
print(node.get("name").str()) // "Nyash"
print(node.get("version").int()) // 1
// JSON文字列に戻す
print(node.stringify()) // {"name":"Nyash","version":1}
```
## 📊 進捗追跡
- [ ] Week 1: JsonNode基盤
- [ ] Week 2: JsonLexer実装
- [ ] Week 3: JsonParser実装
- [ ] Week 4: 統合&最適化
**開始日**: 2025-09-22
**目標完了**: 2025-10-20
**実装者**: Claude × User協働
Notes
- Error messages aim to include: “Error at line X, column Y: …”.
- Unterminated string → tokenizer emits "Unterminated string literal" (locked by quick smoke).

View File

@ -59,9 +59,14 @@ box JsonParser {
// Step 2: 構文解析
local result = me.parse_value()
// Step 3: 余剰トークンチェック
// Step 3: 余剰トークンチェック(詳細情報付き)
if result != null and not me.is_at_end() {
me.add_error("Unexpected tokens after JSON value")
local extra = me.current_token()
if extra != null {
me.add_error("Unexpected tokens after JSON value: " + extra.get_type() + "(" + extra.get_value() + ")")
} else {
me.add_error("Unexpected tokens after JSON value")
}
return null
}

View File

@ -0,0 +1,14 @@
Layer Guard — selfhost/vm
Scope and responsibility
- Minimal Ny-based executors and helpers for selfhosting experiments.
- Responsibilities: trial executors (MIR JSON v0), tiny helpers (scan/binop/compare), smoke drivers.
- Forbidden: full parser implementation, heavy runtime logic, code generation.
Imports policy (SSOT)
- Dev/CI: file-using allowed; drivers may embed JSON for tiny smokes.
- Prod: prefer `nyash.toml` mapping under `[modules.selfhost.*]`.
Notes
- MirVmMin covers: const/binop/compare/ret (M2). Branch/jump/phi are later.
- Keep changes minimal and specneutral; new behavior is gated by new tests.

View File

@ -0,0 +1,112 @@
// mir_vm_m2.nyash — Ny製の最小MIR(JSON v0)実行器M2: const/binop/ret
static box MirVmM2 {
_str_to_int(s) {
local i = 0
local n = s.length()
local acc = 0
loop (i < n) {
local ch = s.substring(i, i+1)
if ch == "0" { acc = acc * 10 + 0 i = i + 1 continue }
if ch == "1" { acc = acc * 10 + 1 i = i + 1 continue }
if ch == "2" { acc = acc * 10 + 2 i = i + 1 continue }
if ch == "3" { acc = acc * 10 + 3 i = i + 1 continue }
if ch == "4" { acc = acc * 10 + 4 i = i + 1 continue }
if ch == "5" { acc = acc * 10 + 5 i = i + 1 continue }
if ch == "6" { acc = acc * 10 + 6 i = i + 1 continue }
if ch == "7" { acc = acc * 10 + 7 i = i + 1 continue }
if ch == "8" { acc = acc * 10 + 8 i = i + 1 continue }
if ch == "9" { acc = acc * 10 + 9 i = i + 1 continue }
break
}
return acc
}
_int_to_str(n) {
if n == 0 { return "0" }
local v = n
local out = ""
loop (v > 0) {
local d = v % 10
local ch = "0"
if d == 1 { ch = "1" } else { if d == 2 { ch = "2" } else { if d == 3 { ch = "3" } else { if d == 4 { ch = "4" } else { if d == 5 { ch = "5" } else { if d == 6 { ch = "6" } else { if d == 7 { ch = "7" } else { if d == 8 { ch = "8" } else { if d == 9 { ch = "9" } } } } } } } }
out = ch + out
v = v / 10
}
return out
}
_find_int_in(seg, keypat) {
local p = seg.indexOf(keypat)
if p < 0 { return null }
p = p + keypat.length()
local i = p
local out = ""
loop(true) {
local ch = seg.substring(i, i+1)
if ch == "" { break }
if ch == "0" || ch == "1" || ch == "2" || ch == "3" || ch == "4" || ch == "5" || ch == "6" || ch == "7" || ch == "8" || ch == "9" { out = out + ch i = i + 1 } else { break }
}
if out == "" { return null }
return me._str_to_int(out)
}
_find_str_in(seg, keypat) {
local p = seg.indexOf(keypat)
if p < 0 { return "" }
p = p + keypat.length()
local q = seg.indexOf(""", p)
if q < 0 { return "" }
return seg.substring(p, q)
}
_get(regs, id) { if regs.has(id) { return regs.get(id) } return 0 }
_set(regs, id, v) { regs.set(id, v) }
_bin(kind, a, b) {
if kind == "Add" { return a + b }
if kind == "Sub" { return a - b }
if kind == "Mul" { return a * b }
if kind == "Div" { if b == 0 { return 0 } else { return a / b } }
return 0
}
run(json) {
local regs = new MapBox()
local pos = json.indexOf(""instructions":[")
if pos < 0 {
print("0")
return 0
}
local cur = pos
loop(true) {
local op_pos = json.indexOf(""op":"", cur)
if op_pos < 0 { break }
local name_start = op_pos + 6
local name_end = json.indexOf(""", name_start)
if name_end < 0 { break }
local opname = json.substring(name_start, name_end)
local next_pos = json.indexOf(""op":"", name_end)
if next_pos < 0 { next_pos = json.length() }
local seg = json.substring(op_pos, next_pos)
if opname == "const" {
local dst = me._find_int_in(seg, ""dst":")
local val = me._find_int_in(seg, ""value":{"type":"i64","value":")
if dst != null and val != null { me._set(regs, "" + dst, val) }
} else { if opname == "binop" {
local dst = me._find_int_in(seg, ""dst":")
local kind = me._find_str_in(seg, ""op_kind":"")
local lhs = me._find_int_in(seg, ""lhs":")
local rhs = me._find_int_in(seg, ""rhs":")
if dst != null and lhs != null and rhs != null {
local a = me._get(regs, "" + lhs)
local b = me._get(regs, "" + rhs)
me._set(regs, "" + dst, me._bin(kind, a, b))
}
} else { if opname == "ret" {
local v = me._find_int_in(seg, ""value":")
if v == null { v = 0 }
local out = me._get(regs, "" + v)
print(me._int_to_str(out))
return 0
} } }
cur = next_pos
}
print("0")
return 0
}
}

View File

@ -7,9 +7,17 @@
// {"op":"ret","value":1}
// ]}]}]
// }
// 振る舞い: 最初の const i64 の値を読み取り、print する。ret は value スロット参照を想定するが、MVPでは無視。
// 振る舞い:
// - M1: 最初の const i64 の値を読み取り print
// - M2: const/binop/compare/ret を最小実装(簡易スキャンで安全に解釈)
static box MirVmMin {
// Public entry used by parity tests (calls into minimal runner)
run(mjson) {
local v = me._run_min(mjson)
print(me._int_to_str(v))
return v
}
// 最小限のスキャン関数(依存ゼロ版)
index_of_from(hay, needle, pos) {
if pos < 0 { pos = 0 }
@ -26,11 +34,11 @@ static box MirVmMin {
}
return -1
}
read_digits(json, pos) {
read_digits(text, pos) {
local out = ""
local i = pos
loop (true) {
local s = json.substring(i, i+1)
local s = text.substring(i, i+1)
if s == "" { break }
if s == "0" || s == "1" || s == "2" || s == "3" || s == "4" || s == "5" || s == "6" || s == "7" || s == "8" || s == "9" {
out = out + s
@ -63,10 +71,10 @@ static box MirVmMin {
if n == 0 { return "0" }
local v = n
local out = ""
local digits = "0123456789"
loop (v > 0) {
local d = v % 10
local ch = "0"
if d == 1 { ch = "1" } else { if d == 2 { ch = "2" } else { if d == 3 { ch = "3" } else { if d == 4 { ch = "4" } else { if d == 5 { ch = "5" } else { if d == 6 { ch = "6" } else { if d == 7 { ch = "7" } else { if d == 8 { ch = "8" } else { if d == 9 { ch = "9" } } } } } } } }
local ch = digits.substring(d, d+1)
out = ch + out
v = v / 10
}
@ -74,26 +82,228 @@ static box MirVmMin {
}
// MVP: 最初の const i64 の値を抽出
_extract_first_const_i64(json) {
if json == null { return 0 }
_extract_first_const_i64(text) {
if text == null { return 0 }
// "op":"const" を探す
local p = json.indexOf("\"op\":\"const\"")
local p = text.indexOf("\"op\":\"const\"")
if p < 0 { return 0 }
// そこから "\"value\":{\"type\":\"i64\",\"value\":" を探す
local key = "\"value\":{\"type\":\"i64\",\"value\":"
local q = me.index_of_from(json, key, p)
local q = me.index_of_from(text, key, p)
if q < 0 { return 0 }
q = q + key.length()
// 連続する数字を読む
local digits = me.read_digits(json, q)
local digits = me.read_digits(text, q)
if digits == "" { return 0 }
return me._str_to_int(digits)
}
// 実行: 値を print し、0 を返すMVP。将来は exit code 連動可。
run(mir_json_text) {
local v = me._extract_first_const_i64(mir_json_text)
print(me._int_to_str(v))
// --- M2 追加: 最小 MIR 実行const/binop/compare/ret ---
_get_map(regs, key) { if regs.has(key) { return regs.get(key) } return 0 }
_set_map(regs, key, val) { regs.set(key, val) }
_find_int_in(seg, keypat) {
local p = seg.indexOf(keypat)
if p < 0 { return null }
p = p + keypat.length()
local i = p
local out = ""
loop(true) {
local ch = seg.substring(i, i+1)
if ch == "" { break }
if ch == "0" || ch == "1" || ch == "2" || ch == "3" || ch == "4" || ch == "5" || ch == "6" || ch == "7" || ch == "8" || ch == "9" { out = out + ch i = i + 1 } else { break }
}
if out == "" { return null }
return me._str_to_int(out)
}
_find_str_in(seg, keypat) {
local p = seg.indexOf(keypat)
if p < 0 { return "" }
p = p + keypat.length()
local q = me.index_of_from(seg, "\"", p)
if q < 0 { return "" }
return seg.substring(p, q)
}
// --- JSON segment helpers (brace/bracket aware, minimal) ---
_seek_obj_start(text, from_pos) {
// scan backward to the nearest '{'
local i = from_pos
loop(true) {
i = i - 1
if i < 0 { return 0 }
local ch = text.substring(i, i+1)
if ch == "{" { return i }
}
return 0
}
_seek_obj_end(text, obj_start) {
// starting at '{', find matching '}' using depth counter
local i = obj_start
local depth = 0
loop(true) {
local ch = text.substring(i, i+1)
if ch == "" { break }
if ch == "{" { depth = depth + 1 }
else { if ch == "}" { depth = depth - 1 } }
if depth == 0 { return i + 1 }
i = i + 1
}
return i
}
_seek_array_end(text, array_after_bracket_pos) {
// given pos right after '[', find the matching ']'
local i = array_after_bracket_pos
local depth = 1
loop(true) {
local ch = text.substring(i, i+1)
if ch == "" { break }
if ch == "[" { depth = depth + 1 }
else { if ch == "]" { depth = depth - 1 } }
if depth == 0 { return i }
i = i + 1
}
return i
}
_map_binop_symbol(sym) {
if sym == "+" { return "Add" }
if sym == "-" { return "Sub" }
if sym == "*" { return "Mul" }
if sym == "/" { return "Div" }
if sym == "%" { return "Mod" }
return "" }
_map_cmp_symbol(sym) {
if sym == "==" { return "Eq" }
if sym == "!=" { return "Ne" }
if sym == "<" { return "Lt" }
if sym == "<=" { return "Le" }
if sym == ">" { return "Gt" }
if sym == ">=" { return "Ge" }
return "" }
_eval_binop(kind, a, b) {
if kind == "Add" { return a + b }
if kind == "Sub" { return a - b }
if kind == "Mul" { return a * b }
if kind == "Div" { if b == 0 { return 0 } else { return a / b } }
if kind == "Mod" { if b == 0 { return 0 } else { return a % b } }
return 0 }
_eval_cmp(kind, a, b) {
if kind == "Eq" { if a == b { return 1 } else { return 0 } }
if kind == "Ne" { if a != b { return 1 } else { return 0 } }
if kind == "Lt" { if a < b { return 1 } else { return 0 } }
if kind == "Gt" { if a > b { return 1 } else { return 0 } }
if kind == "Le" { if a <= b { return 1 } else { return 0 } }
if kind == "Ge" { if a >= b { return 1 } else { return 0 } }
return 0
}
// Locate start of instructions array for given block id
_block_insts_start(mjson, bid) {
local key = "\"id\":" + me._int_to_str(bid)
local p = mjson.indexOf(key)
if p < 0 { return -1 }
local q = me.index_of_from(mjson, "\"instructions\":[", p)
if q < 0 { return -1 }
// "\"instructions\":[" is 16 chars → return pos right after '['
return q + 16
}
_block_insts_end(mjson, insts_start) {
// Bound to the end bracket of this block's instructions array
return me._seek_array_end(mjson, insts_start)
}
_run_min(mjson) {
local regs = new MapBox()
// Control flow: start at block 0, process until ret
local bb = 0
loop(true) {
local pos = me._block_insts_start(mjson, bb)
if pos < 0 { return me._extract_first_const_i64(mjson) }
local block_end = me._block_insts_end(mjson, pos)
// Single-pass over instructions: segment each op object precisely and evaluate
local scan = pos
local moved = 0
loop(true) {
// find next op field within this block
local opos = me.index_of_from(mjson, "\"op\":\"", scan)
if opos < 0 || opos >= block_end { break }
// find exact JSON object bounds for this instruction
// Determine object start as the last '{' between pos..opos
local i = pos
local obj_start = opos
loop(i <= opos) {
local ch0 = mjson.substring(i, i+1)
if ch0 == "{" { obj_start = i }
i = i + 1
}
local obj_end = me._seek_obj_end(mjson, obj_start)
if obj_end > block_end { obj_end = block_end }
local seg = mjson.substring(obj_start, obj_end)
// dispatch by op name (v0/v1 tolerant)
local opname = me._find_str_in(seg, "\"op\":\"")
if opname == "const" {
local cdst = me._find_int_in(seg, "\"dst\":")
local cval = me._find_int_in(seg, "\"value\":{\"type\":\"i64\",\"value\":")
if cdst != null and cval != null { me._set_map(regs, "" + cdst, cval) }
} else {
if opname == "binop" {
local bdst = me._find_int_in(seg, "\"dst\":")
local bkind = me._find_str_in(seg, "\"op_kind\":\"")
if bkind == "" { bkind = me._map_binop_symbol(me._find_str_in(seg, "\"operation\":\"")) }
local blhs = me._find_int_in(seg, "\"lhs\":")
local brhs = me._find_int_in(seg, "\"rhs\":")
if bdst != null and blhs != null and brhs != null {
local a = me._get_map(regs, "" + blhs)
local b = me._get_map(regs, "" + brhs)
local r = me._eval_binop(bkind, a, b)
me._set_map(regs, "" + bdst, r)
}
} else {
if opname == "compare" {
local kdst = me._find_int_in(seg, "\"dst\":")
local kkind = me._find_str_in(seg, "\"cmp\":\"")
if kkind == "" { kkind = me._map_cmp_symbol(me._find_str_in(seg, "\"operation\":\"")) }
local klhs = me._find_int_in(seg, "\"lhs\":")
local krhs = me._find_int_in(seg, "\"rhs\":")
if kdst != null and klhs != null and krhs != null {
local a = me._get_map(regs, "" + klhs)
local b = me._get_map(regs, "" + krhs)
local r = me._eval_cmp(kkind, a, b)
me._set_map(regs, "" + kdst, r)
}
} else {
if opname == "jump" {
local tgt = me._find_int_in(seg, "\"target\":")
if tgt != null { bb = tgt scan = block_end moved = 1 break }
} else {
if opname == "branch" {
local cond = me._find_int_in(seg, "\"cond\":")
local then_id = me._find_int_in(seg, "\"then\":")
local else_id = me._find_int_in(seg, "\"else\":")
local cval = 0
if cond != null { cval = me._get_map(regs, "" + cond) }
if cval != 0 { bb = then_id } else { bb = else_id }
scan = block_end
moved = 1
break
} else {
if opname == "ret" {
local rv = me._find_int_in(seg, "\"value\":")
if rv == null { rv = 0 }
return me._get_map(regs, "" + rv)
}
}
}
}
}
}
// advance to the end of this instruction object
scan = obj_end
}
// No ret encountered in this block; if control moved, continue with new bb
if moved == 1 { continue }
// Fallback when ret not found at all in processed blocks
return me._extract_first_const_i64(mjson)
}
return me._extract_first_const_i64(mjson)
}
}

View File

@ -0,0 +1,32 @@
// vm_kernel_box.nyash — NYABI Kernel (skeleton, dev-only; not wired)
// Scope: Provide policy/decision helpers behind an explicit OFF toggle.
// Notes: This box is not referenced by the VM by default.
static box VmKernelBox {
// Report version and supported features.
caps() {
// v0 draft: features are informative only.
return "{\"version\":0,\"features\":[\"policy\"]}"
}
// Decide stringify strategy for a given type.
// Returns: "direct" | "rewrite_stringify" | "fallback"
stringify_policy(typeName) {
if typeName == "VoidBox" { return "rewrite_stringify" }
return "fallback"
}
// Decide equals strategy for two types.
// Returns: "object" | "value" | "fallback"
equals_policy(lhsType, rhsType) {
if lhsType == rhsType { return "value" }
return "fallback"
}
// Batch resolve method dispatch plans.
// Input/Output via tiny JSON strings (draft). Returns "{\"plans\":[]}" for now.
resolve_method_batch(reqs_json) {
return "{\"plans\":[]}"
}
}

View File

@ -7,7 +7,9 @@ static box Main {
main(args) {
// 既定の最小 MIR(JSON v0)
local json = "{\"functions\":[{\"name\":\"main\",\"params\":[],\"blocks\":[{\"id\":0,\"instructions\":[{\"op\":\"const\",\"dst\":1,\"value\":{\"type\":\"i64\",\"value\":42}},{\"op\":\"ret\",\"value\":1}]}]}]}"
if args { if args.size() > 0 { local s = args.get(0) if s { json = s } } }
return MirVmMin.run(json)
if args != null { if args.size() > 0 { local s = args.get(0) if s != null { json = s } } }
local v = MirVmMin._run_min(json)
print(MirVmMin._int_to_str(v))
return 0
}
}