json-native: token positions (line/column); escape utils BMP coverage + surrogate guard; add smokes for string escapes, nested, and error cases (AST/VM)

This commit is contained in:
Selfhosting Dev
2025-09-26 00:42:55 +09:00
parent b3a96faccb
commit 041cef875a
16 changed files with 206 additions and 44 deletions

View File

@ -71,7 +71,7 @@ Nyashは「Everything is Box」。実装・最適化・検証のすべてを「
### 実践方法 ### 実践方法
1. **まず動くものを作る**80% 1. **まず動くものを作る**80%
2. **改善アイデアは `docs/ideas/` フォルダに記録**20% 2. **改善アイデアは `docs/development/proposals/ideas/` フォルダに記録**20%
3. **優先度に応じて後から改善** 3. **優先度に応じて後から改善**
## 🚀 クイックスタート ## 🚀 クイックスタート
@ -537,7 +537,7 @@ Call { func: ValueId, callee: Option<Callee> } // 段階移行で破壊的変
- 🎯 **All or Nothing**: Phase 10.7でPython transpilation、フォールバック無し設計 - 🎯 **All or Nothing**: Phase 10.7でPython transpilation、フォールバック無し設計
- 📚 **完全ドキュメント化**: README.md導線、実装戦略、技術仕様すべて完備 - 📚 **完全ドキュメント化**: README.md導線、実装戦略、技術仕様すべて完備
- 🗃️ **アーカイブ整理**: 古いphaseファイル群をarchiveに移動、導線クリーンアップ完了 - 🗃️ **アーカイブ整理**: 古いphaseファイル群をarchiveに移動、導線クリーンアップ完了
- 📋 詳細: [Property System仕様](docs/proposals/unified-members.md) | [Python統合計画](docs/development/roadmap/phases/phase-10.7/) - 📋 詳細: [Property System仕様](docs/development/proposals/unified-members.md) | [Python統合計画](docs/development/roadmap/phases/phase-10.7/)
## 📝 Update (2025-09-24) ✅ 改行処理革命Phase 2-B完了実用レベル到達 ## 📝 Update (2025-09-24) ✅ 改行処理革命Phase 2-B完了実用レベル到達
- 🎯 **改行処理革命Phase 2-B完了** Box宣言系ファイルから14箇所のskip_newlines()完全削除 - 🎯 **改行処理革命Phase 2-B完了** Box宣言系ファイルから14箇所のskip_newlines()完全削除
@ -913,12 +913,12 @@ CODEX_MAX_CONCURRENT=2 CODEX_DEDUP=1 CODEX_ASYNC_DETACH=1 \
pgrep -af 'codex.*exec' pgrep -af 'codex.*exec'
``` ```
### 💡 アイデア管理docs/ideas/フォルダ) ### 💡 アイデア管理docs/development/proposals/ideas/ フォルダ)
**80/20ルールの「残り20%」を整理して管理** **80/20ルールの「残り20%」を整理して管理**
``` ```
docs/ideas/ docs/development/proposals/ideas/
├── improvements/ # 80%実装の残り20%改善候補 ├── improvements/ # 80%実装の残り20%改善候補
├── new-features/ # 新機能アイデア ├── new-features/ # 新機能アイデア
└── other/ # その他すべて(調査、メモ、設計案) └── other/ # その他すべて(調査、メモ、設計案)

View File

@ -1114,7 +1114,7 @@ This page is trimmed to reflect the active work only. The previous long form has
- Refreshed docs index with clear "Start here" links (blueprints/strings, EBNF, strings reference) - Refreshed docs index with clear "Start here" links (blueprints/strings, EBNF, strings reference)
- Clarified operator/loop sugar policy in `guides/language-core-and-sugar.md` ("!" adopted, dowhile not adopted) - Clarified operator/loop sugar policy in `guides/language-core-and-sugar.md` ("!" adopted, dowhile not adopted)
- Concurrency docs (design-only): box model, semantics, and patterns/checklist added - Concurrency docs (design-only): box model, semantics, and patterns/checklist added
- `docs/proposals/concurrency/boxes.md` - `docs/development/proposals/concurrency/boxes.md`
- `docs/reference/concurrency/semantics.md` - `docs/reference/concurrency/semantics.md`
- `docs/guides/box-patterns.md`, `docs/guides/box-design-checklist.md` - `docs/guides/box-patterns.md`, `docs/guides/box-design-checklist.md`
- CI/Smokes - CI/Smokes
@ -1351,7 +1351,7 @@ Progress
- Language: Flow blocks & `->` pipingdesign — docs/development/design/legacy/flow-blocks.md - Language: Flow blocks & `->` pipingdesign — docs/development/design/legacy/flow-blocks.md
- Guards: Range/CharClass sugarreference — docs/reference/language/match-guards.md - Guards: Range/CharClass sugarreference — docs/reference/language/match-guards.md
- Strings: `toDigitOrNull` / `toIntOrNull`design note — docs/reference/language/strings.md - Strings: `toDigitOrNull` / `toIntOrNull`design note — docs/reference/language/strings.md
- Concurrency: Box modelRoutine/Channel/Select/Scope — docs/proposals/concurrency/boxes.md - Concurrency: Box modelRoutine/Channel/Select/Scope — docs/development/proposals/concurrency/boxes.md
- Concurrency semanticsblocking/close/select/trace — docs/reference/concurrency/semantics.md - Concurrency semanticsblocking/close/select/trace — docs/reference/concurrency/semantics.md
## Nyash VM めど後 — 機能追加リンク(備忘) ## Nyash VM めど後 — 機能追加リンク(備忘)

View File

@ -30,11 +30,11 @@ Phase15202509アップデート
仕様と既知制約 仕様と既知制約
- 必須不変条件Invariants: `docs/reference/invariants.md` - 必須不変条件Invariants: `docs/reference/invariants.md`
- 制約(既知/一時/解消済み): `docs/reference/constraints.md` - 制約(既知/一時/解消済み): `docs/reference/constraints.md`
- PHI と SSA の設計: `docs/architecture/phi-and-ssa.md` - PHI と SSA の設計: `docs/reference/architecture/phi-and-ssa.md`
- 既定のPHI挙動: Phase15 で PHI-ONMIR14が標準になったよ。ループ・break/continue・構造化制御の合流で PHI を必ず生成するよ。 - 既定のPHI挙動: Phase15 で PHI-ONMIR14が標準になったよ。ループ・break/continue・構造化制御の合流で PHI を必ず生成するよ。
- レガシー互換: `NYASH_MIR_NO_PHI=1`(必要なら `NYASH_VERIFY_ALLOW_NO_PHI=1` も)で PHI-OFFエッジコピーに切り替えできるよ。 - レガシー互換: `NYASH_MIR_NO_PHI=1`(必要なら `NYASH_VERIFY_ALLOW_NO_PHI=1` も)で PHI-OFFエッジコピーに切り替えできるよ。
- テスト行列(仕様→テスト対応): `docs/guides/testing-matrix.md` - テスト行列(仕様→テスト対応): `docs/guides/testing-matrix.md`
- 他言語との比較: `docs/comparison/nyash-vs-others.md` - 他言語との比較: `docs/guides/comparison/nyash-vs-others.md`
プロファイル(クイック) プロファイル(クイック)
- `--profile dev` → マクロONstrict、PyVM 開発向けの既定を適用(必要に応じて環境で上書き可) - `--profile dev` → マクロONstrict、PyVM 開発向けの既定を適用(必要に応じて環境で上書き可)

View File

@ -52,9 +52,9 @@ Profiles (quick)
Specs & Constraints Specs & Constraints
- Invariants (must-hold): `docs/reference/invariants.md` - Invariants (must-hold): `docs/reference/invariants.md`
- Constraints (known/temporary/resolved): `docs/reference/constraints.md` - Constraints (known/temporary/resolved): `docs/reference/constraints.md`
- PHI & SSA design: `docs/architecture/phi-and-ssa.md` - PHI & SSA design: `docs/reference/architecture/phi-and-ssa.md`
- Testing matrix (spec → tests): `docs/guides/testing-matrix.md` - Testing matrix (spec → tests): `docs/guides/testing-matrix.md`
- Comparison with other languages: `docs/comparison/nyash-vs-others.md` - Comparison with other languages: `docs/guides/comparison/nyash-vs-others.md`
## Table of Contents ## Table of Contents
- [SelfHosting (Dev Focus)](#self-hosting) - [SelfHosting (Dev Focus)](#self-hosting)
@ -378,7 +378,7 @@ box DataProcessor {
**Result**: Python code runs 10-50x faster as native Nyash binaries! **Result**: Python code runs 10-50x faster as native Nyash binaries!
### Documentation ### Documentation
- **[Property System Specification](docs/proposals/unified-members.md)** - Complete syntax reference - **[Property System Specification](docs/development/proposals/unified-members.md)** - Complete syntax reference
- **[Python Integration Guide](docs/development/roadmap/phases/phase-10.7/)** - Python → Nyash transpilation - **[Python Integration Guide](docs/development/roadmap/phases/phase-10.7/)** - Python → Nyash transpilation
- **[Implementation Strategy](docs/private/papers/paper-m-method-postfix-catch/implementation-strategy.md)** - Technical details - **[Implementation Strategy](docs/private/papers/paper-m-method-postfix-catch/implementation-strategy.md)** - Technical details

View File

@ -53,6 +53,13 @@ box JsonToken {
get_line() { return me.line } get_line() { return me.line }
get_column() { return me.column } get_column() { return me.column }
// 位置情報の設定(トークナイザーから付与)
set_line_column(line, column) {
me.line = line
me.column = column
return me
}
// ===== 判定メソッド ===== // ===== 判定メソッド =====
is_literal() { is_literal() {
@ -247,4 +254,4 @@ static box TokenStats {
i = i + 1 i = i + 1
} }
} }
} }

View File

@ -66,37 +66,39 @@ box JsonTokenizer {
// EOF チェック // EOF チェック
if me.scanner.is_eof() { if me.scanner.is_eof() {
return new JsonToken("EOF", "", me.scanner.get_position(), me.scanner.get_position()) return new JsonToken("EOF", "", me.scanner.get_position(), me.scanner.get_position()).set_line_column(me.scanner.get_line(), me.scanner.get_column())
} }
local start_pos = me.scanner.get_position() local start_pos = me.scanner.get_position()
local start_line = me.scanner.get_line()
local start_col = me.scanner.get_column()
local ch = me.scanner.current() local ch = me.scanner.current()
// 構造文字(単一文字) // 構造文字(単一文字)
local structural_type = me.char_to_token_type(ch) local structural_type = me.char_to_token_type(ch)
if structural_type != null { if structural_type != null {
me.scanner.advance() me.scanner.advance()
return this.create_structural_token(structural_type, start_pos) return this.create_structural_token(structural_type, start_pos).set_line_column(start_line, start_col)
} }
// 文字列リテラル // 文字列リテラル
if ch == "\"" { if ch == "\"" {
return me.tokenize_string() return me.tokenize_string().set_line_column(start_line, start_col)
} }
// 数値リテラル // 数値リテラル
if me.is_number_start_char(ch) { if me.is_number_start_char(ch) {
return me.tokenize_number() return me.tokenize_number().set_line_column(start_line, start_col)
} }
// キーワードnull, true, false // キーワードnull, true, false
if me.is_alpha_char(ch) { if me.is_alpha_char(ch) {
return me.tokenize_keyword() return me.tokenize_keyword().set_line_column(start_line, start_col)
} }
// 不明な文字(エラー) // 不明な文字(エラー)
me.scanner.advance() me.scanner.advance()
return new JsonToken("ERROR", "Unexpected character: '" + ch + "'", start_pos, me.scanner.get_position()) return new JsonToken("ERROR", "Unexpected character: '" + ch + "'", start_pos, me.scanner.get_position()).set_line_column(start_line, start_col)
} }
// ===== 専用トークナイザーメソッド ===== // ===== 専用トークナイザーメソッド =====

View File

@ -233,18 +233,110 @@ static box EscapeUtils {
(ch >= "A" and ch <= "F") (ch >= "A" and ch <= "F")
} }
// 4桁の16進数文字列を文字に変換簡易版 // 4桁の16進数文字列を文字に変換MVP: BMPの基本ASCIIとサロゲート検知
hex_to_char(hex) { hex_to_char(hex) {
// 簡易実装: 基本的なASCII文字のみ対応 // サロゲート半の範囲は '?' に置換(結合は現段階で未対応
return match hex { if hex >= "D800" and hex <= "DFFF" {
"0020" => " ", // スペース return "?"
"0021" => "!", // 感嘆符
"0022" => "\"", // ダブルクォート
"005C" => "\\", // バックスラッシュ
"0041" => "A", // A
"0061" => "a", // a
_ => "?" // 不明な文字は?で代替
} }
// 簡易: よく使う範囲0x20-0x7Eを網羅
if hex == "005C" { return "\\" }
if hex == "0022" { return "\"" }
// 0-9, A-Z, a-z, 空白と基本記号
if hex == "0020" { return " " }
if hex == "0021" { return "!" }
if hex == "0023" { return "#" }
if hex == "0024" { return "$" }
if hex == "0025" { return "%" }
if hex == "0026" { return "&" }
if hex == "0027" { return "'" }
if hex == "0028" { return "(" }
if hex == "0029" { return ")" }
if hex == "002A" { return "*" }
if hex == "002B" { return "+" }
if hex == "002C" { return "," }
if hex == "002D" { return "-" }
if hex == "002E" { return "." }
if hex == "002F" { return "/" }
if hex == "0030" { return "0" }
if hex == "0031" { return "1" }
if hex == "0032" { return "2" }
if hex == "0033" { return "3" }
if hex == "0034" { return "4" }
if hex == "0035" { return "5" }
if hex == "0036" { return "6" }
if hex == "0037" { return "7" }
if hex == "0038" { return "8" }
if hex == "0039" { return "9" }
if hex == "003A" { return ":" }
if hex == "003B" { return ";" }
if hex == "003C" { return "<" }
if hex == "003D" { return "=" }
if hex == "003E" { return ">" }
if hex == "003F" { return "?" }
if hex == "0040" { return "@" }
if hex == "0041" { return "A" }
if hex == "0042" { return "B" }
if hex == "0043" { return "C" }
if hex == "0044" { return "D" }
if hex == "0045" { return "E" }
if hex == "0046" { return "F" }
if hex == "0047" { return "G" }
if hex == "0048" { return "H" }
if hex == "0049" { return "I" }
if hex == "004A" { return "J" }
if hex == "004B" { return "K" }
if hex == "004C" { return "L" }
if hex == "004D" { return "M" }
if hex == "004E" { return "N" }
if hex == "004F" { return "O" }
if hex == "0050" { return "P" }
if hex == "0051" { return "Q" }
if hex == "0052" { return "R" }
if hex == "0053" { return "S" }
if hex == "0054" { return "T" }
if hex == "0055" { return "U" }
if hex == "0056" { return "V" }
if hex == "0057" { return "W" }
if hex == "0058" { return "X" }
if hex == "0059" { return "Y" }
if hex == "005A" { return "Z" }
if hex == "005B" { return "[" }
if hex == "005D" { return "]" }
if hex == "005E" { return "^" }
if hex == "005F" { return "_" }
if hex == "0060" { return "`" }
if hex == "0061" { return "a" }
if hex == "0062" { return "b" }
if hex == "0063" { return "c" }
if hex == "0064" { return "d" }
if hex == "0065" { return "e" }
if hex == "0066" { return "f" }
if hex == "0067" { return "g" }
if hex == "0068" { return "h" }
if hex == "0069" { return "i" }
if hex == "006A" { return "j" }
if hex == "006B" { return "k" }
if hex == "006C" { return "l" }
if hex == "006D" { return "m" }
if hex == "006E" { return "n" }
if hex == "006F" { return "o" }
if hex == "0070" { return "p" }
if hex == "0071" { return "q" }
if hex == "0072" { return "r" }
if hex == "0073" { return "s" }
if hex == "0074" { return "t" }
if hex == "0075" { return "u" }
if hex == "0076" { return "v" }
if hex == "0077" { return "w" }
if hex == "0078" { return "x" }
if hex == "0079" { return "y" }
if hex == "007A" { return "z" }
if hex == "007B" { return "{" }
if hex == "007C" { return "|" }
if hex == "007D" { return "}" }
if hex == "007E" { return "~" }
return "?"
} }
// ===== 妥当性検証 ===== // ===== 妥当性検証 =====

View File

@ -8,7 +8,7 @@ How to run (after full build):
- `copyFrom = { method_id = 7, args = [ { kind = "box", category = "plugin" } ] }` - `copyFrom = { method_id = 7, args = [ { kind = "box", category = "plugin" } ] }`
- `cloneSelf = { method_id = 8 }` - `cloneSelf = { method_id = 8 }`
- Build the plugin: `cd plugins/nyash-filebox-plugin && cargo build --release` - Build the plugin: `cd plugins/nyash-filebox-plugin && cargo build --release`
- Run the example: `./target/release/nyash docs/examples/plugin_boxref_return.nyash` - Run the example: `./target/release/nyash docs/guides/examples/plugin_boxref_return.nyash`
Expected behavior: Expected behavior:
- Creates two FileBox instances (`f`, `g`), writes to `f`, copies content to `g` via `copyFrom`, then closes both. - Creates two FileBox instances (`f`, `g`), writes to `f`, copies content to `g` via `copyFrom`, then closes both.

View File

@ -49,5 +49,4 @@
## 7. 参考 ## 7. 参考
- 仕様: `docs/reference/plugin-system/nyash-toml-v2_1-spec.md` - 仕様: `docs/reference/plugin-system/nyash-toml-v2_1-spec.md`
- 実装: `src/runtime/plugin_loader_v2.rs`(引数検証/Handle戻り値復元 - 実装: `src/runtime/plugin_loader_v2.rs`(引数検証/Handle戻り値復元
- 例: `docs/examples/plugin_boxref_return.nyash` - 例: `docs/guides/examples/plugin_boxref_return.nyash`

View File

@ -49,5 +49,4 @@
## 7. 参考 ## 7. 参考
- 仕様: `docs/reference/plugin-system/nyash-toml-v2_1-spec.md` - 仕様: `docs/reference/plugin-system/nyash-toml-v2_1-spec.md`
- 実装: `src/runtime/plugin_loader_v2.rs`(引数検証/Handle戻り値復元 - 実装: `src/runtime/plugin_loader_v2.rs`(引数検証/Handle戻り値復元
- 例: `docs/examples/plugin_boxref_return.nyash` - 例: `docs/guides/examples/plugin_boxref_return.nyash`

View File

@ -2,7 +2,7 @@
## 📝 概要 ## 📝 概要
Rust/inkwellの複雑性を回避し、llvmliteを使ってシンプルに実装する実験的バックエンド。 Rust/inkwellの複雑性を回避し、llvmliteを使ってシンプルに実装する実験的バックエンド。
ChatGPTが設計した`docs/design/LLVM_LAYER_OVERVIEW.md`の設計原則に従う。 ChatGPTが設計した`docs/development/design/legacy/LLVM_LAYER_OVERVIEW.md`の設計原則に従う。
## 🎯 目的 ## 🎯 目的
1. **検証ハーネス** - PHI/SSA構造の高速検証 1. **検証ハーネス** - PHI/SSA構造の高速検証

View File

@ -1,7 +1,7 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
""" """
Nyash LLVM Python Backend - Main Builder Nyash LLVM Python Backend - Main Builder
Following the design principles in docs/design/LLVM_LAYER_OVERVIEW.md Following the design principles in docs/development/design/legacy/LLVM_LAYER_OVERVIEW.md
""" """
import json import json

View File

@ -12,7 +12,7 @@ import llvmlite.ir as ir
class Resolver: class Resolver:
""" """
Centralized value resolution with per-block caching. Centralized value resolution with per-block caching.
Following the Core Invariants from docs/design/LLVM_LAYER_OVERVIEW.md: Following the Core Invariants from docs/development/design/legacy/LLVM_LAYER_OVERVIEW.md:
- Resolver-only reads - Resolver-only reads
- Localize at block start (PHI creation) - Localize at block start (PHI creation)
- Cache per (block, value) to avoid redundant PHIs - Cache per (block, value) to avoid redundant PHIs

View File

@ -1,6 +1,6 @@
# Nyash AOT-Plan (Phase 15.1) — Scripts Skeleton # Nyash AOT-Plan (Phase 15.1) — Scripts Skeleton
This folder will contain Nyash scripts that analyze a project (following `using` imports) and emit `aot_plan.v1.json` per docs/design/aot-plan-v1.md. This folder will contain Nyash scripts that analyze a project (following `using` imports) and emit `aot_plan.v1.json` per docs/development/design/legacy/aot-plan-v1.md.
Phase 15.1 scope: Phase 15.1 scope:
- Keep scripts minimal and deterministic - Keep scripts minimal and deterministic

View File

@ -7,13 +7,13 @@ ROOT_DIR="$(cd "$(dirname "$0")/.." && pwd)"
cd "$ROOT_DIR" cd "$ROOT_DIR"
PAIRS=( PAIRS=(
"local_tests/typeop_is_as_func_poc.nyash docs/status/golden/typeop_is_as_func_poc.mir.txt" "local_tests/typeop_is_as_func_poc.nyash docs/development/testing/golden/typeop_is_as_func_poc.mir.txt"
"local_tests/typeop_is_as_poc.nyash docs/status/golden/typeop_is_as_poc.mir.txt" "local_tests/typeop_is_as_poc.nyash docs/development/testing/golden/typeop_is_as_poc.mir.txt"
"local_tests/extern_console_log.nyash docs/status/golden/extern_console_log.mir.txt" "local_tests/extern_console_log.nyash docs/development/testing/golden/extern_console_log.mir.txt"
"local_tests/simple_loop_test.nyash docs/status/golden/loop_simple.mir.txt" "local_tests/simple_loop_test.nyash docs/development/testing/golden/loop_simple.mir.txt"
"local_tests/test_vm_array_getset.nyash docs/status/golden/boxcall_array_getset.mir.txt" "local_tests/test_vm_array_getset.nyash docs/development/testing/golden/boxcall_array_getset.mir.txt"
"local_tests/typeop_mixed.nyash docs/status/golden/typeop_mixed.mir.txt" "local_tests/typeop_mixed.nyash docs/development/testing/golden/typeop_mixed.mir.txt"
"local_tests/loop_nested_if_test.nyash docs/status/golden/loop_nested_if.mir.txt" "local_tests/loop_nested_if_test.nyash docs/development/testing/golden/loop_nested_if.mir.txt"
) )
FAILED=0 FAILED=0

View File

@ -0,0 +1,63 @@
#!/bin/bash
# json_string_escapes_ast.sh - JSON string escapes roundtrip via AST using
source "$(dirname "$0")/../../../lib/test_runner.sh"
require_env || exit 2
preflight_plugins || exit 2
TEST_DIR="/tmp/json_string_escapes_ast_$$"
mkdir -p "$TEST_DIR"
cd "$TEST_DIR"
cat > nyash.toml << EOF
[using.json_native]
path = "$NYASH_ROOT/apps/lib/json_native/"
main = "parser/parser.nyash"
[using.aliases]
json = "json_native"
EOF
cat > driver.nyash << 'EOF'
using json_native as JsonParserModule
static box Main {
main() {
// input → expected stringify
local inputs = new ArrayBox()
local expect = new ArrayBox()
inputs.push("\"A\""); expect.push("\"A\"")
inputs.push("\"\\n\" "); expect.push("\"\\n\"")
inputs.push("\"\\t\""); expect.push("\"\\t\"")
inputs.push("\"\\\\\""); expect.push("\"\\\\\"")
inputs.push("\"\\\"\""); expect.push("\"\\\"\"")
inputs.push("\"\\u0041\""); expect.push("\"A\"")
local i = 0
loop(i < inputs.length()) {
local s = inputs.get(i)
local r = JsonParserModule.roundtrip_test(s)
print(r)
i = i + 1
}
return 0
}
}
EOF
expected=$(cat << 'TXT'
"A"
"\n"
"\t"
"\\"
"\""
"A"
TXT
)
output=$("$NYASH_BIN" --backend vm driver.nyash 2>&1 | filter_noise)
compare_outputs "$expected" "$output" "json_string_escapes_ast"
cd /
rm -rf "$TEST_DIR"