pyvm: split op handlers into ops_core/ops_box/ops_ctrl; add ops_flow + intrinsic; delegate vm.py without behavior change

net-plugin: modularize constants (consts.rs) and sockets (sockets.rs); remove legacy commented socket code; fix unused imports
mir: move instruction unit tests to tests/mir_instruction_unit.rs (file lean-up); no semantic changes
runner/pyvm: ensure using pre-strip; misc docs updates

Build: cargo build ok; legacy cfg warnings remain as before
This commit is contained in:
Selfhosting Dev
2025-09-21 08:53:00 +09:00
parent ee17cfd979
commit c8063c9e41
247 changed files with 10187 additions and 23124 deletions

View File

@ -17,7 +17,7 @@ logic := compare (('&&' | '||') compare)*
compare := sum (( '==' | '!=' | '<' | '>' | '<=' | '>=' ) sum)?
sum := term (('+' | '-') term)*
term := unary (('*' | '/') unary)*
unary := '-' unary | factor
unary := ('-' | '!' | 'not') unary | factor
factor := INT
| STRING
@ -50,6 +50,8 @@ args := expr (',' expr)*
Notes
- ASI: Newline is the primary statement separator. Do not insert a semicolon between a closed block and a following 'else'.
- Semicolon (optional): When `NYASH_PARSER_ALLOW_SEMICOLON=1` is set, `;` is accepted as an additional statement separator (equivalent to newline). It is not allowed between `}` and a following `else`.
- Dowhile: not supported by design. Prefer a singleentry, precondition loop normalized via sugar (e.g., `repeat N {}` / `until cond {}`) to a `loop` with clear break conditions.
- Short-circuit: '&&' and '||' must not evaluate the RHS when not needed.
- Unary minus has higher precedence than '*' and '/'.
- IDENT names consist of [A-Za-z_][A-Za-z0-9_]*
@ -92,6 +94,11 @@ block_as_role := block 'as' ( 'once' | 'birth_once' )? IDENT ':' TYPE
handler_tail := ( catch_block )? ( cleanup_block )?
catch_block := 'catch' ( '(' ( IDENT IDENT | IDENT )? ')' )? block
cleanup_block := 'cleanup' block
; Stage3 (Phase 1 via normalization gate NYASH_CATCH_NEW=1)
; Postfix handlers for expressions and calls
postfix_catch := primary_expr 'catch' ( '(' ( IDENT IDENT | IDENT )? ')' )? block
postfix_cleanup := primary_expr 'cleanup' block
```
Semantics (summary)

View File

@ -27,7 +27,7 @@ Rust製インタープリターによる高性能実行と、直感的な構文
| `loop` | ループ(唯一の形式) | `loop(condition) { }` |
| `continue` | ループ継続 | `continue` |
| `match` | パターンマッチング(構造/型/ガード) | `match value { "A" => 1, _ => 0 }` |
| `try` | 例外捕獲開始 | `try { }` |
| `try` | 例外捕獲開始 | `try { }`非推奨。postfix `catch/cleanup` を使用) |
| `interface` | インターフェース定義 | `interface Comparable { }` |
| `once` | **NEW** 遅延評価プロパティ | `once cache: CacheBox { build() }` |
| `birth_once` | **NEW** 即座評価プロパティ | `birth_once config: ConfigBox { load() }` |
@ -37,8 +37,8 @@ Rust製インタープリターによる高性能実行と、直感的な構文
|-------|------|---|
| `override` | 明示的オーバーライド | `override speak() { }` |
| `break` | ループ脱出 | `break` |
| `catch` | 例外処理 | `catch (e) { }` |
| `cleanup` | 最終処理finally の後継) | `cleanup { }` |
| `catch` | 例外処理 | `catch (e) { }`(式/呼び出しの後置も可・Stage3 |
| `cleanup` | 最終処理finally の後継) | `cleanup { }`(式/呼び出しの後置も可・Stage3 |
| `throw` | 例外発生 | `throw error` |
| `nowait` | 非同期実行 | `nowait future = task()` |
| `await` | 待機・結果取得 | `result = await future` |
@ -55,7 +55,7 @@ Rust製インタープリターによる高性能実行と、直感的な構文
### **演算子・論理**
| 演算子/キーワード | 用途 | 例 |
|-------|------|---|
| `not` | 論理否定 | `not condition` |
| `not` / `!` | 論理否定 | `not condition` / `!condition` |
| `and` | 論理積 | `a and b` |
| `or` | 論理和 | `a or b` |
| `true`/`false` | 真偽値 | `flag = true` |
@ -188,9 +188,12 @@ loop(condition) {
}
}
# ❌ 削除済み - 使用不可
while condition { } # パーサーエラー
loop() { } # パーサーエラー
# ❌ 採用しない構文(設計方針)
while condition { } # 先頭条件は `loop(condition){}` へ統一
do { body } while(cond) # dowhile は不採用。`repeat/ until` 糖衣で表現し、先頭条件に正規化
loop() { } # 無条件ループは `loop(true){}` を意図明確に書く
> 設計メモ: Nyashは「単一入口・先頭条件」の制御フロー規律を重視するため、dowhileは採用しません。必ず実行の表現は `loop(1)` ラッパーや `repeat/until` 糖衣からゼロコストで正規化します。
```
#### **Peek式Phase 12.7で追加)**
@ -716,3 +719,20 @@ let [first, second, ...rest] = array
**🎉 Nyash 2025は、AI協働設計による最先端言語システムとして、シンプルさと強力さを完全に両立しました。**
*最終更新: 2025年9月4日 - Phase 12.7実装済み機能の正確な反映*
### 2.x 例外・エラーハンドリングpostfix / cleanup
方針
- try は非推奨。postfix `catch``cleanup` を用いる。
- `catch` は直前の式/呼び出しで発生した例外を処理。
- `cleanup` は常に実行finally の後継)。
例(式レベルの postfix
```
do_work() catch(Error e) { env.console.log(e) }
open(path) cleanup { env.console.log("close") }
connect(url)
catch(NetworkError e) { env.console.warn(e) }
cleanup { env.console.log("done") }
```
注: Phase 1 は正規化(ゲート `NYASH_CATCH_NEW=1`)で legacy TryCatch へ展開。Phase 2 でパーサが直接受理。

View File

@ -0,0 +1,41 @@
# Match Guards — Syntax and Lowering (MVP + Design Notes)
Status: reference + design additions during freeze (no implementation changes)
Scope
- Guarded branches as a readable form of first-match selection.
- Canonical lowering target: if/else chain + PHI merges.
Syntax (MVP)
- Guard chain (first-match wins):
```nyash
guard <cond> -> { /* then */ }
guard <cond> -> { /* then */ }
else -> { /* else */ }
```
- Conditions may combine comparisons, `is/as` type checks, and literals with `&&` / `||`.
Lowering
- Always lowers to a linear if/else chain with early exit on first true guard.
- Merge points use normal PHI formation invariants (see `reference/mir/phi_invariants.md`).
Design additions (frozen; docs only)
- Range Pattern (sugar):
- `guard x in '0'..'9' -> { ... }`
- Lowers to: `('0' <= x && x <= '9')`.
- Multiple ranges: `in A..B || C..D` → OR of each bound check.
- CharClass (predefined sets):
- `Digit ≡ '0'..'9'`, `AZ ≡ 'A'..'Z'`, `az ≡ 'a'..'z'`, `Alnum ≡ Digit || AZ || az`, `Space ≡ ' '\t\r\n` (MVP set; expandable later).
- `guard ch in Digit -> { ... }` expands to range checks.
Errors & Rules (MVP)
- Default `_` branch does not accept guards.
- Type guard succeeds inside the then-branch; bindings (e.g., `StringBox(s)`) are introduced at branch head.
- Short-circuit semantics follow standard branch evaluation (right side is evaluated only if needed).
Observability (design)
- `NYASH_FLOW_TRACE=1` may trace how guard chains desugar into if/else.
Notes
- This page describes existing guard semantics and adds range/charclass as documentation-only sugar during freeze.

View File

@ -0,0 +1,61 @@
# Nyash Strings: UTF8 First, Bytes Separate
Status: Design committed. This document defines how Nyash treats text vs bytes and the minimal APIs we expose in each layer.
## Principles
- UTF8 is the only inmemory encoding for `StringBox`.
- Text operations are defined in terms of Unicode code points (CP). Grapheme cluster (GC) helpers may be added on top.
- Bytes are not text. Byte operations live in a separate `ByteCursorBox` and bytelevel instructions.
- Conversions are explicit.
## Model
- `StringBox`: immutable UTF8 string value. Public text APIs are CPindexed.
- `Utf8CursorBox`: delegated implementation for scanning and slicing `StringBox` as CPs.
- `ByteCursorBox`: independent binary view/holder for byte sequences.
## Invariants
- Indices are zerobased. Slices use halfopen intervals `[i, j)`.
- CP APIs never intermix with byte APIs. GC APIs are explicitly suffixed (e.g., `*_gc`).
- Conversions must be explicit. No implicit transcoding.
## Core APIs (MVP)
Text (UTF8/CP): implemented by `StringBox` delegating to `Utf8CursorBox`.
- `length() -> i64` — number of code points.
- `substring(i,j) -> StringBox` — CP slice.
- `indexOf(substr, from=0) -> i64` — CP index or `-1`.
- Optional helpers: `startsWith/endsWith/replace/split/trim` as sugar.
Bytes: handled by `ByteCursorBox`.
- `len_bytes() -> i64`
- `slice_bytes(i,j) -> ByteCursorBox`
- `find_bytes(pattern, from=0) -> i64`
- `to_string_utf8(strict=true) -> StringBox | Error` — strict throws on invalid UTF8 (MVP may replace with U+FFFD when `strict=false`).
## Errors
- CP APIs clamp outofrange indices (dev builds may enable strict). Byte APIs mirror the same behavior for byte indices.
- `to_string_utf8(strict=true)` fails on invalid input; `strict=false` replaces invalid sequences by U+FFFD.
## Interop
- FFI/ABI boundaries use UTF8. NonUTF8 sources must enter via `ByteCursorBox` + explicit transcoding.
- Other encodings (e.g., UTF16) are future work via separate cursor boxes; `StringBox` remains UTF8.
## Roadmap
1) Provide Nyashlevel MVP boxes: `Utf8CursorBox`, `ByteCursorBox`.
2) Route `StringBox` public methods through `Utf8CursorBox`.
3) Migrate MiniVM and macro scanners to use `Utf8CursorBox` helpers.
4) Add CP/byte parity smokes; later add GC helpers and normalizers.
## Proposed Convenience (design only)
Parsing helpers (sugar; freeze-era design, not implemented):
- `toDigitOrNull(base=10) -> i64 | null`
- Returns 0..9 when the code point is a decimal digit (or base subset), otherwise `null`.
- CP based; delegates to `Utf8CursorBox` to read the leading code point.
- `toIntOrNull() -> i64 | null`
- Parses the leading consecutive decimal digits into an integer; returns `null` when no digit at head.
- Pure function; does not move any external cursor (callers decide how to advance).
Notes
- Zero new runtime opcodes; compiled as comparisons and simple arithmetic.
- `Option/Maybe` may replace `null` in a future revision; documenting `null` keeps MVP simple.