🚀 Start Phase 15.3: Nyash compiler MVP implementation

Major milestone:
- Set up apps/selfhost-compiler/ directory structure
- Implement basic Nyash compiler in Nyash (CompilerBox)
- Stage-1: Basic arithmetic parser (int/string/+/-/*/括弧/return)
- JSON v0 output compatible with --ny-parser-pipe
- Runner integration with NYASH_USE_NY_COMPILER=1 flag
- Comprehensive smoke tests for PHI/Bridge/Stage-2

Technical updates:
- Updated CLAUDE.md with Phase 15.3 status and MIR14 details
- Statement separation policy: newline-based with minimal ASI
- Fixed runaway ny-parser-pipe processes (CPU 94.9%)
- Clarified MIR14 as canonical instruction set (not 13/18)
- LoopForm strategy: PHI auto-generation during reverse lowering

Collaborative development:
- ChatGPT5 implementing compiler skeleton
- Codex provided LoopForm PHI generation guidance
- Claude maintaining documentation and coordination

🎉 セルフホスティングの歴史的一歩!自分自身をコンパイルする日が近いにゃ!

Co-Authored-By: ChatGPT <noreply@openai.com>
This commit is contained in:
Selfhosting Dev
2025-09-15 01:21:37 +09:00
parent d01f9b9c93
commit af11c6855b
28 changed files with 1007 additions and 40 deletions

View File

@ -0,0 +1,81 @@
# Ny JSON IR v0 — Minimal Spec (Stage2)
Status: experimental but stable for Phase15 Stage2. Input to `--ny-parser-pipe`.
Version and root
- `version`: 0
- `kind`: "Program"
- `body`: array of statements
Statements (`StmtV0`)
- `Return { expr }`
- `Extern { iface, method, args[] }` (optional; passes through to `ExternCall`)
- `Expr { expr }` (expression statement; side effects only)
- `Local { name, expr }` (Stage2)
- `If { cond, then: Stmt[], else?: Stmt[] }` (Stage2)
- `Loop { cond, body: Stmt[] }` (Stage2; while(cond) body)
Expressions (`ExprV0`)
- `Int { value }` where `value` is JSON number or digit string
- `Str { value: string }`
- `Bool { value: bool }`
- `Binary { op: "+"|"-"|"*"|"/", lhs, rhs }`
- `Compare { op: "=="|"!="|"<"|"<="|">"|">=", lhs, rhs }`
- `Logical { op: "&&"|"||", lhs, rhs }` (shortcircuit)
- `Call { name: string, args[] }` (function by name)
- `Method { recv: Expr, method: string, args[] }` (box method)
- `New { class: string, args[] }` (construct Box)
- `Var { name: string }`
CFG conventions (lowered by the bridge)
- If: create `then_bb`, `else_bb`, `merge_bb`. Both branches jump to merge if unterminated.
- Loop: `preheader -> cond_bb -> (body_bb or exit_bb)`, body jumps back to cond.
- Shortcircuit Logical: create `rhs_bb`, `fall_bb`, `merge_bb` with constants on fall path.
- All blocks end with a terminator (branch/jump/return).
PHI merging (current behavior)
- If: locals updated in `then`/`else` merge at `merge_bb` via `phi`.
- Else欠落時は else 側に分岐前(base)を採用。
- 片側にしか存在しない新規変数はスコープ外として外へ未伝播。
- Loop: `cond_bb` にヘッダ PHI を先置きpreheader/base と latch/body end を合流)。
- 目的: Stage2 を早期に安定化させるための橋渡し。将来Core14は LoopForm からの逆LoweringでPHI自動化予定。
Type meta (emitter/LLVM harness cooperation)
- `+` with any string operand → string concat pathhandle固定
- `==/!=` with both strings → string compare path。
Special notes
- `Var("me")`: Bridge 既定では未定義エラー。デバッグ用に `NYASH_BRIDGE_ME_DUMMY=1` でダミー `NewBox{class}` を注入可(`NYASH_BRIDGE_ME_CLASS` 省略時は `Main`)。
- `--ny-parser-pipe` は stdin の JSON v0 を受け取り、MIR→MIRInterp 経由で実行する。
CLI/Env cheatsheet
- Pipe: `echo '{...}' | target/release/nyash --ny-parser-pipe`
- File: `target/release/nyash --json-file sample.json`
- Verbose MIR dump: `NYASH_CLI_VERBOSE=1`
- me dummy: `NYASH_BRIDGE_ME_DUMMY=1 NYASH_BRIDGE_ME_CLASS=ConsoleBox`
Examples
Arithmetic
```json
{"version":0,"kind":"Program","body":[
{"type":"Return","expr":{
"type":"Binary","op":"+",
"lhs":{"type":"Int","value":1},
"rhs":{"type":"Binary","op":"*","lhs":{"type":"Int","value":2},"rhs":{"type":"Int","value":3}}
}}
]}
```
If with local + PHI merge
```json
{"version":0,"kind":"Program","body":[
{"type":"Local","name":"x","expr":{"type":"Int","value":1}},
{"type":"If","cond":{"type":"Compare","op":"<","lhs":{"type":"Int","value":1},"rhs":{"type":"Int","value":2}},
"then":[{"type":"Local","name":"x","expr":{"type":"Int","value":10}}],
"else":[{"type":"Local","name":"x","expr":{"type":"Int","value":20}}]
},
{"type":"Return","expr":{"type":"Var","name":"x"}}
]}
```

View File

@ -10,6 +10,9 @@ This is the entry point for Nyash language documentation.
- Sugar Transformations (?., ??, |> and friends): parser/sugar.rs (source) and tools/nyfmt/NYFMT_POC_ROADMAP.md
- Peek Expression Design/Usage: covered in the Language Reference and Phase 12.7 specs above
Statement separation and semicolons
- See: reference/language/statements.md — newline as primary separator; semicolons optional for multiple statements on one line; minimal ASI rules.
Related implementation notes
- Tokenizer: src/tokenizer.rs
- Parser (expressions/statements): src/parser/expressions.rs, src/parser/statements.rs

View File

@ -0,0 +1,70 @@
# Statement Separation and Semicolons
Status: Adopted for Phase 15.3+; parser implementation is staged.
Policy
- Newline as primary statement separator.
- Semicolons are optional and only needed when multiple statements appear on one physical line.
- Minimal ASI (auto semicolon insertion) rules to avoid surprises.
Rules (minimal and predictable)
- Newline ends a statement when:
- Parenthesis/brace/bracket depth is 0, and
- The line does not end with a continuation token (`+ - * / . ,` etc.).
- Newline does NOT end a statement when:
- Inside any open grouping `(...)`, `[...]`, `{...}`; or
- The previous token is a continuation token.
- `return/break/continue` end the statement at newline unless the value is on the same line or grouped via parentheses.
- `if/else` (and similar paired constructs): do not insert a semicolon between a block and a following `else`.
- Oneline multistatements are allowed with semicolons: `x = 1; y = 2; print(y)`.
- Method chains can break across lines after a dot: `obj\n .method()` (newline treated as whitespace).
Style guidance
- Prefer newline separation (no semicolons) for readability.
- Use semicolons only when placing multiple statements on a single line.
Examples
```nyash
// Preferred (no semicolons)
local x = 5
x = x + 1
print(x)
// One line with multiple statements (use semicolons)
local x = 5; x = x + 1; print(x)
// Line continuation by operator
local n = 1 +
2 +
3
// Grouping across lines
return (
1 + 2 + 3
)
// if / else on separate lines without inserting a semicolon
if cond {
x = x - 1
}
else {
print(x)
}
// Dot chain across lines
local v = obj
.methodA()
.methodB(42)
```
Implementation notes (parser)
- Tokenizer keeps track of grouping depth.
- At newline, attempt ASI only when depth==0 and previous token is not a continuation.
- Error messages should suggest adding a continuation token or grouping when a newline unexpectedly ends a statement.
Parser dev notes (Stage1/2)
- return + newline: treat bare `return` as statement end. To return an expression on the next line, require grouping with parentheses.
- if/else: never insert a semicolon between a closed block and `else` (ASI禁止箇所)。
- Dot chains: treat `.` followed by newline as whitespace (line continuation)。
- Oneline multistatements: accept `;` as statement separator, but formatter should prefer newlines.
- Unary minus: disambiguate from binary minus; implement after Stage1当面は括弧で回避