Commit Graph

28 Commits

Author SHA1 Message Date
b4cb516f6a refactor(compiler): reorganize StageBBodyExtractorBox structure
**Goal**: Improve readability of 480-line build_body_src method with
clear phase separators, consistent spacing, and unified formatting.
**Zero logic changes** - behavior 100% identical.

**Structure improvements**:

1. **Added clear phase separators** with ==== comment lines:
   - Phase 4: Body extraction (k0/k1/k2/k3 logic)
   - Phase 4.7: Comment removal
   - Phase 4.5: Bundle resolution
   - Phase 4.6: Duplicate bundle check
   - Bundle merge + line-map debug output
   - Using resolver
   - Phase 5: Trim (left/right)

2. **Improved readability**:
   - Added consistent spacing between phases (2 blank lines)
   - Unified indentation (2 spaces throughout)
   - Grouped related debug blocks together
   - Made block structure more visible

3. **Zero logic changes**:
   - All variable names unchanged
   - All conditions unchanged
   - All calculations unchanged
   - All DEBUG messages unchanged
   - All bundle/using resolver calls unchanged

**Verification**:
- Same ValueId(17) error as before (expected, will fix in Task B)
- Debug logs identical ([plugin/missing], [DEBUG])
- Behavior 100% identical to original

**Impact**: Code now much more maintainable with clear phase boundaries,
making future modifications safer and simpler.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 04:19:17 +09:00
e2c37f06ba refactor(compiler): split Main into 4 boxes (Phase 25.1c)
**Goal**: Reorganize compiler_stageb.hako from monolithic Main.main
(605 lines) into 4 cleanly separated boxes following "Box Theory".

**Structure** (605→628 lines, +23 for box boundaries):

1. **StageBArgsBox** (lines 18-46)
   - Purpose: CLI argument → source resolution
   - Method: resolve_src(args)
   - Handles: --source, --source-file, env variables, defaults

2. **StageBBodyExtractorBox** (lines 48-528)
   - Purpose: Body extraction + bundle + using + trim
   - Method: build_body_src(src, args)
   - Handles: Method body extraction, comment stripping, bundling,
     using resolution, whitespace trimming

3. **StageBDriverBox** (lines 530-619)
   - Purpose: Main driver logic
   - Method: main(args)
   - Orchestrates: ParserBox setup, parse_program2, defs scanning,
     JSON output

4. **Main** (lines 623-628)
   - Purpose: Entry point (thin wrapper)
   - Method: main(args)
   - Action: Simple delegation to StageBDriverBox.main(args)

**Constraints respected**:
-  Behavior unchanged (same output JSON, same logs)
-  Code moved as-is (no logic changes)
-  All using statements preserved
-  All comments preserved + Phase 25.1c markers added
-  Proper 2-space indentation maintained
-  Call chain: Main → StageBDriverBox → StageBArgsBox/StageBBodyExtractorBox

**Verification**:
- Same ValueId(17) error occurs (expected, not fixed in this task)
- Error location changed: fn=Main.main → fn=StageBArgsBox.resolve_src/1
  (proves code was successfully moved)
- No new errors introduced
- Structural separation enables future SSA/ValueId fixes

**Impact**: Establishes clean box boundaries for future maintenance,
making it easier to debug and fix SSA issues independently per box.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 04:03:29 +09:00
eadde8d1dd fix(mir/builder): use function-local ValueId throughout MIR builder
Phase 25.1b: Complete SSA fix - eliminate all global ValueId usage in function contexts.

Root cause: ~75 locations throughout MIR builder were using global value
generator (self.value_gen.next()) instead of function-local allocator
(f.next_value_id()), causing SSA verification failures and runtime
"use of undefined value" errors.

Solution:
- Added next_value_id() helper that automatically chooses correct allocator
- Fixed 19 files with ~75 occurrences of ValueId allocation
- All function-context allocations now use function-local IDs

Files modified:
- src/mir/builder/utils.rs: Added next_value_id() helper, fixed 8 locations
- src/mir/builder/builder_calls.rs: 17 fixes
- src/mir/builder/ops.rs: 8 fixes
- src/mir/builder/stmts.rs: 7 fixes
- src/mir/builder/emission/constant.rs: 6 fixes
- src/mir/builder/rewrite/*.rs: 10 fixes
- + 13 other files

Verification:
- cargo build --release: SUCCESS
- Simple tests with NYASH_VM_VERIFY_MIR=1: Zero undefined errors
- Multi-parameter static methods: All working

Known remaining: ValueId(22) in Stage-B (separate issue to investigate)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 00:48:18 +09:00
7ca7f646de Phase 25.1b: Step2完了(FuncBodyBasicLowerBox導入)
Step2実装内容:
- FuncBodyBasicLowerBox導入(defs専用下請けモジュール)
- _try_lower_local_if_return実装(Local+単純if)
- _inline_local_ints実装(軽い正規化)
- minimal lowers統合(Return/BinOp/IfCompare/MethodArray系)

Fail-Fast体制確立:
- MirBuilderBox: defs_onlyでも必ずタグ出力
- [builder/selfhost-first:unsupported:defs_only]
- [builder/selfhost-first:unsupported:no_match]

Phase構造整備:
- Phase 25.1b README新設(Step0-3計画)
- Phase 25.2b README新設(次期計画)
- UsingResolverBox追加(using system対応準備)

スモークテスト:
- stage1_launcher_program_to_mir_canary_vm.sh追加

Next: Step3 LoopForm対応

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-15 22:32:13 +09:00
f1fa182a4b AotPrep collections_hot matmul tuning and bench tweaks 2025-11-14 13:36:20 +09:00
647ee05d06 fix(emit): stabilize Stage-B wrapper with temp file approach
Root Cause:
- Subshell CODE expansion became path literal "/cat/tmp/matmul.hako"
- Variable lost in nested subshell with cd command
- All benchmark cases (matmul, arraymap, etc.) failed emit

Solution:
- Temp file approach with trap cleanup (CODE_TMP=$(mktemp))
- 3-tier fallback extraction (Python→awk→ruby)
- Enhanced diagnostics with HAKO_SELFHOST_TRACE=1
- Pre-check SKIP logic in microbench for unstable emit

Changes:
- tools/hakorune_emit_mir.sh
  - Temp file approach eliminates subshell variable issues
  - extract_program_json now has 3 fallback strategies
  - Detailed trace output for debugging
  - Variable scope fixes (local → script level)
- tools/perf/microbench.sh
  - matmul pre-check with SKIP + diagnostic hint
  - Prevents false benchmark results on emit failure

Test Results:
 loop:        936 bytes   rc=0
 call:        330 bytes   rc=0
 stringchain: 313 bytes   rc=0
 arraymap:    422 bytes   rc=0
 matmul:     7731 bytes   rc=0 (FIXED!)
 CI guard: emit_provider_no_jsonfrag_canary PASS

Impact:
- All benchmark cases now emit MIR successfully
- Stable execution without subshell variable bugs
- Comprehensive diagnostics for future debugging
- Foundation for provider-first optimization

Next: Apply AotPrep to optimize Array/Map hot paths

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 22:52:25 +09:00
801833df8d fix(env): improve Environment::set scope resolution (partial)
Fixed:
- Environment::set now properly searches ancestor chain before creating new binding
- Added exists_in_chain_locked() helper for explicit existence checking
- Simple {} blocks now correctly update outer scope variables

Verified Working:
- local x = 10; { x = 42 }; print(x) → prints 42 

Still Broken:
- else blocks don't update outer scope variables
- local x = 10; if flag { x = 99 } else { x = 42 }; print(x) → prints 10 

Root Cause Identified:
- Issue is in MIR Builder (compile-time), not Environment (runtime)
- src/mir/builder/if_form.rs:108 resets variable_map before else block
- PHI generation at merge doesn't use else_var_map_end correctly
- MIR shows: phi [%32, bb1], [%1, bb2] where %1 is original value, not else value

Next: Fix else block variable merging in if_form.rs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 18:55:14 +09:00
1ac0c6b880 feat(stageb): implement UsingResolverBox foundation (partial)
Implemented:
- UsingResolverBox full implementation in using_resolver_box.hako
  - state_new(): Empty state creation
  - load_modules_json(): Load modules JSON from nyash.toml
  - resolve_path_alias(): Resolve paths from aliases
  - resolve_namespace_alias(): Tail segment matching with case-insensitive support
  - to_context_json(): Generate context JSON for ParserBox
- Added sh_core entry to nyash.toml modules section
  - Maps to lang/src/shared/common/string_helpers.hako
  - Fixes "using not found: 'sh_core'" errors
- Cleaned up compiler_stageb.hako
  - Removed problematic using statements
  - Added documentation

Known Issue (to be fixed next):
- Body extraction bug in compiler_stageb.hako:51-197
  - Multiline source extraction fails for "static box Main { main() {...} }"
  - Results in empty Program JSON body
  - Causes Stage-B emit pipeline to fall back to jsonfrag (ratio=207900%)
  - This is the root cause blocking selfhost builder path

Impact:
-  sh_core resolution errors fixed
-  UsingResolverBox infrastructure complete
-  Stage-B emit pipeline not restored (body extraction bug)
-  Selfhost builder path still blocked

Next Priority: Fix body extraction bug to restore Stage-B pipeline

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 18:11:25 +09:00
9e2fa1e36e Phase 21.6 solidification: chain green (return/binop/loop/call); add Phase 21.7 normalization plan (methodize static boxes). Update CURRENT_TASK.md and docs. 2025-11-11 22:35:45 +09:00
6055d53eff feat(phase21.5/22.1): MirBuilder JsonFrag refactor + FileBox ring-1 + registry tests
Phase 21.5 (AOT/LLVM Optimization Prep)
- FileBox ring-1 (core-ro) provider: priority=-100, always available, no panic path
  - src/runner/modes/common_util/provider_registry.rs: CoreRoFileProviderFactory
  - Auto-registers at startup, eliminates fallback panic structurally
- StringBox fast path prototypes (length/size optimization)
- Performance benchmarks (C/Python/Hako comparison baseline)

Phase 22.1 (JsonFrag Unification)
- JsonFrag.last_index_of_from() for backward search (VM fallback)
- Replace hand-written lastIndexOf in lower_loop_sum_bc_box.hako
- SentinelExtractorBox for Break/Continue pattern extraction

MirBuilder Refactor (Box → JsonFrag Migration)
- 20+ lower_*_box.hako: Box-heavy → JsonFrag text assembly
- MirBuilderMinBox: lightweight using set for dev env
- Registry-only fast path with [registry:*] tag observation
- pattern_util_box.hako: enhanced pattern matching

Dev Environment & Testing
- Dev toggles: SMOKES_DEV_PREINCLUDE=1 (point-enable), HAKO_MIR_BUILDER_SKIP_LOOPS=1
- phase2160: registry opt-in tests (array/map get/set/push/len) - content verification
- phase2034: rc-dependent → token grep (grep -F based validation)
- run_quick.sh: fast smoke testing harness
- ENV documentation: docs/ENV_VARS.md

Test Results
 quick phase2034: ALL GREEN (MirBuilder internal patterns)
 registry phase2160: ALL GREEN (array/map get/set/push/len)
 rc-dependent tests → content token verification complete
 PREINCLUDE policy: default OFF, point-enable only where needed

Technical Notes
- No INCLUDE by default (maintain minimalism)
- FAIL_FAST=0 in Bring-up contexts only (explicit dev toggles)
- Tag-based route observation ([mirbuilder/min:*], [registry:*])
- MIR structure validation (not just rc parity)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 19:42:42 +09:00
ddf736ddef Stage‑B alias/table negatives + Bridge v1 Closure negatives + docs sweep (minor)
- BundleResolver: fail-fast with tag on malformed alias table entries ([bundle/alias-table/bad] <entry>)
- Add opt-in negatives:
  - canonicalize_closure_captures_negative_vm (captures wrong types)
  - canonicalize_closure_missing_func_negative_vm (missing func in Closure)
  - stageb_bundle_alias_table_bad_vm (malformed alias table)
- README minor replacements to  (residuals)

All new tests are opt-in; quick remains green.
2025-11-02 17:57:41 +09:00
dd6876e1c6 Phase 20.12b: quick green + structural cleanup
- Deprecations: add warn-once for nyash.toml (runtime::deprecations); apply in plugin loader v2 (singletons/method_resolver)
- Child env + runner hygiene: unify Stage‑3/quiet/disable-fallback env in test/runner; expand LLVM noise filters
- Docs/branding: prefer  and hako.toml in README.md/README.ja.md and smokes README
- VM: implement Map.clear in MIR interpreter (boxes_map)
- Stage‑B: gate bundle/alias/require smokes behind SMOKES_ENABLE_STAGEB; fix include cwd and resolve() call even for require-only cases
- Core‑Direct: gate rc boundary canary behind SMOKES_ENABLE_CORE_DIRECT
- Smokes: inject Stage‑3 and disable selfhost fallback for LLVM runs; filter using/* logs
- Quick profile: 168/168 PASS locally

This commit accelerates Phase 20.33 (80/20) by stabilizing quick suite, reducing noise, and gating heavy/experimental paths for speed.
2025-11-02 17:50:06 +09:00
0cd2342b05 core: add Core Direct string canaries (substring/charAt/replace); Stage‑B: alias table (ENV) support with BundleResolver; docs: Exit Code Policy tag→rc rules and checklist updates. 2025-11-02 16:24:50 +09:00
8b006575c1 smokes: enable Gate‑C budget canary by default (fallback allowed); add map_values_sum/map_keys_size; Stage‑B bundling routed via BundleResolver box; minor fixes. 2025-11-02 15:56:45 +09:00
a1d5b82683 runner: introduce CoreExecutor box for JSON→exec; wire Gate‑C pipe to CoreExecutor. Stage‑B bundling: duplicate name Fail‑Fast + mix canary; add Core Map/String positive smokes; add Gate‑C budget opt‑in canary; docs: Exit Code Policy; apply child_env in PyVM common util. 2025-11-02 15:43:43 +09:00
3aa0c3c875 fix(stage-b): Add sh_core using + Stage-1 JSON support
## Fixed Issues

1. compiler_stageb.hako: Added 'using sh_core as StringHelpers'
   - Resolved: call unresolved ParserStringUtilsBox.skip_ws/2
   - Root cause: using chain resolution not implemented
   - Workaround: explicit using in parent file

2. stageb_helpers.sh: Accept Stage-1 JSON format
   - Modified awk pattern to accept both formats:
     - MIR JSON v0: "version":0, "kind":"Program"
     - Stage-1 JSON: "type":"Program"

## Remaining Issues

ParserBox VM crash: Invalid value: use of undefined value ValueId(5839)
- Cause: Complex nested loops in parse_program2()
- Workaround: Minimal Stage-B (without ParserBox) works
- Fallback: Rust compiler path available

## Verification

 Minimal Stage-B outputs JSON correctly
 ParserBox execution crashes VM (SSA bug)

Co-Authored-By: Task先生 (AI Agent)
2025-11-02 08:23:43 +09:00
7180579cf8 stage-b (P0): stabilize entry — compiler_stageb.hako now emits Stage‑1 Program(JSON v0) directly (one-line), avoiding heavy MIR path; FlowEntry prefers v1→v0 first; noisy debug prints in pipeline with_usings gated. Quick core/stageb canaries PASS. 2025-11-02 07:22:40 +09:00
df9068a555 feat(stage-b): Add FLOW keyword support + fix Stage-3 keyword conflicts
##  Fixed Issues

### 1. `local` keyword tokenization (commit 9aab64f7)
- Added Stage-3 gate for LOCAL/TRY/CATCH/THROW keywords
- LOCAL now only active when NYASH_PARSER_STAGE3=1

### 2. `env.local.get` keyword conflict
- File: `lang/src/compiler/entry/compiler_stageb.hako:21-23`
- Problem: `.local` in member access tokenized as `.LOCAL` keyword
- Fix: Commented out `env.local.get("HAKO_SOURCE")` line
- Fallback: Use `--source` argument (still functional)

### 3. `flow` keyword missing
- Added FLOW to TokenType enum (`src/tokenizer/kinds.rs`)
- Added "flow" → TokenType::FLOW mapping (`src/tokenizer/lex_ident.rs`)
- Added FLOW to Stage-3 gate (requires NYASH_PARSER_STAGE3=1)
- Added FLOW to parser statement dispatch (`src/parser/statements/mod.rs`)
- Added FLOW to declaration handler (`src/parser/statements/declarations.rs`)
- Updated box_declaration parser to accept BOX or FLOW (`src/parser/declarations/box_definition.rs`)
- Treat `flow FooBox {}` as syntactic sugar for `box FooBox {}`

### 4. Module namespace conversion
- Renamed `lang.compiler.builder.ssa.local` → `localvar` (avoid keyword)
- Renamed file `local.hako` → `local_ssa.hako`
- Converted 152 path-based using statements to namespace format
- Added 26+ entries to `nyash.toml` [modules] section

## ⚠️ Remaining Issues

### Stage-B selfhost compiler performance
- Stage-B compiler not producing output (hangs/times out after 10+ seconds)
- Excessive PHI debug output suggests compilation loop issue
- Needs investigation: infinite loop or N² algorithm in hako compiler

### Fallback JSON version mismatch
- Rust fallback (`--emit-mir-json`) emits MIR v1 JSON (schema_version: "1.0")
- Smoke tests expect MIR v0 JSON (`"version":0, "kind":"Program"`)
- stageb_helpers.sh fallback needs adjustment

## Test Status
- Parse errors: FIXED 
- Keyword conflicts: FIXED 
- Stage-B smoke tests: STILL FAILING  (performance issue)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-02 04:13:17 +09:00
f813659d2e refactor(compiler): Stage-B compiler simplification and test infrastructure
**Compiler Simplification (compiler_stageb.hako):**
- Remove complex fallback system (_fallback_enabled, _fallback_program)
- Remove flag parsing system (_collect_flags, _parse_signed_int)
- Streamline to single-method implementation (main only)
- Focus: parse args/env → extract main body → FlowEntry emit
- 149 lines simplified, better maintainability

**Parser Cleanup:**
- Fix trailing whitespace in members.rs (static_def)
- Add child_env module to runner/mod.rs

**Test Infrastructure (stageb_helpers.sh):**
- Enhance Stage-B test helper functions
- Better error handling and diagnostics

**Context:**
These changes were made during PHI UseBeforeDef debugging session.
Simplified compiler_stageb.hako eliminates unnecessary complexity
while maintaining core Stage-B compilation functionality.

**Impact:**
 Reduced Stage-B compiler complexity (-12 lines net)
 Clearer single-responsibility implementation
 Better test infrastructure support

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-01 20:58:26 +09:00
6279d93e9a Stage‑B fallback TTL: HAKO_STAGEB_ALLOW_FALLBACK (default ON); Runner helper child_env (OOB strict helpers) and use in Gate‑C; Add quick smokes: map len/set/get, string index/substring; Bridge canonicalize tests default‑on; Update bridge scripts to robust root detection. 2025-11-01 19:48:40 +09:00
6a452b2dca fix(mir): PHI検証panic修正 - update_cfg()を検証前に呼び出し
A案実装: debug_verify_phi_inputs呼び出し前にCFG predecessorを更新

修正箇所(7箇所):
- src/mir/builder/phi.rs:50, 73, 132, 143
- src/mir/builder/ops.rs:273, 328, 351

根本原因:
- Branch/Jump命令でsuccessorは即座に更新
- predecessorはupdate_cfg()で遅延再構築
- PHI検証が先に実行されてpredecessor未更新でpanic

解決策:
- 各debug_verify_phi_inputs呼び出し前に
  if let Some(func) = self.current_function.as_mut() {
      func.update_cfg();
  }
  を挿入してCFGを同期

影響: if/else文、論理演算子(&&/||)のPHI生成が正常動作
2025-11-01 13:28:56 +09:00
bec43ea206 compiler: route --stage-b through main entry; document Stage-B status 2025-11-01 08:59:43 +09:00
4f4ee948e0 Stage-B: add --v1-compat opt-in path and smoke 2025-11-01 03:56:25 +09:00
1f415e733c Stage-B: route FlowEntry context (using/extern) and default Stage-B entry 2025-11-01 03:03:51 +09:00
978bb4a5c6 runner: add NyVM wrapper core_bridge (canonicalize/dump) + opt-in wrapper canary; export module in common_util 2025-11-01 02:51:49 +09:00
abe174830f hako(compiler): Fix binary operators and if statements parsing
- Implement simplified binary operator parsing (+, -, *, /) with proper JSON output
- Add comparison operator parsing (==, >) for if statements
- Fix if statement parsing with proper body extraction and print statement handling
- Resolve missing parenthesis issue in if body parsing
- All smoke tests now PASS: hako_min_binop_vm.sh and hako_min_if_vm.sh
- Maintain existing functionality: array read/write, map rw canaries still green

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-10-31 23:00:06 +09:00
5208491e6e hako( compiler): Stage-A enhancements - map literals, binary/compare operators, if statements, and error diagnostics
- Implement map literal parsing with basic key/value pairs: {a:1,b:2}
- Add binary operators (+, -, *, /) with precedence handling
- Add comparison operators (>, <, ==, !=, >=, <=) for if statements
- Implement minimal if statement parsing: if(condition){statement}
- Add string indexing error diagnostic for unsupported Stage-A features
- Create new smoke tests: hako_min_binop_vm.sh and hako_min_if_vm.sh
- Enhance JSON v0 output with proper ExprV0.Binary and ExprV0.Compare structures

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-10-31 22:48:46 +09:00
5e3d9e7ae4 restore(lang/compiler): bring back lang/src/compiler from e917d400; add Hako index canaries and docs; implement Rust-side index operator (Array/Map get/set) with Fail‑Fast diagnostics
- restore: lang/src/compiler/** (parser/emit/builder/pipeline_v2) from e917d400
- docs: docs/development/selfhosting/index-operator-hako.md
- smokes(hako): tools/smokes/v2/profiles/quick/core/index_operator_hako.sh (opt-in)
- smokes(vm): adjust index_operator_vm.sh for semicolon gate + stable error text
- rust/parser: allow IndexExpr and assignment LHS=Index; postfix parse LBRACK chain
- rust/builder: lower arr/map index to BoxCall get/set; annotate array/map literals; Fail‑Fast for unsupported types
- CURRENT_TASK: mark Rust side done; add Hako tasks checklist

Note: files disappeared likely due to branch FF to a lineage without lang/src/compiler; no explicit delete commit found. Added anchor checks and suggested CI guard in follow-up.
2025-10-31 20:18:39 +09:00