Files

Selfhosting Dev e114f9bfe3 fix(llvm): Implement handle-based console.log functions for plugin return values

- Add nyash.console.log_handle(i64) -> i64 family functions to nyrt
- Replace invalid int-to-pointer conversion with proper handle-based calls
- Fix bool(i1) -> i64 type conversion in LLVM compiler
- Resolve LLVM function verification errors
- Enable plugin method execution without NYASH_LLVM_ALLOW_BY_NAME
- Merge codex TLV fixes for plugin return value handling (2000+ lines)

Technical Details:
- Root cause: build_int_to_ptr(handle_value, i8*, "arg_i2p") treated
  handle IDs as memory addresses (invalid operation)
- Solution: Direct i64 handle passing to nyrt functions with proper
  handle registry lookup and to_string_box() conversion
- Type safety: Added proper i1/i32/i64 -> i64 conversion handling

Status: Console.log type errors resolved, plugin return value display
still under investigation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-09-11 00:21:11 +09:00

3.4 KiB

Raw Blame History

GPU Box 技術仕様（案）

🔧 技術的詳細

GPU Box の要件

純粋性
- 副作用を持たない
- 外部状態に依存しない
- 決定的な実行結果
型制約
- GPU互換型のみ使用可能
- ポインタ/参照の制限
- 再帰呼び出し禁止

メモリモデル

// GPU メモリレイアウト
gpu box ParticleArray {
    // Structure of Arrays (SoA) で自動配置
    positions: GPUBuffer<Float3>
    velocities: GPUBuffer<Float3>
    masses: GPUBuffer<Float>
}

MIR → GPU IR 変換

// Nyash MIR
BoxCall { 
    box_val: %particles,
    method: "update",
    args: [%deltaTime]
}

// ↓ 変換

// GPU IR（擬似コード）
kernel particle_update {
    params: [particles_ptr, deltaTime]
    threads: particles.count
    
    thread_body: {
        idx = thread_id()
        pos = load(particles.positions[idx])
        vel = load(particles.velocities[idx]) 
        new_pos = pos + vel * deltaTime
        store(particles.positions[idx], new_pos)
    }
}

バックエンド対応

CUDA (NVIDIA GPU)
- PTX生成
- cuBLAS/cuDNN統合
OpenCL (クロスプラットフォーム)
- SPIR-V生成
- 各種GPU対応
Vulkan Compute (モダンAPI)
- SPIR-V生成
- モバイルGPU対応
Metal (Apple GPU)
- Metal Shading Language
- Apple Silicon最適化

最適化技術

Box融合

// これらの操作を1つのGPUカーネルに融合
data.map(x => x * 2)
    .filter(x => x > 10)
    .reduce(+)

メモリ合体アクセス
- Boxフィールドの最適配置
- キャッシュ効率の最大化
占有率最適化
- スレッドブロックサイズ自動調整
- レジスタ使用量の制御

エラー処理

gpu box SafeDiv {
    @gpu
    divide(a, b) {
        if b == 0 {
            // GPU例外はCPU側で処理
            gpu.raise(DivisionByZeroError)
        }
        return a / b
    }
}

🔍 課題と解決策

課題1: デバッグの困難さ

解決: GPU実行トレース機能

// デバッグモードでGPU実行を記録
local result = particles.gpuExecute("update", 0.016, debug: true)
print(result.trace)  // 各スレッドの実行履歴

課題2: CPU/GPU同期オーバーヘッド

解決: 非同期実行とパイプライン

// GPU実行を非同期化
local future = particles.gpuExecuteAsync("update", 0.016)
// CPU側で他の処理を継続
doOtherWork()
// 必要な時に結果を取得
local result = await future

課題3: メモリ制限

解決: ストリーミング処理

// 大規模データを分割処理
largeData.gpuStream(chunkSize: 1_000_000)
    .map(process)
    .collect()

🎓 学習曲線を下げる工夫

自動GPU化
- コンパイラが自動的にGPU実行可能性を判定
- ヒント表示: 「このBoxはGPU実行可能です」
段階的移行
- 既存コードはCPUで動作保証
- @gpuを追加するだけでGPU化

プロファイリング支援

// GPU実行の効果を可視化
local profile = Profiler.compare(
    cpu: => particles.update(0.016),
    gpu: => particles.gpuExecute("update", 0.016)
)
print(profile.speedup)  // "GPU: 156.3x faster"

3.4 KiB Raw Blame History Unescape Escape

GPU Box 技術仕様（案）

🔧 技術的詳細

GPU Box の要件

MIR → GPU IR 変換

バックエンド対応

最適化技術

エラー処理

🔍 課題と解決策

課題1: デバッグの困難さ

課題2: CPU/GPU同期オーバーヘッド

課題3: メモリ制限

🎓 学習曲線を下げる工夫

3.4 KiB

Raw Blame History