Phase 131-3 完了: MIR→LLVM lowering 棚卸し テスト結果マトリックス: - Case A (phase87_llvm_exe_min.hako): ✅ PASS (baseline) - Case B (loop_min_while.hako): ❌ TAG-EMIT (PHI after terminator) - Case B2 (print(42) simple): ✅ PASS (BoxCall works) - Case C (llvm_stage3_loop_only.hako): ❌ TAG-EMIT (JoinIR pattern gap) Critical Bugs: 1. Bug #1: PHI After Terminator (Case B) - 原因: function_lower.py が terminator を PHI より先に emit - 修正: 4-pass block emission (2-3h) 2. Bug #2: JoinIR Pattern Gap (Case C) - 原因: loop(true) { break } パターンが JoinIR 未対応 - 修正: Pattern 5 設計・実装 (3-4h) Next Actions: - P1 (推奨): PHI ordering 修正 → 80% のループを有効化 - P2: JoinIR Pattern 5 → infinite loop 対応 ドキュメント: - phase131-3-llvm-lowering-inventory.md: 詳細棚卸し結果 - phase87-selfhost-llvm-exe-line.md: LLVM IR parsing error 追記 - CURRENT_TASK.md: phase131-3 参照追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
19 KiB
Phase 87: LLVM Exe Line SSOT (2025-12-13)
Goal
Establish single source of truth for .hako → .o → executable → execution pipeline.
SSOT Tool: tools/build_llvm.sh
Prerequisites
-
llvm-config-18 available:
llvm-config-18 --version # Expected: 18.x.x -
hakorune built with LLVM features:
cargo build --release --features llvm ./target/release/hakorune --version # Check --backend llvm available in --help -
Python llvmlite (for LLVM harness):
python3 -c "import llvmlite; print(llvmlite.__version__)" # Expected: 0.40.0 or newer
Compiler Modes
tools/build_llvm.sh supports two compiler modes for LLVM object generation:
Harness (Default) - Production Ready
Python llvmlite-based LLVM IR generation
- Stability: ✅ Proven stable, battle-tested
- Build Time: Fast (~1-3s for minimal programs)
- Dependencies: Python 3, llvmlite, LLVM 18
- Use Case: Default for all production builds
Enable (default behavior):
# Explicit mode selection (optional):
NYASH_LLVM_COMPILER=harness tools/build_llvm.sh program.hako -o output
# Default (no env var needed):
tools/build_llvm.sh program.hako -o output
How it works:
hakorune --backend llvminvokes Python harnesssrc/llvm_py/llvm_builder.pygenerates LLVM IR via llvmlitellc-18compiles IR to object file
Crate (Experimental) - Rust-native Compiler
Pure Rust LLVM IR generation via crates/nyash-llvm-compiler
- Stability: ⚠️ Experimental, under active development
- Build Time: Slower (~5-10s, requires crate compilation)
- Dependencies: LLVM 18 dev libraries, Rust toolchain
- Use Case: Advanced users, development/testing
Enable:
NYASH_LLVM_COMPILER=crate tools/build_llvm.sh program.hako -o output
How it works:
hakorune --emit-mir-jsongenerates MIR JSONny-llvmc(Rust crate) reads JSON and emits LLVM IRllc-18compiles IR to object file
Advanced: Direct exe emission (experimental):
NYASH_LLVM_COMPILER=crate NYASH_LLVM_EMIT=exe \
tools/build_llvm.sh program.hako -o output
# Skips separate linking step, emits executable directly
Mode Comparison Table
| Feature | Harness (Default) | Crate (Experimental) |
|---|---|---|
| Stability | ✅ Production ready | ⚠️ Experimental |
| Build Time | Fast (1-3s) | Moderate (5-10s) |
| Dependencies | Python + llvmlite | LLVM dev + Rust |
| MIR JSON | Internal | Explicit generation |
| Direct exe | ❌ Not supported | ✅ Experimental |
| Recommended For | All users | Advanced/dev only |
Default recommendation: Use harness mode (no env vars needed).
Environment Variables Reference
環境変数の一覧は docs/reference/environment-variables.md の「LLVM Build Pipeline」セクションを参照してください。
主要な環境変数(クイックリファレンス):
NYASH_LLVM_COMPILER: コンパイラモード (harnessまたはcrate)NYASH_CLI_VERBOSE=1: 詳細ビルド出力を有効化NYASH_LLVM_ONLY_OBJ=1: オブジェクト生成後に停止- その他14変数の詳細は上記SSOTドキュメントを参照
Standard Procedure
Build and execute a .hako program to native executable:
# Step 1: Build .hako → executable
tools/build_llvm.sh apps/tests/your_program.hako -o tmp/your_program
# Step 2: Execute
./tmp/your_program
echo $? # Check exit code
What it does:
- Compiles
.hako→ MIR (hakorune compiler) - MIR → LLVM IR (llvmlite harness,
src/llvm_py/) - LLVM IR → object file
.o(llvm tools) - Links object → executable (clang)
Example: Minimal Program
File: apps/tests/phase87_llvm_exe_min.hako
static box Main {
main() {
return 42
}
}
Build:
tools/build_llvm.sh apps/tests/phase87_llvm_exe_min.hako -o tmp/phase87_test
Execute:
./tmp/phase87_test
echo $? # Output: 42
Detailed Pipeline Explanation
Step 1: Hako → MIR JSON
Command (internal to build_llvm.sh):
./target/release/hakorune --emit-mir-json tmp/program.json program.hako
Output: MIR JSON representation of the program
Step 2: MIR JSON → LLVM IR
Command (internal to build_llvm.sh):
python3 src/llvm_py/llvm_builder.py tmp/program.json -o tmp/program.ll
Output: LLVM IR text file (.ll)
Step 3: LLVM IR → Object File
Command (internal to build_llvm.sh):
llc-18 tmp/program.ll -o tmp/program.o -filetype=obj
Output: Object file (.o)
Step 4: Object File → Executable
Command (internal to build_llvm.sh):
clang-18 tmp/program.o -o tmp/program
Output: Native executable
Success/Failure Criteria
Success Indicators
A successful build via tools/build_llvm.sh exhibits:
1. Exit Code: 0
tools/build_llvm.sh program.hako -o output
echo $? # Should output: 0
2. All 4 Steps Complete:
[1/4] Building hakorune (feature selectable) ...
[2/4] Emitting object (.o) via LLVM backend ...
[3/4] Building Nyash Kernel static runtime ...
[4/4] Linking output ...
✅ Done: output
3. Executable Generated:
ls -lh output
# Should exist and be executable
file output
# Output: ELF 64-bit LSB executable, x86-64, dynamically linked
4. Executable Runs:
./output
echo $?
# Should match expected exit code (e.g., 42 for phase87_llvm_exe_min.hako)
Failure Modes
build_llvm.sh uses distinct exit codes for different failure types:
| Exit Code | Meaning | Common Cause |
|---|---|---|
| 0 | ✅ Success | Build completed normally |
| 1 | Usage error | Missing input file or invalid arguments |
| 2 | Missing dependency | llvm-config-18 not found |
| 3 | Compilation failure | Object file not generated (MIR/LLVM IR error) |
| Other | System/linking error | Linking failure, missing libraries |
Exit Code 1 - Usage Error:
tools/build_llvm.sh
# Output: Usage: tools/build_llvm.sh <input.hako> [-o <output>]
# Exit: 1
Exit Code 2 - Missing LLVM:
# When llvm-config-18 not installed
tools/build_llvm.sh program.hako -o output
# Output: error: llvm-config-18 not found (install LLVM 18 dev).
# Exit: 2
Exit Code 3 - Object Generation Failed:
# When MIR/LLVM IR compilation fails
tools/build_llvm.sh bad_program.hako -o output
# Output: error: object not generated: target/aot_objects/bad_program.o
# Exit: 3
Validation Commands
Verify object file validity:
# Check object file exists and has correct format
file target/aot_objects/program.o
# Expected: ELF 64-bit relocatable, x86-64
# Check object file symbols
nm target/aot_objects/program.o | grep -E '(main|nyash_)'
# Should show exported symbols
Verify LLVM IR validity (when using crate mode with JSON):
# Step 1: Generate LLVM IR (manual)
NYASH_LLVM_COMPILER=crate NYASH_LLVM_MIR_JSON=tmp/test.json \
tools/build_llvm.sh program.hako -o output
# Step 2: Validate LLVM IR
llvm-as-18 tmp/test.ll -o /dev/null
# Should complete without errors
# Step 3: Disassemble and inspect
llvm-dis-18 target/aot_objects/program.o -o - | less
# Should show valid LLVM IR
Verify MIR JSON validity (crate mode):
# Ensure MIR JSON is well-formed
jq . tmp/test.json > /dev/null
echo $? # Should output: 0
# Optional: Schema validation
NYASH_LLVM_VALIDATE_JSON=1 NYASH_LLVM_COMPILER=crate \
tools/build_llvm.sh program.hako -o output
Build Time Expectations
Typical build times for phase87_llvm_exe_min.hako (minimal program):
| Step | Expected Time | Notes |
|---|---|---|
| [1/4] Build hakorune | ~0.5-2s | Incremental build (release) |
| [2/4] Emit object | ~1-2s | Harness mode (llvmlite) |
| ~5-10s | Crate mode (ny-llvmc) | |
| [3/4] Build Nyash Kernel | ~1-3s | Incremental build (release) |
| [4/4] Linking | ~0.2-0.5s | Native linker (cc/clang) |
| Total | ~3-8s | Harness mode |
| ~7-15s | Crate mode |
First build: Add ~30-60s for initial cargo build --release compilation.
Performance factors:
- Parallel builds:
-j 24used by default (seebuild_llvm.sh) - Incremental builds:
CARGO_INCREMENTAL=1enabled - Cache hits: Subsequent builds much faster (~1-3s total)
Troubleshooting slow builds:
# Check cargo cache status
cargo clean --release -p nyash-rust
cargo clean --release -p nyash-llvm-compiler
# Rebuild with timing information
time tools/build_llvm.sh program.hako -o output
# Verbose output for bottleneck analysis
NYASH_CLI_VERBOSE=1 time tools/build_llvm.sh program.hako -o output
Troubleshooting
Issue: llvm-config-18 not found
Symptom: build_llvm.sh fails with "llvm-config-18: command not found"
Solution:
# Ubuntu/Debian:
sudo apt-get install llvm-18-dev llvm-18-tools
# macOS (Homebrew):
brew install llvm@18
export PATH="/opt/homebrew/opt/llvm@18/bin:$PATH"
# WSL (Ubuntu):
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 18
Issue: Python llvmlite not found
Symptom: ModuleNotFoundError: No module named 'llvmlite'
Solution:
pip3 install llvmlite
# If system-wide install fails, use virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install llvmlite
Issue: Linking fails
Symptom: ld: symbol(s) not found for architecture
Check:
- Ensure clang-18 is installed
- Verify LLVM 18 libraries available:
llvm-config-18 --libdir ls $(llvm-config-18 --libdir)
Issue: MIR compilation error
Symptom: hakorune fails to compile .hako to MIR
Issue: LLVM IR parsing error(expected instruction opcode / PHI placement)
Symptom: llvmlite が生成した LLVM IR の parse に失敗する(例: expected instruction opcode)。
Next:
- まず棚卸しと代表ケース表を確認:
docs/development/current/main/phase131-3-llvm-lowering-inventory.md - 典型例: ループ + PHI が絡むケースで “PHI が terminator の後に出る” など、LLVM IR の不変条件違反が起きる
Debug:
# Test MIR generation manually:
./target/release/hakorune --emit-mir-json test.json test.hako
# Check error messages:
cat test.json # Should be valid JSON
Issue: LLVM IR generation error
Symptom: llvm_builder.py fails
Debug:
# Run Python builder manually:
python3 src/llvm_py/llvm_builder.py test.json -o test.ll
# Check LLVM IR validity:
llvm-as-18 test.ll -o /dev/null
# Should complete without errors
Debugging Build Pipeline
When build_llvm.sh fails, use these techniques to isolate the problem:
Enable Verbose Mode
Global verbose output:
NYASH_CLI_VERBOSE=1 tools/build_llvm.sh program.hako -o output
# Shows detailed command execution via set -x
Step-by-step verbosity:
# Verbose hakorune compilation
NYASH_CLI_VERBOSE=1 ./target/release/hakorune --emit-mir-json tmp/debug.json program.hako
# Verbose Python LLVM builder
python3 -v src/llvm_py/llvm_builder.py tmp/debug.json -o tmp/debug.ll
# Verbose LLVM compilation
llc-18 -debug tmp/debug.ll -o tmp/debug.o -filetype=obj
# Verbose linking
cc -v tmp/debug.o -L crates/nyash_kernel/target/release -lnyash_kernel -o output
Manual Step Tracing
Isolate each step to find exact failure point:
Step 1: Test MIR emission:
./target/release/hakorune --emit-mir-json tmp/test.json program.hako
echo $? # Should be 0
jq . tmp/test.json # Validate JSON
Step 2: Test LLVM IR generation:
# Harness mode (default)
NYASH_LLVM_USE_HARNESS=1 ./target/release/hakorune --backend llvm program.hako
# Check exit code
# Crate mode
cargo build --release -p nyash-llvm-compiler
./target/release/ny-llvmc --in tmp/test.json --out tmp/test.o
file tmp/test.o # Should be ELF object
Step 3: Test object compilation:
# If .ll file available (crate mode intermediate)
llc-18 -filetype=obj tmp/test.ll -o tmp/test.o
file tmp/test.o # Verify ELF format
nm tmp/test.o # Check symbols
Step 4: Test linking:
# Ensure Nyash Kernel built
cd crates/nyash_kernel && cargo build --release
cd ../..
# Manual link
cc tmp/test.o \
-L crates/nyash_kernel/target/release \
-Wl,--whole-archive -lnyash_kernel -Wl,--no-whole-archive \
-lpthread -ldl -lm -o tmp/manual_output
# Test execution
./tmp/manual_output
echo $?
Save Intermediate Files
Preserve all build artifacts for inspection:
# Create debug directory
mkdir -p debug_build
# Step 1: Emit MIR JSON
./target/release/hakorune --emit-mir-json debug_build/program.json program.hako
# Step 2: Generate LLVM IR (harness mode, manual Python call)
python3 src/llvm_py/llvm_builder.py debug_build/program.json -o debug_build/program.ll
# Step 3: Compile to object
llc-18 debug_build/program.ll -o debug_build/program.o -filetype=obj
# Step 4: Link
cc debug_build/program.o \
-L crates/nyash_kernel/target/release \
-Wl,--whole-archive -lnyash_kernel -Wl,--no-whole-archive \
-lpthread -ldl -lm -o debug_build/program
# Inspect all intermediate files
ls -lh debug_build/
file debug_build/*
Inspect saved artifacts:
# View MIR JSON structure
jq '.functions[] | {name: .name, blocks: .blocks | length}' debug_build/program.json
# View LLVM IR
less debug_build/program.ll
# Disassemble object file
objdump -d debug_build/program.o | less
# Check symbols
nm debug_build/program.o
nm debug_build/program
Common LLVM IR Issues
Problem 1: Invalid function signature
error: expected type '...' but found '...'
Diagnosis: MIR → LLVM IR type mismatch
Fix: Check MIR JSON functions[].signature, ensure correct types
Problem 2: Undefined symbol
error: undefined reference to 'nyash_...'
Diagnosis: Missing Nyash Kernel runtime symbols Fix:
# Rebuild Nyash Kernel
cd crates/nyash_kernel && cargo clean && cargo build --release
# Verify symbols available
nm crates/nyash_kernel/target/release/libnyash_kernel.a | grep nyash_
Problem 3: Invalid IR instruction
error: invalid IR instruction '...'
Diagnosis: Python llvm_builder.py bug or unsupported MIR instruction Fix:
# Check LLVM IR syntax
llvm-as-18 -o /dev/null debug_build/program.ll
# Error message shows exact line number
# Inspect problematic instruction
sed -n '<line>p' debug_build/program.ll
Problem 4: Linking failure
ld: symbol(s) not found for architecture x86_64
Diagnosis: Missing system libraries or incorrect link order Fix:
# Check what symbols are needed
nm -u debug_build/program.o
# Verify Nyash Kernel provides them
nm crates/nyash_kernel/target/release/libnyash_kernel.a | grep <symbol>
# If system library missing, add to NYASH_LLVM_LIBS
NYASH_LLVM_LIBS="-lmissing_lib" tools/build_llvm.sh program.hako -o output
Environment Variables for Debugging
Combine multiple debugging flags:
# Maximum verbosity + preserve artifacts
NYASH_CLI_VERBOSE=1 \
NYASH_LLVM_COMPILER=crate \
NYASH_LLVM_MIR_JSON=/tmp/debug.json \
NYASH_LLVM_VALIDATE_JSON=1 \
tools/build_llvm.sh program.hako -o /tmp/debug_output
# Then inspect intermediate files
ls -lh /tmp/debug*
jq . /tmp/debug.json
cat /tmp/debug.ll
Recommended debugging workflow:
- Enable
NYASH_CLI_VERBOSE=1for initial diagnosis - Use manual step tracing to isolate failure
- Save intermediate files for inspection
- Check LLVM IR validity with
llvm-as-18 - Verify object symbols with
nm - Test linking manually with verbose
cc -v
What NOT to Do
❌ DO NOT create custom link procedures:
- Scattered linking logic across multiple scripts
- Manual
clanginvocations outsidebuild_llvm.sh - Duplicate .o → exe pipelines
❌ DO NOT bypass build_llvm.sh:
- Direct llvm_builder.py invocations for production
- Custom shell scripts for one-off builds
- Hardcoded paths in makefiles
✅ DO use tools/build_llvm.sh for all LLVM exe generation
Integration Test
Location: tools/smokes/v2/profiles/integration/apps/phase87_llvm_exe_min.sh
What it tests:
- Full pipeline: .hako → exe → execution
- Exit code verification (42)
- SKIP if LLVM unavailable (graceful degradation)
Run manually:
tools/smokes/v2/run.sh --profile integration --filter 'phase87_llvm_exe_min\.sh'
Expected outcomes:
- PASS: If llvm-config-18 available → exit code 42 verified
- SKIP: If llvm-config-18 not found → graceful skip message
Why Exit Code 42?
- Stdout-independent: Works even if stdout is redirected/buffered
- Deterministic: No parsing required, simple integer comparison
- Traditional: Unix convention for testable exit codes
- Minimal: No dependencies on print/console boxes
Advanced Usage
Custom output location
# Default: output to tmp/
tools/build_llvm.sh program.hako -o custom/path/binary
# Ensure directory exists first:
mkdir -p custom/path
Debugging build steps
Set verbose mode (if supported by build_llvm.sh):
VERBOSE=1 tools/build_llvm.sh program.hako -o output
Check intermediate files:
# MIR JSON:
ls -lh tmp/*.json
# LLVM IR:
ls -lh tmp/*.ll
# Object file:
ls -lh tmp/*.o
Comparing with VM backend
VM execution (interpreted):
./target/release/hakorune --backend vm program.hako
echo $?
LLVM execution (native):
tools/build_llvm.sh program.hako -o tmp/program
./tmp/program
echo $?
Should produce identical exit codes for correct programs.
Performance Characteristics
Build time: ~1-3 seconds for minimal programs
- .hako → MIR: ~100ms
- MIR → LLVM IR: ~500ms
- LLVM IR → .o: ~1s
- Linking: ~200ms
Execution time: Native speed (no VM overhead)
- Typical speedup: 10-100x vs VM backend
- No JIT warmup required
- Full LLVM optimizations applied
Related Documentation
- LLVM Python harness:
src/llvm_py/README.md - MIR spec:
docs/reference/mir/INSTRUCTION_SET.md - Integration smokes:
tools/smokes/v2/README.md - build_llvm.sh implementation:
tools/build_llvm.sh(read source for details)
SSOT Principle
Single Source of Truth:
- ONE script:
tools/build_llvm.sh - ONE pipeline: .hako → MIR → LLVM IR → .o → exe
- ONE integration test:
phase87_llvm_exe_min.sh
Benefits:
- Maintainability: Update one script, not scattered logic
- Consistency: All LLVM builds use same pipeline
- Testability: Single smoke test covers full pipeline
- Documentation: One canonical reference
Anti-patterns to avoid:
- Multiple competing build scripts
- Copy-pasted linking commands
- Ad-hoc shell scripts for "quick builds"
Status
- ✅ SSOT established:
tools/build_llvm.sh - ✅ Integration smoke added:
phase87_llvm_exe_min.sh - ✅ Documentation complete
- ✅ Prerequisites verified: llvm-config-18, llvmlite, LLVM features
- 🎯 Production ready: Use for all LLVM native compilations
Future Enhancements (Out of Scope for Phase 87)
- Optimization levels: -O0, -O1, -O2, -O3 flags
- Debug symbols: -g flag support
- Static linking: --static flag
- Cross-compilation: --target flag
- LTO: Link-time optimization support
Current scope: Baseline SSOT pipeline establishment only.