Files
hakorune/docs/private/roadmap/phases/phase-20.26

Phase 20.26 — Final Rust Consolidation + hakmem Integration

期間: 2-3ヶ月 ステータス: 計画更新Rust Floor方針に整合 前提条件: Phase 20.25Box System + MIR Builder削除完了


🎯 Executive Summary

Purpose: Rust 層を最終整理し、hakmem アロケータ統合で完全セルフホスト達成。

Final Goal: Rust = Floor, Hakorune = House 完全実現

Target State

┌────────────────────────────────────────────────┐
│ Rust Layer (≤5,000 lines - absolute minimum)   │
│                                                 │
│ ┌─────────────────────────────────────┐        │
│ │ HostBridge API (~500 lines)         │        │
│ │ - Hako_RunScriptUtf8                │        │
│ │ - Hako_Retain / Hako_Release        │        │
│ │ - Hako_ToUtf8 / Hako_LastError      │        │
│ └─────────────────────────────────────┘        │
│                                                 │
│ ┌─────────────────────────────────────┐        │
│ │ CLI Minimal (~1,000 lines)          │        │
│ │ - Argument parsing                  │        │
│ │ - Backend selection                 │        │
│ │ - Environment setup                 │        │
│ └─────────────────────────────────────┘        │
│                                                 │
│ ┌─────────────────────────────────────┐        │
│ │ hakmem C-ABI Binding (~500 lines)   │        │
│ │ - LD_PRELOAD integration            │        │
│ │ - Memory allocator interface        │        │
│ └─────────────────────────────────────┘        │
│                                                 │
│ ┌─────────────────────────────────────┐        │
│ │ Essential Data Structures           │        │
│ │ (~3,000 lines - MIR types only)     │        │
│ └─────────────────────────────────────┘        │
└────────────────────────────────────────────────┘

Targets-61,713行、段階的アーカイブ後の削減

From Phase 20.25 end state:  66,713 lines

Delete:
├── src/mir/ (except essential)  -6,835 lines  ← Keep only core types
├── src/backend/llvm/            -6,551 lines  ← Experimental, remove
├── src/backend/wasm/              -200 lines  ← Experimental, remove
├── src/cli/ (minimize)          -4,000 lines  ← Keep minimal (~1,000)
├── src/runner/ (minimize)       -2,500 lines  ← Keep minimal (~500)
├── src/plugin_system/          -10,000 lines  ← Hakorune plugin system
├── src/kernel/                  -8,000 lines  ← Hakorune kernel
├── src/other/ (cleanup)        -23,627 lines  ← Various dead code
────────────────────────────────────────────────
Total deleted:                  -61,713 lines

Final Rust layer:                ~5,000 lines ✅

Cumulative Progress:

  • Phase 20.24: -10,500 lines-9.0%
  • Phase 20.25: -22,193 lines-19.0%
  • Phase 20.26: -61,713 lines-52.8%
  • Total: -94,406 lines-80.8% 🔥

Final State: Rust (Floor): ≤5,000 / Hakorune (House): 50,000+ = True SelfHosting


🏗️ Implementation Roadmap

Month 1: Rust Layer Audit & Consolidation

Week 1: Comprehensive Audit

Goal: 全 Rust ファイルを分類Keep / Delete / Minimize

Audit Categories:

Category 1: KEEP (Essential - ~3,000 lines)
├── src/mir/types.rs             ~600 lines  ← MirType, ConstValue
├── src/mir/instruction.rs       ~800 lines  ← MirInstruction enum
├── src/mir/basic_block.rs       ~470 lines  ← BasicBlock
├── src/mir/function.rs          ~564 lines  ← MirModule, MirFunction
└── src/mir/value.rs             ~400 lines  ← Register, Value types

Category 2: MINIMIZE (Reduce to ~2,000 lines)
├── src/cli/                   5,000 → 1,000  ← -4,000
├── src/runner/                3,000 →   500  ← -2,500
├── src/c_abi/                 2,000 → 500    ← -1,500 (consolidate)

Category 3: DELETE (Experimental/Dead - ~56,713 lines)
├── src/backend/llvm/          -6,551 lines  ← Python llvmlite primary
├── src/backend/wasm/            -200 lines  ← Python llvmlite WASM
├── src/plugin_system/        -10,000 lines  ← Hakorune plugin system
├── src/kernel/                -8,000 lines  ← Hakorune kernel
├── src/verification/          -2,000 lines  ← Hakorune verifier
├── src/json/                  -1,500 lines  ← serde_json sufficient
├── src/error/                 -1,200 lines  ← Minimal error handling
├── src/debug/                 -1,800 lines  ← Hakorune debug tools
├── src/config/                -1,000 lines  ← Env vars sufficient
├── src/metrics/               -1,500 lines  ← Hakorune metrics
├── src/testing/               -2,000 lines  ← Smoke tests external
├── src/utils/                 -1,500 lines  ← Minimal utils
├── src/macros/                  -800 lines  ← No longer needed
└── src/other/                -18,662 lines  ← Various dead code

Tasks:

  • Audit ALL files in src/108 files from comprehensive roadmap
  • Classify each file: Keep / Delete / Minimize
  • Identify dependencies: What depends on what?
  • Create deletion plan with safe order

Deliverable: RUST_LAYER_AUDIT.md - 完全なファイルリスト+分類

Week 2: Dependency Graph Analysis

Goal: 安全な削除順序を決定

Dependency Analysis:

# Tool: cargo-tree, cargo-geiger, etc.
cargo tree --all-features --depth 3 > dependency_tree.txt

# Custom analysis
rg "use crate::" src/ | sort | uniq > internal_dependencies.txt

# Identify circular dependencies
# Identify orphaned files (no incoming edges)

Output: Dependency graphGraphviz DOT format

digraph RustDependencies {
  // Essential (keep)
  MirTypes [color=green];
  MirInstruction [color=green];

  // Minimize
  CLI [color=yellow];
  Runner [color=yellow];

  // Delete
  LLVM [color=red];
  PluginSystem [color=red];

  // Dependencies
  CLI -> MirTypes;
  Runner -> MirTypes;
  LLVM -> MirTypes;  // Can delete, MirTypes stays
}

Tasks:

  • Generate full dependency graph
  • Identify deletion candidatesno internal dependencies
  • Plan deletion wave orderWave 1, 2, 3...

Week 3-4: Wave 1 DeletionExperimental Backends

Target: src/backend/llvm/, src/backend/wasm/

Rationale:

  • Python llvmlite is primary LLVM backend218,056 lines, proven
  • Rust inkwell is experimental6,551 lines, incomplete
  • WASM via llvmlite worksPhase 15.8 complete

Deletion:

# Safety first
git tag phase-20.26-wave1-pre
git branch backup/rust-llvm-backend

# Delete
rm -rf src/backend/llvm/   # -6,551 lines
rm -rf src/backend/wasm/   # -200 lines

Impact Analysis:

  • CLI: --backend llvm flag → --backend llvm-pyPython llvmlite
  • Tests: Update smoke tests to use Python backend
  • Docs: Update backend documentation

Tasks:

  • Update CLI backend selection
  • Remove --backend llvm flagRust inkwell
  • Update all smoke tests
  • Full regression testing

Acceptance:

  • All smoke tests PASS296/296
  • LLVM backend via Python works
  • No references to deleted code

Month 2: Plugin System & Kernel Deletion

Week 1-2: Plugin System Migration

Current Rust Plugin System~10,000 lines:

src/plugin_system/
├── registry.rs          ~1,500 lines  ← Plugin discovery
├── loader.rs            ~2,000 lines  ← Dynamic loading
├── handle.rs            ~1,200 lines  ← Handle management
├── abi.rs               ~1,800 lines  ← C-ABI bridge
├── lifecycle.rs         ~1,000 lines  ← Init/shutdown
└── ... (other)          ~2,500 lines

Hakorune Plugin System(完全実装済み):

lang/src/runtime/plugin/
├── plugin_registry_box.hako
├── plugin_loader_box.hako
├── handle_registry_box.hako
└── ...

Migration Strategy:

// Minimal Rust wrapper for Hakorune plugin system

pub struct PluginSystemShim {
    hakorune_registry: HakoHandle,  // Points to Hakorune PluginRegistryBox
}

impl PluginSystemShim {
    pub fn load_plugin(&self, path: &str) -> Result<()> {
        // Call Hakorune plugin system
        let result = call_hakorune(
            "PluginRegistryBox.load_plugin",
            &[path.into()]
        )?;
        Ok(())
    }
}

Tasks:

  • Implement PluginSystemShim~200 lines
  • Replace all calls: plugin_system::load()PluginSystemShim::load_plugin()
  • Test with existing pluginsStringBox, ArrayBox, etc.
  • Delete src/plugin_system/-10,000 lines

Week 3-4: Kernel Deletion

Current Rust Kernel~8,000 lines:

src/kernel/
├── runtime.rs           ~2,000 lines  ← Runtime initialization
├── scheduler.rs         ~1,500 lines  ← Task scheduling
├── memory.rs            ~1,200 lines  ← Memory management
├── io.rs                ~1,000 lines  ← I/O subsystem
└── ... (other)          ~2,300 lines

Hakorune Kernel(完全実装済み):

lang/src/runtime/kernel/
├── runtime_box.hako
├── scheduler_box.hako
├── memory_manager_box.hako
└── ...

Migration Strategy: Same as Plugin SystemMinimal shim

Tasks:

  • Implement KernelShim~300 lines
  • Replace kernel calls
  • Test runtime initialization
  • Delete src/kernel/-8,000 lines

Month 3: hakmem Integration + Final Cleanup

Week 1-2: hakmem Allocator Integration

Goal: Hakorune uses hakmem as default allocatorno libc malloc

Current StatePhase 6.14 完了):

  • hakmem PoC complete6,541 lines C
  • mimalloc parity achievedjson: +36.4%, mir: -18.8%
  • Call-site profiling working
  • UCB1 evolution ready

Integration Strategy:

// src/c_abi/hakmem_binding.c (NEW - ~200 lines)

#include "hakmem.h"  // From apps/experiments/hakmem-poc/

// Override libc malloc/free
void* malloc(size_t size) {
    return hak_alloc_cs(size);
}

void free(void* ptr) {
    hak_free_cs(ptr, 0);  // Size unknown, hakmem infers from header
}

void* realloc(void* ptr, size_t new_size) {
    void* new_ptr = malloc(new_size);
    if (ptr) {
        memcpy(new_ptr, ptr, /* old_size */);
        free(ptr);
    }
    return new_ptr;
}

Build Integration:

# Cargo.toml
[dependencies]
hakmem = { path = "apps/experiments/hakmem-poc" }

[profile.release]
# Use hakmem instead of system allocator

Tasks:

  • Create src/c_abi/hakmem_binding.c
  • Integrate hakmem into Cargo build
  • Test with LD_PRELOAD=libhakmem.so
  • Benchmark: Before vs Aftermemory usage, performance

Acceptance:

  • All allocations go through hakmemverify with profiling
  • No libc malloc callsverify with nm or ldd
  • Performance: ±10% of libc malloc
  • Memory usage: ±10% of libc malloc

Week 3: Final Cleanup

Target: Delete all remaining dead code~23,627 lines

Cleanup Categories:

1. Dead imports (no longer used)
2. Dead functions (no callers)
3. Dead tests (functionality moved to Hakorune)
4. Dead docs (outdated architecture)
5. Dead examples (superseded)

Tools:

# Find unused functions
cargo +nightly rustc -- -Z print=dead-code

# Find unused dependencies
cargo machete

# Find orphaned files
find src/ -name "*.rs" | while read f; do
  grep -r "$(basename $f .rs)" src/ | grep -v "^$f:" || echo "Orphaned: $f"
done

Tasks:

  • Run dead code analysis
  • Delete identified dead code
  • Clean importsremove unused use statements
  • Update Cargo.tomlremove unused dependencies

Week 4: Final Verification & Documentation

Verification:

# Clean build
cargo clean
cargo build --release

# Size check
ls -lh target/release/hakorune
# Expected: ~5MBwas ~15MB before cleanup

# Line count
cloc src/
# Expected: ~5,000 lines Rust

# Full test suite
tools/smokes/v2/run.sh --profile all
# Expected: 296/296 PASS

# Performance benchmark
tools/bench_unified.sh --backend all --warmup 10 --repeat 50
# Expected: No regression±5%

Documentation:

  • Architecture diagramFinal state
  • Rust layer API referenceminimal surface
  • hakmem integration guide
  • Migration complete announcement

受け入れ基準Acceptance Criteria

Quantitative Goals

  • Rust layer: ≤5,000 lines目標達成: 99.5% → 実際: 80.8% 削減)
  • Hakorune layer: 50,000+ linesCompiler + VM + Boxes + Plugin System
  • Binary size: ≤5MB15MB → 5MB, -67%
  • Build time: ≤30秒Clean build, release mode

Functional Requirements

  • All smoke tests PASS: 296/296
  • Self-compilation works: Hako₁ → Hako₂ → Hako₃bit-identical
  • hakmem default: No libc malloc usage
  • Performance parity: ±10% vs Phase 20.25 end state

Quality Requirements

  • No dead code: cargo +nightly rustc -Z print=dead-code clean
  • No unused dependencies: cargo machete clean
  • Documentation complete: Architecture, API, Migration guide
  • CI passing: All checks green

Safety Requirements

  • Rollback tested: Each month has checkpointtag + branch
  • Memory safety: valgrind clean with hakmem
  • Thread safety: TSan cleanif applicable

🚨 Risk Analysis

Critical Risks

リスク 確率 影響 軽減策
hakmem stability Extensive testing, fallback to libc
Dependency hell Wave-based deletion, dependency graph
Dead code resurgence CI checks for unused code
Performance regression Continuous benchmarking
Rollback complexity Monthly checkpoints, granular commits

hakmem Integration RisksMost Critical

Risk: hakmem has bugs or performance issues in production

Impact: Highall allocations affected

Mitigation:

  1. Extensive testing: Run all smoke tests with hakmem
  2. Benchmarking: Compare vs libc mallocshould be ±10%
  3. Fallback mechanism:
    pub fn get_allocator() -> Allocator {
        match env::var("HAKO_USE_HAKMEM") {
            Ok(val) if val == "0" => Allocator::Libc,
            _ => Allocator::Hakmem,  // Default
        }
    }
    
  4. Memory profiling: valgrind, ASan validation
  5. Production rollout: Gradualdev → staging → prod

📈 Timeline Visualization

Month 1: Rust Layer Audit & Wave 1 Deletion
├─ Week 1:   Comprehensive audit         [████░░░░] File classification
├─ Week 2:   Dependency graph            [████░░░░] Safe deletion order
├─ Week 3-4: Wave 1LLVM/WASM backend [████████] -6,751 lines
└─ Checkpoint: Experimental backends removed

Month 2: Plugin System & Kernel Deletion
├─ Week 1-2: Plugin System migration     [████████] -10,000 lines
├─ Week 3-4: Kernel migration            [████████] -8,000 lines
└─ Checkpoint: Core systems on Hakorune

Month 3: hakmem Integration + Final Cleanup
├─ Week 1-2: hakmem integration          [████░░░░] Default allocator
├─ Week 3:   Final cleanup               [████████] -23,627 lines
├─ Week 4:   Verification & docs         [████░░░░] ✅ Complete
└─ Completion: Phase 20.26 done

Total: 2-3 months, -61,713 lines (-52.8%)
Cumulative: -94,406 lines (-80.8%) from Phase 20.24 start

💡 Strategic Insights

Achievement Summary

Before Phase 20.24Starting point:

Total codebase:     116,841 lines
Rust layer (src/):   99,406 lines (85.1%)
Hakorune layer:      17,435 lines (14.9%)

After Phase 20.26Final state:

Total codebase:      ~60,000 lines
Rust layer (src/):   ~5,000 lines (8.3%) ✅
Hakorune layer:     ~55,000 lines (91.7%) ✅

Rust reduction: -94,406 lines (-95.0%) 🔥
Ratio flip: 85% Rust → 8% Rust ⚡

Key Milestones

Phase Duration Lines Deleted Cumulative Key Achievement
20.23 2-3 weeks -0 -0 Arc/RefCell foundationHakorune実装
20.24 1-2 months -10,500 -10,500 Parser 完全削除
20.25 2-3 months -22,193 -32,693 MIR Builder + VM + Box System 削除
20.26 2-3 months -61,713 -94,406 Final consolidation
Total 6-9 months -94,406 -94,406 95% Rust reduction 🔥

Comparison with Original Estimates

Original Planfrom comprehensive roadmap:

  • Bridge-B Path: 14.3% reduction-16,693 linesin 7-14 months
  • C ABI Path: 86-93% reduction-50,000+ linesin 12-18 months

This PlanHakorune Implementation:

  • 95.0% reduction-94,406 linesin 6-9 months
  • 10% faster than C ABI path
  • Lower riskHakorune > C for logic

Why This Plan Succeeds:

  1. Hakorune for logicArc/RefCell, Box System
  2. C for data operationshakmem, atomic ops
  3. Proven patternGcBox precedent
  4. Incremental approach4 phases, monthly checkpoints

🔗 関連ドキュメント


🎉 Final Outcome

Before (Phase 20.23 start)

┌────────────────────────────────────────────────┐
│ Rust Layer: 99,406 lines (85.1%)               │
│ ├── Parser/AST (10,500)                        │
│ ├── MIR Builder/Optimizer (3,800)              │
│ ├── Rust VM (2,393)                            │
│ ├── Box System (16,000)                        │
│ ├── LLVM/WASM backends (6,751)                 │
│ ├── Plugin System (10,000)                     │
│ ├── Kernel (8,000)                             │
│ └── Other (41,962)                             │
└────────────────────────────────────────────────┘

Hakorune Layer: 17,435 lines (14.9%)

After (Phase 20.26 complete)

┌────────────────────────────────────────────────┐
│ Rust Layer: ~5,000 lines (8.3%) ✅             │
│ ├── HostBridge API (~500)                      │
│ ├── CLI Minimal (~1,000)                       │
│ ├── hakmem Binding (~500)                      │
│ └── Essential MIR Types (~3,000)               │
└────────────────────────────────────────────────┘

┌────────────────────────────────────────────────┐
│ Hakorune Layer: ~55,000 lines (91.7%) ✅       │
│ ├── Compiler (Parser + MIR Builder) (~12,000)  │
│ ├── VM (MiniVmBox) (~5,000)                    │
│ ├── Box System (All boxes) (~15,000)           │
│ ├── Plugin System (~5,000)                     │
│ ├── Kernel (~5,000)                            │
│ ├── Standard Library (~10,000)                 │
│ └── Other (~3,000)                             │
└────────────────────────────────────────────────┘

hakmem (C): 6,541 lines (Memory allocator)

Key Metrics:

  • Rust: 99,406 → 5,000 lines-95.0% 🔥
  • Hakorune: 17,435 → 55,000 lines+215.4%
  • Total: 116,841 → 66,541 lines-43.0%
  • Ratio: 85% Rust → 8% RustRust = Floor

🏆 Final Achievement

True Self-Hosting Realized:

✅ Rust = Floor~5,000 lines - minimal foundation
✅ Hakorune = House~55,000 lines - everything
✅ hakmem = Plumbing~6,500 lines C - memory

Ratio: 1 : 11 : 1.3 (Rust : Hakorune : C)

What Rust DoesFloor responsibilities:

  1. HostBridge APIC-ABI entry points
  2. CLI MinimalArgument parsing
  3. hakmem BindingAllocator integration
  4. Essential TypesMIR data structures - shared by all backends

What Hakorune DoesHouse - everything else:

  1. CompilerParser + MIR Builder + Optimizer
  2. VMMiniVmBox - interpreter
  3. Box SystemStringBox, ArrayBox, MapBox, etc.
  4. Plugin SystemDynamic loading
  5. KernelRuntime, Scheduler, Memory Manager
  6. Standard LibraryAll builtins

What C DoesPlumbing:

  1. hakmemMemory allocator with call-site profiling
  2. Arc/RefCell atomic operationsData plane
  3. System callsmmap, munmap, etc.

ステータス: 未開始 開始可能条件: Phase 20.25Box System + MIR Builder削除完了 期間: 2-3ヶ月8-12週間 Complete: True Self-Hosting Achieved - Rust = Floor, Hakorune = House