# Phase 20.26 — Final Rust Consolidation + hakmem Integration **期間**: 2-3ヶ月 **ステータス**: 計画更新(Rust Floor方針に整合) **前提条件**: Phase 20.25(Box System + MIR Builder削除)完了 --- ## 🎯 Executive Summary **Purpose**: Rust 層を最終整理し、hakmem アロケータ統合で完全セルフホスト達成。 **Final Goal**: **Rust = Floor, Hakorune = House** 完全実現 ### Target State ``` ┌────────────────────────────────────────────────┐ │ Rust Layer (≤5,000 lines - absolute minimum) │ │ │ │ ┌─────────────────────────────────────┐ │ │ │ HostBridge API (~500 lines) │ │ │ │ - Hako_RunScriptUtf8 │ │ │ │ - Hako_Retain / Hako_Release │ │ │ │ - Hako_ToUtf8 / Hako_LastError │ │ │ └─────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────┐ │ │ │ CLI Minimal (~1,000 lines) │ │ │ │ - Argument parsing │ │ │ │ - Backend selection │ │ │ │ - Environment setup │ │ │ └─────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────┐ │ │ │ hakmem C-ABI Binding (~500 lines) │ │ │ │ - LD_PRELOAD integration │ │ │ │ - Memory allocator interface │ │ │ └─────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────┐ │ │ │ Essential Data Structures │ │ │ │ (~3,000 lines - MIR types only) │ │ │ └─────────────────────────────────────┘ │ └────────────────────────────────────────────────┘ ``` ### Targets(-61,713行、段階的アーカイブ後の削減) ``` From Phase 20.25 end state: 66,713 lines Delete: ├── src/mir/ (except essential) -6,835 lines ← Keep only core types ├── src/backend/llvm/ -6,551 lines ← Experimental, remove ├── src/backend/wasm/ -200 lines ← Experimental, remove ├── src/cli/ (minimize) -4,000 lines ← Keep minimal (~1,000) ├── src/runner/ (minimize) -2,500 lines ← Keep minimal (~500) ├── src/plugin_system/ -10,000 lines ← Hakorune plugin system ├── src/kernel/ -8,000 lines ← Hakorune kernel ├── src/other/ (cleanup) -23,627 lines ← Various dead code ──────────────────────────────────────────────── Total deleted: -61,713 lines Final Rust layer: ~5,000 lines ✅ ``` **Cumulative Progress**: - Phase 20.24: -10,500 lines(-9.0%) - Phase 20.25: -22,193 lines(-19.0%) - Phase 20.26: -61,713 lines(-52.8%) - **Total**: **-94,406 lines(-80.8%)** 🔥 **Final State**: **Rust (Floor): ≤5,000 / Hakorune (House): 50,000+** = **True Self‑Hosting** --- ## 🏗️ Implementation Roadmap ### Month 1: Rust Layer Audit & Consolidation #### Week 1: Comprehensive Audit **Goal**: 全 Rust ファイルを分類(Keep / Delete / Minimize) **Audit Categories**: ``` Category 1: KEEP (Essential - ~3,000 lines) ├── src/mir/types.rs ~600 lines ← MirType, ConstValue ├── src/mir/instruction.rs ~800 lines ← MirInstruction enum ├── src/mir/basic_block.rs ~470 lines ← BasicBlock ├── src/mir/function.rs ~564 lines ← MirModule, MirFunction └── src/mir/value.rs ~400 lines ← Register, Value types Category 2: MINIMIZE (Reduce to ~2,000 lines) ├── src/cli/ 5,000 → 1,000 ← -4,000 ├── src/runner/ 3,000 → 500 ← -2,500 ├── src/c_abi/ 2,000 → 500 ← -1,500 (consolidate) Category 3: DELETE (Experimental/Dead - ~56,713 lines) ├── src/backend/llvm/ -6,551 lines ← Python llvmlite primary ├── src/backend/wasm/ -200 lines ← Python llvmlite WASM ├── src/plugin_system/ -10,000 lines ← Hakorune plugin system ├── src/kernel/ -8,000 lines ← Hakorune kernel ├── src/verification/ -2,000 lines ← Hakorune verifier ├── src/json/ -1,500 lines ← serde_json sufficient ├── src/error/ -1,200 lines ← Minimal error handling ├── src/debug/ -1,800 lines ← Hakorune debug tools ├── src/config/ -1,000 lines ← Env vars sufficient ├── src/metrics/ -1,500 lines ← Hakorune metrics ├── src/testing/ -2,000 lines ← Smoke tests external ├── src/utils/ -1,500 lines ← Minimal utils ├── src/macros/ -800 lines ← No longer needed └── src/other/ -18,662 lines ← Various dead code ``` **Tasks**: - [ ] Audit ALL files in `src/`(108 files from comprehensive roadmap) - [ ] Classify each file: Keep / Delete / Minimize - [ ] Identify dependencies: What depends on what? - [ ] Create deletion plan with safe order **Deliverable**: `RUST_LAYER_AUDIT.md` - 完全なファイルリスト+分類 #### Week 2: Dependency Graph Analysis **Goal**: 安全な削除順序を決定 **Dependency Analysis**: ```bash # Tool: cargo-tree, cargo-geiger, etc. cargo tree --all-features --depth 3 > dependency_tree.txt # Custom analysis rg "use crate::" src/ | sort | uniq > internal_dependencies.txt # Identify circular dependencies # Identify orphaned files (no incoming edges) ``` **Output**: Dependency graph(Graphviz DOT format) ```dot digraph RustDependencies { // Essential (keep) MirTypes [color=green]; MirInstruction [color=green]; // Minimize CLI [color=yellow]; Runner [color=yellow]; // Delete LLVM [color=red]; PluginSystem [color=red]; // Dependencies CLI -> MirTypes; Runner -> MirTypes; LLVM -> MirTypes; // Can delete, MirTypes stays } ``` **Tasks**: - [ ] Generate full dependency graph - [ ] Identify deletion candidates(no internal dependencies) - [ ] Plan deletion wave order(Wave 1, 2, 3...) #### Week 3-4: Wave 1 Deletion(Experimental Backends) **Target**: `src/backend/llvm/`, `src/backend/wasm/` **Rationale**: - Python llvmlite is primary LLVM backend(218,056 lines, proven) - Rust inkwell is experimental(6,551 lines, incomplete) - WASM via llvmlite works(Phase 15.8 complete) **Deletion**: ```bash # Safety first git tag phase-20.26-wave1-pre git branch backup/rust-llvm-backend # Delete rm -rf src/backend/llvm/ # -6,551 lines rm -rf src/backend/wasm/ # -200 lines ``` **Impact Analysis**: - CLI: `--backend llvm` flag → `--backend llvm-py`(Python llvmlite) - Tests: Update smoke tests to use Python backend - Docs: Update backend documentation **Tasks**: - [ ] Update CLI backend selection - [ ] Remove `--backend llvm` flag(Rust inkwell) - [ ] Update all smoke tests - [ ] Full regression testing **Acceptance**: - [ ] All smoke tests PASS(296/296) - [ ] LLVM backend via Python works - [ ] No references to deleted code --- ### Month 2: Plugin System & Kernel Deletion #### Week 1-2: Plugin System Migration **Current Rust Plugin System**(~10,000 lines): ``` src/plugin_system/ ├── registry.rs ~1,500 lines ← Plugin discovery ├── loader.rs ~2,000 lines ← Dynamic loading ├── handle.rs ~1,200 lines ← Handle management ├── abi.rs ~1,800 lines ← C-ABI bridge ├── lifecycle.rs ~1,000 lines ← Init/shutdown └── ... (other) ~2,500 lines ``` **Hakorune Plugin System**(完全実装済み): ``` lang/src/runtime/plugin/ ├── plugin_registry_box.hako ├── plugin_loader_box.hako ├── handle_registry_box.hako └── ... ``` **Migration Strategy**: ```rust // Minimal Rust wrapper for Hakorune plugin system pub struct PluginSystemShim { hakorune_registry: HakoHandle, // Points to Hakorune PluginRegistryBox } impl PluginSystemShim { pub fn load_plugin(&self, path: &str) -> Result<()> { // Call Hakorune plugin system let result = call_hakorune( "PluginRegistryBox.load_plugin", &[path.into()] )?; Ok(()) } } ``` **Tasks**: - [ ] Implement PluginSystemShim(~200 lines) - [ ] Replace all calls: `plugin_system::load()` → `PluginSystemShim::load_plugin()` - [ ] Test with existing plugins(StringBox, ArrayBox, etc.) - [ ] Delete `src/plugin_system/`(-10,000 lines) #### Week 3-4: Kernel Deletion **Current Rust Kernel**(~8,000 lines): ``` src/kernel/ ├── runtime.rs ~2,000 lines ← Runtime initialization ├── scheduler.rs ~1,500 lines ← Task scheduling ├── memory.rs ~1,200 lines ← Memory management ├── io.rs ~1,000 lines ← I/O subsystem └── ... (other) ~2,300 lines ``` **Hakorune Kernel**(完全実装済み): ``` lang/src/runtime/kernel/ ├── runtime_box.hako ├── scheduler_box.hako ├── memory_manager_box.hako └── ... ``` **Migration Strategy**: Same as Plugin System(Minimal shim) **Tasks**: - [ ] Implement KernelShim(~300 lines) - [ ] Replace kernel calls - [ ] Test runtime initialization - [ ] Delete `src/kernel/`(-8,000 lines) --- ### Month 3: hakmem Integration + Final Cleanup #### Week 1-2: hakmem Allocator Integration **Goal**: Hakorune uses hakmem as default allocator(no libc malloc) **Current State**(Phase 6.14 完了): - ✅ hakmem PoC complete(6,541 lines C) - ✅ mimalloc parity achieved(json: +36.4%, mir: -18.8%) - ✅ Call-site profiling working - ✅ UCB1 evolution ready **Integration Strategy**: ```c // src/c_abi/hakmem_binding.c (NEW - ~200 lines) #include "hakmem.h" // From apps/experiments/hakmem-poc/ // Override libc malloc/free void* malloc(size_t size) { return hak_alloc_cs(size); } void free(void* ptr) { hak_free_cs(ptr, 0); // Size unknown, hakmem infers from header } void* realloc(void* ptr, size_t new_size) { void* new_ptr = malloc(new_size); if (ptr) { memcpy(new_ptr, ptr, /* old_size */); free(ptr); } return new_ptr; } ``` **Build Integration**: ```bash # Cargo.toml [dependencies] hakmem = { path = "apps/experiments/hakmem-poc" } [profile.release] # Use hakmem instead of system allocator ``` **Tasks**: - [ ] Create `src/c_abi/hakmem_binding.c` - [ ] Integrate hakmem into Cargo build - [ ] Test with `LD_PRELOAD=libhakmem.so` - [ ] Benchmark: Before vs After(memory usage, performance) **Acceptance**: - [ ] All allocations go through hakmem(verify with profiling) - [ ] No libc malloc calls(verify with `nm` or `ldd`) - [ ] Performance: ±10% of libc malloc - [ ] Memory usage: ±10% of libc malloc #### Week 3: Final Cleanup **Target**: Delete all remaining dead code(~23,627 lines) **Cleanup Categories**: ``` 1. Dead imports (no longer used) 2. Dead functions (no callers) 3. Dead tests (functionality moved to Hakorune) 4. Dead docs (outdated architecture) 5. Dead examples (superseded) ``` **Tools**: ```bash # Find unused functions cargo +nightly rustc -- -Z print=dead-code # Find unused dependencies cargo machete # Find orphaned files find src/ -name "*.rs" | while read f; do grep -r "$(basename $f .rs)" src/ | grep -v "^$f:" || echo "Orphaned: $f" done ``` **Tasks**: - [ ] Run dead code analysis - [ ] Delete identified dead code - [ ] Clean imports(remove unused `use` statements) - [ ] Update Cargo.toml(remove unused dependencies) #### Week 4: Final Verification & Documentation **Verification**: ```bash # Clean build cargo clean cargo build --release # Size check ls -lh target/release/hakorune # Expected: ~5MB(was ~15MB before cleanup) # Line count cloc src/ # Expected: ~5,000 lines Rust # Full test suite tools/smokes/v2/run.sh --profile all # Expected: 296/296 PASS # Performance benchmark tools/bench_unified.sh --backend all --warmup 10 --repeat 50 # Expected: No regression(±5%) ``` **Documentation**: - [ ] Architecture diagram(Final state) - [ ] Rust layer API reference(minimal surface) - [ ] hakmem integration guide - [ ] Migration complete announcement --- ## ✅ 受け入れ基準(Acceptance Criteria) ### Quantitative Goals - [ ] **Rust layer**: ≤5,000 lines(目標達成: 99.5% → 実際: 80.8% 削減) - [ ] **Hakorune layer**: 50,000+ lines(Compiler + VM + Boxes + Plugin System) - [ ] **Binary size**: ≤5MB(15MB → 5MB, -67%) - [ ] **Build time**: ≤30秒(Clean build, release mode) ### Functional Requirements - [ ] **All smoke tests PASS**: 296/296 - [ ] **Self-compilation works**: Hako₁ → Hako₂ → Hako₃(bit-identical) - [ ] **hakmem default**: No libc malloc usage - [ ] **Performance parity**: ±10% vs Phase 20.25 end state ### Quality Requirements - [ ] **No dead code**: `cargo +nightly rustc -Z print=dead-code` clean - [ ] **No unused dependencies**: `cargo machete` clean - [ ] **Documentation complete**: Architecture, API, Migration guide - [ ] **CI passing**: All checks green ### Safety Requirements - [ ] **Rollback tested**: Each month has checkpoint(tag + branch) - [ ] **Memory safety**: valgrind clean with hakmem - [ ] **Thread safety**: TSan clean(if applicable) --- ## 🚨 Risk Analysis ### Critical Risks | リスク | 確率 | 影響 | 軽減策 | |--------|------|------|--------| | **hakmem stability** | 中 | 高 | Extensive testing, fallback to libc | | **Dependency hell** | 中 | 中 | Wave-based deletion, dependency graph | | **Dead code resurgence** | 低 | 低 | CI checks for unused code | | **Performance regression** | 低 | 中 | Continuous benchmarking | | **Rollback complexity** | 中 | 高 | Monthly checkpoints, granular commits | ### hakmem Integration Risks(Most Critical) **Risk**: hakmem has bugs or performance issues in production **Impact**: High(all allocations affected) **Mitigation**: 1. **Extensive testing**: Run all smoke tests with hakmem 2. **Benchmarking**: Compare vs libc malloc(should be ±10%) 3. **Fallback mechanism**: ```rust pub fn get_allocator() -> Allocator { match env::var("HAKO_USE_HAKMEM") { Ok(val) if val == "0" => Allocator::Libc, _ => Allocator::Hakmem, // Default } } ``` 4. **Memory profiling**: valgrind, ASan validation 5. **Production rollout**: Gradual(dev → staging → prod) --- ## 📈 Timeline Visualization ``` Month 1: Rust Layer Audit & Wave 1 Deletion ├─ Week 1: Comprehensive audit [████░░░░] File classification ├─ Week 2: Dependency graph [████░░░░] Safe deletion order ├─ Week 3-4: Wave 1(LLVM/WASM backend) [████████] -6,751 lines └─ Checkpoint: Experimental backends removed Month 2: Plugin System & Kernel Deletion ├─ Week 1-2: Plugin System migration [████████] -10,000 lines ├─ Week 3-4: Kernel migration [████████] -8,000 lines └─ Checkpoint: Core systems on Hakorune Month 3: hakmem Integration + Final Cleanup ├─ Week 1-2: hakmem integration [████░░░░] Default allocator ├─ Week 3: Final cleanup [████████] -23,627 lines ├─ Week 4: Verification & docs [████░░░░] ✅ Complete └─ Completion: Phase 20.26 done Total: 2-3 months, -61,713 lines (-52.8%) Cumulative: -94,406 lines (-80.8%) from Phase 20.24 start ``` --- ## 💡 Strategic Insights ### Achievement Summary **Before Phase 20.24**(Starting point): ``` Total codebase: 116,841 lines Rust layer (src/): 99,406 lines (85.1%) Hakorune layer: 17,435 lines (14.9%) ``` **After Phase 20.26**(Final state): ``` Total codebase: ~60,000 lines Rust layer (src/): ~5,000 lines (8.3%) ✅ Hakorune layer: ~55,000 lines (91.7%) ✅ Rust reduction: -94,406 lines (-95.0%) 🔥 Ratio flip: 85% Rust → 8% Rust ⚡ ``` ### Key Milestones | Phase | Duration | Lines Deleted | Cumulative | Key Achievement | |-------|----------|---------------|------------|-----------------| | **20.23** | 2-3 weeks | -0 | -0 | Arc/RefCell foundation(Hakorune実装) | | **20.24** | 1-2 months | -10,500 | -10,500 | Parser 完全削除 | | **20.25** | 2-3 months | -22,193 | -32,693 | MIR Builder + VM + Box System 削除 | | **20.26** | 2-3 months | -61,713 | -94,406 | **Final consolidation** ✅ | | **Total** | **6-9 months** | **-94,406** | **-94,406** | **95% Rust reduction** 🔥 | ### Comparison with Original Estimates **Original Plan**(from comprehensive roadmap): - Bridge-B Path: 14.3% reduction(-16,693 lines)in 7-14 months - C ABI Path: 86-93% reduction(-50,000+ lines)in 12-18 months **This Plan(Hakorune Implementation)**: - **95.0% reduction**(-94,406 lines)in **6-9 months** ✅ - **10% faster** than C ABI path - **Lower risk**(Hakorune > C for logic) **Why This Plan Succeeds**: 1. **Hakorune for logic**(Arc/RefCell, Box System) 2. **C for data operations**(hakmem, atomic ops) 3. **Proven pattern**(GcBox precedent) 4. **Incremental approach**(4 phases, monthly checkpoints) --- ## 🔗 関連ドキュメント - **Phase 20.23**: [Arc/RefCell in Hakorune](../phase-20.23/README.md) - **Phase 20.24**: [Parser削除](../phase-20.24/README.md) - **Phase 20.25**: [Box System + MIR Builder削除](../phase-20.25/README.md) - **hakmem**: [apps/experiments/hakmem-poc/](../../../../apps/experiments/hakmem-poc/) - **Rust Removal Roadmap**: [rust-removal-comprehensive-roadmap.md](../../../development/analysis/rust-removal-comprehensive-roadmap.md) --- ## 🎉 Final Outcome ### Before (Phase 20.23 start) ``` ┌────────────────────────────────────────────────┐ │ Rust Layer: 99,406 lines (85.1%) │ │ ├── Parser/AST (10,500) │ │ ├── MIR Builder/Optimizer (3,800) │ │ ├── Rust VM (2,393) │ │ ├── Box System (16,000) │ │ ├── LLVM/WASM backends (6,751) │ │ ├── Plugin System (10,000) │ │ ├── Kernel (8,000) │ │ └── Other (41,962) │ └────────────────────────────────────────────────┘ Hakorune Layer: 17,435 lines (14.9%) ``` ### After (Phase 20.26 complete) ``` ┌────────────────────────────────────────────────┐ │ Rust Layer: ~5,000 lines (8.3%) ✅ │ │ ├── HostBridge API (~500) │ │ ├── CLI Minimal (~1,000) │ │ ├── hakmem Binding (~500) │ │ └── Essential MIR Types (~3,000) │ └────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────┐ │ Hakorune Layer: ~55,000 lines (91.7%) ✅ │ │ ├── Compiler (Parser + MIR Builder) (~12,000) │ │ ├── VM (MiniVmBox) (~5,000) │ │ ├── Box System (All boxes) (~15,000) │ │ ├── Plugin System (~5,000) │ │ ├── Kernel (~5,000) │ │ ├── Standard Library (~10,000) │ │ └── Other (~3,000) │ └────────────────────────────────────────────────┘ hakmem (C): 6,541 lines (Memory allocator) ``` **Key Metrics**: - **Rust**: 99,406 → 5,000 lines(**-95.0%** 🔥) - **Hakorune**: 17,435 → 55,000 lines(**+215.4%** ⚡) - **Total**: 116,841 → 66,541 lines(-43.0%) - **Ratio**: 85% Rust → **8% Rust**(**Rust = Floor** ✅) --- ## 🏆 Final Achievement **True Self-Hosting Realized**: ``` ✅ Rust = Floor(~5,000 lines - minimal foundation) ✅ Hakorune = House(~55,000 lines - everything) ✅ hakmem = Plumbing(~6,500 lines C - memory) Ratio: 1 : 11 : 1.3 (Rust : Hakorune : C) ``` **What Rust Does**(Floor responsibilities): 1. HostBridge API(C-ABI entry points) 2. CLI Minimal(Argument parsing) 3. hakmem Binding(Allocator integration) 4. Essential Types(MIR data structures - shared by all backends) **What Hakorune Does**(House - everything else): 1. Compiler(Parser + MIR Builder + Optimizer) 2. VM(MiniVmBox - interpreter) 3. Box System(StringBox, ArrayBox, MapBox, etc.) 4. Plugin System(Dynamic loading) 5. Kernel(Runtime, Scheduler, Memory Manager) 6. Standard Library(All builtins) **What C Does**(Plumbing): 1. hakmem(Memory allocator with call-site profiling) 2. Arc/RefCell atomic operations(Data plane) 3. System calls(mmap, munmap, etc.) --- **ステータス**: 未開始 **開始可能条件**: Phase 20.25(Box System + MIR Builder削除)完了 **期間**: 2-3ヶ月(8-12週間) **Complete**: ✅ **True Self-Hosting Achieved** - Rust = Floor, Hakorune = House