Files
hakorune/docs/private/roadmap/phases/phase-20.26/README.md

657 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 20.26 — Final Rust Consolidation + hakmem Integration
**期間**: 2-3ヶ月
**ステータス**: 計画更新Rust Floor方針に整合
**前提条件**: Phase 20.25Box System + MIR Builder削除完了
---
## 🎯 Executive Summary
**Purpose**: Rust 層を最終整理し、hakmem アロケータ統合で完全セルフホスト達成。
**Final Goal**: **Rust = Floor, Hakorune = House** 完全実現
### Target State
```
┌────────────────────────────────────────────────┐
│ Rust Layer (≤5,000 lines - absolute minimum) │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ HostBridge API (~500 lines) │ │
│ │ - Hako_RunScriptUtf8 │ │
│ │ - Hako_Retain / Hako_Release │ │
│ │ - Hako_ToUtf8 / Hako_LastError │ │
│ └─────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ CLI Minimal (~1,000 lines) │ │
│ │ - Argument parsing │ │
│ │ - Backend selection │ │
│ │ - Environment setup │ │
│ └─────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ hakmem C-ABI Binding (~500 lines) │ │
│ │ - LD_PRELOAD integration │ │
│ │ - Memory allocator interface │ │
│ └─────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ Essential Data Structures │ │
│ │ (~3,000 lines - MIR types only) │ │
│ └─────────────────────────────────────┘ │
└────────────────────────────────────────────────┘
```
### Targets-61,713行、段階的アーカイブ後の削減
```
From Phase 20.25 end state: 66,713 lines
Delete:
├── src/mir/ (except essential) -6,835 lines ← Keep only core types
├── src/backend/llvm/ -6,551 lines ← Experimental, remove
├── src/backend/wasm/ -200 lines ← Experimental, remove
├── src/cli/ (minimize) -4,000 lines ← Keep minimal (~1,000)
├── src/runner/ (minimize) -2,500 lines ← Keep minimal (~500)
├── src/plugin_system/ -10,000 lines ← Hakorune plugin system
├── src/kernel/ -8,000 lines ← Hakorune kernel
├── src/other/ (cleanup) -23,627 lines ← Various dead code
────────────────────────────────────────────────
Total deleted: -61,713 lines
Final Rust layer: ~5,000 lines ✅
```
**Cumulative Progress**:
- Phase 20.24: -10,500 lines-9.0%
- Phase 20.25: -22,193 lines-19.0%
- Phase 20.26: -61,713 lines-52.8%
- **Total**: **-94,406 lines-80.8%** 🔥
**Final State**: **Rust (Floor): ≤5,000 / Hakorune (House): 50,000+** = **True SelfHosting**
---
## 🏗️ Implementation Roadmap
### Month 1: Rust Layer Audit & Consolidation
#### Week 1: Comprehensive Audit
**Goal**: 全 Rust ファイルを分類Keep / Delete / Minimize
**Audit Categories**:
```
Category 1: KEEP (Essential - ~3,000 lines)
├── src/mir/types.rs ~600 lines ← MirType, ConstValue
├── src/mir/instruction.rs ~800 lines ← MirInstruction enum
├── src/mir/basic_block.rs ~470 lines ← BasicBlock
├── src/mir/function.rs ~564 lines ← MirModule, MirFunction
└── src/mir/value.rs ~400 lines ← Register, Value types
Category 2: MINIMIZE (Reduce to ~2,000 lines)
├── src/cli/ 5,000 → 1,000 ← -4,000
├── src/runner/ 3,000 → 500 ← -2,500
├── src/c_abi/ 2,000 → 500 ← -1,500 (consolidate)
Category 3: DELETE (Experimental/Dead - ~56,713 lines)
├── src/backend/llvm/ -6,551 lines ← Python llvmlite primary
├── src/backend/wasm/ -200 lines ← Python llvmlite WASM
├── src/plugin_system/ -10,000 lines ← Hakorune plugin system
├── src/kernel/ -8,000 lines ← Hakorune kernel
├── src/verification/ -2,000 lines ← Hakorune verifier
├── src/json/ -1,500 lines ← serde_json sufficient
├── src/error/ -1,200 lines ← Minimal error handling
├── src/debug/ -1,800 lines ← Hakorune debug tools
├── src/config/ -1,000 lines ← Env vars sufficient
├── src/metrics/ -1,500 lines ← Hakorune metrics
├── src/testing/ -2,000 lines ← Smoke tests external
├── src/utils/ -1,500 lines ← Minimal utils
├── src/macros/ -800 lines ← No longer needed
└── src/other/ -18,662 lines ← Various dead code
```
**Tasks**:
- [ ] Audit ALL files in `src/`108 files from comprehensive roadmap
- [ ] Classify each file: Keep / Delete / Minimize
- [ ] Identify dependencies: What depends on what?
- [ ] Create deletion plan with safe order
**Deliverable**: `RUST_LAYER_AUDIT.md` - 完全なファイルリスト+分類
#### Week 2: Dependency Graph Analysis
**Goal**: 安全な削除順序を決定
**Dependency Analysis**:
```bash
# Tool: cargo-tree, cargo-geiger, etc.
cargo tree --all-features --depth 3 > dependency_tree.txt
# Custom analysis
rg "use crate::" src/ | sort | uniq > internal_dependencies.txt
# Identify circular dependencies
# Identify orphaned files (no incoming edges)
```
**Output**: Dependency graphGraphviz DOT format
```dot
digraph RustDependencies {
// Essential (keep)
MirTypes [color=green];
MirInstruction [color=green];
// Minimize
CLI [color=yellow];
Runner [color=yellow];
// Delete
LLVM [color=red];
PluginSystem [color=red];
// Dependencies
CLI -> MirTypes;
Runner -> MirTypes;
LLVM -> MirTypes; // Can delete, MirTypes stays
}
```
**Tasks**:
- [ ] Generate full dependency graph
- [ ] Identify deletion candidatesno internal dependencies
- [ ] Plan deletion wave orderWave 1, 2, 3...
#### Week 3-4: Wave 1 DeletionExperimental Backends
**Target**: `src/backend/llvm/`, `src/backend/wasm/`
**Rationale**:
- Python llvmlite is primary LLVM backend218,056 lines, proven
- Rust inkwell is experimental6,551 lines, incomplete
- WASM via llvmlite worksPhase 15.8 complete
**Deletion**:
```bash
# Safety first
git tag phase-20.26-wave1-pre
git branch backup/rust-llvm-backend
# Delete
rm -rf src/backend/llvm/ # -6,551 lines
rm -rf src/backend/wasm/ # -200 lines
```
**Impact Analysis**:
- CLI: `--backend llvm` flag → `--backend llvm-py`Python llvmlite
- Tests: Update smoke tests to use Python backend
- Docs: Update backend documentation
**Tasks**:
- [ ] Update CLI backend selection
- [ ] Remove `--backend llvm` flagRust inkwell
- [ ] Update all smoke tests
- [ ] Full regression testing
**Acceptance**:
- [ ] All smoke tests PASS296/296
- [ ] LLVM backend via Python works
- [ ] No references to deleted code
---
### Month 2: Plugin System & Kernel Deletion
#### Week 1-2: Plugin System Migration
**Current Rust Plugin System**~10,000 lines:
```
src/plugin_system/
├── registry.rs ~1,500 lines ← Plugin discovery
├── loader.rs ~2,000 lines ← Dynamic loading
├── handle.rs ~1,200 lines ← Handle management
├── abi.rs ~1,800 lines ← C-ABI bridge
├── lifecycle.rs ~1,000 lines ← Init/shutdown
└── ... (other) ~2,500 lines
```
**Hakorune Plugin System**(完全実装済み):
```
lang/src/runtime/plugin/
├── plugin_registry_box.hako
├── plugin_loader_box.hako
├── handle_registry_box.hako
└── ...
```
**Migration Strategy**:
```rust
// Minimal Rust wrapper for Hakorune plugin system
pub struct PluginSystemShim {
hakorune_registry: HakoHandle, // Points to Hakorune PluginRegistryBox
}
impl PluginSystemShim {
pub fn load_plugin(&self, path: &str) -> Result<()> {
// Call Hakorune plugin system
let result = call_hakorune(
"PluginRegistryBox.load_plugin",
&[path.into()]
)?;
Ok(())
}
}
```
**Tasks**:
- [ ] Implement PluginSystemShim~200 lines
- [ ] Replace all calls: `plugin_system::load()``PluginSystemShim::load_plugin()`
- [ ] Test with existing pluginsStringBox, ArrayBox, etc.
- [ ] Delete `src/plugin_system/`-10,000 lines
#### Week 3-4: Kernel Deletion
**Current Rust Kernel**~8,000 lines:
```
src/kernel/
├── runtime.rs ~2,000 lines ← Runtime initialization
├── scheduler.rs ~1,500 lines ← Task scheduling
├── memory.rs ~1,200 lines ← Memory management
├── io.rs ~1,000 lines ← I/O subsystem
└── ... (other) ~2,300 lines
```
**Hakorune Kernel**(完全実装済み):
```
lang/src/runtime/kernel/
├── runtime_box.hako
├── scheduler_box.hako
├── memory_manager_box.hako
└── ...
```
**Migration Strategy**: Same as Plugin SystemMinimal shim
**Tasks**:
- [ ] Implement KernelShim~300 lines
- [ ] Replace kernel calls
- [ ] Test runtime initialization
- [ ] Delete `src/kernel/`-8,000 lines
---
### Month 3: hakmem Integration + Final Cleanup
#### Week 1-2: hakmem Allocator Integration
**Goal**: Hakorune uses hakmem as default allocatorno libc malloc
**Current State**Phase 6.14 完了):
- ✅ hakmem PoC complete6,541 lines C
- ✅ mimalloc parity achievedjson: +36.4%, mir: -18.8%
- ✅ Call-site profiling working
- ✅ UCB1 evolution ready
**Integration Strategy**:
```c
// src/c_abi/hakmem_binding.c (NEW - ~200 lines)
#include "hakmem.h" // From apps/experiments/hakmem-poc/
// Override libc malloc/free
void* malloc(size_t size) {
return hak_alloc_cs(size);
}
void free(void* ptr) {
hak_free_cs(ptr, 0); // Size unknown, hakmem infers from header
}
void* realloc(void* ptr, size_t new_size) {
void* new_ptr = malloc(new_size);
if (ptr) {
memcpy(new_ptr, ptr, /* old_size */);
free(ptr);
}
return new_ptr;
}
```
**Build Integration**:
```bash
# Cargo.toml
[dependencies]
hakmem = { path = "apps/experiments/hakmem-poc" }
[profile.release]
# Use hakmem instead of system allocator
```
**Tasks**:
- [ ] Create `src/c_abi/hakmem_binding.c`
- [ ] Integrate hakmem into Cargo build
- [ ] Test with `LD_PRELOAD=libhakmem.so`
- [ ] Benchmark: Before vs Aftermemory usage, performance
**Acceptance**:
- [ ] All allocations go through hakmemverify with profiling
- [ ] No libc malloc callsverify with `nm` or `ldd`
- [ ] Performance: ±10% of libc malloc
- [ ] Memory usage: ±10% of libc malloc
#### Week 3: Final Cleanup
**Target**: Delete all remaining dead code~23,627 lines
**Cleanup Categories**:
```
1. Dead imports (no longer used)
2. Dead functions (no callers)
3. Dead tests (functionality moved to Hakorune)
4. Dead docs (outdated architecture)
5. Dead examples (superseded)
```
**Tools**:
```bash
# Find unused functions
cargo +nightly rustc -- -Z print=dead-code
# Find unused dependencies
cargo machete
# Find orphaned files
find src/ -name "*.rs" | while read f; do
grep -r "$(basename $f .rs)" src/ | grep -v "^$f:" || echo "Orphaned: $f"
done
```
**Tasks**:
- [ ] Run dead code analysis
- [ ] Delete identified dead code
- [ ] Clean importsremove unused `use` statements
- [ ] Update Cargo.tomlremove unused dependencies
#### Week 4: Final Verification & Documentation
**Verification**:
```bash
# Clean build
cargo clean
cargo build --release
# Size check
ls -lh target/release/hakorune
# Expected: ~5MBwas ~15MB before cleanup
# Line count
cloc src/
# Expected: ~5,000 lines Rust
# Full test suite
tools/smokes/v2/run.sh --profile all
# Expected: 296/296 PASS
# Performance benchmark
tools/bench_unified.sh --backend all --warmup 10 --repeat 50
# Expected: No regression±5%
```
**Documentation**:
- [ ] Architecture diagramFinal state
- [ ] Rust layer API referenceminimal surface
- [ ] hakmem integration guide
- [ ] Migration complete announcement
---
## ✅ 受け入れ基準Acceptance Criteria
### Quantitative Goals
- [ ] **Rust layer**: ≤5,000 lines目標達成: 99.5% → 実際: 80.8% 削減)
- [ ] **Hakorune layer**: 50,000+ linesCompiler + VM + Boxes + Plugin System
- [ ] **Binary size**: ≤5MB15MB → 5MB, -67%
- [ ] **Build time**: ≤30秒Clean build, release mode
### Functional Requirements
- [ ] **All smoke tests PASS**: 296/296
- [ ] **Self-compilation works**: Hako₁ → Hako₂ → Hako₃bit-identical
- [ ] **hakmem default**: No libc malloc usage
- [ ] **Performance parity**: ±10% vs Phase 20.25 end state
### Quality Requirements
- [ ] **No dead code**: `cargo +nightly rustc -Z print=dead-code` clean
- [ ] **No unused dependencies**: `cargo machete` clean
- [ ] **Documentation complete**: Architecture, API, Migration guide
- [ ] **CI passing**: All checks green
### Safety Requirements
- [ ] **Rollback tested**: Each month has checkpointtag + branch
- [ ] **Memory safety**: valgrind clean with hakmem
- [ ] **Thread safety**: TSan cleanif applicable
---
## 🚨 Risk Analysis
### Critical Risks
| リスク | 確率 | 影響 | 軽減策 |
|--------|------|------|--------|
| **hakmem stability** | 中 | 高 | Extensive testing, fallback to libc |
| **Dependency hell** | 中 | 中 | Wave-based deletion, dependency graph |
| **Dead code resurgence** | 低 | 低 | CI checks for unused code |
| **Performance regression** | 低 | 中 | Continuous benchmarking |
| **Rollback complexity** | 中 | 高 | Monthly checkpoints, granular commits |
### hakmem Integration RisksMost Critical
**Risk**: hakmem has bugs or performance issues in production
**Impact**: Highall allocations affected
**Mitigation**:
1. **Extensive testing**: Run all smoke tests with hakmem
2. **Benchmarking**: Compare vs libc mallocshould be ±10%
3. **Fallback mechanism**:
```rust
pub fn get_allocator() -> Allocator {
match env::var("HAKO_USE_HAKMEM") {
Ok(val) if val == "0" => Allocator::Libc,
_ => Allocator::Hakmem, // Default
}
}
```
4. **Memory profiling**: valgrind, ASan validation
5. **Production rollout**: Gradualdev → staging → prod
---
## 📈 Timeline Visualization
```
Month 1: Rust Layer Audit & Wave 1 Deletion
├─ Week 1: Comprehensive audit [████░░░░] File classification
├─ Week 2: Dependency graph [████░░░░] Safe deletion order
├─ Week 3-4: Wave 1LLVM/WASM backend [████████] -6,751 lines
└─ Checkpoint: Experimental backends removed
Month 2: Plugin System & Kernel Deletion
├─ Week 1-2: Plugin System migration [████████] -10,000 lines
├─ Week 3-4: Kernel migration [████████] -8,000 lines
└─ Checkpoint: Core systems on Hakorune
Month 3: hakmem Integration + Final Cleanup
├─ Week 1-2: hakmem integration [████░░░░] Default allocator
├─ Week 3: Final cleanup [████████] -23,627 lines
├─ Week 4: Verification & docs [████░░░░] ✅ Complete
└─ Completion: Phase 20.26 done
Total: 2-3 months, -61,713 lines (-52.8%)
Cumulative: -94,406 lines (-80.8%) from Phase 20.24 start
```
---
## 💡 Strategic Insights
### Achievement Summary
**Before Phase 20.24**Starting point:
```
Total codebase: 116,841 lines
Rust layer (src/): 99,406 lines (85.1%)
Hakorune layer: 17,435 lines (14.9%)
```
**After Phase 20.26**Final state:
```
Total codebase: ~60,000 lines
Rust layer (src/): ~5,000 lines (8.3%) ✅
Hakorune layer: ~55,000 lines (91.7%) ✅
Rust reduction: -94,406 lines (-95.0%) 🔥
Ratio flip: 85% Rust → 8% Rust ⚡
```
### Key Milestones
| Phase | Duration | Lines Deleted | Cumulative | Key Achievement |
|-------|----------|---------------|------------|-----------------|
| **20.23** | 2-3 weeks | -0 | -0 | Arc/RefCell foundationHakorune実装 |
| **20.24** | 1-2 months | -10,500 | -10,500 | Parser 完全削除 |
| **20.25** | 2-3 months | -22,193 | -32,693 | MIR Builder + VM + Box System 削除 |
| **20.26** | 2-3 months | -61,713 | -94,406 | **Final consolidation** ✅ |
| **Total** | **6-9 months** | **-94,406** | **-94,406** | **95% Rust reduction** 🔥 |
### Comparison with Original Estimates
**Original Plan**from comprehensive roadmap:
- Bridge-B Path: 14.3% reduction-16,693 linesin 7-14 months
- C ABI Path: 86-93% reduction-50,000+ linesin 12-18 months
**This PlanHakorune Implementation**:
- **95.0% reduction**-94,406 linesin **6-9 months** ✅
- **10% faster** than C ABI path
- **Lower risk**Hakorune > C for logic
**Why This Plan Succeeds**:
1. **Hakorune for logic**Arc/RefCell, Box System
2. **C for data operations**hakmem, atomic ops
3. **Proven pattern**GcBox precedent
4. **Incremental approach**4 phases, monthly checkpoints
---
## 🔗 関連ドキュメント
- **Phase 20.23**: [Arc/RefCell in Hakorune](../phase-20.23/README.md)
- **Phase 20.24**: [Parser削除](../phase-20.24/README.md)
- **Phase 20.25**: [Box System + MIR Builder削除](../phase-20.25/README.md)
- **hakmem**: [apps/experiments/hakmem-poc/](../../../../apps/experiments/hakmem-poc/)
- **Rust Removal Roadmap**: [rust-removal-comprehensive-roadmap.md](../../../development/analysis/rust-removal-comprehensive-roadmap.md)
---
## 🎉 Final Outcome
### Before (Phase 20.23 start)
```
┌────────────────────────────────────────────────┐
│ Rust Layer: 99,406 lines (85.1%) │
│ ├── Parser/AST (10,500) │
│ ├── MIR Builder/Optimizer (3,800) │
│ ├── Rust VM (2,393) │
│ ├── Box System (16,000) │
│ ├── LLVM/WASM backends (6,751) │
│ ├── Plugin System (10,000) │
│ ├── Kernel (8,000) │
│ └── Other (41,962) │
└────────────────────────────────────────────────┘
Hakorune Layer: 17,435 lines (14.9%)
```
### After (Phase 20.26 complete)
```
┌────────────────────────────────────────────────┐
│ Rust Layer: ~5,000 lines (8.3%) ✅ │
│ ├── HostBridge API (~500) │
│ ├── CLI Minimal (~1,000) │
│ ├── hakmem Binding (~500) │
│ └── Essential MIR Types (~3,000) │
└────────────────────────────────────────────────┘
┌────────────────────────────────────────────────┐
│ Hakorune Layer: ~55,000 lines (91.7%) ✅ │
│ ├── Compiler (Parser + MIR Builder) (~12,000) │
│ ├── VM (MiniVmBox) (~5,000) │
│ ├── Box System (All boxes) (~15,000) │
│ ├── Plugin System (~5,000) │
│ ├── Kernel (~5,000) │
│ ├── Standard Library (~10,000) │
│ └── Other (~3,000) │
└────────────────────────────────────────────────┘
hakmem (C): 6,541 lines (Memory allocator)
```
**Key Metrics**:
- **Rust**: 99,406 → 5,000 lines**-95.0%** 🔥)
- **Hakorune**: 17,435 → 55,000 lines**+215.4%** ⚡)
- **Total**: 116,841 → 66,541 lines-43.0%
- **Ratio**: 85% Rust → **8% Rust****Rust = Floor** ✅)
---
## 🏆 Final Achievement
**True Self-Hosting Realized**:
```
✅ Rust = Floor~5,000 lines - minimal foundation
✅ Hakorune = House~55,000 lines - everything
✅ hakmem = Plumbing~6,500 lines C - memory
Ratio: 1 : 11 : 1.3 (Rust : Hakorune : C)
```
**What Rust Does**Floor responsibilities:
1. HostBridge APIC-ABI entry points
2. CLI MinimalArgument parsing
3. hakmem BindingAllocator integration
4. Essential TypesMIR data structures - shared by all backends
**What Hakorune Does**House - everything else:
1. CompilerParser + MIR Builder + Optimizer
2. VMMiniVmBox - interpreter
3. Box SystemStringBox, ArrayBox, MapBox, etc.
4. Plugin SystemDynamic loading
5. KernelRuntime, Scheduler, Memory Manager
6. Standard LibraryAll builtins
**What C Does**Plumbing:
1. hakmemMemory allocator with call-site profiling
2. Arc/RefCell atomic operationsData plane
3. System callsmmap, munmap, etc.
---
**ステータス**: 未開始
**開始可能条件**: Phase 20.25Box System + MIR Builder削除完了
**期間**: 2-3ヶ月8-12週間
**Complete**: ✅ **True Self-Hosting Achieved** - Rust = Floor, Hakorune = House