355 lines
13 KiB
Markdown
355 lines
13 KiB
Markdown
|
|
# HAKMEM Tiny Allocator リファクタリング計画 - エグゼクティブサマリー
|
|||
|
|
|
|||
|
|
## 概要
|
|||
|
|
|
|||
|
|
HAKMEM Tiny allocator の **箱理論に基づくスーパーリファクタリング計画** です。
|
|||
|
|
|
|||
|
|
**目標**: 1470行の mega-file (hakmem_tiny_free.inc) を、500行以下の責務単位に分割し、保守性・性能・開発速度を向上させる。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 現状分析
|
|||
|
|
|
|||
|
|
### 問題点
|
|||
|
|
|
|||
|
|
| 項目 | 現状 | 問題 |
|
|||
|
|
|------|------|------|
|
|||
|
|
| **最大ファイル** | hakmem_tiny_free.inc (1470行) | 複雑度 高、バグ多発 |
|
|||
|
|
| **責務の混在** | Free + Alloc + Query + Shutdown | 単一責務原則(SRP)違反 |
|
|||
|
|
| **Include の複雑性** | hakmem_tiny.c が44個の .inc を include | 依存関係が不明確 |
|
|||
|
|
| **パフォーマンス** | Fast path で20+命令 | System tcache の3-4命令に劣る |
|
|||
|
|
| **保守性** | 3時間 /コードレビュー | 複雑度が高い |
|
|||
|
|
|
|||
|
|
### 目指すべき姿
|
|||
|
|
|
|||
|
|
| 項目 | 現状 | 目標 | 効果 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| **最大ファイル** | 1470行 | <= 500行 | -66% 複雑度 |
|
|||
|
|
| **責務分離** | 混在 | 9つの Box | 100% 明確化 |
|
|||
|
|
| **Fast path** | 20+命令 | 3-4命令 | -80% cycles |
|
|||
|
|
| **コードレビュー** | 3時間 | 30分 | -90% 時間 |
|
|||
|
|
| **Throughput** | 52 M ops/s | 58-65 M ops/s | +10-25% |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 箱理論に基づく 9つの Box
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
┌─────────────────────────────────────────────────────────────┐
|
|||
|
|
│ Integration Layer │
|
|||
|
|
│ (hakmem_tiny.c - include aggregator) │
|
|||
|
|
└─────────────────────────────────────────────────────────────┘
|
|||
|
|
↓
|
|||
|
|
┌─────────────────────────────────────────────────────────────┐
|
|||
|
|
│ Box 9: Intel-specific optimizations (3 files × 300行) │
|
|||
|
|
└─────────────────────────────────────────────────────────────┘
|
|||
|
|
↓
|
|||
|
|
┌─────────────────────────────────────────────────────────────┐
|
|||
|
|
│ Box 8: Lifecycle & Init (5 files × 150行) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 7: Statistics & Query (4 files × 200行, existing) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 6: Free Path (3 files × 250行) │
|
|||
|
|
│ - tiny_free_fast.inc.h (same-thread) │
|
|||
|
|
│ - tiny_free_remote.inc.h (cross-thread) │
|
|||
|
|
│ - tiny_free_guard.inc.h (validation) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 5: Allocation Path (3 files × 350行) │
|
|||
|
|
│ - tiny_alloc_fast.inc.h (cache pop, 3-4 cmd) │
|
|||
|
|
│ - hakmem_tiny_refill.inc.h (existing, 410行) │
|
|||
|
|
│ - tiny_alloc_slow.inc.h (superslab refill) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 4: Publish/Adopt (4 files × 300行) │
|
|||
|
|
│ - tiny_publish.c (existing) │
|
|||
|
|
│ - tiny_mailbox.c (existing + split) │
|
|||
|
|
│ - tiny_adopt.inc.h (new) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 3: SuperSlab Core (2 files × 800行) │
|
|||
|
|
│ - hakmem_tiny_superslab.h/c (existing, well-structured) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 2: Remote Queue & Ownership (4 files × 350行) │
|
|||
|
|
│ - tiny_remote_queue.inc.h (new) │
|
|||
|
|
│ - tiny_remote_drain.inc.h (new) │
|
|||
|
|
│ - tiny_owner.inc.h (new) │
|
|||
|
|
│ - slab_handle.h (existing, 295行) │
|
|||
|
|
├─────────────────────────────────────────────────────────────┤
|
|||
|
|
│ Box 1: Atomic Ops (1 file × 80行) │
|
|||
|
|
│ - tiny_atomic.h (new) │
|
|||
|
|
└─────────────────────────────────────────────────────────────┘
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 実装計画 (6週間)
|
|||
|
|
|
|||
|
|
### Week 1: Fast Path (Priority 1) ✨
|
|||
|
|
**目標**: 3-4命令のFast pathを実現
|
|||
|
|
|
|||
|
|
**成果物**:
|
|||
|
|
- [ ] `tiny_atomic.h` (80行) - Atomic操作の統一インターフェース
|
|||
|
|
- [ ] `tiny_alloc_fast.inc.h` (250行) - TLS cache pop (3-4 cmd)
|
|||
|
|
- [ ] `tiny_free_fast.inc.h` (200行) - Same-thread free
|
|||
|
|
- [ ] hakmem_tiny_free.inc 削減 (1470行 → 800行)
|
|||
|
|
|
|||
|
|
**期待値**:
|
|||
|
|
- Fast path: 3-4 instructions (assembly review)
|
|||
|
|
- Throughput: +10% (16-64B size classes)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Week 2: Remote & Ownership (Priority 2)
|
|||
|
|
**目標**: Remote queue と owner TID 管理をモジュール化
|
|||
|
|
|
|||
|
|
**成果物**:
|
|||
|
|
- [ ] `tiny_remote_queue.inc.h` (300行) - MPSC stack ops
|
|||
|
|
- [ ] `tiny_remote_drain.inc.h` (150行) - Drain logic
|
|||
|
|
- [ ] `tiny_owner.inc.h` (120行) - Ownership tracking
|
|||
|
|
- [ ] tiny_remote.c 整理 (645行 → 350行)
|
|||
|
|
|
|||
|
|
**期待値**:
|
|||
|
|
- Remote queue ops を分離・テスト可能に
|
|||
|
|
- Cross-thread free の安定性向上
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Week 3: SuperSlab Integration (Priority 3)
|
|||
|
|
**目標**: Publish/Adopt メカニズムを統合
|
|||
|
|
|
|||
|
|
**成果物**:
|
|||
|
|
- [ ] `tiny_adopt.inc.h` (300行) - Adopt logic
|
|||
|
|
- [ ] `tiny_mailbox_push.inc.h` (80行)
|
|||
|
|
- [ ] `tiny_mailbox_drain.inc.h` (150行)
|
|||
|
|
- [ ] Box 3 (SuperSlab) 強化
|
|||
|
|
|
|||
|
|
**期待値**:
|
|||
|
|
- Multi-thread adoption が完全に統合
|
|||
|
|
- Memory efficiency向上
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Week 4: Allocation/Free Slow Path (Priority 4)
|
|||
|
|
**目標**: Slow pathを明確に分離
|
|||
|
|
|
|||
|
|
**成果物**:
|
|||
|
|
- [ ] `tiny_alloc_slow.inc.h` (300行) - SuperSlab refill
|
|||
|
|
- [ ] `tiny_free_remote.inc.h` (300行) - Cross-thread push
|
|||
|
|
- [ ] `tiny_free_guard.inc.h` (120行) - Validation
|
|||
|
|
- [ ] hakmem_tiny_free.inc (1470行 → 300行に最終化)
|
|||
|
|
|
|||
|
|
**期待値**:
|
|||
|
|
- Slow path を20+ 関数に分割・テスト可能に
|
|||
|
|
- Guard check の安定性確保
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Week 5: Lifecycle & Config (Priority 5)
|
|||
|
|
**目標**: 初期化・クリーンアップを統一化
|
|||
|
|
|
|||
|
|
**成果物**:
|
|||
|
|
- [ ] `tiny_init_globals.inc.h` (150行)
|
|||
|
|
- [ ] `tiny_init_config.inc.h` (150行)
|
|||
|
|
- [ ] `tiny_init_pools.inc.h` (150行)
|
|||
|
|
- [ ] `tiny_lifecycle_trim.inc.h` (120行)
|
|||
|
|
- [ ] `tiny_lifecycle_shutdown.inc.h` (120行)
|
|||
|
|
|
|||
|
|
**期待値**:
|
|||
|
|
- hakmem_tiny_init.inc (544行 → 150行 × 3に分割)
|
|||
|
|
- 重複を排除、設定管理を統一化
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Week 6: Testing + Integration + Benchmark
|
|||
|
|
**目標**: 完全なテスト・ベンチマーク・ドキュメント完備
|
|||
|
|
|
|||
|
|
**成果物**:
|
|||
|
|
- [ ] Unit tests (per Box, 10+テスト)
|
|||
|
|
- [ ] Integration tests (end-to-end)
|
|||
|
|
- [ ] Performance validation
|
|||
|
|
- [ ] Documentation update
|
|||
|
|
|
|||
|
|
**期待値**:
|
|||
|
|
- 全テスト PASS
|
|||
|
|
- Throughput: +10-25% (16-64B size classes)
|
|||
|
|
- Memory efficiency: System 並以上
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 分割戦略 (詳細)
|
|||
|
|
|
|||
|
|
### 抽出元ファイル
|
|||
|
|
|
|||
|
|
| From | To | Lines | Notes |
|
|||
|
|
|------|----|----|------|
|
|||
|
|
| hakmem_tiny_free.inc | tiny_alloc_fast.inc.h | 150 | Fast pop/push |
|
|||
|
|
| hakmem_tiny_free.inc | tiny_free_fast.inc.h | 200 | Same-thread free |
|
|||
|
|
| hakmem_tiny_free.inc | tiny_remote_queue.inc.h | 300 | Remote queue ops |
|
|||
|
|
| hakmem_tiny_free.inc | tiny_alloc_slow.inc.h | 300 | SuperSlab refill |
|
|||
|
|
| hakmem_tiny_free.inc | tiny_free_remote.inc.h | 300 | Cross-thread push |
|
|||
|
|
| hakmem_tiny_free.inc | tiny_free_guard.inc.h | 120 | Validation |
|
|||
|
|
| hakmem_tiny_free.inc | tiny_lifecycle_shutdown.inc.h | 30 | Cleanup |
|
|||
|
|
| hakmem_tiny_free.inc | **削除** | 100 | Commented Query API |
|
|||
|
|
| **Total extract** | - | **1100行** | **-75%削減** |
|
|||
|
|
| **Remaining** | - | **370行** | **Glue code** |
|
|||
|
|
|
|||
|
|
### 新規ファイル一覧
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
✨ New Files (9個, 合計 ~2500行):
|
|||
|
|
|
|||
|
|
Box 1:
|
|||
|
|
- tiny_atomic.h (80行)
|
|||
|
|
|
|||
|
|
Box 2:
|
|||
|
|
- tiny_remote_queue.inc.h (300行)
|
|||
|
|
- tiny_remote_drain.inc.h (150行)
|
|||
|
|
- tiny_owner.inc.h (120行)
|
|||
|
|
|
|||
|
|
Box 4:
|
|||
|
|
- tiny_adopt.inc.h (300行)
|
|||
|
|
- tiny_mailbox_push.inc.h (80行)
|
|||
|
|
- tiny_mailbox_drain.inc.h (150行)
|
|||
|
|
|
|||
|
|
Box 5:
|
|||
|
|
- tiny_alloc_fast.inc.h (250行)
|
|||
|
|
- tiny_alloc_slow.inc.h (300行)
|
|||
|
|
|
|||
|
|
Box 6:
|
|||
|
|
- tiny_free_fast.inc.h (200行)
|
|||
|
|
- tiny_free_remote.inc.h (300行)
|
|||
|
|
- tiny_free_guard.inc.h (120行)
|
|||
|
|
|
|||
|
|
Box 8:
|
|||
|
|
- tiny_init_globals.inc.h (150行)
|
|||
|
|
- tiny_init_config.inc.h (150行)
|
|||
|
|
- tiny_init_pools.inc.h (150行)
|
|||
|
|
- tiny_lifecycle_trim.inc.h (120行)
|
|||
|
|
- tiny_lifecycle_shutdown.inc.h (120行)
|
|||
|
|
|
|||
|
|
Box 9:
|
|||
|
|
- tiny_intel_common.inc.h (150行)
|
|||
|
|
- tiny_intel_fast.inc.h (300行)
|
|||
|
|
- tiny_intel_cache.inc.h (200行)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 期待される効果
|
|||
|
|
|
|||
|
|
### パフォーマンス
|
|||
|
|
|
|||
|
|
| Metric | Before | After | Improvement |
|
|||
|
|
|--------|--------|-------|-------------|
|
|||
|
|
| Fast path instruction count | 20+ | 3-4 | -80% |
|
|||
|
|
| Fast path cycle latency | 50-100 | 15-20 | -70% |
|
|||
|
|
| Branch misprediction penalty | High | Low | -60% |
|
|||
|
|
| Tiny (16-64B) throughput | 52 M ops/s | 58-65 M ops/s | +10-25% |
|
|||
|
|
| Cache hit rate | 70% | 85%+ | +15% |
|
|||
|
|
|
|||
|
|
### 保守性
|
|||
|
|
|
|||
|
|
| Metric | Before | After |
|
|||
|
|
|--------|--------|-------|
|
|||
|
|
| Max file size | 1470行 | 500行以下 |
|
|||
|
|
| Cyclic dependencies | 多数 | 0 (完全DAG) |
|
|||
|
|
| Code review time | 3h | 30min |
|
|||
|
|
| Test coverage | ~60% | 95%+ |
|
|||
|
|
| SRP compliance | 30% | 100% |
|
|||
|
|
|
|||
|
|
### 開発速度
|
|||
|
|
|
|||
|
|
| Task | Before | After |
|
|||
|
|
|------|--------|-------|
|
|||
|
|
| Bug fix | 2-4h | 30min |
|
|||
|
|
| Optimization | 4-6h | 1-2h |
|
|||
|
|
| Feature add | 6-8h | 2-3h |
|
|||
|
|
| Regression debug | 2-3h | 30min |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Include 順序 (新規)
|
|||
|
|
|
|||
|
|
**hakmem_tiny.c** の新規フォーマット:
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
LAYER 0: tiny_atomic.h
|
|||
|
|
LAYER 1: tiny_owner.inc.h, slab_handle.h
|
|||
|
|
LAYER 2: hakmem_tiny_superslab.{h,c}
|
|||
|
|
LAYER 2b: tiny_remote_queue.inc.h, tiny_remote_drain.inc.h
|
|||
|
|
LAYER 3: tiny_publish.{h,c}, tiny_mailbox.*, tiny_adopt.inc.h
|
|||
|
|
LAYER 4: tiny_alloc_fast.inc.h, tiny_free_fast.inc.h
|
|||
|
|
LAYER 5: hakmem_tiny_refill.inc.h, tiny_alloc_slow.inc.h, tiny_free_remote.inc.h, tiny_free_guard.inc.h
|
|||
|
|
LAYER 6: hakmem_tiny_stats.*, hakmem_tiny_query.c
|
|||
|
|
LAYER 7: tiny_init_*.inc.h, tiny_lifecycle_*.inc.h
|
|||
|
|
LAYER 8: tiny_intel_*.inc.h
|
|||
|
|
LAYER 9: Legacy compat (.inc files)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**依存関係の完全DAG**:
|
|||
|
|
```
|
|||
|
|
L0 (tiny_atomic.h)
|
|||
|
|
↓
|
|||
|
|
L1 (tiny_owner, slab_handle)
|
|||
|
|
↓
|
|||
|
|
L2 (SuperSlab, remote_queue, remote_drain)
|
|||
|
|
↓
|
|||
|
|
L3 (Publish/Adopt)
|
|||
|
|
↓
|
|||
|
|
L4 (Fast path)
|
|||
|
|
↓
|
|||
|
|
L5 (Slow path)
|
|||
|
|
↓
|
|||
|
|
L6-L9 (Stats, Lifecycle, Intel, Legacy)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Risk & Mitigation
|
|||
|
|
|
|||
|
|
| Risk | Impact | Mitigation |
|
|||
|
|
|------|--------|-----------|
|
|||
|
|
| Include order bug | Compilation fail | Layer-wise testing, CI |
|
|||
|
|
| Inlining threshold | Performance regression | `__always_inline`, perf profiling |
|
|||
|
|
| TLS contention | Bottleneck | Lock-free CAS, batch ops |
|
|||
|
|
| Remote queue scalability | High-contention bottleneck | Adaptive backoff, sharding |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Success Criteria
|
|||
|
|
|
|||
|
|
✅ **All tests pass** (unit + integration + larson)
|
|||
|
|
✅ **Fast path = 3-4 instruction** (assembly verification)
|
|||
|
|
✅ **+10-25% throughput** (16-64B size classes, vs baseline)
|
|||
|
|
✅ **All files <= 500行**
|
|||
|
|
✅ **Zero cyclic dependencies** (include graph analysis)
|
|||
|
|
✅ **Documentation complete**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ドキュメント
|
|||
|
|
|
|||
|
|
このリファクタリング計画は以下で構成:
|
|||
|
|
|
|||
|
|
1. **REFACTOR_PLAN.md** - 詳細な戦略・分析・タイムライン
|
|||
|
|
2. **REFACTOR_IMPLEMENTATION_GUIDE.md** - 実装手順・コード例・テスト
|
|||
|
|
3. **REFACTOR_SUMMARY.md** (このファイル) - エグゼクティブサマリー
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Next Steps
|
|||
|
|
|
|||
|
|
1. **Week 1 を開始**: Box 1 (tiny_atomic.h) を作成
|
|||
|
|
2. **Benchmark を測定**: Baseline を記録
|
|||
|
|
3. **CI を強化**: Include order を自動チェック
|
|||
|
|
4. **Gradual migration**: Box ごとに段階的に進行
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 連絡先・質問
|
|||
|
|
|
|||
|
|
- 詳細な実装は REFACTOR_IMPLEMENTATION_GUIDE.md を参照
|
|||
|
|
- 全体戦略は REFACTOR_PLAN.md を参照
|
|||
|
|
- 各 Box の責務は Phase 2 セクションを参照
|
|||
|
|
|
|||
|
|
✨ **Let's refactor HAKMEM Tiny to be as simple and fast as System tcache!** ✨
|
|||
|
|
|