Files
hakmem/docs/design/REFACTOR_PLAN.md

773 lines
21 KiB
Markdown
Raw Normal View History

Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization) ## Changes ### 1. core/page_arena.c - Removed init failure message (lines 25-27) - error is handled by returning early - All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks ### 2. core/hakmem.c - Wrapped SIGSEGV handler init message (line 72) - CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs ### 3. core/hakmem_shared_pool.c - Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE: - Node pool exhaustion warning (line 252) - SP_META_CAPACITY_ERROR warning (line 421) - SP_FIX_GEOMETRY debug logging (line 745) - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865) - SP_ACQUIRE_STAGE0_L0 debug logging (line 803) - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922) - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996) - SP_ACQUIRE_STAGE3 debug logging (line 1116) - SP_SLOT_RELEASE debug logging (line 1245) - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305) - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316) - Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized ## Performance Validation Before: 51M ops/s (with debug fprintf overhead) After: 49.1M ops/s (consistent performance, fprintf removed from hot paths) ## Build & Test ```bash ./build.sh larson_hakmem ./out/release/larson_hakmem 1 5 1 1000 100 10000 42 # Result: 49.1M ops/s ``` Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00
# HAKMEM Tiny Allocator スーパーリファクタリング計画
## 執行サマリー
### 現状
- **hakmem_tiny.c (1584行)**: 複数の .inc ファイルをアグリゲートする器
- **hakmem_tiny_free.inc (1470行)**: 最大級の混合ファイル
- Free パス (33-558行)
- SuperSlab Allocation (559-998行)
- SuperSlab Free (999-1369行)
- Query API (commented-out, extracted to hakmem_tiny_query.c)
**問題点**:
1. 単一のメガファイル (1470行)
2. Free + Allocation が混在
3. 責務が不明確
4. Static inline の嵌套が深い
### 目標
**「箱理論に基づいて、500行以下のファイルに分割」**
- 各ファイルが単一責務 (SRP)
- `static inline` で境界をゼロコスト化
- 依存関係を明確化
- リファクタリング順序の最適化
---
## Phase 1: 現状分析
### 巨大ファイル TOP 10
| ランク | ファイル | 行数 | 責務 |
|--------|---------|------|------|
| 1 | hakmem_pool.c | 2592 | Mid/Large allocator (対象外) |
| 2 | hakmem_tiny.c | 1584 | Tiny アグリゲータ (分析対象) |
| 3 | **hakmem_tiny_free.inc** | **1470** | Free + SS Alloc + Query (要分割) |
| 4 | hakmem.c | 1449 | Top-level allocator (対象外) |
| 5 | hakmem_l25_pool.c | 1195 | L25 pool (対象外) |
| 6 | hakmem_tiny_intel.inc | 863 | Intel 最適化 (分割候補) |
| 7 | hakmem_tiny_superslab.c | 810 | SuperSlab (継続, 強化済み) |
| 8 | hakmem_tiny_stats.c | 697 | Statistics (継続) |
| 9 | tiny_remote.c | 645 | Remote queue (継続, 分割候補) |
| 10 | hakmem_learner.c | 603 | Learning (対象外) |
### Tiny 関連で 500行超のファイル
```
hakmem_tiny_free.inc 1470 ← 要分割(最優先)
hakmem_tiny_intel.inc 863 ← 分割候補
hakmem_tiny_init.inc 544 ← 分割候補
tiny_remote.c 645 ← 分割候補
```
### hakmem_tiny.c が include する .inc ファイル (44個)
**最大級 (300行超):**
- hakmem_tiny_free.inc (1470) ← **最優先**
- hakmem_tiny_intel.inc (863)
- hakmem_tiny_init.inc (544)
**中規模 (150-300行):**
- hakmem_tiny_refill.inc.h (410)
- hakmem_tiny_alloc_new.inc (275)
- hakmem_tiny_background.inc (261)
- hakmem_tiny_alloc.inc (249)
- hakmem_tiny_lifecycle.inc (244)
- hakmem_tiny_metadata.inc (226)
**小規模 (50-150行):**
- hakmem_tiny_ultra_simple.inc (176)
- hakmem_tiny_slab_mgmt.inc (163)
- hakmem_tiny_fastcache.inc.h (149)
- hakmem_tiny_hotmag.inc.h (147)
- hakmem_tiny_smallmag.inc.h (139)
- hakmem_tiny_hot_pop.inc.h (118)
- hakmem_tiny_bump.inc.h (107)
---
## Phase 2: 箱理論による責務分類
### Box 1: Atomic Ops (最下層, 50-100行)
**責務**: CAS/Exchange/Fetch のラッパー、メモリ順序管理
**新規作成**:
- `tiny_atomic.h` (80行)
**含める内容**:
```c
// Atomics for remote queue, owner_tid, refcount
- tiny_atomic_cas()
- tiny_atomic_exchange()
- tiny_atomic_load/store()
- Memory order wrapper
```
---
### Box 2: Remote Queue & Ownership (下層, 500-700行)
#### 2.1: Remote Queue Operations (`tiny_remote_queue.inc.h`, 250-350行)
**責務**: MPSC stack ops, guard check, node management
**出処**: hakmem_tiny_free.inc の remote queue 部分を抽出
```c
- tiny_remote_queue_contains_guard()
- tiny_remote_queue_push()
- tiny_remote_queue_pop()
- tiny_remote_drain_owner() // from hakmem_tiny_free.inc:170
```
#### 2.2: Remote Drain Logic (`tiny_remote_drain.inc.h`, 200-250行)
**責務**: Drain logic, TLS cleanup
**出処**: hakmem_tiny_free.inc の drain ロジック
```c
- tiny_remote_drain_batch()
- tiny_remote_process_mailbox()
```
#### 2.3: Ownership (Owner TID) (`tiny_owner.inc.h`, 100-150行)
**責務**: owner_tid の acquire/release, slab ownership
**既存**: slab_handle.h (295行, 継続) + 強化
**新規**: tiny_owner.inc.h
```c
- tiny_owner_acquire()
- tiny_owner_release()
- tiny_owner_self()
```
**依存**: Box 1 (Atomic)
---
### Box 3: Superslab Core (`hakmem_tiny_superslab.c` + `hakmem_tiny_superslab.h`, 継続)
**責務**: SuperSlab allocation, cache, registry
**現状**: 810行既に well-structured
**強化**: 下記の Box と連携
- Box 4 の Publish/Adopt
- Box 2 の Remote ops
---
### Box 4: Publish/Adopt (上層, 400-500行)
#### 4.1: Publish (`tiny_publish.c/h`, 継続, 34行)
**責務**: Freelist 変化を publish
**既存**: tiny_publish.c (34行) ← 既に tiny
#### 4.2: Mailbox (`tiny_mailbox.c/h`, 継続, 252行)
**責務**: 他スレッドからの adopt 要求
**既存**: tiny_mailbox.c (252行) → 分割検討
```c
- tiny_mailbox_push() // 50行
- tiny_mailbox_drain() // 150行
```
**分割案**:
- `tiny_mailbox_push.inc.h` (50行)
- `tiny_mailbox_drain.inc.h` (150行)
#### 4.3: Adopt Logic (`tiny_adopt.inc.h`, 200-300行)
**責務**: SuperSlab から slab を adopt する logic
**出処**: hakmem_tiny_free.inc の adoption ロジックを抽出
```c
- tiny_adopt_request()
- tiny_adopt_select()
- tiny_adopt_cooldown()
```
**依存**: Box 3 (SuperSlab), Box 4.2 (Mailbox), Box 2 (Ownership)
---
### Box 5: Allocation Path (横断, 600-800行)
#### 5.1: Fast Path (`tiny_alloc_fast.inc.h`, 200-300行)
**責務**: 3-4 命令の fast path (TLS cache direct pop)
**出処**: hakmem_tiny_ultra_simple.inc (176行) + hakmem_tiny_fastcache.inc.h (149行)
```c
// Ultra-simple fast (SRP):
static inline void* tiny_fast_alloc(int class_idx) {
void** head = &g_tls_cache[class_idx];
void* ptr = *head;
if (ptr) *head = *(void**)ptr; // Pop
return ptr;
}
// Fast push:
static inline int tiny_fast_push(int class_idx, void* ptr) {
int cap = g_tls_cache_cap[class_idx];
int cnt = atomic_load(&g_tls_cache_count[class_idx]);
if (cnt < cap) {
void** head = &g_tls_cache[class_idx];
*(void**)ptr = *head;
*head = ptr;
atomic_increment(&g_tls_cache_count[class_idx]);
return 1;
}
return 0; // Slow path
}
```
#### 5.2: Refill Logic (`tiny_refill.inc.h`, 410行, 既存)
**責務**: キャッシュのリファイル
**現状**: hakmem_tiny_refill.inc.h (410行) ← 既に well-sized
#### 5.3: Slow Path (`tiny_alloc_slow.inc.h`, 250-350行)
**責務**: SuperSlab → New Slab → Refill
**出処**: hakmem_tiny_free.inc の superslab_refill + allocation logic
+ hakmem_tiny_alloc.inc (249行)
```c
- tiny_alloc_slow()
- tiny_refill_from_superslab()
- tiny_new_slab_alloc()
```
**依存**: Box 3 (SuperSlab), Box 5.2 (Refill)
---
### Box 6: Free Path (横断, 600-800行)
#### 6.1: Fast Free (`tiny_free_fast.inc.h`, 200-250行)
**責務**: Same-thread free, TLS cache push
**出処**: hakmem_tiny_free.inc の fast-path free logic
```c
// Fast same-thread free:
static inline int tiny_free_fast(void* ptr, int class_idx) {
// Owner check + Cache push
uint32_t self_tid = tiny_self_u32();
TinySlab* slab = hak_tiny_owner_slab(ptr);
if (!slab || slab->owner_tid != self_tid)
return 0; // Slow path
return tiny_fast_push(class_idx, ptr);
}
```
#### 6.2: Cross-Thread Free (`tiny_free_remote.inc.h`, 250-300行)
**責務**: Remote queue push, publish
**出処**: hakmem_tiny_free.inc の cross-thread logic + remote push
```c
- tiny_free_remote()
- tiny_free_remote_queue_push()
```
**依存**: Box 2 (Remote Queue), Box 4.1 (Publish)
#### 6.3: Guard/Safety (`tiny_free_guard.inc.h`, 100-150行)
**責務**: Guard sentinel check, bounds validation
**出処**: hakmem_tiny_free.inc の guard logic
```c
- tiny_free_guard_check()
- tiny_free_validate_ptr()
```
---
### Box 7: Statistics & Query (分析層, 700-900行)
#### 既存(継続):
- hakmem_tiny_stats.c (697行) - Stats aggregate
- hakmem_tiny_stats_api.h (103行) - Stats API
- hakmem_tiny_stats.h (278行) - Stats internal
- hakmem_tiny_query.c (72行) - Query API
#### 分割検討:
hakmem_tiny_stats.c (697行) は統計エンジン専門なので OK
---
### Box 8: Lifecycle (初期化・クリーンアップ, 544行)
#### 既存:
- hakmem_tiny_init.inc (544行) - Initialization
- hakmem_tiny_lifecycle.inc (244行) - Lifecycle
- hakmem_tiny_slab_mgmt.inc (163行) - Slab management
**分割検討**:
- `tiny_init_globals.inc.h` (150行) - Global vars
- `tiny_init_config.inc.h` (150行) - Config from env
- `tiny_init_pools.inc.h` (150行) - Pool allocation
- `tiny_lifecycle_trim.inc.h` (120行) - Trim logic
- `tiny_lifecycle_shutdown.inc.h` (120行) - Shutdown
---
### Box 9: Intel Specific (863行)
**分割案**:
- `tiny_intel_fast.inc.h` (300行) - Prefetch + PAUSE
- `tiny_intel_cache.inc.h` (200行) - Cache tuning
- `tiny_intel_cfl.inc.h` (150行) - CFL-specific
- `tiny_intel_skl.inc.h` (150行) - SKL-specific (共通化)
---
## Phase 3: 分割実行計画
### Priority 1: Critical Path (1週間)
**目標**: Fast path を 3-4 命令レベルまで削減
1. **Box 1: tiny_atomic.h** (80行) ✨
- `atomic_load_explicit()` wrapper
- `atomic_store_explicit()` wrapper
- `atomic_cas()` wrapper
- 依存: `<stdatomic.h>` のみ
2. **Box 5.1: tiny_alloc_fast.inc.h** (250行) ✨
- Ultra-simple TLS cache pop
- 依存: Box 1
3. **Box 6.1: tiny_free_fast.inc.h** (200行) ✨
- Same-thread fast free
- 依存: Box 1, Box 5.1
4. **Extract from hakmem_tiny_free.inc**:
- Fast path logic (500行) → 上記へ
- SuperSlab path (400行) → Box 5.3, 6.2へ
- Remote logic (250行) → Box 2へ
- Cleanup → hakmem_tiny_free.inc は 300行に削減
**効果**: Fast path を system tcache 並みに最適化
---
### Priority 2: Remote & Ownership (1週間)
5. **Box 2.1: tiny_remote_queue.inc.h** (300行)
- Remote queue ops
- 依存: Box 1
6. **Box 2.3: tiny_owner.inc.h** (120行)
- Owner TID management
- 依存: Box 1, slab_handle.h (既存)
7. **tiny_remote.c の整理**: 645行
- `tiny_remote_queue_ops()` → tiny_remote_queue.inc.h へ
- `tiny_remote_side_*()` → 継続
- リサイズ: 645 → 350行に削減
**効果**: Remote ops を モジュール化
---
### Priority 3: SuperSlab Integration (1-2週間)
8. **Box 3 強化**: hakmem_tiny_superslab.c (810行, 継続)
- Publish/Adopt 統合
- 依存: Box 2, Box 4
9. **Box 4.1-4.3: Publish/Adopt Path** (400-500行)
- `tiny_publish.c` (34行, 既存)
- `tiny_mailbox.c` → 分割
- `tiny_adopt.inc.h` (新規)
**効果**: SuperSlab adoption を完全に統合
---
### Priority 4: Allocation/Free Slow Path (1週間)
10. **Box 5.2-5.3: Refill & Slow Allocation** (650行)
- hakmem_tiny_refill.inc.h (410行, 既存)
- `tiny_alloc_slow.inc.h` (新規, 300行)
11. **Box 6.2-6.3: Cross-thread Free** (400行)
- `tiny_free_remote.inc.h` (新規)
- `tiny_free_guard.inc.h` (新規)
**効果**: Slow path を 明確に分離
---
### Priority 5: Lifecycle & Config (1-2週間)
12. **Box 8: Lifecycle の分割** (400-500行)
- hakmem_tiny_init.inc (544行) → 150 + 150 + 150
- hakmem_tiny_lifecycle.inc (244行) → 120 + 120
- Remove duplication
13. **Box 9: Intel-specific の整理** (863行)
- `tiny_intel_fast.inc.h` (300行)
- `tiny_intel_cache.inc.h` (200行)
- `tiny_intel_common.inc.h` (150行)
- Deduplicate × 3 architectures
**効果**: 設定管理を統一化
---
## Phase 4: 新ファイル構成案
### 最終構成
```
core/
├─ Box 1: Atomic Ops
│ └─ tiny_atomic.h (80行)
├─ Box 2: Remote & Ownership
│ ├─ tiny_remote.h (80行, 既存, 軽量化)
│ ├─ tiny_remote_queue.inc.h (300行, 新規)
│ ├─ tiny_remote_drain.inc.h (150行, 新規)
│ ├─ tiny_owner.inc.h (120行, 新規)
│ └─ slab_handle.h (295行, 既存, 継続)
├─ Box 3: SuperSlab Core
│ ├─ hakmem_tiny_superslab.h (500行, 既存)
│ └─ hakmem_tiny_superslab.c (810行, 既存)
├─ Box 4: Publish/Adopt
│ ├─ tiny_publish.h (6行, 既존)
│ ├─ tiny_publish.c (34行, 既存)
│ ├─ tiny_mailbox.h (11行, 既存)
│ ├─ tiny_mailbox.c (252行, 既존) → 분할 가능
│ ├─ tiny_mailbox_push.inc.h (80行, 새로)
│ ├─ tiny_mailbox_drain.inc.h (150行, 새로)
│ └─ tiny_adopt.inc.h (300行, 새로)
├─ Box 5: Allocation
│ ├─ tiny_alloc_fast.inc.h (250行, 新規)
│ ├─ hakmem_tiny_refill.inc.h (410行, 既存)
│ └─ tiny_alloc_slow.inc.h (300行, 新規)
├─ Box 6: Free
│ ├─ tiny_free_fast.inc.h (200行, 新規)
│ ├─ tiny_free_remote.inc.h (300行, 新規)
│ ├─ tiny_free_guard.inc.h (120行, 新規)
│ └─ hakmem_tiny_free.inc (1470行, 既存) → 300行に削減
├─ Box 7: Statistics
│ ├─ hakmem_tiny_stats.c (697行, 既存)
│ ├─ hakmem_tiny_stats.h (278行, 既存)
│ ├─ hakmem_tiny_stats_api.h (103行, 既存)
│ └─ hakmem_tiny_query.c (72行, 既存)
├─ Box 8: Lifecycle
│ ├─ tiny_init_globals.inc.h (150行, 新規)
│ ├─ tiny_init_config.inc.h (150行, 新規)
│ ├─ tiny_init_pools.inc.h (150行, 新規)
│ ├─ tiny_lifecycle_trim.inc.h (120行, 新規)
│ └─ tiny_lifecycle_shutdown.inc.h (120行, 新規)
├─ Box 9: Intel-specific
│ ├─ tiny_intel_common.inc.h (150行, 新規)
│ ├─ tiny_intel_fast.inc.h (300行, 新規)
│ └─ tiny_intel_cache.inc.h (200行, 新規)
└─ Integration
└─ hakmem_tiny.c (1584行, 既存, include aggregator)
└─ 新規フォーマット:
1. includes Box 1-9
2. Minimal glue code only
```
---
## Phase 5: Include 順序の最適化
### 安全な include 依存関係
```mermaid
graph TD
A[Box 1: tiny_atomic.h] --> B[Box 2: tiny_remote.h]
A --> C[Box 5/6: Alloc/Free]
B --> D[Box 2.1: tiny_remote_queue.inc.h]
D --> E[tiny_remote.c]
A --> F[Box 4: Publish/Adopt]
E --> F
C --> G[Box 3: SuperSlab]
F --> G
G --> H[Box 5.3/6.2: Slow Path]
I[Box 8: Lifecycle] --> H
J[Box 9: Intel] --> C
```
### hakmem_tiny.c の新規フォーマット
```c
#include "hakmem_tiny.h"
#include "hakmem_tiny_config.h"
// ============================================================
// LAYER 0: Atomic + Ownership (lowest)
// ============================================================
#include "tiny_atomic.h"
#include "tiny_owner.inc.h"
#include "slab_handle.h"
// ============================================================
// LAYER 1: Remote Queue + SuperSlab Core
// ============================================================
#include "hakmem_tiny_superslab.h"
#include "tiny_remote_queue.inc.h"
#include "tiny_remote_drain.inc.h"
#include "tiny_remote.inc" // tiny_remote_side_*
#include "tiny_remote.c" // Link-time
// ============================================================
// LAYER 2: Publish/Adopt (publication mechanism)
// ============================================================
#include "tiny_publish.h"
#include "tiny_publish.c"
#include "tiny_mailbox.h"
#include "tiny_mailbox_push.inc.h"
#include "tiny_mailbox_drain.inc.h"
#include "tiny_mailbox.c"
#include "tiny_adopt.inc.h"
// ============================================================
// LAYER 3: Fast Path (allocation + free)
// ============================================================
#include "tiny_alloc_fast.inc.h"
#include "tiny_free_fast.inc.h"
// ============================================================
// LAYER 4: Slow Path (refill + cross-thread free)
// ============================================================
#include "hakmem_tiny_refill.inc.h"
#include "tiny_alloc_slow.inc.h"
#include "tiny_free_remote.inc.h"
#include "tiny_free_guard.inc.h"
// ============================================================
// LAYER 5: Statistics + Query + Metadata
// ============================================================
#include "hakmem_tiny_stats.h"
#include "hakmem_tiny_query.c"
#include "hakmem_tiny_metadata.inc"
// ============================================================
// LAYER 6: Lifecycle + Init
// ============================================================
#include "tiny_init_globals.inc.h"
#include "tiny_init_config.inc.h"
#include "tiny_init_pools.inc.h"
#include "tiny_lifecycle_trim.inc.h"
#include "tiny_lifecycle_shutdown.inc.h"
// ============================================================
// LAYER 7: Intel-specific optimizations
// ============================================================
#include "tiny_intel_common.inc.h"
#include "tiny_intel_fast.inc.h"
#include "tiny_intel_cache.inc.h"
// ============================================================
// LAYER 8: Legacy/Experimental (kept for compat)
// ============================================================
#include "hakmem_tiny_ultra_simple.inc"
#include "hakmem_tiny_alloc.inc"
#include "hakmem_tiny_slow.inc"
// ============================================================
// LAYER 9: Old free.inc (minimal, mostly extracted)
// ============================================================
#include "hakmem_tiny_free.inc" // Now just cleanup
#include "hakmem_tiny_background.inc"
#include "hakmem_tiny_magazine.h"
#include "tiny_refill.h"
#include "tiny_mmap_gate.h"
```
---
## Phase 6: 実装ガイド
### Key Principles
1. **SRP (Single Responsibility Principle)**
- Each file: 1 責務、500行以下
- No sideways dependencies
2. **Zero-Cost Abstraction**
- All boundaries via `static inline`
- No function pointer indirection
- Compiler inlines aggressively
3. **Cyclic Dependency Prevention**
- Layer 1 → Layer 2 → ... → Layer 9
- Backward dependency は回避
4. **Backward Compatibility**
- Legacy .inc files は維持(互換性)
- 段階的に新ファイルに移行
### Static Inline の使用場所
#### ✅ Use `static inline`:
```c
// tiny_atomic.h
static inline void tiny_atomic_store(volatile int* p, int v) {
atomic_store_explicit((_Atomic int*)p, v, memory_order_release);
}
// tiny_free_fast.inc.h
static inline void* tiny_fast_pop_alloc(int class_idx) {
void** head = &g_tls_cache[class_idx];
void* ptr = *head;
if (ptr) *head = *(void**)ptr;
return ptr;
}
// tiny_alloc_slow.inc.h
static inline void* tiny_refill_from_superslab(int class_idx) {
SuperSlab* ss = g_tls_current_ss[class_idx];
if (ss) return superslab_alloc_from_slab(ss, ...);
return NULL;
}
```
#### ❌ Don't use `static inline` for:
- Large functions (>20 lines)
- Slow path logic
- Setup/teardown code
#### ✅ Use regular functions:
```c
// tiny_remote.c
void tiny_remote_drain_batch(int class_idx) {
// 50+ lines: slow path → regular function
}
// hakmem_tiny_superslab.c
SuperSlab* superslab_refill(int class_idx) {
// Complex allocation → regular function
}
```
### Macro Usage
#### Use Macros for:
```c
// tiny_atomic.h
#define TINY_ATOMIC_LOAD(ptr, order) \
atomic_load_explicit((_Atomic typeof(*ptr)*)ptr, order)
#define TINY_ATOMIC_CAS(ptr, expected, desired) \
atomic_compare_exchange_strong_explicit( \
(_Atomic typeof(*ptr)*)ptr, expected, desired, \
memory_order_release, memory_order_relaxed)
```
#### Don't over-use for:
- Complex logic (use functions)
- Multiple statements (hard to debug)
---
## Phase 7: Testing Strategy
### Per-File Unit Tests
```c
// test_tiny_alloc_fast.c
void test_tiny_alloc_fast_pop_empty() {
g_tls_cache[0] = NULL;
assert(tiny_fast_pop_alloc(0) == NULL);
}
void test_tiny_alloc_fast_push_pop() {
void* ptr = malloc(8);
tiny_fast_push_alloc(0, ptr);
assert(tiny_fast_pop_alloc(0) == ptr);
}
```
### Integration Tests
```c
// test_tiny_alloc_free_cycle.c
void test_alloc_free_single_thread() {
void* p1 = hak_tiny_alloc(8);
void* p2 = hak_tiny_alloc(8);
hak_tiny_free(p1);
hak_tiny_free(p2);
// Verify no memory leak
}
void test_alloc_free_cross_thread() {
// Thread A allocs, Thread B frees
// Verify remote queue works
}
```
---
## 期待される効果
### パフォーマンス
| 指標 | 現状 | 目標 | 効果 |
|------|------|------|------|
| Fast path 命令数 | 20+ | 3-4 | -80% cycles |
| Branch misprediction | 50-100 cycles | 15-20 cycles | -70% |
| TLS cache hit rate | 70% | 85% | +15% throughput |
### 保守性
| 指標 | 現状 | 目標 | 効果 |
|------|------|------|------|
| Max file size | 1470行 | 300-400行 | -70% 複雑度 |
| Cyclic dependencies | 多数 | 0 | 100% 明確化 |
| Code review time | 3h | 30min | -90% |
### 開発速度
| タスク | 現状 | リファクタ後 |
|--------|------|-------------|
| Bug fix | 2-4h | 30min |
| Optimization | 4-6h | 1-2h |
| Feature add | 6-8h | 2-3h |
---
## Timeline
| Week | Task | Owner | Status |
|------|------|-------|--------|
| 1 | Box 1,5,6 (Fast path) | Claude | TODO |
| 2 | Box 2,3 (Remote/SS) | Claude | TODO |
| 3 | Box 4 (Publish/Adopt) | Claude | TODO |
| 4 | Box 8,9 (Lifecycle/Intel) | Claude | TODO |
| 5 | Testing + Integration | Claude | TODO |
| 6 | Benchmark + Tuning | Claude | TODO |
---
## Rollback Strategy
If performance regresses:
1. Keep all old .inc files (legacy compatibility)
2. hakmem_tiny.c can include either old or new
3. Gradual migration: one Box at a time
4. Benchmark after each Box
---
## Known Risks
1. **Include order sensitivity**: New Box 順序が critical → Test carefully
2. **Inlining threshold**: Compiler may not inline all static inline functions → Profiling needed
3. **TLS cache contention**: Fast path の simple化で TLS synchronization が bottleneck化する可能性 → Monitor g_tls_cache_count
4. **RemoteQueue scalability**: Box 2 の remote queue が high-contention に弱い → Lock-free 化検討
---
## Success Criteria
✅ All tests pass (unit + integration + larson)
✅ Fast path = 3-4 命令 (assembly analysis)
✅ +10-15% throughput on Tiny allocations
✅ All files <= 500 行
✅ Zero cyclic dependencies
✅ Documentation complete