Files
hakmem/docs/analysis/MID_POOL_V3_DESIGN.md

297 lines
9.6 KiB
Markdown
Raw Normal View History

# MID_POOL_V3 設計書
## 概要
Mid/Pool v3 は既存の SmallObject v4 (MF2) を発展させ、RegionIdBox による ptr→page_meta O(1) lookup を統合した次世代アーキテクチャ。
**役割分担**: MID v3 は 257-768B 専用、C7 ULTRA が 769-1024B を担当。
## Phase Plan
| Phase | 内容 | 依存 |
|-------|------|------|
| MID-V3-0 | 設計 doc (本文書) | - |
| MID-V3-1 | 型スケルトン + ENV | MID-V3-0 |
| MID-V3-2 | RegionIdBox Registration API 完成 (V6-HDR-2) | MID-V3-1 |
| MID-V3-3 | RegionId 統合 (page registration at carve) | MID-V3-2 |
| MID-V3-4 | Allocation fast path 実装 | MID-V3-3 |
| MID-V3-5 | Free/cold path 実装 | MID-V3-4 |
## 設計課題: Lane vs Page 二重管理問題
### 問題点 (Task Review で指摘)
当初の設計案では Lane と Page の両方で freelist を管理することを想定していたが、
既存 v4 MF2 では per-page freelist が既に機能しており、
Lane を追加すると管理責任が二重化する。
### 既存 v4 MF2 構造
```c
// core/smallobject_hotbox_v4.c
typedef struct small_page_v4 {
uint8_t class_idx;
uint16_t capacity;
uint16_t used;
uint32_t block_size;
uint8_t* base;
void* freelist; // ← Per-page freelist
void* slab_ref;
void* segment;
struct small_page_v4* next;
uint16_t flags;
} small_page_v4;
typedef struct small_class_heap_v4 {
small_page_v4* current; // Current working page
small_page_v4* partial_head; // Partial pages list
uint32_t partial_count;
small_page_v4* full_head; // Full pages list
} small_class_heap_v4;
```
### 解決策: Lane = Page Index Cache
Lane を独立した freelist 管理単位としてではなく、
**TLS が現在作業中の page への index/cache** として再定義する。
```
┌──────────────────────────────────────────────────────┐
│ MidHotBoxV3 (L0 TLS) │
│ ┌─────────────────────────────────────────────────┐ │
│ │ lane[class] = { page_idx, freelist_cache } │ │
│ │ ↓ │ │
│ │ page_idx → MidPageDesc (via RegionIdBox) │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
↓ refill/retire
┌──────────────────────────────────────────────────────┐
│ MidColdIfaceV3 (L1) │
│ - page carve (segment → page) │
│ - page return (page → segment) │
│ - RegionIdBox registration │
└──────────────────────────────────────────────────────┘
```
### Lane 構造 (Revised)
```c
typedef struct MidLaneV3 {
uint32_t page_idx; // Current working page index
void* freelist_head; // TLS-local freelist snapshot (fast path)
uint32_t freelist_count; // Remaining items in freelist
// Note: 実際の freelist は MidPageDesc にあり、
// lane は TLS cache として機能
} MidLaneV3;
```
### Page 構造
```c
typedef struct MidPageDescV3 {
uint8_t* base; // Page base address
uint32_t capacity; // Total slots
uint32_t used; // Used count
void* freelist; // Actual freelist (authoritative)
uint32_t region_id; // RegionIdBox registration ID
uint8_t class_idx; // Size class
uint8_t flags; // Page state flags
} MidPageDescV3;
```
## RegionIdBox 統合
### 現状 (V6-HDR-1)
`region_id_register_v6()` は stub 状態:
```c
// core/region_id_v6.c:202
uint32_t region_id_register_v6(void* base, size_t size, region_kind_t kind, void* metadata) {
(void)base;
(void)size;
(void)kind;
(void)metadata;
return 1; // Single region for now
}
```
### V6-HDR-2: Registration API 完成 (MID-V3-2)
```c
// 必要な機能:
// 1. Region entry array (固定サイズ or 動的)
// 2. ptr → region_entry lookup (radix tree or sorted array)
// 3. Thread-safe registration/unregistration
typedef struct RegionEntry {
uintptr_t base;
uintptr_t end;
region_kind_t kind;
void* metadata; // MidPageDescV3* for SMALL_V4
uint32_t id;
} RegionEntry;
// API
uint32_t region_id_register_v6(void* base, size_t size, region_kind_t kind, void* metadata);
void region_id_unregister_v6(uint32_t region_id);
RegionLookupV6 region_id_lookup_v6(void* ptr);
```
### MID-V3-3: Page Registration at Carve
```c
// Page carve 時に RegionIdBox に登録
MidPageDescV3* mid_cold_v3_carve_page(MidSegmentV3* seg, int class_idx) {
MidPageDescV3* page = /* ... carve from segment ... */;
// RegionIdBox に登録
page->region_id = region_id_register_v6(
page->base,
page->capacity * stride_for_class(class_idx),
REGION_KIND_SMALL_V4, // or new REGION_KIND_MID_V3
page
);
return page;
}
// Page return 時に登録解除
void mid_cold_v3_return_page(MidPageDescV3* page) {
region_id_unregister_v6(page->region_id);
/* ... return to segment ... */
}
```
## Allocation Fast Path (MID-V3-4)
```c
void* mid_hot_v3_alloc(MidHotBoxV3* hot, int class_idx) {
MidLaneV3* lane = &hot->lanes[class_idx];
// L0: TLS freelist cache hit
if (likely(lane->freelist_head)) {
void* blk = lane->freelist_head;
lane->freelist_head = *(void**)blk;
lane->freelist_count--;
return blk;
}
// L0 miss: Refill from page or cold path
return mid_hot_v3_alloc_slow(hot, class_idx);
}
static void* mid_hot_v3_alloc_slow(MidHotBoxV3* hot, int class_idx) {
MidLaneV3* lane = &hot->lanes[class_idx];
// Try to refill from current page
if (lane->page_idx != 0) {
MidPageDescV3* page = mid_page_from_idx(lane->page_idx);
if (page && page->freelist) {
// Batch transfer from page to lane
mid_lane_refill_from_page(lane, page);
return mid_hot_v3_alloc(hot, class_idx);
}
}
// Cold path: Get new page
MidPageDescV3* new_page = mid_cold_v3_refill_page(hot, class_idx);
if (!new_page) return NULL;
lane->page_idx = mid_page_to_idx(new_page);
mid_lane_refill_from_page(lane, new_page);
return mid_hot_v3_alloc(hot, class_idx);
}
```
## Free Path (MID-V3-5)
```c
void mid_hot_v3_free(void* ptr) {
// RegionIdBox lookup (O(1) via TLS cache)
RegionLookupV6 lk = region_id_lookup_cached_v6(ptr);
if (lk.kind != REGION_KIND_MID_V3) {
// Not our allocation, delegate
return;
}
MidPageDescV3* page = (MidPageDescV3*)lk.page_meta;
// Check if local thread owns this page
MidHotBoxV3* hot = mid_hot_box_v3_get();
MidLaneV3* lane = &hot->lanes[page->class_idx];
if (lane->page_idx == mid_page_to_idx(page)) {
// Local page: direct push to lane freelist
*(void**)ptr = lane->freelist_head;
lane->freelist_head = ptr;
lane->freelist_count++;
return;
}
// Remote page: push to page freelist (atomic if needed)
mid_page_push_free(page, ptr);
}
```
## ENV Controls
```
HAKMEM_MID_V3_ENABLED=1 # Enable MID v3 (default: 0)
HAKMEM_MID_V3_CLASSES=0x40 # Class bitmask (default: 0, recommended: 0x40 = C6 only)
# NOTE: C7 (0x80) NOT recommended - use C7 ULTRA instead
HAKMEM_MID_V3_DEBUG=1 # Debug logging (default: 0)
HAKMEM_MID_V3_LANE_BATCH=16 # Lane refill batch size (default: 16)
```
## Performance Results
| Workload | Baseline (ops/s) | MID v3 ON (ops/s) | Improvement |
|----------|------------------|-------------------|-------------|
| C6 (257-768B) | 1,043,379 | 1,159,390 | **+11.1%** |
| Mixed (257-768B) | 976,057 | 1,169,648 | **+19.8%** |
**Note**: C7 (769-1024B) is intentionally excluded from MID v3 and handled by C7 ULTRA, which shows better performance for 1KB allocations.
## Checklist
- [ ] MID-V3-1: 型スケルトン + ENV
- [ ] MidHotBoxV3 structure
- [ ] MidLaneV3 structure
- [ ] MidPageDescV3 structure
- [ ] MidColdIfaceV3 interface
- [ ] ENV parsing
- [ ] MID-V3-2: RegionIdBox Registration API (V6-HDR-2)
- [ ] RegionEntry structure
- [ ] region_id_register_v6() implementation
- [ ] region_id_unregister_v6() implementation
- [ ] Lookup integration (ptr → page_meta)
- [ ] MID-V3-3: RegionId 統合
- [ ] Page carve time registration
- [ ] Page return time unregistration
- [ ] TLS segment auto-registration
- [ ] MID-V3-4: Allocation fast path
- [ ] Lane freelist fast path
- [ ] Page refill slow path
- [ ] Cold refill integration
- [ ] MID-V3-5: Free/cold path
- [ ] RegionIdBox lookup in free
- [ ] Local page fast free
- [ ] Remote page handling
## 参考: 既存コードとの関係
| 既存 | v3 対応 |
|------|---------|
| smallobject_hotbox_v4.c | mid_hotbox_v3.c (新規) |
| small_page_v4 | MidPageDescV3 |
| small_class_heap_v4 | MidLaneV3 |
| cold_refill_page_v4() | mid_cold_v3_refill_page() |
| cold_retire_page_v4() | mid_cold_v3_retire_page() |
| region_id_v6.c | RegionIdBox API 拡張 |