Role separation based on ultrathink analysis: - MID v3: 257-768B専用 (C6 only, HAKMEM_MID_V3_CLASSES=0x40) - C7 ULTRA: 769-1024B専用 (existing optimized path) Changes: - core/box/hak_alloc_api.inc.h: Remove C7 route, restrict to 257-768B - core/box/mid_hotbox_v3_env_box.h: Update ENV comments - docs/analysis/MID_POOL_V3_DESIGN.md: Add performance results & role - CURRENT_TASK.md: Document MID-V3 completion & role separation Verified: - 257-768B with v3 ON: 1,199,526 ops/s (+1.7% vs baseline) - 769-1024B with v3 ON: 1,181,254 ops/s (same as baseline, C7 excluded) - C7 correctly routes to ULTRA instead of MID v3 Rationale: C7-only showed -11% regression, but C6/mixed showed +11-19% improvement. Specializing to mid-range (257-768B) leverages v3 strengths while keeping C7 on the proven ULTRA path. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
9.6 KiB
9.6 KiB
MID_POOL_V3 設計書
概要
Mid/Pool v3 は既存の SmallObject v4 (MF2) を発展させ、RegionIdBox による ptr→page_meta O(1) lookup を統合した次世代アーキテクチャ。
役割分担: MID v3 は 257-768B 専用、C7 ULTRA が 769-1024B を担当。
Phase Plan
| Phase | 内容 | 依存 |
|---|---|---|
| MID-V3-0 | 設計 doc (本文書) | - |
| MID-V3-1 | 型スケルトン + ENV | MID-V3-0 |
| MID-V3-2 | RegionIdBox Registration API 完成 (V6-HDR-2) | MID-V3-1 |
| MID-V3-3 | RegionId 統合 (page registration at carve) | MID-V3-2 |
| MID-V3-4 | Allocation fast path 実装 | MID-V3-3 |
| MID-V3-5 | Free/cold path 実装 | MID-V3-4 |
設計課題: Lane vs Page 二重管理問題
問題点 (Task Review で指摘)
当初の設計案では Lane と Page の両方で freelist を管理することを想定していたが、 既存 v4 MF2 では per-page freelist が既に機能しており、 Lane を追加すると管理責任が二重化する。
既存 v4 MF2 構造
// core/smallobject_hotbox_v4.c
typedef struct small_page_v4 {
uint8_t class_idx;
uint16_t capacity;
uint16_t used;
uint32_t block_size;
uint8_t* base;
void* freelist; // ← Per-page freelist
void* slab_ref;
void* segment;
struct small_page_v4* next;
uint16_t flags;
} small_page_v4;
typedef struct small_class_heap_v4 {
small_page_v4* current; // Current working page
small_page_v4* partial_head; // Partial pages list
uint32_t partial_count;
small_page_v4* full_head; // Full pages list
} small_class_heap_v4;
解決策: Lane = Page Index Cache
Lane を独立した freelist 管理単位としてではなく、 TLS が現在作業中の page への index/cache として再定義する。
┌──────────────────────────────────────────────────────┐
│ MidHotBoxV3 (L0 TLS) │
│ ┌─────────────────────────────────────────────────┐ │
│ │ lane[class] = { page_idx, freelist_cache } │ │
│ │ ↓ │ │
│ │ page_idx → MidPageDesc (via RegionIdBox) │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────┘
↓ refill/retire
┌──────────────────────────────────────────────────────┐
│ MidColdIfaceV3 (L1) │
│ - page carve (segment → page) │
│ - page return (page → segment) │
│ - RegionIdBox registration │
└──────────────────────────────────────────────────────┘
Lane 構造 (Revised)
typedef struct MidLaneV3 {
uint32_t page_idx; // Current working page index
void* freelist_head; // TLS-local freelist snapshot (fast path)
uint32_t freelist_count; // Remaining items in freelist
// Note: 実際の freelist は MidPageDesc にあり、
// lane は TLS cache として機能
} MidLaneV3;
Page 構造
typedef struct MidPageDescV3 {
uint8_t* base; // Page base address
uint32_t capacity; // Total slots
uint32_t used; // Used count
void* freelist; // Actual freelist (authoritative)
uint32_t region_id; // RegionIdBox registration ID
uint8_t class_idx; // Size class
uint8_t flags; // Page state flags
} MidPageDescV3;
RegionIdBox 統合
現状 (V6-HDR-1)
region_id_register_v6() は stub 状態:
// core/region_id_v6.c:202
uint32_t region_id_register_v6(void* base, size_t size, region_kind_t kind, void* metadata) {
(void)base;
(void)size;
(void)kind;
(void)metadata;
return 1; // Single region for now
}
V6-HDR-2: Registration API 完成 (MID-V3-2)
// 必要な機能:
// 1. Region entry array (固定サイズ or 動的)
// 2. ptr → region_entry lookup (radix tree or sorted array)
// 3. Thread-safe registration/unregistration
typedef struct RegionEntry {
uintptr_t base;
uintptr_t end;
region_kind_t kind;
void* metadata; // MidPageDescV3* for SMALL_V4
uint32_t id;
} RegionEntry;
// API
uint32_t region_id_register_v6(void* base, size_t size, region_kind_t kind, void* metadata);
void region_id_unregister_v6(uint32_t region_id);
RegionLookupV6 region_id_lookup_v6(void* ptr);
MID-V3-3: Page Registration at Carve
// Page carve 時に RegionIdBox に登録
MidPageDescV3* mid_cold_v3_carve_page(MidSegmentV3* seg, int class_idx) {
MidPageDescV3* page = /* ... carve from segment ... */;
// RegionIdBox に登録
page->region_id = region_id_register_v6(
page->base,
page->capacity * stride_for_class(class_idx),
REGION_KIND_SMALL_V4, // or new REGION_KIND_MID_V3
page
);
return page;
}
// Page return 時に登録解除
void mid_cold_v3_return_page(MidPageDescV3* page) {
region_id_unregister_v6(page->region_id);
/* ... return to segment ... */
}
Allocation Fast Path (MID-V3-4)
void* mid_hot_v3_alloc(MidHotBoxV3* hot, int class_idx) {
MidLaneV3* lane = &hot->lanes[class_idx];
// L0: TLS freelist cache hit
if (likely(lane->freelist_head)) {
void* blk = lane->freelist_head;
lane->freelist_head = *(void**)blk;
lane->freelist_count--;
return blk;
}
// L0 miss: Refill from page or cold path
return mid_hot_v3_alloc_slow(hot, class_idx);
}
static void* mid_hot_v3_alloc_slow(MidHotBoxV3* hot, int class_idx) {
MidLaneV3* lane = &hot->lanes[class_idx];
// Try to refill from current page
if (lane->page_idx != 0) {
MidPageDescV3* page = mid_page_from_idx(lane->page_idx);
if (page && page->freelist) {
// Batch transfer from page to lane
mid_lane_refill_from_page(lane, page);
return mid_hot_v3_alloc(hot, class_idx);
}
}
// Cold path: Get new page
MidPageDescV3* new_page = mid_cold_v3_refill_page(hot, class_idx);
if (!new_page) return NULL;
lane->page_idx = mid_page_to_idx(new_page);
mid_lane_refill_from_page(lane, new_page);
return mid_hot_v3_alloc(hot, class_idx);
}
Free Path (MID-V3-5)
void mid_hot_v3_free(void* ptr) {
// RegionIdBox lookup (O(1) via TLS cache)
RegionLookupV6 lk = region_id_lookup_cached_v6(ptr);
if (lk.kind != REGION_KIND_MID_V3) {
// Not our allocation, delegate
return;
}
MidPageDescV3* page = (MidPageDescV3*)lk.page_meta;
// Check if local thread owns this page
MidHotBoxV3* hot = mid_hot_box_v3_get();
MidLaneV3* lane = &hot->lanes[page->class_idx];
if (lane->page_idx == mid_page_to_idx(page)) {
// Local page: direct push to lane freelist
*(void**)ptr = lane->freelist_head;
lane->freelist_head = ptr;
lane->freelist_count++;
return;
}
// Remote page: push to page freelist (atomic if needed)
mid_page_push_free(page, ptr);
}
ENV Controls
HAKMEM_MID_V3_ENABLED=1 # Enable MID v3 (default: 0)
HAKMEM_MID_V3_CLASSES=0x40 # Class bitmask (default: 0, recommended: 0x40 = C6 only)
# NOTE: C7 (0x80) NOT recommended - use C7 ULTRA instead
HAKMEM_MID_V3_DEBUG=1 # Debug logging (default: 0)
HAKMEM_MID_V3_LANE_BATCH=16 # Lane refill batch size (default: 16)
Performance Results
| Workload | Baseline (ops/s) | MID v3 ON (ops/s) | Improvement |
|---|---|---|---|
| C6 (257-768B) | 1,043,379 | 1,159,390 | +11.1% |
| Mixed (257-768B) | 976,057 | 1,169,648 | +19.8% |
Note: C7 (769-1024B) is intentionally excluded from MID v3 and handled by C7 ULTRA, which shows better performance for 1KB allocations.
Checklist
-
MID-V3-1: 型スケルトン + ENV
- MidHotBoxV3 structure
- MidLaneV3 structure
- MidPageDescV3 structure
- MidColdIfaceV3 interface
- ENV parsing
-
MID-V3-2: RegionIdBox Registration API (V6-HDR-2)
- RegionEntry structure
- region_id_register_v6() implementation
- region_id_unregister_v6() implementation
- Lookup integration (ptr → page_meta)
-
MID-V3-3: RegionId 統合
- Page carve time registration
- Page return time unregistration
- TLS segment auto-registration
-
MID-V3-4: Allocation fast path
- Lane freelist fast path
- Page refill slow path
- Cold refill integration
-
MID-V3-5: Free/cold path
- RegionIdBox lookup in free
- Local page fast free
- Remote page handling
参考: 既存コードとの関係
| 既存 | v3 対応 |
|---|---|
| smallobject_hotbox_v4.c | mid_hotbox_v3.c (新規) |
| small_page_v4 | MidPageDescV3 |
| small_class_heap_v4 | MidLaneV3 |
| cold_refill_page_v4() | mid_cold_v3_refill_page() |
| cold_retire_page_v4() | mid_cold_v3_retire_page() |
| region_id_v6.c | RegionIdBox API 拡張 |