# MID_POOL_V3 設計書 ## 概要 Mid/Pool v3 は既存の SmallObject v4 (MF2) を発展させ、RegionIdBox による ptr→page_meta O(1) lookup を統合した次世代アーキテクチャ。 **役割分担**: MID v3 は 257-768B 専用、C7 ULTRA が 769-1024B を担当。 ## Phase Plan | Phase | 内容 | 依存 | |-------|------|------| | MID-V3-0 | 設計 doc (本文書) | - | | MID-V3-1 | 型スケルトン + ENV | MID-V3-0 | | MID-V3-2 | RegionIdBox Registration API 完成 (V6-HDR-2) | MID-V3-1 | | MID-V3-3 | RegionId 統合 (page registration at carve) | MID-V3-2 | | MID-V3-4 | Allocation fast path 実装 | MID-V3-3 | | MID-V3-5 | Free/cold path 実装 | MID-V3-4 | ## 設計課題: Lane vs Page 二重管理問題 ### 問題点 (Task Review で指摘) 当初の設計案では Lane と Page の両方で freelist を管理することを想定していたが、 既存 v4 MF2 では per-page freelist が既に機能しており、 Lane を追加すると管理責任が二重化する。 ### 既存 v4 MF2 構造 ```c // core/smallobject_hotbox_v4.c typedef struct small_page_v4 { uint8_t class_idx; uint16_t capacity; uint16_t used; uint32_t block_size; uint8_t* base; void* freelist; // ← Per-page freelist void* slab_ref; void* segment; struct small_page_v4* next; uint16_t flags; } small_page_v4; typedef struct small_class_heap_v4 { small_page_v4* current; // Current working page small_page_v4* partial_head; // Partial pages list uint32_t partial_count; small_page_v4* full_head; // Full pages list } small_class_heap_v4; ``` ### 解決策: Lane = Page Index Cache Lane を独立した freelist 管理単位としてではなく、 **TLS が現在作業中の page への index/cache** として再定義する。 ``` ┌──────────────────────────────────────────────────────┐ │ MidHotBoxV3 (L0 TLS) │ │ ┌─────────────────────────────────────────────────┐ │ │ │ lane[class] = { page_idx, freelist_cache } │ │ │ │ ↓ │ │ │ │ page_idx → MidPageDesc (via RegionIdBox) │ │ │ └─────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────┘ ↓ refill/retire ┌──────────────────────────────────────────────────────┐ │ MidColdIfaceV3 (L1) │ │ - page carve (segment → page) │ │ - page return (page → segment) │ │ - RegionIdBox registration │ └──────────────────────────────────────────────────────┘ ``` ### Lane 構造 (Revised) ```c typedef struct MidLaneV3 { uint32_t page_idx; // Current working page index void* freelist_head; // TLS-local freelist snapshot (fast path) uint32_t freelist_count; // Remaining items in freelist // Note: 実際の freelist は MidPageDesc にあり、 // lane は TLS cache として機能 } MidLaneV3; ``` ### Page 構造 ```c typedef struct MidPageDescV3 { uint8_t* base; // Page base address uint32_t capacity; // Total slots uint32_t used; // Used count void* freelist; // Actual freelist (authoritative) uint32_t region_id; // RegionIdBox registration ID uint8_t class_idx; // Size class uint8_t flags; // Page state flags } MidPageDescV3; ``` ## RegionIdBox 統合 ### 現状 (V6-HDR-1) `region_id_register_v6()` は stub 状態: ```c // core/region_id_v6.c:202 uint32_t region_id_register_v6(void* base, size_t size, region_kind_t kind, void* metadata) { (void)base; (void)size; (void)kind; (void)metadata; return 1; // Single region for now } ``` ### V6-HDR-2: Registration API 完成 (MID-V3-2) ```c // 必要な機能: // 1. Region entry array (固定サイズ or 動的) // 2. ptr → region_entry lookup (radix tree or sorted array) // 3. Thread-safe registration/unregistration typedef struct RegionEntry { uintptr_t base; uintptr_t end; region_kind_t kind; void* metadata; // MidPageDescV3* for SMALL_V4 uint32_t id; } RegionEntry; // API uint32_t region_id_register_v6(void* base, size_t size, region_kind_t kind, void* metadata); void region_id_unregister_v6(uint32_t region_id); RegionLookupV6 region_id_lookup_v6(void* ptr); ``` ### MID-V3-3: Page Registration at Carve ```c // Page carve 時に RegionIdBox に登録 MidPageDescV3* mid_cold_v3_carve_page(MidSegmentV3* seg, int class_idx) { MidPageDescV3* page = /* ... carve from segment ... */; // RegionIdBox に登録 page->region_id = region_id_register_v6( page->base, page->capacity * stride_for_class(class_idx), REGION_KIND_SMALL_V4, // or new REGION_KIND_MID_V3 page ); return page; } // Page return 時に登録解除 void mid_cold_v3_return_page(MidPageDescV3* page) { region_id_unregister_v6(page->region_id); /* ... return to segment ... */ } ``` ## Allocation Fast Path (MID-V3-4) ```c void* mid_hot_v3_alloc(MidHotBoxV3* hot, int class_idx) { MidLaneV3* lane = &hot->lanes[class_idx]; // L0: TLS freelist cache hit if (likely(lane->freelist_head)) { void* blk = lane->freelist_head; lane->freelist_head = *(void**)blk; lane->freelist_count--; return blk; } // L0 miss: Refill from page or cold path return mid_hot_v3_alloc_slow(hot, class_idx); } static void* mid_hot_v3_alloc_slow(MidHotBoxV3* hot, int class_idx) { MidLaneV3* lane = &hot->lanes[class_idx]; // Try to refill from current page if (lane->page_idx != 0) { MidPageDescV3* page = mid_page_from_idx(lane->page_idx); if (page && page->freelist) { // Batch transfer from page to lane mid_lane_refill_from_page(lane, page); return mid_hot_v3_alloc(hot, class_idx); } } // Cold path: Get new page MidPageDescV3* new_page = mid_cold_v3_refill_page(hot, class_idx); if (!new_page) return NULL; lane->page_idx = mid_page_to_idx(new_page); mid_lane_refill_from_page(lane, new_page); return mid_hot_v3_alloc(hot, class_idx); } ``` ## Free Path (MID-V3-5) ```c void mid_hot_v3_free(void* ptr) { // RegionIdBox lookup (O(1) via TLS cache) RegionLookupV6 lk = region_id_lookup_cached_v6(ptr); if (lk.kind != REGION_KIND_MID_V3) { // Not our allocation, delegate return; } MidPageDescV3* page = (MidPageDescV3*)lk.page_meta; // Check if local thread owns this page MidHotBoxV3* hot = mid_hot_box_v3_get(); MidLaneV3* lane = &hot->lanes[page->class_idx]; if (lane->page_idx == mid_page_to_idx(page)) { // Local page: direct push to lane freelist *(void**)ptr = lane->freelist_head; lane->freelist_head = ptr; lane->freelist_count++; return; } // Remote page: push to page freelist (atomic if needed) mid_page_push_free(page, ptr); } ``` ## ENV Controls ``` HAKMEM_MID_V3_ENABLED=1 # Enable MID v3 (default: 0) HAKMEM_MID_V3_CLASSES=0x40 # Class bitmask (default: 0, recommended: 0x40 = C6 only) # NOTE: C7 (0x80) NOT recommended - use C7 ULTRA instead HAKMEM_MID_V3_DEBUG=1 # Debug logging (default: 0) HAKMEM_MID_V3_LANE_BATCH=16 # Lane refill batch size (default: 16) ``` ## Performance Results | Workload | Baseline (ops/s) | MID v3 ON (ops/s) | Improvement | |----------|------------------|-------------------|-------------| | C6 (257-768B) | 1,043,379 | 1,159,390 | **+11.1%** | | Mixed (257-768B) | 976,057 | 1,169,648 | **+19.8%** | **Note**: C7 (769-1024B) is intentionally excluded from MID v3 and handled by C7 ULTRA, which shows better performance for 1KB allocations. ## Checklist - [ ] MID-V3-1: 型スケルトン + ENV - [ ] MidHotBoxV3 structure - [ ] MidLaneV3 structure - [ ] MidPageDescV3 structure - [ ] MidColdIfaceV3 interface - [ ] ENV parsing - [ ] MID-V3-2: RegionIdBox Registration API (V6-HDR-2) - [ ] RegionEntry structure - [ ] region_id_register_v6() implementation - [ ] region_id_unregister_v6() implementation - [ ] Lookup integration (ptr → page_meta) - [ ] MID-V3-3: RegionId 統合 - [ ] Page carve time registration - [ ] Page return time unregistration - [ ] TLS segment auto-registration - [ ] MID-V3-4: Allocation fast path - [ ] Lane freelist fast path - [ ] Page refill slow path - [ ] Cold refill integration - [ ] MID-V3-5: Free/cold path - [ ] RegionIdBox lookup in free - [ ] Local page fast free - [ ] Remote page handling ## 参考: 既存コードとの関係 | 既存 | v3 対応 | |------|---------| | smallobject_hotbox_v4.c | mid_hotbox_v3.c (新規) | | small_page_v4 | MidPageDescV3 | | small_class_heap_v4 | MidLaneV3 | | cold_refill_page_v4() | mid_cold_v3_refill_page() | | cold_retire_page_v4() | mid_cold_v3_retire_page() | | region_id_v6.c | RegionIdBox API 拡張 |