# Phase v11a 実装ロードマップ: MID v3.5 ## 1. ファイル構造(新規作成予定) ### 新規ボックス定義 ``` core/box/ ├─ smallobject_segment_mid_v3_box.h [NEW] Multi-class segment layout ├─ smallobject_stats_mid_v3_box.h [NEW] SmallPageStatsMID_v3 type ├─ smallobject_learner_v2_box.h [NEW] SmallLearnerStatsV2 type └─ smallobject_policy_v2_box.h [NEW] Policy v2 update functions ``` ### 実装ファイル ``` core/ ├─ smallobject_segment_mid_v3.c [NEW] Segment alloc/free/refill ├─ smallobject_learner_v2.c [NEW] Learner stats aggregation └─ smallobject_policy_v2.c [NEW] Policy update logic ``` ### 既存ファイル変更 ``` core/ ├─ smallobject_mid_v3.c [MODIFY] C7 support, stats recording ├─ front/malloc_tiny_fast.h [MODIFY] C7 routing (if SMALL_ROUTE_MID_V3) ├─ hakmem.c [MODIFY] Init smallobject_learner_v2 └─ hakmem.h [MODIFY] Export v2 types ``` ## 2. Phase v11a-1: Design & Infrastructure ### Task 1.1: smallobject_segment_mid_v3_box.h ```c // File: core/box/smallobject_segment_mid_v3_box.h [NEW] #ifndef SMALLOBJECT_SEGMENT_MID_V3_BOX_H #define SMALLOBJECT_SEGMENT_MID_V3_BOX_H #include #include // SmallSegment_MID_v3: unified 2MiB segment for C5-C7 typedef struct { void *start; size_t total_size; // 2 MiB size_t page_size; // 64 KiB uint32_t num_pages; // 32 // Per-class page stacks void *free_pages[8]; // free page stack per class (LIFO) uint32_t free_count[8]; // free page count per class // Current allocation page per class void *current_page[8]; uint32_t page_offset[8]; // allocation offset in current page // Metadata for pages struct SmallPageMeta **pages; // [32] page pointers // Region ID (for lookup) uint32_t region_id; } SmallSegment_MID_v3; typedef struct { SmallSegment_MID_v3 *seg; void *page[8]; // TLS cache: current page per class uint32_t offset[8]; // TLS cache: offset per class } SmallHeapCtx_MID_v3; // API SmallSegment_MID_v3* small_segment_mid_v3_create(void); void small_segment_mid_v3_destroy(SmallSegment_MID_v3 *seg); void* small_segment_mid_v3_alloc_fast( SmallSegment_MID_v3 *seg, uint32_t class_idx, size_t size ); void small_segment_mid_v3_free_page( SmallSegment_MID_v3 *seg, uint32_t class_idx, void *page ); #endif ``` **Rationale**: Defines the multi-class segment geometry with per-class free stacks and TLS caching pattern ### Task 1.2: smallobject_stats_mid_v3_box.h ```c // File: core/box/smallobject_stats_mid_v3_box.h [NEW] typedef struct { uint32_t class_idx; uint64_t total_allocations; uint64_t total_frees; uint32_t page_alloc_count; // Slots on page uint32_t free_hit_ratio_bps; // Free hit rate in basis points (0-10000) } SmallPageStatsMID_v3; typedef struct { SmallPageStatsMID_v3 stat; void *page_ptr; uint64_t retire_timestamp; } SmallPageStatsPublished_MID_v3; // API void small_stats_mid_v3_publish(const SmallPageStatsMID_v3 *stat); const SmallPageStatsPublished_MID_v3* small_stats_mid_v3_latest(void); ``` **Rationale**: Separates stats type from policy to keep Learner input clean ### Task 1.3: smallobject_learner_v2_box.h ```c // File: core/box/smallobject_learner_v2_box.h [NEW] typedef struct { uint64_t allocs[8]; // Allocation count per class uint32_t retire_ratio_pct[8]; // Retire efficiency per class (%) uint64_t avg_page_utilization; // Global average utilization uint32_t free_hit_ratio_bps; // Global free hit rate (basis points) uint64_t eval_count; uint64_t sample_count; } SmallLearnerStatsV2; // API void small_learner_v2_record_refill(uint32_t class_idx, uint64_t capacity); void small_learner_v2_record_retire(uint32_t class_idx, uint32_t free_hit_ratio_bps); void small_learner_v2_evaluate(void); const SmallLearnerStatsV2* small_learner_v2_stats_snapshot(void); ``` **Rationale**: Extends learner beyond v7 C5-only to multi-dimensional metrics ### Task 1.4: smallobject_policy_v2_box.h ```c // File: core/box/smallobject_policy_v2_box.h [NEW] // Policy v2: Route decision with Learner-driven updates typedef struct { uint8_t route_kind[8]; // Route per class (ULTRA, MID_V3, V7, LEGACY) uint32_t policy_version; // Version for TLS cache invalidation } SmallPolicyV2; // API const SmallPolicyV2* small_policy_v2_snapshot(void); void small_policy_v2_init_from_env(SmallPolicyV2 *policy); void small_policy_v2_update_from_learner( const SmallLearnerStatsV2 *stats, SmallPolicyV2 *policy_out ); ``` **Rationale**: Extends Policy Box to handle expanded Learner inputs ### Task 1.5: Benchmark Suite Extension **File**: `core/bench/bench_allocators.c` ```c // Add test cases for Phase v11a // // BENCH_C5_C6_C7_MIXED: // - Min size: 200B (C5) // - Max size: 1000B (C7) // - Mixed ratio: 30% C5, 40% C6, 30% C7 // - Expected perf: 42-48M ops/s (with MID_v3) // // BENCH_C7_HEAVY: // - Min size: 800B // - Max size: 1000B // - Expected perf: 35-40M ops/s (vs ULTRA baseline) // // BENCH_LEARNER_ROUTE_SWITCH: // - Start with C5-heavy (80% C5) // - Expect route[5] = V7 initially // - Then shift to C6-heavy (80% C6) // - Expect route[5] switch to MID_V3 ``` ## 3. Phase v11a-2: Core Implementation ### Task 2.1: SmallSegment_MID_v3 Creation **File**: `core/smallobject_segment_mid_v3.c` ```c SmallSegment_MID_v3* small_segment_mid_v3_create(void) { // Allocate 2MiB segment // Initialize 32 x 64KiB pages // Set up per-class free stacks // Register in RegionIdBox } ``` **Complexity**: Medium - Memory layout: 2MiB = 32 pages of 64KiB each - Metadata: SmallPageMeta per page - Region registration: via RegionIdBox_v7 API (existing) ### Task 2.2: Fast Alloc Path for C5/C6/C7 **File**: `core/smallobject_mid_v3.c` Modify existing C5/C6 alloc to support C7: ```c // Current (v3): // - TLS fast path: C5/C6 from tls_mid_ctx.page // - Refill: get page from free stack or allocate // v11a: // - TLS fast path: C5/C6/C7 from tls_mid_ctx.page[class_idx] // - Refill: per-class free stack // - Retire: record stats with class_idx ``` **Changes**: - [ ] Extend TLS context to support C7 - [ ] Update refill logic for multi-class - [ ] Add C7 routing in malloc_tiny_fast.h ### Task 2.3: Stats Recording **File**: `core/smallobject_mid_v3.c` ```c void small_cold_mid_v3_retire_page( SmallSegment_MID_v3 *seg, uint32_t class_idx, void *page ) { SmallPageMeta *meta = page_to_meta(page); // Record stats uint32_t free_hit_ratio_bps = calc_free_hit_ratio(meta); SmallPageStatsMID_v3 stat = { .class_idx = class_idx, .total_allocations = meta->alloc_count, .total_frees = meta->free_count, .page_alloc_count = meta->capacity, .free_hit_ratio_bps = free_hit_ratio_bps }; // Publish to stats system small_stats_mid_v3_publish(&stat); // Feed to Learner small_learner_v2_record_retire(class_idx, free_hit_ratio_bps); // Free page (return to free stack or OS) ... } ``` **Key Detail**: Must record `class_idx` for Learner aggregation ### Task 2.4: Learner v2 Aggregation **File**: `core/smallobject_learner_v2.c` ```c static SmallLearnerStatsV2 g_learner_v2_stats; void small_learner_v2_record_retire(uint32_t class_idx, uint32_t free_hit_ratio_bps) { if (class_idx >= 8) return; g_learner_v2_stats.allocs[class_idx]++; g_learner_v2_stats.retire_ratio_pct[class_idx] = (g_learner_v2_stats.retire_ratio_pct[class_idx] * 0.9) + (free_hit_ratio_bps / 100.0) * 0.1; // Exponential smoothing // Periodic evaluation static uint64_t total_retires = 0; if (++total_retires % LEARNER_EVAL_INTERVAL == 0) { small_learner_v2_evaluate(); } } void small_learner_v2_evaluate(void) { // Update global version to invalidate TLS policy cache __sync_fetch_and_add(&g_policy_v2_version, 1); g_learner_v2_stats.eval_count++; } ``` ### Task 2.5: Policy v2 Update **File**: `core/smallobject_policy_v2.c` ```c void small_policy_v2_update_from_learner( const SmallLearnerStatsV2 *stats, SmallPolicyV2 *policy_out ) { if (!stats || !policy_out) return; // C5 decision (Phase v11a: same logic as v7) uint64_t total_allocs = 0; for (int i = 0; i < 8; i++) { total_allocs += stats->allocs[i]; } if (total_allocs > 0) { uint64_t c5_ratio_pct = (stats->allocs[5] * 100) / total_allocs; if (c5_ratio_pct >= 30) { policy_out->route_kind[5] = SMALL_ROUTE_V7; } else { policy_out->route_kind[5] = SMALL_ROUTE_MID_V3; } } // Future (Phase v11b): Multi-dimensional decisions // if (retire_ratio[5] < 50% && free_hit < 7000bps) → LEGACY // etc. } ``` ## 4. Phase v11a-3: Integration & Testing ### Task 3.1: C7 Routing in malloc_tiny_fast.h **File**: `core/front/malloc_tiny_fast.h` Modify alloc switch statement: ```c // Current (v10): // case TINY_ROUTE_SMALL_HEAP_V7: return small_heap_alloc_v7(...); // case TINY_ROUTE_SMALL_HEAP_MID_V3: return small_heap_alloc_mid_v3(...); // v11a: // Add support for C7 routing to MID_v3 switch (policy->route_kind[class_idx]) { case SMALL_ROUTE_ULTRA: return ULTRA_alloc(...) case SMALL_ROUTE_MID_V3: return small_heap_alloc_mid_v3(class_idx, size); // ← v11a: supports C7 case SMALL_ROUTE_V7: return small_heap_alloc_v7(class_idx, size); case SMALL_ROUTE_LEGACY: return legacy_alloc(...); } ``` ### Task 3.2: Free Path C7 Support **File**: `core/front/malloc_tiny_fast.h` ```c // v11a: Allow C7 free to route to MID_v3 if (SMALL_MID_V3_CLASS_SUPPORTED(class_idx)) { if (policy->route_kind[class_idx] == SMALL_ROUTE_MID_V3) { small_heap_free_mid_v3(ptr, class_idx); return; } } ``` ### Task 3.3: Integration Tests **File**: `core/test/test_mid_v3_c7.c` [NEW] ```c void test_mid_v3_c7_alloc_free(void) { // Test C7 allocation and free through MID_v3 // Expected: successful alloc/free without segfault // Verify: Policy routing is correct // Verify: Learner stats are recorded } void test_learner_v2_route_switch(void) { // Allocate C5-heavy workload // Verify: route[5] = V7 // Switch to C6-heavy workload // Verify: route[5] switches to MID_V3 // Check stderr: "[LEARNER_V2] C5 route switch: V7 → MID_V3" } void test_mid_v3_perf_c5_c6_c7_mixed(void) { // Performance baseline for C5/C6/C7 mixed // Expected: 42-48M ops/s // Verify: no regression vs v7 research preset } ``` ### Task 3.4: Regression Testing **Ensure**: - [ ] v7 research preset (C5/C6 + Learner) still works - [ ] Mixed profile (16-1024B, v7 OFF) unchanged - [ ] ULTRA (C4-C7) unchanged - [ ] Legacy fallback unchanged ## 5. Build & Compilation ### Makefile Changes ```makefile # Add new object files to HAKMEM_OBJS HAKMEM_OBJS += \ core/smallobject_segment_mid_v3.o \ core/smallobject_learner_v2.o \ core/smallobject_policy_v2.o # Add new box headers to HEADERS HEADERS += \ core/box/smallobject_segment_mid_v3_box.h \ core/box/smallobject_stats_mid_v3_box.h \ core/box/smallobject_learner_v2_box.h \ core/box/smallobject_policy_v2_box.h ``` ## 6. Testing Commands ### Benchmark Suite (after Phase v11a-2) ```bash # C5/C6/C7 mixed (expected MID_v3 preferred) HAKMEM_SMALL_HEAP_V7_ENABLED=0 \ HAKMEM_MID_V3_ENABLED=1 \ HAKMEM_MID_V3_CLASSES=0x70 \ ./bench_allocators bench_c5_c6_c7_mixed 300000 # C7 heavy (expected MID_v3 performance) HAKMEM_SMALL_HEAP_V7_ENABLED=0 \ HAKMEM_MID_V3_ENABLED=1 \ ./bench_allocators bench_c7_heavy 200000 # Learner route switch verification HAKMEM_SMALL_HEAP_V7_ENABLED=1 \ HAKMEM_SMALL_HEAP_V7_CLASSES=0x60 \ HAKMEM_MID_V3_ENABLED=1 \ ./bench_allocators bench_learner_route_switch 500000 ``` ### Expected Output ``` [POLICY_V2_INIT] Route assignments: C0: LEGACY C1: LEGACY C2: LEGACY C3: LEGACY C4: ULTRA C5: MID_V3 C6: MID_V3 C7: MID_V3 [LEARNER_V2] eval_count=1, C5_ratio=28%, retire_ratio[5]=92% C5/C6/C7 mixed (300K iter): 44.2M ops/s ✓ (+4% vs baseline) ``` ## 7. Dependency Graph ``` smallobject_segment_mid_v3_box.h ↓ smallobject_segment_mid_v3.c ↓ calls smallobject_stats_mid_v3.c ↓ publishes to smallobject_learner_v2.c ↓ feeds to smallobject_policy_v2.c ↓ updates malloc_tiny_fast.h (routing) ``` Recommended implementation order: 1. smallobject_segment_mid_v3.h/c (foundation) 2. smallobject_stats_mid_v3.h (simple type def) 3. smallobject_mid_v3.c changes (core alloc/free) 4. smallobject_learner_v2.h/c (stats aggregation) 5. smallobject_policy_v2.h/c (learner integration) 6. malloc_tiny_fast.h (routing) 7. Tests & benchmarks --- **Document Date**: 2025-12-12 **Phase**: v11a-1 (Design & Infrastructure) **Status**: Ready for Task 1.1-1.5 implementation **Next Review**: After Phase v11a-1 completion