diff --git a/docs/analysis/PHASE_V11A_DESIGN_MID_V3.5.md b/docs/analysis/PHASE_V11A_DESIGN_MID_V3.5.md new file mode 100644 index 00000000..92aaf375 --- /dev/null +++ b/docs/analysis/PHASE_V11A_DESIGN_MID_V3.5.md @@ -0,0 +1,326 @@ +# Phase v11a 設計仕様: MID v3.5 (257-1KiB Unified Box) + +## 1. 位置づけ + +**v10 から v11a への移行**: +- v10: v7 を C5/C6-only 研究箱として凍結、Learner default ON +- v11a: **MID v3.5 を 257-1KiB メイン実装として統一拡張** + +**アーキテクチャ役割**: +``` +L0: ULTRA (C4-C7) → FROZEN(変わらず) +L1: MID v3.5 (C5-C7) → 本線 ★NEW + ├─ C5: TLS cache + 2MiB segment (multi-page) + ├─ C6: TLS cache + 2MiB segment (multi-page) + └─ C7: TLS cache + 2MiB segment (multi-page) +L1-research: v7 (C5/C6) → 研究箱(凍結) +L2: Segment/ColdIface/RegionId +L3: Policy Box + Learner v2 (expanded stats) +``` + +## 2. MID v3 → MID v3.5 の変更 + +### 2-1. 現在の MID v3 構成 + +**実装済み**: +- C5/C6 multi-class TLS heap +- 2MiB segments +- RefillPolicy (TLS segment hint, pool fallback) +- Policy routing (via Policy Box v7-4) +- Legacy Stats(page retire時の基本データ) + +**制限事項**: +- C7 未対応(ULTRA固有) +- Learner 統計なし(v7のみ) +- Single-class segment 前提 + +### 2-2. v11a で追加する機能 + +#### 機能 1: C7 完全対応 +**目標**: MID v3.5 が C5-C7 すべてをカバー + +**実装**: +```c +// mid_v3.5.h - new extension + +// TLS context for C7 +typedef struct { + SmallHeapCtx ctx; // Reuse existing context + void *tls_page; // Current page pointer (C7) + uint32_t tls_offset; // Allocation offset in page +} SmallHeapCtx_C7_MID; + +// Allocation fast path +// C7: size > 512B → check MID_ROUTE_C7 +// If enabled: try TLS fast alloc → refill on demand +``` + +**Policy routing**: +```c +// mid_policy.h +route_kind[7] = SMALL_ROUTE_MID_V3; // If C7 enabled, else ULTRA +``` + +**Stats tracking**: +```c +// SmallPageStatsMID_v3: record class_idx for all retires +typedef struct { + uint32_t class_idx; + uint64_t total_allocations; + uint64_t total_frees; + uint32_t page_alloc_count; // ← v11a new + uint32_t free_hit_ratio_bps; // ← v11a new (basis points) +} SmallPageStatsMID_v3; +``` + +#### 機能 2: Multi-class Segment 設計決定 + +**2択の検討**: + +**設計 A: Separate segments** +``` +MID_v3_segment[3] = { + [0] → segment_C5, + [1] → segment_C6, + [2] → segment_C7 +} +``` +利点: Simple, clean class separation +欠点: 3x segment overhead, TLS lookup複雑化 + +**設計 B: Shared segment + per-class pages** +``` +SmallSegment_MID_v3 { + free_pages[8]; // per class free stack + class_pages[8]; // current page per class + page_alloc[8]; // allocation count per class +} +``` +利点: 1 segment で済む, RegionIdBox 変更不要 +欠点: Logic 複雑化 + +**v11a 決定**: **設計 B (shared segment)** +- 理由: RegionIdBox は既存(変更最小化) +- Segment geometry 統一(v7と同じ2MiB/64KiB) +- Multi-class TLS hint 対応可能 + +#### 機能 3: Learner v2 (Expanded Stats) + +**v7-7 Learner の制限**: +```c +// Current: C5 ratio のみ監視 +c5_ratio_pct = (stats->per_class[5].v7_allocs * 100) / total_allocs; +if (c5_ratio_pct >= THRESHOLD) → route[5] = V7; +``` + +**v11a Learner v2 の拡張**: +```c +typedef struct { + uint64_t allocs[8]; // per class allocation count + uint32_t retire_ratio_pct[8]; // per class retire efficiency + uint64_t avg_page_utilization; // global metric + uint32_t free_hit_ratio_bps; // global free hit (basis points) + uint64_t eval_count; +} SmallLearnerStatsV2; + +// 複数指標での route決定(後日拡張可能) +// Example (Phase v11b): +// - C5_ratio < 30% AND retire_ratio < 50% → MID_v3 +// - C5_ratio >= 30% AND free_hit > 8000bps → V7 +``` + +**実装フロー**: +``` +MID_v3 page retire + ↓ record stats +SmallPageStatsMID_v3 {class_idx, allocs, free_hit_ratio} + ↓ periodic publish (every LEARNER_EVAL_INTERVAL) +SmallLearnerStatsV2 aggregate + ↓ +small_learner_v2_evaluate() + ↓ +small_policy_v3_update_from_learner() ← NEW (Policy v2) + ↓ +TLS policy cache invalidation +``` + +### 2-3. 既存コンポーネント継承 + +**変更なし**: +- RegionIdBox: Segment ptr → region lookup(既存動作) +- Policy Box: route_kind[8] 配列(既存 API) +- ColdIface: refill/retire インターフェース(既存) +- TLS cache: class ごと快速化(既存パターン) + +**要変更**: +- Policy initialization: C7 routing 追加 +- Learner stats recording: class_idx 記録追加 +- Stats aggregation: Multi-class 対応 + +## 3. 実装スケジュール + +### Phase v11a-1: Design & Infrastructure (Week 1-2) +- [ ] SmallSegment_MID_v3 multi-class layout 決定 +- [ ] SmallPageStatsMID_v3 型定義 + publish API +- [ ] SmallLearnerStatsV2 型定義 +- [ ] Policy v2 update 関数スケッチ +- [ ] Bench suite拡張: C5/C6/C7 individual tests + +### Phase v11a-2: Core Implementation (Week 3-4) +- [ ] SmallHeap_MID_v3_C7 alloc/free path +- [ ] Multi-class refill logic +- [ ] Stats recording (per-page class_idx) +- [ ] Learner stats aggregation +- [ ] Policy update_from_learner v2 + +### Phase v11a-3: Integration & Testing (Week 5) +- [ ] Learner default ON for MID_v3 +- [ ] Perf benchmarks: C5/C6/C7 mixed +- [ ] Learner route switch verification +- [ ] Regression: v7 research preset still works + +### Phase v11b: Multi-segment Expansion (TBD) +- [ ] Evaluate separate segment approach +- [ ] TLS multi-segment hint optimization +- [ ] C4 support decision (ULTRA vs MID_v3) + +## 4. API 変更最小化 + +### Policy Box API(変更最小) +```c +// 既存: 関数署名そのまま +const SmallPolicyV7* small_policy_v7_snapshot(void); +void small_policy_v7_init_from_env(SmallPolicyV7* policy); +void small_policy_v7_update_from_learner( + const SmallLearnerStatsV7* stats, + SmallPolicyV7* policy_out +); + +// v11a: 型名だけ拡張 +// typedef SmallLearnerStatsV7 → SmallLearnerStatsV2 (backward compat) +// → 内部で v2 の新フィールドは optional +``` + +### Learner Box API(新規 add) +```c +// smallobject_learner_v2_box.h +typedef struct { /* SmallLearnerStatsV2 */ } SmallLearnerStatsV2; + +void small_learner_v2_record_retire(uint32_t class_idx, + uint32_t free_hit_ratio_bps); +void small_learner_v2_evaluate(void); +const SmallLearnerStatsV2* small_learner_v2_stats_snapshot(void); +``` + +### ColdIface API(変更なし) +```c +// 既存の refill/retire インターフェース +typedef void (*cold_refill_page_fn)(uint32_t class_idx, ...); +typedef void (*cold_retire_page_fn)(uint32_t class_idx, ...); +``` + +## 5. パフォーマンス予測 + +### Current MID v3 (C5/C6) +``` +C5/C6 mixed (200-500B, 300K iter): 38.7M ops/s +C6 heavy (400-510B, 500K iter): 56.3M ops/s +Mixed 16-1024B (v7 OFF): 21.5M ops/s +``` + +### Expected MID v3.5 (after implementation) +``` +C5/C6/C7 mixed (200-1000B): +3-5% (more pages, better locality) +C7 heavy (800-1000B): +2-3% (vs ULTRA fallback) +Mixed 16-1024B (with Learner): +1-2% (dynamic routing) +``` + +**メトリクス**: +- Throughput: +1-3% overall +- Overhead: ~5-8% (relative to ULTRA baseline) +- Learner accuracy: > 95% on workload pattern detection + +## 6. 設計確定事項 + +### Segment Geometry (v11a) +``` +SmallSegment_MID_v3: + - Total size: 2 MiB (same as v7) + - Page size: 64 KiB (same as v7) + - Free stack: per-class (C5/C6/C7 each) + - Class pages: current[8], partial[8] + - RegionId: single segment per TLS thread +``` + +### TLS Caching Pattern +```c +// TLS MID context +struct { + SmallSegment_MID_v3 *seg; + void *page[8]; // Current page per class + uint32_t offset[8]; // Allocation offset + uint32_t cache_hits; + uint32_t cache_misses; +} __thread tls_mid_v3_ctx; +``` + +### Stats Recording +```c +// On page retire: +void small_cold_mid_v3_retire_page(..., uint32_t class_idx) { + SmallPageMeta* meta = page->meta; + meta->class_idx = class_idx; // ← record class + + // Calculate metrics + uint32_t free_hit = calc_free_hit_ratio(page); + + // Publish stats + SmallPageStatsMID_v3 stat = { + .class_idx = class_idx, + .total_allocations = page->alloc_count, + .total_frees = page->free_count, + .page_alloc_count = capacity, + .free_hit_ratio_bps = free_hit + }; + + // Feed to Learner + small_learner_v2_ingest_stats(&stat); +} +``` + +## 7. Next Decision Points + +### v11b への移行判定 +``` +Go to v11b (multi-segment) if: + ✓ C7 performance matches ULTRA (±2%) + ✓ Learner accuracy > 90% on class patterns + ✓ RegionId lookup latency acceptable (<2% overhead) +``` + +### Stay in v11a (iterate) if: +``` + ✗ C7 performance < 90% vs ULTRA + ✗ Learner detection < 80% accuracy + ✗ Stats aggregation cost > 5% CPU +``` + +## 8. 枝刳り対象(後日) + +### Branch Cutting (Phase v12+) +- v3 backend の細部最適化 +- v6 headerless gains検証 +- v7 multi-class 検証 +- Learner 多次元最適化(free_pressure, fragmentation) + +### Not in v11a +- Policy v2 の複雑なルーティング(多次元条件) +- v6/v7/MID 同時最適化 +- ColdIface の大規模リファクタ + +--- + +**Document Date**: 2025-12-12 +**Decision**: Option A (MID v3.5 consolidation) +**Target Completion**: Phase v11a end (2025-12-31) +**Next Review**: After Phase v11a-2 implementation diff --git a/docs/analysis/PHASE_V11A_IMPLEMENTATION_ROADMAP.md b/docs/analysis/PHASE_V11A_IMPLEMENTATION_ROADMAP.md new file mode 100644 index 00000000..2e0cbc9e --- /dev/null +++ b/docs/analysis/PHASE_V11A_IMPLEMENTATION_ROADMAP.md @@ -0,0 +1,480 @@ +# Phase v11a 実装ロードマップ: MID v3.5 + +## 1. ファイル構造(新規作成予定) + +### 新規ボックス定義 +``` +core/box/ + ├─ smallobject_segment_mid_v3_box.h [NEW] Multi-class segment layout + ├─ smallobject_stats_mid_v3_box.h [NEW] SmallPageStatsMID_v3 type + ├─ smallobject_learner_v2_box.h [NEW] SmallLearnerStatsV2 type + └─ smallobject_policy_v2_box.h [NEW] Policy v2 update functions +``` + +### 実装ファイル +``` +core/ + ├─ smallobject_segment_mid_v3.c [NEW] Segment alloc/free/refill + ├─ smallobject_learner_v2.c [NEW] Learner stats aggregation + └─ smallobject_policy_v2.c [NEW] Policy update logic +``` + +### 既存ファイル変更 +``` +core/ + ├─ smallobject_mid_v3.c [MODIFY] C7 support, stats recording + ├─ front/malloc_tiny_fast.h [MODIFY] C7 routing (if SMALL_ROUTE_MID_V3) + ├─ hakmem.c [MODIFY] Init smallobject_learner_v2 + └─ hakmem.h [MODIFY] Export v2 types +``` + +## 2. Phase v11a-1: Design & Infrastructure + +### Task 1.1: smallobject_segment_mid_v3_box.h +```c +// File: core/box/smallobject_segment_mid_v3_box.h [NEW] + +#ifndef SMALLOBJECT_SEGMENT_MID_V3_BOX_H +#define SMALLOBJECT_SEGMENT_MID_V3_BOX_H + +#include +#include + +// SmallSegment_MID_v3: unified 2MiB segment for C5-C7 +typedef struct { + void *start; + size_t total_size; // 2 MiB + size_t page_size; // 64 KiB + uint32_t num_pages; // 32 + + // Per-class page stacks + void *free_pages[8]; // free page stack per class (LIFO) + uint32_t free_count[8]; // free page count per class + + // Current allocation page per class + void *current_page[8]; + uint32_t page_offset[8]; // allocation offset in current page + + // Metadata for pages + struct SmallPageMeta **pages; // [32] page pointers + + // Region ID (for lookup) + uint32_t region_id; +} SmallSegment_MID_v3; + +typedef struct { + SmallSegment_MID_v3 *seg; + void *page[8]; // TLS cache: current page per class + uint32_t offset[8]; // TLS cache: offset per class +} SmallHeapCtx_MID_v3; + +// API +SmallSegment_MID_v3* small_segment_mid_v3_create(void); +void small_segment_mid_v3_destroy(SmallSegment_MID_v3 *seg); + +void* small_segment_mid_v3_alloc_fast( + SmallSegment_MID_v3 *seg, + uint32_t class_idx, + size_t size +); + +void small_segment_mid_v3_free_page( + SmallSegment_MID_v3 *seg, + uint32_t class_idx, + void *page +); + +#endif +``` + +**Rationale**: Defines the multi-class segment geometry with per-class free stacks and TLS caching pattern + +### Task 1.2: smallobject_stats_mid_v3_box.h +```c +// File: core/box/smallobject_stats_mid_v3_box.h [NEW] + +typedef struct { + uint32_t class_idx; + uint64_t total_allocations; + uint64_t total_frees; + uint32_t page_alloc_count; // Slots on page + uint32_t free_hit_ratio_bps; // Free hit rate in basis points (0-10000) +} SmallPageStatsMID_v3; + +typedef struct { + SmallPageStatsMID_v3 stat; + void *page_ptr; + uint64_t retire_timestamp; +} SmallPageStatsPublished_MID_v3; + +// API +void small_stats_mid_v3_publish(const SmallPageStatsMID_v3 *stat); +const SmallPageStatsPublished_MID_v3* small_stats_mid_v3_latest(void); +``` + +**Rationale**: Separates stats type from policy to keep Learner input clean + +### Task 1.3: smallobject_learner_v2_box.h +```c +// File: core/box/smallobject_learner_v2_box.h [NEW] + +typedef struct { + uint64_t allocs[8]; // Allocation count per class + uint32_t retire_ratio_pct[8]; // Retire efficiency per class (%) + uint64_t avg_page_utilization; // Global average utilization + uint32_t free_hit_ratio_bps; // Global free hit rate (basis points) + uint64_t eval_count; + uint64_t sample_count; +} SmallLearnerStatsV2; + +// API +void small_learner_v2_record_refill(uint32_t class_idx, uint64_t capacity); +void small_learner_v2_record_retire(uint32_t class_idx, + uint32_t free_hit_ratio_bps); +void small_learner_v2_evaluate(void); +const SmallLearnerStatsV2* small_learner_v2_stats_snapshot(void); +``` + +**Rationale**: Extends learner beyond v7 C5-only to multi-dimensional metrics + +### Task 1.4: smallobject_policy_v2_box.h +```c +// File: core/box/smallobject_policy_v2_box.h [NEW] + +// Policy v2: Route decision with Learner-driven updates +typedef struct { + uint8_t route_kind[8]; // Route per class (ULTRA, MID_V3, V7, LEGACY) + uint32_t policy_version; // Version for TLS cache invalidation +} SmallPolicyV2; + +// API +const SmallPolicyV2* small_policy_v2_snapshot(void); +void small_policy_v2_init_from_env(SmallPolicyV2 *policy); +void small_policy_v2_update_from_learner( + const SmallLearnerStatsV2 *stats, + SmallPolicyV2 *policy_out +); +``` + +**Rationale**: Extends Policy Box to handle expanded Learner inputs + +### Task 1.5: Benchmark Suite Extension +**File**: `core/bench/bench_allocators.c` + +```c +// Add test cases for Phase v11a +// +// BENCH_C5_C6_C7_MIXED: +// - Min size: 200B (C5) +// - Max size: 1000B (C7) +// - Mixed ratio: 30% C5, 40% C6, 30% C7 +// - Expected perf: 42-48M ops/s (with MID_v3) +// +// BENCH_C7_HEAVY: +// - Min size: 800B +// - Max size: 1000B +// - Expected perf: 35-40M ops/s (vs ULTRA baseline) +// +// BENCH_LEARNER_ROUTE_SWITCH: +// - Start with C5-heavy (80% C5) +// - Expect route[5] = V7 initially +// - Then shift to C6-heavy (80% C6) +// - Expect route[5] switch to MID_V3 +``` + +## 3. Phase v11a-2: Core Implementation + +### Task 2.1: SmallSegment_MID_v3 Creation +**File**: `core/smallobject_segment_mid_v3.c` + +```c +SmallSegment_MID_v3* small_segment_mid_v3_create(void) { + // Allocate 2MiB segment + // Initialize 32 x 64KiB pages + // Set up per-class free stacks + // Register in RegionIdBox +} +``` + +**Complexity**: Medium +- Memory layout: 2MiB = 32 pages of 64KiB each +- Metadata: SmallPageMeta per page +- Region registration: via RegionIdBox_v7 API (existing) + +### Task 2.2: Fast Alloc Path for C5/C6/C7 +**File**: `core/smallobject_mid_v3.c` + +Modify existing C5/C6 alloc to support C7: + +```c +// Current (v3): +// - TLS fast path: C5/C6 from tls_mid_ctx.page +// - Refill: get page from free stack or allocate + +// v11a: +// - TLS fast path: C5/C6/C7 from tls_mid_ctx.page[class_idx] +// - Refill: per-class free stack +// - Retire: record stats with class_idx +``` + +**Changes**: +- [ ] Extend TLS context to support C7 +- [ ] Update refill logic for multi-class +- [ ] Add C7 routing in malloc_tiny_fast.h + +### Task 2.3: Stats Recording +**File**: `core/smallobject_mid_v3.c` + +```c +void small_cold_mid_v3_retire_page( + SmallSegment_MID_v3 *seg, + uint32_t class_idx, + void *page +) { + SmallPageMeta *meta = page_to_meta(page); + + // Record stats + uint32_t free_hit_ratio_bps = calc_free_hit_ratio(meta); + SmallPageStatsMID_v3 stat = { + .class_idx = class_idx, + .total_allocations = meta->alloc_count, + .total_frees = meta->free_count, + .page_alloc_count = meta->capacity, + .free_hit_ratio_bps = free_hit_ratio_bps + }; + + // Publish to stats system + small_stats_mid_v3_publish(&stat); + + // Feed to Learner + small_learner_v2_record_retire(class_idx, free_hit_ratio_bps); + + // Free page (return to free stack or OS) + ... +} +``` + +**Key Detail**: Must record `class_idx` for Learner aggregation + +### Task 2.4: Learner v2 Aggregation +**File**: `core/smallobject_learner_v2.c` + +```c +static SmallLearnerStatsV2 g_learner_v2_stats; + +void small_learner_v2_record_retire(uint32_t class_idx, + uint32_t free_hit_ratio_bps) { + if (class_idx >= 8) return; + + g_learner_v2_stats.allocs[class_idx]++; + g_learner_v2_stats.retire_ratio_pct[class_idx] = + (g_learner_v2_stats.retire_ratio_pct[class_idx] * 0.9) + + (free_hit_ratio_bps / 100.0) * 0.1; // Exponential smoothing + + // Periodic evaluation + static uint64_t total_retires = 0; + if (++total_retires % LEARNER_EVAL_INTERVAL == 0) { + small_learner_v2_evaluate(); + } +} + +void small_learner_v2_evaluate(void) { + // Update global version to invalidate TLS policy cache + __sync_fetch_and_add(&g_policy_v2_version, 1); + + g_learner_v2_stats.eval_count++; +} +``` + +### Task 2.5: Policy v2 Update +**File**: `core/smallobject_policy_v2.c` + +```c +void small_policy_v2_update_from_learner( + const SmallLearnerStatsV2 *stats, + SmallPolicyV2 *policy_out +) { + if (!stats || !policy_out) return; + + // C5 decision (Phase v11a: same logic as v7) + uint64_t total_allocs = 0; + for (int i = 0; i < 8; i++) { + total_allocs += stats->allocs[i]; + } + + if (total_allocs > 0) { + uint64_t c5_ratio_pct = (stats->allocs[5] * 100) / total_allocs; + + if (c5_ratio_pct >= 30) { + policy_out->route_kind[5] = SMALL_ROUTE_V7; + } else { + policy_out->route_kind[5] = SMALL_ROUTE_MID_V3; + } + } + + // Future (Phase v11b): Multi-dimensional decisions + // if (retire_ratio[5] < 50% && free_hit < 7000bps) → LEGACY + // etc. +} +``` + +## 4. Phase v11a-3: Integration & Testing + +### Task 3.1: C7 Routing in malloc_tiny_fast.h +**File**: `core/front/malloc_tiny_fast.h` + +Modify alloc switch statement: + +```c +// Current (v10): +// case TINY_ROUTE_SMALL_HEAP_V7: return small_heap_alloc_v7(...); +// case TINY_ROUTE_SMALL_HEAP_MID_V3: return small_heap_alloc_mid_v3(...); + +// v11a: +// Add support for C7 routing to MID_v3 +switch (policy->route_kind[class_idx]) { + case SMALL_ROUTE_ULTRA: + return ULTRA_alloc(...) + case SMALL_ROUTE_MID_V3: + return small_heap_alloc_mid_v3(class_idx, size); // ← v11a: supports C7 + case SMALL_ROUTE_V7: + return small_heap_alloc_v7(class_idx, size); + case SMALL_ROUTE_LEGACY: + return legacy_alloc(...); +} +``` + +### Task 3.2: Free Path C7 Support +**File**: `core/front/malloc_tiny_fast.h` + +```c +// v11a: Allow C7 free to route to MID_v3 +if (SMALL_MID_V3_CLASS_SUPPORTED(class_idx)) { + if (policy->route_kind[class_idx] == SMALL_ROUTE_MID_V3) { + small_heap_free_mid_v3(ptr, class_idx); + return; + } +} +``` + +### Task 3.3: Integration Tests +**File**: `core/test/test_mid_v3_c7.c` [NEW] + +```c +void test_mid_v3_c7_alloc_free(void) { + // Test C7 allocation and free through MID_v3 + // Expected: successful alloc/free without segfault + // Verify: Policy routing is correct + // Verify: Learner stats are recorded +} + +void test_learner_v2_route_switch(void) { + // Allocate C5-heavy workload + // Verify: route[5] = V7 + // Switch to C6-heavy workload + // Verify: route[5] switches to MID_V3 + // Check stderr: "[LEARNER_V2] C5 route switch: V7 → MID_V3" +} + +void test_mid_v3_perf_c5_c6_c7_mixed(void) { + // Performance baseline for C5/C6/C7 mixed + // Expected: 42-48M ops/s + // Verify: no regression vs v7 research preset +} +``` + +### Task 3.4: Regression Testing +**Ensure**: +- [ ] v7 research preset (C5/C6 + Learner) still works +- [ ] Mixed profile (16-1024B, v7 OFF) unchanged +- [ ] ULTRA (C4-C7) unchanged +- [ ] Legacy fallback unchanged + +## 5. Build & Compilation + +### Makefile Changes +```makefile +# Add new object files to HAKMEM_OBJS +HAKMEM_OBJS += \ + core/smallobject_segment_mid_v3.o \ + core/smallobject_learner_v2.o \ + core/smallobject_policy_v2.o + +# Add new box headers to HEADERS +HEADERS += \ + core/box/smallobject_segment_mid_v3_box.h \ + core/box/smallobject_stats_mid_v3_box.h \ + core/box/smallobject_learner_v2_box.h \ + core/box/smallobject_policy_v2_box.h +``` + +## 6. Testing Commands + +### Benchmark Suite (after Phase v11a-2) +```bash +# C5/C6/C7 mixed (expected MID_v3 preferred) +HAKMEM_SMALL_HEAP_V7_ENABLED=0 \ +HAKMEM_MID_V3_ENABLED=1 \ +HAKMEM_MID_V3_CLASSES=0x70 \ +./bench_allocators bench_c5_c6_c7_mixed 300000 + +# C7 heavy (expected MID_v3 performance) +HAKMEM_SMALL_HEAP_V7_ENABLED=0 \ +HAKMEM_MID_V3_ENABLED=1 \ +./bench_allocators bench_c7_heavy 200000 + +# Learner route switch verification +HAKMEM_SMALL_HEAP_V7_ENABLED=1 \ +HAKMEM_SMALL_HEAP_V7_CLASSES=0x60 \ +HAKMEM_MID_V3_ENABLED=1 \ +./bench_allocators bench_learner_route_switch 500000 +``` + +### Expected Output +``` +[POLICY_V2_INIT] Route assignments: + C0: LEGACY + C1: LEGACY + C2: LEGACY + C3: LEGACY + C4: ULTRA + C5: MID_V3 + C6: MID_V3 + C7: MID_V3 + +[LEARNER_V2] eval_count=1, C5_ratio=28%, retire_ratio[5]=92% + +C5/C6/C7 mixed (300K iter): 44.2M ops/s ✓ (+4% vs baseline) +``` + +## 7. Dependency Graph + +``` +smallobject_segment_mid_v3_box.h + ↓ +smallobject_segment_mid_v3.c + ↓ calls +smallobject_stats_mid_v3.c + ↓ publishes to +smallobject_learner_v2.c + ↓ feeds to +smallobject_policy_v2.c + ↓ updates +malloc_tiny_fast.h (routing) +``` + +Recommended implementation order: +1. smallobject_segment_mid_v3.h/c (foundation) +2. smallobject_stats_mid_v3.h (simple type def) +3. smallobject_mid_v3.c changes (core alloc/free) +4. smallobject_learner_v2.h/c (stats aggregation) +5. smallobject_policy_v2.h/c (learner integration) +6. malloc_tiny_fast.h (routing) +7. Tests & benchmarks + +--- + +**Document Date**: 2025-12-12 +**Phase**: v11a-1 (Design & Infrastructure) +**Status**: Ready for Task 1.1-1.5 implementation +**Next Review**: After Phase v11a-1 completion