Phase v11a: Architecture design and implementation roadmap documents

Create comprehensive design specifications for Phase v11a (MID v3.5):

1. PHASE_V11A_DESIGN_MID_V3.5.md
   - Decision rationale: Option A chosen (consolidation vs expansion)
   - MID v3.5 architecture: unified 257-1KiB box
   - Role clarification: v7 frozen as research preset
   - Learner v2 scope: multi-class tracking, C5 ratio primary decision
   - Segment design decision: shared segment (Design B) vs separate segments
   - Stats expansion: per-class efficiency metrics
   - API changes: minimal, backward compatible

2. PHASE_V11A_IMPLEMENTATION_ROADMAP.md
   - Detailed task breakdown for v11a-1, v11a-2, v11a-3
   - File structure: new boxes, implementation files, modified files
   - Concrete function signatures and integration points
   - Benchmark commands and expected performance
   - Dependency graph and implementation order
   - Build/Makefile changes needed
   - Testing strategy and regression checks

Key Design Decisions:
- Multi-class segment uses shared 2MiB segment (not separate)
- Per-class free page stacks for efficient refill
- Stats published per-page retire (for Learner ingestion)
- TLS version-based cache invalidation (atomic policy updates)
- Backward compatibility: Policy v2 extends v1 interface

Next Step: Phase v11a-2 (Core Implementation)
- Implement segment creation/alloc/free
- Add C7 support to existing MID_v3
- Stats recording during page retire
- Learner aggregation logic

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-12 06:20:14 +09:00
parent babd884b96
commit 57313f7822
2 changed files with 806 additions and 0 deletions

View File

@ -0,0 +1,326 @@
# Phase v11a 設計仕様: MID v3.5 (257-1KiB Unified Box)
## 1. 位置づけ
**v10 から v11a への移行**:
- v10: v7 を C5/C6-only 研究箱として凍結、Learner default ON
- v11a: **MID v3.5 を 257-1KiB メイン実装として統一拡張**
**アーキテクチャ役割**:
```
L0: ULTRA (C4-C7) → FROZEN変わらず
L1: MID v3.5 (C5-C7) → 本線 ★NEW
├─ C5: TLS cache + 2MiB segment (multi-page)
├─ C6: TLS cache + 2MiB segment (multi-page)
└─ C7: TLS cache + 2MiB segment (multi-page)
L1-research: v7 (C5/C6) → 研究箱(凍結)
L2: Segment/ColdIface/RegionId
L3: Policy Box + Learner v2 (expanded stats)
```
## 2. MID v3 → MID v3.5 の変更
### 2-1. 現在の MID v3 構成
**実装済み**:
- C5/C6 multi-class TLS heap
- 2MiB segments
- RefillPolicy (TLS segment hint, pool fallback)
- Policy routing (via Policy Box v7-4)
- Legacy Statspage retire時の基本データ
**制限事項**:
- C7 未対応ULTRA固有
- Learner 統計なしv7のみ
- Single-class segment 前提
### 2-2. v11a で追加する機能
#### 機能 1: C7 完全対応
**目標**: MID v3.5 が C5-C7 すべてをカバー
**実装**:
```c
// mid_v3.5.h - new extension
// TLS context for C7
typedef struct {
SmallHeapCtx ctx; // Reuse existing context
void *tls_page; // Current page pointer (C7)
uint32_t tls_offset; // Allocation offset in page
} SmallHeapCtx_C7_MID;
// Allocation fast path
// C7: size > 512B → check MID_ROUTE_C7
// If enabled: try TLS fast alloc → refill on demand
```
**Policy routing**:
```c
// mid_policy.h
route_kind[7] = SMALL_ROUTE_MID_V3; // If C7 enabled, else ULTRA
```
**Stats tracking**:
```c
// SmallPageStatsMID_v3: record class_idx for all retires
typedef struct {
uint32_t class_idx;
uint64_t total_allocations;
uint64_t total_frees;
uint32_t page_alloc_count; // ← v11a new
uint32_t free_hit_ratio_bps; // ← v11a new (basis points)
} SmallPageStatsMID_v3;
```
#### 機能 2: Multi-class Segment 設計決定
**2択の検討**:
**設計 A: Separate segments**
```
MID_v3_segment[3] = {
[0] → segment_C5,
[1] → segment_C6,
[2] → segment_C7
}
```
利点: Simple, clean class separation
欠点: 3x segment overhead, TLS lookup複雑化
**設計 B: Shared segment + per-class pages**
```
SmallSegment_MID_v3 {
free_pages[8]; // per class free stack
class_pages[8]; // current page per class
page_alloc[8]; // allocation count per class
}
```
利点: 1 segment で済む, RegionIdBox 変更不要
欠点: Logic 複雑化
**v11a 決定**: **設計 B (shared segment)**
- 理由: RegionIdBox は既存(変更最小化)
- Segment geometry 統一v7と同じ2MiB/64KiB
- Multi-class TLS hint 対応可能
#### 機能 3: Learner v2 (Expanded Stats)
**v7-7 Learner の制限**:
```c
// Current: C5 ratio のみ監視
c5_ratio_pct = (stats->per_class[5].v7_allocs * 100) / total_allocs;
if (c5_ratio_pct >= THRESHOLD) route[5] = V7;
```
**v11a Learner v2 の拡張**:
```c
typedef struct {
uint64_t allocs[8]; // per class allocation count
uint32_t retire_ratio_pct[8]; // per class retire efficiency
uint64_t avg_page_utilization; // global metric
uint32_t free_hit_ratio_bps; // global free hit (basis points)
uint64_t eval_count;
} SmallLearnerStatsV2;
// 複数指標での route決定後日拡張可能
// Example (Phase v11b):
// - C5_ratio < 30% AND retire_ratio < 50% → MID_v3
// - C5_ratio >= 30% AND free_hit > 8000bps → V7
```
**実装フロー**:
```
MID_v3 page retire
↓ record stats
SmallPageStatsMID_v3 {class_idx, allocs, free_hit_ratio}
↓ periodic publish (every LEARNER_EVAL_INTERVAL)
SmallLearnerStatsV2 aggregate
small_learner_v2_evaluate()
small_policy_v3_update_from_learner() ← NEW (Policy v2)
TLS policy cache invalidation
```
### 2-3. 既存コンポーネント継承
**変更なし**:
- RegionIdBox: Segment ptr → region lookup既存動作
- Policy Box: route_kind[8] 配列(既存 API
- ColdIface: refill/retire インターフェース(既存)
- TLS cache: class ごと快速化(既存パターン)
**要変更**:
- Policy initialization: C7 routing 追加
- Learner stats recording: class_idx 記録追加
- Stats aggregation: Multi-class 対応
## 3. 実装スケジュール
### Phase v11a-1: Design & Infrastructure (Week 1-2)
- [ ] SmallSegment_MID_v3 multi-class layout 決定
- [ ] SmallPageStatsMID_v3 型定義 + publish API
- [ ] SmallLearnerStatsV2 型定義
- [ ] Policy v2 update 関数スケッチ
- [ ] Bench suite拡張: C5/C6/C7 individual tests
### Phase v11a-2: Core Implementation (Week 3-4)
- [ ] SmallHeap_MID_v3_C7 alloc/free path
- [ ] Multi-class refill logic
- [ ] Stats recording (per-page class_idx)
- [ ] Learner stats aggregation
- [ ] Policy update_from_learner v2
### Phase v11a-3: Integration & Testing (Week 5)
- [ ] Learner default ON for MID_v3
- [ ] Perf benchmarks: C5/C6/C7 mixed
- [ ] Learner route switch verification
- [ ] Regression: v7 research preset still works
### Phase v11b: Multi-segment Expansion (TBD)
- [ ] Evaluate separate segment approach
- [ ] TLS multi-segment hint optimization
- [ ] C4 support decision (ULTRA vs MID_v3)
## 4. API 変更最小化
### Policy Box API変更最小
```c
// 既存: 関数署名そのまま
const SmallPolicyV7* small_policy_v7_snapshot(void);
void small_policy_v7_init_from_env(SmallPolicyV7* policy);
void small_policy_v7_update_from_learner(
const SmallLearnerStatsV7* stats,
SmallPolicyV7* policy_out
);
// v11a: 型名だけ拡張
// typedef SmallLearnerStatsV7 → SmallLearnerStatsV2 (backward compat)
// → 内部で v2 の新フィールドは optional
```
### Learner Box API新規 add
```c
// smallobject_learner_v2_box.h
typedef struct { /* SmallLearnerStatsV2 */ } SmallLearnerStatsV2;
void small_learner_v2_record_retire(uint32_t class_idx,
uint32_t free_hit_ratio_bps);
void small_learner_v2_evaluate(void);
const SmallLearnerStatsV2* small_learner_v2_stats_snapshot(void);
```
### ColdIface API変更なし
```c
// 既存の refill/retire インターフェース
typedef void (*cold_refill_page_fn)(uint32_t class_idx, ...);
typedef void (*cold_retire_page_fn)(uint32_t class_idx, ...);
```
## 5. パフォーマンス予測
### Current MID v3 (C5/C6)
```
C5/C6 mixed (200-500B, 300K iter): 38.7M ops/s
C6 heavy (400-510B, 500K iter): 56.3M ops/s
Mixed 16-1024B (v7 OFF): 21.5M ops/s
```
### Expected MID v3.5 (after implementation)
```
C5/C6/C7 mixed (200-1000B): +3-5% (more pages, better locality)
C7 heavy (800-1000B): +2-3% (vs ULTRA fallback)
Mixed 16-1024B (with Learner): +1-2% (dynamic routing)
```
**メトリクス**:
- Throughput: +1-3% overall
- Overhead: ~5-8% (relative to ULTRA baseline)
- Learner accuracy: > 95% on workload pattern detection
## 6. 設計確定事項
### Segment Geometry (v11a)
```
SmallSegment_MID_v3:
- Total size: 2 MiB (same as v7)
- Page size: 64 KiB (same as v7)
- Free stack: per-class (C5/C6/C7 each)
- Class pages: current[8], partial[8]
- RegionId: single segment per TLS thread
```
### TLS Caching Pattern
```c
// TLS MID context
struct {
SmallSegment_MID_v3 *seg;
void *page[8]; // Current page per class
uint32_t offset[8]; // Allocation offset
uint32_t cache_hits;
uint32_t cache_misses;
} __thread tls_mid_v3_ctx;
```
### Stats Recording
```c
// On page retire:
void small_cold_mid_v3_retire_page(..., uint32_t class_idx) {
SmallPageMeta* meta = page->meta;
meta->class_idx = class_idx; // ← record class
// Calculate metrics
uint32_t free_hit = calc_free_hit_ratio(page);
// Publish stats
SmallPageStatsMID_v3 stat = {
.class_idx = class_idx,
.total_allocations = page->alloc_count,
.total_frees = page->free_count,
.page_alloc_count = capacity,
.free_hit_ratio_bps = free_hit
};
// Feed to Learner
small_learner_v2_ingest_stats(&stat);
}
```
## 7. Next Decision Points
### v11b への移行判定
```
Go to v11b (multi-segment) if:
✓ C7 performance matches ULTRA (±2%)
✓ Learner accuracy > 90% on class patterns
✓ RegionId lookup latency acceptable (<2% overhead)
```
### Stay in v11a (iterate) if:
```
✗ C7 performance < 90% vs ULTRA
✗ Learner detection < 80% accuracy
✗ Stats aggregation cost > 5% CPU
```
## 8. 枝刳り対象(後日)
### Branch Cutting (Phase v12+)
- v3 backend の細部最適化
- v6 headerless gains検証
- v7 multi-class 検証
- Learner 多次元最適化free_pressure, fragmentation
### Not in v11a
- Policy v2 の複雑なルーティング(多次元条件)
- v6/v7/MID 同時最適化
- ColdIface の大規模リファクタ
---
**Document Date**: 2025-12-12
**Decision**: Option A (MID v3.5 consolidation)
**Target Completion**: Phase v11a end (2025-12-31)
**Next Review**: After Phase v11a-2 implementation

View File

@ -0,0 +1,480 @@
# Phase v11a 実装ロードマップ: MID v3.5
## 1. ファイル構造(新規作成予定)
### 新規ボックス定義
```
core/box/
├─ smallobject_segment_mid_v3_box.h [NEW] Multi-class segment layout
├─ smallobject_stats_mid_v3_box.h [NEW] SmallPageStatsMID_v3 type
├─ smallobject_learner_v2_box.h [NEW] SmallLearnerStatsV2 type
└─ smallobject_policy_v2_box.h [NEW] Policy v2 update functions
```
### 実装ファイル
```
core/
├─ smallobject_segment_mid_v3.c [NEW] Segment alloc/free/refill
├─ smallobject_learner_v2.c [NEW] Learner stats aggregation
└─ smallobject_policy_v2.c [NEW] Policy update logic
```
### 既存ファイル変更
```
core/
├─ smallobject_mid_v3.c [MODIFY] C7 support, stats recording
├─ front/malloc_tiny_fast.h [MODIFY] C7 routing (if SMALL_ROUTE_MID_V3)
├─ hakmem.c [MODIFY] Init smallobject_learner_v2
└─ hakmem.h [MODIFY] Export v2 types
```
## 2. Phase v11a-1: Design & Infrastructure
### Task 1.1: smallobject_segment_mid_v3_box.h
```c
// File: core/box/smallobject_segment_mid_v3_box.h [NEW]
#ifndef SMALLOBJECT_SEGMENT_MID_V3_BOX_H
#define SMALLOBJECT_SEGMENT_MID_V3_BOX_H
#include <stdint.h>
#include <stddef.h>
// SmallSegment_MID_v3: unified 2MiB segment for C5-C7
typedef struct {
void *start;
size_t total_size; // 2 MiB
size_t page_size; // 64 KiB
uint32_t num_pages; // 32
// Per-class page stacks
void *free_pages[8]; // free page stack per class (LIFO)
uint32_t free_count[8]; // free page count per class
// Current allocation page per class
void *current_page[8];
uint32_t page_offset[8]; // allocation offset in current page
// Metadata for pages
struct SmallPageMeta **pages; // [32] page pointers
// Region ID (for lookup)
uint32_t region_id;
} SmallSegment_MID_v3;
typedef struct {
SmallSegment_MID_v3 *seg;
void *page[8]; // TLS cache: current page per class
uint32_t offset[8]; // TLS cache: offset per class
} SmallHeapCtx_MID_v3;
// API
SmallSegment_MID_v3* small_segment_mid_v3_create(void);
void small_segment_mid_v3_destroy(SmallSegment_MID_v3 *seg);
void* small_segment_mid_v3_alloc_fast(
SmallSegment_MID_v3 *seg,
uint32_t class_idx,
size_t size
);
void small_segment_mid_v3_free_page(
SmallSegment_MID_v3 *seg,
uint32_t class_idx,
void *page
);
#endif
```
**Rationale**: Defines the multi-class segment geometry with per-class free stacks and TLS caching pattern
### Task 1.2: smallobject_stats_mid_v3_box.h
```c
// File: core/box/smallobject_stats_mid_v3_box.h [NEW]
typedef struct {
uint32_t class_idx;
uint64_t total_allocations;
uint64_t total_frees;
uint32_t page_alloc_count; // Slots on page
uint32_t free_hit_ratio_bps; // Free hit rate in basis points (0-10000)
} SmallPageStatsMID_v3;
typedef struct {
SmallPageStatsMID_v3 stat;
void *page_ptr;
uint64_t retire_timestamp;
} SmallPageStatsPublished_MID_v3;
// API
void small_stats_mid_v3_publish(const SmallPageStatsMID_v3 *stat);
const SmallPageStatsPublished_MID_v3* small_stats_mid_v3_latest(void);
```
**Rationale**: Separates stats type from policy to keep Learner input clean
### Task 1.3: smallobject_learner_v2_box.h
```c
// File: core/box/smallobject_learner_v2_box.h [NEW]
typedef struct {
uint64_t allocs[8]; // Allocation count per class
uint32_t retire_ratio_pct[8]; // Retire efficiency per class (%)
uint64_t avg_page_utilization; // Global average utilization
uint32_t free_hit_ratio_bps; // Global free hit rate (basis points)
uint64_t eval_count;
uint64_t sample_count;
} SmallLearnerStatsV2;
// API
void small_learner_v2_record_refill(uint32_t class_idx, uint64_t capacity);
void small_learner_v2_record_retire(uint32_t class_idx,
uint32_t free_hit_ratio_bps);
void small_learner_v2_evaluate(void);
const SmallLearnerStatsV2* small_learner_v2_stats_snapshot(void);
```
**Rationale**: Extends learner beyond v7 C5-only to multi-dimensional metrics
### Task 1.4: smallobject_policy_v2_box.h
```c
// File: core/box/smallobject_policy_v2_box.h [NEW]
// Policy v2: Route decision with Learner-driven updates
typedef struct {
uint8_t route_kind[8]; // Route per class (ULTRA, MID_V3, V7, LEGACY)
uint32_t policy_version; // Version for TLS cache invalidation
} SmallPolicyV2;
// API
const SmallPolicyV2* small_policy_v2_snapshot(void);
void small_policy_v2_init_from_env(SmallPolicyV2 *policy);
void small_policy_v2_update_from_learner(
const SmallLearnerStatsV2 *stats,
SmallPolicyV2 *policy_out
);
```
**Rationale**: Extends Policy Box to handle expanded Learner inputs
### Task 1.5: Benchmark Suite Extension
**File**: `core/bench/bench_allocators.c`
```c
// Add test cases for Phase v11a
//
// BENCH_C5_C6_C7_MIXED:
// - Min size: 200B (C5)
// - Max size: 1000B (C7)
// - Mixed ratio: 30% C5, 40% C6, 30% C7
// - Expected perf: 42-48M ops/s (with MID_v3)
//
// BENCH_C7_HEAVY:
// - Min size: 800B
// - Max size: 1000B
// - Expected perf: 35-40M ops/s (vs ULTRA baseline)
//
// BENCH_LEARNER_ROUTE_SWITCH:
// - Start with C5-heavy (80% C5)
// - Expect route[5] = V7 initially
// - Then shift to C6-heavy (80% C6)
// - Expect route[5] switch to MID_V3
```
## 3. Phase v11a-2: Core Implementation
### Task 2.1: SmallSegment_MID_v3 Creation
**File**: `core/smallobject_segment_mid_v3.c`
```c
SmallSegment_MID_v3* small_segment_mid_v3_create(void) {
// Allocate 2MiB segment
// Initialize 32 x 64KiB pages
// Set up per-class free stacks
// Register in RegionIdBox
}
```
**Complexity**: Medium
- Memory layout: 2MiB = 32 pages of 64KiB each
- Metadata: SmallPageMeta per page
- Region registration: via RegionIdBox_v7 API (existing)
### Task 2.2: Fast Alloc Path for C5/C6/C7
**File**: `core/smallobject_mid_v3.c`
Modify existing C5/C6 alloc to support C7:
```c
// Current (v3):
// - TLS fast path: C5/C6 from tls_mid_ctx.page
// - Refill: get page from free stack or allocate
// v11a:
// - TLS fast path: C5/C6/C7 from tls_mid_ctx.page[class_idx]
// - Refill: per-class free stack
// - Retire: record stats with class_idx
```
**Changes**:
- [ ] Extend TLS context to support C7
- [ ] Update refill logic for multi-class
- [ ] Add C7 routing in malloc_tiny_fast.h
### Task 2.3: Stats Recording
**File**: `core/smallobject_mid_v3.c`
```c
void small_cold_mid_v3_retire_page(
SmallSegment_MID_v3 *seg,
uint32_t class_idx,
void *page
) {
SmallPageMeta *meta = page_to_meta(page);
// Record stats
uint32_t free_hit_ratio_bps = calc_free_hit_ratio(meta);
SmallPageStatsMID_v3 stat = {
.class_idx = class_idx,
.total_allocations = meta->alloc_count,
.total_frees = meta->free_count,
.page_alloc_count = meta->capacity,
.free_hit_ratio_bps = free_hit_ratio_bps
};
// Publish to stats system
small_stats_mid_v3_publish(&stat);
// Feed to Learner
small_learner_v2_record_retire(class_idx, free_hit_ratio_bps);
// Free page (return to free stack or OS)
...
}
```
**Key Detail**: Must record `class_idx` for Learner aggregation
### Task 2.4: Learner v2 Aggregation
**File**: `core/smallobject_learner_v2.c`
```c
static SmallLearnerStatsV2 g_learner_v2_stats;
void small_learner_v2_record_retire(uint32_t class_idx,
uint32_t free_hit_ratio_bps) {
if (class_idx >= 8) return;
g_learner_v2_stats.allocs[class_idx]++;
g_learner_v2_stats.retire_ratio_pct[class_idx] =
(g_learner_v2_stats.retire_ratio_pct[class_idx] * 0.9) +
(free_hit_ratio_bps / 100.0) * 0.1; // Exponential smoothing
// Periodic evaluation
static uint64_t total_retires = 0;
if (++total_retires % LEARNER_EVAL_INTERVAL == 0) {
small_learner_v2_evaluate();
}
}
void small_learner_v2_evaluate(void) {
// Update global version to invalidate TLS policy cache
__sync_fetch_and_add(&g_policy_v2_version, 1);
g_learner_v2_stats.eval_count++;
}
```
### Task 2.5: Policy v2 Update
**File**: `core/smallobject_policy_v2.c`
```c
void small_policy_v2_update_from_learner(
const SmallLearnerStatsV2 *stats,
SmallPolicyV2 *policy_out
) {
if (!stats || !policy_out) return;
// C5 decision (Phase v11a: same logic as v7)
uint64_t total_allocs = 0;
for (int i = 0; i < 8; i++) {
total_allocs += stats->allocs[i];
}
if (total_allocs > 0) {
uint64_t c5_ratio_pct = (stats->allocs[5] * 100) / total_allocs;
if (c5_ratio_pct >= 30) {
policy_out->route_kind[5] = SMALL_ROUTE_V7;
} else {
policy_out->route_kind[5] = SMALL_ROUTE_MID_V3;
}
}
// Future (Phase v11b): Multi-dimensional decisions
// if (retire_ratio[5] < 50% && free_hit < 7000bps) → LEGACY
// etc.
}
```
## 4. Phase v11a-3: Integration & Testing
### Task 3.1: C7 Routing in malloc_tiny_fast.h
**File**: `core/front/malloc_tiny_fast.h`
Modify alloc switch statement:
```c
// Current (v10):
// case TINY_ROUTE_SMALL_HEAP_V7: return small_heap_alloc_v7(...);
// case TINY_ROUTE_SMALL_HEAP_MID_V3: return small_heap_alloc_mid_v3(...);
// v11a:
// Add support for C7 routing to MID_v3
switch (policy->route_kind[class_idx]) {
case SMALL_ROUTE_ULTRA:
return ULTRA_alloc(...)
case SMALL_ROUTE_MID_V3:
return small_heap_alloc_mid_v3(class_idx, size); // ← v11a: supports C7
case SMALL_ROUTE_V7:
return small_heap_alloc_v7(class_idx, size);
case SMALL_ROUTE_LEGACY:
return legacy_alloc(...);
}
```
### Task 3.2: Free Path C7 Support
**File**: `core/front/malloc_tiny_fast.h`
```c
// v11a: Allow C7 free to route to MID_v3
if (SMALL_MID_V3_CLASS_SUPPORTED(class_idx)) {
if (policy->route_kind[class_idx] == SMALL_ROUTE_MID_V3) {
small_heap_free_mid_v3(ptr, class_idx);
return;
}
}
```
### Task 3.3: Integration Tests
**File**: `core/test/test_mid_v3_c7.c` [NEW]
```c
void test_mid_v3_c7_alloc_free(void) {
// Test C7 allocation and free through MID_v3
// Expected: successful alloc/free without segfault
// Verify: Policy routing is correct
// Verify: Learner stats are recorded
}
void test_learner_v2_route_switch(void) {
// Allocate C5-heavy workload
// Verify: route[5] = V7
// Switch to C6-heavy workload
// Verify: route[5] switches to MID_V3
// Check stderr: "[LEARNER_V2] C5 route switch: V7 → MID_V3"
}
void test_mid_v3_perf_c5_c6_c7_mixed(void) {
// Performance baseline for C5/C6/C7 mixed
// Expected: 42-48M ops/s
// Verify: no regression vs v7 research preset
}
```
### Task 3.4: Regression Testing
**Ensure**:
- [ ] v7 research preset (C5/C6 + Learner) still works
- [ ] Mixed profile (16-1024B, v7 OFF) unchanged
- [ ] ULTRA (C4-C7) unchanged
- [ ] Legacy fallback unchanged
## 5. Build & Compilation
### Makefile Changes
```makefile
# Add new object files to HAKMEM_OBJS
HAKMEM_OBJS += \
core/smallobject_segment_mid_v3.o \
core/smallobject_learner_v2.o \
core/smallobject_policy_v2.o
# Add new box headers to HEADERS
HEADERS += \
core/box/smallobject_segment_mid_v3_box.h \
core/box/smallobject_stats_mid_v3_box.h \
core/box/smallobject_learner_v2_box.h \
core/box/smallobject_policy_v2_box.h
```
## 6. Testing Commands
### Benchmark Suite (after Phase v11a-2)
```bash
# C5/C6/C7 mixed (expected MID_v3 preferred)
HAKMEM_SMALL_HEAP_V7_ENABLED=0 \
HAKMEM_MID_V3_ENABLED=1 \
HAKMEM_MID_V3_CLASSES=0x70 \
./bench_allocators bench_c5_c6_c7_mixed 300000
# C7 heavy (expected MID_v3 performance)
HAKMEM_SMALL_HEAP_V7_ENABLED=0 \
HAKMEM_MID_V3_ENABLED=1 \
./bench_allocators bench_c7_heavy 200000
# Learner route switch verification
HAKMEM_SMALL_HEAP_V7_ENABLED=1 \
HAKMEM_SMALL_HEAP_V7_CLASSES=0x60 \
HAKMEM_MID_V3_ENABLED=1 \
./bench_allocators bench_learner_route_switch 500000
```
### Expected Output
```
[POLICY_V2_INIT] Route assignments:
C0: LEGACY
C1: LEGACY
C2: LEGACY
C3: LEGACY
C4: ULTRA
C5: MID_V3
C6: MID_V3
C7: MID_V3
[LEARNER_V2] eval_count=1, C5_ratio=28%, retire_ratio[5]=92%
C5/C6/C7 mixed (300K iter): 44.2M ops/s ✓ (+4% vs baseline)
```
## 7. Dependency Graph
```
smallobject_segment_mid_v3_box.h
smallobject_segment_mid_v3.c
↓ calls
smallobject_stats_mid_v3.c
↓ publishes to
smallobject_learner_v2.c
↓ feeds to
smallobject_policy_v2.c
↓ updates
malloc_tiny_fast.h (routing)
```
Recommended implementation order:
1. smallobject_segment_mid_v3.h/c (foundation)
2. smallobject_stats_mid_v3.h (simple type def)
3. smallobject_mid_v3.c changes (core alloc/free)
4. smallobject_learner_v2.h/c (stats aggregation)
5. smallobject_policy_v2.h/c (learner integration)
6. malloc_tiny_fast.h (routing)
7. Tests & benchmarks
---
**Document Date**: 2025-12-12
**Phase**: v11a-1 (Design & Infrastructure)
**Status**: Ready for Task 1.1-1.5 implementation
**Next Review**: After Phase v11a-1 completion