Files
hakmem/docs/analysis/PHASE_V11A3_IMPLEMENTATION_SUMMARY.md

227 lines
6.9 KiB
Markdown
Raw Normal View History

# Phase v11a-3: MID v3.5 Implementation Summary
**Date:** 2025-12-12
**Status:** Build Complete - Ready for A/B Benchmarking
**Author:** Claude Opus 4.5
## Overview
Phase v11a-3 successfully integrated MID v3.5 into the active code path, making it available for routing C5/C6/C7 allocations. This phase activates the Segment/ColdIface/Stats/Learner v2 infrastructure implemented in Phase v11a-2.
## Implementation Tasks Completed
### Task 1: Policy Box Updates (L3)
**Files Modified:**
- `/mnt/workdisk/public_share/hakmem/core/box/smallobject_policy_v7_box.h`
- `/mnt/workdisk/public_share/hakmem/core/smallobject_policy_v7.c`
**Changes:**
1. Added `SMALL_ROUTE_MID_V35` to `SmallRouteKind` enum
2. Implemented ENV gate functions:
- `mid_v35_enabled()` - checks `HAKMEM_MID_V35_ENABLED`
- `mid_v35_class_mask()` - reads `HAKMEM_MID_V35_CLASSES` (default: 0x60 for C5+C6)
3. Updated policy init with priority: ULTRA > MID_V35 > V7 > MID_V3 > LEGACY
4. Added MID_V35 case to `small_route_kind_name()`
### Task 2: MID v3.5 HotBox Implementation (L1)
**Files Created:**
- `/mnt/workdisk/public_share/hakmem/core/box/smallobject_mid_v35_box.h` - Public API
- `/mnt/workdisk/public_share/hakmem/core/smallobject_mid_v35.c` - Implementation
**Implementation:**
- TLS-cached page allocation (per-class fast path)
- Slot sizes: C5=384B, C6=512B, C7=1024B
- Page size: 64KB (170/128/64 slots for C5/C6/C7)
- Alloc: Fast path (TLS cache hit) + Slow path (refill via ColdIface)
- Free: Simplified counting (no freelist yet - deferred to v11b)
- Header writing: Integrates with tiny_region_id_write_header
**Key Design Decisions:**
- Reused `SmallPageMeta` typedef (matches segment/cold_iface structure)
- Simplified free path (cross-page free deferred to v11b RegionIdBox integration)
- Header integration for compatibility with existing Tiny infrastructure
### Task 3: Front Gate Integration (L0/L1 Boundary)
**Files Modified:**
- `/mnt/workdisk/public_share/hakmem/core/front/malloc_tiny_fast.h`
**Changes:**
1. Added MID v3.5 box header include
2. Alloc path: Check policy for MID_V35, call `small_mid_v35_alloc()` before V7
3. Free path: Check policy for MID_V35, call `small_mid_v35_free()` after ULTRA checks
**Priority Order (Alloc/Free):**
1. ULTRA (C4-C7)
2. MID v3.5 (Policy-driven)
3. V7 (Policy-driven)
4. Legacy routes
### Task 4: Build System Updates
**Files Modified:**
- `/mnt/workdisk/public_share/hakmem/Makefile`
**Changes:**
- Added `core/smallobject_mid_v35.o` to `OBJS_BASE`
- Added `core/smallobject_mid_v35.o` to `BENCH_HAKMEM_OBJS_BASE`
- Added `core/smallobject_mid_v35.o` to `TINY_BENCH_OBJS_BASE`
**Build Results:**
- Clean build successful
- Benchmarks compiled: `bench_random_mixed_hakmem`, `bench_mid_large_mt_hakmem`
- Minor warnings (unused parameter `size`) - non-critical
## ENV Configuration
### MID v3.5 Activation
```bash
# Enable MID v3.5
export HAKMEM_MID_V35_ENABLED=1
# Configure classes (default: 0x60 = C5+C6)
export HAKMEM_MID_V35_CLASSES=0x60 # C5 + C6
# export HAKMEM_MID_V35_CLASSES=0x20 # C5 only
# export HAKMEM_MID_V35_CLASSES=0x40 # C6 only
```
### Policy Debug Output
Policy initialization prints route assignments on first call:
```
[POLICY_V7_INIT] Route assignments:
C0: LEGACY
C1: LEGACY
C2: LEGACY
C3: LEGACY
C4: ULTRA
C5: MID_V35
C6: MID_V35
C7: ULTRA
```
## Next Steps (Task 5: A/B Benchmarks)
### Benchmark 1: C6-Heavy (MID Specialization Check)
```bash
# Baseline: MID v3.5 OFF
HAKMEM_PROFILE=C6_HEAVY_LEGACY_POOLV1 \
HAKMEM_MID_V35_ENABLED=0 \
./bench_mid_large_mt_hakmem 1 1000000 400 1
# Test: MID v3.5 ON (C6 only)
HAKMEM_PROFILE=C6_HEAVY_LEGACY_POOLV1 \
HAKMEM_MID_V35_ENABLED=1 \
HAKMEM_MID_V35_CLASSES=0x40 \
./bench_mid_large_mt_hakmem 1 1000000 400 1
```
**Expected:** Performance within ±5% of MID v3
### Benchmark 2: C5+C6-Only (257-768B Range)
```bash
# Baseline: MID v3.5 OFF
HAKMEM_BENCH_MIN_SIZE=257 \
HAKMEM_BENCH_MAX_SIZE=768 \
HAKMEM_MID_V35_ENABLED=0 \
./bench_random_mixed_hakmem 1000000 400 1
# Test: MID v3.5 ON (C5+C6)
HAKMEM_BENCH_MIN_SIZE=257 \
HAKMEM_BENCH_MAX_SIZE=768 \
HAKMEM_MID_V35_ENABLED=1 \
HAKMEM_MID_V35_CLASSES=0x60 \
./bench_random_mixed_hakmem 1000000 400 1
```
**Expected:** +2-4% improvement (matching v7 gains)
### Benchmark 3: Mixed 16-1024B (Reference)
```bash
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
HAKMEM_MID_V35_ENABLED=1 \
HAKMEM_MID_V35_CLASSES=0x60 \
./bench_random_mixed_hakmem 1000000 400 1
```
**Expected:** Baseline ±3% (no regression)
## Known Limitations (Deferred to v11b)
1. **Cross-Page Free:** Current implementation only handles frees to the current TLS page
- Need RegionIdBox lookup for cross-page free
- Planned for v11b with proper segment ownership check
2. **Freelist Reuse:** No freelist implementation yet
- Slots are marked free via counter only
- Full freelist support planned for v11b
3. **Learner Route Switching:** Learner v2 is in observation mode only
- Dynamic route switching deferred to v11b
- Current implementation: static routing based on ENV
## Code Quality
- **Modularity:** Clean L1/L2/L3 separation maintained
- **Box Boundaries:** Proper isolation between HotBox/ColdIface/Stats/Learner
- **Header Compatibility:** Integrates with existing Tiny region_id infrastructure
- **Build Hygiene:** All targets compile cleanly (only unused parameter warnings)
## Files Summary
### New Files (2)
1. `core/box/smallobject_mid_v35_box.h` - Public API (48 lines)
2. `core/smallobject_mid_v35.c` - Implementation (165 lines)
### Modified Files (4)
1. `core/box/smallobject_policy_v7_box.h` - Enum update
2. `core/smallobject_policy_v7.c` - ENV helpers + priority logic
3. `core/front/malloc_tiny_fast.h` - Route integration
4. `Makefile` - Object file lists (3 locations)
### Total Code Addition
- **New:** ~213 lines
- **Modified:** ~60 lines
- **Total Impact:** ~273 lines
## Architecture Notes
### Layer Interaction (L0→L1→L2→L3)
```
L0 (Front) malloc_tiny_fast.h
↓ (Policy check: MID_V35?)
L1 (HotBox) smallobject_mid_v35.c
↓ (Refill needed?)
L2 (Cold) smallobject_cold_iface_mid_v3.c
↓ (Get page from segment)
L2 (Segment) smallobject_segment_mid_v3.c
↓ (Stats recording)
L2 (Stats) smallobject_stats_mid_v3.c
↓ (Learner evaluation - observation only)
L2 (Learner) smallobject_learner_v2.c
↓ (Policy update - dormant)
L3 (Policy) smallobject_policy_v7.c
```
### Memory Layout
- **Segment:** 2 MiB contiguous region
- **Pages:** 64KB per page
- **Slots:** 384B (C5), 512B (C6), 1024B (C7)
- **TLS Cache:** One current page per class per thread
## Conclusion
Phase v11a-3 successfully activated MID v3.5 infrastructure. The implementation is ready for A/B benchmarking to validate performance against the original MID v3 and establish a baseline for future optimizations.
**Status:** ✅ Build Complete
**Next:** Task 5 (A/B Benchmarks) - Performance validation
**Future:** Task 6 (Documentation) - Results recording