fe70e3baf5
Phase MID-V35-HOTPATH-OPT-1 complete: +7.3% on C6-heavy
...
Step 0: Geometry SSOT
- New: core/box/smallobject_mid_v35_geom_box.h (L1/L2 consistency)
- Fix: C6 slots/page 102→128 in L2 (smallobject_cold_iface_mid_v3.c)
- Applied: smallobject_mid_v35.c, smallobject_segment_mid_v3.c
Step 1-3: ENV gates for hotpath optimizations
- New: core/box/mid_v35_hotpath_env_box.h
* HAKMEM_MID_V35_HEADER_PREFILL (default 0)
* HAKMEM_MID_V35_HOT_COUNTS (default 1)
* HAKMEM_MID_V35_C6_FASTPATH (default 0)
- Implementation: smallobject_mid_v35.c
* Header prefill at refill boundary (Step 1)
* Gated alloc_count++ in hot path (Step 2)
* C6 specialized fast path with constant slot_size (Step 3)
A/B Results:
C6-heavy (257–768B): 8.75M→9.39M ops/s (+7.3%, 5-run mean) ✅
Mixed (16–1024B): 9.98M→9.96M ops/s (-0.2%, within noise) ✓
Decision: FROZEN - defaults OFF, C6-heavy推奨ON, Mixed現状維持
Documentation: ENV_PROFILE_PRESETS.md updated
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com >
2025-12-12 19:19:25 +09:00
d5ffb3eeb2
Fix MID v3.5 activation bugs: policy loop + malloc recursion
...
Two critical bugs fixed:
1. Policy snapshot infinite loop (smallobject_policy_v7.c):
- Condition `g_policy_v7_version == 0` caused reinit on every call
- Fixed via CAS to set global version to 1 after first init
2. Malloc recursion (smallobject_segment_mid_v3.c):
- Internal malloc() routed back through hakmem → MID v3.5 → segment
creation → malloc → infinite recursion / stack overflow
- Fixed by using mmap() directly for internal allocations:
- Segment struct, pages array, page metadata block
Performance results (bench_random_mixed 257-512B):
- Baseline (LEGACY): 34.0M ops/s
- MID_V35 ON (C6): 35.8M ops/s
- Improvement: +5.1% ✓
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2025-12-12 07:12:24 +09:00
0dba67ba9d
Phase v11a-2: Core MID v3.5 implementation - segment, cold iface, stats, learner
...
Implement 5-layer infrastructure for multi-class MID v3.5 (C5-C7, 257-1KiB):
1. SegmentBox_mid_v3 (L2 Physical)
- core/smallobject_segment_mid_v3.c (9.5 KB)
- 2MiB segments, 64KiB pages (32 per segment)
- Per-class free page stacks (LIFO)
- RegionIdBox registration
- Slots: C5→170, C6→102, C7→64
2. ColdIface_mid_v3 (L2→L1)
- core/box/smallobject_cold_iface_mid_v3_box.h (NEW)
- core/smallobject_cold_iface_mid_v3.c (3.5 KB)
- refill: get page from free stack or new segment
- retire: calculate free_hit_ratio, publish stats, return to stack
- Clean separation: TLS cache for hot path, ColdIface for cold path
3. StatsBox_mid_v3 (L2→L3)
- core/smallobject_stats_mid_v3.c (7.2 KB)
- Circular buffer history (1000 events)
- Per-page metrics: class_idx, allocs, frees, free_hit_ratio_bps
- Periodic aggregation (every 100 retires)
- Learner notification callback
4. Learner v2 (L3)
- core/smallobject_learner_v2.c (11 KB)
- Multi-class aggregation: allocs[8], retire_count[8], avg_free_hit_bps[8]
- Exponential smoothing (90% history + 10% new)
- Per-class efficiency tracking
- Stats snapshot API
- Route decision disabled for v11a-2 (v11b feature)
5. Build Integration
- Modified Makefile: added 4 new .o files (segment, cold_iface, stats, learner)
- Updated box header prototypes
- Clean compilation, all dependencies resolved
Architecture Decision Implementation:
- v7 remains frozen (C5/C6 research preset)
- MID v3.5 becomes unified 257-1KiB main path
- Multi-class isolation: per-class free stacks
- Dormant infrastructure: linked but not active (zero overhead)
Performance:
- Build: clean compilation
- Sanity benchmark: 27.3M ops/s (no regression vs v10)
- Memory: ~30MB RSS (baseline maintained)
Design Compliance:
✅ Layer separation: L2 (segment) → L2 (cold iface) → L3 (stats) → L3 (learner)
✅ Hot path clean: alloc/free never touch stats/learner
✅ Backward compatible: existing MID v3 routes unchanged
✅ Transparent: v11a-2 is dormant (no behavior change)
Next Phase (v11a-3):
- Activate C5/C6/C7 routing through MID v3.5
- Connect TLS cache to segment refill
- Verify performance under load
- Then Phase v11a-4: dynamic C5 ratio routing
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com >
2025-12-12 06:37:06 +09:00