3.9 KiB
Phase 87: Inline Slots Overflow Observation Results
Objective
Measure inline slots overflow frequency (C3/C4/C5/C6) to determine if Phase 88 (batch drain optimization) is worth implementing.
Observation Setup
- Workload: Mixed SSOT (WS=400, 16-1024B allocation sizes)
- Operations: 20,000,000 random alloc/free operations
- Runs: single-run observation (OBSERVE binary)
- Configuration:
- Route assignments: LEGACY for all C0-C7
- Inline slots: C4/C5/C6 enabled (Phase 75/76), fixed mode ON (Phase 78), switch dispatch ON (Phase 80)
Critical Fix (measurement correctness)
An earlier observation run reported PUSH TOTAL/POP TOTAL = 0 for all classes.
That was not valid evidence that inline slots were unused.
Root cause was telemetry compile gating:
tiny_inline_slots_overflow_enabled()is a header-only hot-path check.- The original implementation relied on a
#defineinsidetiny_inline_slots_overflow_stats_box.c, which does not apply to other translation units. - Fix: introduce
HAKMEM_INLINE_SLOTS_OVERFLOW_STATS_COMPILEDincore/hakmem_build_flags.hand make the enabled check depend on it. - OBSERVE build now enables it via Makefile:
bench_random_mixed_hakmem_observeadds-DHAKMEM_INLINE_SLOTS_OVERFLOW_STATS_COMPILED=1.
Verified Result: inline slots are being called (WS=400 SSOT)
Total Operation Counts (Verification)
PUSH TOTAL (Free Path Attempts):
C4: 687,564
C5: 1,373,605
C6: 2,750,862
TOTAL (C4-C6): 4,812,031
POP TOTAL (Alloc Path Attempts):
C4: 687,564
C5: 1,373,605
C6: 2,750,862
TOTAL (C4-C6): 4,812,031
This confirms:
- ✅
tiny_legacy_fallback_free_base_with_env()is being executed (LEGACY fallback path). - ✅ C4/C5/C6 inline slots push/pop are active in the LEGACY fallback/hot alloc paths.
Overflow / Underflow Rates (WS=400 SSOT)
PUSH FULL (Free Path Ring Overflow):
TOTAL: 0 (0.00%)
POP EMPTY (Alloc Path Ring Underflow):
TOTAL: 168 (0.003%)
Interpretation:
- WS=400 SSOT is a near-perfect steady state for C4/C5/C6 inline slots.
- Overflow batching ROI is effectively zero:
push_full=0,pop_empty≈0.003%.
Phase 88 ROI Decision: NO-GO
Recommendation
DO NOT IMPLEMENT Phase 88 (Batch Drain Optimization)
Rationale
- Overflow is essentially absent:
push_full=0,pop_empty≈0.003%. - Batch drain overhead would dominate: any additional logic is far more likely to incur layout/branch tax than to save work.
- This is already the desirable state: inline slots are sized correctly for WS=400 SSOT.
Cost-Benefit Analysis
- Implementation Cost: high (batch logic, tests, ongoing maintenance)
- Benefit Under SSOT: ~0% (overflow frequency too low)
- Risk: layout tax / regression in a hot-path-heavy code region
Alternative Path (If overflow work is desired)
Use a research workload that intentionally produces misses/overflow (e.g. larger WS), and re-run this observation. Do not use WS=400 SSOT for that validation.
Implementation Artifacts
Files Created
core/box/tiny_inline_slots_overflow_stats_box.h- Telemetry box headercore/box/tiny_inline_slots_overflow_stats_box.c- Telemetry implementationcore/front/tiny_c{3,4,5,6}_inline_slots.h- Updated with total counter calls
Telemetry Infrastructure
- Atomic counters for thread-safe measurement
- Compile-time enabled (always in observation builds)
- Zero overhead when disabled (checked at init time)
- Percentage calculations for overflow rates
Conclusion
Phase 87 observation (with fixed telemetry gating) confirms that inline slots are active and overflow is negligible for WS=400 SSOT. Phase 88 is therefore correctly frozen as NO-GO for SSOT performance work.
Score: NO-GO ✗
- Expected Improvement: ~0% (overflow extremely rare)
- Actual Improvement: N/A (measurement-only)
- Implementation Burden: High (new code path, batch logic)
- Recommendation: Archive Phase 88 pending inline slots adoption