C7 Alloc Hotpath Flattening (design memo) ========================================= Goals ----- - Make C7 alloc as close to a straight line as possible. - Minimise branches/indirections on the steady hit path (UC/TLS/Warm already stable). - Keep Box boundaries intact; isolate feature gates to one lookup. Current shape (simplified) -------------------------- 1. size→class LUT → `class_idx = 7` for 1024B path. 2. Route/Policy checks (tiny_route_get, tiny_policy_get) → gate UC/Warm/Page. 3. UC pop: hit path shares code with miss/refill, includes stats/guards. 4. TLS/Warm engagement happens behind UC miss boundary. 5. Multiple helper calls on the hit path (gate box, policy box, UC helpers). Target shape ------------ 1. size→class LUT (unchanged). 2. One policy snapshot: `const TinyClassPolicy* pol = tiny_policy_get(7);` 3. One route decision: C7 fast path assumes Tiny→UC→TLS/Warm enabled. 4. Hit path specialised: - Inline `tiny_unified_cache_pop_fast_c7()` that only touches the hot cache lines. - Stats optional/sampled (avoid atomic on every hit). - No feature/env reads. 5. Miss path remains boxed and guarded; enters existing refill flow unchanged. Possible refactors ------------------ - Add `malloc_tiny_fast_c7_inline(...)` as a static inline used only when class==7. - Precompute `pol->warm_enabled/page_box_enabled` once per thread and reuse. - Split UC helpers into `*_hit_fast` vs `*_miss` to keep the hit CFG tiny. Trade-offs / checks ------------------- - Keep the Box boundaries (Gate/Route/Policy) but allow an inline “fast lane” for C7. - Ensure Debug/Policy logging stays in the slow/miss path only. - Validate with IPC/ops after implementation; target +10–15% for C7-heavy mixes.