Boxify superslab registry, add bench profile, and document C7 hotpath experiments

This commit is contained in:
Moe Charm (CI)
2025-12-07 03:12:27 +09:00
parent 18faa6a1c4
commit fda6cd2e67
71 changed files with 2052 additions and 286 deletions

View File

@ -0,0 +1,35 @@
C7 Free Hotpath (design memo)
=============================
Goals
-----
- Flatten the dominant C7 free path to minimise branches and helper hops.
- Keep safety checks boxed; keep hot lane minimal.
Current typical path (C7)
-------------------------
1. size→class LUT → `class_idx = 7`.
2. free gate / route box decides Tiny vs Pool.
3. Tiny free fast v2:
- Policy/env checks,
- TLS SLL push,
- Warm/UC interaction as needed.
4. Multiple helper calls along the way (gate, policy, sll push).
Target hot lane
---------------
1. Single policy snapshot for C7 (warm/page/tls on).
2. Straight to TLS SLL push with minimal bookkeeping.
3. Optional UC/Warm stats only in sampled mode.
4. Rare branches (remote/free-list edge cases) stay in boxed slow path.
Ideas to explore
----------------
- Add `hak_tiny_free_fast_v2_c7()` inline used when `class_idx==7`.
- Fold gate/policy reads into one branch per free call.
- Keep TLS SLL push inline, push remote/cross-thread cases behind unlikely branches.
Validation
----------
- Compare C7-only ops/s before/after.
- Ensure remote/free-list invariants stay enforced in the slow path.