41 lines
2.3 KiB
Markdown
41 lines
2.3 KiB
Markdown
|
|
Phase 6.16 Update (2025‑10‑23)
|
|||
|
|
|
|||
|
|
What changed (summary)
|
|||
|
|
- Tiny fast path simplified and sped up:
|
|||
|
|
- Removed owner pointer from TLS magazine items (smaller push/pop).
|
|||
|
|
- Added sampled counter updates for alloc/free (ENV: HAKMEM_TINY_COUNT_SAMPLE, default 1/256).
|
|||
|
|
- Mid overhead trimmed:
|
|||
|
|
- hits/misses/frees now updated with sampling (ENV: HAKMEM_POOL_COUNT_SAMPLE, default 1/256).
|
|||
|
|
- Extra metrics exposed for trylock attempts/success and ring underflow; learner writes CSV lines `M,<ts>,<try>,<succ>,<under>,<rate>` to HAKMEM_LOG_FILE.
|
|||
|
|
- Mid page‑descriptor registry (baseline):
|
|||
|
|
- 64KiB page → {class_idx, owner_tid} mini hash.
|
|||
|
|
- TLS‑owned pages registered with owner_tid; refill pages registered with owner_tid=0.
|
|||
|
|
- Free path consults descriptor when header is light (owner not set).
|
|||
|
|
- Header modes (Mid): HAKMEM_HDR_LIGHT now supports 0 (full), 1 (minimal), 2 (skip writes/validation; experimental).
|
|||
|
|
|
|||
|
|
Key ENV additions
|
|||
|
|
- HAKMEM_TINY_COUNT_SAMPLE, HAKMEM_POOL_COUNT_SAMPLE — 2^N sampling for hot‑path counters.
|
|||
|
|
- HAKMEM_HDR_LIGHT=0|1|2 — header write policy for Mid.
|
|||
|
|
- Mid TLS two‑tier knobs: HAKMEM_TRYLOCK_PROBES, HAKMEM_RING_RETURN_DIV, HAKMEM_TLS_LO_MAX, HAKMEM_POOL_TLS_RING, HAKMEM_SHARD_MIX.
|
|||
|
|
- Tiny wrapper toggles: HAKMEM_WRAP_TINY=1, HAKMEM_WRAP_TINY_REFILL=1.
|
|||
|
|
|
|||
|
|
Quick A/B results (10s, larson, BURST)
|
|||
|
|
- Tiny 8–64B
|
|||
|
|
- 1T: hakmem 17.93M (prev ~14.6M) / mimalloc ~34.3M / system ~37.4M
|
|||
|
|
- 4T: hakmem 56.10M (prev ~30.9M) / mimalloc ~52.0M / system ~82.3M
|
|||
|
|
- Note: hakmem > mimalloc in 4T here due to TLS/mag simplifications.
|
|||
|
|
- Mid 2–32KiB
|
|||
|
|
- 1T: hakmem ~4.5–4.8M vs mimalloc ~11–14.8M (needs TC + headerless hot path)
|
|||
|
|
- 4T: hakmem ~13.5–15.0M vs mimalloc ~29–30M
|
|||
|
|
|
|||
|
|
Logs
|
|||
|
|
- Tiny: docs/benchmarks/20251023_035642_TINY_AB/summary.txt
|
|||
|
|
- Mid A/B (fast): docs/benchmarks/20251023_033008_AB_FAST_MID/summary.txt
|
|||
|
|
- Head‑to‑head suites: docs/benchmarks/20251023_033358_HEAD2HEAD/summary.txt, ..._HEAD2HEAD_POST, ..._HDR2
|
|||
|
|
|
|||
|
|
Next steps (recommended)
|
|||
|
|
1) Mid Transfer Cache (minimal): (class,thread) MPSC + small budget drain in alloc; remove central freelist round trips on cross‑thread frees.
|
|||
|
|
2) Page‑descriptor routing: hold class/owner in descriptor, make Mid headerless on hot path (keep assertions in debug).
|
|||
|
|
3) Tune TLS parameters (ring cap 8/16, LIFO max 256/512) via scripts/ab_fast_mid.sh.
|
|||
|
|
|