Files
hakmem/docs/status/archive/PHASE_6.16_UPDATE_2025_10_23.md

41 lines
2.3 KiB
Markdown
Raw Normal View History

Phase 6.16 Update (20251023)
What changed (summary)
- Tiny fast path simplified and sped up:
- Removed owner pointer from TLS magazine items (smaller push/pop).
- Added sampled counter updates for alloc/free (ENV: HAKMEM_TINY_COUNT_SAMPLE, default 1/256).
- Mid overhead trimmed:
- hits/misses/frees now updated with sampling (ENV: HAKMEM_POOL_COUNT_SAMPLE, default 1/256).
- Extra metrics exposed for trylock attempts/success and ring underflow; learner writes CSV lines `M,<ts>,<try>,<succ>,<under>,<rate>` to HAKMEM_LOG_FILE.
- Mid pagedescriptor registry (baseline):
- 64KiB page → {class_idx, owner_tid} mini hash.
- TLSowned pages registered with owner_tid; refill pages registered with owner_tid=0.
- Free path consults descriptor when header is light (owner not set).
- Header modes (Mid): HAKMEM_HDR_LIGHT now supports 0 (full), 1 (minimal), 2 (skip writes/validation; experimental).
Key ENV additions
- HAKMEM_TINY_COUNT_SAMPLE, HAKMEM_POOL_COUNT_SAMPLE — 2^N sampling for hotpath counters.
- HAKMEM_HDR_LIGHT=0|1|2 — header write policy for Mid.
- Mid TLS twotier knobs: HAKMEM_TRYLOCK_PROBES, HAKMEM_RING_RETURN_DIV, HAKMEM_TLS_LO_MAX, HAKMEM_POOL_TLS_RING, HAKMEM_SHARD_MIX.
- Tiny wrapper toggles: HAKMEM_WRAP_TINY=1, HAKMEM_WRAP_TINY_REFILL=1.
Quick A/B results (10s, larson, BURST)
- Tiny 864B
- 1T: hakmem 17.93M (prev ~14.6M) / mimalloc ~34.3M / system ~37.4M
- 4T: hakmem 56.10M (prev ~30.9M) / mimalloc ~52.0M / system ~82.3M
- Note: hakmem > mimalloc in 4T here due to TLS/mag simplifications.
- Mid 232KiB
- 1T: hakmem ~4.54.8M vs mimalloc ~1114.8M (needs TC + headerless hot path)
- 4T: hakmem ~13.515.0M vs mimalloc ~2930M
Logs
- Tiny: docs/benchmarks/20251023_035642_TINY_AB/summary.txt
- Mid A/B (fast): docs/benchmarks/20251023_033008_AB_FAST_MID/summary.txt
- Headtohead suites: docs/benchmarks/20251023_033358_HEAD2HEAD/summary.txt, ..._HEAD2HEAD_POST, ..._HDR2
Next steps (recommended)
1) Mid Transfer Cache (minimal): (class,thread) MPSC + small budget drain in alloc; remove central freelist round trips on crossthread frees.
2) Pagedescriptor routing: hold class/owner in descriptor, make Mid headerless on hot path (keep assertions in debug).
3) Tune TLS parameters (ring cap 8/16, LIFO max 256/512) via scripts/ab_fast_mid.sh.