hakmem/MAINLINE_INTEGRATION.md at d355041638d0b77d0bf2f8b4928e8201dbd52f95

Files

Moe Charm (CI) 52386401b3 Debug Counters Implementation - Clean History

Major Features:
- Debug counter infrastructure for Refill Stage tracking
- Free Pipeline counters (ss_local, ss_remote, tls_sll)
- Diagnostic counters for early return analysis
- Unified larson.sh benchmark runner with profiles
- Phase 6-3 regression analysis documentation

Bug Fixes:
- Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB)
- Fix profile variable naming consistency
- Add .gitignore patterns for large files

Performance:
- Phase 6-3: 4.79 M ops/s (has OOM risk)
- With SuperSlab: 3.13 M ops/s (+19% improvement)

This is a clean repository without large log files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-05 12:31:14 +09:00

2.0 KiB

Raw Blame History

Mainline Integration Plan (Tiny focus)

What we promoted to mainline (safe, general)

Safer tiny refill into SLL (already integrated)
- sll_refill_small_from_ss(class_idx, max_take) now caps refill by the actual SLL free capacity, avoiding overtake and wasted meta->used increments.
Entry order consolidation in the small fast path
- Prefer SLL → Magazine → SuperSlab in the normal (non‑bench) path; Quick/FrontCache/Ultra tiers remain opt‑in.
Targeted remote‑drain queue (compiled in, default OFF)
- Per‑class Treiber queue for slabs that exceed remote thresholds; disabled by default via env knobs to preserve conservative behavior.
PGO recipe (opt‑in)
- Makefile targets for PGO on tiny benches are available, but not required for normal builds.

What remains bench‑only (NOT promoted)

SLL‑only front (Magazine compiled out) and TLS warmup
- These are highly benchmark‑specific and are kept in bench‑only builds.
Free‑side SLL‑first push without owner/stats
- Mainline preserves learning/stats semantics; bench builds can cut them out.
Quick/FrontCache / 32/64 specialization hardwiring
- Retained as A/B options; not enabled by default in mainline.

Recommended “Perf‑Main” preset (opt‑in, no bench macros)

Environment (Tiny‑Hot biased but general):
- HAKMEM_TINY_TLS_SLL=1
- HAKMEM_TINY_REFILL_MAX=96
- HAKMEM_TINY_REFILL_MAX_HOT=192
- HAKMEM_TINY_SPILL_HYST=16
- HAKMEM_TINY_BG_REMOTE=0 (keep targeted remote drain off by default)
- Keep Quick/FrontCache/Ultra OFF unless explicitly A/B tested

How to try Perf‑Main locally

Build benches: make bench_fast
Run tiny‑hot triad (no bench macros): bash scripts/run_tiny_hot_triad.sh 60000
Run random‑mixed matrix: bash scripts/run_random_mixed_matrix.sh 100000

Notes

LD_PRELOAD/app mode remains conservative (LD_SAFE staging). Tiny‑only and pass‑through modes are recommended for stability. The bench‑only optimizations are intentionally not applied in LD mode.

2.0 KiB Raw Blame History Unescape Escape

2.0 KiB

Raw Blame History