Files
hakmem/docs/archive/MAINLINE_INTEGRATION.md
Moe Charm (CI) 52386401b3 Debug Counters Implementation - Clean History
Major Features:
- Debug counter infrastructure for Refill Stage tracking
- Free Pipeline counters (ss_local, ss_remote, tls_sll)
- Diagnostic counters for early return analysis
- Unified larson.sh benchmark runner with profiles
- Phase 6-3 regression analysis documentation

Bug Fixes:
- Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB)
- Fix profile variable naming consistency
- Add .gitignore patterns for large files

Performance:
- Phase 6-3: 4.79 M ops/s (has OOM risk)
- With SuperSlab: 3.13 M ops/s (+19% improvement)

This is a clean repository without large log files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 12:31:14 +09:00

38 lines
2.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Mainline Integration Plan (Tiny focus)
What we promoted to mainline (safe, general)
- Safer tiny refill into SLL (already integrated)
- `sll_refill_small_from_ss(class_idx, max_take)` now caps refill by the actual SLL free capacity, avoiding overtake and wasted meta->used increments.
- Entry order consolidation in the small fast path
- Prefer `SLL → Magazine → SuperSlab` in the normal (nonbench) path; Quick/FrontCache/Ultra tiers remain optin.
- Targeted remotedrain queue (compiled in, default OFF)
- Perclass Treiber queue for slabs that exceed remote thresholds; disabled by default via env knobs to preserve conservative behavior.
- PGO recipe (optin)
- Makefile targets for PGO on tiny benches are available, but not required for normal builds.
What remains benchonly (NOT promoted)
- SLLonly front (Magazine compiled out) and TLS warmup
- These are highly benchmarkspecific and are kept in benchonly builds.
- Freeside SLLfirst push without owner/stats
- Mainline preserves learning/stats semantics; bench builds can cut them out.
- Quick/FrontCache / 32/64 specialization hardwiring
- Retained as A/B options; not enabled by default in mainline.
Recommended “PerfMain” preset (optin, no bench macros)
- Environment (TinyHot biased but general):
- `HAKMEM_TINY_TLS_SLL=1`
- `HAKMEM_TINY_REFILL_MAX=96`
- `HAKMEM_TINY_REFILL_MAX_HOT=192`
- `HAKMEM_TINY_SPILL_HYST=16`
- `HAKMEM_TINY_BG_REMOTE=0` (keep targeted remote drain off by default)
- Keep Quick/FrontCache/Ultra OFF unless explicitly A/B tested
How to try PerfMain locally
- Build benches: `make bench_fast`
- Run tinyhot triad (no bench macros): `bash scripts/run_tiny_hot_triad.sh 60000`
- Run randommixed matrix: `bash scripts/run_random_mixed_matrix.sh 100000`
Notes
- LD_PRELOAD/app mode remains conservative (LD_SAFE staging). Tinyonly and passthrough modes are recommended for stability. The benchonly optimizations are intentionally not applied in LD mode.