Major Features: - Debug counter infrastructure for Refill Stage tracking - Free Pipeline counters (ss_local, ss_remote, tls_sll) - Diagnostic counters for early return analysis - Unified larson.sh benchmark runner with profiles - Phase 6-3 regression analysis documentation Bug Fixes: - Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB) - Fix profile variable naming consistency - Add .gitignore patterns for large files Performance: - Phase 6-3: 4.79 M ops/s (has OOM risk) - With SuperSlab: 3.13 M ops/s (+19% improvement) This is a clean repository without large log files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1.6 KiB
1.6 KiB
2025-10-22 Sweep Notes (Larson)
抜粋(1秒ラン)と再現コマンド。詳細は生ログを参照。
環境
- ビルド:
make shared(計測ONはmake debug) - 共有:
LD_PRELOAD=$(readlink -f ./libhakmem.so) - 代表ENV(必要に応じて付与):
HAKMEM_PROF=1 HAKMEM_PROF_SAMPLE=7HAKMEM_LEARN=1(CAP学習ON)HAKMEM_WRAP_L2=1 HAKMEM_WRAP_L25=1(ラッパー内L1許可)
DYN1(14KB)効果(ラッパーOFF)
# 13–15KB, 1T, 1s
DYN1=OFF → 1.44M ops/s
DYN1=ON → 4.57M ops/s
コマンド:
LD_PRELOAD=... HAKMEM_MID_DYN1=0 mimalloc-bench/bench/larson/larson 1 13000 15000 10000 1 12345 1
LD_PRELOAD=... HAKMEM_MID_DYN1=14336 mimalloc-bench/bench/larson/larson 1 13000 15000 10000 1 12345 1
ラッパーON整地後(最低バンドル=3)
# 13–15KB, 1T, 1s, WRAP L1 ON
DYN1=ON → 4.18M ops/s
DYN1=OFF → 4.66M ops/s
# 2–32KB, 4T, 1s, WRAP L1 ON
≈ 4.02M ops/s
コマンド:
HAKMEM_WRAP_L2=1 HAKMEM_WRAP_L25=1 HAKMEM_POOL_MIN_BUNDLE=3 LD_PRELOAD=... HAKMEM_MID_DYN1=14336 mimalloc-bench/bench/larson/larson 1 13000 15000 10000 1 12345 1
HAKMEM_WRAP_L2=1 HAKMEM_WRAP_L25=1 HAKMEM_POOL_MIN_BUNDLE=3 LD_PRELOAD=... HAKMEM_MID_DYN1=0 mimalloc-bench/bench/larson/larson 1 13000 15000 10000 1 12345 1
HAKMEM_WRAP_L2=1 HAKMEM_WRAP_L25=1 HAKMEM_POOL_MIN_BUNDLE=3 LD_PRELOAD=... mimalloc-bench/bench/larson/larson 1 2048 32768 10000 1 12345 4
メモ:
- ラッパーOFFではDYN1の効果が明確。
- ラッパーONではcap/steal/bundleの整地で退化を概ね解消。今後はDYN1 CAP初期値、bundle下限、steal幅を微調整予定。