Phase 1 完了:環境変数整理 + fprintf デバッグガード ENV変数削除(BG/HotMag系): - core/hakmem_tiny_init.inc: HotMag ENV 削除 (~131 lines) - core/hakmem_tiny_bg_spill.c: BG spill ENV 削除 - core/tiny_refill.h: BG remote 固定値化 - core/hakmem_tiny_slow.inc: BG refs 削除 fprintf Debug Guards (#if !HAKMEM_BUILD_RELEASE): - core/hakmem_shared_pool.c: Lock stats (~18 fprintf) - core/page_arena.c: Init/Shutdown/Stats (~27 fprintf) - core/hakmem.c: SIGSEGV init message ドキュメント整理: - 328 markdown files 削除(旧レポート・重複docs) 性能確認: - Larson: 52.35M ops/s (前回52.8M、安定動作✅) - ENV整理による機能影響なし - Debug出力は一部残存(次phase で対応) 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
7.4 KiB
7.4 KiB
HAKMEM Docs Index (2025-10-29)
Purpose
- One‑page map for current work: how to build, run, compare, and tune.
- Focus on Tiny fast‑path tuning vs system/mimalloc, with safe LD guidance.
Quick Build
- Direct link (recommended for perf tuning)
make bench_fast- Run:
HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem
- PGO (direct link)
./build_pgo.sh(profile+build)- Run:
HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem
- Shared (LD_PRELOAD) PGO
make pgo-profile-shared && make pgo-build-shared- Run:
HAKMEM_WRAP_TINY=1 LD_PRELOAD=./libhakmem.so ./bench_comprehensive_system
Direct‑Link Comparisons (CSV)
- Pair (HAKMEM vs mimalloc):
bash scripts/run_comprehensive_pair.sh- CSV:
bench_results/comp_pair_YYYYMMDD_HHMMSS/summary.csv
- CSV:
- Tiny hot triad (HAKMEM/System/mimalloc):
bash scripts/run_tiny_hot_triad.sh 80000- CSV:
bench_results/tiny_hot_triad_YYYYMMDD_HHMMSS/results.csv
- CSV:
- Random mixed triad:
bash scripts/run_random_mixed_matrix.sh 120000- CSV:
bench_results/random_mixed_YYYYMMDD_HHMMSS/results.csv
- CSV:
Perf‑Main preset (safe, mainline‑oriented)
- Build + run triad:
bash scripts/run_perf_main_triad.sh 60000- Applies recommended tiny env (TLS_SLL=1, REFILL_MAX=96, HOT=192, HYST=16) without bench‑only macros.
Tiny param sweeps
- Basic:
bash scripts/sweep_tiny_params.sh 100000 - Advanced(SLL倍率/リフィル/クラス別MAGなど):
bash scripts/sweep_tiny_advanced.sh 80000 --mag64-512
LD_PRELOAD Apps (opt‑in)
- Script:
bash scripts/run_apps_with_hakmem.sh - Default safety:
HAKMEM_LD_SAFE=2(pass‑through) set in script, then per‑caseLD_PRELOADon. - Recommendation: use direct‑link for perf; LD runs are for stability sampling only.
Tiny Modes and Knobs
- Normal (default): TLS magazine + TLS SLL (≤256B)
HAKMEM_TINY_TLS_SLL=1(default)HAKMEM_TINY_MAG_CAP=128(good tiny bench preset; 64B may prefer 512)
- TinyQuickSlot(最小フロント; 実験)
HAKMEM_TINY_QUICK=1- items[6] を1ラインに保持。miss時は SLL/Mag から少量補充して即返却。
- Ultra (SLL‑only, experimental):
HAKMEM_TINY_ULTRA=1(opt‑in)HAKMEM_TINY_ULTRA_VALIDATE=0/1(perf vs safety)- Per‑class overrides:
HAKMEM_TINY_ULTRA_BATCH_C{0..7},HAKMEM_TINY_ULTRA_SLL_CAP_C{0..7}
- FLINT (Fast Lightweight INTelligence): Frontend + deferred Intelligence(実験)
HAKMEM_TINY_FRONTEND=1(enable array FastCache; miss falls back)HAKMEM_TINY_FASTCACHE=1(low‑level switch; keep OFF unless A/B)HAKMEM_INT_ENGINE=1(event ring + BG thread adjusts fill targets)- イベント拡張(内部): timestamp/tier/flags/site_id/thread をリングに蓄積(ホットパス外)。今後の適応に活用
Best‑Known Presets (direct link)
- Tiny hot focus
export HAKMEM_WRAP_TINY=1export HAKMEM_TINY_TLS_SLL=1export HAKMEM_TINY_MAG_CAP=128(64B: try 512)export HAKMEM_TINY_REMOTE_DRAIN_TRYRATE=0export HAKMEM_TINY_REMOTE_DRAIN_THRESHOLD=1000000
- Memory efficiency A/B
export HAKMEM_TINY_FLUSH_ON_EXIT=1- Run bench/app; compare steady‑state RSS with/without.
Refill Batch (A/B)
HAKMEM_TINY_REFILL_MAX_HOT(既定192)/HAKMEM_TINY_REFILL_MAX(既定64)- 小サイズ帯(8/16/32B)でピーク探索。現環境は既定付近が最良帯
Current Results (high level)
- Tiny hot triad (Perf‑Main, 60–80k cycles, safe):
- 16–64B: System ≈ 300–335 M; HAKMEM ≈ 250–300 M; mimalloc 535–620 M.
- 128B: HAKMEM ≈ 250–270 M; System 170–176 M; mimalloc 575–586 M.
- Comprehensive (direct link): mimalloc ≈ 0.9–1.0B; HAKMEM ≈ 0.25–0.27B.
- Random mixed: three close; mimalloc slightly ahead; HAKMEM ≈ System ± a few %.
Bench‑only highlight(参考値, 専用ビルド)
- SLL‑only + warmup + PGO(≤64B)で 8–24B が 400M超、32B/b100 最大 429.18M(System 312.55M)。
- 実行:
bash scripts/run_tiny_sllonly_triad.sh 30000(安全な通常ビルドには含めません)
- 実行:
Open Focus
- Close the 16–64B gap (cap/batch tuning; SLL/mini‑mag overhead shave).
- Ultra (opt‑in) stabilization; A/B vs normal.
- Frontend refill heuristics; BG engine stop/join wiring (added).
Mid Range MT (8-32KB, mimalloc-style)
- Status: COMPLETE (2025-11-01) - 110M ops/sec achieved ✅
- Quick benchmark:
bash benchmarks/scripts/mid/run_mid_mt_bench.sh - Comparison:
bash benchmarks/scripts/mid/compare_mid_mt_allocators.sh - Full report:
MID_MT_COMPLETION_REPORT.md - Implementation:
core/hakmem_mid_mt.{c,h} - Results: 110M ops/sec (100-101% of mimalloc, 2.12x faster than glibc)
ACE Learning Layer (Adaptive Control Engine)
- Status: Phase 1 COMPLETE ✅ (2025-11-01) - Infrastructure ready 🚀
- Goal: Fix weaknesses with adaptive learning (mimalloc超えを目指す!)
- Fragmentation stress: 3.87 → 10-20 M ops/s (2.6-5.2x target)
- Large WS: 22.15 → 30-45 M ops/s (1.4-2.0x target)
- realloc: 277ns → 140-210ns (1.3-2.0x target)
- Documentation:
- User guide:
docs/ACE_LEARNING_LAYER.md✅ - Technical plan:
docs/ACE_LEARNING_LAYER_PLAN.md✅ - Progress report:
ACE_PHASE1_PROGRESS.md✅
- User guide:
- Phase 1 Deliverables (COMPLETE ✅):
- ✅ Metrics collection (
hakmem_ace_metrics.{c,h}) - ✅ UCB1 learning algorithm (
hakmem_ace_ucb1.{c,h}) - ✅ Dual-loop controller (
hakmem_ace_controller.{c,h}) - ✅ Dynamic TLS capacity adjustment
- ✅ Hot-path metrics integration (alloc/free tracking)
- ✅ A/B benchmark script (
scripts/bench_ace_ab.sh)
- ✅ Metrics collection (
- Usage:
- Enable:
HAKMEM_ACE_ENABLED=1 ./your_benchmark - Debug:
HAKMEM_ACE_ENABLED=1 HAKMEM_ACE_LOG_LEVEL=2 ./your_benchmark - A/B test:
./scripts/bench_ace_ab.sh
- Enable:
- Next: Phase 2 - Extended benchmarking + learning convergence validation
Directory Structure (2025-11-01 Reorganization)
- benchmarks/ - All benchmark-related files
src/- Benchmark source code (tiny/mid/comprehensive/stress)scripts/- Benchmark scripts organized by categoryresults/- Benchmark results (formerly bench_results/)perf/- Performance profiling data (formerly perf_data/)
- tests/ - Test files (unit/integration/stress)
- core/ - Core allocator implementation
- docs/ - Documentation (benchmarks/, api/, guides/)
- scripts/ - Development scripts (build/, apps/, maintenance/)
- archive/ - Historical documents and analysis
Where to Read More
- SlabHandle Box:
docs/SLAB_HANDLE.md(ownership + remote drain + metadata のカプセル化) - Free Safety:
docs/FREE_SAFETY.md(二重free/クラス不一致のFail‑Fastとリング運用) - Cleanup/Organization:
CLEANUP_SUMMARY_2025_11_01.md(latest) - Archive:
archive/README.md- Historical docs and analysis - Bench mode:
BENCH_MODE.md - Env knobs:
ENV_VARS.md - Tiny hot microbench:
TINY_HOT_BENCH.md - Frontend/Backend split:
FRONTEND_BACKEND_PLAN.md - LD status/safety:
LD_PRELOAD_STATUS.md - Goals/Targets:
GOALS_2025_10_29.md - Latest results:
BENCH_RESULTS_2025_10_29.md(today),BENCH_RESULTS_2025_10_28.md(yesterday) - Mainline integration plan:
MAINLINE_INTEGRATION.md - FLINT Intelligence (events/adaptation):
FLINT_INTELLIGENCE.md
Hako / MIR / FFI
HAKO_MIR_FFI_SPEC.md— フロント型検証完結+MIRは運ぶだけ+FFI機械的ローワリングの仕様
Notes
- LD mode: keep
HAKMEM_LD_SAFE=2default for apps; prefer direct‑link for tuning. - Ultra/Frontend are experimental; keep OFF by default and use scripts for A/B.