HAKMEM Docs Index (2025-10-29) Purpose - One‑page map for current work: how to build, run, compare, and tune. - Focus on Tiny fast‑path tuning vs system/mimalloc, with safe LD guidance. Quick Build - Direct link (recommended for perf tuning) - `make bench_fast` - Run: `HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem` - PGO (direct link) - `./build_pgo.sh` (profile+build) - Run: `HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem` - Shared (LD_PRELOAD) PGO - `make pgo-profile-shared && make pgo-build-shared` - Run: `HAKMEM_WRAP_TINY=1 LD_PRELOAD=./libhakmem.so ./bench_comprehensive_system` Direct‑Link Comparisons (CSV) - Pair (HAKMEM vs mimalloc): `bash scripts/run_comprehensive_pair.sh` - CSV: `bench_results/comp_pair_YYYYMMDD_HHMMSS/summary.csv` - Tiny hot triad (HAKMEM/System/mimalloc): `bash scripts/run_tiny_hot_triad.sh 80000` - CSV: `bench_results/tiny_hot_triad_YYYYMMDD_HHMMSS/results.csv` - Random mixed triad: `bash scripts/run_random_mixed_matrix.sh 120000` - CSV: `bench_results/random_mixed_YYYYMMDD_HHMMSS/results.csv` Perf‑Main preset (safe, mainline‑oriented) - Build + run triad: `bash scripts/run_perf_main_triad.sh 60000` - Applies recommended tiny env (TLS_SLL=1, REFILL_MAX=96, HOT=192, HYST=16) without bench‑only macros. Tiny param sweeps - Basic: `bash scripts/sweep_tiny_params.sh 100000` - Advanced(SLL倍率/リフィル/クラス別MAGなど): `bash scripts/sweep_tiny_advanced.sh 80000 --mag64-512` LD_PRELOAD Apps (opt‑in) - Script: `bash scripts/run_apps_with_hakmem.sh` - Default safety: `HAKMEM_LD_SAFE=2` (pass‑through) set in script, then per‑case `LD_PRELOAD` on. - Recommendation: use direct‑link for perf; LD runs are for stability sampling only. Tiny Modes and Knobs - Normal (default): TLS magazine + TLS SLL (≤256B) - `HAKMEM_TINY_TLS_SLL=1` (default) - `HAKMEM_TINY_MAG_CAP=128` (good tiny bench preset; 64B may prefer 512) - TinyQuickSlot(最小フロント; 実験) - `HAKMEM_TINY_QUICK=1` - items[6] を1ラインに保持。miss時は SLL/Mag から少量補充して即返却。 - Ultra (SLL‑only, experimental): - `HAKMEM_TINY_ULTRA=1` (opt‑in) - `HAKMEM_TINY_ULTRA_VALIDATE=0/1` (perf vs safety) - Per‑class overrides: `HAKMEM_TINY_ULTRA_BATCH_C{0..7}`, `HAKMEM_TINY_ULTRA_SLL_CAP_C{0..7}` - FLINT (Fast Lightweight INTelligence): Frontend + deferred Intelligence(実験) - `HAKMEM_TINY_FRONTEND=1` (enable array FastCache; miss falls back) - `HAKMEM_TINY_FASTCACHE=1` (low‑level switch; keep OFF unless A/B) - `HAKMEM_INT_ENGINE=1` (event ring + BG thread adjusts fill targets) - イベント拡張(内部): timestamp/tier/flags/site_id/thread をリングに蓄積(ホットパス外)。今後の適応に活用 Best‑Known Presets (direct link) - Tiny hot focus - `export HAKMEM_WRAP_TINY=1` - `export HAKMEM_TINY_TLS_SLL=1` - `export HAKMEM_TINY_MAG_CAP=128` (64B: try 512) - `export HAKMEM_TINY_REMOTE_DRAIN_TRYRATE=0` - `export HAKMEM_TINY_REMOTE_DRAIN_THRESHOLD=1000000` - Memory efficiency A/B - `export HAKMEM_TINY_FLUSH_ON_EXIT=1` - Run bench/app; compare steady‑state RSS with/without. Refill Batch (A/B) - `HAKMEM_TINY_REFILL_MAX_HOT`(既定192)/ `HAKMEM_TINY_REFILL_MAX`(既定64) - 小サイズ帯(8/16/32B)でピーク探索。現環境は既定付近が最良帯 Current Results (high level) - Tiny hot triad (Perf‑Main, 60–80k cycles, safe): - 16–64B: System ≈ 300–335 M; HAKMEM ≈ 250–300 M; mimalloc 535–620 M. - 128B: HAKMEM ≈ 250–270 M; System 170–176 M; mimalloc 575–586 M. - Comprehensive (direct link): mimalloc ≈ 0.9–1.0B; HAKMEM ≈ 0.25–0.27B. - Random mixed: three close; mimalloc slightly ahead; HAKMEM ≈ System ± a few %. Bench‑only highlight(参考値, 専用ビルド) - SLL‑only + warmup + PGO(≤64B)で 8–24B が 400M超、32B/b100 最大 429.18M(System 312.55M)。 - 実行: `bash scripts/run_tiny_sllonly_triad.sh 30000`(安全な通常ビルドには含めません) Open Focus - Close the 16–64B gap (cap/batch tuning; SLL/mini‑mag overhead shave). - Ultra (opt‑in) stabilization; A/B vs normal. - Frontend refill heuristics; BG engine stop/join wiring (added). Mid Range MT (8-32KB, mimalloc-style) - **Status**: COMPLETE (2025-11-01) - 110M ops/sec achieved ✅ - Quick benchmark: `bash benchmarks/scripts/mid/run_mid_mt_bench.sh` - Comparison: `bash benchmarks/scripts/mid/compare_mid_mt_allocators.sh` - Full report: `MID_MT_COMPLETION_REPORT.md` - Implementation: `core/hakmem_mid_mt.{c,h}` - Results: 110M ops/sec (100-101% of mimalloc, 2.12x faster than glibc) ACE Learning Layer (Adaptive Control Engine) - **Status**: Phase 1 COMPLETE ✅ (2025-11-01) - Infrastructure ready 🚀 - **Goal**: Fix weaknesses with adaptive learning (mimalloc超えを目指す!) - Fragmentation stress: 3.87 → 10-20 M ops/s (2.6-5.2x target) - Large WS: 22.15 → 30-45 M ops/s (1.4-2.0x target) - realloc: 277ns → 140-210ns (1.3-2.0x target) - **Documentation**: - User guide: `docs/ACE_LEARNING_LAYER.md` ✅ - Technical plan: `docs/ACE_LEARNING_LAYER_PLAN.md` ✅ - Progress report: `ACE_PHASE1_PROGRESS.md` ✅ - **Phase 1 Deliverables** (COMPLETE ✅): - ✅ Metrics collection (`hakmem_ace_metrics.{c,h}`) - ✅ UCB1 learning algorithm (`hakmem_ace_ucb1.{c,h}`) - ✅ Dual-loop controller (`hakmem_ace_controller.{c,h}`) - ✅ Dynamic TLS capacity adjustment - ✅ Hot-path metrics integration (alloc/free tracking) - ✅ A/B benchmark script (`scripts/bench_ace_ab.sh`) - **Usage**: - Enable: `HAKMEM_ACE_ENABLED=1 ./your_benchmark` - Debug: `HAKMEM_ACE_ENABLED=1 HAKMEM_ACE_LOG_LEVEL=2 ./your_benchmark` - A/B test: `./scripts/bench_ace_ab.sh` - **Next**: Phase 2 - Extended benchmarking + learning convergence validation Directory Structure (2025-11-01 Reorganization) - **benchmarks/** - All benchmark-related files - `src/` - Benchmark source code (tiny/mid/comprehensive/stress) - `scripts/` - Benchmark scripts organized by category - `results/` - Benchmark results (formerly bench_results/) - `perf/` - Performance profiling data (formerly perf_data/) - **tests/** - Test files (unit/integration/stress) - **core/** - Core allocator implementation - **docs/** - Documentation (benchmarks/, api/, guides/) - **scripts/** - Development scripts (build/, apps/, maintenance/) - **archive/** - Historical documents and analysis Where to Read More - **SlabHandle Box**: `docs/SLAB_HANDLE.md`(ownership + remote drain + metadata のカプセル化) - **Free Safety**: `docs/FREE_SAFETY.md`(二重free/クラス不一致のFail‑Fastとリング運用) - **Cleanup/Organization**: `CLEANUP_SUMMARY_2025_11_01.md` (latest) - **Archive**: `archive/README.md` - Historical docs and analysis - Bench mode: `BENCH_MODE.md` - Env knobs: `ENV_VARS.md` - Tiny hot microbench: `TINY_HOT_BENCH.md` - Frontend/Backend split: `FRONTEND_BACKEND_PLAN.md` - LD status/safety: `LD_PRELOAD_STATUS.md` - Goals/Targets: `GOALS_2025_10_29.md` - Latest results: `BENCH_RESULTS_2025_10_29.md` (today), `BENCH_RESULTS_2025_10_28.md` (yesterday) - Mainline integration plan: `MAINLINE_INTEGRATION.md` - FLINT Intelligence (events/adaptation): `FLINT_INTELLIGENCE.md` Hako / MIR / FFI - `HAKO_MIR_FFI_SPEC.md` — フロント型検証完結+MIRは運ぶだけ+FFI機械的ローワリングの仕様 Notes - LD mode: keep `HAKMEM_LD_SAFE=2` default for apps; prefer direct‑link for tuning. - Ultra/Frontend are experimental; keep OFF by default and use scripts for A/B.