hakmem/DOCS_INDEX.md at d355041638d0b77d0bf2f8b4928e8201dbd52f95

Files

Moe Charm (CI) a9ddb52ad4 ENV cleanup: Remove BG/HotMag vars & guard fprintf (Larson 52.3M ops/s)

Phase 1 完了：環境変数整理 + fprintf デバッグガード

ENV変数削除（BG/HotMag系）:
- core/hakmem_tiny_init.inc: HotMag ENV 削除 (~131 lines)
- core/hakmem_tiny_bg_spill.c: BG spill ENV 削除
- core/tiny_refill.h: BG remote 固定値化
- core/hakmem_tiny_slow.inc: BG refs 削除

fprintf Debug Guards (#if !HAKMEM_BUILD_RELEASE):
- core/hakmem_shared_pool.c: Lock stats (~18 fprintf)
- core/page_arena.c: Init/Shutdown/Stats (~27 fprintf)
- core/hakmem.c: SIGSEGV init message

ドキュメント整理:
- 328 markdown files 削除（旧レポート・重複docs）

性能確認:
- Larson: 52.35M ops/s (前回52.8M、安定動作✅)
- ENV整理による機能影響なし
- Debug出力は一部残存（次phase で対応）

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-26 14:45:26 +09:00

7.4 KiB

Raw Blame History

HAKMEM Docs Index (2025-10-29)

Purpose

One‑page map for current work: how to build, run, compare, and tune.
Focus on Tiny fast‑path tuning vs system/mimalloc, with safe LD guidance.

Quick Build

Direct link (recommended for perf tuning)
- make bench_fast
- Run: HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem
PGO (direct link)
- ./build_pgo.sh (profile+build)
- Run: HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem
Shared (LD_PRELOAD) PGO
- make pgo-profile-shared && make pgo-build-shared
- Run: HAKMEM_WRAP_TINY=1 LD_PRELOAD=./libhakmem.so ./bench_comprehensive_system

Direct‑Link Comparisons (CSV)

Pair (HAKMEM vs mimalloc): bash scripts/run_comprehensive_pair.sh
- CSV: bench_results/comp_pair_YYYYMMDD_HHMMSS/summary.csv
Tiny hot triad (HAKMEM/System/mimalloc): bash scripts/run_tiny_hot_triad.sh 80000
- CSV: bench_results/tiny_hot_triad_YYYYMMDD_HHMMSS/results.csv
Random mixed triad: bash scripts/run_random_mixed_matrix.sh 120000
- CSV: bench_results/random_mixed_YYYYMMDD_HHMMSS/results.csv

Perf‑Main preset (safe, mainline‑oriented)

Build + run triad: bash scripts/run_perf_main_triad.sh 60000
- Applies recommended tiny env (TLS_SLL=1, REFILL_MAX=96, HOT=192, HYST=16) without bench‑only macros.

Tiny param sweeps

Basic: bash scripts/sweep_tiny_params.sh 100000
Advanced（SLL倍率/リフィル/クラス別MAGなど）: bash scripts/sweep_tiny_advanced.sh 80000 --mag64-512

LD_PRELOAD Apps (opt‑in)

Script: bash scripts/run_apps_with_hakmem.sh
Default safety: HAKMEM_LD_SAFE=2 (pass‑through) set in script, then per‑case LD_PRELOAD on.
Recommendation: use direct‑link for perf; LD runs are for stability sampling only.

Tiny Modes and Knobs

Normal (default): TLS magazine + TLS SLL (≤256B)
- HAKMEM_TINY_TLS_SLL=1 (default)
- HAKMEM_TINY_MAG_CAP=128 (good tiny bench preset; 64B may prefer 512)
TinyQuickSlot（最小フロント; 実験）
- HAKMEM_TINY_QUICK=1
- items[6] を1ラインに保持。miss時は SLL/Mag から少量補充して即返却。
Ultra (SLL‑only, experimental):
- HAKMEM_TINY_ULTRA=1 (opt‑in)
- HAKMEM_TINY_ULTRA_VALIDATE=0/1 (perf vs safety)
- Per‑class overrides: HAKMEM_TINY_ULTRA_BATCH_C{0..7}, HAKMEM_TINY_ULTRA_SLL_CAP_C{0..7}
FLINT (Fast Lightweight INTelligence): Frontend + deferred Intelligence（実験）
- HAKMEM_TINY_FRONTEND=1 (enable array FastCache; miss falls back)
- HAKMEM_TINY_FASTCACHE=1 (low‑level switch; keep OFF unless A/B)
- HAKMEM_INT_ENGINE=1 (event ring + BG thread adjusts fill targets)
- イベント拡張（内部）: timestamp/tier/flags/site_id/thread をリングに蓄積（ホットパス外）。今後の適応に活用

Best‑Known Presets (direct link)

Tiny hot focus
- export HAKMEM_WRAP_TINY=1
- export HAKMEM_TINY_TLS_SLL=1
- export HAKMEM_TINY_MAG_CAP=128 (64B: try 512)
- export HAKMEM_TINY_REMOTE_DRAIN_TRYRATE=0
- export HAKMEM_TINY_REMOTE_DRAIN_THRESHOLD=1000000
Memory efficiency A/B
- export HAKMEM_TINY_FLUSH_ON_EXIT=1
- Run bench/app; compare steady‑state RSS with/without.

Refill Batch (A/B)

HAKMEM_TINY_REFILL_MAX_HOT（既定192）/ HAKMEM_TINY_REFILL_MAX（既定64）
小サイズ帯（8/16/32B）でピーク探索。現環境は既定付近が最良帯

Current Results (high level)

Tiny hot triad (Perf‑Main, 60–80k cycles, safe):
- 16–64B: System ≈ 300–335 M; HAKMEM ≈ 250–300 M; mimalloc 535–620 M.
- 128B: HAKMEM ≈ 250–270 M; System 170–176 M; mimalloc 575–586 M.
Comprehensive (direct link): mimalloc ≈ 0.9–1.0B; HAKMEM ≈ 0.25–0.27B.
Random mixed: three close; mimalloc slightly ahead; HAKMEM ≈ System ± a few %.

Bench‑only highlight（参考値, 専用ビルド）

SLL‑only + warmup + PGO（≤64B）で 8–24B が 400M超、32B/b100 最大 429.18M（System 312.55M）。
- 実行: bash scripts/run_tiny_sllonly_triad.sh 30000（安全な通常ビルドには含めません）

Open Focus

Close the 16–64B gap (cap/batch tuning; SLL/mini‑mag overhead shave).
Ultra (opt‑in) stabilization; A/B vs normal.
Frontend refill heuristics; BG engine stop/join wiring (added).

Mid Range MT (8-32KB, mimalloc-style)

Status: COMPLETE (2025-11-01) - 110M ops/sec achieved ✅
Quick benchmark: bash benchmarks/scripts/mid/run_mid_mt_bench.sh
Comparison: bash benchmarks/scripts/mid/compare_mid_mt_allocators.sh
Full report: MID_MT_COMPLETION_REPORT.md
Implementation: core/hakmem_mid_mt.{c,h}
Results: 110M ops/sec (100-101% of mimalloc, 2.12x faster than glibc)

ACE Learning Layer (Adaptive Control Engine)

Status: Phase 1 COMPLETE ✅ (2025-11-01) - Infrastructure ready 🚀
Goal: Fix weaknesses with adaptive learning (mimalloc超えを目指す！)
- Fragmentation stress: 3.87 → 10-20 M ops/s (2.6-5.2x target)
- Large WS: 22.15 → 30-45 M ops/s (1.4-2.0x target)
- realloc: 277ns → 140-210ns (1.3-2.0x target)
Documentation:
- User guide: docs/ACE_LEARNING_LAYER.md ✅
- Technical plan: docs/ACE_LEARNING_LAYER_PLAN.md ✅
- Progress report: ACE_PHASE1_PROGRESS.md ✅
Phase 1 Deliverables (COMPLETE ✅):
- ✅ Metrics collection (hakmem_ace_metrics.{c,h})
- ✅ UCB1 learning algorithm (hakmem_ace_ucb1.{c,h})
- ✅ Dual-loop controller (hakmem_ace_controller.{c,h})
- ✅ Dynamic TLS capacity adjustment
- ✅ Hot-path metrics integration (alloc/free tracking)
- ✅ A/B benchmark script (scripts/bench_ace_ab.sh)
Usage:
- Enable: HAKMEM_ACE_ENABLED=1 ./your_benchmark
- Debug: HAKMEM_ACE_ENABLED=1 HAKMEM_ACE_LOG_LEVEL=2 ./your_benchmark
- A/B test: ./scripts/bench_ace_ab.sh
Next: Phase 2 - Extended benchmarking + learning convergence validation

Directory Structure (2025-11-01 Reorganization)

benchmarks/ - All benchmark-related files
- src/ - Benchmark source code (tiny/mid/comprehensive/stress)
- scripts/ - Benchmark scripts organized by category
- results/ - Benchmark results (formerly bench_results/)
- perf/ - Performance profiling data (formerly perf_data/)
tests/ - Test files (unit/integration/stress)
core/ - Core allocator implementation
docs/ - Documentation (benchmarks/, api/, guides/)
scripts/ - Development scripts (build/, apps/, maintenance/)
archive/ - Historical documents and analysis

Where to Read More

SlabHandle Box: docs/SLAB_HANDLE.md（ownership + remote drain + metadata のカプセル化）
Free Safety: docs/FREE_SAFETY.md（二重free/クラス不一致のFail‑Fastとリング運用）
Cleanup/Organization: CLEANUP_SUMMARY_2025_11_01.md (latest)
Archive: archive/README.md - Historical docs and analysis
Bench mode: BENCH_MODE.md
Env knobs: ENV_VARS.md
Tiny hot microbench: TINY_HOT_BENCH.md
Frontend/Backend split: FRONTEND_BACKEND_PLAN.md
LD status/safety: LD_PRELOAD_STATUS.md
Goals/Targets: GOALS_2025_10_29.md
Latest results: BENCH_RESULTS_2025_10_29.md (today), BENCH_RESULTS_2025_10_28.md (yesterday)
Mainline integration plan: MAINLINE_INTEGRATION.md
FLINT Intelligence (events/adaptation): FLINT_INTELLIGENCE.md

Hako / MIR / FFI

HAKO_MIR_FFI_SPEC.md — フロント型検証完結＋MIRは運ぶだけ＋FFI機械的ローワリングの仕様

Notes

LD mode: keep HAKMEM_LD_SAFE=2 default for apps; prefer direct‑link for tuning.
Ultra/Frontend are experimental; keep OFF by default and use scripts for A/B.

7.4 KiB Raw Blame History Unescape Escape

7.4 KiB

Raw Blame History