Files
hakmem/docs/specs/DOCS_INDEX.md
Moe Charm (CI) a9ddb52ad4 ENV cleanup: Remove BG/HotMag vars & guard fprintf (Larson 52.3M ops/s)
Phase 1 完了:環境変数整理 + fprintf デバッグガード

ENV変数削除(BG/HotMag系):
- core/hakmem_tiny_init.inc: HotMag ENV 削除 (~131 lines)
- core/hakmem_tiny_bg_spill.c: BG spill ENV 削除
- core/tiny_refill.h: BG remote 固定値化
- core/hakmem_tiny_slow.inc: BG refs 削除

fprintf Debug Guards (#if !HAKMEM_BUILD_RELEASE):
- core/hakmem_shared_pool.c: Lock stats (~18 fprintf)
- core/page_arena.c: Init/Shutdown/Stats (~27 fprintf)
- core/hakmem.c: SIGSEGV init message

ドキュメント整理:
- 328 markdown files 削除(旧レポート・重複docs)

性能確認:
- Larson: 52.35M ops/s (前回52.8M、安定動作)
- ENV整理による機能影響なし
- Debug出力は一部残存(次phase で対応)

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 14:45:26 +09:00

151 lines
7.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

HAKMEM Docs Index (2025-10-29)
Purpose
- Onepage map for current work: how to build, run, compare, and tune.
- Focus on Tiny fastpath tuning vs system/mimalloc, with safe LD guidance.
Quick Build
- Direct link (recommended for perf tuning)
- `make bench_fast`
- Run: `HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem`
- PGO (direct link)
- `./build_pgo.sh` (profile+build)
- Run: `HAKMEM_WRAP_TINY=1 ./bench_comprehensive_hakmem`
- Shared (LD_PRELOAD) PGO
- `make pgo-profile-shared && make pgo-build-shared`
- Run: `HAKMEM_WRAP_TINY=1 LD_PRELOAD=./libhakmem.so ./bench_comprehensive_system`
DirectLink Comparisons (CSV)
- Pair (HAKMEM vs mimalloc): `bash scripts/run_comprehensive_pair.sh`
- CSV: `bench_results/comp_pair_YYYYMMDD_HHMMSS/summary.csv`
- Tiny hot triad (HAKMEM/System/mimalloc): `bash scripts/run_tiny_hot_triad.sh 80000`
- CSV: `bench_results/tiny_hot_triad_YYYYMMDD_HHMMSS/results.csv`
- Random mixed triad: `bash scripts/run_random_mixed_matrix.sh 120000`
- CSV: `bench_results/random_mixed_YYYYMMDD_HHMMSS/results.csv`
PerfMain preset (safe, mainlineoriented)
- Build + run triad: `bash scripts/run_perf_main_triad.sh 60000`
- Applies recommended tiny env (TLS_SLL=1, REFILL_MAX=96, HOT=192, HYST=16) without benchonly macros.
Tiny param sweeps
- Basic: `bash scripts/sweep_tiny_params.sh 100000`
- AdvancedSLL倍率/リフィル/クラス別MAGなど: `bash scripts/sweep_tiny_advanced.sh 80000 --mag64-512`
LD_PRELOAD Apps (optin)
- Script: `bash scripts/run_apps_with_hakmem.sh`
- Default safety: `HAKMEM_LD_SAFE=2` (passthrough) set in script, then percase `LD_PRELOAD` on.
- Recommendation: use directlink for perf; LD runs are for stability sampling only.
Tiny Modes and Knobs
- Normal (default): TLS magazine + TLS SLL (≤256B)
- `HAKMEM_TINY_TLS_SLL=1` (default)
- `HAKMEM_TINY_MAG_CAP=128` (good tiny bench preset; 64B may prefer 512)
- TinyQuickSlot最小フロント; 実験)
- `HAKMEM_TINY_QUICK=1`
- items[6] を1ラインに保持。miss時は SLL/Mag から少量補充して即返却。
- Ultra (SLLonly, experimental):
- `HAKMEM_TINY_ULTRA=1` (optin)
- `HAKMEM_TINY_ULTRA_VALIDATE=0/1` (perf vs safety)
- Perclass overrides: `HAKMEM_TINY_ULTRA_BATCH_C{0..7}`, `HAKMEM_TINY_ULTRA_SLL_CAP_C{0..7}`
- FLINT (Fast Lightweight INTelligence): Frontend + deferred Intelligence実験
- `HAKMEM_TINY_FRONTEND=1` (enable array FastCache; miss falls back)
- `HAKMEM_TINY_FASTCACHE=1` (lowlevel switch; keep OFF unless A/B)
- `HAKMEM_INT_ENGINE=1` (event ring + BG thread adjusts fill targets)
- イベント拡張(内部): timestamp/tier/flags/site_id/thread をリングに蓄積(ホットパス外)。今後の適応に活用
BestKnown Presets (direct link)
- Tiny hot focus
- `export HAKMEM_WRAP_TINY=1`
- `export HAKMEM_TINY_TLS_SLL=1`
- `export HAKMEM_TINY_MAG_CAP=128` (64B: try 512)
- `export HAKMEM_TINY_REMOTE_DRAIN_TRYRATE=0`
- `export HAKMEM_TINY_REMOTE_DRAIN_THRESHOLD=1000000`
- Memory efficiency A/B
- `export HAKMEM_TINY_FLUSH_ON_EXIT=1`
- Run bench/app; compare steadystate RSS with/without.
Refill Batch (A/B)
- `HAKMEM_TINY_REFILL_MAX_HOT`既定192/ `HAKMEM_TINY_REFILL_MAX`既定64
- 小サイズ帯8/16/32Bでピーク探索。現環境は既定付近が最良帯
Current Results (high level)
- Tiny hot triad (PerfMain, 6080k cycles, safe):
- 1664B: System ≈ 300335 M; HAKMEM ≈ 250300 M; mimalloc 535620 M.
- 128B: HAKMEM ≈ 250270 M; System 170176 M; mimalloc 575586 M.
- Comprehensive (direct link): mimalloc ≈ 0.91.0B; HAKMEM ≈ 0.250.27B.
- Random mixed: three close; mimalloc slightly ahead; HAKMEM ≈ System ± a few %.
Benchonly highlight参考値, 専用ビルド)
- SLLonly + warmup + PGO≤64Bで 824B が 400M超、32B/b100 最大 429.18MSystem 312.55M)。
- 実行: `bash scripts/run_tiny_sllonly_triad.sh 30000`(安全な通常ビルドには含めません)
Open Focus
- Close the 1664B gap (cap/batch tuning; SLL/minimag overhead shave).
- Ultra (optin) stabilization; A/B vs normal.
- Frontend refill heuristics; BG engine stop/join wiring (added).
Mid Range MT (8-32KB, mimalloc-style)
- **Status**: COMPLETE (2025-11-01) - 110M ops/sec achieved ✅
- Quick benchmark: `bash benchmarks/scripts/mid/run_mid_mt_bench.sh`
- Comparison: `bash benchmarks/scripts/mid/compare_mid_mt_allocators.sh`
- Full report: `MID_MT_COMPLETION_REPORT.md`
- Implementation: `core/hakmem_mid_mt.{c,h}`
- Results: 110M ops/sec (100-101% of mimalloc, 2.12x faster than glibc)
ACE Learning Layer (Adaptive Control Engine)
- **Status**: Phase 1 COMPLETE ✅ (2025-11-01) - Infrastructure ready 🚀
- **Goal**: Fix weaknesses with adaptive learning (mimalloc超えを目指す)
- Fragmentation stress: 3.87 → 10-20 M ops/s (2.6-5.2x target)
- Large WS: 22.15 → 30-45 M ops/s (1.4-2.0x target)
- realloc: 277ns → 140-210ns (1.3-2.0x target)
- **Documentation**:
- User guide: `docs/ACE_LEARNING_LAYER.md`
- Technical plan: `docs/ACE_LEARNING_LAYER_PLAN.md`
- Progress report: `ACE_PHASE1_PROGRESS.md`
- **Phase 1 Deliverables** (COMPLETE ✅):
- ✅ Metrics collection (`hakmem_ace_metrics.{c,h}`)
- ✅ UCB1 learning algorithm (`hakmem_ace_ucb1.{c,h}`)
- ✅ Dual-loop controller (`hakmem_ace_controller.{c,h}`)
- ✅ Dynamic TLS capacity adjustment
- ✅ Hot-path metrics integration (alloc/free tracking)
- ✅ A/B benchmark script (`scripts/bench_ace_ab.sh`)
- **Usage**:
- Enable: `HAKMEM_ACE_ENABLED=1 ./your_benchmark`
- Debug: `HAKMEM_ACE_ENABLED=1 HAKMEM_ACE_LOG_LEVEL=2 ./your_benchmark`
- A/B test: `./scripts/bench_ace_ab.sh`
- **Next**: Phase 2 - Extended benchmarking + learning convergence validation
Directory Structure (2025-11-01 Reorganization)
- **benchmarks/** - All benchmark-related files
- `src/` - Benchmark source code (tiny/mid/comprehensive/stress)
- `scripts/` - Benchmark scripts organized by category
- `results/` - Benchmark results (formerly bench_results/)
- `perf/` - Performance profiling data (formerly perf_data/)
- **tests/** - Test files (unit/integration/stress)
- **core/** - Core allocator implementation
- **docs/** - Documentation (benchmarks/, api/, guides/)
- **scripts/** - Development scripts (build/, apps/, maintenance/)
- **archive/** - Historical documents and analysis
Where to Read More
- **SlabHandle Box**: `docs/SLAB_HANDLE.md`ownership + remote drain + metadata のカプセル化)
- **Free Safety**: `docs/FREE_SAFETY.md`二重free/クラス不一致のFailFastとリング運用
- **Cleanup/Organization**: `CLEANUP_SUMMARY_2025_11_01.md` (latest)
- **Archive**: `archive/README.md` - Historical docs and analysis
- Bench mode: `BENCH_MODE.md`
- Env knobs: `ENV_VARS.md`
- Tiny hot microbench: `TINY_HOT_BENCH.md`
- Frontend/Backend split: `FRONTEND_BACKEND_PLAN.md`
- LD status/safety: `LD_PRELOAD_STATUS.md`
- Goals/Targets: `GOALS_2025_10_29.md`
- Latest results: `BENCH_RESULTS_2025_10_29.md` (today), `BENCH_RESULTS_2025_10_28.md` (yesterday)
- Mainline integration plan: `MAINLINE_INTEGRATION.md`
- FLINT Intelligence (events/adaptation): `FLINT_INTELLIGENCE.md`
Hako / MIR / FFI
- `HAKO_MIR_FFI_SPEC.md` — フロント型検証完結MIRは運ぶだけFFI機械的ローワリングの仕様
Notes
- LD mode: keep `HAKMEM_LD_SAFE=2` default for apps; prefer directlink for tuning.
- Ultra/Frontend are experimental; keep OFF by default and use scripts for A/B.