diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index 50e71022..2ceefdc5 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -1,53 +1,21 @@ -## HAKMEM Bug Investigation: OOM Spam (ACE 33KB) - December 1, 2025 +## HAKMEM 状況メモ (2025-12-XX 更新) -### Objective -Investigate and provide a mechanism to diagnose "OOM spam caused by continuous NULL returns for ACE 33KB allocations." The goal is to distinguish between: -1. Threshold issues (size class rounding) -2. Cache exhaustion (pool empty) -3. Mapping failures (OS mmap failure) +### 現在の状態 +- Mid MT 層を完全撤去(コード・ビルド依存・free 早期分岐を削除)し、Mid/Large は ACE+Pool の一本化。 +- Mid W_MAX を 2.0 に緩和し、32–52KB Bridge クラス経路が確実に当たるよう調整。33KB 帯のセグフォは解消済み。 +- free ラッパーは Superslab/Tiny ガードを維持しつつ、Mid/L2/L25 へのルートを確実化(Superslab 未登録 Tiny は無視、Mid/L2/L25 は分類+レジストリで捕捉)。 +- Mid/L2/L25 ラップ判定はデフォルト ON(`HAKMEM_WRAP_L2=0` / `HAKMEM_WRAP_L25=0` で OFF)。ネスト再帰のみブロック。 -### Work Performed & Resolution +### 直近の成果 +- bench 再現: `./bench_mid_large_mt_hakmem 4 20000 1024 4` 完走、ACE-FAIL スパムもなし。 +- Mid MT のビルド/初期化/依存をすべて除去、Makefile も整理。 -1. **Implemented ACE Tracing**: - * Added a runtime-controlled tracing mechanism via the `HAKMEM_ACE_TRACE=1` environment variable. - * Instrumentation was added to `core/hakmem_ace.c`, `core/hakmem_pool.c`, and `core/hakmem_l25_pool.c` to log specific failure reasons to `stderr`. - * Log messages distinguish between `[ACE-FAIL] Threshold`, `[ACE-FAIL] Exhaustion`, and `[ACE-FAIL] MapFail`. +### 利用のポイント +- 33KB 帯の挙動確認は ACE/Pool のみで実施。断片化調整は `HAKMEM_WMAX_MID`(デフォルト 2.0)で行う。 +- Tiny ヘッダー誤分類防止: Superslab 登録必須チェックを free/fast-free で維持。 +- 旧 Mid MT が必要な場合は別ブランチ/過去コミットを参照(現行ブランチには存在しない)。 -2. **Resolved Build & Linkage Issues**: - * **Undefined Symbol `classify_ptr`**: Identified that `core/box/front_gate_classifier.c` was not correctly linked into `libhakmem.so`. The `Makefile` was updated to include `core/box/front_gate_classifier_shared.o` in the `SHARED_OBJS` list. - * **Removed Temporary Debug Logs**: All interim `write(2, ...)` and `fprintf(stderr, ...)` debug statements introduced during the investigation have been removed to restore a clean code state. - -3. **Clarified `malloc` Wrapper Behavior**: - * Discovered that `libhakmem.so`'s `malloc` wrapper had logic to force fallback to `libc`'s `malloc` for larger allocations (`> TINY_MAX_SIZE`) and when `jemalloc` was detected, especially under `LD_PRELOAD`. - * This was preventing 33KB allocations from reaching the `hakmem` ACE layer. - * **Solution**: Identified the necessary environment variables to disable these bypasses for testing purposes: `HAKMEM_LD_SAFE=0` and `HAKMEM_LD_BLOCK_JEMALLOC=0`. - -4. **Verified Trace Functionality**: - * A test program (`test_ace_trace.c`) was used to allocate 33KB. - * By setting `HAKMEM_WMAX_MID=1.01` and `HAKMEM_WMAX_LARGE=1.01` (to force threshold failures), the `[ACE-FAIL] Threshold` logs were successfully generated, confirming the tracing mechanism works as intended. - -### How to Use the Trace Feature (for Users) - -To diagnose the 33KB OOM spam issue in your application: - -1. **Ensure Correct `libhakmem.so` Build**: - Make sure `libhakmem.so` is built without `POOL_TLS_PHASE1` enabled (e.g., `make shared POOL_TLS_PHASE1=0`). The current `libhakmem.so` reflects this. - -2. **Run Your Application with Specific Environment Variables**: - ```bash - export HAKMEM_FRONT_GATE_UNIFIED=0 - export HAKMEM_SMALLMID_ENABLE=0 - export HAKMEM_FORCE_LIBC_ALLOC=0 - export HAKMEM_LD_BLOCK_JEMALLOC=0 - export HAKMEM_ACE_TRACE=1 # Crucial for seeing the logs - export HAKMEM_WMAX_MID=1.60 # Use default or adjust as needed for W_MAX analysis - export HAKMEM_WMAX_LARGE=1.30 # Use default or adjust as needed for W_MAX analysis - export LD_PRELOAD=/path/to/hakmem/libhakmem.so - - ./your_application 2> stderr.log # Redirect stderr to a file for analysis - ``` - -3. **Analyze `stderr.log`**: - Look for `[ACE-FAIL]` messages to determine if the issue is a `Threshold` (e.g., `size=33000 wmax=...`), `Exhaustion` (pool empty), or `MapFail` (OS allocation error). This will provide the necessary data to pinpoint the root cause of the OOM spam. - -This setup will allow for precise diagnosis of 33KB allocation failures within the hakmem ACE component. +### 残タスク/提案 +1. docs/benchmarks/scripts の Mid MT 関連ドキュメント・スクリプトを整理/アーカイブ。 +2. W_MAX/Cap の軽量 A/B(環境変数で OK)でフットプリント vs ヒット率を再計測。 +3. `core/box/front_gate_classifier.d`, `hakmem.d`, `mimalloc-bench` の dirty 表示は必要に応じて無視/クリーン。