diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index a1f98784..d161a6f0 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -397,3 +397,167 @@ Similar or better improvement expected! --- **Status**: Ready to implement - awaiting user confirmation to proceed! 🚀 + +--- + +## NEW 2025-11-11: Tiny L1-miss増加とUB修正(FastCache/Freeチェイン) + +構造方針(確認) +- 結論: 構造はこのままでよい。`tiny_nextptr.h` に next を集約した箱構成で安全性と一貫性は確保。 +- この前提で A/B とパラメータ最適化を継続し、必要時のみ“クラス限定ヘッダ”などの再設計に進む。 + +現象(提供値 + 再現計測) +- 平均スループット: 56.7M → 55.95M ops/s(-1.3% 誤差範囲) +- L1-dcache-miss: 335M → 501M(+49.5%) +- 当環境の `bench_random_mixed_hakmem 100000 256 42` でも L1 miss ≈ 3.7–4.0%(安定) +- mimalloc 同条件: 98–110M ops/s(大差) + +根因仮説(高確度) +1) ヘッダ方式によるアラインメント崩れ(本丸) + - 1バイトヘッダで user ptr を +1 するため、stride=サイズ+1 となり多くのクラスで16B整列を失う。 + - 例: 256B→257B stride で 16ブロック中15ブロックが非整列。L1 miss/μops増の主因。 +2) 非整列 next の void** デリファレンス(UB) + - C0–C6 は next を base+1 に保存/参照しており、C言語的には非整列アクセスで UB。 + - コンパイラ最適化の悪影響やスピル増の可能性。 + +対処(適用済み:UB除去の最小パッチ) +- 追加: 安全 next アクセス小箱 `core/tiny_nextptr.h:1` + - `tiny_next_off(int)`, `tiny_next_load(void*, cls)`, `tiny_next_store(void*, cls, void*)` + - memcpy ベースの実装で、非整列でも未定義動作を回避 +- 適用先(ホットパス差し替え) + - `core/hakmem_tiny_fastcache.inc.h:76,108` + - `core/tiny_free_magazine.inc.h:83,94` + - `core/tiny_alloc_fast_inline.h:54` および push 側 + - `core/hakmem_tiny_tls_list.h:63,76,109,115` 他(pop/push/bulk) + - `core/hakmem_tiny_bg_spill.c`(ループ分割/再接続部) + - `core/hakmem_tiny_bg_spill.h`(spill push 経路) + - `core/tiny_alloc_fast_sfc.inc.h`(pop/push) + - `core/hakmem_tiny_lifecycle.inc`(SLL/Fast 層の drain 処理) + +リリースログ抑制(無害化) +- `core/superslab/superslab_inline.h:208` の `[DEBUG ss_remote_push]` を + `!HAKMEM_BUILD_RELEASE && HAKMEM_DEBUG_VERBOSE` ガード下へ +- `core/tiny_superslab_free.inc.h:36` の `[C7_FIRST_FREE]` も同様に + `!HAKMEM_BUILD_RELEASE && HAKMEM_DEBUG_VERBOSE` のみで出力 + +効果 +- スループット/ミス率は誤差範囲(正当性の改善が中心) +- 非整列 next の UB を除去し、将来の最適化で悪化しづらい状態に整備 +- mimalloc との差は依然大きく、根因は主に「整列崩れ+キャッシュ設計差」と判断 + +計測結果(抜粋) +- hakmem Tiny: + - `./bench_random_mixed_hakmem 100000 256 42` + - Throughput: ≈8.8–9.1M ops/s + - L1-dcache-load-misses: ≈1.50–1.60M(3.7–4.0%) +- mimalloc: + - `LD_LIBRARY_PATH=... ./bench_random_mixed_mi 100000 256 42` + - Throughput: ≈98–110M ops/s +- 固定256B(ヘッダON/OFF比較): + - `./bench_fixed_size_hakmem 100000 256 42` + - ヘッダON: ~3.86M ops/s, L1D miss ≈4.07% + - ヘッダOFF: ~4.00M ops/s, L1D miss ≈4.12%(誤差級) + +新規に特定した懸念と対応案 +- 整列崩れ(最有力) + - 1Bヘッダにより stride=サイズ+1 となり、16B 整列を崩すクラスが多い(例: 256→257B)。 + - 単純なヘッダON/OFF比較では差は小さく、他要因との複合影響と見做し継続調査。 +- UB(未定義動作) + - 非整列 void** load/store を `tiny_nextptr.h` による安全アクセサへ置換済み。 +- リリースガード漏れ + - `[C7_FIRST_FREE]` / `[DEBUG ss_remote_push]` は release ビルドでは + `HAKMEM_DEBUG_VERBOSE` 未指定時に出ないよう修正済み。 + +成功判定(Tiny側) +- A/B(ヘッダOFF or クラス限定ヘッダ)で 256B 固定の L1 miss 低下・ops/s 改善 +- mimalloc との差を段階的に圧縮(まず 2–3x 程度まで、将来的に 1.5x 以内を目標) + +トラッキング(参照ファイル/行) +- 安全 next 小箱: + - `core/tiny_nextptr.h:1` +- 呼び出し側差し替え: + - `core/hakmem_tiny_fastcache.inc.h:76,108` + - `core/tiny_free_magazine.inc.h:83,94` + - `core/tiny_alloc_fast_inline.h:54` 他 + - `core/hakmem_tiny_tls_list.h:63,76,109,115` + - `core/hakmem_tiny_bg_spill.c` / `core/hakmem_tiny_bg_spill.h` + - `core/tiny_alloc_fast_sfc.inc.h` + - `core/hakmem_tiny_lifecycle.inc` +- リリースログガード: + - `core/superslab/superslab_inline.h:208` + - `core/tiny_superslab_free.inc.h:36` + +現象(提供値 + 再現計測) +- 平均スループット: 56.7M → 55.95M ops/s(-1.3% 誤差範囲) +- L1-dcache-miss: 335M → 501M(+49.5%) +- 当環境の `bench_random_mixed_hakmem 100000 256 42` でも L1 miss ≈ 3.7–4.0%(安定) +- mimalloc 同条件: 98–110M ops/s(大差) + +根因仮説(高確度) +1) ヘッダ方式によるアラインメント崩れ(本丸) + - 1バイトヘッダで user ptr を +1 するため、stride=サイズ+1 となり多くのクラスで16B整列を失う。 + - 例: 256B→257B stride で 16ブロック中15ブロックが非整列。L1 miss/μops増の主因。 +2) 非整列 next の void** デリファレンス(UB) + - C0–C6 は next を base+1 に保存/参照しており、C言語的には非整列アクセスで UB。 + - コンパイラ最適化の悪影響やスピル増の可能性。 + +対処(適用済み:UB除去の最小パッチ) +- 追加: 安全 next アクセス小箱 `core/tiny_nextptr.h:1` + - `tiny_next_load()/tiny_next_store()` を memcpy ベースで提供(非整列でもUBなし) +- 適用先(ホットパス) + - `core/hakmem_tiny_fastcache.inc.h:76,108`(tiny_fast_pop/push) + - `core/tiny_free_magazine.inc.h:83,94`(BG spill チェイン構築) + +効果(短期計測) +- Throughput/L1 miss は誤差範囲で横ばい(正当性の改善が主、性能は現状維持) +- 本質は「整列崩れ」→ 次の対策で A/B 確認へ + +未解決の懸念(要フォロー) +- Release ガード漏れの可能性: `[C7_FIRST_FREE]`/`[DEBUG ss_remote_push]` が release でも1回だけ出力 + - 該当箇所: `core/tiny_superslab_free.inc.h:36`, `core/superslab/superslab_inline.h:208` + - Makefile上は `-DHAKMEM_BUILD_RELEASE=1`(print-flags でも確認)。TUごとのCFLAGS齟齬を監査。 + +次アクション(Tiny alignment 検証のA/B) +1) ヘッダ全無効 A/B(即時) +``` +# A: 現行(ヘッダON) +./build.sh bench_random_mixed_hakmem +perf stat -e cycles,instructions,branches,branch-misses,cache-references,cache-misses,\ + L1-dcache-loads,L1-dcache-load-misses -r 5 -- ./bench_random_mixed_hakmem 100000 256 42 + +# B: ヘッダOFF(クラス全体) +EXTRA_MAKEFLAGS="HEADER_CLASSIDX=0" ./build.sh bench_random_mixed_hakmem +perf stat -e cycles,instructions,branches,branch-misses,cache-references,cache-misses,\ + L1-dcache-loads,L1-dcache-load-misses -r 5 -- ./bench_random_mixed_hakmem 100000 256 42 +``` +2) 固定サイズ 256B の比較(alignment 影響の顕在化狙い) +``` +./build.sh bench_fixed_size_hakmem +perf stat -e cycles,instructions,cache-references,cache-misses,L1-dcache-loads,L1-dcache-load-misses \ + -r 5 -- ./bench_fixed_size_hakmem 100000 256 42 +``` +3) FastCache 稼働確認(C0–C3 ヒット率の見える化) +``` +HAKMEM_TINY_FAST_STATS=1 ./bench_random_mixed_hakmem 100000 256 42 +``` + +中期対策(Box設計の指針) +- 方針A(簡易・高効果): ヘッダを小クラス(C0–C3)限定に縮小、C4–C6は整列重視(ヘッダなし)。 + - 実装: まず A/B でヘッダ全OFFの効果を確認→効果大なら「クラス限定ヘッダ」へ段階導入。 +- 方針B(高度): フッタ方式やビットタグ化など“アラインメント維持”の識別方式へ移行。 + - 例: 16B整列を保つパディング/タグで class_idx を保持(RSS/複雑性と要トレードオフ検証)。 + +トラッキング(ファイル/行) +- 安全 next 小箱: `core/tiny_nextptr.h:1` +- 差し替え: `core/hakmem_tiny_fastcache.inc.h:76,108`, `core/tiny_free_magazine.inc.h:83,94` +- 追加監査対象(未修正だが next を直接触る箇所) + - `core/tiny_alloc_fast_inline.h:54,297`, `core/hakmem_tiny_tls_list.h:63,76,109,115` ほか + +成功判定(Tiny) +- A/B(ヘッダOFF)で 256B 固定の L1 miss 低下、ops/s 上昇(±20–50% を期待) +- mimalloc との差が大幅に縮小(まず 2–3x → 継続改善で 1.5x 以内へ) + +最新A/Bスナップショット(当環境, RandomMixed 256B) +- HEADER_CLASSIDX=1(現行): 平均 ≈ 8.16M ops/s, L1D miss ≈ 3.79% +- HEADER_CLASSIDX=0(全OFF): 平均 ≈ 9.12M ops/s, L1D miss ≈ 3.74% +- 差分: +11.7% 前後の改善(整列効果は小〜中。追加のチューニング継続) diff --git a/Makefile b/Makefile index 82c01cf5..ce1a0dc4 100644 --- a/Makefile +++ b/Makefile @@ -756,6 +756,14 @@ bench_debug: CFLAGS += -DHAKMEM_DEBUG_COUNTERS=1 -g -O2 bench_debug: clean bench_comprehensive_hakmem bench_tiny_hot_hakmem bench_tiny_hot_system bench_tiny_hot_mi @echo "✓ bench_debug build complete (debug counters enabled)" +# Debug build for random_mixed (enable counters for SFC stats) +.PHONY: bench_random_mixed_debug +bench_random_mixed_debug: + @echo "[debug] Rebuilding bench_random_mixed_hakmem with HAKMEM_DEBUG_COUNTERS=1" + $(MAKE) clean >/dev/null + $(MAKE) CFLAGS+=" -DHAKMEM_DEBUG_COUNTERS=1 -O2 -g" bench_random_mixed_hakmem >/dev/null + @echo "✓ bench_random_mixed_debug built" + # ======================================== # Phase 7 便利ターゲット(重要な定数がデフォルト化されています) # ======================================== diff --git a/core/box/free_local_box.d b/core/box/free_local_box.d index 9f2c3e0e..f891b0ed 100644 --- a/core/box/free_local_box.d +++ b/core/box/free_local_box.d @@ -2,13 +2,13 @@ core/box/free_local_box.o: core/box/free_local_box.c \ core/box/free_local_box.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ + core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ - core/hakmem_build_flags.h core/box/free_publish_box.h core/hakmem_tiny.h \ - core/hakmem_trace.h core/hakmem_tiny_mini_mag.h + core/box/free_publish_box.h core/hakmem_tiny.h core/hakmem_trace.h \ + core/hakmem_tiny_mini_mag.h core/box/free_local_box.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: @@ -16,6 +16,7 @@ core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/superslab/../tiny_box_geometry.h: core/superslab/../hakmem_tiny_superslab_constants.h: @@ -23,7 +24,6 @@ core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: core/box/free_publish_box.h: core/hakmem_tiny.h: core/hakmem_trace.h: diff --git a/core/box/free_publish_box.d b/core/box/free_publish_box.d index b903dbd2..6b724204 100644 --- a/core/box/free_publish_box.d +++ b/core/box/free_publish_box.d @@ -2,14 +2,14 @@ core/box/free_publish_box.o: core/box/free_publish_box.c \ core/box/free_publish_box.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ + core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ - core/hakmem_build_flags.h core/hakmem_tiny.h core/hakmem_trace.h \ - core/hakmem_tiny_mini_mag.h core/tiny_route.h core/tiny_ready.h \ - core/hakmem_tiny.h core/box/mailbox_box.h + core/hakmem_tiny.h core/hakmem_trace.h core/hakmem_tiny_mini_mag.h \ + core/tiny_route.h core/tiny_ready.h core/hakmem_tiny.h \ + core/box/mailbox_box.h core/box/free_publish_box.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: @@ -17,6 +17,7 @@ core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/superslab/../tiny_box_geometry.h: core/superslab/../hakmem_tiny_superslab_constants.h: @@ -24,7 +25,6 @@ core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: core/hakmem_tiny.h: core/hakmem_trace.h: core/hakmem_tiny_mini_mag.h: diff --git a/core/box/free_remote_box.d b/core/box/free_remote_box.d index 150fcee3..b868ed8b 100644 --- a/core/box/free_remote_box.d +++ b/core/box/free_remote_box.d @@ -2,13 +2,13 @@ core/box/free_remote_box.o: core/box/free_remote_box.c \ core/box/free_remote_box.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ + core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ - core/hakmem_build_flags.h core/box/free_publish_box.h core/hakmem_tiny.h \ - core/hakmem_trace.h core/hakmem_tiny_mini_mag.h + core/box/free_publish_box.h core/hakmem_tiny.h core/hakmem_trace.h \ + core/hakmem_tiny_mini_mag.h core/box/free_remote_box.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: @@ -16,6 +16,7 @@ core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/superslab/../tiny_box_geometry.h: core/superslab/../hakmem_tiny_superslab_constants.h: @@ -23,7 +24,6 @@ core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: core/box/free_publish_box.h: core/hakmem_tiny.h: core/hakmem_trace.h: diff --git a/core/box/front_gate_box.d b/core/box/front_gate_box.d index 64a39671..f1904a40 100644 --- a/core/box/front_gate_box.d +++ b/core/box/front_gate_box.d @@ -1,10 +1,10 @@ core/box/front_gate_box.o: core/box/front_gate_box.c \ core/box/front_gate_box.h core/hakmem_tiny.h core/hakmem_build_flags.h \ core/hakmem_trace.h core/hakmem_tiny_mini_mag.h \ - core/tiny_alloc_fast_sfc.inc.h core/hakmem_tiny.h core/box/tls_sll_box.h \ - core/box/../ptr_trace.h core/box/../hakmem_tiny_config.h \ - core/box/../hakmem_build_flags.h core/box/../tiny_region_id.h \ - core/box/../hakmem_build_flags.h + core/tiny_alloc_fast_sfc.inc.h core/hakmem_tiny.h core/tiny_nextptr.h \ + core/box/tls_sll_box.h core/box/../ptr_trace.h \ + core/box/../hakmem_tiny_config.h core/box/../hakmem_build_flags.h \ + core/box/../tiny_region_id.h core/box/../hakmem_build_flags.h core/box/front_gate_box.h: core/hakmem_tiny.h: core/hakmem_build_flags.h: @@ -12,6 +12,7 @@ core/hakmem_trace.h: core/hakmem_tiny_mini_mag.h: core/tiny_alloc_fast_sfc.inc.h: core/hakmem_tiny.h: +core/tiny_nextptr.h: core/box/tls_sll_box.h: core/box/../ptr_trace.h: core/box/../hakmem_tiny_config.h: diff --git a/core/box/front_gate_classifier.d b/core/box/front_gate_classifier.d index c0c2ffa5..112f2ebb 100644 --- a/core/box/front_gate_classifier.d +++ b/core/box/front_gate_classifier.d @@ -5,7 +5,8 @@ core/box/front_gate_classifier.o: core/box/front_gate_classifier.c \ core/hakmem_tiny_superslab_constants.h \ core/box/../superslab/superslab_inline.h \ core/box/../superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/box/../superslab/../tiny_box_geometry.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ + core/box/../superslab/../tiny_box_geometry.h \ core/box/../superslab/../hakmem_tiny_superslab_constants.h \ core/box/../superslab/../hakmem_tiny_config.h \ core/box/../tiny_debug_ring.h core/box/../tiny_remote.h \ @@ -15,8 +16,7 @@ core/box/front_gate_classifier.o: core/box/front_gate_classifier.c \ core/box/../hakmem.h core/box/../hakmem_config.h \ core/box/../hakmem_features.h core/box/../hakmem_sys.h \ core/box/../hakmem_whale.h core/box/../hakmem_tiny_config.h \ - core/box/../hakmem_super_registry.h core/box/../hakmem_tiny_superslab.h \ - core/box/../pool_tls_registry.h + core/box/../hakmem_super_registry.h core/box/../hakmem_tiny_superslab.h core/box/front_gate_classifier.h: core/box/../tiny_region_id.h: core/box/../hakmem_build_flags.h: @@ -26,6 +26,7 @@ core/hakmem_tiny_superslab_constants.h: core/box/../superslab/superslab_inline.h: core/box/../superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/box/../superslab/../tiny_box_geometry.h: core/box/../superslab/../hakmem_tiny_superslab_constants.h: @@ -44,4 +45,3 @@ core/box/../hakmem_whale.h: core/box/../hakmem_tiny_config.h: core/box/../hakmem_super_registry.h: core/box/../hakmem_tiny_superslab.h: -core/box/../pool_tls_registry.h: diff --git a/core/box/hak_core_init.inc.h b/core/box/hak_core_init.inc.h index 54add955..27a6b147 100644 --- a/core/box/hak_core_init.inc.h +++ b/core/box/hak_core_init.inc.h @@ -298,6 +298,14 @@ static void hak_init_impl(void) { extern void hak_tiny_prewarm_tls_cache(void); hak_tiny_prewarm_tls_cache(); HAKMEM_LOG("TLS cache pre-warmed for %d classes\n", TINY_NUM_CLASSES); + // After TLS prewarm, cascade some hot blocks into SFC to raise early hit rate + { + extern int g_sfc_enabled; + if (g_sfc_enabled) { + extern void sfc_cascade_from_tls_initial(void); + sfc_cascade_from_tls_initial(); + } + } #endif g_initializing = 0; diff --git a/core/box/mailbox_box.d b/core/box/mailbox_box.d index 0ebbaa45..4b8bbb3e 100644 --- a/core/box/mailbox_box.d +++ b/core/box/mailbox_box.d @@ -2,12 +2,12 @@ core/box/mailbox_box.o: core/box/mailbox_box.c core/box/mailbox_box.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/superslab/../tiny_box_geometry.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ - core/hakmem_build_flags.h core/hakmem_tiny.h core/hakmem_trace.h \ - core/hakmem_tiny_mini_mag.h + core/hakmem_tiny.h core/hakmem_trace.h core/hakmem_tiny_mini_mag.h core/box/mailbox_box.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: @@ -15,6 +15,7 @@ core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/superslab/../tiny_box_geometry.h: core/superslab/../hakmem_tiny_superslab_constants.h: @@ -22,7 +23,6 @@ core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: core/hakmem_tiny.h: core/hakmem_trace.h: core/hakmem_tiny_mini_mag.h: diff --git a/core/hakmem_build_flags.h b/core/hakmem_build_flags.h index 05e5edf7..d0a65f23 100644 --- a/core/hakmem_build_flags.h +++ b/core/hakmem_build_flags.h @@ -99,7 +99,7 @@ // Minimal/strict front variants (bench/debug only) #ifndef HAKMEM_TINY_MINIMAL_FRONT -# define HAKMEM_TINY_MINIMAL_FRONT 0 +# define HAKMEM_TINY_MINIMAL_FRONT 1 #endif #ifndef HAKMEM_TINY_STRICT_FRONT # define HAKMEM_TINY_STRICT_FRONT 0 diff --git a/core/hakmem_tiny.c b/core/hakmem_tiny.c index 4c03c64a..884629d6 100644 --- a/core/hakmem_tiny.c +++ b/core/hakmem_tiny.c @@ -72,7 +72,7 @@ static inline int superslab_trace_enabled(void) { // (UltraFront/Quick/Frontend/HotMag/SS-try/BumpShadow), leaving: // SLL → TLS Magazine → SuperSlab → (remaining slow path) #ifndef HAKMEM_TINY_MINIMAL_FRONT -#define HAKMEM_TINY_MINIMAL_FRONT 0 +#define HAKMEM_TINY_MINIMAL_FRONT 1 #endif // Strict front: compile-out optional front tiers but keep baseline structure intact #ifndef HAKMEM_TINY_STRICT_FRONT @@ -362,9 +362,10 @@ static int g_tiny_refill_max_hot = 192; // HAKMEM_TINY_REFILL_MAX_HOT for clas // hakmem_tiny_tls_list.h already included at top static __thread TinyTLSList g_tls_lists[TINY_NUM_CLASSES]; -static int g_tls_list_enable = 1; // Default ON (scope bug fixed 2025-11-11); disable via HAKMEM_TINY_TLS_LIST=0 +static int g_tls_list_enable = 0; // Default OFF for bench; override via HAKMEM_TINY_TLS_LIST=1 static inline int tls_refill_from_tls_slab(int class_idx, TinyTLSList* tls, uint32_t want); static int g_fast_enable = 1; +static int g_fastcache_enable = 1; // Default ON (array stack for C0-C3); override via HAKMEM_TINY_FASTCACHE=0 static uint16_t g_fast_cap[TINY_NUM_CLASSES]; static int g_ultra_bump_shadow = 0; // HAKMEM_TINY_BUMP_SHADOW=1 static uint8_t g_fast_cap_locked[TINY_NUM_CLASSES]; @@ -979,6 +980,8 @@ static inline void tiny_tls_refresh_params(int class_idx, TinyTLSList* tls) { // Forward declarations for functions defined in hakmem_tiny_fastcache.inc.h static inline void* tiny_fast_pop(int class_idx); static inline int tiny_fast_push(int class_idx, void* ptr); +static inline void* fastcache_pop(int class_idx); +static inline int fastcache_push(int class_idx, void* ptr); // ============================================================================ // EXTRACTED TO hakmem_tiny_hot_pop.inc.h (Phase 2D-1) @@ -1046,7 +1049,13 @@ static __attribute__((cold, noinline, unused)) void* tiny_slow_alloc_fast(int cl hak_tiny_set_used(slab, extra_idx); slab->free_count--; void* extra = (void*)(base + ((size_t)extra_idx * block_size)); - if (!tiny_fast_push(class_idx, extra)) { + int pushed = 0; + if (__builtin_expect(g_fastcache_enable && class_idx <= 3, 1)) { + pushed = fastcache_push(class_idx, extra); + } else { + pushed = tiny_fast_push(class_idx, extra); + } + if (!pushed) { if (tls_enabled) { tiny_tls_list_guard_push(class_idx, tls, extra); tls_list_push(tls, extra, class_idx); @@ -1147,7 +1156,6 @@ typedef struct __attribute__((aligned(64))) { int top; int _pad[15]; } TinyFastCache; -static int g_fastcache_enable = 0; // HAKMEM_TINY_FASTCACHE=1 static __thread TinyFastCache g_fast_cache[TINY_NUM_CLASSES]; static int g_frontend_enable = 0; // HAKMEM_TINY_FRONTEND=1 (experimental ultra-fast frontend) // SLL capacity multiplier for hot tiny classes (env: HAKMEM_SLL_MULTIPLIER) @@ -1170,6 +1178,10 @@ static inline __attribute__((always_inline)) uint32_t tiny_self_u32(void) { // Cached pthread_t as-is for APIs that require pthread_t comparison static __thread pthread_t g_tls_pt_self; static __thread int g_tls_pt_inited; + +// Frontend FastCache hit/miss counters (Small diagnostics) +unsigned long long g_front_fc_hit[TINY_NUM_CLASSES] = {0}; +unsigned long long g_front_fc_miss[TINY_NUM_CLASSES] = {0}; // Phase 6-1.7: Export for box refactor (Box 6 needs access from hakmem.c) #ifdef HAKMEM_TINY_PHASE6_BOX_REFACTOR inline __attribute__((always_inline)) pthread_t tiny_self_pt(void) { diff --git a/core/hakmem_tiny.d b/core/hakmem_tiny.d index 9eda0da7..f4e41af2 100644 --- a/core/hakmem_tiny.d +++ b/core/hakmem_tiny.d @@ -20,7 +20,7 @@ core/hakmem_tiny.o: core/hakmem_tiny.c core/hakmem_tiny.h \ core/tiny_ready.h core/box/mailbox_box.h core/hakmem_tiny_superslab.h \ core/tiny_remote_bg.h core/hakmem_tiny_remote_target.h \ core/tiny_ready_bg.h core/tiny_route.h core/box/adopt_gate_box.h \ - core/tiny_tls_guard.h core/hakmem_tiny_tls_list.h \ + core/tiny_tls_guard.h core/hakmem_tiny_tls_list.h core/tiny_nextptr.h \ core/hakmem_tiny_bg_spill.h core/tiny_adaptive_sizing.h \ core/tiny_system.h core/hakmem_prof.h core/tiny_publish.h \ core/box/tls_sll_box.h core/box/../ptr_trace.h \ @@ -95,6 +95,7 @@ core/tiny_route.h: core/box/adopt_gate_box.h: core/tiny_tls_guard.h: core/hakmem_tiny_tls_list.h: +core/tiny_nextptr.h: core/hakmem_tiny_bg_spill.h: core/tiny_adaptive_sizing.h: core/tiny_system.h: diff --git a/core/hakmem_tiny_bg_spill.c b/core/hakmem_tiny_bg_spill.c index f983f97d..ea037995 100644 --- a/core/hakmem_tiny_bg_spill.c +++ b/core/hakmem_tiny_bg_spill.c @@ -53,17 +53,19 @@ void bg_spill_drain_class(int class_idx, pthread_mutex_t* lock) { #endif while (cur && processed < g_bg_spill_max_batch) { prev = cur; - cur = *(void**)((uint8_t*)cur + next_off); + #include "tiny_nextptr.h" + cur = tiny_next_load(cur, class_idx); processed++; } - if (cur != NULL) { rest = cur; *(void**)((uint8_t*)prev + next_off) = NULL; } + if (cur != NULL) { rest = cur; tiny_next_store(prev, class_idx, NULL); } // Return processed nodes to SS freelists pthread_mutex_lock(lock); uint32_t self_tid = tiny_self_u32_guard(); void* node = (void*)chain; while (node) { - void* next = *(void**)((uint8_t*)node + next_off); + #include "tiny_nextptr.h" + void* next = tiny_next_load(node, class_idx); SuperSlab* owner_ss = hak_super_lookup(node); if (owner_ss && owner_ss->magic == SUPERSLAB_MAGIC) { int slab_idx = slab_index_for(owner_ss, node); @@ -94,10 +96,10 @@ void bg_spill_drain_class(int class_idx, pthread_mutex_t* lock) { // Prepend remainder back to head uintptr_t old_head; void* tail = rest; - while (*(void**)((uint8_t*)tail + next_off)) tail = *(void**)((uint8_t*)tail + next_off); + while (tiny_next_load(tail, class_idx)) tail = tiny_next_load(tail, class_idx); do { old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); - *(void**)((uint8_t*)tail + next_off) = (void*)old_head; + tiny_next_store(tail, class_idx, (void*)old_head); } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, (uintptr_t)rest, memory_order_release, memory_order_relaxed)); diff --git a/core/hakmem_tiny_bg_spill.h b/core/hakmem_tiny_bg_spill.h index a378c09e..d273248b 100644 --- a/core/hakmem_tiny_bg_spill.h +++ b/core/hakmem_tiny_bg_spill.h @@ -4,6 +4,7 @@ #include #include #include +#include "tiny_nextptr.h" // Forward declarations typedef struct TinySlab TinySlab; @@ -24,13 +25,7 @@ static inline void bg_spill_push_one(int class_idx, void* p) { uintptr_t old_head; do { old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); - // Phase 7: header-aware next placement (C0-C6: base+1, C7: base) -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off = 0; -#endif - *(void**)((uint8_t*)p + next_off) = (void*)old_head; + tiny_next_store(p, class_idx, (void*)old_head); } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, (uintptr_t)p, memory_order_release, memory_order_relaxed)); @@ -42,13 +37,7 @@ static inline void bg_spill_push_chain(int class_idx, void* head, void* tail, in uintptr_t old_head; do { old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); - // Phase 7: header-aware next placement for tail link -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off = 0; -#endif - *(void**)((uint8_t*)tail + next_off) = (void*)old_head; + tiny_next_store(tail, class_idx, (void*)old_head); } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, (uintptr_t)head, memory_order_release, memory_order_relaxed)); diff --git a/core/hakmem_tiny_config.c b/core/hakmem_tiny_config.c index 300a7d70..15ae4b96 100644 --- a/core/hakmem_tiny_config.c +++ b/core/hakmem_tiny_config.c @@ -11,19 +11,20 @@ // ============================================================================ // Factory defaults (“balanced”) – mutable at runtime +// Small classes (0..2) are given higher caps by default to favor hot small-size throughput. static const uint16_t k_fast_cap_defaults_factory[TINY_NUM_CLASSES] = { - 128, // Class 0: 8B - 128, // Class 1: 16B - 128, // Class 2: 32B + 256, // Class 0: 8B (was 128) + 256, // Class 1: 16B (was 128) + 256, // Class 2: 32B (was 128) 128, // Class 3: 64B (reduced from 512 to limit RSS) 128, // Class 4: 128B (trimmed via ACE/TLS caps) - 96, // Class 5: 256B (favor fewer round-trips) + 224, // Class 5: 256B (bench-optimized default) 128, // Class 6: 512B 48 // Class 7: 1KB (reduce superslab reliance) }; uint16_t g_fast_cap_defaults[TINY_NUM_CLASSES] = { - 128, 128, 128, 128, 128, 96, 128, 48 + 256, 256, 256, 128, 128, 224, 128, 48 }; void tiny_config_reset_defaults(void) { diff --git a/core/hakmem_tiny_fastcache.inc.h b/core/hakmem_tiny_fastcache.inc.h index 4c397ad8..0e95ac17 100644 --- a/core/hakmem_tiny_fastcache.inc.h +++ b/core/hakmem_tiny_fastcache.inc.h @@ -85,7 +85,12 @@ static inline __attribute__((always_inline)) void* tiny_fast_pop(int class_idx) #else const size_t next_offset = 0; #endif - void* next = *(void**)((uint8_t*)head + next_offset); + // Use safe unaligned load for "next" to avoid UB when offset==1 + void* next = NULL; + { + #include "tiny_nextptr.h" + next = tiny_next_load(head, class_idx); + } g_fast_head[class_idx] = next; uint16_t count = g_fast_count[class_idx]; if (count > 0) { @@ -124,7 +129,10 @@ static inline __attribute__((always_inline)) int tiny_fast_push(int class_idx, v #else const size_t next_offset2 = 0; #endif - *(void**)((uint8_t*)ptr + next_offset2) = g_fast_head[class_idx]; + { + #include "tiny_nextptr.h" + tiny_next_store(ptr, class_idx, g_fast_head[class_idx]); + } g_fast_head[class_idx] = ptr; g_fast_count[class_idx] = (uint16_t)(count + 1); g_fast_push_hits[class_idx]++; diff --git a/core/hakmem_tiny_init.inc b/core/hakmem_tiny_init.inc index b9cd44bf..998717a9 100644 --- a/core/hakmem_tiny_init.inc +++ b/core/hakmem_tiny_init.inc @@ -108,6 +108,13 @@ void hak_tiny_init(void) { if (superslab_env) { g_use_superslab = (atoi(superslab_env) != 0) ? 1 : 0; } + + // Initialize Super Front Cache (SFC) with bench-friendly defaults + // Enabled by default; can be disabled via HAKMEM_SFC_ENABLE=0 + { + extern void sfc_init(void); + sfc_init(); + } // Note: Diet mode no longer overrides g_use_superslab (removed lines 104-105) // SuperSlab defaults to 1 unless explicitly disabled via env var // One-shot hint: publish/adopt requires SuperSlab ON diff --git a/core/hakmem_tiny_lifecycle.inc b/core/hakmem_tiny_lifecycle.inc index 34a69f8b..2fcac9dd 100644 --- a/core/hakmem_tiny_lifecycle.inc +++ b/core/hakmem_tiny_lifecycle.inc @@ -149,12 +149,8 @@ static void tiny_tls_cache_drain(int class_idx) { g_tls_sll_head[class_idx] = NULL; g_tls_sll_count[class_idx] = 0; while (sll) { -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off_sll = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off_sll = 0; -#endif - void* next = *(void**)((uint8_t*)sll + next_off_sll); + #include "tiny_nextptr.h" + void* next = tiny_next_load(sll, class_idx); tiny_tls_list_guard_push(class_idx, tls, sll); tls_list_push(tls, sll, class_idx); sll = next; @@ -165,12 +161,8 @@ static void tiny_tls_cache_drain(int class_idx) { g_fast_head[class_idx] = NULL; g_fast_count[class_idx] = 0; while (fast) { -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off_fast = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off_fast = 0; -#endif - void* next = *(void**)((uint8_t*)fast + next_off_fast); + #include "tiny_nextptr.h" + void* next = tiny_next_load(fast, class_idx); tiny_tls_list_guard_push(class_idx, tls, fast); tls_list_push(tls, fast, class_idx); fast = next; @@ -184,13 +176,8 @@ static void tiny_tls_cache_drain(int class_idx) { if (taken == 0u || head == NULL) break; void* cur = head; while (cur) { - // Header-aware next pointer from TLS list chain -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off_tls = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off_tls = 0; -#endif - void* next = *(void**)((uint8_t*)cur + next_off_tls); + #include "tiny_nextptr.h" + void* next = tiny_next_load(cur, class_idx); SuperSlab* ss = hak_super_lookup(cur); if (ss && ss->magic == SUPERSLAB_MAGIC) { hak_tiny_free_superslab(cur, ss); diff --git a/core/hakmem_tiny_refill.inc.h b/core/hakmem_tiny_refill.inc.h index 4e8f971c..4b2aa0f2 100644 --- a/core/hakmem_tiny_refill.inc.h +++ b/core/hakmem_tiny_refill.inc.h @@ -141,6 +141,18 @@ static inline void tiny_debug_validate_node_base(int class_idx, void* node, cons // Fast cache refill and take operation static inline void* tiny_fast_refill_and_take(int class_idx, TinyTLSList* tls) { + // Phase 1: C0–C3 prefer headerless array stack (FastCache) for lowest latency + if (__builtin_expect(g_fastcache_enable && class_idx <= 3, 1)) { + void* fc = fastcache_pop(class_idx); + if (fc) { + extern unsigned long long g_front_fc_hit[]; + g_front_fc_hit[class_idx]++; + return fc; + } else { + extern unsigned long long g_front_fc_miss[]; + g_front_fc_miss[class_idx]++; + } + } void* direct = tiny_fast_pop(class_idx); if (direct) return direct; uint16_t cap = g_fast_cap[class_idx]; @@ -173,11 +185,16 @@ static inline void* tiny_fast_refill_and_take(int class_idx, TinyTLSList* tls) { while (node && remaining > 0u) { void* next = *(void**)((uint8_t*)node + next_off_tls); - if (tiny_fast_push(class_idx, node)) { - node = next; - remaining--; + int pushed = 0; + if (__builtin_expect(g_fastcache_enable && class_idx <= 3, 1)) { + // Headerless array stack for hottest tiny classes + pushed = fastcache_push(class_idx, node); } else { - // Push failed, return remaining to TLS + pushed = tiny_fast_push(class_idx, node); + } + if (pushed) { node = next; remaining--; } + else { + // Push failed, return remaining to TLS (preserve order) tls_list_bulk_put(tls, node, batch_tail, remaining, class_idx); return ret; } diff --git a/core/hakmem_tiny_sfc.c b/core/hakmem_tiny_sfc.c index 70d6ee02..b551fdb9 100644 --- a/core/hakmem_tiny_sfc.c +++ b/core/hakmem_tiny_sfc.c @@ -31,7 +31,7 @@ sfc_stats_t g_sfc_stats[TINY_NUM_CLASSES] = {0}; // Box 5-NEW: Global Config (from ENV) // ============================================================================ -int g_sfc_enabled = 0; // Default: OFF (A/B testing) +int g_sfc_enabled = 1; // Default: ON (bench-focused; A/B via HAKMEM_SFC_ENABLE) static int g_sfc_default_capacity = SFC_DEFAULT_CAPACITY; static int g_sfc_default_refill = SFC_DEFAULT_REFILL_COUNT; @@ -110,6 +110,9 @@ void sfc_init(void) { } } + // Register shutdown hook for optional stats dump + atexit(sfc_shutdown); + // One-shot debug log static int debug_printed = 0; if (!debug_printed) { @@ -144,6 +147,37 @@ void sfc_shutdown(void) { // No cleanup needed (TLS memory freed by OS) } +// Cascade a first batch from TLS SLL into SFC after TLS prewarm. +// Hot classes only (0..3 and 5) to focus on 256B/小サイズ。 +void sfc_cascade_from_tls_initial(void) { + if (!g_sfc_enabled) return; + // TLS SLL externs + extern __thread void* g_tls_sll_head[]; + extern __thread uint32_t g_tls_sll_count[]; + for (int cls = 0; cls < TINY_NUM_CLASSES; cls++) { + if (!(cls <= 3 || cls == 5)) continue; // focus: 8..64B and 256B + uint32_t cap = g_sfc_capacity[cls]; + if (cap == 0) continue; + // target: max half of SFC cap or available SLL count + uint32_t avail = g_tls_sll_count[cls]; + if (avail == 0) continue; + uint32_t target = cap / 2; + if (target == 0) target = (avail < 16 ? avail : 16); + if (target > avail) target = avail; + // transfer + while (target-- > 0 && g_tls_sll_count[cls] > 0 && g_sfc_count[cls] < g_sfc_capacity[cls]) { + void* ptr = NULL; + // pop one from SLL + extern int tls_sll_pop(int class_idx, void** out_ptr); + if (!tls_sll_pop(cls, &ptr)) break; + // push into SFC + tiny_next_store(ptr, cls, g_sfc_head[cls]); + g_sfc_head[cls] = ptr; + g_sfc_count[cls]++; + } + } +} + // ============================================================================ // Box 5-NEW: Refill (Slow Path) - STUB (real logic in hakmem.c) // ============================================================================ diff --git a/core/hakmem_tiny_tls_list.h b/core/hakmem_tiny_tls_list.h index a6007382..9dc920e8 100644 --- a/core/hakmem_tiny_tls_list.h +++ b/core/hakmem_tiny_tls_list.h @@ -3,6 +3,8 @@ #include #include "tiny_remote.h" // TINY_REMOTE_SENTINEL for head poisoning guard +#include "tiny_nextptr.h" // header-aware next load/store +#include "tiny_nextptr.h" // Forward declarations typedef struct TinySlabMeta TinySlabMeta; @@ -57,23 +59,33 @@ static inline void* tls_list_pop(TinyTLSList* tls, int class_idx) { tls->count = 0; return NULL; } - if (__builtin_expect(class_idx == 7, 0)) { - tls->head = *(void**)head; - } else { - tls->head = *(void**)((uint8_t*)head + 1); - } + tls->head = tiny_next_load(head, class_idx); if (tls->count > 0) tls->count--; return head; } static inline void tls_list_push(TinyTLSList* tls, void* node, int class_idx) { if (!node) return; -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off = 0; -#endif - *(void**)((uint8_t*)node + next_off) = tls->head; + tiny_next_store(node, class_idx, tls->head); + tls->head = node; + tls->count++; +} + +// Fast variants: no sentinel/guard checks, minimal bookkeeping +// Preconditions: +// - tls->head is not poisoned +// - node/head pointers belong to correct class +// - caller handles spill/thresholds separately +static inline void* tls_list_pop_fast(TinyTLSList* tls, int class_idx) { + void* head = tls->head; if (!head) return NULL; + tls->head = tiny_next_load(head, class_idx); + if (tls->count > 0) tls->count--; + return head; +} + +static inline void tls_list_push_fast(TinyTLSList* tls, void* node, int class_idx) { + if (!node) return; + tiny_next_store(node, class_idx, tls->head); tls->head = node; tls->count++; } @@ -83,13 +95,6 @@ static inline uint32_t tls_list_bulk_take(TinyTLSList* tls, void** out_head, void** out_tail, int class_idx) { - // Define next_off at function scope to avoid scope violation -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off = 0; -#endif - if (out_head) *out_head = NULL; if (out_tail) *out_tail = NULL; if (tls->head == NULL || tls->count == 0) return 0; @@ -106,14 +111,14 @@ static inline uint32_t tls_list_bulk_take(TinyTLSList* tls, void* cur = head; uint32_t taken = 1; while (taken < want) { - void* next = *(void**)((uint8_t*)cur + next_off); + void* next = tiny_next_load(cur, class_idx); if (!next) break; cur = next; taken++; } void* tail = cur; - void* rest = *(void**)((uint8_t*)tail + next_off); - *(void**)((uint8_t*)tail + next_off) = NULL; + void* rest = tiny_next_load(tail, class_idx); + tiny_next_store(tail, class_idx, NULL); tls->head = rest; tls->count -= taken; @@ -125,12 +130,7 @@ static inline uint32_t tls_list_bulk_take(TinyTLSList* tls, static inline uint32_t tls_list_count_chain(void* head, int class_idx) { uint32_t cnt = 0; if (!head) return 0; -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off = 0; -#endif - while (head) { cnt++; head = *(void**)((uint8_t*)head + next_off); } + while (head) { cnt++; head = tiny_next_load(head, class_idx); } return cnt; } @@ -139,29 +139,22 @@ static inline void tls_list_bulk_put(TinyTLSList* tls, void* tail, uint32_t count, int class_idx) { - // Define next_off at function scope to avoid scope violation -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_off = (class_idx == 7) ? 0 : 1; -#else - const size_t next_off = 0; -#endif - if (!head) return; if (!tail) { // Determine tail and count if not supplied tail = head; uint32_t computed = 1; - while (*(void**)((uint8_t*)tail + next_off)) { tail = *(void**)((uint8_t*)tail + next_off); computed++; } + while (tiny_next_load(tail, class_idx)) { tail = tiny_next_load(tail, class_idx); computed++; } if (count == 0) count = computed; } if (count == 0) { count = tls_list_count_chain(head, class_idx); // Move tail pointer to end if still NULL (just to be safe) - void* cur = head; - while (*(void**)((uint8_t*)cur + next_off)) cur = *(void**)((uint8_t*)cur + next_off); - tail = cur; + void* cur2 = head; + while (tiny_next_load(cur2, class_idx)) cur2 = tiny_next_load(cur2, class_idx); + tail = cur2; } - *(void**)((uint8_t*)tail + next_off) = tls->head; + tiny_next_store(tail, class_idx, tls->head); tls->head = head; tls->count += count; } diff --git a/core/superslab/superslab_inline.h b/core/superslab/superslab_inline.h index 330fd6fb..16623eb6 100644 --- a/core/superslab/superslab_inline.h +++ b/core/superslab/superslab_inline.h @@ -201,7 +201,7 @@ static inline uint8_t hak_tiny_superslab_next_lg(int class_idx) { // Remote free push (MPSC stack) - returns 1 if transitioned from empty static inline int ss_remote_push(SuperSlab* ss, int slab_idx, void* ptr) { atomic_fetch_add_explicit(&g_ss_remote_push_calls, 1, memory_order_relaxed); -#if !HAKMEM_BUILD_RELEASE +#if !HAKMEM_BUILD_RELEASE && HAKMEM_DEBUG_VERBOSE static _Atomic int g_remote_push_count = 0; int count = atomic_fetch_add_explicit(&g_remote_push_count, 1, memory_order_relaxed); if (count < 5) { diff --git a/core/tiny_alloc_fast.inc.h b/core/tiny_alloc_fast.inc.h index 23eab173..b4b57215 100644 --- a/core/tiny_alloc_fast.inc.h +++ b/core/tiny_alloc_fast.inc.h @@ -16,6 +16,8 @@ #include "hakmem_tiny.h" #include "tiny_route.h" #include "tiny_alloc_fast_sfc.inc.h" // Box 5-NEW: SFC Layer +#include "hakmem_tiny_fastcache.inc.h" // Array stack (FastCache) for C0–C3 +#include "hakmem_tiny_tls_list.h" // TLS List (for tiny_fast_refill_and_take) #include "tiny_region_id.h" // Phase 7: Header-based class_idx lookup #include "tiny_adaptive_sizing.h" // Phase 2b: Adaptive sizing #include "box/tls_sll_box.h" // Box TLS-SLL: C7-safe push/pop/splice @@ -186,6 +188,20 @@ static inline void* tiny_alloc_fast_pop(int class_idx) { uint64_t start = tiny_profile_enabled() ? tiny_fast_rdtsc() : 0; #endif + // Phase 1: Try array stack (FastCache) first for hottest tiny classes (C0–C3) + if (__builtin_expect(g_fastcache_enable && class_idx <= 3, 1)) { + void* fc = fastcache_pop(class_idx); + if (__builtin_expect(fc != NULL, 1)) { + // Frontend FastCache hit + extern unsigned long long g_front_fc_hit[]; + g_front_fc_hit[class_idx]++; + return fc; + } else { + extern unsigned long long g_front_fc_miss[]; + g_front_fc_miss[class_idx]++; + } + } + // Box 5-NEW: Layer 0 - Try SFC first (if enabled) // Cache g_sfc_enabled in TLS to avoid global load on every allocation static __thread int sfc_check_done = 0; @@ -457,34 +473,34 @@ static inline void* tiny_alloc_fast(size_t size) { } ROUTE_BEGIN(class_idx); - // 2. Fast path: TLS freelist pop (3-4 instructions, 95% hit rate) - // CRITICAL: Use Box TLS-SLL API (static inline, same performance as macro but SAFE!) - // The old macro had race condition: read head before pop → rbp=0xa0 SEGV - void* ptr = NULL; - tls_sll_pop(class_idx, &ptr); + // 2. Fast path: Frontend pop (FastCache/SFC/SLL) + // Try the consolidated fast pop path first (includes FastCache for C0–C3) + void* ptr = tiny_alloc_fast_pop(class_idx); if (__builtin_expect(ptr != NULL, 1)) { - // C7 (1024B, headerless): clear embedded next pointer before returning to user - if (__builtin_expect(class_idx == 7, 0)) { - *(void**)ptr = NULL; - } + // C7 (1024B, headerless) is never returned by tiny_alloc_fast_pop (returns NULL for C7) HAK_RET_ALLOC(class_idx, ptr); } - // 3. Miss: Refill from backend (Box 3: SuperSlab) + // 3. Miss: Refill from TLS List/SuperSlab and take one into FastCache/front + { + // Use header-aware TLS List bulk transfer that prefers FastCache for C0–C3 + extern __thread TinyTLSList g_tls_lists[TINY_NUM_CLASSES]; + void* took = tiny_fast_refill_and_take(class_idx, &g_tls_lists[class_idx]); + if (took) { + HAK_RET_ALLOC(class_idx, took); + } + } + + // 4. Still miss: Fallback to existing backend refill and retry int refilled = tiny_alloc_fast_refill(class_idx); if (__builtin_expect(refilled > 0, 1)) { - // Refill success → retry pop using safe Box TLS-SLL API - ptr = NULL; - tls_sll_pop(class_idx, &ptr); + ptr = tiny_alloc_fast_pop(class_idx); if (ptr) { - if (__builtin_expect(class_idx == 7, 0)) { - *(void**)ptr = NULL; - } HAK_RET_ALLOC(class_idx, ptr); } } - // 4. Refill failure or still empty → slow path (OOM or new SuperSlab) + // 5. Refill failure or still empty → slow path (OOM or new SuperSlab) // Box Boundary: Delegate to Slow Path (Box 3 backend) ptr = hak_tiny_alloc_slow(size, class_idx); if (ptr) { diff --git a/core/tiny_alloc_fast_sfc.inc.h b/core/tiny_alloc_fast_sfc.inc.h index 1c56d163..c0a6fa10 100644 --- a/core/tiny_alloc_fast_sfc.inc.h +++ b/core/tiny_alloc_fast_sfc.inc.h @@ -9,14 +9,15 @@ #include // For debug output (getenv, fprintf, stderr) #include // For getenv #include "hakmem_tiny.h" +#include "tiny_nextptr.h" // ============================================================================ // Box 5-NEW: Super Front Cache - Global Config // ============================================================================ // Default capacities (can be overridden per-class) -#define SFC_DEFAULT_CAPACITY 128 -#define SFC_DEFAULT_REFILL_COUNT 64 +#define SFC_DEFAULT_CAPACITY 256 +#define SFC_DEFAULT_REFILL_COUNT 128 #define SFC_DEFAULT_SPILL_THRESH 90 // Spill when >90% full // Per-class capacity limits @@ -78,13 +79,8 @@ static inline void* sfc_alloc(int cls) { void* base = g_sfc_head[cls]; if (__builtin_expect(base != NULL, 1)) { -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_offset = (cls == 7) ? 0 : 1; -#else - const size_t next_offset = 0; -#endif - // Pop: header-aware next - g_sfc_head[cls] = *(void**)((uint8_t*)base + next_offset); + // Pop: safe header-aware next + g_sfc_head[cls] = tiny_next_load(base, cls); g_sfc_count[cls]--; // count-- #if HAKMEM_DEBUG_COUNTERS @@ -109,23 +105,22 @@ static inline int sfc_free_push(int cls, void* ptr) { uint32_t cap = g_sfc_capacity[cls]; uint32_t cnt = g_sfc_count[cls]; - // Debug: Always log sfc_free_push calls when SFC_DEBUG is set - static __thread int free_debug_count = 0; - if (getenv("HAKMEM_SFC_DEBUG") && free_debug_count < 20) { - free_debug_count++; - extern int g_sfc_enabled; - fprintf(stderr, "[SFC_FREE_PUSH] cls=%d, ptr=%p, cnt=%u, cap=%u, will_succeed=%d, enabled=%d\n", - cls, ptr, cnt, cap, (cnt < cap), g_sfc_enabled); - } +#if !HAKMEM_BUILD_RELEASE && defined(HAKMEM_SFC_DEBUG_LOG) + // Debug logging (compile-time gated; zero cost in release) + do { + static __thread int free_debug_count = 0; + if (getenv("HAKMEM_SFC_DEBUG") && free_debug_count < 20) { + free_debug_count++; + extern int g_sfc_enabled; + fprintf(stderr, "[SFC_FREE_PUSH] cls=%d, ptr=%p, cnt=%u, cap=%u, will_succeed=%d, enabled=%d\n", + cls, ptr, cnt, cap, (cnt < cap), g_sfc_enabled); + } + } while(0); +#endif if (__builtin_expect(cnt < cap, 1)) { -#if HAKMEM_TINY_HEADER_CLASSIDX - const size_t next_offset = (cls == 7) ? 0 : 1; -#else - const size_t next_offset = 0; -#endif - // Push: header-aware next placement - *(void**)((uint8_t*)ptr + next_offset) = g_sfc_head[cls]; + // Push: safe header-aware next placement + tiny_next_store(ptr, cls, g_sfc_head[cls]); g_sfc_head[cls] = ptr; // head = base g_sfc_count[cls] = cnt + 1; // count++ @@ -149,6 +144,7 @@ static inline int sfc_free_push(int cls, void* ptr) { // Initialize SFC (called once at startup) void sfc_init(void); +void sfc_cascade_from_tls_initial(void); // Shutdown SFC (called at exit, optional) void sfc_shutdown(void); diff --git a/core/tiny_debug_ring.h b/core/tiny_debug_ring.h index 7c544914..ed5800af 100644 --- a/core/tiny_debug_ring.h +++ b/core/tiny_debug_ring.h @@ -3,6 +3,7 @@ #include #include +#include "hakmem_build_flags.h" // Tiny Debug Ring Trace (Phase 8 tooling) // Environment: HAKMEM_TINY_TRACE_RING=1 to enable @@ -36,7 +37,16 @@ enum { TINY_RING_EVENT_ROUTE }; +#if HAKMEM_BUILD_RELEASE && !HAKMEM_DEBUG_VERBOSE +static inline void tiny_debug_ring_init(void) { + (void)0; +} +static inline void tiny_debug_ring_record(uint16_t event, uint16_t class_idx, void* ptr, uintptr_t aux) { + (void)event; (void)class_idx; (void)ptr; (void)aux; +} +#else void tiny_debug_ring_init(void); void tiny_debug_ring_record(uint16_t event, uint16_t class_idx, void* ptr, uintptr_t aux); +#endif #endif // TINY_DEBUG_RING_H diff --git a/core/tiny_free_magazine.inc.h b/core/tiny_free_magazine.inc.h index 91ca6294..f85d479c 100644 --- a/core/tiny_free_magazine.inc.h +++ b/core/tiny_free_magazine.inc.h @@ -80,7 +80,8 @@ #else const size_t next_off = 0; #endif - *(void**)((uint8_t*)head + next_off) = NULL; + #include "tiny_nextptr.h" + tiny_next_store(head, class_idx, NULL); void* tail = head; // current tail int taken = 1; while (taken < limit && mag->top > 0) { @@ -90,7 +91,7 @@ #else const size_t next_off2 = 0; #endif - *(void**)((uint8_t*)p2 + next_off2) = head; + tiny_next_store(p2, class_idx, head); head = p2; taken++; } @@ -211,7 +212,7 @@ if (tls->count < tls->cap) { void* base = (class_idx == 7) ? ptr : (void*)((uint8_t*)ptr - 1); tiny_tls_list_guard_push(class_idx, tls, base); - tls_list_push(tls, base, class_idx); + tls_list_push_fast(tls, base, class_idx); HAK_STAT_FREE(class_idx); return; } @@ -222,7 +223,7 @@ { void* base = (class_idx == 7) ? ptr : (void*)((uint8_t*)ptr - 1); tiny_tls_list_guard_push(class_idx, tls, base); - tls_list_push(tls, base, class_idx); + tls_list_push_fast(tls, base, class_idx); } if (tls_list_should_spill(tls)) { tls_list_spill_excess(class_idx, tls); diff --git a/core/tiny_nextptr.h b/core/tiny_nextptr.h new file mode 100644 index 00000000..b87c6a87 --- /dev/null +++ b/core/tiny_nextptr.h @@ -0,0 +1,59 @@ +// tiny_nextptr.h - Safe load/store for header-aware next pointers +// +// Context: +// - Tiny classes 0–6 place a 1-byte header immediately before the user pointer +// - Freelist "next" is stored inside the block at an offset that depends on class +// - Many hot paths currently cast to void** at base+1, which is unaligned and UB in C +// +// This header centralizes the offset calculation and uses memcpy-based loads/stores +// to avoid undefined behavior from unaligned pointer access. Compilers will optimize +// these to efficient byte moves on x86_64 while remaining standards-compliant. + +#ifndef TINY_NEXTPTR_H +#define TINY_NEXTPTR_H + +#include +#include +#include "hakmem_build_flags.h" + +// Compute freelist next-pointer offset within a block for the given class. +// - Class 7 (1024B) is headerless → next at offset 0 (block base) +// - Classes 0–6 have 1-byte header → next at offset 1 +static inline __attribute__((always_inline)) size_t tiny_next_off(int class_idx) { +#if HAKMEM_TINY_HEADER_CLASSIDX + return (class_idx == 7) ? 0 : 1; +#else + (void)class_idx; + return 0; +#endif +} + +// Safe load of next pointer from a block base +static inline __attribute__((always_inline)) void* tiny_next_load(const void* base, int class_idx) { + size_t off = tiny_next_off(class_idx); +#if HAKMEM_TINY_HEADER_CLASSIDX + if (__builtin_expect(off != 0, 0)) { + void* next = NULL; + const uint8_t* p = (const uint8_t*)base + off; + memcpy(&next, p, sizeof(void*)); + return next; + } +#endif + // Either headers are disabled, or this class uses offset 0 (aligned) + return *(void* const*)base; +} + +// Safe store of next pointer into a block base +static inline __attribute__((always_inline)) void tiny_next_store(void* base, int class_idx, void* next) { + size_t off = tiny_next_off(class_idx); +#if HAKMEM_TINY_HEADER_CLASSIDX + if (__builtin_expect(off != 0, 0)) { + uint8_t* p = (uint8_t*)base + off; + memcpy(p, &next, sizeof(void*)); + return; + } +#endif + *(void**)base = next; +} + +#endif // TINY_NEXTPTR_H diff --git a/core/tiny_route.h b/core/tiny_route.h index 7f8d143f..2e1ac6c9 100644 --- a/core/tiny_route.h +++ b/core/tiny_route.h @@ -46,30 +46,38 @@ static inline uint32_t route_sample_mask(void) { return (g_route_sample_lg >= 31) ? 0xFFFFFFFFu : ((1u << g_route_sample_lg) - 1u); } -#define ROUTE_BEGIN(cls) do { \ - if (__builtin_expect(!route_enabled_runtime(), 1)) { g_route_active = 0; break; } \ - uint32_t m = route_sample_mask(); \ - uint32_t s = ++g_route_seq; \ - g_route_active = ((s & m) == 0u); \ - g_route_fp = 0ull; \ - (void)(cls); \ -} while(0) +#if HAKMEM_BUILD_RELEASE && !HAKMEM_ROUTE + #define ROUTE_BEGIN(cls) do { (void)(cls); } while(0) + #define ROUTE_MARK(bit) do { (void)(bit); } while(0) + #define ROUTE_COMMIT(cls, tag) do { (void)(cls); (void)(tag); } while(0) + static inline void route_free_commit(int class_idx, uint64_t bits, uint16_t tag) { + (void)class_idx; (void)bits; (void)tag; + } +#else + #define ROUTE_BEGIN(cls) do { \ + if (__builtin_expect(!route_enabled_runtime(), 1)) { g_route_active = 0; break; } \ + uint32_t m = route_sample_mask(); \ + uint32_t s = ++g_route_seq; \ + g_route_active = ((s & m) == 0u); \ + g_route_fp = 0ull; \ + (void)(cls); \ + } while(0) -#define ROUTE_MARK(bit) do { if (__builtin_expect(g_route_active, 0)) { g_route_fp |= (1ull << (bit)); } } while(0) + #define ROUTE_MARK(bit) do { if (__builtin_expect(g_route_active, 0)) { g_route_fp |= (1ull << (bit)); } } while(0) -#define ROUTE_COMMIT(cls, tag) do { \ - if (__builtin_expect(g_route_active, 0)) { \ - uintptr_t aux = ((uintptr_t)(tag & 0xFFFF) << 48) | (uintptr_t)(g_route_fp & 0x0000FFFFFFFFFFFFull); \ - tiny_debug_ring_record(TINY_RING_EVENT_ROUTE, (uint16_t)(cls), (void*)(uintptr_t)g_route_fp, aux); \ - g_route_active = 0; \ - } \ -} while(0) + #define ROUTE_COMMIT(cls, tag) do { \ + if (__builtin_expect(g_route_active, 0)) { \ + uintptr_t aux = ((uintptr_t)(tag & 0xFFFF) << 48) | (uintptr_t)(g_route_fp & 0x0000FFFFFFFFFFFFull); \ + tiny_debug_ring_record(TINY_RING_EVENT_ROUTE, (uint16_t)(cls), (void*)(uintptr_t)g_route_fp, aux); \ + g_route_active = 0; \ + } \ + } while(0) -// Free-side one-shot route commit (independent of alloc-side COMMIT) -static inline void route_free_commit(int class_idx, uint64_t bits, uint16_t tag) { - if (!route_enabled_runtime()) return; - uintptr_t aux = ((uintptr_t)(tag & 0xFFFF) << 48) | (uintptr_t)(bits & 0x0000FFFFFFFFFFFFull); - tiny_debug_ring_record(TINY_RING_EVENT_ROUTE, (uint16_t)class_idx, (void*)(uintptr_t)bits, aux); -} + static inline void route_free_commit(int class_idx, uint64_t bits, uint16_t tag) { + if (!route_enabled_runtime()) return; + uintptr_t aux = ((uintptr_t)(tag & 0xFFFF) << 48) | (uintptr_t)(bits & 0x0000FFFFFFFFFFFFull); + tiny_debug_ring_record(TINY_RING_EVENT_ROUTE, (uint16_t)class_idx, (void*)(uintptr_t)bits, aux); + } +#endif // Note: Build-time gate removed to keep integration simple; runtime env controls activation. diff --git a/hakmem.d b/hakmem.d index a0f4aab5..d2ad4a31 100644 --- a/hakmem.d +++ b/hakmem.d @@ -19,11 +19,11 @@ hakmem.o: core/hakmem.c core/hakmem.h core/hakmem_build_flags.h \ core/hakmem_ace_metrics.h core/hakmem_ace_ucb1.h core/ptr_trace.h \ core/box/hak_exit_debug.inc.h core/box/hak_kpi_util.inc.h \ core/box/hak_core_init.inc.h core/hakmem_phase7_config.h \ - core/box/hak_alloc_api.inc.h core/box/../pool_tls.h \ - core/box/hak_free_api.inc.h core/hakmem_tiny_superslab.h \ - core/box/../tiny_free_fast_v2.inc.h core/box/../tiny_region_id.h \ - core/box/../hakmem_build_flags.h core/box/../hakmem_tiny_config.h \ - core/box/../box/tls_sll_box.h core/box/../box/../hakmem_tiny_config.h \ + core/box/hak_alloc_api.inc.h core/box/hak_free_api.inc.h \ + core/hakmem_tiny_superslab.h core/box/../tiny_free_fast_v2.inc.h \ + core/box/../tiny_region_id.h core/box/../hakmem_build_flags.h \ + core/box/../hakmem_tiny_config.h core/box/../box/tls_sll_box.h \ + core/box/../box/../hakmem_tiny_config.h \ core/box/../box/../hakmem_build_flags.h \ core/box/../box/../tiny_region_id.h core/box/front_gate_classifier.h \ core/box/hak_wrappers.inc.h @@ -77,7 +77,6 @@ core/box/hak_kpi_util.inc.h: core/box/hak_core_init.inc.h: core/hakmem_phase7_config.h: core/box/hak_alloc_api.inc.h: -core/box/../pool_tls.h: core/box/hak_free_api.inc.h: core/hakmem_tiny_superslab.h: core/box/../tiny_free_fast_v2.inc.h: diff --git a/hakmem_tiny_bg_spill.d b/hakmem_tiny_bg_spill.d index 696ef4c0..944a17b4 100644 --- a/hakmem_tiny_bg_spill.d +++ b/hakmem_tiny_bg_spill.d @@ -1,5 +1,6 @@ hakmem_tiny_bg_spill.o: core/hakmem_tiny_bg_spill.c \ - core/hakmem_tiny_bg_spill.h core/hakmem_tiny_superslab.h \ + core/hakmem_tiny_bg_spill.h core/tiny_nextptr.h \ + core/hakmem_build_flags.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ core/tiny_debug_ring.h core/tiny_remote.h \ @@ -7,9 +8,11 @@ hakmem_tiny_bg_spill.o: core/hakmem_tiny_bg_spill.c \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ - core/hakmem_build_flags.h core/hakmem_super_registry.h \ - core/hakmem_tiny.h core/hakmem_trace.h core/hakmem_tiny_mini_mag.h + core/hakmem_super_registry.h core/hakmem_tiny.h core/hakmem_trace.h \ + core/hakmem_tiny_mini_mag.h core/hakmem_tiny_bg_spill.h: +core/tiny_nextptr.h: +core/hakmem_build_flags.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: @@ -23,7 +26,6 @@ core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: core/hakmem_super_registry.h: core/hakmem_tiny.h: core/hakmem_trace.h: diff --git a/hakmem_tiny_sfc.d b/hakmem_tiny_sfc.d index 5360fdfe..7dca2a13 100644 --- a/hakmem_tiny_sfc.d +++ b/hakmem_tiny_sfc.d @@ -1,10 +1,11 @@ hakmem_tiny_sfc.o: core/hakmem_tiny_sfc.c core/tiny_alloc_fast_sfc.inc.h \ core/hakmem_tiny.h core/hakmem_build_flags.h core/hakmem_trace.h \ - core/hakmem_tiny_mini_mag.h core/hakmem_tiny_config.h \ - core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ - core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/superslab/../tiny_box_geometry.h \ + core/hakmem_tiny_mini_mag.h core/tiny_nextptr.h \ + core/hakmem_tiny_config.h core/hakmem_tiny_superslab.h \ + core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ + core/superslab/superslab_inline.h core/superslab/superslab_types.h \ + core/tiny_debug_ring.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ @@ -14,6 +15,7 @@ core/hakmem_tiny.h: core/hakmem_build_flags.h: core/hakmem_trace.h: core/hakmem_tiny_mini_mag.h: +core/tiny_nextptr.h: core/hakmem_tiny_config.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: diff --git a/hakmem_tiny_superslab.d b/hakmem_tiny_superslab.d index 570be1d2..a0104bfa 100644 --- a/hakmem_tiny_superslab.d +++ b/hakmem_tiny_superslab.d @@ -2,20 +2,22 @@ hakmem_tiny_superslab.o: core/hakmem_tiny_superslab.c \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/superslab/../tiny_box_geometry.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ - core/hakmem_build_flags.h core/hakmem_super_registry.h \ - core/hakmem_tiny.h core/hakmem_trace.h core/hakmem_tiny_mini_mag.h \ - core/hakmem_internal.h core/hakmem.h core/hakmem_config.h \ - core/hakmem_features.h core/hakmem_sys.h core/hakmem_whale.h + core/hakmem_super_registry.h core/hakmem_tiny.h core/hakmem_trace.h \ + core/hakmem_tiny_mini_mag.h core/hakmem_internal.h core/hakmem.h \ + core/hakmem_config.h core/hakmem_features.h core/hakmem_sys.h \ + core/hakmem_whale.h core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/superslab/../tiny_box_geometry.h: core/superslab/../hakmem_tiny_superslab_constants.h: @@ -23,7 +25,6 @@ core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: core/hakmem_super_registry.h: core/hakmem_tiny.h: core/hakmem_trace.h: diff --git a/tiny_debug_ring.d b/tiny_debug_ring.d index deaa016b..a204eb45 100644 --- a/tiny_debug_ring.d +++ b/tiny_debug_ring.d @@ -1,8 +1,8 @@ tiny_debug_ring.o: core/tiny_debug_ring.c core/tiny_debug_ring.h \ - core/hakmem_tiny.h core/hakmem_build_flags.h core/hakmem_trace.h \ + core/hakmem_build_flags.h core/hakmem_tiny.h core/hakmem_trace.h \ core/hakmem_tiny_mini_mag.h core/tiny_debug_ring.h: -core/hakmem_tiny.h: core/hakmem_build_flags.h: +core/hakmem_tiny.h: core/hakmem_trace.h: core/hakmem_tiny_mini_mag.h: diff --git a/tiny_remote.d b/tiny_remote.d index eb16a160..f32254de 100644 --- a/tiny_remote.d +++ b/tiny_remote.d @@ -2,10 +2,11 @@ tiny_remote.o: core/tiny_remote.c core/tiny_remote.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/superslab/../tiny_box_geometry.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ core/superslab/../hakmem_tiny_superslab_constants.h \ core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ - core/hakmem_tiny_superslab_constants.h core/hakmem_build_flags.h + core/hakmem_tiny_superslab_constants.h core/tiny_remote.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: @@ -13,10 +14,10 @@ core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: core/tiny_debug_ring.h: +core/hakmem_build_flags.h: core/tiny_remote.h: core/superslab/../tiny_box_geometry.h: core/superslab/../hakmem_tiny_superslab_constants.h: core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_tiny_superslab_constants.h: -core/hakmem_build_flags.h: