Fix C7 warm/TLS Release path and unify debug instrumentation
This commit is contained in:
@ -1,4 +1,4 @@
|
|||||||
## HAKMEM 状況メモ (2025-12-05 更新)
|
## HAKMEM 状況メモ (2025-12-05 更新 / C7 Warm/TLS Bind 反映)
|
||||||
|
|
||||||
### 現在の状態(Tiny / Superslab / Warm Pool)
|
### 現在の状態(Tiny / Superslab / Warm Pool)
|
||||||
- Tiny Front / Superslab / Shared Pool は Box Theory 準拠で 3 層構造に整理済み(HOT/WARM/COLD)。
|
- Tiny Front / Superslab / Shared Pool は Box Theory 準拠で 3 層構造に整理済み(HOT/WARM/COLD)。
|
||||||
@ -27,10 +27,26 @@
|
|||||||
- `core/box/tiny_page_box.h` / `core/box/tiny_page_box.c` を追加し、`HAKMEM_TINY_PAGE_BOX_CLASSES` で有効クラスを制御できる Page Box を実装。
|
- `core/box/tiny_page_box.h` / `core/box/tiny_page_box.c` を追加し、`HAKMEM_TINY_PAGE_BOX_CLASSES` で有効クラスを制御できる Page Box を実装。
|
||||||
- `tiny_tls_bind_slab()` から `tiny_page_box_on_new_slab()` を呼び出し、TLS が bind した C7 slab を per-thread の page pool に登録。
|
- `tiny_tls_bind_slab()` から `tiny_page_box_on_new_slab()` を呼び出し、TLS が bind した C7 slab を per-thread の page pool に登録。
|
||||||
- `unified_cache_refill()` の先頭に Page Box 経路を追加し、C7 では「TLS が掴んでいるページ内 freelist/carve」からバッチ供給を試みてから Warm Pool / Shared Pool に落ちるようにした(Box 境界は `Tiny Page Box → Warm Pool → Shared Pool` の順序を維持)。
|
- `unified_cache_refill()` の先頭に Page Box 経路を追加し、C7 では「TLS が掴んでいるページ内 freelist/carve」からバッチ供給を試みてから Warm Pool / Shared Pool に落ちるようにした(Box 境界は `Tiny Page Box → Warm Pool → Shared Pool` の順序を維持)。
|
||||||
|
- TLS Bind Box の導入:
|
||||||
|
- `core/box/ss_tls_bind_box.h` に `ss_tls_bind_one()` を追加し、「Superslab + slab_idx → TLS」のバインド処理(`superslab_init_slab` / `meta->class_idx` 設定 / `tiny_tls_bind_slab`)を 1 箇所に集約。
|
||||||
|
- `superslab_refill()`(Shared Pool 経路)および Warm Pool 実験経路から、この Box を経由して TLS に接続するよう統一。
|
||||||
|
- C7 Warm/TLS Bind 経路の実装と検証:
|
||||||
|
- `core/front/tiny_unified_cache.c` に C7 専用の Warm/TLS Bind モード(0/1/2)を追加し、Debug では `HAKMEM_WARM_TLS_BIND_C7` で切替可能にした。
|
||||||
|
- mode 0: Legacy Warm(レガシー/デバッグ用、C7 では carve 0 が多く非推奨)
|
||||||
|
- mode 1: Bind-only(Warm から取得した Superslab を TLS Bind Box 経由でバインドする本番経路)
|
||||||
|
- mode 2: Bind+TLS carve(TLS から直接 carve する実験経路)
|
||||||
|
- Release ビルドでは常に mode=1 固定。Debug では `HAKMEM_WARM_TLS_BIND_C7=0/1/2` で切替。
|
||||||
|
- Warm Pool / Unified Cache の詳細計測:
|
||||||
|
- `warm_pool_dbg_box.h` と Unified Cache の計測フックを拡張し、C7 向けに
|
||||||
|
- Warm pop 試行/ヒット/実 carve 回数
|
||||||
|
- TLS carve 試行/成功/失敗
|
||||||
|
- UC ミスを Warm/TLS/Shared 別に分類
|
||||||
|
を Debug ビルドで観測可能にした。
|
||||||
|
- `bench_random_mixed.c` に `HAKMEM_BENCH_C7_ONLY=1` を追加し、C7 サイズ専用の micro-bench を追加。
|
||||||
|
|
||||||
### 性能の現状(Random Mixed, HEAD)
|
### 性能の現状(Random Mixed, HEAD)
|
||||||
- 条件: `bench_random_mixed_hakmem 1000000 256 42`(1T, ws=256, RELEASE, 16–1024B)
|
- 条件: `bench_random_mixed_hakmem 1000000 256 42`(1T, ws=256, RELEASE, 16–1024B)
|
||||||
- HAKMEM: 約 5.0M ops/s
|
- HAKMEM: 約 27.6M ops/s(C7 Warm/TLS 修復後)
|
||||||
- system malloc: 約 90–100M ops/s
|
- system malloc: 約 90–100M ops/s
|
||||||
- mimalloc: 約 120–130M ops/s
|
- mimalloc: 約 120–130M ops/s
|
||||||
- 条件: `bench_random_mixed_hakmem 1000000 256 42` +
|
- 条件: `bench_random_mixed_hakmem 1000000 256 42` +
|
||||||
@ -38,26 +54,27 @@
|
|||||||
- HAKMEM Tiny Front: 約 80–90M ops/s(mimalloc と同オーダー)
|
- HAKMEM Tiny Front: 約 80–90M ops/s(mimalloc と同オーダー)
|
||||||
- 条件: `bench_random_mixed_hakmem 1000000 256 42` +
|
- 条件: `bench_random_mixed_hakmem 1000000 256 42` +
|
||||||
`HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024`(Tiny C5–C7 のみ)
|
`HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024`(Tiny C5–C7 のみ)
|
||||||
- HAKMEM: 約 4.7–4.8M ops/s
|
- HAKMEM: 約 28.0M ops/s(Warm/TLS ガード適用後)
|
||||||
|
- 条件: C7 専用 micro-bench(Debug, `HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_PROFILE=full HAKMEM_WARM_C7_MAX=8 HAKMEM_WARM_C7_PREFETCH=4` ほか)
|
||||||
|
- mode 0(Legacy Warm): 約 2.0M ops/s、C7 Warm ヒット 0・Shared Pool ロック多数(`slab_carve_from_ss` が 0 を頻発)
|
||||||
|
- mode 1(Bind-only): 約 20M ops/s(iters=200K, ws=32)、Warm hit ≈100%・Shared Pool ロック 5 回まで減少
|
||||||
|
- mode 2(Bind+TLS carve 実験): mode 1 と同等〜わずかに上(UC ミスは増えるが `uc_miss_tls` に集中し、avg_refill は短縮)
|
||||||
|
- 条件: C7 専用 micro-bench(Release, `HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_PROFILE=full HAKMEM_WARM_C7_MAX=8 HAKMEM_WARM_C7_PREFETCH=4`)
|
||||||
|
- HAKMEM: 約 18.8M ops/s(空スラブ強制ガード + リセット導入後、Debug と同オーダーまで回復)
|
||||||
- 結論:
|
- 結論:
|
||||||
- Tiny front 自体(8–128B)は十分速く、mimalloc と同オーダーまで出ている。
|
- Tiny front 自体(8–128B)は十分速く、mimalloc と同オーダーまで出ている。
|
||||||
- 129–1024B の Tiny C5–C7 経路で Unified Cache hit=0 / Shared Pool ロック多発というボトルネックがあり、
|
- C5–C7 経路は「満杯 C7 slab を Warm に再供給していた」問題を空スラブ限定ガード+Release/Debug 共通リセットで解消し、
|
||||||
Random Mixed 全体の性能を支配している。
|
C7-only Release も ~18.8M ops/s に回復。Random Mixed Release も 27M クラスまで改善。
|
||||||
|
|
||||||
### 次にやること(優先タスク:C7 Page Box の実効性検証とチューニング)
|
### 次にやること(広い条件での安定化確認)
|
||||||
1. **C7 Page Box 経路の実効性を計測**
|
1. `HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024` や通常の `bench_random_mixed_hakmem 1000000 256 42` で
|
||||||
- ENV: `HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024` + `HAKMEM_MEASURE_UNIFIED_CACHE=1` で
|
空スラブ限定ガードが副作用なく動くかを継続確認(現状 Release で 27–28M ops/s を確認済み)。
|
||||||
`bench_random_mixed_hakmem 1000000 256 42` を実行し、C7 の:
|
2. ドキュメント更新:
|
||||||
- Unified Cache refill 回数・平均 cycles
|
- Release だけ C7 Warm が死んでいた根本原因 = 満杯 C7 slab を Shared Pool がリセットせず再供給していた。
|
||||||
- `shared_pool_acquire_slab(C7)` のロック回数
|
- Acquire の空スラブ強制ガード+Release/Debug 共通リセットで C7-only Release が ~18.8M ops/s まで回復した。
|
||||||
を、Page Box ON/OFF(`HAKMEM_TINY_PAGE_BOX_CLASSES=` 未設定 vs `7`)で比較する。
|
3. 次フェーズ案:
|
||||||
2. **C7 の Unified Cache 容量・バッチサイズのチューニング**
|
- C5/C6 でも同様の Warm/TLS 最適化・空スラブガードを適用するか、
|
||||||
- `HAKMEM_TINY_UNIFIED_C7` と `unified_cache_refill()` の `max_batch` 設定を変えつつ、
|
- Random Mixed 全体のボトルネック(Shared Pool ロック/Wrapper/mid-size path など)を洗うかを選択。
|
||||||
Page Box ON 時の C7 ヒット率・Shared Pool ロック回数・throughput を観測し、C7 にとって最適な容量/バッチサイズを探る。
|
|
||||||
3. **Page Box を C5/C6 に拡張するかの判断**
|
|
||||||
- C7 で十分な効果(Shared Pool ロック大幅減 + throughput 向上)が得られた場合、
|
|
||||||
`HAKMEM_TINY_PAGE_BOX_CLASSES=5,6,7` を試し、C5/C6 も Tiny-Plus 化したときの安定性・性能を確認する。
|
|
||||||
- 問題がなければ、デフォルトプロファイルを「C5–C7 Page Box 有効」に近づけるかを検討する。
|
|
||||||
|
|
||||||
### メモ
|
### メモ
|
||||||
- ページフォルト問題は Prefault Box + ウォームアップで一定水準まで解消済みで、現在の主ボトルネックはユーザー空間の箱(Unified Cache / free / Pool)側に移っている。
|
- ページフォルト問題は Prefault Box + ウォームアップで一定水準まで解消済みで、現在の主ボトルネックはユーザー空間の箱(Unified Cache / free / Pool)側に移っている。
|
||||||
|
|||||||
@ -1,5 +1,8 @@
|
|||||||
# HAKMEM Allocator Performance Analysis Results
|
# HAKMEM Allocator Performance Analysis Results
|
||||||
|
|
||||||
|
**最新メモ (2025-12-05)**: C7 Warm/TLS Bind は本番経路を Bind-only (mode=1) に統一。Debug では `HAKMEM_WARM_TLS_BIND_C7=0/1/2` で切替可能だが、Release は常に mode=1 固定。C7-only ワークロードでは mode=1 が legacy (mode=0) 比で ~4–10x 速く、mode=2 は TLS carve 実験として残置。
|
||||||
|
**追記 (2025-12-05, Release 修復)**: Release だけ C7 Warm が死んでいた原因は「満杯 C7 slab が Shared Pool に居残り、空スラブが Warm に渡っていなかった」こと。Acquire で C7 は空スラブ限定、Release でメタをリセットするガードを導入し、C7-only Release で ~18.8M ops/s、Random Mixed Release で ~27–28M ops/s まで回復。
|
||||||
|
|
||||||
**分析実施日**: 2025-11-28
|
**分析実施日**: 2025-11-28
|
||||||
**分析対象**: HAKMEM allocator (commit 0ce20bb83)
|
**分析対象**: HAKMEM allocator (commit 0ce20bb83)
|
||||||
**ベンチマーク**: bench_random_mixed (1,000,000 ops, working set=256)
|
**ベンチマーク**: bench_random_mixed (1,000,000 ops, working set=256)
|
||||||
|
|||||||
@ -13,9 +13,16 @@
|
|||||||
#include <stdint.h>
|
#include <stdint.h>
|
||||||
#include <time.h>
|
#include <time.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#define C7_META_COUNTER_DEFINE
|
||||||
|
#include "core/box/c7_meta_used_counter_box.h"
|
||||||
|
#undef C7_META_COUNTER_DEFINE
|
||||||
|
#include "core/box/warm_pool_rel_counters_box.h"
|
||||||
|
|
||||||
#ifdef USE_HAKMEM
|
#ifdef USE_HAKMEM
|
||||||
#include "hakmem.h"
|
#include "hakmem.h"
|
||||||
|
#include "hakmem_build_flags.h"
|
||||||
|
#include "core/box/c7_meta_used_counter_box.h"
|
||||||
|
|
||||||
// Box BenchMeta: Benchmark metadata management (bypass hakmem wrapper)
|
// Box BenchMeta: Benchmark metadata management (bypass hakmem wrapper)
|
||||||
// Phase 15: Separate BenchMeta (slots array) from CoreAlloc (user workload)
|
// Phase 15: Separate BenchMeta (slots array) from CoreAlloc (user workload)
|
||||||
@ -253,6 +260,38 @@ int main(int argc, char** argv){
|
|||||||
extern void tiny_warm_pool_print_stats_public(void);
|
extern void tiny_warm_pool_print_stats_public(void);
|
||||||
tiny_warm_pool_print_stats_public();
|
tiny_warm_pool_print_stats_public();
|
||||||
|
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
// Minimal Release-side telemetry to verify Warm path usage (C7-only)
|
||||||
|
extern _Atomic uint64_t g_rel_c7_warm_pop;
|
||||||
|
extern _Atomic uint64_t g_rel_c7_warm_push;
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_CARVE] attempts=%llu success=%llu zero=%llu\n",
|
||||||
|
(unsigned long long)warm_pool_rel_c7_carve_attempts(),
|
||||||
|
(unsigned long long)warm_pool_rel_c7_carve_successes(),
|
||||||
|
(unsigned long long)warm_pool_rel_c7_carve_zeroes());
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_WARM] pop=%llu push=%llu\n",
|
||||||
|
(unsigned long long)atomic_load_explicit(&g_rel_c7_warm_pop, memory_order_relaxed),
|
||||||
|
(unsigned long long)atomic_load_explicit(&g_rel_c7_warm_push, memory_order_relaxed));
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_WARM_PREFILL] calls=%llu slabs=%llu\n",
|
||||||
|
(unsigned long long)warm_pool_rel_c7_prefill_calls(),
|
||||||
|
(unsigned long long)warm_pool_rel_c7_prefill_slabs());
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_META_USED_INC] total=%llu backend=%llu tls=%llu front=%llu\n",
|
||||||
|
(unsigned long long)c7_meta_used_total(),
|
||||||
|
(unsigned long long)c7_meta_used_backend(),
|
||||||
|
(unsigned long long)c7_meta_used_tls(),
|
||||||
|
(unsigned long long)c7_meta_used_front());
|
||||||
|
#else
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_META_USED_INC] total=%llu backend=%llu tls=%llu front=%llu\n",
|
||||||
|
(unsigned long long)c7_meta_used_total(),
|
||||||
|
(unsigned long long)c7_meta_used_backend(),
|
||||||
|
(unsigned long long)c7_meta_used_tls(),
|
||||||
|
(unsigned long long)c7_meta_used_front());
|
||||||
|
#endif
|
||||||
|
|
||||||
// Phase 21-1: Ring cache - DELETED (A/B test: OFF is faster)
|
// Phase 21-1: Ring cache - DELETED (A/B test: OFF is faster)
|
||||||
// extern void ring_cache_print_stats(void);
|
// extern void ring_cache_print_stats(void);
|
||||||
// ring_cache_print_stats();
|
// ring_cache_print_stats();
|
||||||
|
|||||||
59
core/box/c7_meta_used_counter_box.h
Normal file
59
core/box/c7_meta_used_counter_box.h
Normal file
@ -0,0 +1,59 @@
|
|||||||
|
// c7_meta_used_counter_box.h
|
||||||
|
// Box: C7 meta->used increment counters (Release/Debug共通)
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <stdint.h>
|
||||||
|
|
||||||
|
typedef enum C7MetaUsedSource {
|
||||||
|
C7_META_USED_SRC_UNKNOWN = 0,
|
||||||
|
C7_META_USED_SRC_BACKEND = 1,
|
||||||
|
C7_META_USED_SRC_TLS = 2,
|
||||||
|
C7_META_USED_SRC_FRONT = 3,
|
||||||
|
} C7MetaUsedSource;
|
||||||
|
|
||||||
|
#ifdef C7_META_COUNTER_DEFINE
|
||||||
|
#define C7_META_COUNTER_EXTERN
|
||||||
|
#else
|
||||||
|
#define C7_META_COUNTER_EXTERN extern
|
||||||
|
#endif
|
||||||
|
|
||||||
|
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_total;
|
||||||
|
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_backend;
|
||||||
|
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_tls;
|
||||||
|
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_front;
|
||||||
|
|
||||||
|
static inline void c7_meta_used_note(int class_idx, C7MetaUsedSource src) {
|
||||||
|
if (__builtin_expect(class_idx != 7, 1)) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
atomic_fetch_add_explicit(&g_c7_meta_used_inc_total, 1, memory_order_relaxed);
|
||||||
|
switch (src) {
|
||||||
|
case C7_META_USED_SRC_BACKEND:
|
||||||
|
atomic_fetch_add_explicit(&g_c7_meta_used_inc_backend, 1, memory_order_relaxed);
|
||||||
|
break;
|
||||||
|
case C7_META_USED_SRC_TLS:
|
||||||
|
atomic_fetch_add_explicit(&g_c7_meta_used_inc_tls, 1, memory_order_relaxed);
|
||||||
|
break;
|
||||||
|
case C7_META_USED_SRC_FRONT:
|
||||||
|
atomic_fetch_add_explicit(&g_c7_meta_used_inc_front, 1, memory_order_relaxed);
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t c7_meta_used_total(void) {
|
||||||
|
return atomic_load_explicit(&g_c7_meta_used_inc_total, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t c7_meta_used_backend(void) {
|
||||||
|
return atomic_load_explicit(&g_c7_meta_used_inc_backend, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t c7_meta_used_tls(void) {
|
||||||
|
return atomic_load_explicit(&g_c7_meta_used_inc_tls, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t c7_meta_used_front(void) {
|
||||||
|
return atomic_load_explicit(&g_c7_meta_used_inc_front, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
#undef C7_META_COUNTER_EXTERN
|
||||||
@ -15,6 +15,7 @@
|
|||||||
#include "tiny_header_box.h" // Header Box: Single Source of Truth for header operations
|
#include "tiny_header_box.h" // Header Box: Single Source of Truth for header operations
|
||||||
#include "../tiny_refill_opt.h" // TinyRefillChain, trc_linear_carve()
|
#include "../tiny_refill_opt.h" // TinyRefillChain, trc_linear_carve()
|
||||||
#include "../tiny_box_geometry.h" // tiny_stride_for_class(), tiny_slab_base_for_geometry()
|
#include "../tiny_box_geometry.h" // tiny_stride_for_class(), tiny_slab_base_for_geometry()
|
||||||
|
#include "c7_meta_used_counter_box.h"
|
||||||
|
|
||||||
// External declarations
|
// External declarations
|
||||||
extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES];
|
extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES];
|
||||||
@ -191,6 +192,7 @@ uint32_t box_carve_and_push_with_freelist(int class_idx, uint32_t want) {
|
|||||||
void* p = meta->freelist;
|
void* p = meta->freelist;
|
||||||
meta->freelist = tiny_next_read(class_idx, p);
|
meta->freelist = tiny_next_read(class_idx, p);
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
|
|
||||||
// CRITICAL FIX: Restore header BEFORE pushing to TLS SLL
|
// CRITICAL FIX: Restore header BEFORE pushing to TLS SLL
|
||||||
// Freelist blocks may have stale data at offset 0
|
// Freelist blocks may have stale data at offset 0
|
||||||
|
|||||||
@ -41,7 +41,7 @@ core/box/carve_push_box.o: core/box/carve_push_box.c \
|
|||||||
core/box/../tiny_region_id.h core/box/../hakmem_tiny_integrity.h \
|
core/box/../tiny_region_id.h core/box/../hakmem_tiny_integrity.h \
|
||||||
core/box/../box/slab_freelist_atomic.h core/box/tiny_header_box.h \
|
core/box/../box/slab_freelist_atomic.h core/box/tiny_header_box.h \
|
||||||
core/box/../tiny_refill_opt.h core/box/../box/tls_sll_box.h \
|
core/box/../tiny_refill_opt.h core/box/../box/tls_sll_box.h \
|
||||||
core/box/../tiny_box_geometry.h
|
core/box/../tiny_box_geometry.h core/box/c7_meta_used_counter_box.h
|
||||||
core/box/../hakmem_tiny.h:
|
core/box/../hakmem_tiny.h:
|
||||||
core/box/../hakmem_build_flags.h:
|
core/box/../hakmem_build_flags.h:
|
||||||
core/box/../hakmem_trace.h:
|
core/box/../hakmem_trace.h:
|
||||||
@ -116,3 +116,4 @@ core/box/tiny_header_box.h:
|
|||||||
core/box/../tiny_refill_opt.h:
|
core/box/../tiny_refill_opt.h:
|
||||||
core/box/../box/tls_sll_box.h:
|
core/box/../box/tls_sll_box.h:
|
||||||
core/box/../tiny_box_geometry.h:
|
core/box/../tiny_box_geometry.h:
|
||||||
|
core/box/c7_meta_used_counter_box.h:
|
||||||
|
|||||||
@ -9,12 +9,15 @@
|
|||||||
|
|
||||||
#include <stdint.h>
|
#include <stdint.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <stdatomic.h>
|
||||||
#include "../hakmem_tiny_config.h"
|
#include "../hakmem_tiny_config.h"
|
||||||
#include "../hakmem_tiny_superslab.h"
|
#include "../hakmem_tiny_superslab.h"
|
||||||
#include "../superslab/superslab_inline.h"
|
#include "../superslab/superslab_inline.h"
|
||||||
#include "../tiny_box_geometry.h"
|
#include "../tiny_box_geometry.h"
|
||||||
#include "../box/tiny_next_ptr_box.h"
|
#include "../box/tiny_next_ptr_box.h"
|
||||||
#include "../box/pagefault_telemetry_box.h"
|
#include "../box/pagefault_telemetry_box.h"
|
||||||
|
#include "c7_meta_used_counter_box.h"
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Slab Carving API (Inline for Hot Path)
|
// Slab Carving API (Inline for Hot Path)
|
||||||
@ -46,11 +49,31 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
|
|||||||
|
|
||||||
// Find an available slab in this SuperSlab
|
// Find an available slab in this SuperSlab
|
||||||
int cap = ss_slabs_capacity(ss);
|
int cap = ss_slabs_capacity(ss);
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic int rel_c7_meta_logged = 0;
|
||||||
|
TinySlabMeta* rel_c7_meta = NULL;
|
||||||
|
int rel_c7_meta_idx = -1;
|
||||||
|
#else
|
||||||
|
static __thread int dbg_c7_meta_logged = 0;
|
||||||
|
TinySlabMeta* dbg_c7_meta = NULL;
|
||||||
|
int dbg_c7_meta_idx = -1;
|
||||||
|
#endif
|
||||||
for (int slab_idx = 0; slab_idx < cap; slab_idx++) {
|
for (int slab_idx = 0; slab_idx < cap; slab_idx++) {
|
||||||
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
||||||
|
|
||||||
// Check if this slab matches our class and has capacity
|
// Check if this slab matches our class and has capacity
|
||||||
if (meta->class_idx != (uint8_t)class_idx) continue;
|
if (meta->class_idx != (uint8_t)class_idx) continue;
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7 && atomic_load_explicit(&rel_c7_meta_logged, memory_order_relaxed) == 0 && !rel_c7_meta) {
|
||||||
|
rel_c7_meta = meta;
|
||||||
|
rel_c7_meta_idx = slab_idx;
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
if (class_idx == 7 && dbg_c7_meta_logged == 0 && !dbg_c7_meta) {
|
||||||
|
dbg_c7_meta = meta;
|
||||||
|
dbg_c7_meta_idx = slab_idx;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
if (meta->used >= meta->capacity && !meta->freelist) continue;
|
if (meta->used >= meta->capacity && !meta->freelist) continue;
|
||||||
|
|
||||||
// Carve blocks from this slab
|
// Carve blocks from this slab
|
||||||
@ -73,6 +96,7 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
|
|||||||
|
|
||||||
meta->freelist = next_node;
|
meta->freelist = next_node;
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
|
|
||||||
} else if (meta->carved < meta->capacity) {
|
} else if (meta->carved < meta->capacity) {
|
||||||
// Linear carve
|
// Linear carve
|
||||||
@ -84,6 +108,7 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
|
|||||||
|
|
||||||
meta->carved++;
|
meta->carved++;
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
|
|
||||||
} else {
|
} else {
|
||||||
break; // This slab exhausted
|
break; // This slab exhausted
|
||||||
@ -99,6 +124,48 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
|
|||||||
// If this slab had no freelist and no carved capacity, continue to next
|
// If this slab had no freelist and no carved capacity, continue to next
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
static __thread int dbg_c7_slab_carve_zero_logs = 0;
|
||||||
|
if (class_idx == 7 && dbg_c7_slab_carve_zero_logs < 10) {
|
||||||
|
fprintf(stderr, "[C7_SLAB_CARVE_ZERO] ss=%p no blocks carved\n", (void*)ss);
|
||||||
|
dbg_c7_slab_carve_zero_logs++;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7 &&
|
||||||
|
atomic_load_explicit(&rel_c7_meta_logged, memory_order_relaxed) == 0 &&
|
||||||
|
rel_c7_meta) {
|
||||||
|
size_t bs = tiny_stride_for_class(class_idx);
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_CARVE_META] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p stride=%zu slabs_cap=%d\n",
|
||||||
|
(void*)ss,
|
||||||
|
rel_c7_meta_idx,
|
||||||
|
(unsigned)rel_c7_meta->class_idx,
|
||||||
|
(unsigned)rel_c7_meta->used,
|
||||||
|
(unsigned)rel_c7_meta->capacity,
|
||||||
|
(unsigned)rel_c7_meta->carved,
|
||||||
|
rel_c7_meta->freelist,
|
||||||
|
bs,
|
||||||
|
cap);
|
||||||
|
atomic_store_explicit(&rel_c7_meta_logged, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
if (class_idx == 7 && dbg_c7_meta_logged == 0 && dbg_c7_meta) {
|
||||||
|
size_t bs = tiny_stride_for_class(class_idx);
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_CARVE_META] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p stride=%zu slabs_cap=%d\n",
|
||||||
|
(void*)ss,
|
||||||
|
dbg_c7_meta_idx,
|
||||||
|
(unsigned)dbg_c7_meta->class_idx,
|
||||||
|
(unsigned)dbg_c7_meta->used,
|
||||||
|
(unsigned)dbg_c7_meta->capacity,
|
||||||
|
(unsigned)dbg_c7_meta->carved,
|
||||||
|
dbg_c7_meta->freelist,
|
||||||
|
bs,
|
||||||
|
cap);
|
||||||
|
dbg_c7_meta_logged = 1;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
return 0; // No slab in this SuperSlab had available capacity
|
return 0; // No slab in this SuperSlab had available capacity
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
26
core/box/ss_slab_reset_box.h
Normal file
26
core/box/ss_slab_reset_box.h
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
// ss_slab_reset_box.h
|
||||||
|
// Box: Reset TinySlabMeta for reuse (C7 diagnostics-friendly)
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include "ss_slab_meta_box.h"
|
||||||
|
#include "../superslab/superslab_inline.h"
|
||||||
|
#include <stdatomic.h>
|
||||||
|
|
||||||
|
static inline void ss_slab_reset_meta_for_tiny(SuperSlab* ss,
|
||||||
|
int slab_idx,
|
||||||
|
int class_idx)
|
||||||
|
{
|
||||||
|
if (!ss) return;
|
||||||
|
if (slab_idx < 0 || slab_idx >= ss_slabs_capacity(ss)) return;
|
||||||
|
|
||||||
|
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
||||||
|
meta->used = 0;
|
||||||
|
meta->carved = 0;
|
||||||
|
meta->freelist = NULL;
|
||||||
|
meta->class_idx = (uint8_t)class_idx;
|
||||||
|
ss->class_map[slab_idx] = (uint8_t)class_idx;
|
||||||
|
|
||||||
|
// Reset remote queue state to avoid stale pending frees on reuse.
|
||||||
|
atomic_store_explicit(&ss->remote_heads[slab_idx], 0, memory_order_relaxed);
|
||||||
|
atomic_store_explicit(&ss->remote_counts[slab_idx], 0, memory_order_relaxed);
|
||||||
|
}
|
||||||
@ -13,6 +13,7 @@
|
|||||||
#include "../hakmem_tiny_config.h"
|
#include "../hakmem_tiny_config.h"
|
||||||
#include "../box/tiny_page_box.h" // For tiny_page_box_on_new_slab()
|
#include "../box/tiny_page_box.h" // For tiny_page_box_on_new_slab()
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
|
#include <stdatomic.h>
|
||||||
|
|
||||||
// Forward declaration if not included
|
// Forward declaration if not included
|
||||||
// CRITICAL FIX: type must match core/hakmem_tiny_config.h (const size_t, not uint16_t)
|
// CRITICAL FIX: type must match core/hakmem_tiny_config.h (const size_t, not uint16_t)
|
||||||
@ -64,9 +65,7 @@ static inline int ss_tls_bind_one(int class_idx,
|
|||||||
// superslab_init_slab() only sets it if meta->class_idx==255.
|
// superslab_init_slab() only sets it if meta->class_idx==255.
|
||||||
// We must explicitly set it to the requested class to avoid C0/C7 confusion.
|
// We must explicitly set it to the requested class to avoid C0/C7 confusion.
|
||||||
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
||||||
#if !HAKMEM_BUILD_RELEASE
|
|
||||||
uint8_t old_cls = meta->class_idx;
|
uint8_t old_cls = meta->class_idx;
|
||||||
#endif
|
|
||||||
meta->class_idx = (uint8_t)class_idx;
|
meta->class_idx = (uint8_t)class_idx;
|
||||||
#if !HAKMEM_BUILD_RELEASE
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
if (class_idx == 7 && old_cls != class_idx) {
|
if (class_idx == 7 && old_cls != class_idx) {
|
||||||
@ -75,6 +74,36 @@ static inline int ss_tls_bind_one(int class_idx,
|
|||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic int rel_c7_bind_logged = 0;
|
||||||
|
if (class_idx == 7 &&
|
||||||
|
atomic_load_explicit(&rel_c7_bind_logged, memory_order_relaxed) == 0) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_BIND] ss=%p slab=%d cls=%u cap=%u used=%u carved=%u\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->carved);
|
||||||
|
atomic_store_explicit(&rel_c7_bind_logged, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static __thread int dbg_c7_bind_logged = 0;
|
||||||
|
if (class_idx == 7 && dbg_c7_bind_logged == 0) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_BIND] ss=%p slab=%d old_cls=%u new_cls=%u cap=%u used=%u carved=%u\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)old_cls,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->carved);
|
||||||
|
dbg_c7_bind_logged = 1;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
// Bind this slab to TLS for fast subsequent allocations.
|
// Bind this slab to TLS for fast subsequent allocations.
|
||||||
// Inline implementation of tiny_tls_bind_slab() to avoid header dependencies.
|
// Inline implementation of tiny_tls_bind_slab() to avoid header dependencies.
|
||||||
// Original logic:
|
// Original logic:
|
||||||
@ -109,4 +138,4 @@ static inline int ss_tls_bind_one(int class_idx,
|
|||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
#endif // HAK_SS_TLS_BIND_BOX_H
|
#endif // HAK_SS_TLS_BIND_BOX_H
|
||||||
|
|||||||
@ -4,6 +4,7 @@
|
|||||||
|
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
// Default: conservative profile (all classes TINY_FIRST).
|
// Default: conservative profile (all classes TINY_FIRST).
|
||||||
// This keeps Tiny in the fast path but always allows Pool fallback.
|
// This keeps Tiny in the fast path but always allows Pool fallback.
|
||||||
@ -40,5 +41,16 @@ void tiny_route_init(void)
|
|||||||
// - 全クラス TINY_FIRST(Tiny を使うが必ず Pool fallbackあり)
|
// - 全クラス TINY_FIRST(Tiny を使うが必ず Pool fallbackあり)
|
||||||
memset(g_tiny_route, ROUTE_TINY_FIRST, sizeof(g_tiny_route));
|
memset(g_tiny_route, ROUTE_TINY_FIRST, sizeof(g_tiny_route));
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static int rel_logged = 0;
|
||||||
|
if (!rel_logged) {
|
||||||
|
const char* mode =
|
||||||
|
(g_tiny_route[7] == ROUTE_TINY_ONLY) ? "TINY_ONLY" :
|
||||||
|
(g_tiny_route[7] == ROUTE_TINY_FIRST) ? "TINY_FIRST" :
|
||||||
|
(g_tiny_route[7] == ROUTE_POOL_ONLY) ? "POOL_ONLY" : "UNKNOWN";
|
||||||
|
fprintf(stderr, "[REL_C7_ROUTE] profile=%s route=%s\n", profile, mode);
|
||||||
|
rel_logged = 1;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|||||||
@ -19,6 +19,7 @@
|
|||||||
#define TINY_ROUTE_BOX_H
|
#define TINY_ROUTE_BOX_H
|
||||||
|
|
||||||
#include <stdint.h>
|
#include <stdint.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
// Routing policy per Tiny class.
|
// Routing policy per Tiny class.
|
||||||
typedef enum {
|
typedef enum {
|
||||||
@ -43,8 +44,21 @@ void tiny_route_init(void);
|
|||||||
// Uses simple array lookup; class_idx is masked to [0,7] defensively.
|
// Uses simple array lookup; class_idx is masked to [0,7] defensively.
|
||||||
static inline TinyRoutePolicy tiny_route_get(int class_idx)
|
static inline TinyRoutePolicy tiny_route_get(int class_idx)
|
||||||
{
|
{
|
||||||
return (TinyRoutePolicy)g_tiny_route[class_idx & 7];
|
TinyRoutePolicy p = (TinyRoutePolicy)g_tiny_route[class_idx & 7];
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if ((class_idx & 7) == 7) {
|
||||||
|
static int rel_route_logged = 0;
|
||||||
|
if (!rel_route_logged) {
|
||||||
|
const char* mode =
|
||||||
|
(p == ROUTE_TINY_ONLY) ? "TINY_ONLY" :
|
||||||
|
(p == ROUTE_TINY_FIRST) ? "TINY_FIRST" :
|
||||||
|
(p == ROUTE_POOL_ONLY) ? "POOL_ONLY" : "UNKNOWN";
|
||||||
|
fprintf(stderr, "[REL_C7_ROUTE] via tiny_route_get route=%s\n", mode);
|
||||||
|
rel_route_logged = 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
return p;
|
||||||
}
|
}
|
||||||
|
|
||||||
#endif // TINY_ROUTE_BOX_H
|
#endif // TINY_ROUTE_BOX_H
|
||||||
|
|
||||||
|
|||||||
102
core/box/tiny_tls_carve_one_block_box.h
Normal file
102
core/box/tiny_tls_carve_one_block_box.h
Normal file
@ -0,0 +1,102 @@
|
|||||||
|
// tiny_tls_carve_one_block_box.h
|
||||||
|
// Box: Shared TLS carve helper (linear or freelist) for Tiny classes.
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include "../tiny_tls.h"
|
||||||
|
#include "../tiny_box_geometry.h"
|
||||||
|
#include "../tiny_debug_api.h" // tiny_refill_failfast_level(), tiny_failfast_abort_ptr()
|
||||||
|
#include "c7_meta_used_counter_box.h" // C7 meta->used telemetry (Release/Debug共通)
|
||||||
|
#include "tiny_next_ptr_box.h"
|
||||||
|
#include "../superslab/superslab_inline.h"
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <signal.h>
|
||||||
|
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
extern int g_tiny_safe_free;
|
||||||
|
extern int g_tiny_safe_free_strict;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
enum {
|
||||||
|
TINY_TLS_CARVE_PATH_NONE = 0,
|
||||||
|
TINY_TLS_CARVE_PATH_LINEAR = 1,
|
||||||
|
TINY_TLS_CARVE_PATH_FREELIST = 2,
|
||||||
|
};
|
||||||
|
|
||||||
|
typedef struct TinyTLSCarveOneResult {
|
||||||
|
void* block;
|
||||||
|
int path;
|
||||||
|
} TinyTLSCarveOneResult;
|
||||||
|
|
||||||
|
// Carve one block from the current TLS slab.
|
||||||
|
// Returns .block == NULL on failure. path describes which sub-path was taken.
|
||||||
|
static inline TinyTLSCarveOneResult
|
||||||
|
tiny_tls_carve_one_block(TinyTLSSlab* tls, int class_idx)
|
||||||
|
{
|
||||||
|
TinyTLSCarveOneResult res = {.block = NULL, .path = TINY_TLS_CARVE_PATH_NONE};
|
||||||
|
|
||||||
|
if (!tls) return res;
|
||||||
|
|
||||||
|
TinySlabMeta* meta = tls->meta;
|
||||||
|
if (!meta || !tls->ss || tls->slab_base == NULL) return res;
|
||||||
|
if (meta->class_idx != (uint8_t)class_idx) return res;
|
||||||
|
if (tls->slab_idx < 0 || tls->slab_idx >= ss_slabs_capacity(tls->ss)) return res;
|
||||||
|
|
||||||
|
// Freelist pop
|
||||||
|
if (meta->freelist) {
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
if (__builtin_expect(g_tiny_safe_free, 0)) {
|
||||||
|
size_t blk = tiny_stride_for_class(meta->class_idx);
|
||||||
|
uint8_t* base = tiny_slab_base_for_geometry(tls->ss, tls->slab_idx);
|
||||||
|
uintptr_t delta = (uintptr_t)meta->freelist - (uintptr_t)base;
|
||||||
|
int align_ok = ((delta % blk) == 0);
|
||||||
|
int range_ok = (delta / blk) < meta->capacity;
|
||||||
|
if (!align_ok || !range_ok) {
|
||||||
|
if (g_tiny_safe_free_strict) { raise(SIGUSR2); return res; }
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
void* block = meta->freelist;
|
||||||
|
meta->freelist = tiny_next_read(class_idx, block);
|
||||||
|
meta->used++;
|
||||||
|
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_TLS);
|
||||||
|
ss_active_add(tls->ss, 1);
|
||||||
|
res.block = block;
|
||||||
|
res.path = TINY_TLS_CARVE_PATH_FREELIST;
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Linear carve
|
||||||
|
if (meta->used < meta->capacity) {
|
||||||
|
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
||||||
|
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
|
||||||
|
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0)) {
|
||||||
|
uintptr_t base_ss = (uintptr_t)tls->ss;
|
||||||
|
size_t ss_size = (size_t)1ULL << tls->ss->lg_size;
|
||||||
|
uintptr_t p = (uintptr_t)block;
|
||||||
|
int in_range = (p >= base_ss) && (p < base_ss + ss_size);
|
||||||
|
int aligned = ((p - (uintptr_t)tls->slab_base) % block_size) == 0;
|
||||||
|
int idx_ok = (tls->slab_idx >= 0) &&
|
||||||
|
(tls->slab_idx < ss_slabs_capacity(tls->ss));
|
||||||
|
if (!in_range || !aligned || !idx_ok || meta->used + 1 > meta->capacity) {
|
||||||
|
tiny_failfast_abort_ptr("tls_carve_align",
|
||||||
|
tls->ss,
|
||||||
|
tls->slab_idx,
|
||||||
|
block,
|
||||||
|
"tiny_tls_carve_one_block");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
meta->used++;
|
||||||
|
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_TLS);
|
||||||
|
ss_active_add(tls->ss, 1);
|
||||||
|
res.block = block;
|
||||||
|
res.path = TINY_TLS_CARVE_PATH_LINEAR;
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
|
||||||
|
return res;
|
||||||
|
}
|
||||||
121
core/box/warm_pool_dbg_box.h
Normal file
121
core/box/warm_pool_dbg_box.h
Normal file
@ -0,0 +1,121 @@
|
|||||||
|
// warm_pool_dbg_box.h
|
||||||
|
// Box: Debug-only counters for C7 Warm Pool instrumentation.
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <stdint.h>
|
||||||
|
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
#ifdef WARM_POOL_DBG_DEFINE
|
||||||
|
_Atomic uint64_t g_dbg_c7_warm_pop_attempts = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_warm_pop_hits = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_warm_pop_carve = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_tls_carve_attempts = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_tls_carve_success = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_tls_carve_fail = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_uc_miss_warm_refill = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_uc_miss_tls_refill = 0;
|
||||||
|
_Atomic uint64_t g_dbg_c7_uc_miss_shared_refill = 0;
|
||||||
|
#else
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_warm_pop_attempts;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_warm_pop_hits;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_warm_pop_carve;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_tls_carve_attempts;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_tls_carve_success;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_tls_carve_fail;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_uc_miss_warm_refill;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_uc_miss_tls_refill;
|
||||||
|
extern _Atomic uint64_t g_dbg_c7_uc_miss_shared_refill;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_attempt(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_warm_pop_attempts, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_hit(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_warm_pop_hits, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_carve(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_warm_pop_carve, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_tls_attempt(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_tls_carve_attempts, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_tls_success(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_tls_carve_success, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_tls_fail(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_tls_carve_fail, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_uc_miss_warm(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_uc_miss_warm_refill, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_uc_miss_tls(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_uc_miss_tls_refill, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_pool_dbg_c7_uc_miss_shared(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_dbg_c7_uc_miss_shared_refill, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_attempts(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_warm_pop_attempts, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_hits(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_warm_pop_hits, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_carves(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_warm_pop_carve, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_tls_attempts(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_tls_carve_attempts, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_tls_successes(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_tls_carve_success, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_tls_failures(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_tls_carve_fail, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_uc_miss_warm_refills(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_uc_miss_warm_refill, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_uc_miss_tls_refills(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_uc_miss_tls_refill, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_uc_miss_shared_refills(void) {
|
||||||
|
return atomic_load_explicit(&g_dbg_c7_uc_miss_shared_refill, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static inline void warm_pool_dbg_c7_attempt(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_hit(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_carve(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_tls_attempt(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_tls_success(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_tls_fail(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_uc_miss_warm(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_uc_miss_tls(void) { }
|
||||||
|
static inline void warm_pool_dbg_c7_uc_miss_shared(void) { }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_attempts(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_hits(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_carves(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_tls_attempts(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_tls_successes(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_tls_failures(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_uc_miss_warm_refills(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_uc_miss_tls_refills(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_dbg_c7_uc_miss_shared_refills(void) { return 0; }
|
||||||
|
#endif
|
||||||
@ -7,11 +7,51 @@
|
|||||||
#define HAK_WARM_POOL_PREFILL_BOX_H
|
#define HAK_WARM_POOL_PREFILL_BOX_H
|
||||||
|
|
||||||
#include <stdint.h>
|
#include <stdint.h>
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <stdio.h>
|
||||||
#include "../hakmem_tiny_config.h"
|
#include "../hakmem_tiny_config.h"
|
||||||
#include "../hakmem_tiny_superslab.h"
|
#include "../hakmem_tiny_superslab.h"
|
||||||
#include "../tiny_tls.h"
|
#include "../tiny_tls.h"
|
||||||
#include "../front/tiny_warm_pool.h"
|
#include "../front/tiny_warm_pool.h"
|
||||||
#include "../box/warm_pool_stats_box.h"
|
#include "../box/warm_pool_stats_box.h"
|
||||||
|
#include "../box/warm_pool_rel_counters_box.h"
|
||||||
|
|
||||||
|
static inline void warm_prefill_log_c7_meta(const char* tag, TinyTLSSlab* tls) {
|
||||||
|
if (!tls || !tls->ss) return;
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&rel_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_%s] ss=%p slab=%u cls=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(void*)tls->ss,
|
||||||
|
(unsigned)tls->slab_idx,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&dbg_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_%s] ss=%p slab=%u cls=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(void*)tls->ss,
|
||||||
|
(unsigned)tls->slab_idx,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
// Forward declarations
|
// Forward declarations
|
||||||
extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES];
|
extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES];
|
||||||
@ -45,9 +85,17 @@ extern SuperSlab* superslab_refill(int class_idx);
|
|||||||
// Performance: Only triggered when pool is empty, cold path cost
|
// Performance: Only triggered when pool is empty, cold path cost
|
||||||
//
|
//
|
||||||
static inline int warm_pool_do_prefill(int class_idx, TinyTLSSlab* tls) {
|
static inline int warm_pool_do_prefill(int class_idx, TinyTLSSlab* tls) {
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_pool_rel_c7_prefill_call();
|
||||||
|
}
|
||||||
|
#endif
|
||||||
int budget = (tiny_warm_pool_count(class_idx) == 0) ? WARM_POOL_PREFILL_BUDGET : 1;
|
int budget = (tiny_warm_pool_count(class_idx) == 0) ? WARM_POOL_PREFILL_BUDGET : 1;
|
||||||
|
|
||||||
while (budget > 0) {
|
while (budget > 0) {
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_prefill_log_c7_meta("PREFILL_META", tls);
|
||||||
|
}
|
||||||
if (!tls->ss) {
|
if (!tls->ss) {
|
||||||
// Need to load a new SuperSlab
|
// Need to load a new SuperSlab
|
||||||
if (!superslab_refill(class_idx)) {
|
if (!superslab_refill(class_idx)) {
|
||||||
@ -61,16 +109,75 @@ static inline int warm_pool_do_prefill(int class_idx, TinyTLSSlab* tls) {
|
|||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// C7 safety: prefer only pristine slabs (used=0 carved=0 freelist=NULL)
|
||||||
|
if (class_idx == 7) {
|
||||||
|
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
|
||||||
|
if (meta->class_idx == 7 &&
|
||||||
|
(meta->used > 0 || meta->carved > 0 || meta->freelist != NULL)) {
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic int rel_c7_skip_logged = 0;
|
||||||
|
if (atomic_load_explicit(&rel_c7_skip_logged, memory_order_relaxed) == 0) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_PREFILL_SKIP_NONEMPTY] ss=%p slab=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
(void*)tls->ss,
|
||||||
|
(unsigned)tls->slab_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
atomic_store_explicit(&rel_c7_skip_logged, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static __thread int dbg_c7_skip_logged = 0;
|
||||||
|
if (dbg_c7_skip_logged < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_PREFILL_SKIP_NONEMPTY] ss=%p slab=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
(void*)tls->ss,
|
||||||
|
(unsigned)tls->slab_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
dbg_c7_skip_logged++;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
tls->ss = NULL; // Drop exhausted slab and try another
|
||||||
|
budget--;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (budget > 1) {
|
if (budget > 1) {
|
||||||
// Prefill mode: push to pool and load another
|
// Prefill mode: push to pool and load another
|
||||||
tiny_warm_pool_push(class_idx, tls->ss);
|
tiny_warm_pool_push(class_idx, tls->ss);
|
||||||
warm_pool_record_prefilled(class_idx);
|
warm_pool_record_prefilled(class_idx);
|
||||||
tls->ss = NULL; // Force next iteration to refill
|
#if HAKMEM_BUILD_RELEASE
|
||||||
budget--;
|
if (class_idx == 7) {
|
||||||
} else {
|
warm_pool_rel_c7_prefill_slab();
|
||||||
// Final slab: keep in TLS for immediate carving
|
|
||||||
budget = 0;
|
|
||||||
}
|
}
|
||||||
|
#else
|
||||||
|
if (class_idx == 7) {
|
||||||
|
static __thread int dbg_c7_prefill_logs = 0;
|
||||||
|
if (dbg_c7_prefill_logs < 8) {
|
||||||
|
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_PREFILL] ss=%p slab=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
(void*)tls->ss,
|
||||||
|
(unsigned)tls->slab_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
dbg_c7_prefill_logs++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
tls->ss = NULL; // Force next iteration to refill
|
||||||
|
budget--;
|
||||||
|
} else {
|
||||||
|
// Final slab: keep in TLS for immediate carving
|
||||||
|
budget = 0;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return 0; // Success
|
return 0; // Success
|
||||||
|
|||||||
64
core/box/warm_pool_rel_counters_box.h
Normal file
64
core/box/warm_pool_rel_counters_box.h
Normal file
@ -0,0 +1,64 @@
|
|||||||
|
// warm_pool_rel_counters_box.h
|
||||||
|
// Box: Lightweight Release-side counters for C7 Warm/TLS instrumentation.
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <stdint.h>
|
||||||
|
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
#ifdef WARM_POOL_REL_DEFINE
|
||||||
|
_Atomic uint64_t g_rel_c7_carve_attempts = 0;
|
||||||
|
_Atomic uint64_t g_rel_c7_carve_success = 0;
|
||||||
|
_Atomic uint64_t g_rel_c7_carve_zero = 0;
|
||||||
|
_Atomic uint64_t g_rel_c7_warm_prefill_calls = 0;
|
||||||
|
_Atomic uint64_t g_rel_c7_warm_prefill_slabs = 0;
|
||||||
|
#else
|
||||||
|
extern _Atomic uint64_t g_rel_c7_carve_attempts;
|
||||||
|
extern _Atomic uint64_t g_rel_c7_carve_success;
|
||||||
|
extern _Atomic uint64_t g_rel_c7_carve_zero;
|
||||||
|
extern _Atomic uint64_t g_rel_c7_warm_prefill_calls;
|
||||||
|
extern _Atomic uint64_t g_rel_c7_warm_prefill_slabs;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
static inline void warm_pool_rel_c7_carve_attempt(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_carve_attempts, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline void warm_pool_rel_c7_carve_success(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_carve_success, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline void warm_pool_rel_c7_carve_zero(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_carve_zero, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline void warm_pool_rel_c7_prefill_call(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_warm_prefill_calls, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline void warm_pool_rel_c7_prefill_slab(void) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_warm_prefill_slabs, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t warm_pool_rel_c7_carve_attempts(void) {
|
||||||
|
return atomic_load_explicit(&g_rel_c7_carve_attempts, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t warm_pool_rel_c7_carve_successes(void) {
|
||||||
|
return atomic_load_explicit(&g_rel_c7_carve_success, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t warm_pool_rel_c7_carve_zeroes(void) {
|
||||||
|
return atomic_load_explicit(&g_rel_c7_carve_zero, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t warm_pool_rel_c7_prefill_calls(void) {
|
||||||
|
return atomic_load_explicit(&g_rel_c7_warm_prefill_calls, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
static inline uint64_t warm_pool_rel_c7_prefill_slabs(void) {
|
||||||
|
return atomic_load_explicit(&g_rel_c7_warm_prefill_slabs, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static inline void warm_pool_rel_c7_carve_attempt(void) { }
|
||||||
|
static inline void warm_pool_rel_c7_carve_success(void) { }
|
||||||
|
static inline void warm_pool_rel_c7_carve_zero(void) { }
|
||||||
|
static inline void warm_pool_rel_c7_prefill_call(void) { }
|
||||||
|
static inline void warm_pool_rel_c7_prefill_slab(void) { }
|
||||||
|
static inline uint64_t warm_pool_rel_c7_carve_attempts(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_rel_c7_carve_successes(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_rel_c7_carve_zeroes(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_rel_c7_prefill_calls(void) { return 0; }
|
||||||
|
static inline uint64_t warm_pool_rel_c7_prefill_slabs(void) { return 0; }
|
||||||
|
#endif
|
||||||
57
core/box/warm_tls_bind_logger_box.h
Normal file
57
core/box/warm_tls_bind_logger_box.h
Normal file
@ -0,0 +1,57 @@
|
|||||||
|
// warm_tls_bind_logger_box.h
|
||||||
|
// Box: Warm TLS Bind experiment logging with simple throttling.
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include "../hakmem_tiny_superslab.h"
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <stdlib.h>
|
||||||
|
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic int g_warm_tls_bind_log_limit = -1;
|
||||||
|
static _Atomic int g_warm_tls_bind_log_count = 0;
|
||||||
|
|
||||||
|
static inline int warm_tls_bind_log_limit(void) {
|
||||||
|
int limit = atomic_load_explicit(&g_warm_tls_bind_log_limit, memory_order_relaxed);
|
||||||
|
if (__builtin_expect(limit == -1, 0)) {
|
||||||
|
const char* e = getenv("HAKMEM_WARM_TLS_BIND_LOG_MAX");
|
||||||
|
int parsed = (e && *e) ? atoi(e) : 1;
|
||||||
|
atomic_store_explicit(&g_warm_tls_bind_log_limit, parsed, memory_order_relaxed);
|
||||||
|
limit = parsed;
|
||||||
|
}
|
||||||
|
return limit;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline int warm_tls_bind_log_acquire(void) {
|
||||||
|
int limit = warm_tls_bind_log_limit();
|
||||||
|
int prev = atomic_fetch_add_explicit(&g_warm_tls_bind_log_count, 1, memory_order_relaxed);
|
||||||
|
return prev < limit;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_tls_bind_log_success(SuperSlab* ss, int slab_idx) {
|
||||||
|
if (warm_tls_bind_log_acquire()) {
|
||||||
|
fprintf(stderr, "[WARM_TLS_BIND] C7 bind success: ss=%p slab=%d\n",
|
||||||
|
(void*)ss, slab_idx);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_tls_bind_log_tls_carve(SuperSlab* ss, int slab_idx, void* block) {
|
||||||
|
if (warm_tls_bind_log_acquire()) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[WARM_TLS_BIND] C7 TLS carve success: ss=%p slab=%d block=%p\n",
|
||||||
|
(void*)ss, slab_idx, block);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void warm_tls_bind_log_tls_fail(SuperSlab* ss, int slab_idx) {
|
||||||
|
if (warm_tls_bind_log_acquire()) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[WARM_TLS_BIND] C7 TLS carve failed, fallback (ss=%p slab=%d)\n",
|
||||||
|
(void*)ss, slab_idx);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static inline void warm_tls_bind_log_success(SuperSlab* ss, int slab_idx) { (void)ss; (void)slab_idx; }
|
||||||
|
static inline void warm_tls_bind_log_tls_carve(SuperSlab* ss, int slab_idx, void* block) { (void)ss; (void)slab_idx; (void)block; }
|
||||||
|
static inline void warm_tls_bind_log_tls_fail(SuperSlab* ss, int slab_idx) { (void)ss; (void)slab_idx; }
|
||||||
|
#endif
|
||||||
@ -12,10 +12,19 @@
|
|||||||
#include "../box/ss_slab_meta_box.h" // For ss_active_add() and slab metadata operations
|
#include "../box/ss_slab_meta_box.h" // For ss_active_add() and slab metadata operations
|
||||||
#include "../box/warm_pool_stats_box.h" // Box: Warm Pool Statistics Recording (inline)
|
#include "../box/warm_pool_stats_box.h" // Box: Warm Pool Statistics Recording (inline)
|
||||||
#include "../box/slab_carve_box.h" // Box: Slab Carving (inline O(slabs) scan)
|
#include "../box/slab_carve_box.h" // Box: Slab Carving (inline O(slabs) scan)
|
||||||
|
#define WARM_POOL_REL_DEFINE
|
||||||
|
#include "../box/warm_pool_rel_counters_box.h" // Box: Release-side C7 counters
|
||||||
|
#undef WARM_POOL_REL_DEFINE
|
||||||
|
#include "../box/c7_meta_used_counter_box.h" // Box: C7 meta->used increment counters
|
||||||
#include "../box/warm_pool_prefill_box.h" // Box: Warm Pool Prefill (secondary optimization)
|
#include "../box/warm_pool_prefill_box.h" // Box: Warm Pool Prefill (secondary optimization)
|
||||||
#include "../hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls)
|
#include "../hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls)
|
||||||
#include "../box/tiny_page_box.h" // Tiny-Plus Page Box (C5–C7 initial hook)
|
#include "../box/tiny_page_box.h" // Tiny-Plus Page Box (C5–C7 initial hook)
|
||||||
#include "../box/ss_tls_bind_box.h" // Box: TLS Bind (SuperSlab -> TLS binding)
|
#include "../box/ss_tls_bind_box.h" // Box: TLS Bind (SuperSlab -> TLS binding)
|
||||||
|
#include "../box/tiny_tls_carve_one_block_box.h" // Box: TLS carve helper (shared)
|
||||||
|
#include "../box/warm_tls_bind_logger_box.h" // Box: Warm TLS Bind logging (throttled)
|
||||||
|
#define WARM_POOL_DBG_DEFINE
|
||||||
|
#include "../box/warm_pool_dbg_box.h" // Box: Warm Pool C7 debug counters
|
||||||
|
#undef WARM_POOL_DBG_DEFINE
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
#include <stdatomic.h>
|
#include <stdatomic.h>
|
||||||
@ -84,6 +93,12 @@ __thread uint64_t g_unified_cache_push[TINY_NUM_CLASSES] = {0};
|
|||||||
__thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES] = {0};
|
__thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES] = {0};
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
// Release-side lightweight telemetry (C7 Warm path only)
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
_Atomic uint64_t g_rel_c7_warm_pop = 0;
|
||||||
|
_Atomic uint64_t g_rel_c7_warm_push = 0;
|
||||||
|
#endif
|
||||||
|
|
||||||
// Warm Pool metrics (definition - declared in tiny_warm_pool.h as extern)
|
// Warm Pool metrics (definition - declared in tiny_warm_pool.h as extern)
|
||||||
// Note: These are kept outside !HAKMEM_BUILD_RELEASE for profiling in release builds
|
// Note: These are kept outside !HAKMEM_BUILD_RELEASE for profiling in release builds
|
||||||
__thread TinyWarmPoolStats g_warm_pool_stats[TINY_NUM_CLASSES] = {0};
|
__thread TinyWarmPoolStats g_warm_pool_stats[TINY_NUM_CLASSES] = {0};
|
||||||
@ -98,46 +113,36 @@ _Atomic uint64_t g_dbg_warm_pop_attempts = 0;
|
|||||||
_Atomic uint64_t g_dbg_warm_pop_hits = 0;
|
_Atomic uint64_t g_dbg_warm_pop_hits = 0;
|
||||||
_Atomic uint64_t g_dbg_warm_pop_empty = 0;
|
_Atomic uint64_t g_dbg_warm_pop_empty = 0;
|
||||||
_Atomic uint64_t g_dbg_warm_pop_carve_zero = 0;
|
_Atomic uint64_t g_dbg_warm_pop_carve_zero = 0;
|
||||||
|
#endif
|
||||||
|
|
||||||
// Debug-only: cached ENV for Warm TLS Bind (C7)
|
// Warm TLS Bind (C7) mode selector
|
||||||
static int g_warm_tls_bind_mode_c7 = -1;
|
// mode 0: Legacy warm path(デバッグ専用・C7では非推奨)
|
||||||
|
// mode 1: Bind-only 本番経路(C7 標準)
|
||||||
|
// mode 2: Bind + TLS carve 実験経路(Debug 専用)
|
||||||
|
// Release ビルドでは常に mode=1 に固定し、ENV は無視する。
|
||||||
static inline int warm_tls_bind_mode_c7(void) {
|
static inline int warm_tls_bind_mode_c7(void) {
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static int g_warm_tls_bind_mode_c7 = -1;
|
||||||
if (__builtin_expect(g_warm_tls_bind_mode_c7 == -1, 0)) {
|
if (__builtin_expect(g_warm_tls_bind_mode_c7 == -1, 0)) {
|
||||||
const char* e = getenv("HAKMEM_WARM_TLS_BIND_C7");
|
const char* e = getenv("HAKMEM_WARM_TLS_BIND_C7");
|
||||||
// 0/empty: disabled, 1: bind only, 2: bind + TLS carve one block
|
int mode = (e && *e) ? atoi(e) : 1; // default = Bind-only
|
||||||
g_warm_tls_bind_mode_c7 = (e && *e) ? atoi(e) : 0;
|
if (mode < 0) mode = 0;
|
||||||
|
if (mode > 2) mode = 2;
|
||||||
|
g_warm_tls_bind_mode_c7 = mode;
|
||||||
}
|
}
|
||||||
return g_warm_tls_bind_mode_c7;
|
return g_warm_tls_bind_mode_c7;
|
||||||
}
|
#else
|
||||||
|
static int g_warm_tls_bind_mode_c7 = -1;
|
||||||
static inline void* warm_tls_carve_one_block(int class_idx) {
|
if (__builtin_expect(g_warm_tls_bind_mode_c7 == -1, 0)) {
|
||||||
TinyTLSSlab* tls = &g_tls_slabs[class_idx];
|
const char* e = getenv("HAKMEM_WARM_TLS_BIND_C7");
|
||||||
TinySlabMeta* meta = tls->meta;
|
int mode = (e && *e) ? atoi(e) : 1; // default = Bind-only
|
||||||
|
if (mode < 0) mode = 0;
|
||||||
if (!meta || !tls->ss || tls->slab_base == NULL) return NULL;
|
if (mode > 2) mode = 2;
|
||||||
if (meta->class_idx != (uint8_t)class_idx) return NULL;
|
g_warm_tls_bind_mode_c7 = mode;
|
||||||
if (tls->slab_idx < 0 || tls->slab_idx >= ss_slabs_capacity(tls->ss)) return NULL;
|
|
||||||
|
|
||||||
if (meta->freelist) {
|
|
||||||
void* block = meta->freelist;
|
|
||||||
meta->freelist = tiny_next_read(class_idx, block);
|
|
||||||
meta->used++;
|
|
||||||
ss_active_add(tls->ss, 1);
|
|
||||||
return block;
|
|
||||||
}
|
}
|
||||||
|
return g_warm_tls_bind_mode_c7;
|
||||||
if (meta->used < meta->capacity) {
|
|
||||||
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
|
||||||
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
|
|
||||||
meta->used++;
|
|
||||||
ss_active_add(tls->ss, 1);
|
|
||||||
return block;
|
|
||||||
}
|
|
||||||
|
|
||||||
return NULL;
|
|
||||||
}
|
|
||||||
#endif
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
// Forward declaration for Warm Pool stats printer (defined later in this file)
|
// Forward declaration for Warm Pool stats printer (defined later in this file)
|
||||||
static inline void tiny_warm_pool_print_stats(void);
|
static inline void tiny_warm_pool_print_stats(void);
|
||||||
@ -157,6 +162,15 @@ int unified_cache_enabled(void) {
|
|||||||
fprintf(stderr, "[Unified-INIT] unified_cache_enabled() = %d\n", g_enable);
|
fprintf(stderr, "[Unified-INIT] unified_cache_enabled() = %d\n", g_enable);
|
||||||
fflush(stderr);
|
fflush(stderr);
|
||||||
}
|
}
|
||||||
|
#else
|
||||||
|
if (g_enable) {
|
||||||
|
static int printed = 0;
|
||||||
|
if (!printed) {
|
||||||
|
fprintf(stderr, "[Rel-Unified] unified_cache_enabled() = %d\n", g_enable);
|
||||||
|
fflush(stderr);
|
||||||
|
printed = 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
return g_enable;
|
return g_enable;
|
||||||
@ -311,6 +325,32 @@ static inline void tiny_warm_pool_print_stats(void) {
|
|||||||
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_hits, memory_order_relaxed),
|
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_hits, memory_order_relaxed),
|
||||||
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_empty, memory_order_relaxed),
|
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_empty, memory_order_relaxed),
|
||||||
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_carve_zero, memory_order_relaxed));
|
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_carve_zero, memory_order_relaxed));
|
||||||
|
uint64_t c7_attempts = warm_pool_dbg_c7_attempts();
|
||||||
|
uint64_t c7_hits = warm_pool_dbg_c7_hits();
|
||||||
|
uint64_t c7_carve = warm_pool_dbg_c7_carves();
|
||||||
|
uint64_t c7_tls_attempts = warm_pool_dbg_c7_tls_attempts();
|
||||||
|
uint64_t c7_tls_success = warm_pool_dbg_c7_tls_successes();
|
||||||
|
uint64_t c7_tls_fail = warm_pool_dbg_c7_tls_failures();
|
||||||
|
uint64_t c7_uc_warm = warm_pool_dbg_c7_uc_miss_warm_refills();
|
||||||
|
uint64_t c7_uc_tls = warm_pool_dbg_c7_uc_miss_tls_refills();
|
||||||
|
uint64_t c7_uc_shared = warm_pool_dbg_c7_uc_miss_shared_refills();
|
||||||
|
if (c7_attempts || c7_hits || c7_carve ||
|
||||||
|
c7_tls_attempts || c7_tls_success || c7_tls_fail ||
|
||||||
|
c7_uc_warm || c7_uc_tls || c7_uc_shared) {
|
||||||
|
fprintf(stderr,
|
||||||
|
" [DBG_C7] warm_pop_attempts=%llu warm_pop_hits=%llu warm_pop_carve=%llu "
|
||||||
|
"tls_carve_attempts=%llu tls_carve_success=%llu tls_carve_fail=%llu "
|
||||||
|
"uc_miss_warm=%llu uc_miss_tls=%llu uc_miss_shared=%llu\n",
|
||||||
|
(unsigned long long)c7_attempts,
|
||||||
|
(unsigned long long)c7_hits,
|
||||||
|
(unsigned long long)c7_carve,
|
||||||
|
(unsigned long long)c7_tls_attempts,
|
||||||
|
(unsigned long long)c7_tls_success,
|
||||||
|
(unsigned long long)c7_tls_fail,
|
||||||
|
(unsigned long long)c7_uc_warm,
|
||||||
|
(unsigned long long)c7_uc_tls,
|
||||||
|
(unsigned long long)c7_uc_shared);
|
||||||
|
}
|
||||||
#endif
|
#endif
|
||||||
fflush(stderr);
|
fflush(stderr);
|
||||||
}
|
}
|
||||||
@ -515,6 +555,7 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
|
|||||||
// - これにより、room <= max_batch <= 512 が常に成り立ち、out[] オーバーランを防止する。
|
// - これにより、room <= max_batch <= 512 が常に成り立ち、out[] オーバーランを防止する。
|
||||||
void* out[512];
|
void* out[512];
|
||||||
int produced = 0;
|
int produced = 0;
|
||||||
|
int tls_carved = 0; // Debug bookkeeping: track TLS carve experiment hits
|
||||||
|
|
||||||
// ========== PAGE BOX HOT PATH(Tiny-Plus 層): Try page box FIRST ==========
|
// ========== PAGE BOX HOT PATH(Tiny-Plus 層): Try page box FIRST ==========
|
||||||
// 将来的に C7 専用の page-level freelist 管理をここに統合する。
|
// 将来的に C7 専用の page-level freelist 管理をここに統合する。
|
||||||
@ -554,10 +595,21 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
|
|||||||
// This is the critical optimization - avoid superslab_refill() registry scan
|
// This is the critical optimization - avoid superslab_refill() registry scan
|
||||||
#if !HAKMEM_BUILD_RELEASE
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
atomic_fetch_add_explicit(&g_dbg_warm_pop_attempts, 1, memory_order_relaxed);
|
atomic_fetch_add_explicit(&g_dbg_warm_pop_attempts, 1, memory_order_relaxed);
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_pool_dbg_c7_attempt();
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_warm_pop, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
#endif
|
#endif
|
||||||
SuperSlab* warm_ss = tiny_warm_pool_pop(class_idx);
|
SuperSlab* warm_ss = tiny_warm_pool_pop(class_idx);
|
||||||
if (warm_ss) {
|
if (warm_ss) {
|
||||||
#if !HAKMEM_BUILD_RELEASE
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_pool_dbg_c7_hit();
|
||||||
|
}
|
||||||
// Debug-only: Warm TLS Bind experiment (C7 only)
|
// Debug-only: Warm TLS Bind experiment (C7 only)
|
||||||
if (class_idx == 7) {
|
if (class_idx == 7) {
|
||||||
int warm_mode = warm_tls_bind_mode_c7();
|
int warm_mode = warm_tls_bind_mode_c7();
|
||||||
@ -577,25 +629,22 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
|
|||||||
TinyTLSSlab* tls = &g_tls_slabs[class_idx];
|
TinyTLSSlab* tls = &g_tls_slabs[class_idx];
|
||||||
uint32_t tid = (uint32_t)(uintptr_t)pthread_self();
|
uint32_t tid = (uint32_t)(uintptr_t)pthread_self();
|
||||||
if (ss_tls_bind_one(class_idx, tls, warm_ss, slab_idx, tid)) {
|
if (ss_tls_bind_one(class_idx, tls, warm_ss, slab_idx, tid)) {
|
||||||
static int logged = 0;
|
warm_tls_bind_log_success(warm_ss, slab_idx);
|
||||||
if (!logged) {
|
|
||||||
fprintf(stderr, "[WARM_TLS_BIND] C7 bind success: ss=%p slab=%d\n",
|
|
||||||
(void*)warm_ss, slab_idx);
|
|
||||||
logged = 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Mode 2: carve a single block via TLS fast path
|
// Mode 2: carve a single block via TLS fast path
|
||||||
if (warm_mode == 2) {
|
if (warm_mode == 2) {
|
||||||
void* tls_block = warm_tls_carve_one_block(class_idx);
|
warm_pool_dbg_c7_tls_attempt();
|
||||||
if (tls_block) {
|
TinyTLSCarveOneResult tls_carve =
|
||||||
fprintf(stderr,
|
tiny_tls_carve_one_block(tls, class_idx);
|
||||||
"[WARM_TLS_BIND] C7 TLS carve success: ss=%p slab=%d block=%p\n",
|
if (tls_carve.block) {
|
||||||
(void*)warm_ss, slab_idx, tls_block);
|
warm_tls_bind_log_tls_carve(warm_ss, slab_idx, tls_carve.block);
|
||||||
out[0] = tls_block;
|
warm_pool_dbg_c7_tls_success();
|
||||||
|
out[0] = tls_carve.block;
|
||||||
produced = 1;
|
produced = 1;
|
||||||
|
tls_carved = 1;
|
||||||
} else {
|
} else {
|
||||||
fprintf(stderr,
|
warm_tls_bind_log_tls_fail(warm_ss, slab_idx);
|
||||||
"[WARM_TLS_BIND] C7 TLS carve failed, fallback\n");
|
warm_pool_dbg_c7_tls_fail();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -607,7 +656,21 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
|
|||||||
#endif
|
#endif
|
||||||
// HOT PATH: Warm pool hit, try to carve directly
|
// HOT PATH: Warm pool hit, try to carve directly
|
||||||
if (produced == 0) {
|
if (produced == 0) {
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_pool_rel_c7_carve_attempt();
|
||||||
|
}
|
||||||
|
#endif
|
||||||
produced = slab_carve_from_ss(class_idx, warm_ss, out, room);
|
produced = slab_carve_from_ss(class_idx, warm_ss, out, room);
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
if (produced > 0) {
|
||||||
|
warm_pool_rel_c7_carve_success();
|
||||||
|
} else {
|
||||||
|
warm_pool_rel_c7_carve_zero();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
if (produced > 0) {
|
if (produced > 0) {
|
||||||
// Update active counter for carved blocks
|
// Update active counter for carved blocks
|
||||||
ss_active_add(warm_ss, (uint32_t)produced);
|
ss_active_add(warm_ss, (uint32_t)produced);
|
||||||
@ -615,7 +678,22 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (produced > 0) {
|
if (produced > 0) {
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_pool_dbg_c7_carve();
|
||||||
|
if (tls_carved) {
|
||||||
|
warm_pool_dbg_c7_uc_miss_tls();
|
||||||
|
} else {
|
||||||
|
warm_pool_dbg_c7_uc_miss_warm();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif
|
||||||
// Success! Return SuperSlab to warm pool for next use
|
// Success! Return SuperSlab to warm pool for next use
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
atomic_fetch_add_explicit(&g_rel_c7_warm_push, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
tiny_warm_pool_push(class_idx, warm_ss);
|
tiny_warm_pool_push(class_idx, warm_ss);
|
||||||
|
|
||||||
// Track warm pool hit (always compiled, ENV-gated printing)
|
// Track warm pool hit (always compiled, ENV-gated printing)
|
||||||
@ -761,6 +839,9 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
#if !HAKMEM_BUILD_RELEASE
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
if (class_idx == 7) {
|
||||||
|
warm_pool_dbg_c7_uc_miss_shared();
|
||||||
|
}
|
||||||
g_unified_cache_miss[class_idx]++;
|
g_unified_cache_miss[class_idx]++;
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|||||||
@ -40,10 +40,18 @@ core/front/tiny_unified_cache.o: core/front/tiny_unified_cache.c \
|
|||||||
core/front/../box/../superslab/superslab_inline.h \
|
core/front/../box/../superslab/superslab_inline.h \
|
||||||
core/front/../box/../tiny_box_geometry.h \
|
core/front/../box/../tiny_box_geometry.h \
|
||||||
core/front/../box/../box/pagefault_telemetry_box.h \
|
core/front/../box/../box/pagefault_telemetry_box.h \
|
||||||
|
core/front/../box/c7_meta_used_counter_box.h \
|
||||||
|
core/front/../box/warm_pool_rel_counters_box.h \
|
||||||
core/front/../box/warm_pool_prefill_box.h \
|
core/front/../box/warm_pool_prefill_box.h \
|
||||||
core/front/../box/../tiny_tls.h \
|
core/front/../box/../tiny_tls.h \
|
||||||
core/front/../box/../box/warm_pool_stats_box.h \
|
core/front/../box/../box/warm_pool_stats_box.h \
|
||||||
core/front/../hakmem_env_cache.h core/front/../box/tiny_page_box.h
|
core/front/../hakmem_env_cache.h core/front/../box/tiny_page_box.h \
|
||||||
|
core/front/../box/ss_tls_bind_box.h \
|
||||||
|
core/front/../box/../box/tiny_page_box.h \
|
||||||
|
core/front/../box/tiny_tls_carve_one_block_box.h \
|
||||||
|
core/front/../box/../tiny_debug_api.h \
|
||||||
|
core/front/../box/warm_tls_bind_logger_box.h \
|
||||||
|
core/front/../box/warm_pool_dbg_box.h
|
||||||
core/front/tiny_unified_cache.h:
|
core/front/tiny_unified_cache.h:
|
||||||
core/front/../hakmem_build_flags.h:
|
core/front/../hakmem_build_flags.h:
|
||||||
core/front/../hakmem_tiny_config.h:
|
core/front/../hakmem_tiny_config.h:
|
||||||
@ -104,8 +112,16 @@ core/front/../box/../hakmem_tiny_superslab.h:
|
|||||||
core/front/../box/../superslab/superslab_inline.h:
|
core/front/../box/../superslab/superslab_inline.h:
|
||||||
core/front/../box/../tiny_box_geometry.h:
|
core/front/../box/../tiny_box_geometry.h:
|
||||||
core/front/../box/../box/pagefault_telemetry_box.h:
|
core/front/../box/../box/pagefault_telemetry_box.h:
|
||||||
|
core/front/../box/c7_meta_used_counter_box.h:
|
||||||
|
core/front/../box/warm_pool_rel_counters_box.h:
|
||||||
core/front/../box/warm_pool_prefill_box.h:
|
core/front/../box/warm_pool_prefill_box.h:
|
||||||
core/front/../box/../tiny_tls.h:
|
core/front/../box/../tiny_tls.h:
|
||||||
core/front/../box/../box/warm_pool_stats_box.h:
|
core/front/../box/../box/warm_pool_stats_box.h:
|
||||||
core/front/../hakmem_env_cache.h:
|
core/front/../hakmem_env_cache.h:
|
||||||
core/front/../box/tiny_page_box.h:
|
core/front/../box/tiny_page_box.h:
|
||||||
|
core/front/../box/ss_tls_bind_box.h:
|
||||||
|
core/front/../box/../box/tiny_page_box.h:
|
||||||
|
core/front/../box/tiny_tls_carve_one_block_box.h:
|
||||||
|
core/front/../box/../tiny_debug_api.h:
|
||||||
|
core/front/../box/warm_tls_bind_logger_box.h:
|
||||||
|
core/front/../box/warm_pool_dbg_box.h:
|
||||||
|
|||||||
@ -87,6 +87,10 @@ extern __thread uint64_t g_unified_cache_hit[TINY_NUM_CLASSES]; // Alloc hits
|
|||||||
extern __thread uint64_t g_unified_cache_miss[TINY_NUM_CLASSES]; // Alloc misses
|
extern __thread uint64_t g_unified_cache_miss[TINY_NUM_CLASSES]; // Alloc misses
|
||||||
extern __thread uint64_t g_unified_cache_push[TINY_NUM_CLASSES]; // Free pushes
|
extern __thread uint64_t g_unified_cache_push[TINY_NUM_CLASSES]; // Free pushes
|
||||||
extern __thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES]; // Free full (fallback to SuperSlab)
|
extern __thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES]; // Free full (fallback to SuperSlab)
|
||||||
|
#else
|
||||||
|
// Release-side lightweight C7 warm path counters (for smoke validation)
|
||||||
|
extern _Atomic uint64_t g_rel_c7_warm_pop;
|
||||||
|
extern _Atomic uint64_t g_rel_c7_warm_push;
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
|
|||||||
@ -10,11 +10,145 @@
|
|||||||
#include "hakmem_policy.h"
|
#include "hakmem_policy.h"
|
||||||
#include "hakmem_env_cache.h" // Priority-2: ENV cache
|
#include "hakmem_env_cache.h" // Priority-2: ENV cache
|
||||||
#include "front/tiny_warm_pool.h" // Warm Pool: Prefill during registry scans
|
#include "front/tiny_warm_pool.h" // Warm Pool: Prefill during registry scans
|
||||||
|
#include "box/ss_slab_reset_box.h" // Box: Reset slab metadata on reuse (C7 guard)
|
||||||
|
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stdatomic.h>
|
#include <stdatomic.h>
|
||||||
|
|
||||||
|
static inline void c7_log_meta_state(const char* tag, SuperSlab* ss, int slab_idx) {
|
||||||
|
if (!ss) return;
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_c7_meta_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&rel_c7_meta_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 8) {
|
||||||
|
TinySlabMeta* m = &ss->slabs[slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)m->class_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_c7_meta_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_meta_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 8) {
|
||||||
|
TinySlabMeta* m = &ss->slabs[slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)m->class_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline int c7_meta_is_pristine(TinySlabMeta* m) {
|
||||||
|
return m && m->used == 0 && m->carved == 0 && m->freelist == NULL;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void c7_log_skip_nonempty_acquire(SuperSlab* ss,
|
||||||
|
int slab_idx,
|
||||||
|
TinySlabMeta* m,
|
||||||
|
const char* tag) {
|
||||||
|
if (!(ss && m)) return;
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_c7_skip_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&rel_c7_skip_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)m->class_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_c7_skip_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_skip_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)m->class_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline int c7_reset_and_log_if_needed(SuperSlab* ss,
|
||||||
|
int slab_idx,
|
||||||
|
int class_idx) {
|
||||||
|
if (class_idx != 7) {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
TinySlabMeta* m = &ss->slabs[slab_idx];
|
||||||
|
c7_log_meta_state("ACQUIRE_META", ss, slab_idx);
|
||||||
|
|
||||||
|
if (m->class_idx != 255 && m->class_idx != (uint8_t)class_idx) {
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_c7_class_mismatch_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&rel_c7_class_mismatch_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_CLASS_MISMATCH] ss=%p slab=%d want=%d have=%u used=%u cap=%u carved=%u\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
class_idx,
|
||||||
|
(unsigned)m->class_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_c7_class_mismatch_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_class_mismatch_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_CLASS_MISMATCH] ss=%p slab=%d want=%d have=%u used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
class_idx,
|
||||||
|
(unsigned)m->class_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!c7_meta_is_pristine(m)) {
|
||||||
|
c7_log_skip_nonempty_acquire(ss, slab_idx, m, "SKIP_NONEMPTY_ACQUIRE");
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
ss_slab_reset_meta_for_tiny(ss, slab_idx, class_idx);
|
||||||
|
c7_log_meta_state("ACQUIRE", ss, slab_idx);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Performance Measurement: Shared Pool Lock Contention (ENV-gated)
|
// Performance Measurement: Shared Pool Lock Contention (ENV-gated)
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
@ -147,7 +281,12 @@ sp_acquire_from_empty_scan(int class_idx, SuperSlab** ss_out, int* slab_idx_out,
|
|||||||
fprintf(stderr, "[STAGE0.5_STATS] hits=%lu attempts=%lu rate=%.1f%% (scan_limit=%d warm_pool=%d)\n",
|
fprintf(stderr, "[STAGE0.5_STATS] hits=%lu attempts=%lu rate=%.1f%% (scan_limit=%d warm_pool=%d)\n",
|
||||||
hits, attempts, (double)hits * 100.0 / attempts, scan_limit, tiny_warm_pool_count(class_idx));
|
hits, attempts, (double)hits * 100.0 / attempts, scan_limit, tiny_warm_pool_count(class_idx));
|
||||||
}
|
}
|
||||||
return 0;
|
if (c7_reset_and_log_if_needed(primary_result, primary_slab_idx, class_idx) == 0) {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
primary_result = NULL;
|
||||||
|
*ss_out = NULL;
|
||||||
|
*slab_idx_out = -1;
|
||||||
}
|
}
|
||||||
return -1;
|
return -1;
|
||||||
}
|
}
|
||||||
@ -216,6 +355,15 @@ stage1_retry_after_tension_drain:
|
|||||||
if (ss_guard) {
|
if (ss_guard) {
|
||||||
tiny_tls_slab_reuse_guard(ss_guard);
|
tiny_tls_slab_reuse_guard(ss_guard);
|
||||||
|
|
||||||
|
if (class_idx == 7) {
|
||||||
|
TinySlabMeta* meta = &ss_guard->slabs[reuse_slot_idx];
|
||||||
|
if (!c7_meta_is_pristine(meta)) {
|
||||||
|
c7_log_skip_nonempty_acquire(ss_guard, reuse_slot_idx, meta, "SKIP_NONEMPTY_ACQUIRE");
|
||||||
|
sp_freelist_push_lockfree(class_idx, reuse_meta, reuse_slot_idx);
|
||||||
|
goto stage2_fallback;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// P-Tier: Skip DRAINING tier SuperSlabs
|
// P-Tier: Skip DRAINING tier SuperSlabs
|
||||||
if (!ss_tier_is_hot(ss_guard)) {
|
if (!ss_tier_is_hot(ss_guard)) {
|
||||||
// DRAINING SuperSlab - skip this slot and fall through to Stage 2
|
// DRAINING SuperSlab - skip this slot and fall through to Stage 2
|
||||||
@ -270,6 +418,15 @@ stage1_retry_after_tension_drain:
|
|||||||
|
|
||||||
*ss_out = ss;
|
*ss_out = ss;
|
||||||
*slab_idx_out = reuse_slot_idx;
|
*slab_idx_out = reuse_slot_idx;
|
||||||
|
if (c7_reset_and_log_if_needed(ss, reuse_slot_idx, class_idx) != 0) {
|
||||||
|
*ss_out = NULL;
|
||||||
|
*slab_idx_out = -1;
|
||||||
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
|
}
|
||||||
|
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
|
||||||
|
goto stage2_fallback;
|
||||||
|
}
|
||||||
|
|
||||||
if (g_lock_stats_enabled == 1) {
|
if (g_lock_stats_enabled == 1) {
|
||||||
atomic_fetch_add(&g_lock_release_count, 1);
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
@ -338,6 +495,19 @@ stage2_fallback:
|
|||||||
1, memory_order_relaxed);
|
1, memory_order_relaxed);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (class_idx == 7) {
|
||||||
|
TinySlabMeta* meta = &ss->slabs[claimed_idx];
|
||||||
|
if (!c7_meta_is_pristine(meta)) {
|
||||||
|
c7_log_skip_nonempty_acquire(ss, claimed_idx, meta, "SKIP_NONEMPTY_ACQUIRE");
|
||||||
|
sp_slot_mark_empty(hint_meta, claimed_idx);
|
||||||
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
|
}
|
||||||
|
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
|
||||||
|
goto stage2_scan;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Update SuperSlab metadata under mutex
|
// Update SuperSlab metadata under mutex
|
||||||
ss->slab_bitmap |= (1u << claimed_idx);
|
ss->slab_bitmap |= (1u << claimed_idx);
|
||||||
ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx);
|
ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx);
|
||||||
@ -353,6 +523,15 @@ stage2_fallback:
|
|||||||
// Hint is still good, no need to update
|
// Hint is still good, no need to update
|
||||||
*ss_out = ss;
|
*ss_out = ss;
|
||||||
*slab_idx_out = claimed_idx;
|
*slab_idx_out = claimed_idx;
|
||||||
|
if (c7_reset_and_log_if_needed(ss, claimed_idx, class_idx) != 0) {
|
||||||
|
*ss_out = NULL;
|
||||||
|
*slab_idx_out = -1;
|
||||||
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
|
}
|
||||||
|
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
|
||||||
|
goto stage2_scan;
|
||||||
|
}
|
||||||
sp_fix_geometry_if_needed(ss, claimed_idx, class_idx);
|
sp_fix_geometry_if_needed(ss, claimed_idx, class_idx);
|
||||||
|
|
||||||
if (g_lock_stats_enabled == 1) {
|
if (g_lock_stats_enabled == 1) {
|
||||||
@ -432,6 +611,19 @@ stage2_scan:
|
|||||||
1, memory_order_relaxed);
|
1, memory_order_relaxed);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (class_idx == 7) {
|
||||||
|
TinySlabMeta* meta_slab = &ss->slabs[claimed_idx];
|
||||||
|
if (!c7_meta_is_pristine(meta_slab)) {
|
||||||
|
c7_log_skip_nonempty_acquire(ss, claimed_idx, meta_slab, "SKIP_NONEMPTY_ACQUIRE");
|
||||||
|
sp_slot_mark_empty(meta, claimed_idx);
|
||||||
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
|
}
|
||||||
|
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Update SuperSlab metadata under mutex
|
// Update SuperSlab metadata under mutex
|
||||||
ss->slab_bitmap |= (1u << claimed_idx);
|
ss->slab_bitmap |= (1u << claimed_idx);
|
||||||
ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx);
|
ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx);
|
||||||
@ -449,6 +641,15 @@ stage2_scan:
|
|||||||
|
|
||||||
*ss_out = ss;
|
*ss_out = ss;
|
||||||
*slab_idx_out = claimed_idx;
|
*slab_idx_out = claimed_idx;
|
||||||
|
if (c7_reset_and_log_if_needed(ss, claimed_idx, class_idx) != 0) {
|
||||||
|
*ss_out = NULL;
|
||||||
|
*slab_idx_out = -1;
|
||||||
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
|
}
|
||||||
|
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
sp_fix_geometry_if_needed(ss, claimed_idx, class_idx);
|
sp_fix_geometry_if_needed(ss, claimed_idx, class_idx);
|
||||||
|
|
||||||
if (g_lock_stats_enabled == 1) {
|
if (g_lock_stats_enabled == 1) {
|
||||||
@ -623,6 +824,15 @@ stage2_scan:
|
|||||||
|
|
||||||
*ss_out = new_ss;
|
*ss_out = new_ss;
|
||||||
*slab_idx_out = first_slot;
|
*slab_idx_out = first_slot;
|
||||||
|
if (c7_reset_and_log_if_needed(new_ss, first_slot, class_idx) != 0) {
|
||||||
|
*ss_out = NULL;
|
||||||
|
*slab_idx_out = -1;
|
||||||
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
atomic_fetch_add(&g_lock_release_count, 1);
|
||||||
|
}
|
||||||
|
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
sp_fix_geometry_if_needed(new_ss, first_slot, class_idx);
|
sp_fix_geometry_if_needed(new_ss, first_slot, class_idx);
|
||||||
|
|
||||||
if (g_lock_stats_enabled == 1) {
|
if (g_lock_stats_enabled == 1) {
|
||||||
|
|||||||
@ -6,11 +6,42 @@
|
|||||||
#include "hakmem_env_cache.h" // Priority-2: ENV cache
|
#include "hakmem_env_cache.h" // Priority-2: ENV cache
|
||||||
#include "superslab/superslab_inline.h" // superslab_ref_get guard for TLS pins
|
#include "superslab/superslab_inline.h" // superslab_ref_get guard for TLS pins
|
||||||
#include "box/ss_release_guard_box.h" // Box: SuperSlab Release Guard
|
#include "box/ss_release_guard_box.h" // Box: SuperSlab Release Guard
|
||||||
|
#include "box/ss_slab_reset_box.h" // Box: Reset slab metadata on reuse path
|
||||||
|
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stdatomic.h>
|
#include <stdatomic.h>
|
||||||
|
|
||||||
|
static inline void c7_release_log_once(SuperSlab* ss, int slab_idx) {
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_c7_release_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&rel_c7_release_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 8) {
|
||||||
|
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_RELEASE] ss=%p slab=%d used=%u cap=%u carved=%u\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_c7_release_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_release_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 8) {
|
||||||
|
TinySlabMeta* meta = &ss->slabs[slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_RELEASE] ss=%p slab=%d used=%u cap=%u carved=%u\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
void
|
void
|
||||||
shared_pool_release_slab(SuperSlab* ss, int slab_idx)
|
shared_pool_release_slab(SuperSlab* ss, int slab_idx)
|
||||||
{
|
{
|
||||||
@ -75,6 +106,9 @@ shared_pool_release_slab(SuperSlab* ss, int slab_idx)
|
|||||||
}
|
}
|
||||||
|
|
||||||
uint8_t class_idx = slab_meta->class_idx;
|
uint8_t class_idx = slab_meta->class_idx;
|
||||||
|
if (class_idx == 7) {
|
||||||
|
c7_release_log_once(ss, slab_idx);
|
||||||
|
}
|
||||||
|
|
||||||
// Guard: if SuperSlab is pinned (TLS/remote references), defer release to avoid
|
// Guard: if SuperSlab is pinned (TLS/remote references), defer release to avoid
|
||||||
// class_map=255 while pointers are still in-flight.
|
// class_map=255 while pointers are still in-flight.
|
||||||
@ -101,6 +135,39 @@ shared_pool_release_slab(SuperSlab* ss, int slab_idx)
|
|||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
if (class_idx == 7) {
|
||||||
|
ss_slab_reset_meta_for_tiny(ss, slab_idx, class_idx);
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_c7_reset_logs = 0;
|
||||||
|
uint32_t rn = atomic_fetch_add_explicit(&rel_c7_reset_logs, 1, memory_order_relaxed);
|
||||||
|
if (rn < 4) {
|
||||||
|
TinySlabMeta* m = &ss->slabs[slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_RELEASE_RESET] ss=%p slab=%d used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_c7_reset_logs = 0;
|
||||||
|
uint32_t rn = atomic_fetch_add_explicit(&dbg_c7_reset_logs, 1, memory_order_relaxed);
|
||||||
|
if (rn < 4) {
|
||||||
|
TinySlabMeta* m = &ss->slabs[slab_idx];
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_RELEASE_RESET] ss=%p slab=%d used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)m->used,
|
||||||
|
(unsigned)m->capacity,
|
||||||
|
(unsigned)m->carved,
|
||||||
|
m->freelist);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
// Find SharedSSMeta for this SuperSlab
|
// Find SharedSSMeta for this SuperSlab
|
||||||
SharedSSMeta* sp_meta = NULL;
|
SharedSSMeta* sp_meta = NULL;
|
||||||
uint32_t count = atomic_load_explicit(&g_shared_pool.ss_meta_count, memory_order_relaxed);
|
uint32_t count = atomic_load_explicit(&g_shared_pool.ss_meta_count, memory_order_relaxed);
|
||||||
|
|||||||
@ -25,6 +25,7 @@
|
|||||||
#include "front/tiny_heap_v2.h"
|
#include "front/tiny_heap_v2.h"
|
||||||
#include "tiny_tls_guard.h"
|
#include "tiny_tls_guard.h"
|
||||||
#include "tiny_ready.h"
|
#include "tiny_ready.h"
|
||||||
|
#include "box/c7_meta_used_counter_box.h"
|
||||||
#include "hakmem_tiny_tls_list.h"
|
#include "hakmem_tiny_tls_list.h"
|
||||||
#include "hakmem_tiny_remote_target.h" // Phase 2C-1: Remote target queue
|
#include "hakmem_tiny_remote_target.h" // Phase 2C-1: Remote target queue
|
||||||
#include "hakmem_tiny_bg_spill.h" // Phase 2C-2: Background spill queue
|
#include "hakmem_tiny_bg_spill.h" // Phase 2C-2: Background spill queue
|
||||||
@ -334,6 +335,7 @@ static inline void* hak_tiny_alloc_superslab_try_fast(int class_idx) {
|
|||||||
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
||||||
void* block = tls->slab_base + ((size_t)meta->used * block_size);
|
void* block = tls->slab_base + ((size_t)meta->used * block_size);
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
|
||||||
// Track active blocks in SuperSlab for conservative reclamation
|
// Track active blocks in SuperSlab for conservative reclamation
|
||||||
ss_active_inc(tls->ss);
|
ss_active_inc(tls->ss);
|
||||||
return block;
|
return block;
|
||||||
|
|||||||
@ -17,6 +17,7 @@
|
|||||||
// Phase E1-CORRECT: Box API for next pointer operations
|
// Phase E1-CORRECT: Box API for next pointer operations
|
||||||
#include "box/tiny_next_ptr_box.h"
|
#include "box/tiny_next_ptr_box.h"
|
||||||
#include "front/tiny_heap_v2.h"
|
#include "front/tiny_heap_v2.h"
|
||||||
|
#include "box/c7_meta_used_counter_box.h"
|
||||||
|
|
||||||
// Debug counters (thread-local)
|
// Debug counters (thread-local)
|
||||||
static __thread uint64_t g_3layer_bump_hits = 0;
|
static __thread uint64_t g_3layer_bump_hits = 0;
|
||||||
@ -265,6 +266,7 @@ static void* tiny_alloc_slow_new(int class_idx) {
|
|||||||
meta->freelist = tiny_next_read(node); // Phase E1-CORRECT: Box API
|
meta->freelist = tiny_next_read(node); // Phase E1-CORRECT: Box API
|
||||||
items[got++] = node;
|
items[got++] = node;
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Then linear carve (KEY OPTIMIZATION - direct array fill!)
|
// Then linear carve (KEY OPTIMIZATION - direct array fill!)
|
||||||
@ -285,6 +287,11 @@ static void* tiny_alloc_slow_new(int class_idx) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
meta->used += need; // Reserve to TLS; not active until returned to user
|
meta->used += need; // Reserve to TLS; not active until returned to user
|
||||||
|
if (class_idx == 7) {
|
||||||
|
for (uint32_t i = 0; i < need; ++i) {
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (got == 0) {
|
if (got == 0) {
|
||||||
|
|||||||
@ -18,6 +18,7 @@
|
|||||||
#include "tiny_box_geometry.h"
|
#include "tiny_box_geometry.h"
|
||||||
#include "superslab/superslab_inline.h" // Provides hak_super_lookup() and SUPERSLAB_MAGIC
|
#include "superslab/superslab_inline.h" // Provides hak_super_lookup() and SUPERSLAB_MAGIC
|
||||||
#include "box/tls_sll_box.h"
|
#include "box/tls_sll_box.h"
|
||||||
|
#include "box/c7_meta_used_counter_box.h"
|
||||||
#include "box/tiny_header_box.h" // Header Box: Single Source of Truth for header operations
|
#include "box/tiny_header_box.h" // Header Box: Single Source of Truth for header operations
|
||||||
#include "box/tiny_front_config_box.h" // Phase 7-Step6-Fix: Config macros for dead code elimination
|
#include "box/tiny_front_config_box.h" // Phase 7-Step6-Fix: Config macros for dead code elimination
|
||||||
#include "hakmem_tiny_integrity.h"
|
#include "hakmem_tiny_integrity.h"
|
||||||
@ -94,6 +95,39 @@ static inline void tiny_debug_validate_node_base(int class_idx, void* node, cons
|
|||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
static inline void c7_log_used_assign_cap(TinySlabMeta* meta,
|
||||||
|
int class_idx,
|
||||||
|
const char* tag) {
|
||||||
|
if (__builtin_expect(class_idx != 7, 1)) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic uint32_t rel_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&rel_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_USED_ASSIGN] tag=%s used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static _Atomic uint32_t dbg_logs = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&dbg_logs, 1, memory_order_relaxed);
|
||||||
|
if (n < 4) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_USED_ASSIGN] tag=%s used=%u cap=%u carved=%u freelist=%p\n",
|
||||||
|
tag,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
meta->freelist);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
// ========= superslab_tls_bump_fast =========
|
// ========= superslab_tls_bump_fast =========
|
||||||
//
|
//
|
||||||
// Ultra bump shadow: current slabが freelist 空で carved<capacity のとき、
|
// Ultra bump shadow: current slabが freelist 空で carved<capacity のとき、
|
||||||
@ -141,6 +175,11 @@ static inline void* superslab_tls_bump_fast(int class_idx) {
|
|||||||
|
|
||||||
meta->carved = (uint16_t)(carved + (uint16_t)chunk);
|
meta->carved = (uint16_t)(carved + (uint16_t)chunk);
|
||||||
meta->used = (uint16_t)(meta->used + (uint16_t)chunk);
|
meta->used = (uint16_t)(meta->used + (uint16_t)chunk);
|
||||||
|
if (class_idx == 7) {
|
||||||
|
for (uint32_t i = 0; i < chunk; ++i) {
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
|
}
|
||||||
|
}
|
||||||
ss_active_add(tls->ss, chunk);
|
ss_active_add(tls->ss, chunk);
|
||||||
#if HAKMEM_DEBUG_COUNTERS
|
#if HAKMEM_DEBUG_COUNTERS
|
||||||
g_bump_arms[class_idx]++;
|
g_bump_arms[class_idx]++;
|
||||||
@ -365,8 +404,10 @@ int sll_refill_small_from_ss(int class_idx, int max_take)
|
|||||||
|
|
||||||
meta->freelist = next_raw;
|
meta->freelist = next_raw;
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
if (__builtin_expect(meta->used > meta->capacity, 0)) {
|
if (__builtin_expect(meta->used > meta->capacity, 0)) {
|
||||||
// 異常検出時はロールバックして終了(fail-fast 回避のため静かに中断)
|
// 異常検出時はロールバックして終了(fail-fast 回避のため静かに中断)
|
||||||
|
c7_log_used_assign_cap(meta, class_idx, "FREELIST_OVERRUN");
|
||||||
meta->used = meta->capacity;
|
meta->used = meta->capacity;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
@ -414,7 +455,9 @@ int sll_refill_small_from_ss(int class_idx, int max_take)
|
|||||||
|
|
||||||
meta->carved++;
|
meta->carved++;
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
if (__builtin_expect(meta->used > meta->capacity, 0)) {
|
if (__builtin_expect(meta->used > meta->capacity, 0)) {
|
||||||
|
c7_log_used_assign_cap(meta, class_idx, "CARVE_OVERRUN");
|
||||||
meta->used = meta->capacity;
|
meta->used = meta->capacity;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|||||||
@ -33,6 +33,7 @@
|
|||||||
#ifndef HEADER_CLASS_MASK
|
#ifndef HEADER_CLASS_MASK
|
||||||
#define HEADER_CLASS_MASK 0x0F
|
#define HEADER_CLASS_MASK 0x0F
|
||||||
#endif
|
#endif
|
||||||
|
#include "../box/c7_meta_used_counter_box.h"
|
||||||
|
|
||||||
// ========================================================================
|
// ========================================================================
|
||||||
// REFILL CONTRACT: ss_refill_fc_fill() - Standard Refill Entry Point
|
// REFILL CONTRACT: ss_refill_fc_fill() - Standard Refill Entry Point
|
||||||
@ -131,12 +132,14 @@ static inline int ss_refill_fc_fill(int class_idx, int want) {
|
|||||||
p = meta->freelist;
|
p = meta->freelist;
|
||||||
meta->freelist = tiny_next_read(class_idx, p);
|
meta->freelist = tiny_next_read(class_idx, p);
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
}
|
}
|
||||||
// Option B: Carve new block (if capacity available)
|
// Option B: Carve new block (if capacity available)
|
||||||
else if (meta->carved < meta->capacity) {
|
else if (meta->carved < meta->capacity) {
|
||||||
p = (void*)(slab_base + (meta->carved * stride));
|
p = (void*)(slab_base + (meta->carved * stride));
|
||||||
meta->carved++;
|
meta->carved++;
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
|
||||||
}
|
}
|
||||||
// Option C: Slab exhausted, need new slab
|
// Option C: Slab exhausted, need new slab
|
||||||
else {
|
else {
|
||||||
|
|||||||
@ -9,6 +9,7 @@
|
|||||||
#include "tiny_debug_ring.h"
|
#include "tiny_debug_ring.h"
|
||||||
#include "tiny_remote.h"
|
#include "tiny_remote.h"
|
||||||
#include "box/tiny_next_ptr_box.h" // Box API: next pointer read/write
|
#include "box/tiny_next_ptr_box.h" // Box API: next pointer read/write
|
||||||
|
#include "box/c7_meta_used_counter_box.h"
|
||||||
|
|
||||||
extern int g_debug_remote_guard;
|
extern int g_debug_remote_guard;
|
||||||
extern int g_tiny_safe_free_strict;
|
extern int g_tiny_safe_free_strict;
|
||||||
@ -311,6 +312,7 @@ static inline void* slab_freelist_pop(SlabHandle* h) {
|
|||||||
void* next = tiny_next_read(h->meta->class_idx, ptr); // Box API: next pointer read
|
void* next = tiny_next_read(h->meta->class_idx, ptr); // Box API: next pointer read
|
||||||
h->meta->freelist = next;
|
h->meta->freelist = next;
|
||||||
h->meta->used++;
|
h->meta->used++;
|
||||||
|
c7_meta_used_note(h->meta->class_idx, C7_META_USED_SRC_FRONT);
|
||||||
// Optional freelist mask clear when freelist becomes empty
|
// Optional freelist mask clear when freelist becomes empty
|
||||||
do {
|
do {
|
||||||
static int g_mask_en2 = -1;
|
static int g_mask_en2 = -1;
|
||||||
|
|||||||
@ -4,6 +4,10 @@
|
|||||||
// Date: 2025-11-28
|
// Date: 2025-11-28
|
||||||
|
|
||||||
#include "hakmem_tiny_superslab_internal.h"
|
#include "hakmem_tiny_superslab_internal.h"
|
||||||
|
#include "box/c7_meta_used_counter_box.h"
|
||||||
|
#include <stdatomic.h>
|
||||||
|
|
||||||
|
static _Atomic uint32_t g_c7_backend_calls = 0;
|
||||||
|
|
||||||
// Note: Legacy backend moved to archive/superslab_backend_legacy.c (not built).
|
// Note: Legacy backend moved to archive/superslab_backend_legacy.c (not built).
|
||||||
|
|
||||||
@ -83,6 +87,20 @@ void* hak_tiny_alloc_superslab_backend_shared(int class_idx)
|
|||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (class_idx == 7) {
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&g_c7_backend_calls, 1, memory_order_relaxed);
|
||||||
|
if (n < 8) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_BACKEND_CALL] cls=%d meta_cls=%u used=%u cap=%u ss=%p slab=%d\n",
|
||||||
|
class_idx,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Simple bump allocation within this slab.
|
// Simple bump allocation within this slab.
|
||||||
if (meta->used >= meta->capacity) {
|
if (meta->used >= meta->capacity) {
|
||||||
// Slab exhausted: in minimal Phase12-2 backend we do not loop;
|
// Slab exhausted: in minimal Phase12-2 backend we do not loop;
|
||||||
@ -101,6 +119,7 @@ void* hak_tiny_alloc_superslab_backend_shared(int class_idx)
|
|||||||
uint8_t* base = (uint8_t*)ss + slab_base_off + offset;
|
uint8_t* base = (uint8_t*)ss + slab_base_off + offset;
|
||||||
|
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(class_idx, C7_META_USED_SRC_BACKEND);
|
||||||
atomic_fetch_add_explicit(&ss->total_active_blocks, 1, memory_order_relaxed);
|
atomic_fetch_add_explicit(&ss->total_active_blocks, 1, memory_order_relaxed);
|
||||||
|
|
||||||
HAK_RET_ALLOC_BLOCK_TRACED(class_idx, base, ALLOC_PATH_BACKEND);
|
HAK_RET_ALLOC_BLOCK_TRACED(class_idx, base, ALLOC_PATH_BACKEND);
|
||||||
|
|||||||
@ -6,6 +6,7 @@
|
|||||||
#include "hakmem_tiny_superslab_internal.h"
|
#include "hakmem_tiny_superslab_internal.h"
|
||||||
#include "box/slab_recycling_box.h"
|
#include "box/slab_recycling_box.h"
|
||||||
#include "hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls)
|
#include "hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls)
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Remote Drain (MPSC queue to freelist conversion)
|
// Remote Drain (MPSC queue to freelist conversion)
|
||||||
@ -175,6 +176,37 @@ void superslab_init_slab(SuperSlab* ss, int slab_idx, size_t block_size, uint32_
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#if HAKMEM_BUILD_RELEASE
|
||||||
|
static _Atomic int rel_c7_init_logged = 0;
|
||||||
|
if (meta->class_idx == 7 &&
|
||||||
|
atomic_load_explicit(&rel_c7_init_logged, memory_order_relaxed) == 0) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[REL_C7_INIT] ss=%p slab=%d cls=%u cap=%u used=%u carved=%u stride=%zu\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
stride);
|
||||||
|
atomic_store_explicit(&rel_c7_init_logged, 1, memory_order_relaxed);
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static __thread int dbg_c7_init_logged = 0;
|
||||||
|
if (meta->class_idx == 7 && dbg_c7_init_logged == 0) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[DBG_C7_INIT] ss=%p slab=%d cls=%u cap=%u used=%u carved=%u stride=%zu\n",
|
||||||
|
(void*)ss,
|
||||||
|
slab_idx,
|
||||||
|
(unsigned)meta->class_idx,
|
||||||
|
(unsigned)meta->capacity,
|
||||||
|
(unsigned)meta->used,
|
||||||
|
(unsigned)meta->carved,
|
||||||
|
stride);
|
||||||
|
dbg_c7_init_logged = 1;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
superslab_activate_slab(ss, slab_idx);
|
superslab_activate_slab(ss, slab_idx);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -7,6 +7,8 @@
|
|||||||
|
|
||||||
#include "box/superslab_expansion_box.h" // Box E: Expansion with TLS state guarantee
|
#include "box/superslab_expansion_box.h" // Box E: Expansion with TLS state guarantee
|
||||||
#include "box/tiny_next_ptr_box.h" // Box API: Next pointer read/write
|
#include "box/tiny_next_ptr_box.h" // Box API: Next pointer read/write
|
||||||
|
#include "box/tiny_tls_carve_one_block_box.h" // Box: Shared TLS carve helper
|
||||||
|
#include "box/c7_meta_used_counter_box.h" // Box: C7 meta->used telemetry
|
||||||
#include "hakmem_tiny_superslab_constants.h"
|
#include "hakmem_tiny_superslab_constants.h"
|
||||||
#include "tiny_box_geometry.h" // Box 3: Geometry & Capacity Calculator"
|
#include "tiny_box_geometry.h" // Box 3: Geometry & Capacity Calculator"
|
||||||
#include "tiny_debug_api.h" // Guard/failfast declarations
|
#include "tiny_debug_api.h" // Guard/failfast declarations
|
||||||
@ -33,6 +35,7 @@ static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx) {
|
|||||||
uint8_t* base = tiny_slab_base_for_geometry(ss, slab_idx);
|
uint8_t* base = tiny_slab_base_for_geometry(ss, slab_idx);
|
||||||
void* block = tiny_block_at_index(base, meta->used, unit_sz);
|
void* block = tiny_block_at_index(base, meta->used, unit_sz);
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(cls, C7_META_USED_SRC_FRONT);
|
||||||
ss_active_inc(ss);
|
ss_active_inc(ss);
|
||||||
HAK_RET_ALLOC(cls, block);
|
HAK_RET_ALLOC(cls, block);
|
||||||
}
|
}
|
||||||
@ -105,6 +108,7 @@ static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx) {
|
|||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
|
||||||
void* user =
|
void* user =
|
||||||
#if HAKMEM_TINY_HEADER_CLASSIDX
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
||||||
tiny_region_id_write_header(block_base, meta->class_idx);
|
tiny_region_id_write_header(block_base, meta->class_idx);
|
||||||
@ -157,6 +161,7 @@ static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx) {
|
|||||||
|
|
||||||
meta->freelist = tiny_next_read(meta->class_idx, block);
|
meta->freelist = tiny_next_read(meta->class_idx, block);
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
|
||||||
|
|
||||||
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0) &&
|
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0) &&
|
||||||
__builtin_expect(meta->used > meta->capacity, 0)) {
|
__builtin_expect(meta->used > meta->capacity, 0)) {
|
||||||
@ -294,54 +299,33 @@ static inline void* hak_tiny_alloc_superslab(int class_idx) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Fast path: linear carve from current TLS slab
|
// Fast path: linear carve from current TLS slab
|
||||||
if (meta && meta->freelist == NULL && meta->used < meta->capacity && tls->slab_base) {
|
if (meta && tls->slab_base) {
|
||||||
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
TinyTLSCarveOneResult carve = tiny_tls_carve_one_block(tls, class_idx);
|
||||||
uint8_t* base = tls->slab_base;
|
if (carve.block) {
|
||||||
void* block = base + ((size_t)meta->used * block_size);
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
meta->used++;
|
if (__builtin_expect(g_debug_remote_guard, 0)) {
|
||||||
|
const char* tag = (carve.path == TINY_TLS_CARVE_PATH_FREELIST)
|
||||||
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0)) {
|
? "freelist_alloc"
|
||||||
uintptr_t base_ss = (uintptr_t)tls->ss;
|
: "linear_alloc";
|
||||||
size_t ss_size = (size_t)1ULL << tls->ss->lg_size;
|
tiny_remote_track_on_alloc(tls->ss, slab_idx, carve.block, tag, 0);
|
||||||
uintptr_t p = (uintptr_t)block;
|
tiny_remote_assert_not_remote(tls->ss, slab_idx, carve.block, tag, 0);
|
||||||
int in_range = (p >= base_ss) && (p < base_ss + ss_size);
|
|
||||||
int aligned = ((p - (uintptr_t)base) % block_size) == 0;
|
|
||||||
int idx_ok = (tls->slab_idx >= 0) &&
|
|
||||||
(tls->slab_idx < ss_slabs_capacity(tls->ss));
|
|
||||||
if (!in_range || !aligned || !idx_ok || meta->used > meta->capacity) {
|
|
||||||
tiny_failfast_abort_ptr("alloc_ret_align",
|
|
||||||
tls->ss,
|
|
||||||
tls->slab_idx,
|
|
||||||
block,
|
|
||||||
"superslab_tls_invariant");
|
|
||||||
}
|
}
|
||||||
}
|
#endif
|
||||||
|
|
||||||
ss_active_inc(tls->ss);
|
#if HAKMEM_TINY_SS_TLS_HINT
|
||||||
ROUTE_MARK(11); ROUTE_COMMIT(class_idx, 0x60);
|
{
|
||||||
HAK_RET_ALLOC(class_idx, block);
|
void* ss_base = (void*)tls->ss;
|
||||||
}
|
size_t ss_size = (size_t)1ULL << tls->ss->lg_size;
|
||||||
|
tls_ss_hint_update(tls->ss, ss_base, ss_size);
|
||||||
// Freelist path from current TLS slab
|
|
||||||
if (meta && meta->freelist) {
|
|
||||||
void* block = meta->freelist;
|
|
||||||
if (__builtin_expect(g_tiny_safe_free, 0)) {
|
|
||||||
size_t blk = tiny_stride_for_class(meta->class_idx);
|
|
||||||
uint8_t* base = tiny_slab_base_for_geometry(tls->ss, tls->slab_idx);
|
|
||||||
uintptr_t delta = (uintptr_t)block - (uintptr_t)base;
|
|
||||||
int align_ok = ((delta % blk) == 0);
|
|
||||||
int range_ok = (delta / blk) < meta->capacity;
|
|
||||||
if (!align_ok || !range_ok) {
|
|
||||||
if (g_tiny_safe_free_strict) { raise(SIGUSR2); return NULL; }
|
|
||||||
return NULL;
|
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
if (carve.path == TINY_TLS_CARVE_PATH_LINEAR) {
|
||||||
|
ROUTE_MARK(11); ROUTE_COMMIT(class_idx, 0x60);
|
||||||
|
} else if (carve.path == TINY_TLS_CARVE_PATH_FREELIST) {
|
||||||
|
ROUTE_MARK(12); ROUTE_COMMIT(class_idx, 0x61);
|
||||||
|
}
|
||||||
|
HAK_RET_ALLOC(class_idx, carve.block);
|
||||||
}
|
}
|
||||||
void* next = tiny_next_read(class_idx, block);
|
|
||||||
meta->freelist = next;
|
|
||||||
meta->used++;
|
|
||||||
ss_active_inc(tls->ss);
|
|
||||||
ROUTE_MARK(12); ROUTE_COMMIT(class_idx, 0x61);
|
|
||||||
HAK_RET_ALLOC(class_idx, block);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Slow path: acquire a new slab via shared pool
|
// Slow path: acquire a new slab via shared pool
|
||||||
@ -363,6 +347,7 @@ static inline void* hak_tiny_alloc_superslab(int class_idx) {
|
|||||||
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
size_t block_size = tiny_stride_for_class(meta->class_idx);
|
||||||
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
|
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
|
||||||
meta->used++;
|
meta->used++;
|
||||||
|
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
|
||||||
ss_active_inc(ss);
|
ss_active_inc(ss);
|
||||||
HAK_RET_ALLOC(class_idx, block);
|
HAK_RET_ALLOC(class_idx, block);
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user