Fix C7 warm/TLS Release path and unify debug instrumentation

This commit is contained in:
Moe Charm (CI)
2025-12-05 23:41:01 +09:00
parent 96c2988381
commit d17ec46628
29 changed files with 1314 additions and 123 deletions

View File

@ -1,4 +1,4 @@
## HAKMEM 状況メモ (2025-12-05 更新) ## HAKMEM 状況メモ (2025-12-05 更新 / C7 Warm/TLS Bind 反映)
### 現在の状態Tiny / Superslab / Warm Pool ### 現在の状態Tiny / Superslab / Warm Pool
- Tiny Front / Superslab / Shared Pool は Box Theory 準拠で 3 層構造に整理済みHOT/WARM/COLD - Tiny Front / Superslab / Shared Pool は Box Theory 準拠で 3 層構造に整理済みHOT/WARM/COLD
@ -27,10 +27,26 @@
- `core/box/tiny_page_box.h` / `core/box/tiny_page_box.c` を追加し、`HAKMEM_TINY_PAGE_BOX_CLASSES` で有効クラスを制御できる Page Box を実装。 - `core/box/tiny_page_box.h` / `core/box/tiny_page_box.c` を追加し、`HAKMEM_TINY_PAGE_BOX_CLASSES` で有効クラスを制御できる Page Box を実装。
- `tiny_tls_bind_slab()` から `tiny_page_box_on_new_slab()` を呼び出し、TLS が bind した C7 slab を per-thread の page pool に登録。 - `tiny_tls_bind_slab()` から `tiny_page_box_on_new_slab()` を呼び出し、TLS が bind した C7 slab を per-thread の page pool に登録。
- `unified_cache_refill()` の先頭に Page Box 経路を追加し、C7 では「TLS が掴んでいるページ内 freelist/carve」からバッチ供給を試みてから Warm Pool / Shared Pool に落ちるようにしたBox 境界は `Tiny Page Box → Warm Pool → Shared Pool` の順序を維持)。 - `unified_cache_refill()` の先頭に Page Box 経路を追加し、C7 では「TLS が掴んでいるページ内 freelist/carve」からバッチ供給を試みてから Warm Pool / Shared Pool に落ちるようにしたBox 境界は `Tiny Page Box → Warm Pool → Shared Pool` の順序を維持)。
- TLS Bind Box の導入:
- `core/box/ss_tls_bind_box.h``ss_tls_bind_one()` を追加し、「Superslab + slab_idx → TLS」のバインド処理`superslab_init_slab` / `meta->class_idx` 設定 / `tiny_tls_bind_slab`)を 1 箇所に集約。
- `superslab_refill()`Shared Pool 経路)および Warm Pool 実験経路から、この Box を経由して TLS に接続するよう統一。
- C7 Warm/TLS Bind 経路の実装と検証:
- `core/front/tiny_unified_cache.c` に C7 専用の Warm/TLS Bind モード0/1/2を追加し、Debug では `HAKMEM_WARM_TLS_BIND_C7` で切替可能にした。
- mode 0: Legacy Warmレガシー/デバッグ用、C7 では carve 0 が多く非推奨)
- mode 1: Bind-onlyWarm から取得した Superslab を TLS Bind Box 経由でバインドする本番経路)
- mode 2: Bind+TLS carveTLS から直接 carve する実験経路)
- Release ビルドでは常に mode=1 固定。Debug では `HAKMEM_WARM_TLS_BIND_C7=0/1/2` で切替。
- Warm Pool / Unified Cache の詳細計測:
- `warm_pool_dbg_box.h` と Unified Cache の計測フックを拡張し、C7 向けに
- Warm pop 試行/ヒット/実 carve 回数
- TLS carve 試行/成功/失敗
- UC ミスを Warm/TLS/Shared 別に分類
を Debug ビルドで観測可能にした。
- `bench_random_mixed.c``HAKMEM_BENCH_C7_ONLY=1` を追加し、C7 サイズ専用の micro-bench を追加。
### 性能の現状Random Mixed, HEAD ### 性能の現状Random Mixed, HEAD
- 条件: `bench_random_mixed_hakmem 1000000 256 42`1T, ws=256, RELEASE, 161024B - 条件: `bench_random_mixed_hakmem 1000000 256 42`1T, ws=256, RELEASE, 161024B
- HAKMEM: 約 5.0M ops/s - HAKMEM: 約 27.6M ops/sC7 Warm/TLS 修復後)
- system malloc: 約 90100M ops/s - system malloc: 約 90100M ops/s
- mimalloc: 約 120130M ops/s - mimalloc: 約 120130M ops/s
- 条件: `bench_random_mixed_hakmem 1000000 256 42` + - 条件: `bench_random_mixed_hakmem 1000000 256 42` +
@ -38,26 +54,27 @@
- HAKMEM Tiny Front: 約 8090M ops/smimalloc と同オーダー) - HAKMEM Tiny Front: 約 8090M ops/smimalloc と同オーダー)
- 条件: `bench_random_mixed_hakmem 1000000 256 42` + - 条件: `bench_random_mixed_hakmem 1000000 256 42` +
`HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024`Tiny C5C7 のみ) `HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024`Tiny C5C7 のみ)
- HAKMEM: 約 4.74.8M ops/s - HAKMEM: 約 28.0M ops/sWarm/TLS ガード適用後)
- 条件: C7 専用 micro-benchDebug, `HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_PROFILE=full HAKMEM_WARM_C7_MAX=8 HAKMEM_WARM_C7_PREFETCH=4` ほか)
- mode 0Legacy Warm: 約 2.0M ops/s、C7 Warm ヒット 0・Shared Pool ロック多数(`slab_carve_from_ss` が 0 を頻発)
- mode 1Bind-only: 約 20M ops/siters=200K, ws=32、Warm hit ≈100%・Shared Pool ロック 5 回まで減少
- mode 2Bind+TLS carve 実験): mode 1 と同等〜わずかに上UC ミスは増えるが `uc_miss_tls` に集中し、avg_refill は短縮)
- 条件: C7 専用 micro-benchRelease, `HAKMEM_BENCH_C7_ONLY=1 HAKMEM_TINY_PROFILE=full HAKMEM_WARM_C7_MAX=8 HAKMEM_WARM_C7_PREFETCH=4`
- HAKMEM: 約 18.8M ops/s空スラブ強制ガード + リセット導入後、Debug と同オーダーまで回復)
- 結論: - 結論:
- Tiny front 自体8128Bは十分速く、mimalloc と同オーダーまで出ている。 - Tiny front 自体8128Bは十分速く、mimalloc と同オーダーまで出ている。
- 1291024B の Tiny C5C7 経路で Unified Cache hit=0 / Shared Pool ロック多発というボトルネックがあり - C5C7 経路は「満杯 C7 slab を Warm に再供給していた」問題を空スラブ限定ガードRelease/Debug 共通リセットで解消し
Random Mixed 全体の性能を支配している C7-only Release も ~18.8M ops/s に回復。Random Mixed Release も 27M クラスまで改善
### 次にやること(優先タスクC7 Page Box の実効性検証とチューニング ### 次にやること(広い条件での安定化確認
1. **C7 Page Box 経路の実効性を計測** 1. `HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024` や通常の `bench_random_mixed_hakmem 1000000 256 42`
- ENV: `HAKMEM_BENCH_MIN_SIZE=129 HAKMEM_BENCH_MAX_SIZE=1024` `HAKMEM_MEASURE_UNIFIED_CACHE=1` 空スラブ限定ガードが副作用なく動くかを継続確認(現状 Release で 2728M ops/s を確認済み)。
`bench_random_mixed_hakmem 1000000 256 42` を実行し、C7 の: 2. ドキュメント更新:
- Unified Cache refill 回数・平均 cycles - Release だけ C7 Warm が死んでいた根本原因 = 満杯 C7 slab を Shared Pool がリセットせず再供給していた。
- `shared_pool_acquire_slab(C7)` のロック回数 - Acquire の空スラブ強制ガードRelease/Debug 共通リセットで C7-only Release が ~18.8M ops/s まで回復した。
を、Page Box ON/OFF`HAKMEM_TINY_PAGE_BOX_CLASSES=` 未設定 vs `7`)で比較する。 3. 次フェーズ案:
2. **C7 の Unified Cache 容量・バッチサイズのチューニング** - C5/C6 でも同様の Warm/TLS 最適化・空スラブガードを適用するか、
- `HAKMEM_TINY_UNIFIED_C7``unified_cache_refill()``max_batch` 設定を変えつつ、 - Random Mixed 全体のボトルネックShared Pool ロック/Wrapper/mid-size path など)を洗うかを選択。
Page Box ON 時の C7 ヒット率・Shared Pool ロック回数・throughput を観測し、C7 にとって最適な容量/バッチサイズを探る。
3. **Page Box を C5/C6 に拡張するかの判断**
- C7 で十分な効果Shared Pool ロック大幅減 + throughput 向上)が得られた場合、
`HAKMEM_TINY_PAGE_BOX_CLASSES=5,6,7` を試し、C5/C6 も Tiny-Plus 化したときの安定性・性能を確認する。
- 問題がなければ、デフォルトプロファイルを「C5C7 Page Box 有効」に近づけるかを検討する。
### メモ ### メモ
- ページフォルト問題は Prefault Box + ウォームアップで一定水準まで解消済みで、現在の主ボトルネックはユーザー空間の箱Unified Cache / free / Pool側に移っている。 - ページフォルト問題は Prefault Box + ウォームアップで一定水準まで解消済みで、現在の主ボトルネックはユーザー空間の箱Unified Cache / free / Pool側に移っている。

View File

@ -1,5 +1,8 @@
# HAKMEM Allocator Performance Analysis Results # HAKMEM Allocator Performance Analysis Results
**最新メモ (2025-12-05)**: C7 Warm/TLS Bind は本番経路を Bind-only (mode=1) に統一。Debug では `HAKMEM_WARM_TLS_BIND_C7=0/1/2` で切替可能だが、Release は常に mode=1 固定。C7-only ワークロードでは mode=1 が legacy (mode=0) 比で ~410x 速く、mode=2 は TLS carve 実験として残置。
**追記 (2025-12-05, Release 修復)**: Release だけ C7 Warm が死んでいた原因は「満杯 C7 slab が Shared Pool に居残り、空スラブが Warm に渡っていなかった」こと。Acquire で C7 は空スラブ限定、Release でメタをリセットするガードを導入し、C7-only Release で ~18.8M ops/s、Random Mixed Release で ~2728M ops/s まで回復。
**分析実施日**: 2025-11-28 **分析実施日**: 2025-11-28
**分析対象**: HAKMEM allocator (commit 0ce20bb83) **分析対象**: HAKMEM allocator (commit 0ce20bb83)
**ベンチマーク**: bench_random_mixed (1,000,000 ops, working set=256) **ベンチマーク**: bench_random_mixed (1,000,000 ops, working set=256)

View File

@ -13,9 +13,16 @@
#include <stdint.h> #include <stdint.h>
#include <time.h> #include <time.h>
#include <string.h> #include <string.h>
#include <stdatomic.h>
#define C7_META_COUNTER_DEFINE
#include "core/box/c7_meta_used_counter_box.h"
#undef C7_META_COUNTER_DEFINE
#include "core/box/warm_pool_rel_counters_box.h"
#ifdef USE_HAKMEM #ifdef USE_HAKMEM
#include "hakmem.h" #include "hakmem.h"
#include "hakmem_build_flags.h"
#include "core/box/c7_meta_used_counter_box.h"
// Box BenchMeta: Benchmark metadata management (bypass hakmem wrapper) // Box BenchMeta: Benchmark metadata management (bypass hakmem wrapper)
// Phase 15: Separate BenchMeta (slots array) from CoreAlloc (user workload) // Phase 15: Separate BenchMeta (slots array) from CoreAlloc (user workload)
@ -253,6 +260,38 @@ int main(int argc, char** argv){
extern void tiny_warm_pool_print_stats_public(void); extern void tiny_warm_pool_print_stats_public(void);
tiny_warm_pool_print_stats_public(); tiny_warm_pool_print_stats_public();
#if HAKMEM_BUILD_RELEASE
// Minimal Release-side telemetry to verify Warm path usage (C7-only)
extern _Atomic uint64_t g_rel_c7_warm_pop;
extern _Atomic uint64_t g_rel_c7_warm_push;
fprintf(stderr,
"[REL_C7_CARVE] attempts=%llu success=%llu zero=%llu\n",
(unsigned long long)warm_pool_rel_c7_carve_attempts(),
(unsigned long long)warm_pool_rel_c7_carve_successes(),
(unsigned long long)warm_pool_rel_c7_carve_zeroes());
fprintf(stderr,
"[REL_C7_WARM] pop=%llu push=%llu\n",
(unsigned long long)atomic_load_explicit(&g_rel_c7_warm_pop, memory_order_relaxed),
(unsigned long long)atomic_load_explicit(&g_rel_c7_warm_push, memory_order_relaxed));
fprintf(stderr,
"[REL_C7_WARM_PREFILL] calls=%llu slabs=%llu\n",
(unsigned long long)warm_pool_rel_c7_prefill_calls(),
(unsigned long long)warm_pool_rel_c7_prefill_slabs());
fprintf(stderr,
"[REL_C7_META_USED_INC] total=%llu backend=%llu tls=%llu front=%llu\n",
(unsigned long long)c7_meta_used_total(),
(unsigned long long)c7_meta_used_backend(),
(unsigned long long)c7_meta_used_tls(),
(unsigned long long)c7_meta_used_front());
#else
fprintf(stderr,
"[DBG_C7_META_USED_INC] total=%llu backend=%llu tls=%llu front=%llu\n",
(unsigned long long)c7_meta_used_total(),
(unsigned long long)c7_meta_used_backend(),
(unsigned long long)c7_meta_used_tls(),
(unsigned long long)c7_meta_used_front());
#endif
// Phase 21-1: Ring cache - DELETED (A/B test: OFF is faster) // Phase 21-1: Ring cache - DELETED (A/B test: OFF is faster)
// extern void ring_cache_print_stats(void); // extern void ring_cache_print_stats(void);
// ring_cache_print_stats(); // ring_cache_print_stats();

View File

@ -0,0 +1,59 @@
// c7_meta_used_counter_box.h
// Box: C7 meta->used increment counters (Release/Debug共通)
#pragma once
#include <stdatomic.h>
#include <stdint.h>
typedef enum C7MetaUsedSource {
C7_META_USED_SRC_UNKNOWN = 0,
C7_META_USED_SRC_BACKEND = 1,
C7_META_USED_SRC_TLS = 2,
C7_META_USED_SRC_FRONT = 3,
} C7MetaUsedSource;
#ifdef C7_META_COUNTER_DEFINE
#define C7_META_COUNTER_EXTERN
#else
#define C7_META_COUNTER_EXTERN extern
#endif
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_total;
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_backend;
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_tls;
C7_META_COUNTER_EXTERN _Atomic uint64_t g_c7_meta_used_inc_front;
static inline void c7_meta_used_note(int class_idx, C7MetaUsedSource src) {
if (__builtin_expect(class_idx != 7, 1)) {
return;
}
atomic_fetch_add_explicit(&g_c7_meta_used_inc_total, 1, memory_order_relaxed);
switch (src) {
case C7_META_USED_SRC_BACKEND:
atomic_fetch_add_explicit(&g_c7_meta_used_inc_backend, 1, memory_order_relaxed);
break;
case C7_META_USED_SRC_TLS:
atomic_fetch_add_explicit(&g_c7_meta_used_inc_tls, 1, memory_order_relaxed);
break;
case C7_META_USED_SRC_FRONT:
atomic_fetch_add_explicit(&g_c7_meta_used_inc_front, 1, memory_order_relaxed);
break;
default:
break;
}
}
static inline uint64_t c7_meta_used_total(void) {
return atomic_load_explicit(&g_c7_meta_used_inc_total, memory_order_relaxed);
}
static inline uint64_t c7_meta_used_backend(void) {
return atomic_load_explicit(&g_c7_meta_used_inc_backend, memory_order_relaxed);
}
static inline uint64_t c7_meta_used_tls(void) {
return atomic_load_explicit(&g_c7_meta_used_inc_tls, memory_order_relaxed);
}
static inline uint64_t c7_meta_used_front(void) {
return atomic_load_explicit(&g_c7_meta_used_inc_front, memory_order_relaxed);
}
#undef C7_META_COUNTER_EXTERN

View File

@ -15,6 +15,7 @@
#include "tiny_header_box.h" // Header Box: Single Source of Truth for header operations #include "tiny_header_box.h" // Header Box: Single Source of Truth for header operations
#include "../tiny_refill_opt.h" // TinyRefillChain, trc_linear_carve() #include "../tiny_refill_opt.h" // TinyRefillChain, trc_linear_carve()
#include "../tiny_box_geometry.h" // tiny_stride_for_class(), tiny_slab_base_for_geometry() #include "../tiny_box_geometry.h" // tiny_stride_for_class(), tiny_slab_base_for_geometry()
#include "c7_meta_used_counter_box.h"
// External declarations // External declarations
extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES]; extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES];
@ -191,6 +192,7 @@ uint32_t box_carve_and_push_with_freelist(int class_idx, uint32_t want) {
void* p = meta->freelist; void* p = meta->freelist;
meta->freelist = tiny_next_read(class_idx, p); meta->freelist = tiny_next_read(class_idx, p);
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
// CRITICAL FIX: Restore header BEFORE pushing to TLS SLL // CRITICAL FIX: Restore header BEFORE pushing to TLS SLL
// Freelist blocks may have stale data at offset 0 // Freelist blocks may have stale data at offset 0

View File

@ -41,7 +41,7 @@ core/box/carve_push_box.o: core/box/carve_push_box.c \
core/box/../tiny_region_id.h core/box/../hakmem_tiny_integrity.h \ core/box/../tiny_region_id.h core/box/../hakmem_tiny_integrity.h \
core/box/../box/slab_freelist_atomic.h core/box/tiny_header_box.h \ core/box/../box/slab_freelist_atomic.h core/box/tiny_header_box.h \
core/box/../tiny_refill_opt.h core/box/../box/tls_sll_box.h \ core/box/../tiny_refill_opt.h core/box/../box/tls_sll_box.h \
core/box/../tiny_box_geometry.h core/box/../tiny_box_geometry.h core/box/c7_meta_used_counter_box.h
core/box/../hakmem_tiny.h: core/box/../hakmem_tiny.h:
core/box/../hakmem_build_flags.h: core/box/../hakmem_build_flags.h:
core/box/../hakmem_trace.h: core/box/../hakmem_trace.h:
@ -116,3 +116,4 @@ core/box/tiny_header_box.h:
core/box/../tiny_refill_opt.h: core/box/../tiny_refill_opt.h:
core/box/../box/tls_sll_box.h: core/box/../box/tls_sll_box.h:
core/box/../tiny_box_geometry.h: core/box/../tiny_box_geometry.h:
core/box/c7_meta_used_counter_box.h:

View File

@ -9,12 +9,15 @@
#include <stdint.h> #include <stdint.h>
#include <string.h> #include <string.h>
#include <stdio.h>
#include <stdatomic.h>
#include "../hakmem_tiny_config.h" #include "../hakmem_tiny_config.h"
#include "../hakmem_tiny_superslab.h" #include "../hakmem_tiny_superslab.h"
#include "../superslab/superslab_inline.h" #include "../superslab/superslab_inline.h"
#include "../tiny_box_geometry.h" #include "../tiny_box_geometry.h"
#include "../box/tiny_next_ptr_box.h" #include "../box/tiny_next_ptr_box.h"
#include "../box/pagefault_telemetry_box.h" #include "../box/pagefault_telemetry_box.h"
#include "c7_meta_used_counter_box.h"
// ============================================================================ // ============================================================================
// Slab Carving API (Inline for Hot Path) // Slab Carving API (Inline for Hot Path)
@ -46,11 +49,31 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
// Find an available slab in this SuperSlab // Find an available slab in this SuperSlab
int cap = ss_slabs_capacity(ss); int cap = ss_slabs_capacity(ss);
#if HAKMEM_BUILD_RELEASE
static _Atomic int rel_c7_meta_logged = 0;
TinySlabMeta* rel_c7_meta = NULL;
int rel_c7_meta_idx = -1;
#else
static __thread int dbg_c7_meta_logged = 0;
TinySlabMeta* dbg_c7_meta = NULL;
int dbg_c7_meta_idx = -1;
#endif
for (int slab_idx = 0; slab_idx < cap; slab_idx++) { for (int slab_idx = 0; slab_idx < cap; slab_idx++) {
TinySlabMeta* meta = &ss->slabs[slab_idx]; TinySlabMeta* meta = &ss->slabs[slab_idx];
// Check if this slab matches our class and has capacity // Check if this slab matches our class and has capacity
if (meta->class_idx != (uint8_t)class_idx) continue; if (meta->class_idx != (uint8_t)class_idx) continue;
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7 && atomic_load_explicit(&rel_c7_meta_logged, memory_order_relaxed) == 0 && !rel_c7_meta) {
rel_c7_meta = meta;
rel_c7_meta_idx = slab_idx;
}
#else
if (class_idx == 7 && dbg_c7_meta_logged == 0 && !dbg_c7_meta) {
dbg_c7_meta = meta;
dbg_c7_meta_idx = slab_idx;
}
#endif
if (meta->used >= meta->capacity && !meta->freelist) continue; if (meta->used >= meta->capacity && !meta->freelist) continue;
// Carve blocks from this slab // Carve blocks from this slab
@ -73,6 +96,7 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
meta->freelist = next_node; meta->freelist = next_node;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
} else if (meta->carved < meta->capacity) { } else if (meta->carved < meta->capacity) {
// Linear carve // Linear carve
@ -84,6 +108,7 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
meta->carved++; meta->carved++;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
} else { } else {
break; // This slab exhausted break; // This slab exhausted
@ -99,6 +124,48 @@ static inline int slab_carve_from_ss(int class_idx, SuperSlab* ss,
// If this slab had no freelist and no carved capacity, continue to next // If this slab had no freelist and no carved capacity, continue to next
} }
#if !HAKMEM_BUILD_RELEASE
static __thread int dbg_c7_slab_carve_zero_logs = 0;
if (class_idx == 7 && dbg_c7_slab_carve_zero_logs < 10) {
fprintf(stderr, "[C7_SLAB_CARVE_ZERO] ss=%p no blocks carved\n", (void*)ss);
dbg_c7_slab_carve_zero_logs++;
}
#endif
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7 &&
atomic_load_explicit(&rel_c7_meta_logged, memory_order_relaxed) == 0 &&
rel_c7_meta) {
size_t bs = tiny_stride_for_class(class_idx);
fprintf(stderr,
"[REL_C7_CARVE_META] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p stride=%zu slabs_cap=%d\n",
(void*)ss,
rel_c7_meta_idx,
(unsigned)rel_c7_meta->class_idx,
(unsigned)rel_c7_meta->used,
(unsigned)rel_c7_meta->capacity,
(unsigned)rel_c7_meta->carved,
rel_c7_meta->freelist,
bs,
cap);
atomic_store_explicit(&rel_c7_meta_logged, 1, memory_order_relaxed);
}
#else
if (class_idx == 7 && dbg_c7_meta_logged == 0 && dbg_c7_meta) {
size_t bs = tiny_stride_for_class(class_idx);
fprintf(stderr,
"[DBG_C7_CARVE_META] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p stride=%zu slabs_cap=%d\n",
(void*)ss,
dbg_c7_meta_idx,
(unsigned)dbg_c7_meta->class_idx,
(unsigned)dbg_c7_meta->used,
(unsigned)dbg_c7_meta->capacity,
(unsigned)dbg_c7_meta->carved,
dbg_c7_meta->freelist,
bs,
cap);
dbg_c7_meta_logged = 1;
}
#endif
return 0; // No slab in this SuperSlab had available capacity return 0; // No slab in this SuperSlab had available capacity
} }

View File

@ -0,0 +1,26 @@
// ss_slab_reset_box.h
// Box: Reset TinySlabMeta for reuse (C7 diagnostics-friendly)
#pragma once
#include "ss_slab_meta_box.h"
#include "../superslab/superslab_inline.h"
#include <stdatomic.h>
static inline void ss_slab_reset_meta_for_tiny(SuperSlab* ss,
int slab_idx,
int class_idx)
{
if (!ss) return;
if (slab_idx < 0 || slab_idx >= ss_slabs_capacity(ss)) return;
TinySlabMeta* meta = &ss->slabs[slab_idx];
meta->used = 0;
meta->carved = 0;
meta->freelist = NULL;
meta->class_idx = (uint8_t)class_idx;
ss->class_map[slab_idx] = (uint8_t)class_idx;
// Reset remote queue state to avoid stale pending frees on reuse.
atomic_store_explicit(&ss->remote_heads[slab_idx], 0, memory_order_relaxed);
atomic_store_explicit(&ss->remote_counts[slab_idx], 0, memory_order_relaxed);
}

View File

@ -13,6 +13,7 @@
#include "../hakmem_tiny_config.h" #include "../hakmem_tiny_config.h"
#include "../box/tiny_page_box.h" // For tiny_page_box_on_new_slab() #include "../box/tiny_page_box.h" // For tiny_page_box_on_new_slab()
#include <stdio.h> #include <stdio.h>
#include <stdatomic.h>
// Forward declaration if not included // Forward declaration if not included
// CRITICAL FIX: type must match core/hakmem_tiny_config.h (const size_t, not uint16_t) // CRITICAL FIX: type must match core/hakmem_tiny_config.h (const size_t, not uint16_t)
@ -64,9 +65,7 @@ static inline int ss_tls_bind_one(int class_idx,
// superslab_init_slab() only sets it if meta->class_idx==255. // superslab_init_slab() only sets it if meta->class_idx==255.
// We must explicitly set it to the requested class to avoid C0/C7 confusion. // We must explicitly set it to the requested class to avoid C0/C7 confusion.
TinySlabMeta* meta = &ss->slabs[slab_idx]; TinySlabMeta* meta = &ss->slabs[slab_idx];
#if !HAKMEM_BUILD_RELEASE
uint8_t old_cls = meta->class_idx; uint8_t old_cls = meta->class_idx;
#endif
meta->class_idx = (uint8_t)class_idx; meta->class_idx = (uint8_t)class_idx;
#if !HAKMEM_BUILD_RELEASE #if !HAKMEM_BUILD_RELEASE
if (class_idx == 7 && old_cls != class_idx) { if (class_idx == 7 && old_cls != class_idx) {
@ -75,6 +74,36 @@ static inline int ss_tls_bind_one(int class_idx,
} }
#endif #endif
#if HAKMEM_BUILD_RELEASE
static _Atomic int rel_c7_bind_logged = 0;
if (class_idx == 7 &&
atomic_load_explicit(&rel_c7_bind_logged, memory_order_relaxed) == 0) {
fprintf(stderr,
"[REL_C7_BIND] ss=%p slab=%d cls=%u cap=%u used=%u carved=%u\n",
(void*)ss,
slab_idx,
(unsigned)meta->class_idx,
(unsigned)meta->capacity,
(unsigned)meta->used,
(unsigned)meta->carved);
atomic_store_explicit(&rel_c7_bind_logged, 1, memory_order_relaxed);
}
#else
static __thread int dbg_c7_bind_logged = 0;
if (class_idx == 7 && dbg_c7_bind_logged == 0) {
fprintf(stderr,
"[DBG_C7_BIND] ss=%p slab=%d old_cls=%u new_cls=%u cap=%u used=%u carved=%u\n",
(void*)ss,
slab_idx,
(unsigned)old_cls,
(unsigned)meta->class_idx,
(unsigned)meta->capacity,
(unsigned)meta->used,
(unsigned)meta->carved);
dbg_c7_bind_logged = 1;
}
#endif
// Bind this slab to TLS for fast subsequent allocations. // Bind this slab to TLS for fast subsequent allocations.
// Inline implementation of tiny_tls_bind_slab() to avoid header dependencies. // Inline implementation of tiny_tls_bind_slab() to avoid header dependencies.
// Original logic: // Original logic:
@ -109,4 +138,4 @@ static inline int ss_tls_bind_one(int class_idx,
return 1; return 1;
} }
#endif // HAK_SS_TLS_BIND_BOX_H #endif // HAK_SS_TLS_BIND_BOX_H

View File

@ -4,6 +4,7 @@
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include <stdio.h>
// Default: conservative profile (all classes TINY_FIRST). // Default: conservative profile (all classes TINY_FIRST).
// This keeps Tiny in the fast path but always allows Pool fallback. // This keeps Tiny in the fast path but always allows Pool fallback.
@ -40,5 +41,16 @@ void tiny_route_init(void)
// - 全クラス TINY_FIRSTTiny を使うが必ず Pool fallbackあり // - 全クラス TINY_FIRSTTiny を使うが必ず Pool fallbackあり
memset(g_tiny_route, ROUTE_TINY_FIRST, sizeof(g_tiny_route)); memset(g_tiny_route, ROUTE_TINY_FIRST, sizeof(g_tiny_route));
} }
}
#if HAKMEM_BUILD_RELEASE
static int rel_logged = 0;
if (!rel_logged) {
const char* mode =
(g_tiny_route[7] == ROUTE_TINY_ONLY) ? "TINY_ONLY" :
(g_tiny_route[7] == ROUTE_TINY_FIRST) ? "TINY_FIRST" :
(g_tiny_route[7] == ROUTE_POOL_ONLY) ? "POOL_ONLY" : "UNKNOWN";
fprintf(stderr, "[REL_C7_ROUTE] profile=%s route=%s\n", profile, mode);
rel_logged = 1;
}
#endif
}

View File

@ -19,6 +19,7 @@
#define TINY_ROUTE_BOX_H #define TINY_ROUTE_BOX_H
#include <stdint.h> #include <stdint.h>
#include <stdio.h>
// Routing policy per Tiny class. // Routing policy per Tiny class.
typedef enum { typedef enum {
@ -43,8 +44,21 @@ void tiny_route_init(void);
// Uses simple array lookup; class_idx is masked to [0,7] defensively. // Uses simple array lookup; class_idx is masked to [0,7] defensively.
static inline TinyRoutePolicy tiny_route_get(int class_idx) static inline TinyRoutePolicy tiny_route_get(int class_idx)
{ {
return (TinyRoutePolicy)g_tiny_route[class_idx & 7]; TinyRoutePolicy p = (TinyRoutePolicy)g_tiny_route[class_idx & 7];
#if HAKMEM_BUILD_RELEASE
if ((class_idx & 7) == 7) {
static int rel_route_logged = 0;
if (!rel_route_logged) {
const char* mode =
(p == ROUTE_TINY_ONLY) ? "TINY_ONLY" :
(p == ROUTE_TINY_FIRST) ? "TINY_FIRST" :
(p == ROUTE_POOL_ONLY) ? "POOL_ONLY" : "UNKNOWN";
fprintf(stderr, "[REL_C7_ROUTE] via tiny_route_get route=%s\n", mode);
rel_route_logged = 1;
}
}
#endif
return p;
} }
#endif // TINY_ROUTE_BOX_H #endif // TINY_ROUTE_BOX_H

View File

@ -0,0 +1,102 @@
// tiny_tls_carve_one_block_box.h
// Box: Shared TLS carve helper (linear or freelist) for Tiny classes.
#pragma once
#include "../tiny_tls.h"
#include "../tiny_box_geometry.h"
#include "../tiny_debug_api.h" // tiny_refill_failfast_level(), tiny_failfast_abort_ptr()
#include "c7_meta_used_counter_box.h" // C7 meta->used telemetry (Release/Debug共通)
#include "tiny_next_ptr_box.h"
#include "../superslab/superslab_inline.h"
#include <stdatomic.h>
#include <signal.h>
#if !HAKMEM_BUILD_RELEASE
extern int g_tiny_safe_free;
extern int g_tiny_safe_free_strict;
#endif
enum {
TINY_TLS_CARVE_PATH_NONE = 0,
TINY_TLS_CARVE_PATH_LINEAR = 1,
TINY_TLS_CARVE_PATH_FREELIST = 2,
};
typedef struct TinyTLSCarveOneResult {
void* block;
int path;
} TinyTLSCarveOneResult;
// Carve one block from the current TLS slab.
// Returns .block == NULL on failure. path describes which sub-path was taken.
static inline TinyTLSCarveOneResult
tiny_tls_carve_one_block(TinyTLSSlab* tls, int class_idx)
{
TinyTLSCarveOneResult res = {.block = NULL, .path = TINY_TLS_CARVE_PATH_NONE};
if (!tls) return res;
TinySlabMeta* meta = tls->meta;
if (!meta || !tls->ss || tls->slab_base == NULL) return res;
if (meta->class_idx != (uint8_t)class_idx) return res;
if (tls->slab_idx < 0 || tls->slab_idx >= ss_slabs_capacity(tls->ss)) return res;
// Freelist pop
if (meta->freelist) {
#if !HAKMEM_BUILD_RELEASE
if (__builtin_expect(g_tiny_safe_free, 0)) {
size_t blk = tiny_stride_for_class(meta->class_idx);
uint8_t* base = tiny_slab_base_for_geometry(tls->ss, tls->slab_idx);
uintptr_t delta = (uintptr_t)meta->freelist - (uintptr_t)base;
int align_ok = ((delta % blk) == 0);
int range_ok = (delta / blk) < meta->capacity;
if (!align_ok || !range_ok) {
if (g_tiny_safe_free_strict) { raise(SIGUSR2); return res; }
return res;
}
}
#endif
void* block = meta->freelist;
meta->freelist = tiny_next_read(class_idx, block);
meta->used++;
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_TLS);
ss_active_add(tls->ss, 1);
res.block = block;
res.path = TINY_TLS_CARVE_PATH_FREELIST;
return res;
}
// Linear carve
if (meta->used < meta->capacity) {
size_t block_size = tiny_stride_for_class(meta->class_idx);
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
#if !HAKMEM_BUILD_RELEASE
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0)) {
uintptr_t base_ss = (uintptr_t)tls->ss;
size_t ss_size = (size_t)1ULL << tls->ss->lg_size;
uintptr_t p = (uintptr_t)block;
int in_range = (p >= base_ss) && (p < base_ss + ss_size);
int aligned = ((p - (uintptr_t)tls->slab_base) % block_size) == 0;
int idx_ok = (tls->slab_idx >= 0) &&
(tls->slab_idx < ss_slabs_capacity(tls->ss));
if (!in_range || !aligned || !idx_ok || meta->used + 1 > meta->capacity) {
tiny_failfast_abort_ptr("tls_carve_align",
tls->ss,
tls->slab_idx,
block,
"tiny_tls_carve_one_block");
}
}
#endif
meta->used++;
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_TLS);
ss_active_add(tls->ss, 1);
res.block = block;
res.path = TINY_TLS_CARVE_PATH_LINEAR;
return res;
}
return res;
}

View File

@ -0,0 +1,121 @@
// warm_pool_dbg_box.h
// Box: Debug-only counters for C7 Warm Pool instrumentation.
#pragma once
#include <stdatomic.h>
#include <stdint.h>
#if !HAKMEM_BUILD_RELEASE
#ifdef WARM_POOL_DBG_DEFINE
_Atomic uint64_t g_dbg_c7_warm_pop_attempts = 0;
_Atomic uint64_t g_dbg_c7_warm_pop_hits = 0;
_Atomic uint64_t g_dbg_c7_warm_pop_carve = 0;
_Atomic uint64_t g_dbg_c7_tls_carve_attempts = 0;
_Atomic uint64_t g_dbg_c7_tls_carve_success = 0;
_Atomic uint64_t g_dbg_c7_tls_carve_fail = 0;
_Atomic uint64_t g_dbg_c7_uc_miss_warm_refill = 0;
_Atomic uint64_t g_dbg_c7_uc_miss_tls_refill = 0;
_Atomic uint64_t g_dbg_c7_uc_miss_shared_refill = 0;
#else
extern _Atomic uint64_t g_dbg_c7_warm_pop_attempts;
extern _Atomic uint64_t g_dbg_c7_warm_pop_hits;
extern _Atomic uint64_t g_dbg_c7_warm_pop_carve;
extern _Atomic uint64_t g_dbg_c7_tls_carve_attempts;
extern _Atomic uint64_t g_dbg_c7_tls_carve_success;
extern _Atomic uint64_t g_dbg_c7_tls_carve_fail;
extern _Atomic uint64_t g_dbg_c7_uc_miss_warm_refill;
extern _Atomic uint64_t g_dbg_c7_uc_miss_tls_refill;
extern _Atomic uint64_t g_dbg_c7_uc_miss_shared_refill;
#endif
static inline void warm_pool_dbg_c7_attempt(void) {
atomic_fetch_add_explicit(&g_dbg_c7_warm_pop_attempts, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_hit(void) {
atomic_fetch_add_explicit(&g_dbg_c7_warm_pop_hits, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_carve(void) {
atomic_fetch_add_explicit(&g_dbg_c7_warm_pop_carve, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_tls_attempt(void) {
atomic_fetch_add_explicit(&g_dbg_c7_tls_carve_attempts, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_tls_success(void) {
atomic_fetch_add_explicit(&g_dbg_c7_tls_carve_success, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_tls_fail(void) {
atomic_fetch_add_explicit(&g_dbg_c7_tls_carve_fail, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_uc_miss_warm(void) {
atomic_fetch_add_explicit(&g_dbg_c7_uc_miss_warm_refill, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_uc_miss_tls(void) {
atomic_fetch_add_explicit(&g_dbg_c7_uc_miss_tls_refill, 1, memory_order_relaxed);
}
static inline void warm_pool_dbg_c7_uc_miss_shared(void) {
atomic_fetch_add_explicit(&g_dbg_c7_uc_miss_shared_refill, 1, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_attempts(void) {
return atomic_load_explicit(&g_dbg_c7_warm_pop_attempts, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_hits(void) {
return atomic_load_explicit(&g_dbg_c7_warm_pop_hits, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_carves(void) {
return atomic_load_explicit(&g_dbg_c7_warm_pop_carve, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_tls_attempts(void) {
return atomic_load_explicit(&g_dbg_c7_tls_carve_attempts, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_tls_successes(void) {
return atomic_load_explicit(&g_dbg_c7_tls_carve_success, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_tls_failures(void) {
return atomic_load_explicit(&g_dbg_c7_tls_carve_fail, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_uc_miss_warm_refills(void) {
return atomic_load_explicit(&g_dbg_c7_uc_miss_warm_refill, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_uc_miss_tls_refills(void) {
return atomic_load_explicit(&g_dbg_c7_uc_miss_tls_refill, memory_order_relaxed);
}
static inline uint64_t warm_pool_dbg_c7_uc_miss_shared_refills(void) {
return atomic_load_explicit(&g_dbg_c7_uc_miss_shared_refill, memory_order_relaxed);
}
#else
static inline void warm_pool_dbg_c7_attempt(void) { }
static inline void warm_pool_dbg_c7_hit(void) { }
static inline void warm_pool_dbg_c7_carve(void) { }
static inline void warm_pool_dbg_c7_tls_attempt(void) { }
static inline void warm_pool_dbg_c7_tls_success(void) { }
static inline void warm_pool_dbg_c7_tls_fail(void) { }
static inline void warm_pool_dbg_c7_uc_miss_warm(void) { }
static inline void warm_pool_dbg_c7_uc_miss_tls(void) { }
static inline void warm_pool_dbg_c7_uc_miss_shared(void) { }
static inline uint64_t warm_pool_dbg_c7_attempts(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_hits(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_carves(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_tls_attempts(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_tls_successes(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_tls_failures(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_uc_miss_warm_refills(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_uc_miss_tls_refills(void) { return 0; }
static inline uint64_t warm_pool_dbg_c7_uc_miss_shared_refills(void) { return 0; }
#endif

View File

@ -7,11 +7,51 @@
#define HAK_WARM_POOL_PREFILL_BOX_H #define HAK_WARM_POOL_PREFILL_BOX_H
#include <stdint.h> #include <stdint.h>
#include <stdatomic.h>
#include <stdio.h>
#include "../hakmem_tiny_config.h" #include "../hakmem_tiny_config.h"
#include "../hakmem_tiny_superslab.h" #include "../hakmem_tiny_superslab.h"
#include "../tiny_tls.h" #include "../tiny_tls.h"
#include "../front/tiny_warm_pool.h" #include "../front/tiny_warm_pool.h"
#include "../box/warm_pool_stats_box.h" #include "../box/warm_pool_stats_box.h"
#include "../box/warm_pool_rel_counters_box.h"
static inline void warm_prefill_log_c7_meta(const char* tag, TinyTLSSlab* tls) {
if (!tls || !tls->ss) return;
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&rel_logs, 1, memory_order_relaxed);
if (n < 4) {
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
fprintf(stderr,
"[REL_C7_%s] ss=%p slab=%u cls=%u used=%u cap=%u carved=%u freelist=%p\n",
tag,
(void*)tls->ss,
(unsigned)tls->slab_idx,
(unsigned)meta->class_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
}
#else
static _Atomic uint32_t dbg_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&dbg_logs, 1, memory_order_relaxed);
if (n < 4) {
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
fprintf(stderr,
"[DBG_C7_%s] ss=%p slab=%u cls=%u used=%u cap=%u carved=%u freelist=%p\n",
tag,
(void*)tls->ss,
(unsigned)tls->slab_idx,
(unsigned)meta->class_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
}
#endif
}
// Forward declarations // Forward declarations
extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES]; extern __thread TinyTLSSlab g_tls_slabs[TINY_NUM_CLASSES];
@ -45,9 +85,17 @@ extern SuperSlab* superslab_refill(int class_idx);
// Performance: Only triggered when pool is empty, cold path cost // Performance: Only triggered when pool is empty, cold path cost
// //
static inline int warm_pool_do_prefill(int class_idx, TinyTLSSlab* tls) { static inline int warm_pool_do_prefill(int class_idx, TinyTLSSlab* tls) {
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
warm_pool_rel_c7_prefill_call();
}
#endif
int budget = (tiny_warm_pool_count(class_idx) == 0) ? WARM_POOL_PREFILL_BUDGET : 1; int budget = (tiny_warm_pool_count(class_idx) == 0) ? WARM_POOL_PREFILL_BUDGET : 1;
while (budget > 0) { while (budget > 0) {
if (class_idx == 7) {
warm_prefill_log_c7_meta("PREFILL_META", tls);
}
if (!tls->ss) { if (!tls->ss) {
// Need to load a new SuperSlab // Need to load a new SuperSlab
if (!superslab_refill(class_idx)) { if (!superslab_refill(class_idx)) {
@ -61,16 +109,75 @@ static inline int warm_pool_do_prefill(int class_idx, TinyTLSSlab* tls) {
break; break;
} }
// C7 safety: prefer only pristine slabs (used=0 carved=0 freelist=NULL)
if (class_idx == 7) {
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
if (meta->class_idx == 7 &&
(meta->used > 0 || meta->carved > 0 || meta->freelist != NULL)) {
#if HAKMEM_BUILD_RELEASE
static _Atomic int rel_c7_skip_logged = 0;
if (atomic_load_explicit(&rel_c7_skip_logged, memory_order_relaxed) == 0) {
fprintf(stderr,
"[REL_C7_PREFILL_SKIP_NONEMPTY] ss=%p slab=%u used=%u cap=%u carved=%u freelist=%p\n",
(void*)tls->ss,
(unsigned)tls->slab_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
atomic_store_explicit(&rel_c7_skip_logged, 1, memory_order_relaxed);
}
#else
static __thread int dbg_c7_skip_logged = 0;
if (dbg_c7_skip_logged < 4) {
fprintf(stderr,
"[DBG_C7_PREFILL_SKIP_NONEMPTY] ss=%p slab=%u used=%u cap=%u carved=%u freelist=%p\n",
(void*)tls->ss,
(unsigned)tls->slab_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
dbg_c7_skip_logged++;
}
#endif
tls->ss = NULL; // Drop exhausted slab and try another
budget--;
continue;
}
}
if (budget > 1) { if (budget > 1) {
// Prefill mode: push to pool and load another // Prefill mode: push to pool and load another
tiny_warm_pool_push(class_idx, tls->ss); tiny_warm_pool_push(class_idx, tls->ss);
warm_pool_record_prefilled(class_idx); warm_pool_record_prefilled(class_idx);
tls->ss = NULL; // Force next iteration to refill #if HAKMEM_BUILD_RELEASE
budget--; if (class_idx == 7) {
} else { warm_pool_rel_c7_prefill_slab();
// Final slab: keep in TLS for immediate carving
budget = 0;
} }
#else
if (class_idx == 7) {
static __thread int dbg_c7_prefill_logs = 0;
if (dbg_c7_prefill_logs < 8) {
TinySlabMeta* meta = &tls->ss->slabs[tls->slab_idx];
fprintf(stderr,
"[DBG_C7_PREFILL] ss=%p slab=%u used=%u cap=%u carved=%u freelist=%p\n",
(void*)tls->ss,
(unsigned)tls->slab_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
dbg_c7_prefill_logs++;
}
}
#endif
tls->ss = NULL; // Force next iteration to refill
budget--;
} else {
// Final slab: keep in TLS for immediate carving
budget = 0;
}
} }
return 0; // Success return 0; // Success

View File

@ -0,0 +1,64 @@
// warm_pool_rel_counters_box.h
// Box: Lightweight Release-side counters for C7 Warm/TLS instrumentation.
#pragma once
#include <stdatomic.h>
#include <stdint.h>
#if HAKMEM_BUILD_RELEASE
#ifdef WARM_POOL_REL_DEFINE
_Atomic uint64_t g_rel_c7_carve_attempts = 0;
_Atomic uint64_t g_rel_c7_carve_success = 0;
_Atomic uint64_t g_rel_c7_carve_zero = 0;
_Atomic uint64_t g_rel_c7_warm_prefill_calls = 0;
_Atomic uint64_t g_rel_c7_warm_prefill_slabs = 0;
#else
extern _Atomic uint64_t g_rel_c7_carve_attempts;
extern _Atomic uint64_t g_rel_c7_carve_success;
extern _Atomic uint64_t g_rel_c7_carve_zero;
extern _Atomic uint64_t g_rel_c7_warm_prefill_calls;
extern _Atomic uint64_t g_rel_c7_warm_prefill_slabs;
#endif
static inline void warm_pool_rel_c7_carve_attempt(void) {
atomic_fetch_add_explicit(&g_rel_c7_carve_attempts, 1, memory_order_relaxed);
}
static inline void warm_pool_rel_c7_carve_success(void) {
atomic_fetch_add_explicit(&g_rel_c7_carve_success, 1, memory_order_relaxed);
}
static inline void warm_pool_rel_c7_carve_zero(void) {
atomic_fetch_add_explicit(&g_rel_c7_carve_zero, 1, memory_order_relaxed);
}
static inline void warm_pool_rel_c7_prefill_call(void) {
atomic_fetch_add_explicit(&g_rel_c7_warm_prefill_calls, 1, memory_order_relaxed);
}
static inline void warm_pool_rel_c7_prefill_slab(void) {
atomic_fetch_add_explicit(&g_rel_c7_warm_prefill_slabs, 1, memory_order_relaxed);
}
static inline uint64_t warm_pool_rel_c7_carve_attempts(void) {
return atomic_load_explicit(&g_rel_c7_carve_attempts, memory_order_relaxed);
}
static inline uint64_t warm_pool_rel_c7_carve_successes(void) {
return atomic_load_explicit(&g_rel_c7_carve_success, memory_order_relaxed);
}
static inline uint64_t warm_pool_rel_c7_carve_zeroes(void) {
return atomic_load_explicit(&g_rel_c7_carve_zero, memory_order_relaxed);
}
static inline uint64_t warm_pool_rel_c7_prefill_calls(void) {
return atomic_load_explicit(&g_rel_c7_warm_prefill_calls, memory_order_relaxed);
}
static inline uint64_t warm_pool_rel_c7_prefill_slabs(void) {
return atomic_load_explicit(&g_rel_c7_warm_prefill_slabs, memory_order_relaxed);
}
#else
static inline void warm_pool_rel_c7_carve_attempt(void) { }
static inline void warm_pool_rel_c7_carve_success(void) { }
static inline void warm_pool_rel_c7_carve_zero(void) { }
static inline void warm_pool_rel_c7_prefill_call(void) { }
static inline void warm_pool_rel_c7_prefill_slab(void) { }
static inline uint64_t warm_pool_rel_c7_carve_attempts(void) { return 0; }
static inline uint64_t warm_pool_rel_c7_carve_successes(void) { return 0; }
static inline uint64_t warm_pool_rel_c7_carve_zeroes(void) { return 0; }
static inline uint64_t warm_pool_rel_c7_prefill_calls(void) { return 0; }
static inline uint64_t warm_pool_rel_c7_prefill_slabs(void) { return 0; }
#endif

View File

@ -0,0 +1,57 @@
// warm_tls_bind_logger_box.h
// Box: Warm TLS Bind experiment logging with simple throttling.
#pragma once
#include "../hakmem_tiny_superslab.h"
#include <stdatomic.h>
#include <stdio.h>
#include <stdlib.h>
#if !HAKMEM_BUILD_RELEASE
static _Atomic int g_warm_tls_bind_log_limit = -1;
static _Atomic int g_warm_tls_bind_log_count = 0;
static inline int warm_tls_bind_log_limit(void) {
int limit = atomic_load_explicit(&g_warm_tls_bind_log_limit, memory_order_relaxed);
if (__builtin_expect(limit == -1, 0)) {
const char* e = getenv("HAKMEM_WARM_TLS_BIND_LOG_MAX");
int parsed = (e && *e) ? atoi(e) : 1;
atomic_store_explicit(&g_warm_tls_bind_log_limit, parsed, memory_order_relaxed);
limit = parsed;
}
return limit;
}
static inline int warm_tls_bind_log_acquire(void) {
int limit = warm_tls_bind_log_limit();
int prev = atomic_fetch_add_explicit(&g_warm_tls_bind_log_count, 1, memory_order_relaxed);
return prev < limit;
}
static inline void warm_tls_bind_log_success(SuperSlab* ss, int slab_idx) {
if (warm_tls_bind_log_acquire()) {
fprintf(stderr, "[WARM_TLS_BIND] C7 bind success: ss=%p slab=%d\n",
(void*)ss, slab_idx);
}
}
static inline void warm_tls_bind_log_tls_carve(SuperSlab* ss, int slab_idx, void* block) {
if (warm_tls_bind_log_acquire()) {
fprintf(stderr,
"[WARM_TLS_BIND] C7 TLS carve success: ss=%p slab=%d block=%p\n",
(void*)ss, slab_idx, block);
}
}
static inline void warm_tls_bind_log_tls_fail(SuperSlab* ss, int slab_idx) {
if (warm_tls_bind_log_acquire()) {
fprintf(stderr,
"[WARM_TLS_BIND] C7 TLS carve failed, fallback (ss=%p slab=%d)\n",
(void*)ss, slab_idx);
}
}
#else
static inline void warm_tls_bind_log_success(SuperSlab* ss, int slab_idx) { (void)ss; (void)slab_idx; }
static inline void warm_tls_bind_log_tls_carve(SuperSlab* ss, int slab_idx, void* block) { (void)ss; (void)slab_idx; (void)block; }
static inline void warm_tls_bind_log_tls_fail(SuperSlab* ss, int slab_idx) { (void)ss; (void)slab_idx; }
#endif

View File

@ -12,10 +12,19 @@
#include "../box/ss_slab_meta_box.h" // For ss_active_add() and slab metadata operations #include "../box/ss_slab_meta_box.h" // For ss_active_add() and slab metadata operations
#include "../box/warm_pool_stats_box.h" // Box: Warm Pool Statistics Recording (inline) #include "../box/warm_pool_stats_box.h" // Box: Warm Pool Statistics Recording (inline)
#include "../box/slab_carve_box.h" // Box: Slab Carving (inline O(slabs) scan) #include "../box/slab_carve_box.h" // Box: Slab Carving (inline O(slabs) scan)
#define WARM_POOL_REL_DEFINE
#include "../box/warm_pool_rel_counters_box.h" // Box: Release-side C7 counters
#undef WARM_POOL_REL_DEFINE
#include "../box/c7_meta_used_counter_box.h" // Box: C7 meta->used increment counters
#include "../box/warm_pool_prefill_box.h" // Box: Warm Pool Prefill (secondary optimization) #include "../box/warm_pool_prefill_box.h" // Box: Warm Pool Prefill (secondary optimization)
#include "../hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls) #include "../hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls)
#include "../box/tiny_page_box.h" // Tiny-Plus Page Box (C5C7 initial hook) #include "../box/tiny_page_box.h" // Tiny-Plus Page Box (C5C7 initial hook)
#include "../box/ss_tls_bind_box.h" // Box: TLS Bind (SuperSlab -> TLS binding) #include "../box/ss_tls_bind_box.h" // Box: TLS Bind (SuperSlab -> TLS binding)
#include "../box/tiny_tls_carve_one_block_box.h" // Box: TLS carve helper (shared)
#include "../box/warm_tls_bind_logger_box.h" // Box: Warm TLS Bind logging (throttled)
#define WARM_POOL_DBG_DEFINE
#include "../box/warm_pool_dbg_box.h" // Box: Warm Pool C7 debug counters
#undef WARM_POOL_DBG_DEFINE
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include <stdatomic.h> #include <stdatomic.h>
@ -84,6 +93,12 @@ __thread uint64_t g_unified_cache_push[TINY_NUM_CLASSES] = {0};
__thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES] = {0}; __thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES] = {0};
#endif #endif
// Release-side lightweight telemetry (C7 Warm path only)
#if HAKMEM_BUILD_RELEASE
_Atomic uint64_t g_rel_c7_warm_pop = 0;
_Atomic uint64_t g_rel_c7_warm_push = 0;
#endif
// Warm Pool metrics (definition - declared in tiny_warm_pool.h as extern) // Warm Pool metrics (definition - declared in tiny_warm_pool.h as extern)
// Note: These are kept outside !HAKMEM_BUILD_RELEASE for profiling in release builds // Note: These are kept outside !HAKMEM_BUILD_RELEASE for profiling in release builds
__thread TinyWarmPoolStats g_warm_pool_stats[TINY_NUM_CLASSES] = {0}; __thread TinyWarmPoolStats g_warm_pool_stats[TINY_NUM_CLASSES] = {0};
@ -98,46 +113,36 @@ _Atomic uint64_t g_dbg_warm_pop_attempts = 0;
_Atomic uint64_t g_dbg_warm_pop_hits = 0; _Atomic uint64_t g_dbg_warm_pop_hits = 0;
_Atomic uint64_t g_dbg_warm_pop_empty = 0; _Atomic uint64_t g_dbg_warm_pop_empty = 0;
_Atomic uint64_t g_dbg_warm_pop_carve_zero = 0; _Atomic uint64_t g_dbg_warm_pop_carve_zero = 0;
#endif
// Debug-only: cached ENV for Warm TLS Bind (C7) // Warm TLS Bind (C7) mode selector
static int g_warm_tls_bind_mode_c7 = -1; // mode 0: Legacy warm pathデバッグ専用・C7では非推奨
// mode 1: Bind-only 本番経路C7 標準)
// mode 2: Bind + TLS carve 実験経路Debug 専用)
// Release ビルドでは常に mode=1 に固定し、ENV は無視する。
static inline int warm_tls_bind_mode_c7(void) { static inline int warm_tls_bind_mode_c7(void) {
#if HAKMEM_BUILD_RELEASE
static int g_warm_tls_bind_mode_c7 = -1;
if (__builtin_expect(g_warm_tls_bind_mode_c7 == -1, 0)) { if (__builtin_expect(g_warm_tls_bind_mode_c7 == -1, 0)) {
const char* e = getenv("HAKMEM_WARM_TLS_BIND_C7"); const char* e = getenv("HAKMEM_WARM_TLS_BIND_C7");
// 0/empty: disabled, 1: bind only, 2: bind + TLS carve one block int mode = (e && *e) ? atoi(e) : 1; // default = Bind-only
g_warm_tls_bind_mode_c7 = (e && *e) ? atoi(e) : 0; if (mode < 0) mode = 0;
if (mode > 2) mode = 2;
g_warm_tls_bind_mode_c7 = mode;
} }
return g_warm_tls_bind_mode_c7; return g_warm_tls_bind_mode_c7;
} #else
static int g_warm_tls_bind_mode_c7 = -1;
static inline void* warm_tls_carve_one_block(int class_idx) { if (__builtin_expect(g_warm_tls_bind_mode_c7 == -1, 0)) {
TinyTLSSlab* tls = &g_tls_slabs[class_idx]; const char* e = getenv("HAKMEM_WARM_TLS_BIND_C7");
TinySlabMeta* meta = tls->meta; int mode = (e && *e) ? atoi(e) : 1; // default = Bind-only
if (mode < 0) mode = 0;
if (!meta || !tls->ss || tls->slab_base == NULL) return NULL; if (mode > 2) mode = 2;
if (meta->class_idx != (uint8_t)class_idx) return NULL; g_warm_tls_bind_mode_c7 = mode;
if (tls->slab_idx < 0 || tls->slab_idx >= ss_slabs_capacity(tls->ss)) return NULL;
if (meta->freelist) {
void* block = meta->freelist;
meta->freelist = tiny_next_read(class_idx, block);
meta->used++;
ss_active_add(tls->ss, 1);
return block;
} }
return g_warm_tls_bind_mode_c7;
if (meta->used < meta->capacity) {
size_t block_size = tiny_stride_for_class(meta->class_idx);
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
meta->used++;
ss_active_add(tls->ss, 1);
return block;
}
return NULL;
}
#endif #endif
}
// Forward declaration for Warm Pool stats printer (defined later in this file) // Forward declaration for Warm Pool stats printer (defined later in this file)
static inline void tiny_warm_pool_print_stats(void); static inline void tiny_warm_pool_print_stats(void);
@ -157,6 +162,15 @@ int unified_cache_enabled(void) {
fprintf(stderr, "[Unified-INIT] unified_cache_enabled() = %d\n", g_enable); fprintf(stderr, "[Unified-INIT] unified_cache_enabled() = %d\n", g_enable);
fflush(stderr); fflush(stderr);
} }
#else
if (g_enable) {
static int printed = 0;
if (!printed) {
fprintf(stderr, "[Rel-Unified] unified_cache_enabled() = %d\n", g_enable);
fflush(stderr);
printed = 1;
}
}
#endif #endif
} }
return g_enable; return g_enable;
@ -311,6 +325,32 @@ static inline void tiny_warm_pool_print_stats(void) {
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_hits, memory_order_relaxed), (unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_hits, memory_order_relaxed),
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_empty, memory_order_relaxed), (unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_empty, memory_order_relaxed),
(unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_carve_zero, memory_order_relaxed)); (unsigned long long)atomic_load_explicit(&g_dbg_warm_pop_carve_zero, memory_order_relaxed));
uint64_t c7_attempts = warm_pool_dbg_c7_attempts();
uint64_t c7_hits = warm_pool_dbg_c7_hits();
uint64_t c7_carve = warm_pool_dbg_c7_carves();
uint64_t c7_tls_attempts = warm_pool_dbg_c7_tls_attempts();
uint64_t c7_tls_success = warm_pool_dbg_c7_tls_successes();
uint64_t c7_tls_fail = warm_pool_dbg_c7_tls_failures();
uint64_t c7_uc_warm = warm_pool_dbg_c7_uc_miss_warm_refills();
uint64_t c7_uc_tls = warm_pool_dbg_c7_uc_miss_tls_refills();
uint64_t c7_uc_shared = warm_pool_dbg_c7_uc_miss_shared_refills();
if (c7_attempts || c7_hits || c7_carve ||
c7_tls_attempts || c7_tls_success || c7_tls_fail ||
c7_uc_warm || c7_uc_tls || c7_uc_shared) {
fprintf(stderr,
" [DBG_C7] warm_pop_attempts=%llu warm_pop_hits=%llu warm_pop_carve=%llu "
"tls_carve_attempts=%llu tls_carve_success=%llu tls_carve_fail=%llu "
"uc_miss_warm=%llu uc_miss_tls=%llu uc_miss_shared=%llu\n",
(unsigned long long)c7_attempts,
(unsigned long long)c7_hits,
(unsigned long long)c7_carve,
(unsigned long long)c7_tls_attempts,
(unsigned long long)c7_tls_success,
(unsigned long long)c7_tls_fail,
(unsigned long long)c7_uc_warm,
(unsigned long long)c7_uc_tls,
(unsigned long long)c7_uc_shared);
}
#endif #endif
fflush(stderr); fflush(stderr);
} }
@ -515,6 +555,7 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
// - これにより、room <= max_batch <= 512 が常に成り立ち、out[] オーバーランを防止する。 // - これにより、room <= max_batch <= 512 が常に成り立ち、out[] オーバーランを防止する。
void* out[512]; void* out[512];
int produced = 0; int produced = 0;
int tls_carved = 0; // Debug bookkeeping: track TLS carve experiment hits
// ========== PAGE BOX HOT PATHTiny-Plus 層): Try page box FIRST ========== // ========== PAGE BOX HOT PATHTiny-Plus 層): Try page box FIRST ==========
// 将来的に C7 専用の page-level freelist 管理をここに統合する。 // 将来的に C7 専用の page-level freelist 管理をここに統合する。
@ -554,10 +595,21 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
// This is the critical optimization - avoid superslab_refill() registry scan // This is the critical optimization - avoid superslab_refill() registry scan
#if !HAKMEM_BUILD_RELEASE #if !HAKMEM_BUILD_RELEASE
atomic_fetch_add_explicit(&g_dbg_warm_pop_attempts, 1, memory_order_relaxed); atomic_fetch_add_explicit(&g_dbg_warm_pop_attempts, 1, memory_order_relaxed);
if (class_idx == 7) {
warm_pool_dbg_c7_attempt();
}
#endif
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
atomic_fetch_add_explicit(&g_rel_c7_warm_pop, 1, memory_order_relaxed);
}
#endif #endif
SuperSlab* warm_ss = tiny_warm_pool_pop(class_idx); SuperSlab* warm_ss = tiny_warm_pool_pop(class_idx);
if (warm_ss) { if (warm_ss) {
#if !HAKMEM_BUILD_RELEASE #if !HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
warm_pool_dbg_c7_hit();
}
// Debug-only: Warm TLS Bind experiment (C7 only) // Debug-only: Warm TLS Bind experiment (C7 only)
if (class_idx == 7) { if (class_idx == 7) {
int warm_mode = warm_tls_bind_mode_c7(); int warm_mode = warm_tls_bind_mode_c7();
@ -577,25 +629,22 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
TinyTLSSlab* tls = &g_tls_slabs[class_idx]; TinyTLSSlab* tls = &g_tls_slabs[class_idx];
uint32_t tid = (uint32_t)(uintptr_t)pthread_self(); uint32_t tid = (uint32_t)(uintptr_t)pthread_self();
if (ss_tls_bind_one(class_idx, tls, warm_ss, slab_idx, tid)) { if (ss_tls_bind_one(class_idx, tls, warm_ss, slab_idx, tid)) {
static int logged = 0; warm_tls_bind_log_success(warm_ss, slab_idx);
if (!logged) {
fprintf(stderr, "[WARM_TLS_BIND] C7 bind success: ss=%p slab=%d\n",
(void*)warm_ss, slab_idx);
logged = 1;
}
// Mode 2: carve a single block via TLS fast path // Mode 2: carve a single block via TLS fast path
if (warm_mode == 2) { if (warm_mode == 2) {
void* tls_block = warm_tls_carve_one_block(class_idx); warm_pool_dbg_c7_tls_attempt();
if (tls_block) { TinyTLSCarveOneResult tls_carve =
fprintf(stderr, tiny_tls_carve_one_block(tls, class_idx);
"[WARM_TLS_BIND] C7 TLS carve success: ss=%p slab=%d block=%p\n", if (tls_carve.block) {
(void*)warm_ss, slab_idx, tls_block); warm_tls_bind_log_tls_carve(warm_ss, slab_idx, tls_carve.block);
out[0] = tls_block; warm_pool_dbg_c7_tls_success();
out[0] = tls_carve.block;
produced = 1; produced = 1;
tls_carved = 1;
} else { } else {
fprintf(stderr, warm_tls_bind_log_tls_fail(warm_ss, slab_idx);
"[WARM_TLS_BIND] C7 TLS carve failed, fallback\n"); warm_pool_dbg_c7_tls_fail();
} }
} }
} }
@ -607,7 +656,21 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
#endif #endif
// HOT PATH: Warm pool hit, try to carve directly // HOT PATH: Warm pool hit, try to carve directly
if (produced == 0) { if (produced == 0) {
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
warm_pool_rel_c7_carve_attempt();
}
#endif
produced = slab_carve_from_ss(class_idx, warm_ss, out, room); produced = slab_carve_from_ss(class_idx, warm_ss, out, room);
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
if (produced > 0) {
warm_pool_rel_c7_carve_success();
} else {
warm_pool_rel_c7_carve_zero();
}
}
#endif
if (produced > 0) { if (produced > 0) {
// Update active counter for carved blocks // Update active counter for carved blocks
ss_active_add(warm_ss, (uint32_t)produced); ss_active_add(warm_ss, (uint32_t)produced);
@ -615,7 +678,22 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
} }
if (produced > 0) { if (produced > 0) {
#if !HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
warm_pool_dbg_c7_carve();
if (tls_carved) {
warm_pool_dbg_c7_uc_miss_tls();
} else {
warm_pool_dbg_c7_uc_miss_warm();
}
}
#endif
// Success! Return SuperSlab to warm pool for next use // Success! Return SuperSlab to warm pool for next use
#if HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
atomic_fetch_add_explicit(&g_rel_c7_warm_push, 1, memory_order_relaxed);
}
#endif
tiny_warm_pool_push(class_idx, warm_ss); tiny_warm_pool_push(class_idx, warm_ss);
// Track warm pool hit (always compiled, ENV-gated printing) // Track warm pool hit (always compiled, ENV-gated printing)
@ -761,6 +839,9 @@ hak_base_ptr_t unified_cache_refill(int class_idx) {
} }
#if !HAKMEM_BUILD_RELEASE #if !HAKMEM_BUILD_RELEASE
if (class_idx == 7) {
warm_pool_dbg_c7_uc_miss_shared();
}
g_unified_cache_miss[class_idx]++; g_unified_cache_miss[class_idx]++;
#endif #endif

View File

@ -40,10 +40,18 @@ core/front/tiny_unified_cache.o: core/front/tiny_unified_cache.c \
core/front/../box/../superslab/superslab_inline.h \ core/front/../box/../superslab/superslab_inline.h \
core/front/../box/../tiny_box_geometry.h \ core/front/../box/../tiny_box_geometry.h \
core/front/../box/../box/pagefault_telemetry_box.h \ core/front/../box/../box/pagefault_telemetry_box.h \
core/front/../box/c7_meta_used_counter_box.h \
core/front/../box/warm_pool_rel_counters_box.h \
core/front/../box/warm_pool_prefill_box.h \ core/front/../box/warm_pool_prefill_box.h \
core/front/../box/../tiny_tls.h \ core/front/../box/../tiny_tls.h \
core/front/../box/../box/warm_pool_stats_box.h \ core/front/../box/../box/warm_pool_stats_box.h \
core/front/../hakmem_env_cache.h core/front/../box/tiny_page_box.h core/front/../hakmem_env_cache.h core/front/../box/tiny_page_box.h \
core/front/../box/ss_tls_bind_box.h \
core/front/../box/../box/tiny_page_box.h \
core/front/../box/tiny_tls_carve_one_block_box.h \
core/front/../box/../tiny_debug_api.h \
core/front/../box/warm_tls_bind_logger_box.h \
core/front/../box/warm_pool_dbg_box.h
core/front/tiny_unified_cache.h: core/front/tiny_unified_cache.h:
core/front/../hakmem_build_flags.h: core/front/../hakmem_build_flags.h:
core/front/../hakmem_tiny_config.h: core/front/../hakmem_tiny_config.h:
@ -104,8 +112,16 @@ core/front/../box/../hakmem_tiny_superslab.h:
core/front/../box/../superslab/superslab_inline.h: core/front/../box/../superslab/superslab_inline.h:
core/front/../box/../tiny_box_geometry.h: core/front/../box/../tiny_box_geometry.h:
core/front/../box/../box/pagefault_telemetry_box.h: core/front/../box/../box/pagefault_telemetry_box.h:
core/front/../box/c7_meta_used_counter_box.h:
core/front/../box/warm_pool_rel_counters_box.h:
core/front/../box/warm_pool_prefill_box.h: core/front/../box/warm_pool_prefill_box.h:
core/front/../box/../tiny_tls.h: core/front/../box/../tiny_tls.h:
core/front/../box/../box/warm_pool_stats_box.h: core/front/../box/../box/warm_pool_stats_box.h:
core/front/../hakmem_env_cache.h: core/front/../hakmem_env_cache.h:
core/front/../box/tiny_page_box.h: core/front/../box/tiny_page_box.h:
core/front/../box/ss_tls_bind_box.h:
core/front/../box/../box/tiny_page_box.h:
core/front/../box/tiny_tls_carve_one_block_box.h:
core/front/../box/../tiny_debug_api.h:
core/front/../box/warm_tls_bind_logger_box.h:
core/front/../box/warm_pool_dbg_box.h:

View File

@ -87,6 +87,10 @@ extern __thread uint64_t g_unified_cache_hit[TINY_NUM_CLASSES]; // Alloc hits
extern __thread uint64_t g_unified_cache_miss[TINY_NUM_CLASSES]; // Alloc misses extern __thread uint64_t g_unified_cache_miss[TINY_NUM_CLASSES]; // Alloc misses
extern __thread uint64_t g_unified_cache_push[TINY_NUM_CLASSES]; // Free pushes extern __thread uint64_t g_unified_cache_push[TINY_NUM_CLASSES]; // Free pushes
extern __thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES]; // Free full (fallback to SuperSlab) extern __thread uint64_t g_unified_cache_full[TINY_NUM_CLASSES]; // Free full (fallback to SuperSlab)
#else
// Release-side lightweight C7 warm path counters (for smoke validation)
extern _Atomic uint64_t g_rel_c7_warm_pop;
extern _Atomic uint64_t g_rel_c7_warm_push;
#endif #endif
// ============================================================================ // ============================================================================

View File

@ -10,11 +10,145 @@
#include "hakmem_policy.h" #include "hakmem_policy.h"
#include "hakmem_env_cache.h" // Priority-2: ENV cache #include "hakmem_env_cache.h" // Priority-2: ENV cache
#include "front/tiny_warm_pool.h" // Warm Pool: Prefill during registry scans #include "front/tiny_warm_pool.h" // Warm Pool: Prefill during registry scans
#include "box/ss_slab_reset_box.h" // Box: Reset slab metadata on reuse (C7 guard)
#include <stdlib.h> #include <stdlib.h>
#include <stdio.h> #include <stdio.h>
#include <stdatomic.h> #include <stdatomic.h>
static inline void c7_log_meta_state(const char* tag, SuperSlab* ss, int slab_idx) {
if (!ss) return;
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_c7_meta_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&rel_c7_meta_logs, 1, memory_order_relaxed);
if (n < 8) {
TinySlabMeta* m = &ss->slabs[slab_idx];
fprintf(stderr,
"[REL_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
tag,
(void*)ss,
slab_idx,
(unsigned)m->class_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#else
static _Atomic uint32_t dbg_c7_meta_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_meta_logs, 1, memory_order_relaxed);
if (n < 8) {
TinySlabMeta* m = &ss->slabs[slab_idx];
fprintf(stderr,
"[DBG_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
tag,
(void*)ss,
slab_idx,
(unsigned)m->class_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#endif
}
static inline int c7_meta_is_pristine(TinySlabMeta* m) {
return m && m->used == 0 && m->carved == 0 && m->freelist == NULL;
}
static inline void c7_log_skip_nonempty_acquire(SuperSlab* ss,
int slab_idx,
TinySlabMeta* m,
const char* tag) {
if (!(ss && m)) return;
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_c7_skip_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&rel_c7_skip_logs, 1, memory_order_relaxed);
if (n < 4) {
fprintf(stderr,
"[REL_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
tag,
(void*)ss,
slab_idx,
(unsigned)m->class_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#else
static _Atomic uint32_t dbg_c7_skip_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_skip_logs, 1, memory_order_relaxed);
if (n < 4) {
fprintf(stderr,
"[DBG_C7_%s] ss=%p slab=%d cls=%u used=%u cap=%u carved=%u freelist=%p\n",
tag,
(void*)ss,
slab_idx,
(unsigned)m->class_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#endif
}
static inline int c7_reset_and_log_if_needed(SuperSlab* ss,
int slab_idx,
int class_idx) {
if (class_idx != 7) {
return 0;
}
TinySlabMeta* m = &ss->slabs[slab_idx];
c7_log_meta_state("ACQUIRE_META", ss, slab_idx);
if (m->class_idx != 255 && m->class_idx != (uint8_t)class_idx) {
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_c7_class_mismatch_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&rel_c7_class_mismatch_logs, 1, memory_order_relaxed);
if (n < 4) {
fprintf(stderr,
"[REL_C7_CLASS_MISMATCH] ss=%p slab=%d want=%d have=%u used=%u cap=%u carved=%u\n",
(void*)ss,
slab_idx,
class_idx,
(unsigned)m->class_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved);
}
#else
static _Atomic uint32_t dbg_c7_class_mismatch_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_class_mismatch_logs, 1, memory_order_relaxed);
if (n < 4) {
fprintf(stderr,
"[DBG_C7_CLASS_MISMATCH] ss=%p slab=%d want=%d have=%u used=%u cap=%u carved=%u freelist=%p\n",
(void*)ss,
slab_idx,
class_idx,
(unsigned)m->class_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#endif
return -1;
}
if (!c7_meta_is_pristine(m)) {
c7_log_skip_nonempty_acquire(ss, slab_idx, m, "SKIP_NONEMPTY_ACQUIRE");
return -1;
}
ss_slab_reset_meta_for_tiny(ss, slab_idx, class_idx);
c7_log_meta_state("ACQUIRE", ss, slab_idx);
return 0;
}
// ============================================================================ // ============================================================================
// Performance Measurement: Shared Pool Lock Contention (ENV-gated) // Performance Measurement: Shared Pool Lock Contention (ENV-gated)
// ============================================================================ // ============================================================================
@ -147,7 +281,12 @@ sp_acquire_from_empty_scan(int class_idx, SuperSlab** ss_out, int* slab_idx_out,
fprintf(stderr, "[STAGE0.5_STATS] hits=%lu attempts=%lu rate=%.1f%% (scan_limit=%d warm_pool=%d)\n", fprintf(stderr, "[STAGE0.5_STATS] hits=%lu attempts=%lu rate=%.1f%% (scan_limit=%d warm_pool=%d)\n",
hits, attempts, (double)hits * 100.0 / attempts, scan_limit, tiny_warm_pool_count(class_idx)); hits, attempts, (double)hits * 100.0 / attempts, scan_limit, tiny_warm_pool_count(class_idx));
} }
return 0; if (c7_reset_and_log_if_needed(primary_result, primary_slab_idx, class_idx) == 0) {
return 0;
}
primary_result = NULL;
*ss_out = NULL;
*slab_idx_out = -1;
} }
return -1; return -1;
} }
@ -216,6 +355,15 @@ stage1_retry_after_tension_drain:
if (ss_guard) { if (ss_guard) {
tiny_tls_slab_reuse_guard(ss_guard); tiny_tls_slab_reuse_guard(ss_guard);
if (class_idx == 7) {
TinySlabMeta* meta = &ss_guard->slabs[reuse_slot_idx];
if (!c7_meta_is_pristine(meta)) {
c7_log_skip_nonempty_acquire(ss_guard, reuse_slot_idx, meta, "SKIP_NONEMPTY_ACQUIRE");
sp_freelist_push_lockfree(class_idx, reuse_meta, reuse_slot_idx);
goto stage2_fallback;
}
}
// P-Tier: Skip DRAINING tier SuperSlabs // P-Tier: Skip DRAINING tier SuperSlabs
if (!ss_tier_is_hot(ss_guard)) { if (!ss_tier_is_hot(ss_guard)) {
// DRAINING SuperSlab - skip this slot and fall through to Stage 2 // DRAINING SuperSlab - skip this slot and fall through to Stage 2
@ -270,6 +418,15 @@ stage1_retry_after_tension_drain:
*ss_out = ss; *ss_out = ss;
*slab_idx_out = reuse_slot_idx; *slab_idx_out = reuse_slot_idx;
if (c7_reset_and_log_if_needed(ss, reuse_slot_idx, class_idx) != 0) {
*ss_out = NULL;
*slab_idx_out = -1;
if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1);
}
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
goto stage2_fallback;
}
if (g_lock_stats_enabled == 1) { if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1); atomic_fetch_add(&g_lock_release_count, 1);
@ -338,6 +495,19 @@ stage2_fallback:
1, memory_order_relaxed); 1, memory_order_relaxed);
} }
if (class_idx == 7) {
TinySlabMeta* meta = &ss->slabs[claimed_idx];
if (!c7_meta_is_pristine(meta)) {
c7_log_skip_nonempty_acquire(ss, claimed_idx, meta, "SKIP_NONEMPTY_ACQUIRE");
sp_slot_mark_empty(hint_meta, claimed_idx);
if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1);
}
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
goto stage2_scan;
}
}
// Update SuperSlab metadata under mutex // Update SuperSlab metadata under mutex
ss->slab_bitmap |= (1u << claimed_idx); ss->slab_bitmap |= (1u << claimed_idx);
ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx); ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx);
@ -353,6 +523,15 @@ stage2_fallback:
// Hint is still good, no need to update // Hint is still good, no need to update
*ss_out = ss; *ss_out = ss;
*slab_idx_out = claimed_idx; *slab_idx_out = claimed_idx;
if (c7_reset_and_log_if_needed(ss, claimed_idx, class_idx) != 0) {
*ss_out = NULL;
*slab_idx_out = -1;
if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1);
}
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
goto stage2_scan;
}
sp_fix_geometry_if_needed(ss, claimed_idx, class_idx); sp_fix_geometry_if_needed(ss, claimed_idx, class_idx);
if (g_lock_stats_enabled == 1) { if (g_lock_stats_enabled == 1) {
@ -432,6 +611,19 @@ stage2_scan:
1, memory_order_relaxed); 1, memory_order_relaxed);
} }
if (class_idx == 7) {
TinySlabMeta* meta_slab = &ss->slabs[claimed_idx];
if (!c7_meta_is_pristine(meta_slab)) {
c7_log_skip_nonempty_acquire(ss, claimed_idx, meta_slab, "SKIP_NONEMPTY_ACQUIRE");
sp_slot_mark_empty(meta, claimed_idx);
if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1);
}
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
continue;
}
}
// Update SuperSlab metadata under mutex // Update SuperSlab metadata under mutex
ss->slab_bitmap |= (1u << claimed_idx); ss->slab_bitmap |= (1u << claimed_idx);
ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx); ss_slab_meta_class_idx_set(ss, claimed_idx, (uint8_t)class_idx);
@ -449,6 +641,15 @@ stage2_scan:
*ss_out = ss; *ss_out = ss;
*slab_idx_out = claimed_idx; *slab_idx_out = claimed_idx;
if (c7_reset_and_log_if_needed(ss, claimed_idx, class_idx) != 0) {
*ss_out = NULL;
*slab_idx_out = -1;
if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1);
}
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
continue;
}
sp_fix_geometry_if_needed(ss, claimed_idx, class_idx); sp_fix_geometry_if_needed(ss, claimed_idx, class_idx);
if (g_lock_stats_enabled == 1) { if (g_lock_stats_enabled == 1) {
@ -623,6 +824,15 @@ stage2_scan:
*ss_out = new_ss; *ss_out = new_ss;
*slab_idx_out = first_slot; *slab_idx_out = first_slot;
if (c7_reset_and_log_if_needed(new_ss, first_slot, class_idx) != 0) {
*ss_out = NULL;
*slab_idx_out = -1;
if (g_lock_stats_enabled == 1) {
atomic_fetch_add(&g_lock_release_count, 1);
}
pthread_mutex_unlock(&g_shared_pool.alloc_lock);
return -1;
}
sp_fix_geometry_if_needed(new_ss, first_slot, class_idx); sp_fix_geometry_if_needed(new_ss, first_slot, class_idx);
if (g_lock_stats_enabled == 1) { if (g_lock_stats_enabled == 1) {

View File

@ -6,11 +6,42 @@
#include "hakmem_env_cache.h" // Priority-2: ENV cache #include "hakmem_env_cache.h" // Priority-2: ENV cache
#include "superslab/superslab_inline.h" // superslab_ref_get guard for TLS pins #include "superslab/superslab_inline.h" // superslab_ref_get guard for TLS pins
#include "box/ss_release_guard_box.h" // Box: SuperSlab Release Guard #include "box/ss_release_guard_box.h" // Box: SuperSlab Release Guard
#include "box/ss_slab_reset_box.h" // Box: Reset slab metadata on reuse path
#include <stdlib.h> #include <stdlib.h>
#include <stdio.h> #include <stdio.h>
#include <stdatomic.h> #include <stdatomic.h>
static inline void c7_release_log_once(SuperSlab* ss, int slab_idx) {
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_c7_release_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&rel_c7_release_logs, 1, memory_order_relaxed);
if (n < 8) {
TinySlabMeta* meta = &ss->slabs[slab_idx];
fprintf(stderr,
"[REL_C7_RELEASE] ss=%p slab=%d used=%u cap=%u carved=%u\n",
(void*)ss,
slab_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved);
}
#else
static _Atomic uint32_t dbg_c7_release_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&dbg_c7_release_logs, 1, memory_order_relaxed);
if (n < 8) {
TinySlabMeta* meta = &ss->slabs[slab_idx];
fprintf(stderr,
"[DBG_C7_RELEASE] ss=%p slab=%d used=%u cap=%u carved=%u\n",
(void*)ss,
slab_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved);
}
#endif
}
void void
shared_pool_release_slab(SuperSlab* ss, int slab_idx) shared_pool_release_slab(SuperSlab* ss, int slab_idx)
{ {
@ -75,6 +106,9 @@ shared_pool_release_slab(SuperSlab* ss, int slab_idx)
} }
uint8_t class_idx = slab_meta->class_idx; uint8_t class_idx = slab_meta->class_idx;
if (class_idx == 7) {
c7_release_log_once(ss, slab_idx);
}
// Guard: if SuperSlab is pinned (TLS/remote references), defer release to avoid // Guard: if SuperSlab is pinned (TLS/remote references), defer release to avoid
// class_map=255 while pointers are still in-flight. // class_map=255 while pointers are still in-flight.
@ -101,6 +135,39 @@ shared_pool_release_slab(SuperSlab* ss, int slab_idx)
} }
#endif #endif
if (class_idx == 7) {
ss_slab_reset_meta_for_tiny(ss, slab_idx, class_idx);
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_c7_reset_logs = 0;
uint32_t rn = atomic_fetch_add_explicit(&rel_c7_reset_logs, 1, memory_order_relaxed);
if (rn < 4) {
TinySlabMeta* m = &ss->slabs[slab_idx];
fprintf(stderr,
"[REL_C7_RELEASE_RESET] ss=%p slab=%d used=%u cap=%u carved=%u freelist=%p\n",
(void*)ss,
slab_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#else
static _Atomic uint32_t dbg_c7_reset_logs = 0;
uint32_t rn = atomic_fetch_add_explicit(&dbg_c7_reset_logs, 1, memory_order_relaxed);
if (rn < 4) {
TinySlabMeta* m = &ss->slabs[slab_idx];
fprintf(stderr,
"[DBG_C7_RELEASE_RESET] ss=%p slab=%d used=%u cap=%u carved=%u freelist=%p\n",
(void*)ss,
slab_idx,
(unsigned)m->used,
(unsigned)m->capacity,
(unsigned)m->carved,
m->freelist);
}
#endif
}
// Find SharedSSMeta for this SuperSlab // Find SharedSSMeta for this SuperSlab
SharedSSMeta* sp_meta = NULL; SharedSSMeta* sp_meta = NULL;
uint32_t count = atomic_load_explicit(&g_shared_pool.ss_meta_count, memory_order_relaxed); uint32_t count = atomic_load_explicit(&g_shared_pool.ss_meta_count, memory_order_relaxed);

View File

@ -25,6 +25,7 @@
#include "front/tiny_heap_v2.h" #include "front/tiny_heap_v2.h"
#include "tiny_tls_guard.h" #include "tiny_tls_guard.h"
#include "tiny_ready.h" #include "tiny_ready.h"
#include "box/c7_meta_used_counter_box.h"
#include "hakmem_tiny_tls_list.h" #include "hakmem_tiny_tls_list.h"
#include "hakmem_tiny_remote_target.h" // Phase 2C-1: Remote target queue #include "hakmem_tiny_remote_target.h" // Phase 2C-1: Remote target queue
#include "hakmem_tiny_bg_spill.h" // Phase 2C-2: Background spill queue #include "hakmem_tiny_bg_spill.h" // Phase 2C-2: Background spill queue
@ -334,6 +335,7 @@ static inline void* hak_tiny_alloc_superslab_try_fast(int class_idx) {
size_t block_size = tiny_stride_for_class(meta->class_idx); size_t block_size = tiny_stride_for_class(meta->class_idx);
void* block = tls->slab_base + ((size_t)meta->used * block_size); void* block = tls->slab_base + ((size_t)meta->used * block_size);
meta->used++; meta->used++;
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
// Track active blocks in SuperSlab for conservative reclamation // Track active blocks in SuperSlab for conservative reclamation
ss_active_inc(tls->ss); ss_active_inc(tls->ss);
return block; return block;

View File

@ -17,6 +17,7 @@
// Phase E1-CORRECT: Box API for next pointer operations // Phase E1-CORRECT: Box API for next pointer operations
#include "box/tiny_next_ptr_box.h" #include "box/tiny_next_ptr_box.h"
#include "front/tiny_heap_v2.h" #include "front/tiny_heap_v2.h"
#include "box/c7_meta_used_counter_box.h"
// Debug counters (thread-local) // Debug counters (thread-local)
static __thread uint64_t g_3layer_bump_hits = 0; static __thread uint64_t g_3layer_bump_hits = 0;
@ -265,6 +266,7 @@ static void* tiny_alloc_slow_new(int class_idx) {
meta->freelist = tiny_next_read(node); // Phase E1-CORRECT: Box API meta->freelist = tiny_next_read(node); // Phase E1-CORRECT: Box API
items[got++] = node; items[got++] = node;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
} }
// Then linear carve (KEY OPTIMIZATION - direct array fill!) // Then linear carve (KEY OPTIMIZATION - direct array fill!)
@ -285,6 +287,11 @@ static void* tiny_alloc_slow_new(int class_idx) {
} }
meta->used += need; // Reserve to TLS; not active until returned to user meta->used += need; // Reserve to TLS; not active until returned to user
if (class_idx == 7) {
for (uint32_t i = 0; i < need; ++i) {
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
}
}
} }
if (got == 0) { if (got == 0) {

View File

@ -18,6 +18,7 @@
#include "tiny_box_geometry.h" #include "tiny_box_geometry.h"
#include "superslab/superslab_inline.h" // Provides hak_super_lookup() and SUPERSLAB_MAGIC #include "superslab/superslab_inline.h" // Provides hak_super_lookup() and SUPERSLAB_MAGIC
#include "box/tls_sll_box.h" #include "box/tls_sll_box.h"
#include "box/c7_meta_used_counter_box.h"
#include "box/tiny_header_box.h" // Header Box: Single Source of Truth for header operations #include "box/tiny_header_box.h" // Header Box: Single Source of Truth for header operations
#include "box/tiny_front_config_box.h" // Phase 7-Step6-Fix: Config macros for dead code elimination #include "box/tiny_front_config_box.h" // Phase 7-Step6-Fix: Config macros for dead code elimination
#include "hakmem_tiny_integrity.h" #include "hakmem_tiny_integrity.h"
@ -94,6 +95,39 @@ static inline void tiny_debug_validate_node_base(int class_idx, void* node, cons
} }
#endif #endif
static inline void c7_log_used_assign_cap(TinySlabMeta* meta,
int class_idx,
const char* tag) {
if (__builtin_expect(class_idx != 7, 1)) {
return;
}
#if HAKMEM_BUILD_RELEASE
static _Atomic uint32_t rel_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&rel_logs, 1, memory_order_relaxed);
if (n < 4) {
fprintf(stderr,
"[REL_C7_USED_ASSIGN] tag=%s used=%u cap=%u carved=%u freelist=%p\n",
tag,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
}
#else
static _Atomic uint32_t dbg_logs = 0;
uint32_t n = atomic_fetch_add_explicit(&dbg_logs, 1, memory_order_relaxed);
if (n < 4) {
fprintf(stderr,
"[DBG_C7_USED_ASSIGN] tag=%s used=%u cap=%u carved=%u freelist=%p\n",
tag,
(unsigned)meta->used,
(unsigned)meta->capacity,
(unsigned)meta->carved,
meta->freelist);
}
#endif
}
// ========= superslab_tls_bump_fast ========= // ========= superslab_tls_bump_fast =========
// //
// Ultra bump shadow: current slabが freelist 空で carved<capacity のとき、 // Ultra bump shadow: current slabが freelist 空で carved<capacity のとき、
@ -141,6 +175,11 @@ static inline void* superslab_tls_bump_fast(int class_idx) {
meta->carved = (uint16_t)(carved + (uint16_t)chunk); meta->carved = (uint16_t)(carved + (uint16_t)chunk);
meta->used = (uint16_t)(meta->used + (uint16_t)chunk); meta->used = (uint16_t)(meta->used + (uint16_t)chunk);
if (class_idx == 7) {
for (uint32_t i = 0; i < chunk; ++i) {
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
}
}
ss_active_add(tls->ss, chunk); ss_active_add(tls->ss, chunk);
#if HAKMEM_DEBUG_COUNTERS #if HAKMEM_DEBUG_COUNTERS
g_bump_arms[class_idx]++; g_bump_arms[class_idx]++;
@ -365,8 +404,10 @@ int sll_refill_small_from_ss(int class_idx, int max_take)
meta->freelist = next_raw; meta->freelist = next_raw;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
if (__builtin_expect(meta->used > meta->capacity, 0)) { if (__builtin_expect(meta->used > meta->capacity, 0)) {
// 異常検出時はロールバックして終了fail-fast 回避のため静かに中断) // 異常検出時はロールバックして終了fail-fast 回避のため静かに中断)
c7_log_used_assign_cap(meta, class_idx, "FREELIST_OVERRUN");
meta->used = meta->capacity; meta->used = meta->capacity;
break; break;
} }
@ -414,7 +455,9 @@ int sll_refill_small_from_ss(int class_idx, int max_take)
meta->carved++; meta->carved++;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
if (__builtin_expect(meta->used > meta->capacity, 0)) { if (__builtin_expect(meta->used > meta->capacity, 0)) {
c7_log_used_assign_cap(meta, class_idx, "CARVE_OVERRUN");
meta->used = meta->capacity; meta->used = meta->capacity;
break; break;
} }

View File

@ -33,6 +33,7 @@
#ifndef HEADER_CLASS_MASK #ifndef HEADER_CLASS_MASK
#define HEADER_CLASS_MASK 0x0F #define HEADER_CLASS_MASK 0x0F
#endif #endif
#include "../box/c7_meta_used_counter_box.h"
// ======================================================================== // ========================================================================
// REFILL CONTRACT: ss_refill_fc_fill() - Standard Refill Entry Point // REFILL CONTRACT: ss_refill_fc_fill() - Standard Refill Entry Point
@ -131,12 +132,14 @@ static inline int ss_refill_fc_fill(int class_idx, int want) {
p = meta->freelist; p = meta->freelist;
meta->freelist = tiny_next_read(class_idx, p); meta->freelist = tiny_next_read(class_idx, p);
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
} }
// Option B: Carve new block (if capacity available) // Option B: Carve new block (if capacity available)
else if (meta->carved < meta->capacity) { else if (meta->carved < meta->capacity) {
p = (void*)(slab_base + (meta->carved * stride)); p = (void*)(slab_base + (meta->carved * stride));
meta->carved++; meta->carved++;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_FRONT);
} }
// Option C: Slab exhausted, need new slab // Option C: Slab exhausted, need new slab
else { else {

View File

@ -9,6 +9,7 @@
#include "tiny_debug_ring.h" #include "tiny_debug_ring.h"
#include "tiny_remote.h" #include "tiny_remote.h"
#include "box/tiny_next_ptr_box.h" // Box API: next pointer read/write #include "box/tiny_next_ptr_box.h" // Box API: next pointer read/write
#include "box/c7_meta_used_counter_box.h"
extern int g_debug_remote_guard; extern int g_debug_remote_guard;
extern int g_tiny_safe_free_strict; extern int g_tiny_safe_free_strict;
@ -311,6 +312,7 @@ static inline void* slab_freelist_pop(SlabHandle* h) {
void* next = tiny_next_read(h->meta->class_idx, ptr); // Box API: next pointer read void* next = tiny_next_read(h->meta->class_idx, ptr); // Box API: next pointer read
h->meta->freelist = next; h->meta->freelist = next;
h->meta->used++; h->meta->used++;
c7_meta_used_note(h->meta->class_idx, C7_META_USED_SRC_FRONT);
// Optional freelist mask clear when freelist becomes empty // Optional freelist mask clear when freelist becomes empty
do { do {
static int g_mask_en2 = -1; static int g_mask_en2 = -1;

View File

@ -4,6 +4,10 @@
// Date: 2025-11-28 // Date: 2025-11-28
#include "hakmem_tiny_superslab_internal.h" #include "hakmem_tiny_superslab_internal.h"
#include "box/c7_meta_used_counter_box.h"
#include <stdatomic.h>
static _Atomic uint32_t g_c7_backend_calls = 0;
// Note: Legacy backend moved to archive/superslab_backend_legacy.c (not built). // Note: Legacy backend moved to archive/superslab_backend_legacy.c (not built).
@ -83,6 +87,20 @@ void* hak_tiny_alloc_superslab_backend_shared(int class_idx)
return NULL; return NULL;
} }
if (class_idx == 7) {
uint32_t n = atomic_fetch_add_explicit(&g_c7_backend_calls, 1, memory_order_relaxed);
if (n < 8) {
fprintf(stderr,
"[REL_C7_BACKEND_CALL] cls=%d meta_cls=%u used=%u cap=%u ss=%p slab=%d\n",
class_idx,
(unsigned)meta->class_idx,
(unsigned)meta->used,
(unsigned)meta->capacity,
(void*)ss,
slab_idx);
}
}
// Simple bump allocation within this slab. // Simple bump allocation within this slab.
if (meta->used >= meta->capacity) { if (meta->used >= meta->capacity) {
// Slab exhausted: in minimal Phase12-2 backend we do not loop; // Slab exhausted: in minimal Phase12-2 backend we do not loop;
@ -101,6 +119,7 @@ void* hak_tiny_alloc_superslab_backend_shared(int class_idx)
uint8_t* base = (uint8_t*)ss + slab_base_off + offset; uint8_t* base = (uint8_t*)ss + slab_base_off + offset;
meta->used++; meta->used++;
c7_meta_used_note(class_idx, C7_META_USED_SRC_BACKEND);
atomic_fetch_add_explicit(&ss->total_active_blocks, 1, memory_order_relaxed); atomic_fetch_add_explicit(&ss->total_active_blocks, 1, memory_order_relaxed);
HAK_RET_ALLOC_BLOCK_TRACED(class_idx, base, ALLOC_PATH_BACKEND); HAK_RET_ALLOC_BLOCK_TRACED(class_idx, base, ALLOC_PATH_BACKEND);

View File

@ -6,6 +6,7 @@
#include "hakmem_tiny_superslab_internal.h" #include "hakmem_tiny_superslab_internal.h"
#include "box/slab_recycling_box.h" #include "box/slab_recycling_box.h"
#include "hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls) #include "hakmem_env_cache.h" // Priority-2: ENV cache (eliminate syscalls)
#include <stdio.h>
// ============================================================================ // ============================================================================
// Remote Drain (MPSC queue to freelist conversion) // Remote Drain (MPSC queue to freelist conversion)
@ -175,6 +176,37 @@ void superslab_init_slab(SuperSlab* ss, int slab_idx, size_t block_size, uint32_
} }
} }
#if HAKMEM_BUILD_RELEASE
static _Atomic int rel_c7_init_logged = 0;
if (meta->class_idx == 7 &&
atomic_load_explicit(&rel_c7_init_logged, memory_order_relaxed) == 0) {
fprintf(stderr,
"[REL_C7_INIT] ss=%p slab=%d cls=%u cap=%u used=%u carved=%u stride=%zu\n",
(void*)ss,
slab_idx,
(unsigned)meta->class_idx,
(unsigned)meta->capacity,
(unsigned)meta->used,
(unsigned)meta->carved,
stride);
atomic_store_explicit(&rel_c7_init_logged, 1, memory_order_relaxed);
}
#else
static __thread int dbg_c7_init_logged = 0;
if (meta->class_idx == 7 && dbg_c7_init_logged == 0) {
fprintf(stderr,
"[DBG_C7_INIT] ss=%p slab=%d cls=%u cap=%u used=%u carved=%u stride=%zu\n",
(void*)ss,
slab_idx,
(unsigned)meta->class_idx,
(unsigned)meta->capacity,
(unsigned)meta->used,
(unsigned)meta->carved,
stride);
dbg_c7_init_logged = 1;
}
#endif
superslab_activate_slab(ss, slab_idx); superslab_activate_slab(ss, slab_idx);
} }

View File

@ -7,6 +7,8 @@
#include "box/superslab_expansion_box.h" // Box E: Expansion with TLS state guarantee #include "box/superslab_expansion_box.h" // Box E: Expansion with TLS state guarantee
#include "box/tiny_next_ptr_box.h" // Box API: Next pointer read/write #include "box/tiny_next_ptr_box.h" // Box API: Next pointer read/write
#include "box/tiny_tls_carve_one_block_box.h" // Box: Shared TLS carve helper
#include "box/c7_meta_used_counter_box.h" // Box: C7 meta->used telemetry
#include "hakmem_tiny_superslab_constants.h" #include "hakmem_tiny_superslab_constants.h"
#include "tiny_box_geometry.h" // Box 3: Geometry & Capacity Calculator" #include "tiny_box_geometry.h" // Box 3: Geometry & Capacity Calculator"
#include "tiny_debug_api.h" // Guard/failfast declarations #include "tiny_debug_api.h" // Guard/failfast declarations
@ -33,6 +35,7 @@ static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx) {
uint8_t* base = tiny_slab_base_for_geometry(ss, slab_idx); uint8_t* base = tiny_slab_base_for_geometry(ss, slab_idx);
void* block = tiny_block_at_index(base, meta->used, unit_sz); void* block = tiny_block_at_index(base, meta->used, unit_sz);
meta->used++; meta->used++;
c7_meta_used_note(cls, C7_META_USED_SRC_FRONT);
ss_active_inc(ss); ss_active_inc(ss);
HAK_RET_ALLOC(cls, block); HAK_RET_ALLOC(cls, block);
} }
@ -105,6 +108,7 @@ static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx) {
} }
#endif #endif
meta->used++; meta->used++;
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
void* user = void* user =
#if HAKMEM_TINY_HEADER_CLASSIDX #if HAKMEM_TINY_HEADER_CLASSIDX
tiny_region_id_write_header(block_base, meta->class_idx); tiny_region_id_write_header(block_base, meta->class_idx);
@ -157,6 +161,7 @@ static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx) {
meta->freelist = tiny_next_read(meta->class_idx, block); meta->freelist = tiny_next_read(meta->class_idx, block);
meta->used++; meta->used++;
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0) && if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0) &&
__builtin_expect(meta->used > meta->capacity, 0)) { __builtin_expect(meta->used > meta->capacity, 0)) {
@ -294,54 +299,33 @@ static inline void* hak_tiny_alloc_superslab(int class_idx) {
} }
// Fast path: linear carve from current TLS slab // Fast path: linear carve from current TLS slab
if (meta && meta->freelist == NULL && meta->used < meta->capacity && tls->slab_base) { if (meta && tls->slab_base) {
size_t block_size = tiny_stride_for_class(meta->class_idx); TinyTLSCarveOneResult carve = tiny_tls_carve_one_block(tls, class_idx);
uint8_t* base = tls->slab_base; if (carve.block) {
void* block = base + ((size_t)meta->used * block_size); #if !HAKMEM_BUILD_RELEASE
meta->used++; if (__builtin_expect(g_debug_remote_guard, 0)) {
const char* tag = (carve.path == TINY_TLS_CARVE_PATH_FREELIST)
if (__builtin_expect(tiny_refill_failfast_level() >= 2, 0)) { ? "freelist_alloc"
uintptr_t base_ss = (uintptr_t)tls->ss; : "linear_alloc";
size_t ss_size = (size_t)1ULL << tls->ss->lg_size; tiny_remote_track_on_alloc(tls->ss, slab_idx, carve.block, tag, 0);
uintptr_t p = (uintptr_t)block; tiny_remote_assert_not_remote(tls->ss, slab_idx, carve.block, tag, 0);
int in_range = (p >= base_ss) && (p < base_ss + ss_size);
int aligned = ((p - (uintptr_t)base) % block_size) == 0;
int idx_ok = (tls->slab_idx >= 0) &&
(tls->slab_idx < ss_slabs_capacity(tls->ss));
if (!in_range || !aligned || !idx_ok || meta->used > meta->capacity) {
tiny_failfast_abort_ptr("alloc_ret_align",
tls->ss,
tls->slab_idx,
block,
"superslab_tls_invariant");
} }
} #endif
ss_active_inc(tls->ss); #if HAKMEM_TINY_SS_TLS_HINT
ROUTE_MARK(11); ROUTE_COMMIT(class_idx, 0x60); {
HAK_RET_ALLOC(class_idx, block); void* ss_base = (void*)tls->ss;
} size_t ss_size = (size_t)1ULL << tls->ss->lg_size;
tls_ss_hint_update(tls->ss, ss_base, ss_size);
// Freelist path from current TLS slab
if (meta && meta->freelist) {
void* block = meta->freelist;
if (__builtin_expect(g_tiny_safe_free, 0)) {
size_t blk = tiny_stride_for_class(meta->class_idx);
uint8_t* base = tiny_slab_base_for_geometry(tls->ss, tls->slab_idx);
uintptr_t delta = (uintptr_t)block - (uintptr_t)base;
int align_ok = ((delta % blk) == 0);
int range_ok = (delta / blk) < meta->capacity;
if (!align_ok || !range_ok) {
if (g_tiny_safe_free_strict) { raise(SIGUSR2); return NULL; }
return NULL;
} }
#endif
if (carve.path == TINY_TLS_CARVE_PATH_LINEAR) {
ROUTE_MARK(11); ROUTE_COMMIT(class_idx, 0x60);
} else if (carve.path == TINY_TLS_CARVE_PATH_FREELIST) {
ROUTE_MARK(12); ROUTE_COMMIT(class_idx, 0x61);
}
HAK_RET_ALLOC(class_idx, carve.block);
} }
void* next = tiny_next_read(class_idx, block);
meta->freelist = next;
meta->used++;
ss_active_inc(tls->ss);
ROUTE_MARK(12); ROUTE_COMMIT(class_idx, 0x61);
HAK_RET_ALLOC(class_idx, block);
} }
// Slow path: acquire a new slab via shared pool // Slow path: acquire a new slab via shared pool
@ -363,6 +347,7 @@ static inline void* hak_tiny_alloc_superslab(int class_idx) {
size_t block_size = tiny_stride_for_class(meta->class_idx); size_t block_size = tiny_stride_for_class(meta->class_idx);
void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size); void* block = tiny_block_at_index(tls->slab_base, meta->used, block_size);
meta->used++; meta->used++;
c7_meta_used_note(meta->class_idx, C7_META_USED_SRC_FRONT);
ss_active_inc(ss); ss_active_inc(ss);
HAK_RET_ALLOC(class_idx, block); HAK_RET_ALLOC(class_idx, block);
} }