diff --git a/CURRENT_TASK.md b/CURRENT_TASK.md index 642d84d1..c0db2af5 100644 --- a/CURRENT_TASK.md +++ b/CURRENT_TASK.md @@ -1,501 +1,113 @@ -# CURRENT TASK – Tiny / SuperSlab / Shared Pool 最近まとめ +# CURRENT TASK – Performance Optimization Status -**Last Updated**: 2025-11-21 -**Scope**: Phase 12-1.1 (EMPTY Slab Reuse) / Phase 19 (Frontend tcache) / Box Theory Refactoring -**Note**: 古い詳細履歴は PHASE\* / REPORT\* 系のファイルに退避済み(このファイルは最近の要約だけを保持) +**Last Updated**: 2025-11-25 +**Scope**: Random Mixed 16-1024B / Arena Allocator / Architecture Limit Analysis --- -## 🎨 Box Theory Refactoring - 完了 (2025-11-21) +## 🎯 現状サマリ -**成果**: hakmem_tiny.c を **2081行 → 562行 (-73%)** に削減、12モジュールを抽出 +### ✅ Arena Allocator 実装完了 - mmap 95% 削減達成 -### 3フェーズ体系的リファクタリング -- **Phase 1** (ChatGPT): -30% (config_box, publish_box) -- **Phase 2** (Claude): -58% (globals_box, legacy_slow_box, slab_lookup_box) -- **Phase 3** (Task先生分析): -9% (ss_active_box, eventq_box, sll_cap_box, ultra_batch_box) +| Metric | Before | After | Improvement | +|--------|--------|-------|-------------| +| mmap syscalls | 401 | 32 | -92% | +| munmap syscalls | 378 | 3 | -99% | +| Performance (10M) | ~60M ops/s | **68-70M ops/s** | +15% | -**詳細**: Commit 4c33ccdf8 "Box Theory Refactoring - Phase 1-3 Complete" - ---- - -## 🚀 Phase 12-1.1: EMPTY Slab Detection + Immediate Reuse (2025-11-21) - -### 概要 -Task先生Priority 1推奨: SuperSlabにempty_mask追加、EMPTY slab (used==0) の即座検出・再利用でStage 3 (mmap) オーバーヘッド削減 - -### 実装内容 - -#### 1. SuperSlab構造体拡張 (`core/superslab/superslab_types.h`) -```c -uint32_t empty_mask; // slabs with used==0 (highest reuse priority) -uint8_t empty_count; // number of EMPTY slabs for quick check +### 現在の性能比較 (10M iterations) ``` - -#### 2. EMPTY検出API (`core/box/ss_hot_cold_box.h`) -- `ss_is_slab_empty()`: EMPTY判定 (capacity > 0 && used == 0) -- `ss_mark_slab_empty()`: EMPTY状態マーク -- `ss_clear_slab_empty()`: EMPTY状態クリア(再活性化時) -- `ss_update_hot_cold_indices()`: EMPTY/Hot/Cold 3分類 - -#### 3. Free Path統合 (`core/box/free_local_box.c`) -```c -meta->used--; -if (meta->used == 0) { - ss_mark_slab_empty(ss, slab_idx); // 即座にEMPTYマーク -} -``` - -#### 4. Shared Pool Stage 0.5 (`core/hakmem_shared_pool.c`) -```c -// Stage 1前の新ステージ: EMPTY slab直接スキャン -for (int i = 0; i < scan_limit; i++) { - SuperSlab* ss = g_super_reg_by_class[class_idx][i]; - if (ss->empty_count > 0) { - uint32_t mask = ss->empty_mask; - while (mask) { - int empty_idx = __builtin_ctz(mask); - // EMPTY slab再利用、Stage 3回避 - } - } -} -``` - -### 性能結果 - -**Random Mixed 256B (1M iterations)**: -``` -Baseline (OFF): 22.9M ops/s (平均) -Phase 12-1.1 (ON): 23.2M ops/s (+1.3%) -差: 誤差範囲内、ただしrun-to-run変動で最大 +14.9% 観測 -``` - -**Stage統計 (HAKMEM_SHARED_POOL_STAGE_STATS=1)**: -``` -Class 6 (256B): - Stage 1 (EMPTY): 95.1% ← 既に超高効率! - Stage 2 (UNUSED): 4.7% - Stage 3 (new SS): 0.2% ← ボトルネックほぼ解消済み -``` - -### 重要な発見 🔍 - -1. **Task先生の前提条件が不成立** - - 期待: Stage 3が87-95%支配(高い) - - 現実: Stage 3は0.2%(**Phase 12 Shared Pool既に効いてる**) - - 結論: さらなるStage 3削減の余地は限定的 - -2. **+14.9%改善の真因** - - Stage分布は変わらず(95.1% / 4.7% / 0.2%) - - 推定: EMPTY slab優先→**キャッシュ局所性向上** - - 同じStage 1でも、ホットなメモリ再利用で高速化 - -3. **Phase 12戦略の限界** - - Tiny backend最適化(SS-Reuse)は既に飽和 - - 次のボトルネック: **Frontend (優先度2)** - -### ENV制御 - -```bash -# EMPTY reuse有効化(default OFF for A/B testing) -export HAKMEM_SS_EMPTY_REUSE=1 - -# スキャン範囲調整(default 16) -export HAKMEM_SS_EMPTY_SCAN_LIMIT=16 - -# Stage統計測定 -export HAKMEM_SHARED_POOL_STAGE_STATS=1 -``` - -### ファイル変更 -- `core/superslab/superslab_types.h` - empty_mask/empty_count追加 -- `core/box/ss_hot_cold_box.h` - EMPTY検出API -- `core/box/free_local_box.c` - Free path EMPTY検出 -- `core/hakmem_shared_pool.c` - Stage 0.5 EMPTY scan - -**Commit**: 6afaa5703 "Phase 12-1.1: EMPTY Slab Detection + Immediate Reuse" - ---- - -## 🎯 Phase 19: Frontend Fast Path Optimization (次期実装) - -### 背景 -Phase 12-1.1の結果、**Backend最適化は飽和** (Stage 3: 0.2%)。真のボトルネックは **Frontend fast path**: -- 現状: 31ns (HAKMEM) vs 9ns (mimalloc) = **3.4倍遅い** -- 目標: 31ns → 15ns (-50%) で **22M → 40M ops/s** - -### ChatGPT先生戦略 (優先度2 → 優先度1に昇格) - -#### Phase 19の前提: Hit率分析 -``` -HeapV2: 88-99% (主力) -UltraHot: 0-12% (限定的) -FC/SFC: 0% (未使用) -``` -→ HeapV2以外のレイヤーは削減対象 - ---- - -### Phase 19-1: Quick Prune (枝刈り) - 🚀 最優先 - -**目的**: 不要なfrontend層を全スキップ、HeapV2 → SLL → SS だけの単純パスに - -**実装方法**: -```c -// File: core/tiny_alloc_fast.inc.h -// Front入り口に早期returnゲート追加 - -#ifdef HAKMEM_TINY_FRONT_SLIM - // fastcache/sfc/ultrahot/class5を全スキップ - // HeapV2 → SLL → SS へ直行 - if (class_idx >= 5) { - // class5+ は旧パスへfallback - } - // HeapV2 pop 試行 → miss → SLL → SS -#endif -``` - -**特徴**: -- ✅ **既存コード不変** - 早期returnでバイパスのみ -- ✅ **A/Bゲート** - ENV=0で即座に元に戻せる -- ✅ **リスク最小** - 段階的枝刈り可能 - -**ENV制御**: -```bash -export HAKMEM_TINY_FRONT_SLIM=1 # Quick Prune有効化 -``` - -**期待効果**: 22M → 27-30M ops/s (+22-36%) - ---- - -### Phase 19-2: Front-V2 (tcache一層化) - ⚡ 本命 - -**目的**: Frontend を tcache 型(1層 per-class magazine)に統一 - -**設計**: -```c -// File: core/front/tiny_heap_v2.h (新規) -typedef struct { - void* items[32]; // cap 32 (tunable) - uint8_t top; // stack top index - uint8_t class_idx; // bound class -} TinyFrontV2; - -__thread TinyFrontV2 g_front_v2[TINY_NUM_CLASSES]; - -// Pop operation (ultra-fast) -static inline void* front_v2_pop(int class_idx) { - TinyFrontV2* f = &g_front_v2[class_idx]; - if (f->top == 0) return NULL; // empty - return f->items[--f->top]; // 1 instruction pop -} - -// Push operation -static inline int front_v2_push(int class_idx, void* ptr) { - TinyFrontV2* f = &g_front_v2[class_idx]; - if (f->top >= 32) return 0; // full → spill to SLL - f->items[f->top++] = ptr; // 1 instruction push - return 1; -} - -// Refill from backend (only place calling tiny_alloc_fast_refill) -static inline int front_v2_refill(int class_idx) { - // Boundary: drain → bind → owner logic (AGENTS guide) - // Bulk take from SLL/SS (e.g., 8-16 blocks) -} -``` - -**Fast Path フロー**: -```c -void* ptr = front_v2_pop(class_idx); // 1 branch + 1 array lookup -if (!ptr) { - front_v2_refill(class_idx); // miss → refill - ptr = front_v2_pop(class_idx); // retry - if (!ptr) { - // backend fallback (SLL/SS) - } -} -return ptr; -``` - -**対象クラス**: C0-C3 (hot classes)、C4-C5はオフ - -**ENV制御**: -```bash -export HAKMEM_TINY_FRONT_V2=1 # Front-V2有効化 -export HAKMEM_FRONT_V2_CAP=32 # Magazine容量(default 32) -``` - -**期待効果**: 30M → 40M ops/s (+33%) - ---- - -### Phase 19-3: A/B Testing & Metrics - -**測定項目**: -```c -// File: core/front/tiny_heap_v2.c -_Atomic uint64_t g_front_v2_hits[TINY_NUM_CLASSES]; -_Atomic uint64_t g_front_v2_miss[TINY_NUM_CLASSES]; -_Atomic uint64_t g_front_v2_refill_count[TINY_NUM_CLASSES]; -``` - -**ENV制御**: -```bash -export HAKMEM_TINY_FRONT_METRICS=1 # Metrics有効化 -``` - -**ベンチマーク順序**: -1. **Short run (100K)** - SEGV/回帰なし確認 - ```bash - HAKMEM_TINY_FRONT_SLIM=1 ./bench_random_mixed_hakmem 100000 256 42 - ``` - -2. **Latency測定** - 31ns → 15ns 目標 - ```bash - HAKMEM_TINY_FRONT_V2=1 ./bench_random_mixed_hakmem 500000 256 42 - ``` - -3. **Larson short run** - MT回帰なし確認 - ```bash - HAKMEM_TINY_FRONT_V2=1 ./larson_hakmem 10 10000 8 100000 - ``` - ---- - -### Phase 19実装順序 - -``` -Week 1: Phase 19-1 Quick Prune - - tiny_alloc_fast.inc.h にゲート追加 - - ENV=HAKMEM_TINY_FRONT_SLIM=1 実装 - - 100K短尺テスト - - 性能測定 (期待: 22M → 27-30M) - -Week 2: Phase 19-2 Front-V2 設計 - - core/front/tiny_heap_v2.{h,c} 新規作成 - - front_v2_pop/push/refill 実装 - - C0-C3 統合テスト - -Week 3: Phase 19-2 Front-V2 統合 - - tiny_alloc_fast.inc.h にFront-V2パス追加 - - ENV=HAKMEM_TINY_FRONT_V2=1 実装 - - A/Bベンチマーク - -Week 4: Phase 19-3 最適化 - - Magazine容量チューニング (16/32/64) - - Refillバッチサイズ調整 - - Larson/MT安定性確認 +System malloc: 93M ops/s (baseline) +HAKMEM: 68-70M ops/s (73-76% of system malloc) +Gap: ~25% (構造的オーバーヘッド) ``` --- -### 期待される最終性能 +## 🔬 Phase 27 調査結果: アーキテクチャ限界の確認 -``` -Baseline (Phase 12-1.1): 22M ops/s -Phase 19-1 (Slim): 27-30M ops/s (+22-36%) -Phase 19-2 (V2): 40M ops/s (+82%) ← 目標 -System malloc: 78M ops/s (参考) +### 試した最適化(すべて失敗) +| 最適化案 | 結果 | 効果 | +|---------|------|------| +| C5 TLS容量 2倍 (1024→2048) | 68-69M | 変化なし | +| Registry lookup削除 | 68-70M | 変化なし | +| Ultra SLIM 4-layer | ~69M | 変化なし | +| **Phase 27-A: Ultra-Inline (全size)** | **56-61M** | **-15% 悪化** ❌ | +| **Phase 27-B: Ultra-Inline (9-512B)** | **61-62M** | **-10% 悪化** ❌ | -Gap closure: 28% → 51% (大幅改善!) -``` +### Phase 27 失敗の原因 +- Workload の ~52% が headerless classes (cls 0: 1-8B, cls 7: 513-1024B) +- Headerless クラスをフィルタする条件分岐自体が overhead +- Classes 1-6 からの利益 < 条件分岐の overhead + +### 残り 25% ギャップの原因(構造的オーバーヘッド) +1. **Header byte オーバーヘッド** - 毎 alloc/free で 1 バイト書き込み/読み込み +2. **TLS SLL カウンタ** - 毎回 count++ / count-- (vs tcache: pointer のみ) +3. **多層分岐** - 4-5層 dispatch (vs tcache: 2-3層) + +### 結論 +**68-70M ops/s が現アーキテクチャの実質的な限界**。System malloc の 93M ops/s に到達するには: +- Header-free design への全面的な見直し +- tcache 模倣(カウンタ削除、分岐削減) + +が必要だが、現時点では投資対効果が低い。 --- -## 1. Tiny Phase 3d – Hot/Cold Split 状況 +## 📁 主要な修正ファイル(Arena Allocator 実装) -### 1.1 Phase 3d-C: Hot/Cold Split(完了) - -- **目的**: SuperSlab 内で Hot slab(高利用率)を優先し、L1D ミス・分岐ミスを削減。 -- **主な変更**: - - `core/superslab/superslab_types.h` - - `hot_count / cold_count` - - `hot_indices[16] / cold_indices[16]` - - `core/box/ss_hot_cold_box.h` - - `ss_is_slab_hot()` – used > 50% を Hot 判定 - - `ss_update_hot_cold_indices()` – active slab を走査し index 配列を更新 - - `core/hakmem_tiny_superslab.c` - - `superslab_activate_slab()` で slab 活性化時に `ss_update_hot_cold_indices()` を呼ぶ -- **Perf(Random Mixed 256B, 100K ops)**: - - Phase 3d-B → 3d-C: **22.6M → 25.0M ops/s (+10.8%)** - - Phase 3c → 3d-C 累積: **9.38M → 25.0M ops/s (+167%)** - -### 1.2 Phase 3d-D: Hot優先 refill(失敗 → revert 済み) - -- **試行内容(要約)**: - - `core/hakmem_shared_pool.c` の `shared_pool_acquire_slab()` Stage 2 を 2 パス構成に変更。 - - Pass 1: `ss->hot_indices[]` を優先スキャンして UNUSED slot を CAS 取得。 - - Pass 2: `ss->cold_indices[]` をフォールバックとしてスキャン。 - - 目的: Stage 2 内での「Hot slab 優先」を実現し、L1D/branch ミスをさらに削減。 - -- **結果**: - - Random Mixed 256B ベンチ: **23.2M → 6.9M ops/s (-72%)** まで悪化。 - -- **主な原因**: - 1. **`hot_count/cold_count` が実際には育っていない** - - 新規 SuperSlab 確保が支配的なため、Hot/Cold 情報が溜まる前に SS がローテーション。 - - その結果、`hot_count == cold_count == 0` で Pass1/2 がほぼ常時スキップされ、Stage 3 へのフォールバック頻度だけ増加。 - 2. **Stage 2 はそもそもボトルネックではない** - - SP-SLOT 導入後の統計では: - - Stage1 (EMPTY reuse): 約 5% - - Stage2 (UNUSED reuse): 約 92% - - Stage3 (new Superslab): 約 3% - - → Stage 2 内の「どの UNUSED slot を取るか」をいじっても、構造的には futex/mmap/L1 miss にはほぼ効かない。 - 3. **Shared Pool / Superslab の設計上、期待できる改善幅が小さい** - - Stage 2 のスキャンコストを O(スロット数) → O(hot+cold) に減らしても、全体の cycles のうち Stage 2 が占める割合が小さい。 - - 理論的な上限も高々数 % レベルにとどまる。 - -- **結論**: - - Phase 3d-D の実装は **revert 済み**。 - - `shared_pool_acquire_slab()` Stage 2 は、Phase 3d-C 相当のシンプルな UNUSED スキャンに戻す。 - - Hot/Cold indices は今後の別の Box(例: Refill path 以外、学習層または可視化用途)で再利用候補。 +- `core/box/ss_cache_box.inc:138-229` - SSArena allocator 追加 +- `core/box/tls_sll_box.h:509-561` - Release mode で recycle check オプショナル化 +- `core/tiny_free_fast_v2.inc.h:113-148` - Release mode で cross-check 削除 +- `core/hakmem_tiny_sll_cap_box.inc:8-25` - C5 容量を full capacity に変更 +- `core/hakmem_policy.c:24-30` - min_keep tuning +- `core/tiny_alloc_fast_sfc.inc.h:18-26` - SFC defaults tuning --- -## 2. hakmem_tiny.c Box Theory リファクタリング(進行中) +## 🗃 過去の問題と解決(参考) -### 2.1 目的 +### Arena Allocator 以前の状態 +- **Random Mixed (5M ops)**: ~56-60M ops/s, **mmap 418回** (mimalloc の 26倍) +- **根本原因**: SuperSlab = allocation単位 = cache単位 という設計ミスマッチ +- **問題**: ws=256 では Slab が 5-15% 使用率で停滞 → 完全 EMPTY にならない → LRU キャッシュ不発 → 毎回 mmap/munmap -- `core/hakmem_tiny.c` が 2000 行超で可読性・保守性が低下。 -- Box Theory に沿って役割ごとに箱を切り出し、**元の挙動を変えずに**翻訳単位内のレイアウトだけ整理する。 -- クロススレッド TLS / Superslab / Shared Pool など依存が重い部分は **別 .c に出さず `.inc` 化** のみに留める。 - -### 2.2 これまでの分割(大きいところだけ) - -(※ 実際の詳細は各 `core/hakmem_tiny_*.inc` / `*_box.inc` を参照) - -- **Phase 1 – Config / Publish Box 抽出(今回の run を含む)** - - 新規ファイル: - - `core/hakmem_tiny_config_box.inc`(約 200 行) - - `g_tiny_class_sizes[]` - - `tiny_get_max_size()` - - Integrity カウンタ (`g_integrity_check_*`) - - Debug/bench マクロ (`HAKMEM_TINY_BENCH_*`, `HAK_RET_ALLOC`, `HAK_STAT_FREE` 等) - - `core/hakmem_tiny_publish_box.inc`(約 400 行) - - Publish/Adopt 統計 (`g_pub_*`, `g_slab_publish_dbg` など) - - Bench mailbox / partial ring (`bench_pub_*`, `slab_partial_*`) - - Live cap / Hot slot (`live_cap_for_class()`, `hot_slot_*`) - - TLS target helper (`tiny_tls_publish_targets()`, `tiny_tls_refresh_params()` 等) - - 本体側: - - `hakmem_tiny.c` から該当ブロックを削除し、同じ位置に `#include` を挿入。 - - 翻訳単位は維持されるため TLS / static 関数の依存はそのまま。 - - 効果: - - `hakmem_tiny.c`: **2081 行 → 1456 行**(約 -30%) - - ビルド: ✅ 通過(挙動は従来どおり)。 - -### 2.3 今後の候補(未実施、TODO) - -- Frontend / fast-cache / TID キャッシュ周辺 - - `tiny_self_u32()`, `tiny_self_pt()` - - `g_fast_cache[]`, `g_front_fc_hit/miss[]` -- Phase 6 front gate wrapper - - `hak_tiny_alloc_fast_wrapper()`, `hak_tiny_free_fast_wrapper()` - - 周辺の debug / integrity チェック - -**方針**: -- どれも「別 .c ではなく `.inc` として hakmem_tiny.c から `#include` される」形に統一し、 - TLS や static 変数のスコープを壊さない。 +### Arena Allocator による解決 +- SuperSlab を OS 単位として扱う Arena allocator 実装 +- mmap 418回 → 32回 (-92%)、munmap 378回 → 3回 (-99%) +- 性能 60M → 68-70M ops/s (+15%) --- -## 3. SuperSlab / Shared Pool の現状要約 +## 📊 他アロケータとのアーキテクチャ対応(参考) -### 3.1 SuperSlab 安定化(Phase 6-2.x) - -- **主な問題**: - - Guess loop が unmapped 領域を `magic` 読みして SEGV。 - - Tiny free が Superslab を見つけられなかったケースで fallback しきれず崩壊。 -- **主な修正**: - - Guess loop 削除(`hakmem_free_api.inc.h`)。 - - Superslab registry の `_Atomic uintptr_t base` 化、acquire/release の統一。 - - Fallback 経路のみ `hak_is_memory_readable()`(mincore ベース)で safety check を実施。 -- **結果**: - - Random Mixed / mid_large_mt などでの SEGV は解消。 - - mincore は fallback 経路限定のため、ホットパスへの影響は無視できるレベル。 - -### 3.2 SharedSuperSlabPool(Phase 12 SP-SLOT Box) - -- **構造**: - - Stage1: EMPTY slot reuse(per-class free list / lock-free freelist) - - Stage2: UNUSED slot reuse(`SharedSSMeta` + lock-free CAS) - - Stage3: 新規 Superslab(LRU pop → mmap) -- **成果**: - - SuperSlab 新規確保(mmap/munmap)呼び出しをおよそ **-48%** 削減。 - - 「毎回 mmap」状態からは脱出。 -- **残る課題**: - - Larson / 一部 workload で `shared_pool_acquire_slab()` が CPU 時間の大半を占める。 - - Stage3 の mutex 頻度・待ち時間が高く、futex が syscall time の ~70% というケースもある。 - - Warm Superslab を長く保持する SS-Reuse ポリシーがまだ弱い。 +| HAKMEM | mimalloc | tcmalloc | jemalloc | +|--------|----------|----------|----------| +| SuperSlab (2MB) | Segment (~2MiB) | PageHeap | Extent | +| Slab (64KB) | Page (~64KiB) | Span | Run/slab | +| per-class freelist | pages_queue | Central freelist | bin/slab lists | +| Arena allocator | segment cache | PageHeap | extent_avail | --- -## 4. Small‑Mid / Mid‑Large – Crash 修正と現状 +## 🚀 将来の可能性(長期) -### 4.1 Mid‑Large Crash FIX(2025-11-16) +### Slab-level EMPTY Recycling(未実装) +- **Goal**: Slab を cross-class で再利用可能にする +- **設計**: EMPTY slab を lock-free stack で管理、alloc 時に class_idx を動的割り当て +- **期待効果**: メモリ効率向上(ただし性能向上は限定的) -- **症状**: - - Mid‑Large / VM Mixed ベンチで `free(): invalid pointer` → 即 SEGV。 -- **Root Cause**: - - `classify_ptr()` が Mid‑Large の `AllocHeader` を見ておらず、`PTR_KIND_UNKNOWN` と誤分類。 - - Free wrapper が `PTR_KIND_MID_LARGE` ケースを処理していなかった。 -- **修正**: - - `classify_ptr()` に AllocHeader チェックを追加し、`PTR_KIND_MID_LARGE` を返す。 - - Free wrapper に `PTR_KIND_MID_LARGE` ケースを追加して HAKMEM 管轄として処理。 -- **結果**: - - Mid‑Large MT (8–32KB): 0 → **10.5M ops/s**(System 8.7M の **120%**)。 - - VM Mixed: 0 → **285K ops/s**(System 939K の 30.4%)。 - - Mid‑Large 系のクラッシュは解消。 - -### 4.2 random_mixed / Larson Crash FIX(2025-11-16) - -- **random_mixed**: - - Mid‑Large fix に追加した AllocHeader 読みがページ境界を跨いで SEGV を起こしていた。 - - 修正: ページ内オフセットがヘッダサイズ以上のときだけヘッダを読むようにガード。 - - 結果: SEGV → **1.9M ops/s** 程度まで回復。 - -- **Larson**: - - Layer1: cross-thread free が TLS SLL を破壊していた → `owner_tid_low` による cross‑thread 判定を常時 ON にし、remote queue に退避。 - - Layer2: `MAX_SS_METADATA_ENTRIES` が 2048 で頭打ち → 8192 に拡張。 - - 結果: クラッシュ・ハングは解消(性能はまだ System に遠く及ばない)。 +### Abandoned SuperSlab(MT 用、未実装) +- **Goal**: スレッド終了後のメモリを他スレッドから reclaim +- **設計**: mimalloc の abandoned segments 相当 +- **実装タイミング**: MT ワークロードで必要になってから --- -## 5. TODO(短期フォーカス) +## ✅ 完成したマイルストーン -**Tiny / Backend** -- [ ] SS-Reuse Box の設計 - - Superslab 単位の再利用戦略を整理(EMPTY slab の扱い、Warm SS の寿命、LRU と shared_pool の関係)。 -- [ ] `shared_pool_acquire_slab()` Stage3 の観測強化 - - futex 回数 / 待ち時間のカウンタを追加し、「どのクラスが new Superslab を乱発しているか」を可視化。 - -**Tiny / Frontend(軽め)** -- [ ] C2/C3 Hot Ring Cache proto(Phase 21-1) - - Ring → SLL → Superslab の階層を C2/C3 のみ先行実装して、効果と複雑さを評価。 - -**Small‑Mid / Mid‑Large** -- [ ] Small‑Mid Box(Phase 17)のコードは保持しつつ、デフォルト OFF を維持(実験結果のアーカイブとして残す)。 -- [ ] Mid‑Large / VM Mixed の perf 改善は、Tiny/Backend 安定化後に再検討。 - ---- - -## 6. 古い詳細ログへのリンク - -この CURRENT_TASK は「直近フェーズのダイジェスト専用」です。 -歴史的な詳細ログや試行錯誤の全文は、以下のファイル群を参照してください: - -- Tiny / Frontend / Phase 23–26: - - `PHASE23_CAPACITY_OPTIMIZATION_RESULTS.md` - - `PHASE25_*`, `PHASE26_*` 系ドキュメント -- SuperSlab / Shared Pool / Backend: - - `PHASE12_SHARED_SUPERSLAB_POOL_DESIGN.md` - - `PHASE12_SP_SLOT_BOX_IMPLEMENTATION_REPORT.md` - - `BOTTLENECK_ANALYSIS_REPORT_20251114.md` -- Small‑Mid / Mid‑Large / Larson: - - `MID_LARGE_*` 系レポート - - `LARSON_*` 系レポート - - `P0_*`, `CRITICAL_BUG_REPORT.md` - -必要になったら、これらから個別に掘り起こして Box 単位で議論・実装していく方針です。 +1. **Arena Allocator 実装** - mmap 95% 削減達成 ✅ +2. **Phase 27 調査** - アーキテクチャ限界の確認 ✅ +3. **性能 68-70M ops/s** - System malloc の 73-76% に到達 ✅ +**現在の推奨**: 68-70M ops/s を baseline として受け入れ、他のワークロード(Mid-Large, Larson 等)の最適化に注力する。 diff --git a/Makefile b/Makefile index 7a0715c1..1cb8b66a 100644 --- a/Makefile +++ b/Makefile @@ -190,12 +190,12 @@ LDFLAGS += $(EXTRA_LDFLAGS) # Targets TARGET = test_hakmem -OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o hakmem_tiny_superslab.o hakmem_smallmid.o hakmem_smallmid_superslab.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_mid_mt.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_local_box.o core/box/free_remote_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/unified_batch_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/page_arena.o core/front/tiny_ring_cache.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/link_stubs.o core/tiny_failfast.o test_hakmem.o +OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o hakmem_tiny_superslab.o hakmem_smallmid.o hakmem_smallmid_superslab.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_mid_mt.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_local_box.o core/box/free_remote_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/unified_batch_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/link_stubs.o core/tiny_failfast.o test_hakmem.o OBJS = $(OBJS_BASE) # Shared library SHARED_LIB = libhakmem.so -SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o hakmem_tiny_superslab_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/free_local_box_shared.o core/box/free_remote_box_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/bench_fast_box_shared.o core/front/tiny_ring_cache_shared.o core/front/tiny_unified_cache_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_mid_mt_shared.o hakmem_super_registry_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o +SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o hakmem_tiny_superslab_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/free_local_box_shared.o core/box/free_remote_box_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/bench_fast_box_shared.o core/front/tiny_unified_cache_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_mid_mt_shared.o hakmem_super_registry_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o # Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1) ifeq ($(POOL_TLS_PHASE1),1) @@ -222,7 +222,7 @@ endif # Benchmark targets BENCH_HAKMEM = bench_allocators_hakmem BENCH_SYSTEM = bench_allocators_system -BENCH_HAKMEM_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o hakmem_tiny_superslab.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_mid_mt.o hakmem_super_registry.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_local_box.o core/box/free_remote_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/unified_batch_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/page_arena.o core/front/tiny_ring_cache.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/link_stubs.o core/tiny_failfast.o bench_allocators_hakmem.o +BENCH_HAKMEM_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o hakmem_tiny_superslab.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_mid_mt.o hakmem_super_registry.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_local_box.o core/box/free_remote_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/unified_batch_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/link_stubs.o core/tiny_failfast.o bench_allocators_hakmem.o BENCH_HAKMEM_OBJS = $(BENCH_HAKMEM_OBJS_BASE) ifeq ($(POOL_TLS_PHASE1),1) BENCH_HAKMEM_OBJS += pool_tls.o pool_refill.o pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o @@ -399,7 +399,7 @@ test-box-refactor: box-refactor ./larson_hakmem 10 8 128 1024 1 12345 4 # Phase 4: Tiny Pool benchmarks (properly linked with hakmem) -TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o hakmem_tiny_superslab.o hakmem_smallmid.o hakmem_smallmid_superslab.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_local_box.o core/box/free_remote_box.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/unified_batch_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/tiny_sizeclass_hist_box.o core/box/pagefault_telemetry_box.o core/page_arena.o core/front/tiny_ring_cache.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_mid_mt.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/link_stubs.o core/tiny_failfast.o +TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o hakmem_tiny_superslab.o hakmem_smallmid.o hakmem_smallmid_superslab.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_local_box.o core/box/free_remote_box.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/unified_batch_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/tiny_sizeclass_hist_box.o core/box/pagefault_telemetry_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_mid_mt.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/link_stubs.o core/tiny_failfast.o TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE) ifeq ($(POOL_TLS_PHASE1),1) TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o diff --git a/core/box/carve_push_box.d b/core/box/carve_push_box.d index 07d62e34..9dd87a60 100644 --- a/core/box/carve_push_box.d +++ b/core/box/carve_push_box.d @@ -5,20 +5,25 @@ core/box/carve_push_box.o: core/box/carve_push_box.c \ core/box/../superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h \ core/box/../superslab/superslab_inline.h \ - core/box/../superslab/superslab_types.h core/box/../tiny_debug_ring.h \ - core/box/../tiny_remote.h core/box/../hakmem_tiny_superslab_constants.h \ + core/box/../superslab/superslab_types.h \ + core/box/../superslab/../tiny_box_geometry.h \ + core/box/../superslab/../hakmem_tiny_superslab_constants.h \ + core/box/../superslab/../hakmem_tiny_config.h \ + core/box/../tiny_debug_ring.h core/box/../tiny_remote.h \ + core/box/../hakmem_tiny_superslab_constants.h \ core/box/../hakmem_tiny_config.h core/box/../hakmem_tiny_superslab.h \ core/box/../hakmem_tiny_integrity.h core/box/../hakmem_tiny.h \ core/box/carve_push_box.h core/box/capacity_box.h core/box/tls_sll_box.h \ core/box/../hakmem_build_flags.h core/box/../tiny_remote.h \ core/box/../tiny_region_id.h core/box/../tiny_box_geometry.h \ - core/box/../hakmem_tiny_config.h core/box/../ptr_track.h \ - core/box/../hakmem_super_registry.h core/box/../ptr_track.h \ - core/box/../ptr_trace.h core/box/../box/tiny_next_ptr_box.h \ - core/hakmem_tiny_config.h core/tiny_nextptr.h core/hakmem_build_flags.h \ - core/tiny_region_id.h core/superslab/superslab_inline.h \ - core/box/../tiny_debug_ring.h core/box/../superslab/superslab_inline.h \ - core/box/../tiny_refill_opt.h core/box/../tiny_region_id.h \ + core/box/../ptr_track.h core/box/../hakmem_super_registry.h \ + core/box/../ptr_track.h core/box/../ptr_trace.h \ + core/box/../box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ + core/tiny_nextptr.h core/hakmem_build_flags.h core/tiny_region_id.h \ + core/superslab/superslab_inline.h core/box/../tiny_debug_ring.h \ + core/box/../superslab/superslab_inline.h core/box/../tiny_refill_opt.h \ + core/box/../tiny_region_id.h core/box/../box/slab_freelist_atomic.h \ + core/box/../box/../superslab/superslab_types.h \ core/box/../box/tls_sll_box.h core/box/../tiny_box_geometry.h core/box/../hakmem_tiny.h: core/box/../hakmem_build_flags.h: @@ -30,6 +35,9 @@ core/box/../superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/box/../superslab/superslab_inline.h: core/box/../superslab/superslab_types.h: +core/box/../superslab/../tiny_box_geometry.h: +core/box/../superslab/../hakmem_tiny_superslab_constants.h: +core/box/../superslab/../hakmem_tiny_config.h: core/box/../tiny_debug_ring.h: core/box/../tiny_remote.h: core/box/../hakmem_tiny_superslab_constants.h: @@ -44,7 +52,6 @@ core/box/../hakmem_build_flags.h: core/box/../tiny_remote.h: core/box/../tiny_region_id.h: core/box/../tiny_box_geometry.h: -core/box/../hakmem_tiny_config.h: core/box/../ptr_track.h: core/box/../hakmem_super_registry.h: core/box/../ptr_track.h: @@ -59,5 +66,7 @@ core/box/../tiny_debug_ring.h: core/box/../superslab/superslab_inline.h: core/box/../tiny_refill_opt.h: core/box/../tiny_region_id.h: +core/box/../box/slab_freelist_atomic.h: +core/box/../box/../superslab/superslab_types.h: core/box/../box/tls_sll_box.h: core/box/../tiny_box_geometry.h: diff --git a/core/box/free_local_box.d b/core/box/free_local_box.d index 6a43c382..8e0dfc50 100644 --- a/core/box/free_local_box.d +++ b/core/box/free_local_box.d @@ -2,12 +2,15 @@ core/box/free_local_box.o: core/box/free_local_box.c \ core/box/free_local_box.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/box/free_publish_box.h \ core/hakmem_tiny.h core/hakmem_trace.h core/hakmem_tiny_mini_mag.h \ core/box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ core/tiny_nextptr.h core/tiny_region_id.h core/tiny_box_geometry.h \ - core/hakmem_tiny_config.h core/ptr_track.h core/hakmem_super_registry.h \ + core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/box/ss_hot_cold_box.h \ core/box/../superslab/superslab_types.h core/tiny_region_id.h core/box/free_local_box.h: @@ -16,6 +19,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: @@ -29,7 +35,6 @@ core/hakmem_tiny_config.h: core/tiny_nextptr.h: core/tiny_region_id.h: core/tiny_box_geometry.h: -core/hakmem_tiny_config.h: core/ptr_track.h: core/hakmem_super_registry.h: core/hakmem_tiny_superslab.h: diff --git a/core/box/free_publish_box.d b/core/box/free_publish_box.d index 31859bee..564704e0 100644 --- a/core/box/free_publish_box.d +++ b/core/box/free_publish_box.d @@ -2,7 +2,10 @@ core/box/free_publish_box.o: core/box/free_publish_box.c \ core/box/free_publish_box.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/hakmem_tiny.h \ core/hakmem_trace.h core/hakmem_tiny_mini_mag.h core/tiny_route.h \ core/tiny_ready.h core/hakmem_tiny.h core/box/mailbox_box.h @@ -12,6 +15,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: diff --git a/core/box/free_remote_box.d b/core/box/free_remote_box.d index 1102b987..7d3bfa23 100644 --- a/core/box/free_remote_box.d +++ b/core/box/free_remote_box.d @@ -2,7 +2,10 @@ core/box/free_remote_box.o: core/box/free_remote_box.c \ core/box/free_remote_box.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/box/free_publish_box.h \ core/hakmem_tiny.h core/hakmem_trace.h core/hakmem_tiny_mini_mag.h \ core/hakmem_tiny_integrity.h core/hakmem_tiny.h @@ -12,6 +15,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: diff --git a/core/box/front_gate_box.d b/core/box/front_gate_box.d index 7bf8a500..453c29c7 100644 --- a/core/box/front_gate_box.d +++ b/core/box/front_gate_box.d @@ -8,8 +8,8 @@ core/box/front_gate_box.o: core/box/front_gate_box.c \ core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/box/tls_sll_box.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/tiny_debug_ring.h core/tiny_remote.h core/box/tls_sll_box.h \ core/box/../hakmem_tiny_config.h core/box/../hakmem_build_flags.h \ core/box/../tiny_remote.h core/box/../tiny_region_id.h \ core/box/../hakmem_tiny_integrity.h core/box/../hakmem_tiny.h \ @@ -37,6 +37,7 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/box/tls_sll_box.h: diff --git a/core/box/front_gate_classifier.d b/core/box/front_gate_classifier.d index 01fb4bb6..2db2743d 100644 --- a/core/box/front_gate_classifier.d +++ b/core/box/front_gate_classifier.d @@ -7,14 +7,15 @@ core/box/front_gate_classifier.o: core/box/front_gate_classifier.c \ core/box/../superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h \ core/box/../superslab/superslab_inline.h \ - core/box/../superslab/superslab_types.h core/box/../tiny_debug_ring.h \ - core/box/../tiny_remote.h core/box/../hakmem_tiny_superslab.h \ + core/box/../superslab/superslab_types.h \ + core/box/../superslab/../tiny_box_geometry.h \ + core/box/../tiny_debug_ring.h core/box/../tiny_remote.h \ + core/box/../hakmem_tiny_superslab.h \ core/box/../superslab/superslab_inline.h \ core/box/../hakmem_build_flags.h core/box/../hakmem_internal.h \ core/box/../hakmem.h core/box/../hakmem_config.h \ core/box/../hakmem_features.h core/box/../hakmem_sys.h \ - core/box/../hakmem_whale.h core/box/../hakmem_tiny_config.h \ - core/box/../pool_tls_registry.h + core/box/../hakmem_whale.h core/box/../hakmem_tiny_config.h core/box/front_gate_classifier.h: core/box/../tiny_region_id.h: core/box/../hakmem_build_flags.h: @@ -28,6 +29,7 @@ core/box/../superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/box/../superslab/superslab_inline.h: core/box/../superslab/superslab_types.h: +core/box/../superslab/../tiny_box_geometry.h: core/box/../tiny_debug_ring.h: core/box/../tiny_remote.h: core/box/../hakmem_tiny_superslab.h: @@ -40,4 +42,3 @@ core/box/../hakmem_features.h: core/box/../hakmem_sys.h: core/box/../hakmem_whale.h: core/box/../hakmem_tiny_config.h: -core/box/../pool_tls_registry.h: diff --git a/core/box/mailbox_box.d b/core/box/mailbox_box.d index 65b60cdd..ee5eec23 100644 --- a/core/box/mailbox_box.d +++ b/core/box/mailbox_box.d @@ -1,7 +1,9 @@ core/box/mailbox_box.o: core/box/mailbox_box.c core/box/mailbox_box.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/hakmem_tiny.h \ core/hakmem_trace.h core/hakmem_tiny_mini_mag.h core/tiny_debug_ring.h @@ -11,6 +13,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: diff --git a/core/box/prewarm_box.d b/core/box/prewarm_box.d index bff86065..f2b9bf1d 100644 --- a/core/box/prewarm_box.d +++ b/core/box/prewarm_box.d @@ -5,8 +5,12 @@ core/box/prewarm_box.o: core/box/prewarm_box.c core/box/../hakmem_tiny.h \ core/box/../superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h \ core/box/../superslab/superslab_inline.h \ - core/box/../superslab/superslab_types.h core/box/../tiny_debug_ring.h \ - core/box/../tiny_remote.h core/box/../hakmem_tiny_superslab_constants.h \ + core/box/../superslab/superslab_types.h \ + core/box/../superslab/../tiny_box_geometry.h \ + core/box/../superslab/../hakmem_tiny_superslab_constants.h \ + core/box/../superslab/../hakmem_tiny_config.h \ + core/box/../tiny_debug_ring.h core/box/../tiny_remote.h \ + core/box/../hakmem_tiny_superslab_constants.h \ core/box/../hakmem_tiny_config.h core/box/../hakmem_tiny_superslab.h \ core/box/../hakmem_tiny_integrity.h core/box/../hakmem_tiny.h \ core/box/prewarm_box.h core/box/capacity_box.h core/box/carve_push_box.h @@ -20,6 +24,9 @@ core/box/../superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/box/../superslab/superslab_inline.h: core/box/../superslab/superslab_types.h: +core/box/../superslab/../tiny_box_geometry.h: +core/box/../superslab/../hakmem_tiny_superslab_constants.h: +core/box/../superslab/../hakmem_tiny_config.h: core/box/../tiny_debug_ring.h: core/box/../tiny_remote.h: core/box/../hakmem_tiny_superslab_constants.h: diff --git a/core/box/superslab_expansion_box.d b/core/box/superslab_expansion_box.d index 9ba96768..9568677c 100644 --- a/core/box/superslab_expansion_box.d +++ b/core/box/superslab_expansion_box.d @@ -5,9 +5,12 @@ core/box/superslab_expansion_box.o: core/box/superslab_expansion_box.c \ core/box/../hakmem_tiny_superslab.h \ core/box/../superslab/superslab_types.h \ core/box/../superslab/superslab_inline.h \ - core/box/../superslab/superslab_types.h core/box/../tiny_debug_ring.h \ - core/box/../hakmem_build_flags.h core/box/../tiny_remote.h \ - core/box/../hakmem_tiny_superslab_constants.h \ + core/box/../superslab/superslab_types.h \ + core/box/../superslab/../tiny_box_geometry.h \ + core/box/../superslab/../hakmem_tiny_superslab_constants.h \ + core/box/../superslab/../hakmem_tiny_config.h \ + core/box/../tiny_debug_ring.h core/box/../hakmem_build_flags.h \ + core/box/../tiny_remote.h core/box/../hakmem_tiny_superslab_constants.h \ core/box/../hakmem_tiny_superslab.h \ core/box/../hakmem_tiny_superslab_constants.h core/box/superslab_expansion_box.h: @@ -18,6 +21,9 @@ core/box/../hakmem_tiny_superslab.h: core/box/../superslab/superslab_types.h: core/box/../superslab/superslab_inline.h: core/box/../superslab/superslab_types.h: +core/box/../superslab/../tiny_box_geometry.h: +core/box/../superslab/../hakmem_tiny_superslab_constants.h: +core/box/../superslab/../hakmem_tiny_config.h: core/box/../tiny_debug_ring.h: core/box/../hakmem_build_flags.h: core/box/../tiny_remote.h: diff --git a/core/box/unified_batch_box.d b/core/box/unified_batch_box.d index 9e30bef9..76145b72 100644 --- a/core/box/unified_batch_box.d +++ b/core/box/unified_batch_box.d @@ -13,6 +13,7 @@ core/box/unified_batch_box.o: core/box/unified_batch_box.c \ core/hakmem_tiny_superslab_constants.h \ core/box/../box/../superslab/superslab_inline.h \ core/box/../box/../superslab/superslab_types.h \ + core/box/../box/../superslab/../tiny_box_geometry.h \ core/box/../box/../tiny_debug_ring.h core/box/../box/../tiny_remote.h \ core/box/../box/../hakmem_tiny_integrity.h \ core/box/../box/../hakmem_tiny.h core/box/../box/../hakmem_trace.h \ @@ -40,6 +41,7 @@ core/box/../box/../superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/box/../box/../superslab/superslab_inline.h: core/box/../box/../superslab/superslab_types.h: +core/box/../box/../superslab/../tiny_box_geometry.h: core/box/../box/../tiny_debug_ring.h: core/box/../box/../tiny_remote.h: core/box/../box/../hakmem_tiny_integrity.h: diff --git a/core/front/tiny_front_c23.h b/core/front/tiny_front_c23.h deleted file mode 100644 index dd390b33..00000000 --- a/core/front/tiny_front_c23.h +++ /dev/null @@ -1,157 +0,0 @@ -// tiny_front_c23.h - Ultra-Simple Front Path for C2/C3 (Phase B) -// Purpose: Bypass SFC/SLL/Magazine complexity for 128B/256B allocations -// Target: 15-20M ops/s (vs current 8-9M ops/s) -// -// Architecture: -// - C2/C3 only (class_idx 2 or 3) -// - Direct FastCache access (no SLL/Magazine overhead) -// - Direct SuperSlab refill (ss_refill_fc_fill) -// - ENV-gated: HAKMEM_TINY_FRONT_C23_SIMPLE=1 -// -// Performance Strategy: -// - Minimize layers: FC → SS (2 layers instead of 5+) -// - Minimize branches: ENV check cached in TLS -// - Minimize overhead: No stats, no logging in hot path -// -// Box Theory Compliance: -// - Clear boundary: Front ← Backend (ss_refill_fc_fill) -// - Safe fallback: NULL return → caller handles slow path -// - Header preservation: BASE pointers only, HAK_RET_ALLOC at caller - -#ifndef TINY_FRONT_C23_H -#define TINY_FRONT_C23_H - -#include -#include -#include -#include -#include "../hakmem_build_flags.h" - -// Forward declarations (functions from other modules) -// These are declared in hakmem_tiny_fastcache.inc.h and refill/ss_refill_fc.h -extern void* fastcache_pop(int class_idx); -extern int fastcache_push(int class_idx, void* ptr); -extern int ss_refill_fc_fill(int class_idx, int want); - -// ENV-gated enable/disable (TLS cached for zero overhead after first check) -static inline int tiny_front_c23_enabled(void) { - static __thread int cached = -1; - if (__builtin_expect(cached == -1, 0)) { - const char* env = getenv("HAKMEM_TINY_FRONT_C23_SIMPLE"); - cached = (env && atoi(env) == 1) ? 1 : 0; - if (cached) { - fprintf(stderr, "[TINY_FRONT_C23] Enabled for C2/C3 (128B/256B)\n"); - } - } - return cached; -} - -// Refill target: 64 blocks (optimized via A/B testing) -// A/B Results (100K iterations): -// 128B: refill=64 → 9.55M ops/s (+15.5% vs baseline 8.27M) -// 256B: refill=64 → 8.47M ops/s (+7.2% vs baseline 7.90M) -// 256B: refill=32 → 8.61M ops/s (+9.0%, slightly better for 256B) -// Decision: refill=64 for balanced performance across C2/C3 -static inline int tiny_front_c23_refill_target(int class_idx) { - (void)class_idx; - static __thread int target = -1; - if (__builtin_expect(target == -1, 0)) { - const char* env = getenv("HAKMEM_TINY_FRONT_C23_REFILL"); - target = (env && *env) ? atoi(env) : 64; // Default: 64 (A/B optimized) - if (target <= 0) target = 64; - if (target > 128) target = 128; // Cap at 128 to avoid excessive latency - } - return target; -} - -// Ultra-simple alloc for C2/C3 -// Returns: BASE pointer or NULL -// -// Flow: -// 1. Try FastCache pop (L1, ultra-fast array access) -// 2. If miss, call ss_refill_fc_fill (SuperSlab → FC direct, bypass SLL) -// 3. Try FastCache pop again (should succeed after refill) -// 4. Return NULL if all failed (caller handles slow path) -// -// Contract: -// - Input: size (64-1024B), class_idx (2 or 3) -// - Output: BASE pointer (header at ptr-1 for C2/C3) -// - Caller: Must call HAK_RET_ALLOC(class_idx, ptr) to convert BASE → USER -// - Safety: NULL checks, class_idx bounds checks, fallback to slow path -// -// Performance: -// - Hot path (FC hit): ~3-5 instructions (array[top--]) -// - Cold path (FC miss): ~20-50 instructions (ss_refill_fc_fill + retry) -// - Expected hit rate: 90-95% (based on Phase 7 results) -static inline void* tiny_front_c23_alloc(size_t size, int class_idx) { - // Safety: Bounds check (should never fail, but defense-in-depth) - if (__builtin_expect(class_idx < 2 || class_idx > 3, 0)) { - return NULL; // Not C2/C3, caller should use generic path - } - - (void)size; // Unused, class_idx already determined by caller - - // Step 1: Try FastCache pop (L1, ultra-fast) - void* ptr = fastcache_pop(class_idx); - if (__builtin_expect(ptr != NULL, 1)) { - // FastCache hit! Return BASE pointer (caller will apply HAK_RET_ALLOC) - return ptr; - } - - // Step 2: FastCache miss → Refill from SuperSlab - int want = tiny_front_c23_refill_target(class_idx); - int refilled = ss_refill_fc_fill(class_idx, want); - - if (__builtin_expect(refilled <= 0, 0)) { - // Refill failed (OOM or capacity exhausted) - return NULL; // Caller will try slow path - } - - // Step 3: Retry FastCache pop (should succeed now) - ptr = fastcache_pop(class_idx); - if (__builtin_expect(ptr != NULL, 1)) { - // Success! Return BASE pointer - return ptr; - } - - // Step 4: Still NULL (rare, indicates FC capacity issue or race) - // Fallback: Let caller try slow path - return NULL; -} - -// Performance Notes: -// -// Expected improvement over generic path: -// - Generic: FC → SLL → Magazine → Backend (4-5 layers) -// - C23: FC → SS (2 layers) -// - Reduction: -50-60% instructions in refill path -// -// Expected latency: -// - Hot path (FC hit): 3-5 instructions (1-2 cycles) -// - Cold path (refill): 20-50 instructions (10-20 cycles) -// - vs Generic cold: 50-100+ instructions (25-50 cycles) -// -// Memory impact: -// - Zero additional memory (reuses existing FastCache) -// - No new TLS state (uses existing ss_refill_fc_fill backend) -// -// Integration Notes: -// -// Usage (from tiny_alloc_fast.inc.h): -// if (tiny_front_c23_enabled() && (class_idx == 2 || class_idx == 3)) { -// void* ptr = tiny_front_c23_alloc(size, class_idx); -// if (ptr) return ptr; // Success via C23 fast path -// // Fall through to existing path if C23 path failed -// } -// -// ENV Controls: -// HAKMEM_TINY_FRONT_C23_SIMPLE=1 - Enable C23 fast path -// HAKMEM_TINY_FRONT_C23_REFILL=N - Set refill target (default: 16) -// -// A/B Testing: -// export HAKMEM_TINY_FRONT_C23_SIMPLE=1 -// export HAKMEM_TINY_FRONT_C23_REFILL=16 # Conservative -// export HAKMEM_TINY_FRONT_C23_REFILL=32 # Balanced -// export HAKMEM_TINY_FRONT_C23_REFILL=64 # Aggressive - -#endif // TINY_FRONT_C23_H diff --git a/core/front/tiny_ring_cache.c b/core/front/tiny_ring_cache.c deleted file mode 100644 index 02cfd019..00000000 --- a/core/front/tiny_ring_cache.c +++ /dev/null @@ -1,212 +0,0 @@ -// tiny_ring_cache.c - Phase 21-1: Ring cache implementation -#include "tiny_ring_cache.h" -#include "../box/tls_sll_box.h" // For tls_sll_pop/push (Phase 21-1-C refill) -#include -#include - -// ============================================================================ -// TLS Variables (defined here, extern in header) -// ============================================================================ - -__thread TinyRingCache g_ring_cache_c2 = {NULL, 0, 0, 0, 0}; -__thread TinyRingCache g_ring_cache_c3 = {NULL, 0, 0, 0, 0}; -__thread TinyRingCache g_ring_cache_c5 = {NULL, 0, 0, 0, 0}; - -// ============================================================================ -// Metrics (Phase 21-1-E, optional for Phase 21-1-C) -// ============================================================================ - -#if !HAKMEM_BUILD_RELEASE -__thread uint64_t g_ring_cache_hit[8] = {0}; -__thread uint64_t g_ring_cache_miss[8] = {0}; -__thread uint64_t g_ring_cache_push[8] = {0}; -__thread uint64_t g_ring_cache_full[8] = {0}; -__thread uint64_t g_ring_cache_refill[8] = {0}; -#endif - -// ============================================================================ -// Init (called at thread start, from hakmem_tiny.c) -// ============================================================================ - -void ring_cache_init(void) { - if (!ring_cache_enabled()) return; - - // C2 init - size_t cap_c2 = ring_capacity_c2(); - g_ring_cache_c2.slots = (void**)calloc(cap_c2, sizeof(void*)); - if (!g_ring_cache_c2.slots) { -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] Failed to allocate C2 ring (%zu slots)\n", cap_c2); - fflush(stderr); -#endif - return; - } - g_ring_cache_c2.capacity = (uint16_t)cap_c2; - g_ring_cache_c2.mask = (uint16_t)(cap_c2 - 1); - g_ring_cache_c2.head = 0; - g_ring_cache_c2.tail = 0; - - // C3 init - size_t cap_c3 = ring_capacity_c3(); - g_ring_cache_c3.slots = (void**)calloc(cap_c3, sizeof(void*)); - if (!g_ring_cache_c3.slots) { -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] Failed to allocate C3 ring (%zu slots)\n", cap_c3); - fflush(stderr); -#endif - // Free C2 if C3 failed - free(g_ring_cache_c2.slots); - g_ring_cache_c2.slots = NULL; - return; - } - g_ring_cache_c3.capacity = (uint16_t)cap_c3; - g_ring_cache_c3.mask = (uint16_t)(cap_c3 - 1); - g_ring_cache_c3.head = 0; - g_ring_cache_c3.tail = 0; - - // C5 init - size_t cap_c5 = ring_capacity_c5(); - g_ring_cache_c5.slots = (void**)calloc(cap_c5, sizeof(void*)); - if (!g_ring_cache_c5.slots) { -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] Failed to allocate C5 ring (%zu slots)\n", cap_c5); - fflush(stderr); -#endif - // Free C2 and C3 if C5 failed - free(g_ring_cache_c2.slots); - g_ring_cache_c2.slots = NULL; - free(g_ring_cache_c3.slots); - g_ring_cache_c3.slots = NULL; - return; - } - g_ring_cache_c5.capacity = (uint16_t)cap_c5; - g_ring_cache_c5.mask = (uint16_t)(cap_c5 - 1); - g_ring_cache_c5.head = 0; - g_ring_cache_c5.tail = 0; - -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] C2=%zu slots (%zu bytes), C3=%zu slots (%zu bytes), C5=%zu slots (%zu bytes)\n", - cap_c2, cap_c2 * sizeof(void*), - cap_c3, cap_c3 * sizeof(void*), - cap_c5, cap_c5 * sizeof(void*)); - fflush(stderr); -#endif -} - -// ============================================================================ -// Shutdown (called at thread exit, optional) -// ============================================================================ - -void ring_cache_shutdown(void) { - if (!ring_cache_enabled()) return; - - // Drain rings to TLS SLL before shutdown (prevent leak) - // TODO: Implement drain logic in Phase 21-1-C - - // Free ring buffers - if (g_ring_cache_c2.slots) { - free(g_ring_cache_c2.slots); - g_ring_cache_c2.slots = NULL; - } - - if (g_ring_cache_c3.slots) { - free(g_ring_cache_c3.slots); - g_ring_cache_c3.slots = NULL; - } - - if (g_ring_cache_c5.slots) { - free(g_ring_cache_c5.slots); - g_ring_cache_c5.slots = NULL; - } - -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-SHUTDOWN] C2/C3/C5 rings freed\n"); - fflush(stderr); -#endif -} - -// ============================================================================ -// Refill from TLS SLL (cascade, Phase 21-1-C) -// ============================================================================ - -// Refill ring from TLS SLL (one-way cascade: SLL → Ring) -// Returns: number of blocks transferred -int ring_refill_from_sll(int class_idx, int target_count) { - if (!ring_cascade_enabled()) return 0; - if (class_idx != 2 && class_idx != 3) return 0; - - int transferred = 0; - - while (transferred < target_count) { - void* ptr = NULL; - - // Pop from TLS SLL - if (!tls_sll_pop(class_idx, &ptr)) { - break; // SLL empty - } - - // Push to Ring - if (!ring_cache_push(class_idx, ptr)) { - // Ring full, push back to SLL - tls_sll_push(class_idx, ptr, (uint32_t)-1); // Unlimited capacity - break; - } - - transferred++; - } - -#if !HAKMEM_BUILD_RELEASE - if (transferred > 0) { - g_ring_cache_refill[class_idx]++; // Count refill operations - fprintf(stderr, "[Ring-REFILL] C%d: %d blocks transferred from SLL to Ring\n", - class_idx, transferred); - fflush(stderr); - } -#endif - - return transferred; -} - -// ============================================================================ -// Stats (Phase 21-1-C/E metrics) -// ============================================================================ - -void ring_cache_print_stats(void) { - if (!ring_cache_enabled()) return; - -#if !HAKMEM_BUILD_RELEASE - // Current occupancy - uint16_t c2_count = (g_ring_cache_c2.tail >= g_ring_cache_c2.head) - ? (g_ring_cache_c2.tail - g_ring_cache_c2.head) - : (g_ring_cache_c2.capacity - g_ring_cache_c2.head + g_ring_cache_c2.tail); - - uint16_t c3_count = (g_ring_cache_c3.tail >= g_ring_cache_c3.head) - ? (g_ring_cache_c3.tail - g_ring_cache_c3.head) - : (g_ring_cache_c3.capacity - g_ring_cache_c3.head + g_ring_cache_c3.tail); - - fprintf(stderr, "\n[Ring-STATS] Ring Cache Metrics (C2/C3):\n"); - fprintf(stderr, " C2: %u/%u slots occupied\n", c2_count, g_ring_cache_c2.capacity); - fprintf(stderr, " C3: %u/%u slots occupied\n", c3_count, g_ring_cache_c3.capacity); - - // Metrics summary (C2/C3 only) - for (int c = 2; c <= 3; c++) { - uint64_t total_allocs = g_ring_cache_hit[c] + g_ring_cache_miss[c]; - uint64_t total_frees = g_ring_cache_push[c] + g_ring_cache_full[c]; - double hit_rate = (total_allocs > 0) ? (100.0 * g_ring_cache_hit[c] / total_allocs) : 0.0; - double full_rate = (total_frees > 0) ? (100.0 * g_ring_cache_full[c] / total_frees) : 0.0; - - if (total_allocs > 0 || total_frees > 0) { - fprintf(stderr, " C%d: hit=%llu miss=%llu (%.1f%% hit), push=%llu full=%llu (%.1f%% full), refill=%llu\n", - c, - (unsigned long long)g_ring_cache_hit[c], - (unsigned long long)g_ring_cache_miss[c], - hit_rate, - (unsigned long long)g_ring_cache_push[c], - (unsigned long long)g_ring_cache_full[c], - full_rate, - (unsigned long long)g_ring_cache_refill[c]); - } - } - fflush(stderr); -#endif -} diff --git a/core/front/tiny_ring_cache.d b/core/front/tiny_ring_cache.d deleted file mode 100644 index cc3079d3..00000000 --- a/core/front/tiny_ring_cache.d +++ /dev/null @@ -1,61 +0,0 @@ -core/front/tiny_ring_cache.o: core/front/tiny_ring_cache.c \ - core/front/tiny_ring_cache.h core/front/../hakmem_build_flags.h \ - core/front/../box/tls_sll_box.h \ - core/front/../box/../hakmem_tiny_config.h \ - core/front/../box/../hakmem_build_flags.h \ - core/front/../box/../tiny_remote.h core/front/../box/../tiny_region_id.h \ - core/front/../box/../hakmem_build_flags.h \ - core/front/../box/../tiny_box_geometry.h \ - core/front/../box/../hakmem_tiny_superslab_constants.h \ - core/front/../box/../hakmem_tiny_config.h \ - core/front/../box/../ptr_track.h \ - core/front/../box/../hakmem_super_registry.h \ - core/front/../box/../hakmem_tiny_superslab.h \ - core/front/../box/../superslab/superslab_types.h \ - core/hakmem_tiny_superslab_constants.h \ - core/front/../box/../superslab/superslab_inline.h \ - core/front/../box/../superslab/superslab_types.h \ - core/front/../box/../tiny_debug_ring.h \ - core/front/../box/../tiny_remote.h \ - core/front/../box/../hakmem_tiny_integrity.h \ - core/front/../box/../hakmem_tiny.h core/front/../box/../hakmem_trace.h \ - core/front/../box/../hakmem_tiny_mini_mag.h \ - core/front/../box/../ptr_track.h core/front/../box/../ptr_trace.h \ - core/front/../box/../box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ - core/tiny_nextptr.h core/hakmem_build_flags.h core/tiny_region_id.h \ - core/superslab/superslab_inline.h core/front/../box/../tiny_debug_ring.h \ - core/front/../box/../superslab/superslab_inline.h -core/front/tiny_ring_cache.h: -core/front/../hakmem_build_flags.h: -core/front/../box/tls_sll_box.h: -core/front/../box/../hakmem_tiny_config.h: -core/front/../box/../hakmem_build_flags.h: -core/front/../box/../tiny_remote.h: -core/front/../box/../tiny_region_id.h: -core/front/../box/../hakmem_build_flags.h: -core/front/../box/../tiny_box_geometry.h: -core/front/../box/../hakmem_tiny_superslab_constants.h: -core/front/../box/../hakmem_tiny_config.h: -core/front/../box/../ptr_track.h: -core/front/../box/../hakmem_super_registry.h: -core/front/../box/../hakmem_tiny_superslab.h: -core/front/../box/../superslab/superslab_types.h: -core/hakmem_tiny_superslab_constants.h: -core/front/../box/../superslab/superslab_inline.h: -core/front/../box/../superslab/superslab_types.h: -core/front/../box/../tiny_debug_ring.h: -core/front/../box/../tiny_remote.h: -core/front/../box/../hakmem_tiny_integrity.h: -core/front/../box/../hakmem_tiny.h: -core/front/../box/../hakmem_trace.h: -core/front/../box/../hakmem_tiny_mini_mag.h: -core/front/../box/../ptr_track.h: -core/front/../box/../ptr_trace.h: -core/front/../box/../box/tiny_next_ptr_box.h: -core/hakmem_tiny_config.h: -core/tiny_nextptr.h: -core/hakmem_build_flags.h: -core/tiny_region_id.h: -core/superslab/superslab_inline.h: -core/front/../box/../tiny_debug_ring.h: -core/front/../box/../superslab/superslab_inline.h: diff --git a/core/front/tiny_ring_cache.h b/core/front/tiny_ring_cache.h deleted file mode 100644 index 318498f5..00000000 --- a/core/front/tiny_ring_cache.h +++ /dev/null @@ -1,263 +0,0 @@ -// tiny_ring_cache.h - Phase 21-1: Array-based hot cache (C2/C3/C5) -// -// Goal: Eliminate pointer chasing in TLS SLL by using ring buffer -// Target: +15-20% performance (54.4M → 62-65M ops/s) -// -// Design (ChatGPT feedback): -// - Ring → SLL → SuperSlab (3-layer hierarchy) -// - Ring size: 128 slots (ENV: 64/128/256 A/B test) -// - C2/C3 only (hot classes, 33-128B) -// - Replaces UltraHot (Phase 19-3: +12.9% by removing UltraHot) -// -// Performance: -// - Alloc: 1-2 instructions (array access, no pointer chasing) -// - Free: 1-2 instructions (array write, no pointer chasing) -// - vs TLS SLL: 3 mem accesses → 2 mem accesses, 1 cache miss → 0 -// -// ENV Variables: -// HAKMEM_TINY_HOT_RING_ENABLE=1 # Enable Ring cache (default: 0) -// HAKMEM_TINY_HOT_RING_C2=128 # C2 ring size (default: 128) -// HAKMEM_TINY_HOT_RING_C3=128 # C3 ring size (default: 128) -// HAKMEM_TINY_HOT_RING_CASCADE=1 # Enable SLL → Ring refill (default: 0) - -#ifndef HAK_FRONT_TINY_RING_CACHE_H -#define HAK_FRONT_TINY_RING_CACHE_H - -#include -#include -#include -#include "../hakmem_build_flags.h" - -// ============================================================================ -// Ring Buffer Structure -// ============================================================================ - -typedef struct { - void** slots; // Dynamic array (allocated at init, power-of-2 size) - uint16_t head; // Pop index (consumer) - uint16_t tail; // Push index (producer) - uint16_t capacity; // Ring size (power of 2 for fast modulo: & (capacity-1)) - uint16_t mask; // Capacity - 1 (for fast modulo) -} TinyRingCache; - -// ============================================================================ -// External TLS Variables (defined in tiny_ring_cache.c) -// ============================================================================ - -extern __thread TinyRingCache g_ring_cache_c2; -extern __thread TinyRingCache g_ring_cache_c3; -extern __thread TinyRingCache g_ring_cache_c5; - -// ============================================================================ -// Metrics (Phase 21-1-E, optional for Phase 21-1-C) -// ============================================================================ - -#if !HAKMEM_BUILD_RELEASE -extern __thread uint64_t g_ring_cache_hit[8]; // Alloc hits -extern __thread uint64_t g_ring_cache_miss[8]; // Alloc misses -extern __thread uint64_t g_ring_cache_push[8]; // Free pushes -extern __thread uint64_t g_ring_cache_full[8]; // Free full (fallback to SLL) -extern __thread uint64_t g_ring_cache_refill[8]; // Refill count (SLL → Ring) -#endif - -// ============================================================================ -// ENV Control (cached, lazy init) -// ============================================================================ - -// Enable flag (default: 1, ON) -static inline int ring_cache_enabled(void) { - static int g_enable = -1; - if (__builtin_expect(g_enable == -1, 0)) { - const char* e = getenv("HAKMEM_TINY_HOT_RING_ENABLE"); - g_enable = (e && *e == '0') ? 0 : 1; // DEFAULT: ON (set ENV=0 to disable) -#if !HAKMEM_BUILD_RELEASE - if (g_enable) { - fprintf(stderr, "[Ring-INIT] ring_cache_enabled() = %d\n", g_enable); - fflush(stderr); - } -#endif - } - return g_enable; -} - -// C2 capacity (default: 128) -static inline size_t ring_capacity_c2(void) { - static size_t g_cap = 0; - if (__builtin_expect(g_cap == 0, 0)) { - const char* e = getenv("HAKMEM_TINY_HOT_RING_C2"); - g_cap = (e && *e) ? (size_t)atoi(e) : 128; // Default: 128 - - // Round up to power of 2 (for fast modulo) - if (g_cap < 32) g_cap = 32; - if (g_cap > 256) g_cap = 256; - - // Ensure power of 2 - size_t pow2 = 32; - while (pow2 < g_cap) pow2 *= 2; - g_cap = pow2; - -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] C2 capacity = %zu (power of 2)\n", g_cap); - fflush(stderr); -#endif - } - return g_cap; -} - -// C3 capacity (default: 128) -static inline size_t ring_capacity_c3(void) { - static size_t g_cap = 0; - if (__builtin_expect(g_cap == 0, 0)) { - const char* e = getenv("HAKMEM_TINY_HOT_RING_C3"); - g_cap = (e && *e) ? (size_t)atoi(e) : 128; // Default: 128 - - // Round up to power of 2 - if (g_cap < 32) g_cap = 32; - if (g_cap > 256) g_cap = 256; - - size_t pow2 = 32; - while (pow2 < g_cap) pow2 *= 2; - g_cap = pow2; - -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] C3 capacity = %zu (power of 2)\n", g_cap); - fflush(stderr); -#endif - } - return g_cap; -} - -// C5 capacity (default: 128) -static inline size_t ring_capacity_c5(void) { - static size_t g_cap = 0; - if (__builtin_expect(g_cap == 0, 0)) { - const char* e = getenv("HAKMEM_TINY_HOT_RING_C5"); - g_cap = (e && *e) ? (size_t)atoi(e) : 128; // Default: 128 - - // Round up to power of 2 - if (g_cap < 32) g_cap = 32; - if (g_cap > 256) g_cap = 256; - - size_t pow2 = 32; - while (pow2 < g_cap) pow2 *= 2; - g_cap = pow2; - -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[Ring-INIT] C5 capacity = %zu (power of 2)\n", g_cap); - fflush(stderr); -#endif - } - return g_cap; -} - -// Cascade enable flag (default: 0, OFF) -static inline int ring_cascade_enabled(void) { - static int g_enable = -1; - if (__builtin_expect(g_enable == -1, 0)) { - const char* e = getenv("HAKMEM_TINY_HOT_RING_CASCADE"); - g_enable = (e && *e && *e != '0') ? 1 : 0; -#if !HAKMEM_BUILD_RELEASE - if (g_enable) { - fprintf(stderr, "[Ring-INIT] ring_cascade_enabled() = %d\n", g_enable); - fflush(stderr); - } -#endif - } - return g_enable; -} - -// ============================================================================ -// Init/Shutdown Forward Declarations (needed by pop/push) -// ============================================================================ - -void ring_cache_init(void); -void ring_cache_shutdown(void); -void ring_cache_print_stats(void); - -// ============================================================================ -// Ultra-Fast Pop/Push (1-2 instructions) -// ============================================================================ - -// Pop from ring (alloc fast path) -// Returns: BASE pointer (caller must convert to USER with +1) -static inline void* ring_cache_pop(int class_idx) { - // Fast path: Ring disabled or wrong class → return NULL immediately - if (__builtin_expect(!ring_cache_enabled(), 0)) return NULL; - if (__builtin_expect(class_idx != 2 && class_idx != 3 && class_idx != 5, 0)) return NULL; - - TinyRingCache* ring = (class_idx == 2) ? &g_ring_cache_c2 : - (class_idx == 3) ? &g_ring_cache_c3 : &g_ring_cache_c5; - - // Lazy init check (once per thread) - if (__builtin_expect(ring->slots == NULL, 0)) { - ring_cache_init(); // First call in this thread - // Re-check after init (may fail if allocation failed) - if (ring->slots == NULL) return NULL; - } - - // Empty check - if (__builtin_expect(ring->head == ring->tail, 0)) { -#if !HAKMEM_BUILD_RELEASE - g_ring_cache_miss[class_idx]++; -#endif - return NULL; // Empty - } - - // Pop from head (consumer) - void* base = ring->slots[ring->head]; - ring->head = (ring->head + 1) & ring->mask; // Fast modulo (power of 2) - -#if !HAKMEM_BUILD_RELEASE - g_ring_cache_hit[class_idx]++; -#endif - - return base; // Return BASE pointer -} - -// Push to ring (free fast path) -// Input: BASE pointer (caller must pass BASE, not USER) -// Returns: 1=SUCCESS, 0=FULL -static inline int ring_cache_push(int class_idx, void* base) { - // Fast path: Ring disabled or wrong class → return 0 (not handled) - if (__builtin_expect(!ring_cache_enabled(), 0)) return 0; - if (__builtin_expect(class_idx != 2 && class_idx != 3 && class_idx != 5, 0)) return 0; - - TinyRingCache* ring = (class_idx == 2) ? &g_ring_cache_c2 : - (class_idx == 3) ? &g_ring_cache_c3 : &g_ring_cache_c5; - - // Lazy init check (once per thread) - if (__builtin_expect(ring->slots == NULL, 0)) { - ring_cache_init(); // First call in this thread - // Re-check after init (may fail if allocation failed) - if (ring->slots == NULL) return 0; - } - - uint16_t next_tail = (ring->tail + 1) & ring->mask; - - // Full check (leave 1 slot empty to distinguish full/empty) - if (__builtin_expect(next_tail == ring->head, 0)) { -#if !HAKMEM_BUILD_RELEASE - g_ring_cache_full[class_idx]++; -#endif - return 0; // Full - } - - // Push to tail (producer) - ring->slots[ring->tail] = base; - ring->tail = next_tail; - -#if !HAKMEM_BUILD_RELEASE - g_ring_cache_push[class_idx]++; -#endif - - return 1; // SUCCESS -} - -// ============================================================================ -// Refill from TLS SLL (cascade, Phase 21-1-C) -// ============================================================================ - -// Forward declaration (defined in tiny_ring_cache.c) -int ring_refill_from_sll(int class_idx, int target_count); - -#endif // HAK_FRONT_TINY_RING_CACHE_H diff --git a/core/front/tiny_ultra_hot.h b/core/front/tiny_ultra_hot.h deleted file mode 100644 index 89b62367..00000000 --- a/core/front/tiny_ultra_hot.h +++ /dev/null @@ -1,458 +0,0 @@ -// tiny_ultra_hot.h - Ultra-fast hot path for C2/C3/C4/C5 (16B-128B allocations) -// Purpose: -// - Minimize L1 dcache misses (30x → 3x target) by using 2 cache line TLS -// - Minimize instructions (6.2x → 2x target) by ultra-simple straight-line path -// - Minimize branches (7.1x → 2x target) by predict-likely hints -// -// Design (ChatGPT consultation Phase 14 + Phase 14-B): -// - Phase 14: C2/C3 (16B/32B) - Coverage: 1.71% -// - Phase 14-B: +C4/C5 (64B/128B) - Coverage: 11.14% (6.5x improvement!) -// - TLS structure: 2 cache lines (128B) for 4 magazines with adaptive slot counts -// - Path: 2-3 instructions per alloc/free (pop/push from magazine) -// - Fallback: If magazine empty/full → existing TinyHeapV2/FastCache path -// -// Cache locality strategy: -// - All state in 1 cache line (64B): 2x mag[8] + 2x top + padding -// - No pointer chasing, no indirect access -// - Touches only 1 struct per alloc/free -// -// Instruction reduction strategy: -// - Size→class: 1 compare (size <= 16 ? C1 : C2) -// - Magazine access: Direct array index (no loops) -// - Fallback: Return NULL immediately (caller handles) -// -// Branch prediction strategy: -// - __builtin_expect(hit, 1) - expect 95%+ hit rate -// - No nested branches in hot path - -#ifndef HAK_FRONT_TINY_ULTRA_HOT_H -#define HAK_FRONT_TINY_ULTRA_HOT_H - -#include -#include -#include -#include "../box/tls_sll_box.h" // Phase 14-C: Borrowing design - refill from TLS SLL - -// Magazine capacity - adaptive sizing for cache locality (Phase 14-B) -// Design principle: Balance capacity vs cache line usage -// -// Cache line 0 (64B): C2 + C3 magazines -// C2 (16B): 4 slots × 8B ptr = 32B -// C3 (32B): 4 slots × 8B ptr = 32B -// Total: 64B (perfect fit!) -// -// Cache line 1 (64B): C4 + C5 magazines + counters -// C4 (64B): 2 slots × 8B ptr = 16B -// C5 (128B): 1 slot × 8B ptr = 8B -// Counters: c1_top, c2_top, c4_top, c5_top = 4B -// Padding: 36B -// Total: 64B (fits!) -// -// Why fewer slots for larger classes? -// - Maintain cache locality (2 cache lines = 128B total) -// - Block size scales, so magazine memory scales proportionally -// - Free path supplies blocks → even 1-2 slots maintain high hit rate -// -#ifndef ULTRA_HOT_MAG_CAP_C2 -#define ULTRA_HOT_MAG_CAP_C2 4 // C2 (16B) - 4 slots -#endif -#ifndef ULTRA_HOT_MAG_CAP_C3 -#define ULTRA_HOT_MAG_CAP_C3 4 // C3 (32B) - 4 slots -#endif -#ifndef ULTRA_HOT_MAG_CAP_C4 -#define ULTRA_HOT_MAG_CAP_C4 2 // C4 (64B) - 2 slots (NEW Phase 14-B) -#endif -#ifndef ULTRA_HOT_MAG_CAP_C5 -#define ULTRA_HOT_MAG_CAP_C5 1 // C5 (128B) - 1 slot (NEW Phase 14-B) -#endif - -// TLS structure: 2 cache lines (128B) for hot path (Phase 14-B expanded) -// Layout: -// Cache line 0 (64B): C2_mag[4] (32B) + C3_mag[4] (32B) -// Cache line 1 (64B): C4_mag[2] (16B) + C5_mag[1] (8B) + counters (4B) + pad (36B) -// Cache line 2+: Statistics (cold path) -// Total hot state: 128B (2 cache lines) -typedef struct { - // ===== Cache line 0 (64B): C2/C3 magazines ===== - void* c1_mag[ULTRA_HOT_MAG_CAP_C2]; // C2 (16B) - 4 slots, 32B - void* c2_mag[ULTRA_HOT_MAG_CAP_C3]; // C3 (32B) - 4 slots, 32B - - // ===== Cache line 1 (64B): C4/C5 magazines + counters ===== - void* c4_mag[ULTRA_HOT_MAG_CAP_C4]; // C4 (64B) - 2 slots, 16B (NEW Phase 14-B) - void* c5_mag[ULTRA_HOT_MAG_CAP_C5]; // C5 (128B) - 1 slot, 8B (NEW Phase 14-B) - - uint8_t c1_top; // C2 magazine top index - uint8_t c2_top; // C3 magazine top index - uint8_t c4_top; // C4 magazine top index (NEW Phase 14-B) - uint8_t c5_top; // C5 magazine top index (NEW Phase 14-B) - uint8_t pad[36]; // Padding to cache line boundary - - // ===== Statistics (cold path, cache line 2+) ===== - uint64_t c1_alloc_calls; - uint64_t c1_hits; - uint64_t c1_misses; - uint64_t c2_alloc_calls; - uint64_t c2_hits; - uint64_t c2_misses; - uint64_t c4_alloc_calls; // NEW Phase 14-B - uint64_t c4_hits; // NEW Phase 14-B - uint64_t c4_misses; // NEW Phase 14-B - uint64_t c5_alloc_calls; // NEW Phase 14-B - uint64_t c5_hits; // NEW Phase 14-B - uint64_t c5_misses; // NEW Phase 14-B - - uint64_t c1_free_calls; - uint64_t c1_free_hits; - uint64_t c2_free_calls; - uint64_t c2_free_hits; - uint64_t c4_free_calls; // NEW Phase 14-B - uint64_t c4_free_hits; // NEW Phase 14-B - uint64_t c5_free_calls; // NEW Phase 14-B - uint64_t c5_free_hits; // NEW Phase 14-B -} __attribute__((aligned(64))) TinyUltraHot; - -// External TLS variable (defined in hakmem_tiny.c) -extern __thread TinyUltraHot g_ultra_hot; - -// Enable flag (cached) -// ENV: HAKMEM_TINY_ULTRA_HOT -// - 0: Disable (use existing TinyHeapV2/FastCache) -// - 1 (default): Enable ultra-fast C1/C2 path -static inline int ultra_hot_enabled(void) { - static int g_enable = -1; - if (__builtin_expect(g_enable == -1, 0)) { - const char* e = getenv("HAKMEM_TINY_ULTRA_HOT"); - if (e && *e) { - g_enable = (*e != '0') ? 1 : 0; - } else { - g_enable = 1; // Default: ON (Phase 14 decision) - } -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[UltraHot-INIT] ultra_hot_enabled() = %d\n", g_enable); - fflush(stderr); -#endif - } - return g_enable; -} - -// Phase 14-C: Max size control (ENV: HAKMEM_TINY_ULTRA_HOT_MAX_SIZE) -// Purpose: Control which size classes UltraHot handles -// Default: 32 (C2/C3 only, safe for Random Mixed) -// Fixed-size: 128 (C2-C5, optimal for fixed-size workloads) -static inline size_t ultra_hot_max_size(void) { - static size_t g_max_size = 0; - if (__builtin_expect(g_max_size == 0, 0)) { - const char* e = getenv("HAKMEM_TINY_ULTRA_HOT_MAX_SIZE"); - if (e && *e) { - g_max_size = (size_t)atoi(e); - } else { - g_max_size = 32; // Default: C2/C3 only (Phase 14 behavior) - } -#if !HAKMEM_BUILD_RELEASE - fprintf(stderr, "[UltraHot-INIT] ultra_hot_max_size() = %zu\n", g_max_size); - fflush(stderr); -#endif - } - return g_max_size; -} - -// Ultra-fast alloc (C2/C3/C4/C5 - Phase 14-B expanded) -// Contract: -// - Input: size (must be 9-128B for C2-C5) -// - Output: BASE pointer (not USER pointer!) or NULL -// - Caller converts BASE → USER via HAK_RET_ALLOC -// -// Hot path (expect 95% hit rate): -// 1. size → class (cascading compares) -// 2. magazine pop (1 load + 1 decrement + 1 store) -// 3. return BASE -// -// Cold path (5% miss rate): -// - return NULL → caller uses existing TinyHeapV2/FastCache -// -// Performance target: -// - L1 dcache: 2 cache lines load (128B) - all 4 mags -// - Instructions: 5-7 instructions total per hit -// - Branches: 2 branches (size check + mag empty check) -static inline void* ultra_hot_alloc(size_t size) { - // Fast path: size → class (cascading compares for branch prediction) - // C2 = 16B (9-16), C3 = 32B (17-32), C4 = 64B (33-64), C5 = 128B (65-128) - if (__builtin_expect(size <= 16, 1)) { - // C2 path (16B) - g_ultra_hot.c1_alloc_calls++; - - if (__builtin_expect(g_ultra_hot.c1_top > 0, 1)) { - // Magazine hit! (5 instructions: load top, dec, load mag, store top, ret) - g_ultra_hot.c1_hits++; - uint8_t idx = --g_ultra_hot.c1_top; - void* base = g_ultra_hot.c1_mag[idx]; - return base; // Return BASE (caller converts to USER) - } else { - // Magazine empty (cold path) - g_ultra_hot.c1_misses++; - return NULL; - } - } else if (__builtin_expect(size <= 32, 1)) { - // C3 path (32B) - g_ultra_hot.c2_alloc_calls++; - - if (__builtin_expect(g_ultra_hot.c2_top > 0, 1)) { - // Magazine hit! - g_ultra_hot.c2_hits++; - uint8_t idx = --g_ultra_hot.c2_top; - void* base = g_ultra_hot.c2_mag[idx]; - return base; - } else { - // Magazine empty - g_ultra_hot.c2_misses++; - return NULL; - } - } else if (__builtin_expect(size <= 64 && ultra_hot_max_size() >= 64, 0)) { - // C4 path (64B) - Phase 14-C: ENV gated - g_ultra_hot.c4_alloc_calls++; - - if (__builtin_expect(g_ultra_hot.c4_top > 0, 1)) { - // Magazine hit! - g_ultra_hot.c4_hits++; - uint8_t idx = --g_ultra_hot.c4_top; - void* base = g_ultra_hot.c4_mag[idx]; - return base; - } else { - // Magazine empty - g_ultra_hot.c4_misses++; - return NULL; - } - } else if (__builtin_expect(size <= 128 && ultra_hot_max_size() >= 128, 0)) { - // C5 path (128B) - Phase 14-C: ENV gated - g_ultra_hot.c5_alloc_calls++; - - if (__builtin_expect(g_ultra_hot.c5_top > 0, 1)) { - // Magazine hit! - g_ultra_hot.c5_hits++; - uint8_t idx = --g_ultra_hot.c5_top; - void* base = g_ultra_hot.c5_mag[idx]; - return base; - } else { - // Magazine empty - g_ultra_hot.c5_misses++; - return NULL; - } - } else { - // Size out of range (C6+ or C0) - return NULL; - } -} - -// Ultra-fast free (C2/C3/C4/C5 - Phase 14-B expanded) -// Contract: -// - Input: base (BASE pointer), class_idx -// - Output: 1 if handled, 0 if magazine full (fallback to existing path) -// -// Hot path (expect 95% hit rate): -// 1. class check (1 compare) -// 2. magazine push (1 load top + 1 store mag + 1 increment + 1 store top) -// 3. return 1 -// -// Cold path (5% miss rate): -// - return 0 → caller uses existing TinyHeapV2/TLS SLL path -static inline int ultra_hot_free_by_class(void* base, int class_idx) { - // Fast path: class → magazine - // NOTE: HAKMEM class numbering: C0=8B, C1=?, C2=16B, C3=32B, C4=64B, C5=128B - if (__builtin_expect(class_idx == 2, 1)) { - // C2 path (16B) - g_ultra_hot.c1_free_calls++; - - if (__builtin_expect(g_ultra_hot.c1_top < ULTRA_HOT_MAG_CAP_C2, 1)) { - // Magazine has room! (5 instructions) - g_ultra_hot.c1_free_hits++; - uint8_t idx = g_ultra_hot.c1_top++; - g_ultra_hot.c1_mag[idx] = base; - return 1; // Success - } else { - // Magazine full → fallback - return 0; - } - } else if (__builtin_expect(class_idx == 3, 1)) { - // C3 path (32B) - g_ultra_hot.c2_free_calls++; - - if (__builtin_expect(g_ultra_hot.c2_top < ULTRA_HOT_MAG_CAP_C3, 1)) { - // Magazine has room! - g_ultra_hot.c2_free_hits++; - uint8_t idx = g_ultra_hot.c2_top++; - g_ultra_hot.c2_mag[idx] = base; - return 1; - } else { - // Magazine full - return 0; - } - } else if (__builtin_expect(class_idx == 4, 0)) { - // C4 path (64B) - NEW Phase 14-B - g_ultra_hot.c4_free_calls++; - - if (__builtin_expect(g_ultra_hot.c4_top < ULTRA_HOT_MAG_CAP_C4, 1)) { - // Magazine has room! - g_ultra_hot.c4_free_hits++; - uint8_t idx = g_ultra_hot.c4_top++; - g_ultra_hot.c4_mag[idx] = base; - return 1; - } else { - // Magazine full - return 0; - } - } else if (__builtin_expect(class_idx == 5, 0)) { - // C5 path (128B) - NEW Phase 14-B - g_ultra_hot.c5_free_calls++; - - if (__builtin_expect(g_ultra_hot.c5_top < ULTRA_HOT_MAG_CAP_C5, 1)) { - // Magazine has room! - g_ultra_hot.c5_free_hits++; - uint8_t idx = g_ultra_hot.c5_top++; - g_ultra_hot.c5_mag[idx] = base; - return 1; - } else { - // Magazine full - return 0; - } - } else { - // Class out of range (not C2-C5) - return 0; - } -} - -// Magazine refill (called from existing front when it has spare blocks) -// Strategy: TinyHeapV2 / FastCache can "donate" blocks to UltraHot -// This is optional - UltraHot can work with just free path supply -static inline void ultra_hot_try_refill_c1(void* base) { - if (g_ultra_hot.c1_top < ULTRA_HOT_MAG_CAP_C2) { - g_ultra_hot.c1_mag[g_ultra_hot.c1_top++] = base; - } -} - -static inline void ultra_hot_try_refill_c2(void* base) { - if (g_ultra_hot.c2_top < ULTRA_HOT_MAG_CAP_C3) { - g_ultra_hot.c2_mag[g_ultra_hot.c2_top++] = base; - } -} - -static inline void ultra_hot_try_refill_c4(void* base) { - if (g_ultra_hot.c4_top < ULTRA_HOT_MAG_CAP_C4) { - g_ultra_hot.c4_mag[g_ultra_hot.c4_top++] = base; - } -} - -static inline void ultra_hot_try_refill_c5(void* base) { - if (g_ultra_hot.c5_top < ULTRA_HOT_MAG_CAP_C5) { - g_ultra_hot.c5_mag[g_ultra_hot.c5_top++] = base; - } -} - -// Print statistics (called at program exit if HAKMEM_TINY_ULTRA_HOT_STATS=1) -// Declaration only (implementation in hakmem_tiny.c for external linkage) -void ultra_hot_print_stats(void); - -// Design notes: -// -// 1. Cache locality: -// - All state fits in 2 cache lines (128B total) -// - First line (64B): Both magazines (C1 + C2) -// - Second line (64B): Counters + stats -// - Expected L1 miss: ~1-2 per alloc/free (vs 30+ currently) -// -// 2. Instruction count: -// - Alloc hit: ~7 instructions (size check + mag pop + return) -// - Free hit: ~7 instructions (size check + mag push + return) -// - Total: ~14 instructions per alloc/free pair (vs ~281M/500K = 562 currently) -// - Reduction: 562 → 14 = 40x improvement -// -// 3. Branch prediction: -// - Size check: __builtin_expect(size <= 16, 1) - predict C1 likely -// - Magazine check: __builtin_expect(top > 0, 1) - predict hit likely -// - Expected branch-miss: ~5% (vs 7.83% currently) -// -// 4. Integration with existing front: -// - UltraHot is L0 (fastest) -// - TinyHeapV2 is L1 (fast) -// - FastCache is L2 (normal) -// - If UltraHot misses → fallback to L1/L2 -// - Free path supplies both UltraHot and TinyHeapV2 -// -// 5. Supply strategy: -// - Free path: Always try UltraHot first, then TinyHeapV2, then TLS SLL -// - Alloc path: Try UltraHot first, then TinyHeapV2, then FastCache -// - No refill from backend (keeps UltraHot ultra-simple) -// -// 6. Expected performance: -// - Current: 9.3M ops/s (Random Mixed 256B) -// - Target: 40-60M ops/s (+330-545%) -// - L1 miss: 2.9M → ~300K (-90%) -// - Instructions: 281M → ~80M (-71%) -// - Branches: 59M → ~15M (-75%) -// -// 7. Why C1/C2 only? -// - C1 (16B) + C2 (32B) cover ~60% of tiny allocations -// - Small magazine (4 slots) fits both in 1-2 cache lines -// - Size check is trivial (size <= 16 / size <= 32) -// - Larger classes (C3+) have different access patterns (less cache-sensitive) -// -// 8. Why not C0 (8B)? -// - TinyHeapV2 showed -5% regression on C0 -// - 8B allocations are rare in real workloads -// - Magazine overhead too high for 8B blocks -// -// 9. Comparison with TinyHeapV2: -// - TinyHeapV2: 16 slots per class, covers C1-C3 -// - UltraHot: 4 slots per class, covers C1-C2 only -// - UltraHot is "ultra-hot subset" of TinyHeapV2 -// - Trade magazine capacity for cache locality -// -// 10. ENV flags: -// - HAKMEM_TINY_ULTRA_HOT=0/1 - Enable/disable (default: 1) -// - HAKMEM_TINY_ULTRA_HOT_STATS=0/1 - Print stats at exit (default: 0) - -// ============================================================================= -// Phase 14-C: Borrowing Design - Refill from TLS SLL (正史から借りる) -// ============================================================================= -// Design: UltraHot は「TLS SLL の手前にあるビュー」として動作 -// - Free: 正史(TLS SLL)に戻す(横取りしない) -// - Alloc miss: TLS SLL から借りて magazine を refill -// - 学習層(Superslab/drain)が正しい在庫を追跡できる -// -// Call this after ultra_hot_alloc() miss to refill magazine from TLS SLL -static inline void ultra_hot_try_refill(int class_idx) { - if (!ultra_hot_enabled()) return; - if (class_idx < 2 || class_idx > 5) return; // C2-C5 のみ - - // Refill magazine to full capacity (borrow from TLS SLL = 正史) - if (class_idx == 2) { - // C2 (16B): 4 slots magazine - while (g_ultra_hot.c1_top < ULTRA_HOT_MAG_CAP_C2) { - void* ptr = NULL; - if (!tls_sll_pop(class_idx, &ptr)) break; // TLS SLL から借りる - g_ultra_hot.c1_mag[g_ultra_hot.c1_top++] = ptr; - } - } else if (class_idx == 3) { - // C3 (32B): 4 slots magazine - while (g_ultra_hot.c2_top < ULTRA_HOT_MAG_CAP_C3) { - void* ptr = NULL; - if (!tls_sll_pop(class_idx, &ptr)) break; - g_ultra_hot.c2_mag[g_ultra_hot.c2_top++] = ptr; - } - } else if (class_idx == 4) { - // C4 (64B): 2 slots magazine - while (g_ultra_hot.c4_top < ULTRA_HOT_MAG_CAP_C4) { - void* ptr = NULL; - if (!tls_sll_pop(class_idx, &ptr)) break; - g_ultra_hot.c4_mag[g_ultra_hot.c4_top++] = ptr; - } - } else if (class_idx == 5) { - // C5 (128B): 1 slot magazine - while (g_ultra_hot.c5_top < ULTRA_HOT_MAG_CAP_C5) { - void* ptr = NULL; - if (!tls_sll_pop(class_idx, &ptr)) break; - g_ultra_hot.c5_mag[g_ultra_hot.c5_top++] = ptr; - } - } -} - -#endif // HAK_FRONT_TINY_ULTRA_HOT_H diff --git a/core/front/tiny_unified_cache.d b/core/front/tiny_unified_cache.d index f5f9f03b..ef268091 100644 --- a/core/front/tiny_unified_cache.d +++ b/core/front/tiny_unified_cache.d @@ -6,15 +6,17 @@ core/front/tiny_unified_cache.o: core/front/tiny_unified_cache.c \ core/hakmem_tiny_superslab_constants.h \ core/front/../superslab/superslab_inline.h \ core/front/../superslab/superslab_types.h \ + core/front/../superslab/../tiny_box_geometry.h \ + core/front/../superslab/../hakmem_tiny_superslab_constants.h \ + core/front/../superslab/../hakmem_tiny_config.h \ core/front/../tiny_debug_ring.h core/front/../hakmem_build_flags.h \ core/front/../tiny_remote.h \ core/front/../hakmem_tiny_superslab_constants.h \ - core/front/../tiny_box_geometry.h core/front/../hakmem_tiny_config.h \ - core/front/../box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ - core/tiny_nextptr.h core/hakmem_build_flags.h core/tiny_region_id.h \ - core/tiny_box_geometry.h core/ptr_track.h core/hakmem_super_registry.h \ - core/hakmem_tiny_superslab.h core/superslab/superslab_inline.h \ - core/front/../hakmem_tiny_superslab.h \ + core/front/../tiny_box_geometry.h core/front/../box/tiny_next_ptr_box.h \ + core/hakmem_tiny_config.h core/tiny_nextptr.h core/hakmem_build_flags.h \ + core/tiny_region_id.h core/tiny_box_geometry.h core/ptr_track.h \ + core/hakmem_super_registry.h core/hakmem_tiny_superslab.h \ + core/superslab/superslab_inline.h core/front/../hakmem_tiny_superslab.h \ core/front/../superslab/superslab_inline.h \ core/front/../box/pagefault_telemetry_box.h core/front/tiny_unified_cache.h: @@ -27,12 +29,14 @@ core/front/../superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/front/../superslab/superslab_inline.h: core/front/../superslab/superslab_types.h: +core/front/../superslab/../tiny_box_geometry.h: +core/front/../superslab/../hakmem_tiny_superslab_constants.h: +core/front/../superslab/../hakmem_tiny_config.h: core/front/../tiny_debug_ring.h: core/front/../hakmem_build_flags.h: core/front/../tiny_remote.h: core/front/../hakmem_tiny_superslab_constants.h: core/front/../tiny_box_geometry.h: -core/front/../hakmem_tiny_config.h: core/front/../box/tiny_next_ptr_box.h: core/hakmem_tiny_config.h: core/tiny_nextptr.h: diff --git a/core/hakmem_tiny_init.inc b/core/hakmem_tiny_init.inc index 4b9df30f..ca6d50b4 100644 --- a/core/hakmem_tiny_init.inc +++ b/core/hakmem_tiny_init.inc @@ -15,14 +15,6 @@ void hak_tiny_init(void) { // Step 1: Simple initialization (static global is already zero-initialized) g_tiny_initialized = 1; - // Hot-class toggle: class5 (256B) dedicated TLS fast path - // Default ON; allow runtime override via HAKMEM_TINY_HOTPATH_CLASS5 - { - const char* hp5 = getenv("HAKMEM_TINY_HOTPATH_CLASS5"); - if (hp5 && *hp5) { - g_tiny_hotpath_class5 = (atoi(hp5) != 0) ? 1 : 0; - } - } // Reset fast-cache defaults and apply preset (if provided) tiny_config_reset_defaults(); @@ -101,50 +93,6 @@ void hak_tiny_init(void) { tls->spill_high = tiny_tls_default_spill(base_cap); tiny_tls_publish_targets(i, base_cap); } - // Optional: override TLS parameters for hot class 5 (256B) - if (g_tiny_hotpath_class5) { - TinyTLSList* tls5 = &g_tls_lists[5]; - int cap_def = 512; // thick cache for hot class - int refill_def = 128; // refill low-water mark - int spill_def = 0; // 0 → use cap as hard spill threshold - const char* ecap = getenv("HAKMEM_TINY_CLASS5_TLS_CAP"); - const char* eref = getenv("HAKMEM_TINY_CLASS5_TLS_REFILL"); - const char* espl = getenv("HAKMEM_TINY_CLASS5_TLS_SPILL"); - if (ecap && *ecap) cap_def = atoi(ecap); - if (eref && *eref) refill_def = atoi(eref); - if (espl && *espl) spill_def = atoi(espl); - if (cap_def < 64) cap_def = 64; if (cap_def > 4096) cap_def = 4096; - if (refill_def < 16) refill_def = 16; if (refill_def > cap_def) refill_def = cap_def; - if (spill_def < 0) spill_def = 0; if (spill_def > cap_def) spill_def = cap_def; - tls5->cap = (uint32_t)cap_def; - tls5->refill_low = (uint32_t)refill_def; - tls5->spill_high = (uint32_t)spill_def; // 0 → use cap logic in helper - tiny_tls_publish_targets(5, (uint32_t)cap_def); - - // Optional: one-shot TLS prewarm for class5 - // Env: HAKMEM_TINY_CLASS5_PREWARM= (default 128, 0 disables) - int prewarm = 128; - const char* pw = getenv("HAKMEM_TINY_CLASS5_PREWARM"); - if (pw && *pw) prewarm = atoi(pw); - if (prewarm < 0) prewarm = 0; - if (prewarm > (int)tls5->cap) prewarm = (int)tls5->cap; - - if (prewarm > 0) { - // ✅ NEW: Use Box Prewarm API (safe, simple, handles all initialization) - // Box Prewarm guarantees: - // - Correct initialization order (capacity system initialized first) - // - No orphaned blocks (atomic carve-and-push) - // - No double-free risk (all-or-nothing semantics) - // - Clear error handling - int taken = box_prewarm_tls(5, prewarm); - - #if !HAKMEM_BUILD_RELEASE - // Debug logging (optional) - fprintf(stderr, "[PREWARM] class=5 requested=%d taken=%d\n", prewarm, taken); - #endif - (void)taken; // Suppress unused warning in release builds - } - } if (mem_diet_enabled) { tiny_apply_mem_diet(); } diff --git a/core/hakmem_tiny_sfc.c b/core/hakmem_tiny_sfc.c index eb3a97fb..919e6ae9 100644 --- a/core/hakmem_tiny_sfc.c +++ b/core/hakmem_tiny_sfc.c @@ -111,12 +111,6 @@ void sfc_init(void) { } } - // If class5 hotpath is enabled, disable SFC for class 5 by default - // unless explicitly overridden via HAKMEM_SFC_CAPACITY_CLASS5 - extern int g_tiny_hotpath_class5; - if (g_tiny_hotpath_class5 && g_sfc_capacity_override[5] == 0) { - g_sfc_capacity[5] = 0; - } // Register shutdown hook for optional stats dump atexit(sfc_shutdown); diff --git a/core/hakmem_tiny_tls_state_box.inc b/core/hakmem_tiny_tls_state_box.inc index b460551f..ac85c768 100644 --- a/core/hakmem_tiny_tls_state_box.inc +++ b/core/hakmem_tiny_tls_state_box.inc @@ -1,7 +1,6 @@ // Hot-path cheap sampling counter to avoid rand() in allocation path // Phase 9.4: TLS single-linked freelist (mimalloc-inspired) for hottest classes (≤128B/≤256B) int g_tls_sll_enable = 1; // HAKMEM_TINY_TLS_SLL=0 to disable -int g_tiny_hotpath_class5 = 0; // HAKMEM_TINY_HOTPATH_CLASS5=1 to enable class 5 hotpath // Phase 6-1.7: Export TLS variables for box refactor (Box 5/6 need access from hakmem.c) // CRITICAL FIX: Explicit initializers prevent SEGV from uninitialized TLS in worker threads // PRIORITY 3: TLS Canaries - Add canaries around TLS arrays to detect buffer overruns diff --git a/core/tiny_alloc_fast_push.d b/core/tiny_alloc_fast_push.d index cbf610bc..cdb347fe 100644 --- a/core/tiny_alloc_fast_push.d +++ b/core/tiny_alloc_fast_push.d @@ -9,15 +9,17 @@ core/tiny_alloc_fast_push.o: core/tiny_alloc_fast_push.c \ core/box/../superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h \ core/box/../superslab/superslab_inline.h \ - core/box/../superslab/superslab_types.h core/box/../tiny_debug_ring.h \ - core/box/../tiny_remote.h core/box/../hakmem_tiny_integrity.h \ - core/box/../hakmem_tiny.h core/box/../hakmem_trace.h \ - core/box/../hakmem_tiny_mini_mag.h core/box/../ptr_track.h \ - core/box/../ptr_trace.h core/box/../box/tiny_next_ptr_box.h \ - core/hakmem_tiny_config.h core/tiny_nextptr.h core/hakmem_build_flags.h \ - core/tiny_region_id.h core/superslab/superslab_inline.h \ - core/box/../tiny_debug_ring.h core/box/../superslab/superslab_inline.h \ - core/box/front_gate_box.h core/hakmem_tiny.h + core/box/../superslab/superslab_types.h \ + core/box/../superslab/../tiny_box_geometry.h \ + core/box/../tiny_debug_ring.h core/box/../tiny_remote.h \ + core/box/../hakmem_tiny_integrity.h core/box/../hakmem_tiny.h \ + core/box/../hakmem_trace.h core/box/../hakmem_tiny_mini_mag.h \ + core/box/../ptr_track.h core/box/../ptr_trace.h \ + core/box/../box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ + core/tiny_nextptr.h core/hakmem_build_flags.h core/tiny_region_id.h \ + core/superslab/superslab_inline.h core/box/../tiny_debug_ring.h \ + core/box/../superslab/superslab_inline.h core/box/front_gate_box.h \ + core/hakmem_tiny.h core/hakmem_tiny_config.h: core/box/tls_sll_box.h: core/box/../hakmem_tiny_config.h: @@ -35,6 +37,7 @@ core/box/../superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/box/../superslab/superslab_inline.h: core/box/../superslab/superslab_types.h: +core/box/../superslab/../tiny_box_geometry.h: core/box/../tiny_debug_ring.h: core/box/../tiny_remote.h: core/box/../hakmem_tiny_integrity.h: diff --git a/core/tiny_failfast.d b/core/tiny_failfast.d index d0c7f3f1..a01725a1 100644 --- a/core/tiny_failfast.d +++ b/core/tiny_failfast.d @@ -1,13 +1,19 @@ core/tiny_failfast.o: core/tiny_failfast.c core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: diff --git a/core/tiny_free_fast.inc.h b/core/tiny_free_fast.inc.h index 62612d38..341997e2 100644 --- a/core/tiny_free_fast.inc.h +++ b/core/tiny_free_fast.inc.h @@ -40,8 +40,6 @@ extern pthread_t tiny_self_pt(void); // External TLS variables (from Box 5) // Phase 3d-B: TLS Cache Merge - Unified TLS SLL structure extern __thread TinyTLSSLL g_tls_sll[TINY_NUM_CLASSES]; -// Hot-class toggle: class5 (256B) dedicated TLS fast path -extern int g_tiny_hotpath_class5; extern __thread TinyTLSList g_tls_lists[TINY_NUM_CLASSES]; // Box 5 helper (TLS push) @@ -135,13 +133,9 @@ static inline int tiny_free_fast_ss(SuperSlab* ss, int slab_idx, void* base, uin g_free_via_ss_local[class_idx]++; #endif - // Box 5 integration: class5 can use dedicated TLS List hotpath + // Box 5 integration extern int g_sfc_enabled; - if (__builtin_expect(g_tiny_hotpath_class5 && class_idx == 5, 0)) { - TinyTLSList* tls5 = &g_tls_lists[5]; - // Use guarded push for class5 to avoid sentinel/next poisoning during triage - tls_list_push(tls5, base, 5); - } else if (g_sfc_enabled) { + if (g_sfc_enabled) { // Box 5-NEW: Try SFC (128-256 slots) if (!sfc_free_push(class_idx, base)) { // SFC full → skip caching, use slow path (return 0) diff --git a/hakmem.d b/hakmem.d index 9aa46c85..9fa01f74 100644 --- a/hakmem.d +++ b/hakmem.d @@ -7,11 +7,13 @@ hakmem.o: core/hakmem.c core/hakmem.h core/hakmem_build_flags.h \ core/hakmem_tiny_mini_mag.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ - core/hakmem_tiny_superslab_constants.h core/tiny_fastcache.h \ - core/box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ - core/tiny_nextptr.h core/tiny_region_id.h core/tiny_box_geometry.h \ - core/hakmem_tiny_config.h core/ptr_track.h core/hakmem_super_registry.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ + core/tiny_fastcache.h core/box/tiny_next_ptr_box.h \ + core/hakmem_tiny_config.h core/tiny_nextptr.h core/tiny_region_id.h \ + core/tiny_box_geometry.h core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_mid_mt.h core/hakmem_elo.h core/hakmem_ace_stats.h \ core/hakmem_batch.h core/hakmem_evo.h core/hakmem_debug.h \ core/hakmem_prof.h core/hakmem_syscall.h core/hakmem_ace_controller.h \ @@ -20,11 +22,11 @@ hakmem.o: core/hakmem.c core/hakmem.h core/hakmem_build_flags.h \ core/box/hak_kpi_util.inc.h core/box/hak_core_init.inc.h \ core/hakmem_phase7_config.h core/box/ss_hot_prewarm_box.h \ core/box/hak_alloc_api.inc.h core/box/../hakmem_tiny.h \ - core/box/../hakmem_smallmid.h core/box/../pool_tls.h \ - core/box/hak_free_api.inc.h core/hakmem_tiny_superslab.h \ - core/box/../tiny_free_fast_v2.inc.h core/box/../tiny_region_id.h \ - core/box/../hakmem_build_flags.h core/box/../hakmem_tiny_config.h \ - core/box/../box/tls_sll_box.h core/box/../box/../hakmem_tiny_config.h \ + core/box/../hakmem_smallmid.h core/box/hak_free_api.inc.h \ + core/hakmem_tiny_superslab.h core/box/../tiny_free_fast_v2.inc.h \ + core/box/../tiny_region_id.h core/box/../hakmem_build_flags.h \ + core/box/../hakmem_tiny_config.h core/box/../box/tls_sll_box.h \ + core/box/../box/../hakmem_tiny_config.h \ core/box/../box/../hakmem_build_flags.h core/box/../box/../tiny_remote.h \ core/box/../box/../tiny_region_id.h \ core/box/../box/../hakmem_tiny_integrity.h \ @@ -36,10 +38,10 @@ hakmem.o: core/hakmem.c core/hakmem.h core/hakmem_build_flags.h \ core/box/../superslab/superslab_inline.h \ core/box/../box/ss_slab_meta_box.h \ core/box/../box/../superslab/superslab_types.h \ - core/box/../box/free_remote_box.h core/box/front_gate_v2.h \ - core/box/external_guard_box.h core/box/ss_slab_meta_box.h \ - core/box/hak_wrappers.inc.h core/box/front_gate_classifier.h \ - core/box/../front/malloc_tiny_fast.h \ + core/box/../box/slab_freelist_atomic.h core/box/../box/free_remote_box.h \ + core/box/front_gate_v2.h core/box/external_guard_box.h \ + core/box/ss_slab_meta_box.h core/box/hak_wrappers.inc.h \ + core/box/front_gate_classifier.h core/box/../front/malloc_tiny_fast.h \ core/box/../front/../hakmem_build_flags.h \ core/box/../front/../hakmem_tiny_config.h \ core/box/../front/tiny_unified_cache.h \ @@ -67,6 +69,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: @@ -76,7 +81,6 @@ core/hakmem_tiny_config.h: core/tiny_nextptr.h: core/tiny_region_id.h: core/tiny_box_geometry.h: -core/hakmem_tiny_config.h: core/ptr_track.h: core/hakmem_super_registry.h: core/hakmem_mid_mt.h: @@ -100,7 +104,6 @@ core/box/ss_hot_prewarm_box.h: core/box/hak_alloc_api.inc.h: core/box/../hakmem_tiny.h: core/box/../hakmem_smallmid.h: -core/box/../pool_tls.h: core/box/hak_free_api.inc.h: core/hakmem_tiny_superslab.h: core/box/../tiny_free_fast_v2.inc.h: @@ -124,6 +127,7 @@ core/box/../hakmem_tiny_integrity.h: core/box/../superslab/superslab_inline.h: core/box/../box/ss_slab_meta_box.h: core/box/../box/../superslab/superslab_types.h: +core/box/../box/slab_freelist_atomic.h: core/box/../box/free_remote_box.h: core/box/front_gate_v2.h: core/box/external_guard_box.h: diff --git a/hakmem_learner.d b/hakmem_learner.d index 083b76d0..30bf2167 100644 --- a/hakmem_learner.d +++ b/hakmem_learner.d @@ -6,7 +6,9 @@ hakmem_learner.o: core/hakmem_learner.c core/hakmem_learner.h \ core/hakmem_size_hist.h core/hakmem_learn_log.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h core/hakmem_learner.h: core/hakmem_internal.h: @@ -28,6 +30,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: diff --git a/hakmem_shared_pool.d b/hakmem_shared_pool.d index 05bef189..9f798757 100644 --- a/hakmem_shared_pool.d +++ b/hakmem_shared_pool.d @@ -1,38 +1,48 @@ hakmem_shared_pool.o: core/hakmem_shared_pool.c core/hakmem_shared_pool.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/box/ss_slab_meta_box.h \ - core/box/../superslab/superslab_types.h core/box/ss_hot_cold_box.h \ + core/box/../superslab/superslab_types.h core/box/slab_freelist_atomic.h \ + core/box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ + core/tiny_nextptr.h core/tiny_region_id.h core/tiny_box_geometry.h \ + core/ptr_track.h core/hakmem_super_registry.h core/box/ss_hot_cold_box.h \ core/box/pagefault_telemetry_box.h core/box/tls_sll_drain_box.h \ core/box/tls_sll_box.h core/box/../hakmem_tiny_config.h \ core/box/../hakmem_build_flags.h core/box/../tiny_remote.h \ - core/box/../tiny_region_id.h core/box/../hakmem_build_flags.h \ - core/box/../tiny_box_geometry.h \ - core/box/../hakmem_tiny_superslab_constants.h \ - core/box/../hakmem_tiny_config.h core/box/../ptr_track.h \ - core/box/../hakmem_super_registry.h core/box/../hakmem_tiny_superslab.h \ - core/box/../superslab/superslab_inline.h \ - core/box/../hakmem_tiny_integrity.h core/box/../hakmem_tiny.h \ + core/box/../tiny_region_id.h core/box/../hakmem_tiny_integrity.h \ + core/box/../hakmem_tiny.h core/box/../hakmem_build_flags.h \ core/box/../hakmem_trace.h core/box/../hakmem_tiny_mini_mag.h \ core/box/../ptr_track.h core/box/../ptr_trace.h \ - core/box/../box/tiny_next_ptr_box.h core/hakmem_tiny_config.h \ - core/tiny_nextptr.h core/tiny_region_id.h core/box/../tiny_debug_ring.h \ - core/box/../superslab/superslab_inline.h core/box/free_local_box.h \ - core/hakmem_tiny_superslab.h core/hakmem_policy.h + core/box/../tiny_debug_ring.h core/box/../superslab/superslab_inline.h \ + core/box/free_local_box.h core/hakmem_tiny_superslab.h \ + core/hakmem_policy.h core/hakmem_shared_pool.h: core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/hakmem_tiny_superslab.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: core/box/ss_slab_meta_box.h: core/box/../superslab/superslab_types.h: +core/box/slab_freelist_atomic.h: +core/box/tiny_next_ptr_box.h: +core/hakmem_tiny_config.h: +core/tiny_nextptr.h: +core/tiny_region_id.h: +core/tiny_box_geometry.h: +core/ptr_track.h: +core/hakmem_super_registry.h: core/box/ss_hot_cold_box.h: core/box/pagefault_telemetry_box.h: core/box/tls_sll_drain_box.h: @@ -41,24 +51,13 @@ core/box/../hakmem_tiny_config.h: core/box/../hakmem_build_flags.h: core/box/../tiny_remote.h: core/box/../tiny_region_id.h: -core/box/../hakmem_build_flags.h: -core/box/../tiny_box_geometry.h: -core/box/../hakmem_tiny_superslab_constants.h: -core/box/../hakmem_tiny_config.h: -core/box/../ptr_track.h: -core/box/../hakmem_super_registry.h: -core/box/../hakmem_tiny_superslab.h: -core/box/../superslab/superslab_inline.h: core/box/../hakmem_tiny_integrity.h: core/box/../hakmem_tiny.h: +core/box/../hakmem_build_flags.h: core/box/../hakmem_trace.h: core/box/../hakmem_tiny_mini_mag.h: core/box/../ptr_track.h: core/box/../ptr_trace.h: -core/box/../box/tiny_next_ptr_box.h: -core/hakmem_tiny_config.h: -core/tiny_nextptr.h: -core/tiny_region_id.h: core/box/../tiny_debug_ring.h: core/box/../superslab/superslab_inline.h: core/box/free_local_box.h: diff --git a/hakmem_smallmid.d b/hakmem_smallmid.d index 4ed898ab..4bab52ea 100644 --- a/hakmem_smallmid.d +++ b/hakmem_smallmid.d @@ -5,8 +5,8 @@ hakmem_smallmid.o: core/hakmem_smallmid.c core/hakmem_smallmid.h \ core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/tiny_debug_ring.h core/tiny_remote.h core/hakmem_smallmid.h: core/hakmem_build_flags.h: core/hakmem_smallmid_superslab.h: @@ -21,5 +21,6 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: core/tiny_debug_ring.h: core/tiny_remote.h: diff --git a/hakmem_super_registry.d b/hakmem_super_registry.d index 5a6bfdff..9c99b253 100644 --- a/hakmem_super_registry.d +++ b/hakmem_super_registry.d @@ -2,7 +2,10 @@ hakmem_super_registry.o: core/hakmem_super_registry.c \ core/hakmem_super_registry.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/hakmem_build_flags.h core/tiny_remote.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/hakmem_super_registry.h: core/hakmem_tiny_superslab.h: @@ -10,6 +13,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: diff --git a/hakmem_tiny_bg_spill.d b/hakmem_tiny_bg_spill.d index 04cced2c..d60757e8 100644 --- a/hakmem_tiny_bg_spill.d +++ b/hakmem_tiny_bg_spill.d @@ -6,9 +6,9 @@ hakmem_tiny_bg_spill.o: core/hakmem_tiny_bg_spill.c \ core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/hakmem_tiny.h core/hakmem_trace.h \ - core/hakmem_tiny_mini_mag.h + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/tiny_debug_ring.h core/tiny_remote.h core/hakmem_tiny.h \ + core/hakmem_trace.h core/hakmem_tiny_mini_mag.h core/hakmem_tiny_bg_spill.h: core/box/tiny_next_ptr_box.h: core/hakmem_tiny_config.h: @@ -25,6 +25,7 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny.h: diff --git a/hakmem_tiny_magazine.d b/hakmem_tiny_magazine.d index c33f0796..8f0d70e3 100644 --- a/hakmem_tiny_magazine.d +++ b/hakmem_tiny_magazine.d @@ -4,11 +4,13 @@ hakmem_tiny_magazine.o: core/hakmem_tiny_magazine.c \ core/hakmem_tiny_config.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ - core/hakmem_tiny_superslab_constants.h core/hakmem_super_registry.h \ - core/hakmem_prof.h core/hakmem_internal.h core/hakmem.h \ - core/hakmem_config.h core/hakmem_features.h core/hakmem_sys.h \ - core/hakmem_whale.h core/box/tiny_next_ptr_box.h \ + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ + core/hakmem_super_registry.h core/hakmem_prof.h core/hakmem_internal.h \ + core/hakmem.h core/hakmem_config.h core/hakmem_features.h \ + core/hakmem_sys.h core/hakmem_whale.h core/box/tiny_next_ptr_box.h \ core/hakmem_tiny_config.h core/tiny_nextptr.h core/tiny_region_id.h \ core/tiny_box_geometry.h core/ptr_track.h core/hakmem_tiny_magazine.h: @@ -22,6 +24,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: diff --git a/hakmem_tiny_query.d b/hakmem_tiny_query.d index fe3f7bc3..6331abf1 100644 --- a/hakmem_tiny_query.d +++ b/hakmem_tiny_query.d @@ -4,9 +4,11 @@ hakmem_tiny_query.o: core/hakmem_tiny_query.c core/hakmem_tiny.h \ core/hakmem_tiny_query_api.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ - core/hakmem_tiny_superslab_constants.h core/hakmem_super_registry.h \ - core/hakmem_config.h core/hakmem_features.h + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ + core/hakmem_super_registry.h core/hakmem_config.h core/hakmem_features.h core/hakmem_tiny.h: core/hakmem_build_flags.h: core/hakmem_trace.h: @@ -18,6 +20,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: diff --git a/hakmem_tiny_sfc.d b/hakmem_tiny_sfc.d index 56275f0e..992a436e 100644 --- a/hakmem_tiny_sfc.d +++ b/hakmem_tiny_sfc.d @@ -6,13 +6,14 @@ hakmem_tiny_sfc.o: core/hakmem_tiny_sfc.c core/tiny_alloc_fast_sfc.inc.h \ core/hakmem_tiny_config.h core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/tiny_tls.h core/box/tls_sll_box.h \ - core/box/../hakmem_tiny_config.h core/box/../hakmem_build_flags.h \ - core/box/../tiny_remote.h core/box/../tiny_region_id.h \ - core/box/../hakmem_tiny_integrity.h core/box/../hakmem_tiny.h \ - core/box/../ptr_track.h core/box/../ptr_trace.h \ - core/box/../tiny_debug_ring.h core/box/../superslab/superslab_inline.h + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/tiny_debug_ring.h core/tiny_remote.h core/tiny_tls.h \ + core/box/tls_sll_box.h core/box/../hakmem_tiny_config.h \ + core/box/../hakmem_build_flags.h core/box/../tiny_remote.h \ + core/box/../tiny_region_id.h core/box/../hakmem_tiny_integrity.h \ + core/box/../hakmem_tiny.h core/box/../ptr_track.h \ + core/box/../ptr_trace.h core/box/../tiny_debug_ring.h \ + core/box/../superslab/superslab_inline.h core/tiny_alloc_fast_sfc.inc.h: core/hakmem_tiny.h: core/hakmem_build_flags.h: @@ -32,6 +33,7 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/tiny_tls.h: diff --git a/hakmem_tiny_stats.d b/hakmem_tiny_stats.d index 7e961c01..0b4a57ae 100644 --- a/hakmem_tiny_stats.d +++ b/hakmem_tiny_stats.d @@ -4,9 +4,11 @@ hakmem_tiny_stats.o: core/hakmem_tiny_stats.c core/hakmem_tiny.h \ core/hakmem_tiny_stats_api.h core/hakmem_tiny_superslab.h \ core/superslab/superslab_types.h core/hakmem_tiny_superslab_constants.h \ core/superslab/superslab_inline.h core/superslab/superslab_types.h \ - core/tiny_debug_ring.h core/tiny_remote.h \ - core/hakmem_tiny_superslab_constants.h core/hakmem_config.h \ - core/hakmem_features.h core/hakmem_tiny_stats.h + core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ + core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ + core/hakmem_config.h core/hakmem_features.h core/hakmem_tiny_stats.h core/hakmem_tiny.h: core/hakmem_build_flags.h: core/hakmem_trace.h: @@ -18,6 +20,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: diff --git a/hakmem_tiny_superslab.d b/hakmem_tiny_superslab.d index a504b4b6..dd8f2865 100644 --- a/hakmem_tiny_superslab.d +++ b/hakmem_tiny_superslab.d @@ -1,7 +1,9 @@ hakmem_tiny_superslab.o: core/hakmem_tiny_superslab.c \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/hakmem_build_flags.h core/tiny_remote.h \ core/hakmem_tiny_superslab_constants.h core/box/ss_hot_cold_box.h \ core/box/../superslab/superslab_types.h core/hakmem_super_registry.h \ @@ -11,12 +13,16 @@ hakmem_tiny_superslab.o: core/hakmem_tiny_superslab.c \ core/hakmem_features.h core/hakmem_sys.h core/hakmem_whale.h \ core/tiny_region_id.h core/tiny_box_geometry.h core/ptr_track.h \ core/hakmem_tiny_integrity.h core/box/tiny_next_ptr_box.h \ - core/hakmem_tiny_config.h core/tiny_nextptr.h + core/hakmem_tiny_config.h core/tiny_nextptr.h \ + core/box/slab_freelist_atomic.h core/hakmem_tiny_superslab.h: core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/tiny_remote.h: @@ -42,3 +48,4 @@ core/hakmem_tiny_integrity.h: core/box/tiny_next_ptr_box.h: core/hakmem_tiny_config.h: core/tiny_nextptr.h: +core/box/slab_freelist_atomic.h: diff --git a/tiny_adaptive_sizing.d b/tiny_adaptive_sizing.d index 38af50e6..b989ea4b 100644 --- a/tiny_adaptive_sizing.d +++ b/tiny_adaptive_sizing.d @@ -7,8 +7,8 @@ tiny_adaptive_sizing.o: core/tiny_adaptive_sizing.c \ core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/tiny_debug_ring.h core/tiny_remote.h core/tiny_adaptive_sizing.h: core/hakmem_tiny.h: core/hakmem_build_flags.h: @@ -28,5 +28,6 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: core/tiny_debug_ring.h: core/tiny_remote.h: diff --git a/tiny_fastcache.d b/tiny_fastcache.d index edc670dc..b42d2568 100644 --- a/tiny_fastcache.d +++ b/tiny_fastcache.d @@ -5,9 +5,9 @@ tiny_fastcache.o: core/tiny_fastcache.c core/tiny_fastcache.h \ core/hakmem_tiny_config.h core/ptr_track.h core/hakmem_super_registry.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ - core/tiny_remote.h core/hakmem_tiny.h core/hakmem_trace.h \ - core/hakmem_tiny_mini_mag.h + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/tiny_debug_ring.h core/tiny_remote.h core/hakmem_tiny.h \ + core/hakmem_trace.h core/hakmem_tiny_mini_mag.h core/tiny_fastcache.h: core/box/tiny_next_ptr_box.h: core/hakmem_tiny_config.h: @@ -24,6 +24,7 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny.h: diff --git a/tiny_publish.d b/tiny_publish.d index ac491700..18f02817 100644 --- a/tiny_publish.d +++ b/tiny_publish.d @@ -3,7 +3,9 @@ tiny_publish.o: core/tiny_publish.c core/hakmem_tiny.h \ core/hakmem_tiny_mini_mag.h core/box/mailbox_box.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h \ core/tiny_publish.h core/hakmem_tiny_superslab.h \ core/hakmem_tiny_stats_api.h @@ -17,6 +19,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: diff --git a/tiny_remote.d b/tiny_remote.d index df9c624b..f2b226d6 100644 --- a/tiny_remote.d +++ b/tiny_remote.d @@ -1,7 +1,9 @@ tiny_remote.o: core/tiny_remote.c core/tiny_remote.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/hakmem_build_flags.h core/hakmem_tiny_superslab_constants.h core/tiny_remote.h: core/hakmem_tiny_superslab.h: @@ -9,6 +11,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/hakmem_build_flags.h: core/hakmem_tiny_superslab_constants.h: diff --git a/tiny_sticky.d b/tiny_sticky.d index 7ddc7792..d4b718b4 100644 --- a/tiny_sticky.d +++ b/tiny_sticky.d @@ -3,7 +3,9 @@ tiny_sticky.o: core/tiny_sticky.c core/hakmem_tiny.h \ core/hakmem_tiny_mini_mag.h core/tiny_sticky.h \ core/hakmem_tiny_superslab.h core/superslab/superslab_types.h \ core/hakmem_tiny_superslab_constants.h core/superslab/superslab_inline.h \ - core/superslab/superslab_types.h core/tiny_debug_ring.h \ + core/superslab/superslab_types.h core/superslab/../tiny_box_geometry.h \ + core/superslab/../hakmem_tiny_superslab_constants.h \ + core/superslab/../hakmem_tiny_config.h core/tiny_debug_ring.h \ core/tiny_remote.h core/hakmem_tiny_superslab_constants.h core/hakmem_tiny.h: core/hakmem_build_flags.h: @@ -15,6 +17,9 @@ core/superslab/superslab_types.h: core/hakmem_tiny_superslab_constants.h: core/superslab/superslab_inline.h: core/superslab/superslab_types.h: +core/superslab/../tiny_box_geometry.h: +core/superslab/../hakmem_tiny_superslab_constants.h: +core/superslab/../hakmem_tiny_config.h: core/tiny_debug_ring.h: core/tiny_remote.h: core/hakmem_tiny_superslab_constants.h: