Files
hakmem/core/box/pagefault_telemetry_box.h
Moe Charm (CI) 03ba62df4d Phase 23 Unified Cache + PageFaultTelemetry generalization: Mid/VM page-fault bottleneck identified
Summary:
- Phase 23 Unified Cache: +30% improvement (Random Mixed 256B: 18.18M → 23.68M ops/s)
- PageFaultTelemetry: Extended to generic buckets (C0-C7, MID, L25, SSM)
- Measurement-driven decision: Mid/VM page-faults (80-100K) >> Tiny (6K) → prioritize Mid/VM optimization

Phase 23 Changes:
1. Unified Cache implementation (core/front/tiny_unified_cache.{c,h})
   - Direct SuperSlab carve (TLS SLL bypass)
   - Self-contained pop-or-refill pattern
   - ENV: HAKMEM_TINY_UNIFIED_CACHE=1, HAKMEM_TINY_UNIFIED_C{0-7}=128

2. Fast path pruning (tiny_alloc_fast.inc.h, tiny_free_fast_v2.inc.h)
   - Unified ON → direct cache access (skip all intermediate layers)
   - Alloc: unified_cache_pop_or_refill() → immediate fail to slow
   - Free: unified_cache_push() → fallback to SLL only if full

PageFaultTelemetry Changes:
3. Generic bucket architecture (core/box/pagefault_telemetry_box.{c,h})
   - PF_BUCKET_{C0-C7, MID, L25, SSM} for domain-specific measurement
   - Integration: hak_pool_try_alloc(), l25_alloc_new_run(), shared_pool_allocate_superslab_unlocked()

4. Measurement results (Random Mixed 500K / 256B):
   - Tiny C2-C7: 2-33 pages, high reuse (64-3.8 touches/page)
   - SSM: 512 pages (initialization footprint)
   - MID/L25: 0 (unused in this workload)
   - Mid/Large VM benchmarks: 80-100K page-faults (13-16x higher than Tiny)

Ring Cache Enhancements:
5. Hot Ring Cache (core/front/tiny_ring_cache.{c,h})
   - ENV: HAKMEM_TINY_HOT_RING_ENABLE=1, HAKMEM_TINY_HOT_RING_C{0-7}=size
   - Conditional compilation cleanup

Documentation:
6. Analysis reports
   - RANDOM_MIXED_BOTTLENECK_ANALYSIS.md: Page-fault breakdown
   - RANDOM_MIXED_SUMMARY.md: Phase 23 summary
   - RING_CACHE_ACTIVATION_GUIDE.md: Ring cache usage
   - CURRENT_TASK.md: Updated with Phase 23 results and Phase 24 plan

Next Steps (Phase 24):
- Target: Mid/VM PageArena/HotSpanBox (page-fault reduction 80-100K → 30-40K)
- Tiny SSM optimization deferred (low ROI, ~6K page-faults already optimal)
- Expected improvement: +30-50% for Mid/Large workloads

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-17 02:47:58 +09:00

97 lines
3.0 KiB
C
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

// pagefault_telemetry_box.h - Box PageFaultTelemetry: Tiny page-touch visualization
// Purpose:
// - Approximate「何枚のページをどれだけ触ったか」をクラス別に計測する箱。
// - Tiny フロントエンド側からのみ呼び出し、Superslab/カーネル側の挙動は変更しない。
//
// Design:
// - 4KB ページ単位でアドレスを正規化し、簡易 Bloom/ビットセットにハッシュ。
// - 1 クラスあたり 1024bit (= 16 x uint64_t) を用意し、popcount で「近似ページ枚数」を算出。
// - 衝突は起こり得るが「下限近似値」として十分。目的は傾向把握。
//
// ENV Control:
// - HAKMEM_TINY_PAGEFAULT_TELEMETRY=1 … 計測有効化
// - HAKMEM_TINY_PAGEFAULT_DUMP=1 … 終了時に stderr へ 1 回だけダンプ
#ifndef HAK_BOX_PAGEFAULT_TELEMETRY_H
#define HAK_BOX_PAGEFAULT_TELEMETRY_H
#include <stdint.h>
#ifdef __cplusplus
extern "C" {
#endif
// Tiny クラス数(既存定義が無ければ 8 とみなす)
#ifndef TINY_NUM_CLASSES
#define TINY_NUM_CLASSES 8
#endif
// ドメインバケット定義:
// 0..7 : Tiny C0..C7
// 8 : Mid Pool (hak_pool_*)
// 9 : L25 Pool (hak_l25_pool_*)
// 10 : Shared SuperSlab meta / backing
// 11 : 予備
enum {
PF_BUCKET_TINY_BASE = 0,
PF_BUCKET_TINY_LIMIT = TINY_NUM_CLASSES,
PF_BUCKET_MID = TINY_NUM_CLASSES,
PF_BUCKET_L25 = TINY_NUM_CLASSES + 1,
PF_BUCKET_SS_META = TINY_NUM_CLASSES + 2,
PF_BUCKET_RESERVED = TINY_NUM_CLASSES + 3,
PF_BUCKET_MAX = TINY_NUM_CLASSES + 4
};
// ビットセット本体1 バケットあたり 1024bit
extern __thread uint64_t g_pf_bloom[PF_BUCKET_MAX][16];
// タッチ総数(ページ単位ではなく「呼び出し回数」)
extern __thread uint64_t g_pf_touch[PF_BUCKET_MAX];
// ENV による有効/無効判定(キャッシュ付き)
int pagefault_telemetry_enabled(void);
// 集計・ダンプENV HAKMEM_TINY_PAGEFAULT_DUMP=1 のときだけ出力)
void pagefault_telemetry_dump(void);
// ----------------------------------------------------------------------------
// Inline helper: ページタッチ記録
// ----------------------------------------------------------------------------
static inline void pagefault_telemetry_touch(int cls, const void* ptr) {
#if HAKMEM_DEBUG_COUNTERS
if (!pagefault_telemetry_enabled()) {
return;
}
if (cls < 0 || cls >= PF_BUCKET_MAX) {
return;
}
// 4KB ページに正規化
uintptr_t addr = (uintptr_t)ptr;
uintptr_t page = addr >> 12;
// 1024 エントリのビットセットにハッシュ
uint32_t idx = (uint32_t)(page & 1023u);
uint32_t word = idx >> 6;
uint32_t bit = idx & 63u;
uint64_t mask = (uint64_t)1u << bit;
uint64_t old = g_pf_bloom[cls][word];
if (!(old & mask)) {
g_pf_bloom[cls][word] = old | mask;
}
g_pf_touch[cls]++;
#else
(void)cls;
(void)ptr;
#endif
}
#ifdef __cplusplus
}
#endif
#endif // HAK_BOX_PAGEFAULT_TELEMETRY_H