Files
hakmem/core/box/tiny_route_env_box.h

177 lines
7.2 KiB
C
Raw Normal View History

// tiny_route_env_box.h - Route snapshot for Tiny front (Heap vs Legacy)
// 役割:
// - 起動時に「各クラスが TinyHeap を使うか」をスナップショットし、ホットパスでは LUT 1 回に縮約する。
// - HAKMEM_TINY_HEAP_PROFILE / HAKMEM_TINY_HEAP_BOX / HAKMEM_TINY_HEAP_CLASSES の組合せをここで解決する。
#pragma once
#include <stdint.h>
#include <stdlib.h>
#include "../hakmem_tiny_config.h"
#include "tiny_heap_env_box.h"
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
#include "free_dispatch_stats_box.h" // Phase FREE-DISPATCHER-OPT-1
#include "smallobject_hotbox_v3_env_box.h"
#include "smallobject_hotbox_v4_env_box.h"
#include "smallobject_v5_env_box.h"
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
// ENV sentinel values
#ifndef ENV_UNINIT
#define ENV_UNINIT (-1)
#define ENV_ENABLED (1)
#define ENV_DISABLED (0)
#endif
typedef enum {
TINY_ROUTE_LEGACY = 0,
TINY_ROUTE_HEAP = 1, // TinyHeap v1
TINY_ROUTE_HOTHEAP_V2 = 2, // TinyHotHeap v2
TINY_ROUTE_SMALL_HEAP_V3 = 3, // SmallObject HotHeap v3 (C7-first,研究箱)
TINY_ROUTE_SMALL_HEAP_V4 = 4, // SmallObject HotHeap v4 (stub, route未使用)
TINY_ROUTE_SMALL_HEAP_V5 = 5, // SmallObject HotHeap v5 (C6-only route stub, Phase v5-1)
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
TINY_ROUTE_SMALL_HEAP_V6 = 6, // SmallObject Core v6 (C6-only route stub, Phase v6-1)
TINY_ROUTE_SMALL_HEAP_V7 = 7, // SmallObject HotHeap v7 (C6-only route stub, Phase v7-1)
} tiny_route_kind_t;
extern tiny_route_kind_t g_tiny_route_class[TINY_NUM_CLASSES];
extern int g_tiny_route_snapshot_done;
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
// ============================================================================
// Phase v6-1: SmallObject Core v6 ENV gate (must be before tiny_route_snapshot_init)
// ============================================================================
// small_heap_v6_enabled() - グローバル v6 enable check
static inline int small_heap_v6_enabled(void) {
static int g_enabled = ENV_UNINIT;
if (__builtin_expect(g_enabled == ENV_UNINIT, 0)) {
const char* e = getenv("HAKMEM_SMALL_HEAP_V6_ENABLED");
g_enabled = (e && *e && *e != '0') ? ENV_ENABLED : ENV_DISABLED;
}
return (g_enabled == ENV_ENABLED);
}
// small_heap_v6_class_mask() - v6 対象クラスのビットマスク
static inline uint32_t small_heap_v6_class_mask(void) {
static int g_mask = ENV_UNINIT;
if (__builtin_expect(g_mask == ENV_UNINIT, 0)) {
const char* e = getenv("HAKMEM_SMALL_HEAP_V6_CLASSES");
if (e && *e) {
g_mask = (int)strtoul(e, NULL, 0);
} else {
g_mask = 0x0; // default: OFF
}
}
return (uint32_t)g_mask;
}
// small_heap_v6_class_enabled() - 指定クラスが v6 有効か
static inline int small_heap_v6_class_enabled(uint32_t class_idx) {
if (class_idx >= 8) return 0;
if (!small_heap_v6_enabled()) return 0;
uint32_t mask = small_heap_v6_class_mask();
return (mask & (1u << class_idx)) ? 1 : 0;
}
// ============================================================================
// Phase v7-1: SmallObject HotHeap v7 ENV gate (must be before tiny_route_snapshot_init)
// ============================================================================
// small_heap_v7_enabled() - グローバル v7 enable check
static inline int small_heap_v7_enabled(void) {
static int g_enabled = ENV_UNINIT;
if (__builtin_expect(g_enabled == ENV_UNINIT, 0)) {
const char* e = getenv("HAKMEM_SMALL_HEAP_V7_ENABLED");
g_enabled = (e && *e && *e != '0') ? ENV_ENABLED : ENV_DISABLED;
}
return (g_enabled == ENV_ENABLED);
}
// small_heap_v7_class_mask() - v7 対象クラスのビットマスク
static inline uint32_t small_heap_v7_class_mask(void) {
static int g_mask = ENV_UNINIT;
if (__builtin_expect(g_mask == ENV_UNINIT, 0)) {
const char* e = getenv("HAKMEM_SMALL_HEAP_V7_CLASSES");
if (e && *e) {
g_mask = (int)strtoul(e, NULL, 0);
} else {
g_mask = 0x0; // default: OFF
}
}
return (uint32_t)g_mask;
}
// small_heap_v7_class_enabled() - 指定クラスが v7 有効か
static inline int small_heap_v7_class_enabled(uint32_t class_idx) {
if (class_idx >= 8) return 0;
if (!small_heap_v7_enabled()) return 0;
uint32_t mask = small_heap_v7_class_mask();
return (mask & (1u << class_idx)) ? 1 : 0;
}
static inline void tiny_route_snapshot_init(void) {
for (int i = 0; i < TINY_NUM_CLASSES; i++) {
// Phase v7-1: C6-only v7 route stub (highest priority)
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(env_checks); // ENV check counter
if (small_heap_v7_class_enabled((uint32_t)i)) {
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V7;
FREE_DISPATCH_STAT_INC(route_core_v7);
} else if (small_heap_v6_class_enabled((uint32_t)i)) {
// Phase v6-1: C6-only v6 route stub
FREE_DISPATCH_STAT_INC(env_checks);
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V6;
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(route_core_v6);
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
} else if (i == 6 && small_heap_v5_class_enabled(6)) {
// Phase v5-1: C6-only v5 route stub (before v4 check)
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(env_checks);
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V5;
} else if (small_heap_v4_class_enabled((uint8_t)i)) {
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(env_checks);
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V4;
} else if (small_heap_v3_class_enabled((uint8_t)i)) {
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(env_checks);
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V3;
} else if (tiny_hotheap_v2_class_enabled((uint8_t)i)) {
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(env_checks);
g_tiny_route_class[i] = TINY_ROUTE_HOTHEAP_V2;
} else if (tiny_heap_box_enabled() && tiny_heap_class_route_enabled(i)) {
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(env_checks);
FREE_DISPATCH_STAT_INC(env_checks);
g_tiny_route_class[i] = TINY_ROUTE_HEAP;
} else {
g_tiny_route_class[i] = TINY_ROUTE_LEGACY;
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(route_tiny_legacy);
}
}
g_tiny_route_snapshot_done = 1;
}
static inline tiny_route_kind_t tiny_route_for_class(uint8_t ci) {
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測 **目的**: free dispatcher(29%)の内訳を細分化して計測。 **実装内容**: - FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0) - カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls - hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み - 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ) **計測結果**: Mixed 16-1024B (1M iter, ws=400): - total=8,081, route_calls=267,967, env_checks=9 - BENCH_FAST_FRONT により大半は早期リターン - route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees) - ENV check は初期化時の 9回のみ(snapshot 効果) C6-heavy (257-768B, 1M iter, ws=400): - total=500,099, route_calls=1,034, env_checks=9 - fg_classify_domain に到達する free が多い - route_for_class 呼び出しは極小(snapshot 効果) **結論**: - ENV check は既に十分最適化されている(初期化時のみ) - route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1) - 次フェーズ(OPT-2)では別のアプローチを検討 **ドキュメント追加**: - docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規) - CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
FREE_DISPATCH_STAT_INC(route_for_class_calls); // Phase FREE-DISPATCHER-OPT-1
if (__builtin_expect(!g_tiny_route_snapshot_done, 0)) {
tiny_route_snapshot_init();
}
if (__builtin_expect(ci >= TINY_NUM_CLASSES, 0)) {
return TINY_ROUTE_LEGACY;
}
return g_tiny_route_class[ci];
}
static inline int tiny_route_is_heap_kind(tiny_route_kind_t route) {
return route == TINY_ROUTE_HEAP ||
route == TINY_ROUTE_HOTHEAP_V2 ||
route == TINY_ROUTE_SMALL_HEAP_V3 ||
route == TINY_ROUTE_SMALL_HEAP_V4 ||
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
route == TINY_ROUTE_SMALL_HEAP_V5 ||
route == TINY_ROUTE_SMALL_HEAP_V6 ||
route == TINY_ROUTE_SMALL_HEAP_V7;
}
// C7 front が TinyHeap を使うかRoute snapshot 経由で判定)
static inline int tiny_c7_front_uses_heap(void) {
return tiny_route_is_heap_kind(tiny_route_for_class(7));
}
// C6 front が TinyHeap を使うかRoute snapshot 経由で判定)
static inline int tiny_c6_front_uses_heap(void) {
return tiny_route_is_heap_kind(tiny_route_for_class(6));
}