2025-12-08 21:30:21 +09:00
|
|
|
|
// tiny_route_env_box.h - Route snapshot for Tiny front (Heap vs Legacy)
|
|
|
|
|
|
// 役割:
|
|
|
|
|
|
// - 起動時に「各クラスが TinyHeap を使うか」をスナップショットし、ホットパスでは LUT 1 回に縮約する。
|
|
|
|
|
|
// - HAKMEM_TINY_HEAP_PROFILE / HAKMEM_TINY_HEAP_BOX / HAKMEM_TINY_HEAP_CLASSES の組合せをここで解決する。
|
|
|
|
|
|
#pragma once
|
|
|
|
|
|
|
|
|
|
|
|
#include <stdint.h>
|
|
|
|
|
|
#include <stdlib.h>
|
|
|
|
|
|
#include "../hakmem_tiny_config.h"
|
|
|
|
|
|
#include "tiny_heap_env_box.h"
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
#include "free_dispatch_stats_box.h" // Phase FREE-DISPATCHER-OPT-1
|
2025-12-08 21:30:21 +09:00
|
|
|
|
|
2025-12-09 21:50:15 +09:00
|
|
|
|
#include "smallobject_hotbox_v3_env_box.h"
|
2025-12-10 17:58:42 +09:00
|
|
|
|
#include "smallobject_hotbox_v4_env_box.h"
|
2025-12-11 03:25:37 +09:00
|
|
|
|
#include "smallobject_v5_env_box.h"
|
2025-12-09 21:50:15 +09:00
|
|
|
|
|
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor
Phase v6-1: C6-only route stub (v1/pool fallback)
Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation
- 2MiB segment / 64KiB page allocation
- O(1) ptr→page_meta lookup with segment masking
- C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s)
Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching)
- TLS ownership fast-path skip page_meta for 90%+ of frees
- Batch header writes during refill (32 allocs = 1 header write)
- TLS batch refill (1/32 refill frequency)
- C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅
Phase v6-4: Mixed hang fix (segment metadata lookup correction)
- Root cause: metadata lookup was reading mmap region instead of TLS slot
- Fix: use TLS slot descriptor with in_use validation
- Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅
Phase v6-refactor: Code quality improvements (macro unification + inline + docs)
- Add SMALL_V6_* prefix macros (header, pointer conversion, page index)
- Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6)
- Doxygen-style comments for all public functions
- Result: 0 compiler warnings, maintained +1.2% performance
Files:
- core/box/smallobject_core_v6_box.h (new, type & API definitions)
- core/box/smallobject_cold_iface_v6.h (new, cold iface API)
- core/box/smallsegment_v6_box.h (new, segment type definitions)
- core/smallobject_core_v6.c (new, C6 alloc/free implementation)
- core/smallobject_cold_iface_v6.c (new, refill/retire logic)
- core/smallsegment_v6.c (new, segment allocator)
- docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document)
- core/box/tiny_route_env_box.h (modified, v6 route added)
- core/front/malloc_tiny_fast.h (modified, v6 case in route switch)
- Makefile (modified, v6 objects added)
- CURRENT_TASK.md (modified, v6 status added)
Status:
- C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅
- Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅
- Build: 0 warnings, fully documented ✅
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
|
|
|
|
// ENV sentinel values
|
|
|
|
|
|
#ifndef ENV_UNINIT
|
|
|
|
|
|
#define ENV_UNINIT (-1)
|
|
|
|
|
|
#define ENV_ENABLED (1)
|
|
|
|
|
|
#define ENV_DISABLED (0)
|
|
|
|
|
|
#endif
|
|
|
|
|
|
|
2025-12-08 21:30:21 +09:00
|
|
|
|
typedef enum {
|
|
|
|
|
|
TINY_ROUTE_LEGACY = 0,
|
2025-12-09 21:50:15 +09:00
|
|
|
|
TINY_ROUTE_HEAP = 1, // TinyHeap v1
|
|
|
|
|
|
TINY_ROUTE_HOTHEAP_V2 = 2, // TinyHotHeap v2
|
|
|
|
|
|
TINY_ROUTE_SMALL_HEAP_V3 = 3, // SmallObject HotHeap v3 (C7-first,研究箱)
|
2025-12-10 17:58:42 +09:00
|
|
|
|
TINY_ROUTE_SMALL_HEAP_V4 = 4, // SmallObject HotHeap v4 (stub, route未使用)
|
2025-12-11 03:25:37 +09:00
|
|
|
|
TINY_ROUTE_SMALL_HEAP_V5 = 5, // SmallObject HotHeap v5 (C6-only route stub, Phase v5-1)
|
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor
Phase v6-1: C6-only route stub (v1/pool fallback)
Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation
- 2MiB segment / 64KiB page allocation
- O(1) ptr→page_meta lookup with segment masking
- C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s)
Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching)
- TLS ownership fast-path skip page_meta for 90%+ of frees
- Batch header writes during refill (32 allocs = 1 header write)
- TLS batch refill (1/32 refill frequency)
- C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅
Phase v6-4: Mixed hang fix (segment metadata lookup correction)
- Root cause: metadata lookup was reading mmap region instead of TLS slot
- Fix: use TLS slot descriptor with in_use validation
- Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅
Phase v6-refactor: Code quality improvements (macro unification + inline + docs)
- Add SMALL_V6_* prefix macros (header, pointer conversion, page index)
- Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6)
- Doxygen-style comments for all public functions
- Result: 0 compiler warnings, maintained +1.2% performance
Files:
- core/box/smallobject_core_v6_box.h (new, type & API definitions)
- core/box/smallobject_cold_iface_v6.h (new, cold iface API)
- core/box/smallsegment_v6_box.h (new, segment type definitions)
- core/smallobject_core_v6.c (new, C6 alloc/free implementation)
- core/smallobject_cold_iface_v6.c (new, refill/retire logic)
- core/smallsegment_v6.c (new, segment allocator)
- docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document)
- core/box/tiny_route_env_box.h (modified, v6 route added)
- core/front/malloc_tiny_fast.h (modified, v6 case in route switch)
- Makefile (modified, v6 objects added)
- CURRENT_TASK.md (modified, v6 status added)
Status:
- C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅
- Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅
- Build: 0 warnings, fully documented ✅
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
|
|
|
|
TINY_ROUTE_SMALL_HEAP_V6 = 6, // SmallObject Core v6 (C6-only route stub, Phase v6-1)
|
2025-12-12 03:12:28 +09:00
|
|
|
|
TINY_ROUTE_SMALL_HEAP_V7 = 7, // SmallObject HotHeap v7 (C6-only route stub, Phase v7-1)
|
2025-12-08 21:30:21 +09:00
|
|
|
|
} tiny_route_kind_t;
|
|
|
|
|
|
|
|
|
|
|
|
extern tiny_route_kind_t g_tiny_route_class[TINY_NUM_CLASSES];
|
|
|
|
|
|
extern int g_tiny_route_snapshot_done;
|
|
|
|
|
|
|
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor
Phase v6-1: C6-only route stub (v1/pool fallback)
Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation
- 2MiB segment / 64KiB page allocation
- O(1) ptr→page_meta lookup with segment masking
- C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s)
Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching)
- TLS ownership fast-path skip page_meta for 90%+ of frees
- Batch header writes during refill (32 allocs = 1 header write)
- TLS batch refill (1/32 refill frequency)
- C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅
Phase v6-4: Mixed hang fix (segment metadata lookup correction)
- Root cause: metadata lookup was reading mmap region instead of TLS slot
- Fix: use TLS slot descriptor with in_use validation
- Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅
Phase v6-refactor: Code quality improvements (macro unification + inline + docs)
- Add SMALL_V6_* prefix macros (header, pointer conversion, page index)
- Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6)
- Doxygen-style comments for all public functions
- Result: 0 compiler warnings, maintained +1.2% performance
Files:
- core/box/smallobject_core_v6_box.h (new, type & API definitions)
- core/box/smallobject_cold_iface_v6.h (new, cold iface API)
- core/box/smallsegment_v6_box.h (new, segment type definitions)
- core/smallobject_core_v6.c (new, C6 alloc/free implementation)
- core/smallobject_cold_iface_v6.c (new, refill/retire logic)
- core/smallsegment_v6.c (new, segment allocator)
- docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document)
- core/box/tiny_route_env_box.h (modified, v6 route added)
- core/front/malloc_tiny_fast.h (modified, v6 case in route switch)
- Makefile (modified, v6 objects added)
- CURRENT_TASK.md (modified, v6 status added)
Status:
- C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅
- Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅
- Build: 0 warnings, fully documented ✅
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
|
|
|
|
// ============================================================================
|
|
|
|
|
|
// Phase v6-1: SmallObject Core v6 ENV gate (must be before tiny_route_snapshot_init)
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
|
|
// small_heap_v6_enabled() - グローバル v6 enable check
|
|
|
|
|
|
static inline int small_heap_v6_enabled(void) {
|
|
|
|
|
|
static int g_enabled = ENV_UNINIT;
|
|
|
|
|
|
if (__builtin_expect(g_enabled == ENV_UNINIT, 0)) {
|
|
|
|
|
|
const char* e = getenv("HAKMEM_SMALL_HEAP_V6_ENABLED");
|
|
|
|
|
|
g_enabled = (e && *e && *e != '0') ? ENV_ENABLED : ENV_DISABLED;
|
|
|
|
|
|
}
|
|
|
|
|
|
return (g_enabled == ENV_ENABLED);
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// small_heap_v6_class_mask() - v6 対象クラスのビットマスク
|
|
|
|
|
|
static inline uint32_t small_heap_v6_class_mask(void) {
|
|
|
|
|
|
static int g_mask = ENV_UNINIT;
|
|
|
|
|
|
if (__builtin_expect(g_mask == ENV_UNINIT, 0)) {
|
|
|
|
|
|
const char* e = getenv("HAKMEM_SMALL_HEAP_V6_CLASSES");
|
|
|
|
|
|
if (e && *e) {
|
|
|
|
|
|
g_mask = (int)strtoul(e, NULL, 0);
|
|
|
|
|
|
} else {
|
|
|
|
|
|
g_mask = 0x0; // default: OFF
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
return (uint32_t)g_mask;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// small_heap_v6_class_enabled() - 指定クラスが v6 有効か
|
|
|
|
|
|
static inline int small_heap_v6_class_enabled(uint32_t class_idx) {
|
|
|
|
|
|
if (class_idx >= 8) return 0;
|
|
|
|
|
|
if (!small_heap_v6_enabled()) return 0;
|
|
|
|
|
|
uint32_t mask = small_heap_v6_class_mask();
|
|
|
|
|
|
return (mask & (1u << class_idx)) ? 1 : 0;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2025-12-12 03:12:28 +09:00
|
|
|
|
// ============================================================================
|
|
|
|
|
|
// Phase v7-1: SmallObject HotHeap v7 ENV gate (must be before tiny_route_snapshot_init)
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
|
|
// small_heap_v7_enabled() - グローバル v7 enable check
|
|
|
|
|
|
static inline int small_heap_v7_enabled(void) {
|
|
|
|
|
|
static int g_enabled = ENV_UNINIT;
|
|
|
|
|
|
if (__builtin_expect(g_enabled == ENV_UNINIT, 0)) {
|
|
|
|
|
|
const char* e = getenv("HAKMEM_SMALL_HEAP_V7_ENABLED");
|
|
|
|
|
|
g_enabled = (e && *e && *e != '0') ? ENV_ENABLED : ENV_DISABLED;
|
|
|
|
|
|
}
|
|
|
|
|
|
return (g_enabled == ENV_ENABLED);
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// small_heap_v7_class_mask() - v7 対象クラスのビットマスク
|
|
|
|
|
|
static inline uint32_t small_heap_v7_class_mask(void) {
|
|
|
|
|
|
static int g_mask = ENV_UNINIT;
|
|
|
|
|
|
if (__builtin_expect(g_mask == ENV_UNINIT, 0)) {
|
|
|
|
|
|
const char* e = getenv("HAKMEM_SMALL_HEAP_V7_CLASSES");
|
|
|
|
|
|
if (e && *e) {
|
|
|
|
|
|
g_mask = (int)strtoul(e, NULL, 0);
|
|
|
|
|
|
} else {
|
|
|
|
|
|
g_mask = 0x0; // default: OFF
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
return (uint32_t)g_mask;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// small_heap_v7_class_enabled() - 指定クラスが v7 有効か
|
|
|
|
|
|
static inline int small_heap_v7_class_enabled(uint32_t class_idx) {
|
|
|
|
|
|
if (class_idx >= 8) return 0;
|
|
|
|
|
|
if (!small_heap_v7_enabled()) return 0;
|
|
|
|
|
|
uint32_t mask = small_heap_v7_class_mask();
|
|
|
|
|
|
return (mask & (1u << class_idx)) ? 1 : 0;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2025-12-08 21:30:21 +09:00
|
|
|
|
static inline void tiny_route_snapshot_init(void) {
|
|
|
|
|
|
for (int i = 0; i < TINY_NUM_CLASSES; i++) {
|
2025-12-12 03:12:28 +09:00
|
|
|
|
// Phase v7-1: C6-only v7 route stub (highest priority)
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks); // ENV check counter
|
2025-12-12 03:12:28 +09:00
|
|
|
|
if (small_heap_v7_class_enabled((uint32_t)i)) {
|
|
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V7;
|
|
|
|
|
|
FREE_DISPATCH_STAT_INC(route_core_v7);
|
|
|
|
|
|
} else if (small_heap_v6_class_enabled((uint32_t)i)) {
|
|
|
|
|
|
// Phase v6-1: C6-only v6 route stub
|
|
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor
Phase v6-1: C6-only route stub (v1/pool fallback)
Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation
- 2MiB segment / 64KiB page allocation
- O(1) ptr→page_meta lookup with segment masking
- C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s)
Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching)
- TLS ownership fast-path skip page_meta for 90%+ of frees
- Batch header writes during refill (32 allocs = 1 header write)
- TLS batch refill (1/32 refill frequency)
- C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅
Phase v6-4: Mixed hang fix (segment metadata lookup correction)
- Root cause: metadata lookup was reading mmap region instead of TLS slot
- Fix: use TLS slot descriptor with in_use validation
- Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅
Phase v6-refactor: Code quality improvements (macro unification + inline + docs)
- Add SMALL_V6_* prefix macros (header, pointer conversion, page index)
- Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6)
- Doxygen-style comments for all public functions
- Result: 0 compiler warnings, maintained +1.2% performance
Files:
- core/box/smallobject_core_v6_box.h (new, type & API definitions)
- core/box/smallobject_cold_iface_v6.h (new, cold iface API)
- core/box/smallsegment_v6_box.h (new, segment type definitions)
- core/smallobject_core_v6.c (new, C6 alloc/free implementation)
- core/smallobject_cold_iface_v6.c (new, refill/retire logic)
- core/smallsegment_v6.c (new, segment allocator)
- docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document)
- core/box/tiny_route_env_box.h (modified, v6 route added)
- core/front/malloc_tiny_fast.h (modified, v6 case in route switch)
- Makefile (modified, v6 objects added)
- CURRENT_TASK.md (modified, v6 status added)
Status:
- C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅
- Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅
- Build: 0 warnings, fully documented ✅
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V6;
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(route_core_v6);
|
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor
Phase v6-1: C6-only route stub (v1/pool fallback)
Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation
- 2MiB segment / 64KiB page allocation
- O(1) ptr→page_meta lookup with segment masking
- C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s)
Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching)
- TLS ownership fast-path skip page_meta for 90%+ of frees
- Batch header writes during refill (32 allocs = 1 header write)
- TLS batch refill (1/32 refill frequency)
- C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅
Phase v6-4: Mixed hang fix (segment metadata lookup correction)
- Root cause: metadata lookup was reading mmap region instead of TLS slot
- Fix: use TLS slot descriptor with in_use validation
- Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅
Phase v6-refactor: Code quality improvements (macro unification + inline + docs)
- Add SMALL_V6_* prefix macros (header, pointer conversion, page index)
- Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6)
- Doxygen-style comments for all public functions
- Result: 0 compiler warnings, maintained +1.2% performance
Files:
- core/box/smallobject_core_v6_box.h (new, type & API definitions)
- core/box/smallobject_cold_iface_v6.h (new, cold iface API)
- core/box/smallsegment_v6_box.h (new, segment type definitions)
- core/smallobject_core_v6.c (new, C6 alloc/free implementation)
- core/smallobject_cold_iface_v6.c (new, refill/retire logic)
- core/smallsegment_v6.c (new, segment allocator)
- docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document)
- core/box/tiny_route_env_box.h (modified, v6 route added)
- core/front/malloc_tiny_fast.h (modified, v6 case in route switch)
- Makefile (modified, v6 objects added)
- CURRENT_TASK.md (modified, v6 status added)
Status:
- C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅
- Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅
- Build: 0 warnings, fully documented ✅
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
|
|
|
|
} else if (i == 6 && small_heap_v5_class_enabled(6)) {
|
|
|
|
|
|
// Phase v5-1: C6-only v5 route stub (before v4 check)
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
2025-12-11 03:25:37 +09:00
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V5;
|
|
|
|
|
|
} else if (small_heap_v4_class_enabled((uint8_t)i)) {
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
2025-12-10 17:58:42 +09:00
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V4;
|
|
|
|
|
|
} else if (small_heap_v3_class_enabled((uint8_t)i)) {
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
2025-12-09 21:50:15 +09:00
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_SMALL_HEAP_V3;
|
|
|
|
|
|
} else if (tiny_hotheap_v2_class_enabled((uint8_t)i)) {
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
2025-12-08 21:30:21 +09:00
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_HOTHEAP_V2;
|
|
|
|
|
|
} else if (tiny_heap_box_enabled() && tiny_heap_class_route_enabled(i)) {
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
|
|
|
|
|
FREE_DISPATCH_STAT_INC(env_checks);
|
2025-12-08 21:30:21 +09:00
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_HEAP;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
g_tiny_route_class[i] = TINY_ROUTE_LEGACY;
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(route_tiny_legacy);
|
2025-12-08 21:30:21 +09:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
g_tiny_route_snapshot_done = 1;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static inline tiny_route_kind_t tiny_route_for_class(uint8_t ci) {
|
Phase FREE-DISPATCHER-OPT-1: free dispatcher 統計計測
**目的**: free dispatcher(29%)の内訳を細分化して計測。
**実装内容**:
- FreeDispatchStats 構造体追加(ENV: HAKMEM_FREE_DISPATCH_STATS, default 0)
- カウンタ: total_calls / domain (tiny/mid/large) / route (ultra/legacy/pool/v6) / env_checks / route_for_class_calls
- hak_free_at / tiny_route_for_class / tiny_route_snapshot_init にカウンタ埋め込み
- 挙動変更なし(計測のみ、ENV OFF 時は overhead ゼロ)
**計測結果**:
Mixed 16-1024B (1M iter, ws=400):
- total=8,081, route_calls=267,967, env_checks=9
- BENCH_FAST_FRONT により大半は早期リターン
- route_for_class は主に alloc 側で呼ばれる(267k calls vs 8k frees)
- ENV check は初期化時の 9回のみ(snapshot 効果)
C6-heavy (257-768B, 1M iter, ws=400):
- total=500,099, route_calls=1,034, env_checks=9
- fg_classify_domain に到達する free が多い
- route_for_class 呼び出しは極小(snapshot 効果)
**結論**:
- ENV check は既に十分最適化されている(初期化時のみ)
- route_for_class は alloc 側での呼び出しが主で、free 側は snapshot で O(1)
- 次フェーズ(OPT-2)では別のアプローチを検討
**ドキュメント追加**:
- docs/analysis/FREE_DISPATCHER_ANALYSIS.md(新規)
- CURRENT_TASK.md に Phase FREE-DISPATCHER-OPT-1 セクション追加
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 21:21:40 +09:00
|
|
|
|
FREE_DISPATCH_STAT_INC(route_for_class_calls); // Phase FREE-DISPATCHER-OPT-1
|
2025-12-08 21:30:21 +09:00
|
|
|
|
if (__builtin_expect(!g_tiny_route_snapshot_done, 0)) {
|
|
|
|
|
|
tiny_route_snapshot_init();
|
|
|
|
|
|
}
|
|
|
|
|
|
if (__builtin_expect(ci >= TINY_NUM_CLASSES, 0)) {
|
|
|
|
|
|
return TINY_ROUTE_LEGACY;
|
|
|
|
|
|
}
|
|
|
|
|
|
return g_tiny_route_class[ci];
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
static inline int tiny_route_is_heap_kind(tiny_route_kind_t route) {
|
2025-12-10 17:58:42 +09:00
|
|
|
|
return route == TINY_ROUTE_HEAP ||
|
|
|
|
|
|
route == TINY_ROUTE_HOTHEAP_V2 ||
|
|
|
|
|
|
route == TINY_ROUTE_SMALL_HEAP_V3 ||
|
2025-12-11 03:25:37 +09:00
|
|
|
|
route == TINY_ROUTE_SMALL_HEAP_V4 ||
|
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor
Phase v6-1: C6-only route stub (v1/pool fallback)
Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation
- 2MiB segment / 64KiB page allocation
- O(1) ptr→page_meta lookup with segment masking
- C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s)
Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching)
- TLS ownership fast-path skip page_meta for 90%+ of frees
- Batch header writes during refill (32 allocs = 1 header write)
- TLS batch refill (1/32 refill frequency)
- C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅
Phase v6-4: Mixed hang fix (segment metadata lookup correction)
- Root cause: metadata lookup was reading mmap region instead of TLS slot
- Fix: use TLS slot descriptor with in_use validation
- Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅
Phase v6-refactor: Code quality improvements (macro unification + inline + docs)
- Add SMALL_V6_* prefix macros (header, pointer conversion, page index)
- Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6)
- Doxygen-style comments for all public functions
- Result: 0 compiler warnings, maintained +1.2% performance
Files:
- core/box/smallobject_core_v6_box.h (new, type & API definitions)
- core/box/smallobject_cold_iface_v6.h (new, cold iface API)
- core/box/smallsegment_v6_box.h (new, segment type definitions)
- core/smallobject_core_v6.c (new, C6 alloc/free implementation)
- core/smallobject_cold_iface_v6.c (new, refill/retire logic)
- core/smallsegment_v6.c (new, segment allocator)
- docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document)
- core/box/tiny_route_env_box.h (modified, v6 route added)
- core/front/malloc_tiny_fast.h (modified, v6 case in route switch)
- Makefile (modified, v6 objects added)
- CURRENT_TASK.md (modified, v6 status added)
Status:
- C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅
- Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅
- Build: 0 warnings, fully documented ✅
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
|
|
|
|
route == TINY_ROUTE_SMALL_HEAP_V5 ||
|
2025-12-12 03:12:28 +09:00
|
|
|
|
route == TINY_ROUTE_SMALL_HEAP_V6 ||
|
|
|
|
|
|
route == TINY_ROUTE_SMALL_HEAP_V7;
|
2025-12-08 21:30:21 +09:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// C7 front が TinyHeap を使うか(Route snapshot 経由で判定)
|
|
|
|
|
|
static inline int tiny_c7_front_uses_heap(void) {
|
|
|
|
|
|
return tiny_route_is_heap_kind(tiny_route_for_class(7));
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// C6 front が TinyHeap を使うか(Route snapshot 経由で判定)
|
|
|
|
|
|
static inline int tiny_c6_front_uses_heap(void) {
|
|
|
|
|
|
return tiny_route_is_heap_kind(tiny_route_for_class(6));
|
|
|
|
|
|
}
|