Files
hakmem/core/box/free_path_stats_box.c
Moe Charm (CI) 1b196b3ac0 Phase FREE-LEGACY-OPT-4-2/4-3: C6 ULTRA-free TLS cache + segment learning
Phase 4-2:
- Add TinyC6UltraFreeTLS structure with 128-slot TLS freelist
- Implement tiny_c6_ultra_free_fast/slow for C6 free hot path
- Add c6_ultra_free_fast counter to FreePathStats
- ENV gate: HAKMEM_TINY_C6_ULTRA_FREE_ENABLED (default: OFF)

Phase 4-3:
- Add segment learning on first C6 free via ss_fast_lookup()
- Learn seg_base/seg_end from SuperSlab for range check
- Increase cache capacity from 32 to 128 blocks

Results:
- Segment learning works: fast path captures blocks in segment
- However, without alloc integration, cache fills up and overflows to legacy
- Net effect: +1-3% (within noise range)
- Drain strategy also tested: no benefit (equal overhead)

Conclusion:
- Free-only TLS cache is limited without alloc-side integration
- Core v6 already has alloc/free integrated TLS (but -12% slower)
- Keep as research box (ENV default OFF)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-11 18:34:27 +09:00

44 lines
1.6 KiB
C

#include "free_path_stats_box.h"
#include <stdio.h>
FreePathStats g_free_path_stats = {0};
// Helper function for pool_api.inc.h (to avoid inline include issues)
void free_path_stat_inc_pool_v1_fast(void) {
if (__builtin_expect(free_path_stats_enabled(), 0)) {
g_free_path_stats.pool_v1_fast++;
}
}
__attribute__((destructor))
static void free_path_stats_dump(void) {
if (!free_path_stats_enabled()) {
return;
}
fprintf(stderr, "[FREE_PATH_STATS] total=%lu c7_ultra=%lu c6_ultra_free=%lu small_v3=%lu v6=%lu tiny_v1=%lu pool_v1=%lu remote=%lu super_lookup=%lu legacy_fb=%lu\n",
g_free_path_stats.total_calls,
g_free_path_stats.c7_ultra_fast,
g_free_path_stats.c6_ultra_free_fast, // Phase 4-2
g_free_path_stats.smallheap_v3_fast,
g_free_path_stats.smallheap_v6_fast,
g_free_path_stats.tiny_heap_v1_fast,
g_free_path_stats.pool_v1_fast,
g_free_path_stats.remote_free,
g_free_path_stats.super_lookup_called,
g_free_path_stats.legacy_fallback);
// Phase 4-1: Legacy per-class breakdown
fprintf(stderr, "[FREE_PATH_STATS_LEGACY_BY_CLASS] c0=%lu c1=%lu c2=%lu c3=%lu c4=%lu c5=%lu c6=%lu c7=%lu\n",
g_free_path_stats.legacy_by_class[0],
g_free_path_stats.legacy_by_class[1],
g_free_path_stats.legacy_by_class[2],
g_free_path_stats.legacy_by_class[3],
g_free_path_stats.legacy_by_class[4],
g_free_path_stats.legacy_by_class[5],
g_free_path_stats.legacy_by_class[6],
g_free_path_stats.legacy_by_class[7]);
fflush(stderr);
}