P2: TLS SLL Redesign - class_map default, tls_cached tracking, conditional header restore

This commit completes the P2 phase of the Tiny Pool TLS SLL redesign to fix the
Header/Next pointer conflict that was causing ~30% crash rates.

Changes:
- P2.1: Make class_map lookup the default (ENV: HAKMEM_TINY_NO_CLASS_MAP=1 for legacy)
- P2.2: Add meta->tls_cached field to track blocks cached in TLS SLL
- P2.3: Make Header restoration conditional in tiny_next_store() (default: skip)
- P2.4: Add invariant verification functions (active + tls_cached ≈ used)
- P0.4: Document new ENV variables in ENV_VARS.md

New ENV variables:
- HAKMEM_TINY_ACTIVE_TRACK=1: Enable active/tls_cached tracking (~1% overhead)
- HAKMEM_TINY_NO_CLASS_MAP=1: Disable class_map (legacy mode)
- HAKMEM_TINY_RESTORE_HEADER=1: Force header restoration (legacy mode)
- HAKMEM_TINY_INVARIANT_CHECK=1: Enable invariant verification (debug)
- HAKMEM_TINY_INVARIANT_DUMP=1: Enable periodic state dumps (debug)

Benchmark results (bench_tiny_hot_hakmem 64B):
- Default (class_map ON): 84.49 M ops/sec
- ACTIVE_TRACK=1: 83.62 M ops/sec (-1%)
- NO_CLASS_MAP=1 (legacy): 85.06 M ops/sec
- MT performance: +21-28% vs system allocator

No crashes observed. All tests passed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-28 14:11:37 +09:00
parent 6b86c60a20
commit a6e681aae7
8 changed files with 154 additions and 17 deletions

View File

@ -107,17 +107,18 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
}
#endif
// P1.2: Use class_map instead of Header to avoid Header/Next contention
// ENV: HAKMEM_TINY_USE_CLASS_MAP=1 to enable (default: 0 for compatibility)
// P2.1: Use class_map instead of Header to avoid Header/Next contention
// ENV: HAKMEM_TINY_NO_CLASS_MAP=1 to disable (default: ON - class_map is preferred)
int class_idx = -1;
{
static __thread int g_use_class_map = -1;
if (__builtin_expect(g_use_class_map == -1, 0)) {
const char* e = getenv("HAKMEM_TINY_USE_CLASS_MAP");
g_use_class_map = (e && *e && *e != '0') ? 1 : 0;
const char* e = getenv("HAKMEM_TINY_NO_CLASS_MAP");
// P2.1: Default is ON (use class_map), set HAKMEM_TINY_NO_CLASS_MAP=1 to disable
g_use_class_map = (e && *e && *e != '0') ? 0 : 1;
}
if (__builtin_expect(g_use_class_map, 0)) {
if (__builtin_expect(g_use_class_map, 1)) {
// P1.2: class_map path - avoid Header read
SuperSlab* ss = ss_fast_lookup((uint8_t*)ptr - 1);
if (ss && ss->magic == SUPERSLAB_MAGIC) {
@ -144,7 +145,7 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
#endif
}
} else {
// Default: Header read (existing behavior)
// P2.1: Fallback to Header read (disabled class_map mode)
class_idx = tiny_region_id_read_header(ptr);
#if HAKMEM_DEBUG_VERBOSE
if (atomic_load(&debug_calls) <= 5) {
@ -329,8 +330,9 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
return 0;
}
// P1.3: Decrement meta->active when block is freed (user gives it back)
// P1.3/P2.2: Track active/tls_cached when block is freed (user gives it back)
// ENV gate: HAKMEM_TINY_ACTIVE_TRACK=1 to enable (default: 0 for performance)
// Flow: User → TLS SLL means active--, tls_cached++
{
static __thread int g_active_track = -1;
if (__builtin_expect(g_active_track == -1, 0)) {
@ -345,6 +347,7 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
if (slab_idx >= 0 && slab_idx < ss_slabs_capacity(ss)) {
TinySlabMeta* meta = &ss->slabs[slab_idx];
atomic_fetch_sub_explicit(&meta->active, 1, memory_order_relaxed);
atomic_fetch_add_explicit(&meta->tls_cached, 1, memory_order_relaxed); // P2.2
}
}
}