Problem: Range-based ownership check caused SEGV in MT benchmarks
Root cause: Arena range tracking complexity + initialization race condition
Solution: Simplified to TID-cache-only approach
- Removed arena range tracking (arena_base, arena_end)
- Fast same-thread check via TID comparison only
- gettid() cached in TLS to avoid repeated syscalls
Changes:
1. core/pool_tls_bind.h - Simplified to TID cache struct
- PoolTLSBind: only stores tid (no arena range)
- pool_get_my_tid(): inline TID cache accessor
- pool_tls_is_mine_tid(owner_tid): simple TID comparison
2. core/pool_tls_bind.c - Minimal TLS storage only
- All logic moved to inline functions in header
- Only defines: __thread PoolTLSBind g_pool_tls_bind = {0};
3. core/pool_tls.c - Use TID comparison in pool_free()
- Changed: pool_tls_is_mine(ptr) → pool_tls_is_mine_tid(owner_tid)
- Registry lookup still needed for owner_tid (accepted overhead)
- Fixed gettid_cached() duplicate definition (#ifdef guard)
4. core/pool_tls_arena.c - Removed arena range hooks
- Removed: pool_tls_bind_update_range() call (disabled)
- Removed: pool_arena_get_my_range() implementation
5. core/pool_tls_arena.h - Removed getter API
- Removed: pool_arena_get_my_range() declaration
Results:
- MT stability: ✅ 2T/4T benchmarks SEGV-free
- Throughput: 2T=0.93M ops/s, 4T=1.64M ops/s
- Code simplicity: 90% reduction in BIND_BOX complexity
Trade-off:
- Registry lookup still required (TID-only doesn't eliminate it)
- But: simplified code, no initialization complexity, MT-safe
Next: Profile with perf to find remaining Mid-Large bottlenecks
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
57 lines
1.5 KiB
C
57 lines
1.5 KiB
C
#ifndef HAKMEM_POOL_TLS_BIND_H
|
|
#define HAKMEM_POOL_TLS_BIND_H
|
|
|
|
#include <stdint.h>
|
|
#include <sys/types.h>
|
|
#include <sys/syscall.h>
|
|
#include <unistd.h>
|
|
|
|
/**
|
|
* POOL_TLS_BIND_BOX - TID Cache for Fast Same-Thread Detection
|
|
*
|
|
* Box Theory:
|
|
* - Boundary: Thread initialization - cache TID in TLS once
|
|
* - Internal: Hot path uses TID comparison (no gettid syscall)
|
|
* - Fallback: gettid_cached() on first access
|
|
*
|
|
* Performance:
|
|
* - Eliminates repeated gettid() calls from hot path
|
|
* - Simple TID comparison (1 comparison vs registry lookup)
|
|
* - Expected: Reduce cache misses and syscall overhead
|
|
*/
|
|
|
|
// TLS binding for fast TID caching
|
|
typedef struct PoolTLSBind {
|
|
pid_t tid; // My thread ID (cached, 0 = uninitialized)
|
|
} PoolTLSBind;
|
|
|
|
// TLS cache (per-thread, automatically zero-initialized)
|
|
extern __thread PoolTLSBind g_pool_tls_bind;
|
|
|
|
// Inline helper: gettid with caching
|
|
static inline pid_t gettid_cached(void) {
|
|
static __thread pid_t cached_tid = 0;
|
|
if (__builtin_expect(cached_tid == 0, 0)) {
|
|
cached_tid = (pid_t)syscall(SYS_gettid);
|
|
}
|
|
return cached_tid;
|
|
}
|
|
|
|
// API
|
|
|
|
// Get my thread ID (cached in TLS)
|
|
static inline pid_t pool_get_my_tid(void) {
|
|
if (__builtin_expect(g_pool_tls_bind.tid == 0, 0)) {
|
|
g_pool_tls_bind.tid = gettid_cached();
|
|
}
|
|
return g_pool_tls_bind.tid;
|
|
}
|
|
|
|
// Fast same-thread check (TID comparison only)
|
|
// Returns 1 if owner_tid matches my TID, 0 otherwise
|
|
static inline int pool_tls_is_mine_tid(pid_t owner_tid) {
|
|
return owner_tid == pool_get_my_tid();
|
|
}
|
|
|
|
#endif // HAKMEM_POOL_TLS_BIND_H
|