Add ss_fast_lookup() for O(1) SuperSlab lookup via mask

Replaces expensive hak_super_lookup() (registry hash lookup, 50-100 cycles)
with fast mask-based lookup (~5-10 cycles) in free hot paths.

Algorithm:
1. Mask pointer with SUPERSLAB_SIZE_MIN (1MB) - works for both 1MB and 2MB SS
2. Validate magic (SUPERSLAB_MAGIC)
3. Range check using ss->lg_size

Applied to:
- tiny_free_fast.inc.h: tiny_free_fast() SuperSlab path
- tiny_free_fast_v2.inc.h: LARSON_FIX cross-thread check
- front/malloc_tiny_fast.h: free_tiny_fast() LARSON_FIX path

Note: Performance impact minimal with LARSON_FIX=OFF (default) since
SuperSlab lookup is skipped entirely in that case. Optimization benefits
LARSON_FIX=ON path for safe multi-threaded operation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-27 12:47:10 +09:00
parent 0a8bdb8b18
commit 64ed3d8d8c
4 changed files with 38 additions and 8 deletions

View File

@ -11,6 +11,33 @@ void _ss_remote_drain_to_freelist_unsafe(SuperSlab* ss, int slab_idx, TinySlabMe
// Optional debug counter (defined in hakmem_tiny_superslab.c)
extern _Atomic uint64_t g_ss_active_dec_calls;
// ========== Fast SuperSlab Lookup via Mask (Phase 12 optimization) ==========
// Purpose: Replace expensive hak_super_lookup() with O(1) mask calculation
// Invariant: All SuperSlabs are aligned to at least SUPERSLAB_SIZE_MIN (1MB)
// Cost: ~5-10 cycles (vs 50-100 cycles for registry lookup)
static inline SuperSlab* ss_fast_lookup(void* ptr)
{
if (__builtin_expect(!ptr, 0)) return NULL;
uintptr_t p = (uintptr_t)ptr;
// Step 1: Mask with minimum SuperSlab size (1MB alignment)
// Note: 2MB SuperSlabs are also 1MB aligned, so this works for both
SuperSlab* ss = (SuperSlab*)(p & ~((uintptr_t)SUPERSLAB_SIZE_MIN - 1u));
// Step 2: Validate magic (quick reject for non-SuperSlab memory)
if (__builtin_expect(ss->magic != SUPERSLAB_MAGIC, 0)) {
return NULL;
}
// Step 3: Range check (ptr must be within this SuperSlab)
size_t ss_size = (size_t)1 << ss->lg_size;
if (__builtin_expect(p >= (uintptr_t)ss + ss_size, 0)) {
return NULL;
}
return ss;
}
// Return maximum number of slabs for this SuperSlab based on lg_size.
static inline int ss_slabs_capacity(SuperSlab* ss)
{