Add ss_fast_lookup() for O(1) SuperSlab lookup via mask

Replaces expensive hak_super_lookup() (registry hash lookup, 50-100 cycles)
with fast mask-based lookup (~5-10 cycles) in free hot paths.

Algorithm:
1. Mask pointer with SUPERSLAB_SIZE_MIN (1MB) - works for both 1MB and 2MB SS
2. Validate magic (SUPERSLAB_MAGIC)
3. Range check using ss->lg_size

Applied to:
- tiny_free_fast.inc.h: tiny_free_fast() SuperSlab path
- tiny_free_fast_v2.inc.h: LARSON_FIX cross-thread check
- front/malloc_tiny_fast.h: free_tiny_fast() LARSON_FIX path

Note: Performance impact minimal with LARSON_FIX=OFF (default) since
SuperSlab lookup is skipped entirely in that case. Optimization benefits
LARSON_FIX=ON path for safe multi-threaded operation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-27 12:47:10 +09:00
parent 0a8bdb8b18
commit 64ed3d8d8c
4 changed files with 38 additions and 8 deletions

View File

@ -194,7 +194,8 @@ static inline int hak_tiny_free_fast_v2(void* ptr) {
if (__builtin_expect(g_larson_fix, 0)) {
// Cross-thread check enabled - MT safe mode
SuperSlab* ss = hak_super_lookup(base);
// Phase 12 optimization: Use fast mask-based lookup (~5-10 cycles vs 50-100)
SuperSlab* ss = ss_fast_lookup(base);
if (__builtin_expect(ss != NULL, 1)) {
int slab_idx = slab_index_for(ss, base);
if (__builtin_expect(slab_idx >= 0, 1)) {