Files
hakmem/core
Claude 5ec9d1746f Option A (Full): Inline TLS cache access in malloc()
Implementation:
1. Added g_initialized check to fast path (skip bootstrap overhead)
2. Inlined hak_tiny_size_to_class() - LUT lookup (~1 load)
3. Inlined TLS cache pop - direct g_tls_sll_head access (3-4 instructions)
4. Eliminated function call overhead on fast path hit

Result: +11.5% improvement (1.31M → 1.46M ops/s avg, threads=4)
- Before: Function call + internal processing (~15-20 instructions)
- After: LUT + TLS load + pop + return (~5-6 instructions)

Still below target (1.81M ops/s). Next: RDTSC profiling to identify remaining bottleneck.
2025-11-05 07:07:47 +00:00
..