Files
hakmem/core
Moe Charm (CI) a32d0fafd4 Two-Speed Optimization Part 2: Remove atomic trace counters from hot path
Performance improvements:
- lock incl instructions completely removed from malloc/free hot paths
- Cache misses reduced from 24.4% → 13.4% of cycles
- Throughput: 85M → 89.12M ops/sec (+4.8% improvement)
- Cycles/op: 48.8 → 48.25 (-1.1%)

Changes in core/box/hak_wrappers.inc.h:
- malloc: Guard g_wrap_malloc_trace_count atomic with #if !HAKMEM_BUILD_RELEASE
- free: Guard g_wrap_free_trace_count and g_free_wrapper_calls with same guard

Debug builds retain full instrumentation via HAK_TRACE.
Release builds execute completely clean hot paths without atomic operations.

Verified via:
- perf report: lock incl instructions gone
- perf stat: cycles/op reduced, cache miss % improved
- objdump: 0 lock instructions in hot paths

Next: Inline unified_cache_refill for additional 3-4 cycles/op improvement

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 19:20:44 +09:00
..
2025-11-11 21:49:05 +09:00