hakmem

tomoaki/hakmem

Fork 0

Commit Graph

Author	SHA1	Message	Date
Moe Charm (CI)	29fefa2018	P0 Lock Contention Analysis: Instrumentation + comprehensive report P0-2: Lock Instrumentation (✅ Complete) - Add atomic counters to g_shared_pool.alloc_lock - Track acquire_slab() vs release_slab() separately - Environment: HAKMEM_SHARED_POOL_LOCK_STATS=1 - Report stats at shutdown via destructor P0-3: Analysis Results (✅ Complete) - 100% contention from acquire_slab() (allocation path) - 0% from release_slab() (effectively lock-free!) - Lock rate: 0.206% (TLS hit rate: 99.8%) - Scaling: 4T→8T = 1.44x (sublinear, lock bottleneck) Key Findings: - 4T: 330 lock acquisitions / 160K ops - 8T: 658 lock acquisitions / 320K ops - futex: 68% of syscall time (from previous strace) - Bottleneck: acquire_slab 3-stage logic under mutex Report: MID_LARGE_LOCK_CONTENTION_ANALYSIS.md (2.3KB) - Detailed breakdown by code path - Root cause analysis (TLS miss → shared pool lock) - Lock-free implementation roadmap (P0-4/P0-5) - Expected impact: +50-73% throughput Files Modified: - core/hakmem_shared_pool.c: +60 lines instrumentation - Atomic counters: g_lock_acquire/release_slab_count - lock_stats_init() + lock_stats_report() - Per-path tracking in acquire/release functions Next Steps: - P0-4: Lock-free per-class free lists (Stage 1: LIFO stack CAS) - P0-5: Lock-free slot claiming (Stage 2: atomic bitmap) - P0-6: A/B comparison (target: +50-73%) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-14 15:32:07 +09:00

Author

SHA1

Message

Date

Moe Charm (CI)

29fefa2018

P0 Lock Contention Analysis: Instrumentation + comprehensive report

**P0-2: Lock Instrumentation** (✅ Complete)
- Add atomic counters to g_shared_pool.alloc_lock
- Track acquire_slab() vs release_slab() separately
- Environment: HAKMEM_SHARED_POOL_LOCK_STATS=1
- Report stats at shutdown via destructor

**P0-3: Analysis Results** (✅ Complete)
- 100% contention from acquire_slab() (allocation path)
- 0% from release_slab() (effectively lock-free!)
- Lock rate: 0.206% (TLS hit rate: 99.8%)
- Scaling: 4T→8T = 1.44x (sublinear, lock bottleneck)

**Key Findings**:
- 4T: 330 lock acquisitions / 160K ops
- 8T: 658 lock acquisitions / 320K ops
- futex: 68% of syscall time (from previous strace)
- Bottleneck: acquire_slab 3-stage logic under mutex

**Report**: MID_LARGE_LOCK_CONTENTION_ANALYSIS.md (2.3KB)
- Detailed breakdown by code path
- Root cause analysis (TLS miss → shared pool lock)
- Lock-free implementation roadmap (P0-4/P0-5)
- Expected impact: +50-73% throughput

**Files Modified**:
- core/hakmem_shared_pool.c: +60 lines instrumentation
  - Atomic counters: g_lock_acquire/release_slab_count
  - lock_stats_init() + lock_stats_report()
  - Per-path tracking in acquire/release functions

**Next Steps**:
- P0-4: Lock-free per-class free lists (Stage 1: LIFO stack CAS)
- P0-5: Lock-free slot claiming (Stage 2: atomic bitmap)
- P0-6: A/B comparison (target: +50-73%)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-14 15:32:07 +09:00

1 Commits