Major Features: - Debug counter infrastructure for Refill Stage tracking - Free Pipeline counters (ss_local, ss_remote, tls_sll) - Diagnostic counters for early return analysis - Unified larson.sh benchmark runner with profiles - Phase 6-3 regression analysis documentation Bug Fixes: - Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB) - Fix profile variable naming consistency - Add .gitignore patterns for large files Performance: - Phase 6-3: 4.79 M ops/s (has OOM risk) - With SuperSlab: 3.13 M ops/s (+19% improvement) This is a clean repository without large log files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1.6 KiB
1.6 KiB
Claude AI Instructions for HAKMEM Project
🚫 CRITICAL: LD_PRELOAD Benchmark Ban
NEVER suggest or attempt LD_PRELOAD benchmarking
Rule
If you encounter segfaults or issues with LD_PRELOAD:
- STOP immediately - Do not debug
- Remind user: "LD_PRELOAD benchmarks are banned (see LD_PRELOAD_BENCHMARK_BAN.md)"
- Suggest alternative: Use direct linking instead
Why This Rule Exists
- Previous sessions wasted hours debugging LD_PRELOAD segfaults
- Problem is NOT in HAKMEM - it's a glibc limitation
- Industry-wide issue affecting tcmalloc, jemalloc, mimalloc, hardened_malloc
- Trade-off: LD_PRELOAD safety requires mincore() → 6.4x performance loss → unacceptable
Correct Approach
# ✅ ALWAYS USE THIS
gcc -o bench bench.c libhakmem.a -lpthread
./bench
# ❌ NEVER USE THIS FOR BENCHMARKING
LD_PRELOAD=./libhakmem.so ./bench
Reference
See LD_PRELOAD_BENCHMARK_BAN.md for full details including:
- WebSearch evidence (hardened_malloc #98, mimalloc #21, Stack Overflow)
- Historical attempts (Phase 6.15, Phase 8.2)
- Technical root causes (dlsym recursion, printf malloc dependency, glibc edge cases)
Project Context
HAKMEM is a high-performance malloc replacement with:
- L0 Tiny Pool (≤1KiB): TLS magazine + TLS Active Slab
- L1 Mid Pool (1-16KiB): Thread-local cache
- L2 Pool (16-256KiB): Sharded locks + remote free rings
- L2.5 Pool (256KiB-2MiB): Size-class caching
- L3 BigCache (>2MiB): mmap with batch madvise
Current focus: Performance optimization and memory overhead reduction.
Last Updated: 2025-10-27