Major Features: - Debug counter infrastructure for Refill Stage tracking - Free Pipeline counters (ss_local, ss_remote, tls_sll) - Diagnostic counters for early return analysis - Unified larson.sh benchmark runner with profiles - Phase 6-3 regression analysis documentation Bug Fixes: - Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB) - Fix profile variable naming consistency - Add .gitignore patterns for large files Performance: - Phase 6-3: 4.79 M ops/s (has OOM risk) - With SuperSlab: 3.13 M ops/s (+19% improvement) This is a clean repository without large log files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
4.4 KiB
LD_PRELOAD Benchmark Policy - BANNED
Date: 2025-10-27 Status: 🚫 PERMANENTLY BANNED Reason: Industry-wide glibc edge cases cause unpredictable segfaults
TL;DR
DO NOT USE LD_PRELOAD FOR BENCHMARKING
Use direct linking instead:
# ✅ CORRECT
gcc -o bench bench.c libhakmem.a
./bench
# ❌ BANNED
LD_PRELOAD=./libhakmem.so ./bench
Why Banned?
1. Previous Investigation History
We spent significant time debugging segfaults with LD_PRELOAD benchmarks:
- Multithreaded tests crashed unpredictably
- Different commands (ls, locale tools) crashed randomly
- Even after fixes, new crashes appeared
Conclusion from previous sessions: The problem is NOT fixable in HAKMEM - it's a glibc limitation.
2. Industry-Wide Issue (WebSearch Evidence 2024)
hardened_malloc (GrapheneOS):
- Issue #98: sssd crashes with LD_PRELOAD
- Same problem affects production-grade allocators
mimalloc (Microsoft):
- Issue #21: Firefox crashes on start with ld.so.preload
- Widely used allocator, still has LD_PRELOAD issues
Stack Overflow consensus:
"Even though glibc officially supports using linker preloading for malloc replacement, there are edge cases that aren't supported properly by them"
Root causes:
dlsym()callscalloc(32)→ infinite recursion → stack overflowprintf()internally uses malloc → recursion → segfaultmbstowcs()(locale) calls internal malloc → crashes unrelated code- Incomplete malloc substitution in glibc edge cases
3. Performance Impact (Phase 8.2)
When we tried to make LD_PRELOAD safe (commit c8139f3):
- Added
mincore()syscall for safety → 6.4x slowdown (145M → 22.78M ops/sec) - Removed
mincore()to recover performance → LD_PRELOAD broken again
Trade-off: LD_PRELOAD safety OR performance - cannot have both.
What Works Instead
✅ Direct Linking (Recommended)
# Static linking
gcc -o bench bench.c libhakmem.a -lpthread
# Dynamic linking with -lhakmem
gcc -o bench bench.c -L. -lhakmem -lpthread
export LD_LIBRARY_PATH=.
./bench
Benefits:
- No glibc edge cases
- Full performance (no mincore overhead)
- Predictable behavior
- No random segfaults
✅ What LD_PRELOAD Can Do (Limited)
Simple commands work fine:
LD_PRELOAD=./libhakmem.so echo "hello" # ✅ Works
LD_PRELOAD=./libhakmem.so cat file.txt # ✅ Works
LD_PRELOAD=./libhakmem.so grep pattern # ✅ Works
But DO NOT use for benchmarking - use direct linking.
Historical Context
Phase 6.15 (Oct 2025)
- Attempted LD_PRELOAD multithreaded safety fixes
- Added pthread_once() for initialization
- Result: Still crashed - not a HAKMEM bug, glibc limitation
Commit c8139f3 (Phase 8.2)
- Removed mincore() for performance
- Decision: Sacrificed LD_PRELOAD safety for 10.9x speedup
- Implicit ban on LD_PRELOAD benchmarks
Commit e4d61d5
- Documented LD_PRELOAD limitations
- But did NOT explicitly ban benchmarking ← This was the mistake
AI Instructions (claude.md)
Added to claude.md:
## LD_PRELOAD Benchmark Policy
**NEVER suggest or attempt LD_PRELOAD benchmarking**
If encountering segfaults with LD_PRELOAD:
1. Immediately stop debugging
2. Remind user: "LD_PRELOAD benchmarks are banned (see LD_PRELOAD_BENCHMARK_BAN.md)"
3. Suggest direct linking instead
References
WebSearch findings (2024-10-27):
-
hardened_malloc Issue #98
- https://github.com/GrapheneOS/hardened_malloc/issues/98
- sssd crashes with LD_PRELOAD
-
Stack Overflow - dlsym recursion
- https://stackoverflow.com/questions/6083337/overriding-malloc-using-the-ld-preload-mechanism
- Infinite loop segfaults
-
mimalloc Issue #21
- https://github.com/microsoft/mimalloc/issues/21
- Firefox crashes on start
-
Stack Overflow - glibc edge cases
- https://stackoverflow.com/questions/27322295/ld-preload-causing-segmentation-fault-in-dynamic-library-loader
- "edge cases that aren't supported properly by glibc"
Summary
| Approach | Safety | Performance | Verdict |
|---|---|---|---|
| Direct linking | ✅ Safe | ✅ Fast | ✅ USE THIS |
| LD_PRELOAD (with mincore) | ⚠️ Safer | ❌ 6.4x slower | ❌ Unacceptable |
| LD_PRELOAD (without mincore) | ❌ Crashes | ✅ Fast | ❌ BANNED |
Final Decision: Use direct linking for all benchmarks.
Last Updated: 2025-10-27 Policy Expires: Never (permanent ban)