Files
hakmem/docs/archive/LD_PRELOAD_BENCHMARK_BAN.md
Moe Charm (CI) 52386401b3 Debug Counters Implementation - Clean History
Major Features:
- Debug counter infrastructure for Refill Stage tracking
- Free Pipeline counters (ss_local, ss_remote, tls_sll)
- Diagnostic counters for early return analysis
- Unified larson.sh benchmark runner with profiles
- Phase 6-3 regression analysis documentation

Bug Fixes:
- Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB)
- Fix profile variable naming consistency
- Add .gitignore patterns for large files

Performance:
- Phase 6-3: 4.79 M ops/s (has OOM risk)
- With SuperSlab: 3.13 M ops/s (+19% improvement)

This is a clean repository without large log files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 12:31:14 +09:00

170 lines
4.4 KiB
Markdown

# LD_PRELOAD Benchmark Policy - BANNED
**Date**: 2025-10-27
**Status**: 🚫 **PERMANENTLY BANNED**
**Reason**: Industry-wide glibc edge cases cause unpredictable segfaults
---
## TL;DR
**DO NOT USE LD_PRELOAD FOR BENCHMARKING**
Use direct linking instead:
```bash
# ✅ CORRECT
gcc -o bench bench.c libhakmem.a
./bench
# ❌ BANNED
LD_PRELOAD=./libhakmem.so ./bench
```
---
## Why Banned?
### 1. Previous Investigation History
We spent significant time debugging segfaults with LD_PRELOAD benchmarks:
- Multithreaded tests crashed unpredictably
- Different commands (ls, locale tools) crashed randomly
- Even after fixes, new crashes appeared
**Conclusion from previous sessions**: The problem is NOT fixable in HAKMEM - it's a glibc limitation.
### 2. Industry-Wide Issue (WebSearch Evidence 2024)
**hardened_malloc (GrapheneOS)**:
- Issue #98: sssd crashes with LD_PRELOAD
- Same problem affects production-grade allocators
**mimalloc (Microsoft)**:
- Issue #21: Firefox crashes on start with ld.so.preload
- Widely used allocator, still has LD_PRELOAD issues
**Stack Overflow consensus**:
> "Even though glibc officially supports using linker preloading for malloc replacement,
> there are edge cases that aren't supported properly by them"
**Root causes**:
1. `dlsym()` calls `calloc(32)` → infinite recursion → stack overflow
2. `printf()` internally uses malloc → recursion → segfault
3. `mbstowcs()` (locale) calls internal malloc → crashes unrelated code
4. Incomplete malloc substitution in glibc edge cases
### 3. Performance Impact (Phase 8.2)
When we tried to make LD_PRELOAD safe (commit c8139f3):
- Added `mincore()` syscall for safety → **6.4x slowdown** (145M → 22.78M ops/sec)
- Removed `mincore()` to recover performance → LD_PRELOAD broken again
**Trade-off**: LD_PRELOAD safety OR performance - cannot have both.
---
## What Works Instead
### ✅ Direct Linking (Recommended)
```bash
# Static linking
gcc -o bench bench.c libhakmem.a -lpthread
# Dynamic linking with -lhakmem
gcc -o bench bench.c -L. -lhakmem -lpthread
export LD_LIBRARY_PATH=.
./bench
```
**Benefits**:
- No glibc edge cases
- Full performance (no mincore overhead)
- Predictable behavior
- No random segfaults
### ✅ What LD_PRELOAD Can Do (Limited)
Simple commands work fine:
```bash
LD_PRELOAD=./libhakmem.so echo "hello" # ✅ Works
LD_PRELOAD=./libhakmem.so cat file.txt # ✅ Works
LD_PRELOAD=./libhakmem.so grep pattern # ✅ Works
```
But DO NOT use for benchmarking - use direct linking.
---
## Historical Context
### Phase 6.15 (Oct 2025)
- Attempted LD_PRELOAD multithreaded safety fixes
- Added pthread_once() for initialization
- **Result**: Still crashed - not a HAKMEM bug, glibc limitation
### Commit c8139f3 (Phase 8.2)
- Removed mincore() for performance
- **Decision**: Sacrificed LD_PRELOAD safety for 10.9x speedup
- Implicit ban on LD_PRELOAD benchmarks
### Commit e4d61d5
- Documented LD_PRELOAD limitations
- **But did NOT explicitly ban benchmarking** ← This was the mistake
---
## AI Instructions (claude.md)
Added to claude.md:
```markdown
## LD_PRELOAD Benchmark Policy
**NEVER suggest or attempt LD_PRELOAD benchmarking**
If encountering segfaults with LD_PRELOAD:
1. Immediately stop debugging
2. Remind user: "LD_PRELOAD benchmarks are banned (see LD_PRELOAD_BENCHMARK_BAN.md)"
3. Suggest direct linking instead
```
---
## References
**WebSearch findings (2024-10-27)**:
1. hardened_malloc Issue #98
- https://github.com/GrapheneOS/hardened_malloc/issues/98
- sssd crashes with LD_PRELOAD
2. Stack Overflow - dlsym recursion
- https://stackoverflow.com/questions/6083337/overriding-malloc-using-the-ld-preload-mechanism
- Infinite loop segfaults
3. mimalloc Issue #21
- https://github.com/microsoft/mimalloc/issues/21
- Firefox crashes on start
4. Stack Overflow - glibc edge cases
- https://stackoverflow.com/questions/27322295/ld-preload-causing-segmentation-fault-in-dynamic-library-loader
- "edge cases that aren't supported properly by glibc"
---
## Summary
| Approach | Safety | Performance | Verdict |
|----------|--------|-------------|---------|
| Direct linking | ✅ Safe | ✅ Fast | ✅ **USE THIS** |
| LD_PRELOAD (with mincore) | ⚠️ Safer | ❌ 6.4x slower | ❌ Unacceptable |
| LD_PRELOAD (without mincore) | ❌ Crashes | ✅ Fast | ❌ **BANNED** |
**Final Decision**: Use direct linking for all benchmarks.
---
**Last Updated**: 2025-10-27
**Policy Expires**: Never (permanent ban)