Major Features: - Debug counter infrastructure for Refill Stage tracking - Free Pipeline counters (ss_local, ss_remote, tls_sll) - Diagnostic counters for early return analysis - Unified larson.sh benchmark runner with profiles - Phase 6-3 regression analysis documentation Bug Fixes: - Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB) - Fix profile variable naming consistency - Add .gitignore patterns for large files Performance: - Phase 6-3: 4.79 M ops/s (has OOM risk) - With SuperSlab: 3.13 M ops/s (+19% improvement) This is a clean repository without large log files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
170 lines
4.4 KiB
Markdown
170 lines
4.4 KiB
Markdown
# LD_PRELOAD Benchmark Policy - BANNED
|
|
|
|
**Date**: 2025-10-27
|
|
**Status**: 🚫 **PERMANENTLY BANNED**
|
|
**Reason**: Industry-wide glibc edge cases cause unpredictable segfaults
|
|
|
|
---
|
|
|
|
## TL;DR
|
|
|
|
**DO NOT USE LD_PRELOAD FOR BENCHMARKING**
|
|
|
|
Use direct linking instead:
|
|
```bash
|
|
# ✅ CORRECT
|
|
gcc -o bench bench.c libhakmem.a
|
|
./bench
|
|
|
|
# ❌ BANNED
|
|
LD_PRELOAD=./libhakmem.so ./bench
|
|
```
|
|
|
|
---
|
|
|
|
## Why Banned?
|
|
|
|
### 1. Previous Investigation History
|
|
|
|
We spent significant time debugging segfaults with LD_PRELOAD benchmarks:
|
|
- Multithreaded tests crashed unpredictably
|
|
- Different commands (ls, locale tools) crashed randomly
|
|
- Even after fixes, new crashes appeared
|
|
|
|
**Conclusion from previous sessions**: The problem is NOT fixable in HAKMEM - it's a glibc limitation.
|
|
|
|
### 2. Industry-Wide Issue (WebSearch Evidence 2024)
|
|
|
|
**hardened_malloc (GrapheneOS)**:
|
|
- Issue #98: sssd crashes with LD_PRELOAD
|
|
- Same problem affects production-grade allocators
|
|
|
|
**mimalloc (Microsoft)**:
|
|
- Issue #21: Firefox crashes on start with ld.so.preload
|
|
- Widely used allocator, still has LD_PRELOAD issues
|
|
|
|
**Stack Overflow consensus**:
|
|
> "Even though glibc officially supports using linker preloading for malloc replacement,
|
|
> there are edge cases that aren't supported properly by them"
|
|
|
|
**Root causes**:
|
|
1. `dlsym()` calls `calloc(32)` → infinite recursion → stack overflow
|
|
2. `printf()` internally uses malloc → recursion → segfault
|
|
3. `mbstowcs()` (locale) calls internal malloc → crashes unrelated code
|
|
4. Incomplete malloc substitution in glibc edge cases
|
|
|
|
### 3. Performance Impact (Phase 8.2)
|
|
|
|
When we tried to make LD_PRELOAD safe (commit c8139f3):
|
|
- Added `mincore()` syscall for safety → **6.4x slowdown** (145M → 22.78M ops/sec)
|
|
- Removed `mincore()` to recover performance → LD_PRELOAD broken again
|
|
|
|
**Trade-off**: LD_PRELOAD safety OR performance - cannot have both.
|
|
|
|
---
|
|
|
|
## What Works Instead
|
|
|
|
### ✅ Direct Linking (Recommended)
|
|
|
|
```bash
|
|
# Static linking
|
|
gcc -o bench bench.c libhakmem.a -lpthread
|
|
|
|
# Dynamic linking with -lhakmem
|
|
gcc -o bench bench.c -L. -lhakmem -lpthread
|
|
export LD_LIBRARY_PATH=.
|
|
./bench
|
|
```
|
|
|
|
**Benefits**:
|
|
- No glibc edge cases
|
|
- Full performance (no mincore overhead)
|
|
- Predictable behavior
|
|
- No random segfaults
|
|
|
|
### ✅ What LD_PRELOAD Can Do (Limited)
|
|
|
|
Simple commands work fine:
|
|
```bash
|
|
LD_PRELOAD=./libhakmem.so echo "hello" # ✅ Works
|
|
LD_PRELOAD=./libhakmem.so cat file.txt # ✅ Works
|
|
LD_PRELOAD=./libhakmem.so grep pattern # ✅ Works
|
|
```
|
|
|
|
But DO NOT use for benchmarking - use direct linking.
|
|
|
|
---
|
|
|
|
## Historical Context
|
|
|
|
### Phase 6.15 (Oct 2025)
|
|
- Attempted LD_PRELOAD multithreaded safety fixes
|
|
- Added pthread_once() for initialization
|
|
- **Result**: Still crashed - not a HAKMEM bug, glibc limitation
|
|
|
|
### Commit c8139f3 (Phase 8.2)
|
|
- Removed mincore() for performance
|
|
- **Decision**: Sacrificed LD_PRELOAD safety for 10.9x speedup
|
|
- Implicit ban on LD_PRELOAD benchmarks
|
|
|
|
### Commit e4d61d5
|
|
- Documented LD_PRELOAD limitations
|
|
- **But did NOT explicitly ban benchmarking** ← This was the mistake
|
|
|
|
---
|
|
|
|
## AI Instructions (claude.md)
|
|
|
|
Added to claude.md:
|
|
|
|
```markdown
|
|
## LD_PRELOAD Benchmark Policy
|
|
|
|
**NEVER suggest or attempt LD_PRELOAD benchmarking**
|
|
|
|
If encountering segfaults with LD_PRELOAD:
|
|
1. Immediately stop debugging
|
|
2. Remind user: "LD_PRELOAD benchmarks are banned (see LD_PRELOAD_BENCHMARK_BAN.md)"
|
|
3. Suggest direct linking instead
|
|
```
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
**WebSearch findings (2024-10-27)**:
|
|
|
|
1. hardened_malloc Issue #98
|
|
- https://github.com/GrapheneOS/hardened_malloc/issues/98
|
|
- sssd crashes with LD_PRELOAD
|
|
|
|
2. Stack Overflow - dlsym recursion
|
|
- https://stackoverflow.com/questions/6083337/overriding-malloc-using-the-ld-preload-mechanism
|
|
- Infinite loop segfaults
|
|
|
|
3. mimalloc Issue #21
|
|
- https://github.com/microsoft/mimalloc/issues/21
|
|
- Firefox crashes on start
|
|
|
|
4. Stack Overflow - glibc edge cases
|
|
- https://stackoverflow.com/questions/27322295/ld-preload-causing-segmentation-fault-in-dynamic-library-loader
|
|
- "edge cases that aren't supported properly by glibc"
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
| Approach | Safety | Performance | Verdict |
|
|
|----------|--------|-------------|---------|
|
|
| Direct linking | ✅ Safe | ✅ Fast | ✅ **USE THIS** |
|
|
| LD_PRELOAD (with mincore) | ⚠️ Safer | ❌ 6.4x slower | ❌ Unacceptable |
|
|
| LD_PRELOAD (without mincore) | ❌ Crashes | ✅ Fast | ❌ **BANNED** |
|
|
|
|
**Final Decision**: Use direct linking for all benchmarks.
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-10-27
|
|
**Policy Expires**: Never (permanent ban)
|