Phase 8-TLS-Fix: BenchFast crash root cause fixes

Two critical bugs fixed:

1. TLS→Atomic guard (cross-thread safety):
   - Changed `__thread int bench_fast_init_in_progress` to `atomic_int`
   - Root cause: pthread_once() creates threads with fresh TLS (= 0)
   - Guard must protect entire process, not just calling thread
   - Box Contract: Observable state across all threads

2. Direct header write (P3 optimization bypass):
   - bench_fast_alloc() now writes header directly: 0xa0 | class_idx
   - Root cause: P3 optimization skips header writes by default
   - BenchFast REQUIRES headers for free routing (0xa0-0xa7 magic)
   - Box Contract: BenchFast always writes headers

Result:
- Normal mode: 16.3M ops/s (working)
- BenchFast mode: No crash (pool exhaustion expected with 128 blocks/class)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-30 05:12:32 +09:00
parent 191e659837
commit da8f4d2c86
3 changed files with 21 additions and 11 deletions

View File

@ -57,7 +57,8 @@ void* malloc(size_t size) {
// Phase 20-2: BenchFast mode (structural ceiling measurement)
// WARNING: Bypasses ALL safety checks - benchmark only!
// IMPORTANT: Do NOT use BenchFast during preallocation/init to avoid recursion.
if (__builtin_expect(!bench_fast_init_in_progress && bench_fast_enabled(), 0)) {
// Phase 8-TLS-Fix: Use atomic_load for cross-thread safety
if (__builtin_expect(!atomic_load(&g_bench_fast_init_in_progress) && bench_fast_enabled(), 0)) {
if (size <= 1024) { // Tiny range
return bench_fast_alloc(size);
}