Fix C7 TLS SLL header restoration regression + Document Larson MT race condition

## Bug Fix: Restore C7 Exception in TLS SLL Push

**File**: `core/box/tls_sll_box.h:309`

**Problem**: Commit 25d963a4a (Code Cleanup) accidentally reverted the C7 fix by changing:
```c
if (class_idx != 0 && class_idx != 7) {  // CORRECT (commit 8b67718bf)
if (class_idx != 0) {                     // BROKEN (commit 25d963a4a)
```

**Impact**: C7 (1024B class) header restoration in TLS SLL push overwrote next pointer at base[0], causing corruption.

**Fix**: Restored `&& class_idx != 7` check to prevent header restoration for C7.

**Why C7 Needs Exception**:
- C7 uses offset=0 (stores next pointer at base[0])
- User pointer is at base+1
- Next pointer MUST NOT be overwritten by header restoration
- C1-C6 use offset=1 (next at base[1]), so base[0] header restoration is safe

## Investigation: Larson MT Race Condition (SEPARATE ISSUE)

**Finding**: Larson still crashes with 3+ threads due to UNRELATED multi-threading race condition in unified cache freelist management.

**Root Cause**: Non-atomic freelist operations in `TinySlabMeta`:
```c
typedef struct TinySlabMeta {
    void* freelist;    //  NOT ATOMIC
    uint16_t used;     //  NOT ATOMIC
} TinySlabMeta;
```

**Evidence**:
```
1 thread:   PASS (1.88M - 41.8M ops/s)
2 threads:  PASS (24.6M ops/s)
3 threads:  SEGV (race condition)
4+ threads:  SEGV (race condition)
```

**Status**: C7 fix is CORRECT. Larson crash is separate MT issue requiring atomic freelist implementation.

## Documentation Added

Created comprehensive investigation reports:
- `LARSON_CRASH_ROOT_CAUSE_REPORT.md` - Full technical analysis
- `LARSON_DIAGNOSTIC_PATCH.md` - Implementation guide
- `LARSON_INVESTIGATION_SUMMARY.md` - Executive summary
- `LARSON_QUICK_REF.md` - Quick reference
- `verify_race_condition.sh` - Automated verification script

## Next Steps

Implement atomic freelist operations for full MT safety (7-9 hour effort):
1. Make `TinySlabMeta.freelist` atomic with CAS loop
2. Audit 87 freelist access sites
3. Test with Larson 8+ threads

🔧 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-22 02:15:34 +09:00
parent 3ad1e4c3fe
commit d8168a2021
13 changed files with 1391 additions and 9 deletions

View File

@ -13,7 +13,8 @@ core/box/front_gate_classifier.o: core/box/front_gate_classifier.c \
core/box/../hakmem_build_flags.h core/box/../hakmem_internal.h \
core/box/../hakmem.h core/box/../hakmem_config.h \
core/box/../hakmem_features.h core/box/../hakmem_sys.h \
core/box/../hakmem_whale.h core/box/../hakmem_tiny_config.h
core/box/../hakmem_whale.h core/box/../hakmem_tiny_config.h \
core/box/../pool_tls_registry.h
core/box/front_gate_classifier.h:
core/box/../tiny_region_id.h:
core/box/../hakmem_build_flags.h:
@ -39,3 +40,4 @@ core/box/../hakmem_features.h:
core/box/../hakmem_sys.h:
core/box/../hakmem_whale.h:
core/box/../hakmem_tiny_config.h:
core/box/../pool_tls_registry.h:

View File

@ -302,10 +302,11 @@ static inline bool tls_sll_push(int class_idx, void* ptr, uint32_t capacity)
}
#if HAKMEM_TINY_HEADER_CLASSIDX
// Header handling for header classes (class != 0,7).
// Header handling for header classes (class 1-6 only, NOT 0 or 7).
// C0, C7 use offset=0, so next pointer is at base[0] and MUST NOT restore header.
// Safe mode (HAKMEM_TINY_SLL_SAFEHEADER=1): never overwrite header; reject on magic mismatch.
// Default mode: restore expected header.
if (class_idx != 0) {
if (class_idx != 0 && class_idx != 7) {
static int g_sll_safehdr = -1;
static int g_sll_ring_en = -1; // optional ring trace for TLS-SLL anomalies
if (__builtin_expect(g_sll_safehdr == -1, 0)) {