Larson double-free investigation: Enhanced diagnostics + Remove buggy drain pushback

**Problem**: Larson benchmark crashes with TLS_SLL_DUP (double-free), 100% crash rate in debug

**Root Cause**: TLS drain pushback code (commit c2f104618) created duplicates by
pushing pointers back to TLS SLL while they were still in the linked list chain.

**Diagnostic Enhancements** (ChatGPT + Claude collaboration):
1. **Callsite Tracking**: Track file:line for each TLS SLL push (debug only)
   - Arrays: g_tls_sll_push_file[], g_tls_sll_push_line[]
   - Macro: tls_sll_push() auto-records __FILE__, __LINE__

2. **Enhanced Duplicate Detection**:
   - Scan depth: 64 → 256 nodes (deep duplicate detection)
   - Error message shows BOTH current and previous push locations
   - Calls ptr_trace_dump_now() for detailed analysis

3. **Evidence Captured**:
   - Both duplicate pushes from same line (221)
   - Pointer at position 11 in TLS SLL (count=18, scanned=11)
   - Confirms pointer allocated without being popped from TLS SLL

**Fix**:
- **core/box/tls_sll_drain_box.h**: Remove pushback code entirely
  - Old: Push back to TLS SLL on validation failure → duplicates!
  - New: Skip pointer (accept rare leak) to avoid duplicates
  - Rationale: SuperSlab lookup failures are transient/rare

**Status**: Fix implemented, ready for testing

**Updated**:
- LARSON_DOUBLE_FREE_INVESTIGATION.md: Root cause confirmed
This commit is contained in:
Moe Charm (CI)
2025-11-27 07:30:32 +09:00
parent c2f104618f
commit 8553894171
3 changed files with 83 additions and 45 deletions

View File

@ -52,6 +52,12 @@ extern __thread uint64_t g_tls_canary_after_sll;
extern __thread const char* g_tls_sll_last_writer[TINY_NUM_CLASSES];
extern int g_tls_sll_class_mask; // bit i=1 → SLL allowed for class i
#if !HAKMEM_BUILD_RELEASE
// Global callsite record (debug only; zero overhead in release)
static const char* g_tls_sll_push_file[TINY_NUM_CLASSES] = {0};
static int g_tls_sll_push_line[TINY_NUM_CLASSES] = {0};
#endif
// ========== Debug guard ==========
#if !HAKMEM_BUILD_RELEASE
@ -669,12 +675,51 @@ static inline uint32_t tls_sll_splice(int class_idx,
// ========== Macro Wrappers ==========
//
// Box Theory: Callers use tls_sll_push/pop() macros which auto-insert __func__.
// No changes required to 20+ call sites.
// Box Theory: Callers use tls_sll_push/pop() macros which auto-insert callsite info (debug only).
// No changes required to call sites.
#if !HAKMEM_BUILD_RELEASE
static inline bool tls_sll_push_guarded(int class_idx, void* ptr, uint32_t capacity,
const char* where, const char* file, int line) {
// Enhanced duplicate guard (scan up to 256 nodes for deep duplicates)
uint32_t scanned = 0;
void* cur = g_tls_sll[class_idx].head;
const uint32_t limit = (g_tls_sll[class_idx].count < 256) ? g_tls_sll[class_idx].count : 256;
while (cur && scanned < limit) {
if (cur == ptr) {
// Enhanced error message with both old and new callsite info
const char* last_file = g_tls_sll_push_file[class_idx] ? g_tls_sll_push_file[class_idx] : "(null)";
fprintf(stderr,
"[TLS_SLL_DUP] cls=%d ptr=%p head=%p count=%u scanned=%u\n"
" Current push: where=%s at %s:%d\n"
" Previous push: %s:%d\n",
class_idx, ptr, g_tls_sll[class_idx].head, g_tls_sll[class_idx].count, scanned,
where, file, line,
last_file, g_tls_sll_push_line[class_idx]);
// Dump pointer trace for detailed analysis
ptr_trace_dump_now("tls_sll_dup");
abort();
}
void* next = NULL;
PTR_NEXT_READ("tls_sll_dupcheck", class_idx, cur, 0, next);
cur = next;
scanned++;
}
// Call impl (duplicate check in impl will be skipped since we already checked above and would abort)
// Note: impl has its own duplicate check, but we'll never reach it because we abort above
bool ok = tls_sll_push_impl(class_idx, ptr, capacity, where);
if (ok) {
g_tls_sll_push_file[class_idx] = file;
g_tls_sll_push_line[class_idx] = line;
}
return ok;
}
# define tls_sll_push(cls, ptr, cap) \
tls_sll_push_impl((cls), (ptr), (cap), __func__)
tls_sll_push_guarded((cls), (ptr), (cap), __func__, __FILE__, __LINE__)
# define tls_sll_pop(cls, out) \
tls_sll_pop_impl((cls), (out), __func__)
#else