Files
hakmem/docs/analysis/C7_TLS_SLL_CORRUPTION_ANALYSIS.md
Moe Charm (CI) 67fb15f35f Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization)
## Changes

### 1. core/page_arena.c
- Removed init failure message (lines 25-27) - error is handled by returning early
- All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks

### 2. core/hakmem.c
- Wrapped SIGSEGV handler init message (line 72)
- CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs

### 3. core/hakmem_shared_pool.c
- Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE:
  - Node pool exhaustion warning (line 252)
  - SP_META_CAPACITY_ERROR warning (line 421)
  - SP_FIX_GEOMETRY debug logging (line 745)
  - SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865)
  - SP_ACQUIRE_STAGE0_L0 debug logging (line 803)
  - SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922)
  - SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996)
  - SP_ACQUIRE_STAGE3 debug logging (line 1116)
  - SP_SLOT_RELEASE debug logging (line 1245)
  - SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305)
  - SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316)
- Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized

## Performance Validation

Before: 51M ops/s (with debug fprintf overhead)
After:  49.1M ops/s (consistent performance, fprintf removed from hot paths)

## Build & Test

```bash
./build.sh larson_hakmem
./out/release/larson_hakmem 1 5 1 1000 100 10000 42
# Result: 49.1M ops/s
```

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 13:14:18 +09:00

5.3 KiB
Raw Blame History

C7 (1024B) TLS SLL Corruption Root Cause Analysis

症状

修正後も依然として発生:

  • Class 7 (1024B)でTLS SLL破壊が継続
  • tiny_nextptr.h line 45を return 1u に修正済みC7もoffset=1
  • 破壊がClass 6からClass 7に移動修正の効果はあるが根本解決せず

観察事項:

[TLS_SLL_POP_INVALID] cls=7 head=0x5d dropped count=1
[TLS_SLL_POP_INVALID] cls=7 last_push=0x7815fa801003  ← 奇数アドレス!
[TLS_SLL_POP_INVALID] cls=7 head=0xfd dropped count=2
[TLS_SLL_POP_INVALID] cls=7 last_push=0x7815f99a0801  ← 奇数アドレス!
  1. headに無効な小さい値0x5d, 0xfd等が入る
  2. last_pushアドレスが奇数0x...03, 0x...01等)

アーキテクチャ確認

Allocation Path正常

tiny_alloc_fast.inc.h:

  • tiny_alloc_fast_pop() returns base (SuperSlab block start)
  • HAK_RET_ALLOC(7, base):
    *(uint8_t*)(base) = 0xa7;      // Write header at base[0]
    return (void*)((uint8_t*)(base) + 1);  // Return user = base + 1
    
  • User receives: ptr = base + 1

Free Pathここに問題がある可能性

tiny_free_fast_v2.inc.h (line 106-144):

int class_idx = tiny_region_id_read_header(ptr);  // Read from ptr-1 = base ✓
void* base = (char*)ptr - 1;  // base = user - 1 ✓

tls_sll_box.h (line 117, 235-238):

static inline bool tls_sll_push(int class_idx, void* ptr, uint32_t capacity) {
    // ptr parameter = base (from caller)
    ...
    PTR_NEXT_WRITE("tls_push", class_idx, ptr, 0, g_tls_sll[class_idx].head);
    g_tls_sll[class_idx].head = ptr;
    ...
    s_tls_sll_last_push[class_idx] = ptr;  // ← Should store base
}

tiny_next_ptr_box.h (line 39):

static inline void tiny_next_write(int class_idx, void *base, void *next_value) {
    tiny_next_store(base, class_idx, next_value);
}

tiny_nextptr.h (line 44-45, 69-80):

static inline size_t tiny_next_off(int class_idx) {
    return (class_idx == 0) ? 0u : 1u;  // C7 → offset = 1 ✓
}

static inline void tiny_next_store(void* base, int class_idx, void* next) {
    size_t off = tiny_next_off(class_idx);  // C7 → off = 1

    if (off == 0) {
        *(void**)base = next;
        return;
    }

    // off == 1: C7はここを通る
    uint8_t* p = (uint8_t*)base + off;  // p = base + 1 = user pointer!
    memcpy(p, &next, sizeof(void*));    // Write next at user pointer
}

期待される動作C7 freelist中

Memory layoutC7 freelist中:

Address:     base      base+1        base+9          base+2048
            ┌────┬──────────────┬───────────────┬──────────┐
Content:    │ ?? │ next (8B)    │  (unused)     │          │
            └────┴──────────────┴───────────────┴──────────┘
            header  ← ここにnextを格納offset=1
  • base: headerの位置freelist中は破壊されてもOK - C0と同じ
  • base + 1: next pointerを格納user dataの先頭8バイトを使用

問題の仮説

仮説1: header restoration logic

tls_sll_box.h line 176:

if (class_idx != 0 && class_idx != 7) {
    // C7はここに入らない → header restorationしない
    ...
}

C7はC0と同様に「freelist中はheaderを潰す」設計だが、tiny_nextptr.hでは:

  • C0: offset = 0 → base[0]からnextを書くheader潰す
  • C7: offset = 1 → base[1]からnextを書くheader保持 矛盾!

これが根本原因: C7は「headerを潰す」前提offset=0だが、現在は「headerを保持」offset=1になっている。

修正案

Option A: C7もoffset=0に戻す元の設計に従う

tiny_nextptr.h line 44-45を修正:

static inline size_t tiny_next_off(int class_idx) {
    // Class 0, 7: offset 0 (freelist時はheader潰す)
    // Class 1-6: offset 1 (header保持)
    return (class_idx == 0 || class_idx == 7) ? 0u : 1u;
}

理由:

  • C7 (2048B total) = [1B header] + [2047B payload]
  • Next pointer (8B)はheader位置から書く → payload = 2047B確保
  • Header restorationは allocation時に行うHAK_RET_ALLOC

Option B: C7もheader保持現在のoffset=1を維持し、restoration追加

tls_sll_box.h line 176を修正:

if (class_idx != 0) {  // C7も含める
    // All header classes (C1-C7) restore header during push
    ...
}

理由:

  • 統一性全header classes (C1-C7)でheader保持
  • Payload: 2047B → 2039B (8B next pointer)

推奨: Option A

根拠:

  1. Design Consistency: C0とC7は「headerを犠牲にしてpayload最大化」という同じ設計思想
  2. Memory Efficiency: 2047B payload維持8B節約
  3. Performance: Header restoration不要1命令削減
  4. Code Simplicity: 既存のC0 logicを再利用

実装手順

  1. core/tiny_nextptr.h line 44-45を修正
  2. Build & test with C7 (1024B) allocations
  3. Verify no TLS_SLL_POP_INVALID errors
  4. Verify last_push addresses are even (base pointers)

期待される結果

修正後:

# 100K iterations, no errors
Throughput = 25-30M ops/s (current: 1.5M ops/s with corruption)