Files

Moe Charm (CI) 1da8754d45 CRITICAL FIX: TLS 未初期化による 4T SEGV を完全解消

**問題:**
- Larson 4T で 100% SEGV (1T は 2.09M ops/s で完走)
- System/mimalloc は 4T で 33.52M ops/s 正常動作
- SS OFF + Remote OFF でも 4T で SEGV

**根本原因: (Task agent ultrathink 調査結果)**
```
CRASH: mov (%r15),%r13
R15 = 0x6261  ← ASCII "ba" (ゴミ値、未初期化TLS)
```

Worker スレッドの TLS 変数が未初期化:
- `__thread void* g_tls_sll_head[TINY_NUM_CLASSES];`  ← 初期化なし
- pthread_create() で生成されたスレッドでゼロ初期化されない
- NULL チェックが通過 (0x6261 != NULL) → dereference → SEGV

**修正内容:**
全 TLS 配列に明示的初期化子 `= {0}` を追加:

1. **core/hakmem_tiny.c:**
   - `g_tls_sll_head[TINY_NUM_CLASSES] = {0}`
   - `g_tls_sll_count[TINY_NUM_CLASSES] = {0}`
   - `g_tls_live_ss[TINY_NUM_CLASSES] = {0}`
   - `g_tls_bcur[TINY_NUM_CLASSES] = {0}`
   - `g_tls_bend[TINY_NUM_CLASSES] = {0}`

2. **core/tiny_fastcache.c:**
   - `g_tiny_fast_cache[TINY_FAST_CLASS_COUNT] = {0}`
   - `g_tiny_fast_count[TINY_FAST_CLASS_COUNT] = {0}`
   - `g_tiny_fast_free_head[TINY_FAST_CLASS_COUNT] = {0}`
   - `g_tiny_fast_free_count[TINY_FAST_CLASS_COUNT] = {0}`

3. **core/hakmem_tiny_magazine.c:**
   - `g_tls_mags[TINY_NUM_CLASSES] = {0}`

4. **core/tiny_sticky.c:**
   - `g_tls_sticky_ss[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}`
   - `g_tls_sticky_idx[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}`
   - `g_tls_sticky_pos[TINY_NUM_CLASSES] = {0}`

**効果:**
```
Before: 1T: 2.09M ✅  |  4T: SEGV 💀
After:  1T: 2.41M ✅  |  4T: 4.19M ✅  (+15% 1T, SEGV解消)
```

**テスト:**
```bash
# 1 thread: 完走
./larson_hakmem 2 8 128 1024 1 12345 1
→ Throughput = 2,407,597 ops/s ✅

# 4 threads: 完走（以前は SEGV）
./larson_hakmem 2 8 128 1024 1 12345 4
→ Throughput = 4,192,155 ops/s ✅
```

**調査協力:** Task agent (ultrathink mode) による完璧な根本原因特定

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-07 01:27:04 +09:00

1.5 KiB

Raw Blame History

Results Snapshot (short runs)

計測日時: 2025-11-06（短時間ラン、参考値）

Larson（8–128B, chunks=1024, seed=12345, 2s）

system 1T: Throughput ≈ 13.58M ops/s
mimalloc 1T: Throughput ≈ 14.54M ops/s
HAKMEM 1T: Throughput ≈ 2.20M ops/s
system 4T: Throughput ≈ 16.76M ops/s
mimalloc 4T: Throughput ≈ 16.76M ops/s
HAKMEM 4T: Throughput ≈ 4.19M ops/s

Tiny Hot（LIFO、batch=100, cycles=60000）

64B: system ≈ 73.13M ops/s, HAKMEM ≈ 24.32M ops/s
32B: HAKMEM ≈ 26.76M ops/s

Random Mixed（16–1024B, ws=8192）

400k ops: system ≈ 53.82M ops/s, HAKMEM ≈ 4.65M ops/s
300k ops（matrix）: system ≈ 47.7–48.2M ops/s, HAKMEM ≈ 4.31–4.80M ops/s

Mid/Large MT（8–32KiB, ws=2048）

4T, cycles=40000: system ≈ 8.27M ops/s, HAKMEM ≈ 4.06M ops/s
1T, cycles=20000（matrix）: system ≈ 2.16M ops/s, HAKMEM ≈ 1.59–1.63M ops/s
4T, cycles=20000（matrix）: system ≈ 6.22M ops/s（HAKMEMは要取得）

VM Mixed（512KB–<2MB, ws=256, cycles=20000）

system: ≈ 0.95–1.03M ops/s
HAKMEM（L25=0）: ≈ 263k–268k ops/s
HAKMEM（L25=1）: ≈ 235k ops/s

注意:

上記は短時間のスモーク値。公式比較は benchmarks/scripts/*_matrix.sh で reps=5/10, 長時間（例: 10s）推奨。
出力CSVの例:
- random_mixed: bench_results/auto/random_mixed_20251106_100710/results.csv
- mid_large_mt: bench_results/auto/mid_large_mt_20251106_100710/results.csv
- vm_mixed: bench_results/auto/vm_mixed_20251106_100709/results.csv

1.5 KiB Raw Blame History Unescape Escape

Results Snapshot (short runs)

Larson（8–128B, chunks=1024, seed=12345, 2s）

Tiny Hot（LIFO、batch=100, cycles=60000）

Random Mixed（16–1024B, ws=8192）

Mid/Large MT（8–32KiB, ws=2048）

VM Mixed（512KB–<2MB, ws=256, cycles=20000）

1.5 KiB

Raw Blame History