**問題:**
- Larson 4T で 100% SEGV (1T は 2.09M ops/s で完走)
- System/mimalloc は 4T で 33.52M ops/s 正常動作
- SS OFF + Remote OFF でも 4T で SEGV
**根本原因: (Task agent ultrathink 調査結果)**
```
CRASH: mov (%r15),%r13
R15 = 0x6261 ← ASCII "ba" (ゴミ値、未初期化TLS)
```
Worker スレッドの TLS 変数が未初期化:
- `__thread void* g_tls_sll_head[TINY_NUM_CLASSES];` ← 初期化なし
- pthread_create() で生成されたスレッドでゼロ初期化されない
- NULL チェックが通過 (0x6261 != NULL) → dereference → SEGV
**修正内容:**
全 TLS 配列に明示的初期化子 `= {0}` を追加:
1. **core/hakmem_tiny.c:**
- `g_tls_sll_head[TINY_NUM_CLASSES] = {0}`
- `g_tls_sll_count[TINY_NUM_CLASSES] = {0}`
- `g_tls_live_ss[TINY_NUM_CLASSES] = {0}`
- `g_tls_bcur[TINY_NUM_CLASSES] = {0}`
- `g_tls_bend[TINY_NUM_CLASSES] = {0}`
2. **core/tiny_fastcache.c:**
- `g_tiny_fast_cache[TINY_FAST_CLASS_COUNT] = {0}`
- `g_tiny_fast_count[TINY_FAST_CLASS_COUNT] = {0}`
- `g_tiny_fast_free_head[TINY_FAST_CLASS_COUNT] = {0}`
- `g_tiny_fast_free_count[TINY_FAST_CLASS_COUNT] = {0}`
3. **core/hakmem_tiny_magazine.c:**
- `g_tls_mags[TINY_NUM_CLASSES] = {0}`
4. **core/tiny_sticky.c:**
- `g_tls_sticky_ss[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}`
- `g_tls_sticky_idx[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}`
- `g_tls_sticky_pos[TINY_NUM_CLASSES] = {0}`
**効果:**
```
Before: 1T: 2.09M ✅ | 4T: SEGV 💀
After: 1T: 2.41M ✅ | 4T: 4.19M ✅ (+15% 1T, SEGV解消)
```
**テスト:**
```bash
# 1 thread: 完走
./larson_hakmem 2 8 128 1024 1 12345 1
→ Throughput = 2,407,597 ops/s ✅
# 4 threads: 完走(以前は SEGV)
./larson_hakmem 2 8 128 1024 1 12345 4
→ Throughput = 4,192,155 ops/s ✅
```
**調査協力:** Task agent (ultrathink mode) による完璧な根本原因特定
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
14 KiB
14 KiB
hakmem_tiny_free.inc 分割実装詳細
セクション別 行数マッピング
現在のファイル構造
hakmem_tiny_free.inc (1,711 lines)
SECTION Lines Code Comments Description
════════════════════════════════════════════════════════════════════════
Includes & declarations 1-13 10 3 External dependencies
Helper: drain_to_sll_budget 16-25 10 5 ENV-based SLL drain budget
Helper: drain_freelist_to_sll 27-42 16 8 Freelist → SLL splicing
Helper: remote_queue_contains 44-64 21 10 Duplicate detection
═══════════════════════════════════════════════════════════════════════
MAIN FREE FUNCTION 68-625 462 96 hak_tiny_free_with_slab()
└─ SuperSlab mode 70-133 64 29 If slab==NULL dispatch
└─ Same-thread TLS paths 135-206 72 36 Fast/List/HotMag
└─ Magazine/SLL paths 208-620 413 97 **TO EXTRACT**
═══════════════════════════════════════════════════════════════════════
ALLOCATION SECTION 626-1019 308 86 SuperSlab alloc & refill
└─ superslab_alloc_from_slab 626-709 71 22 **TO EXTRACT**
└─ superslab_refill 712-1019 237 64 **TO EXTRACT**
═══════════════════════════════════════════════════════════════════════
FREE SECTION 1171-1475 281 82 hak_tiny_free_superslab()
└─ Validation & safety 1200-1230 30 20 Bounds/magic check
└─ Same-thread path 1232-1310 79 45 **TO EXTRACT**
└─ Remote/cross-thread 1312-1470 159 80 **TO EXTRACT**
═══════════════════════════════════════════════════════════════════════
EXTRACTED COMMENTS 1612-1625 0 14 (Placeholder)
═══════════════════════════════════════════════════════════════════════
SHUTDOWN 1676-1705 28 7 hak_tiny_shutdown()
═══════════════════════════════════════════════════════════════════════
分割計画(3つの新ファイル)
SPLIT 1: tiny_free_magazine.inc.h
抽出元: hakmem_tiny_free.inc lines 208-620
内容:
LINES CODE CONTENT
────────────────────────────────────────────────────────────
208-217 10 #if !HAKMEM_BUILD_RELEASE & includes
218-226 9 TinyQuickSlot fast path
227-241 15 TLS SLL fast path (3-4 instruction check)
242-247 6 Magazine hysteresis threshold
248-263 16 Magazine push (top < cap + hyst)
264-290 27 Background spill async queue
291-620 350 Publisher final fallback + loop
推定サイズ: 413行 → 400行 (include overhead -3行)
新しい公開関数: (なし - すべて inline/helper)
含まれるヘッダ:
#include "hakmem_tiny_magazine.h" // TinyTLSMag, mag operations
#include "tiny_tls_guard.h" // tls_list_push, guard ops
#include "mid_tcache.h" // midtc_enabled, midtc_push
#include "box/free_publish_box.h" // publisher operations
#include <stdatomic.h> // atomic operations
呼び出し箇所:
// In hak_tiny_free_with_slab(), after line 206:
#include "tiny_free_magazine.inc.h"
if (g_tls_list_enable) {
#include logic here
}
// Else magazine path
#include logic here
SPLIT 2: tiny_superslab_alloc.inc.h
抽出元: hakmem_tiny_free.inc lines 626-1019
内容:
LINES CODE FUNCTION
──────────────────────────────────────────────────────
626-709 71 superslab_alloc_from_slab()
├─ Remote queue drain
├─ Linear allocation
└─ Freelist allocation
712-1019 237 superslab_refill()
├─ Mid-size simple refill (747-782)
├─ SuperSlab adoption (785-947)
│ ├─ First-fit slab selection
│ ├─ Scoring algorithm
│ └─ Slab acquisition
└─ Fresh SuperSlab alloc (949-1019)
├─ superslab_allocate()
├─ Init slab 0
└─ Refcount mgmt
추정 사이즈: 394行 → 380행
필요한 헤더:
#include "tiny_refill.h" // ss_partial_adopt, superslab_allocate
#include "slab_handle.h" // slab_try_acquire, slab_release
#include "tiny_remote.h" // Remote tracking
#include <stdatomic.h> // atomic operations
#include <string.h> // memset
#include <stdlib.h> // malloc, errno
공개 함수:
static SuperSlab* superslab_refill(int class_idx)static inline void* superslab_alloc_from_slab(SuperSlab* ss, int slab_idx)static inline void* hak_tiny_alloc_superslab(int class_idx)(1020-1170)
호출 위치:
// In hakmem_tiny_free.inc, replace lines 626-1019 with:
#include "tiny_superslab_alloc.inc.h"
SPLIT 3: tiny_superslab_free.inc.h
抽출元: hakmem_tiny_free.inc lines 1171-1475
내容:
LINES CODE CONTENT
────────────────────────────────────────────────────
1171-1198 28 Entry & debug initialization
1200-1230 30 Validation & safety checks
1232-1310 79 Same-thread freelist push
├─ ROUTE_MARK tracking
├─ Direct freelist push
├─ remote guard validation
├─ MidTC integration
└─ First-free publish
1312-1470 159 Remote/cross-thread path
├─ Owner tid validation
├─ Remote queue enqueue
├─ Sentinel validation
└─ Pending coordination
推定サイズ: 305행 → 290행
필요한 헤더:
#include "box/free_local_box.h" // tiny_free_local_box()
#include "box/free_remote_box.h" // tiny_free_remote_box()
#include "tiny_remote.h" // Remote validation & tracking
#include "slab_handle.h" // slab_index_for
#include "mid_tcache.h" // midtc operations
#include <signal.h> // raise()
#include <stdatomic.h> // atomic operations
공개 함수:
static inline void hak_tiny_free_superslab(void* ptr, SuperSlab* ss)
호출 위치:
// In hakmem_tiny_free.inc, replace lines 1171-1475 with:
#include "tiny_superslab_free.inc.h"
Makefile 의존성 업데이트
현재:
libhakmem.so: hakmem_tiny_free.inc (간접 의존)
변경 후:
libhakmem.so: core/hakmem_tiny_free.inc \
core/tiny_free_magazine.inc.h \
core/tiny_superslab_alloc.inc.h \
core/tiny_superslab_free.inc.h
또는 자동 의존성 생성 (이미 Makefile에 있음):
# gcc -MMD -MP 플래그로 자동 검출됨
# .d 파일에 .inc 의존성도 기록됨
함수별 이동 체크리스트
hakmem_tiny_free.inc 에 남을 함수
tiny_drain_to_sll_budget()(lines 16-25)tiny_drain_freelist_to_sll_once()(lines 27-42)tiny_remote_queue_contains_guard()(lines 44-64)hak_tiny_free_with_slab()(lines 68-625, 축소됨)hak_tiny_free()(lines 1476-1610)hak_tiny_shutdown()(lines 1676-1705)
tiny_free_magazine.inc.h 로 이동
hotmag_push()(inline from magazine.h)tls_list_push()(inline from guard)bulk_mag_to_sll_if_room()- Magazine hysteresis logic
- Background spill logic
- Publisher fallback logic
tiny_superslab_alloc.inc.h 로 이동
superslab_alloc_from_slab()(lines 626-709)superslab_refill()(lines 712-1019)hak_tiny_alloc_superslab()(lines 1020-1170)- Adoption scoring helpers
- Registry scan helpers
tiny_superslab_free.inc.h 로 이동
hak_tiny_free_superslab()(lines 1171-1475)- Inline:
tiny_free_local_box() - Inline:
tiny_free_remote_box() - Remote queue sentinel validation
- First-free publish detection
병합/분리 후 검증 체크리스트
Build Verification
[ ] make clean
[ ] make build # Should not error
[ ] make bench_comprehensive_hakmem
[ ] Check: No new compiler warnings
Behavioral Verification
[ ] ./larson_hakmem 2 8 128 1024 1 12345 4
→ Score should match baseline (±1%)
[ ] Run with various ENV flags:
[ ] HAKMEM_TINY_DRAIN_TO_SLL=16
[ ] HAKMEM_TINY_SS_ADOPT=1
[ ] HAKMEM_SAFE_FREE=1
[ ] HAKMEM_TINY_FREE_TO_SS=1
Code Quality
[ ] grep -n "hak_tiny_free_with_slab\|superslab_refill" core/*.inc.h
→ Should find only in appropriate files
[ ] Check cyclomatic complexity reduced
[ ] hak_tiny_free_with_slab: 28 → ~8
[ ] superslab_refill: 18 (isolated)
[ ] hak_tiny_free_superslab: 16 (isolated)
Git Verification
[ ] git diff core/hakmem_tiny_free.inc | wc -l
→ Should show ~700 deletions, ~300 additions
[ ] git add core/tiny_free_magazine.inc.h
[ ] git add core/tiny_superslab_alloc.inc.h
[ ] git add core/tiny_superslab_free.inc.h
[ ] git commit -m "Split hakmem_tiny_free.inc into 3 focused modules"
分割の逆戻し手順(緊急時)
# Step 1: Restore backup
cp core/hakmem_tiny_free.inc.bak core/hakmem_tiny_free.inc
# Step 2: Remove new files
rm core/tiny_free_magazine.inc.h
rm core/tiny_superslab_alloc.inc.h
rm core/tiny_superslab_free.inc.h
# Step 3: Reset git
git checkout core/hakmem_tiny_free.inc
git reset --hard HEAD~1 # If committed
# Step 4: Rebuild
make clean && make
分割後のアーキテクチャ図
┌──────────────────────────────────────────────────────────┐
│ hak_tiny_free() Entry Point │
│ (1476-1610, 135 lines, CC=12) │
└───────────────────┬────────────────────────────────────┘
│
┌───────────┴───────────┐
│ │
v v
[SuperSlab] [TinySlab]
g_use_superslab=1 fallback
│ │
v v
┌──────────────────┐ ┌─────────────────────┐
│ tiny_superslab_ │ │ hak_tiny_free_with_ │
│ free.inc.h │ │ slab() │
│ (305 lines) │ │ (dispatches to:) │
│ CC=16 │ └─────────────────────┘
│ │
│ ├─ Validation │ ┌─────────────────────────┐
│ ├─ Same-thread │ │ tiny_free_magazine.inc.h│
│ │ path (79L) │ │ (400 lines) │
│ └─ Remote path │ │ CC=10 │
│ (159L) │ │ │
└──────────────────┘ ├─ TinyQuickSlot
├─ TLS SLL push
[Alloc] ├─ Magazine push
┌──────────┐ ├─ Background spill
v v ├─ Publisher fallback
┌──────────────────────┐
│ tiny_superslab_alloc │
│ .inc.h │
│ (394 lines) │
│ CC=18 │
│ │
│ ├─ superslab_refill │
│ │ (308L, O(n) path)│
│ ├─ alloc_from_slab │
│ │ (84L) │
│ └─ entry point │
│ (151L) │
└──────────────────────┘
パフォーマンス影響の予測
コンパイル時間
- Before: ~500ms (1 large file)
- After: ~650ms (4 files with includes)
- 増加: +30% (許容範囲内)
ランタイム性能
- 変化なし (全てのコードは inline/static)
- 理由:
.inc.hファイルはコンパイル時に1つにマージされる
検証方法
./larson_hakmem 2 8 128 1024 1 12345 4
# Expected: 4.19M ± 2% ops/sec (baseline maintained)
ドキュメント更新チェック
- CLAUDE.md - 新しいファイル構造を記述
- README.md - 概要に分割情報を追加(必要なら)
- Makefile コメント - 依存関係の説明
- このファイル (SPLIT_DETAILS.md)