Files
hakmem/POINTER_CONVERSION_BUG_ANALYSIS.md
Moe Charm (CI) 72b38bc994 Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)

**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed =  IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 =  POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)

**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
  - Class 0, 7: next at offset 0 (overwrites header when on freelist)
  - Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
  - All classes: next at offset 0

**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion

## Fixes Applied

### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)

// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```

### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files

Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`

### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage

## Verification (GPT5 Report)

**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`

**Results**:
-  Main loop completed successfully
-  Drain phase completed successfully
-  NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers

**Analysis**:
- Class 0 immediate SEGV:  RESOLVED (correct offset 0 now used)
- 66K iteration crash:  RESOLVED (offset consistency fixed)
- Box API conflicts:  RESOLVED (unified 3-arg API)

## Technical Details

### Offset Logic Justification
```
Class 0:  8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```

### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports

## Remaining Work

None for Box API offset bugs - all structural issues resolved.

Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00

17 KiB

ポインタ変換バグの根本原因分析

🔍 調査結果サマリー

バグの本質: DOUBLE CONVERSION - BASE → USER 変換が2回実行されている

影響範囲: Class 7 (1KB headerless) で alignment error が発生

修正方法: TLS SLL は BASE pointer を保存し、HAK_RET_ALLOC で USER 変換を1回だけ実行


📊 完全なポインタ契約マップ

1. ストレージレイアウト

Phase E1-CORRECT: ALL classes (C0-C7) have 1-byte header

Memory Layout:
  storage[0]     = 1-byte header (0xa0 | class_idx)
  storage[1..N]  = user data

Pointers:
  BASE = storage     (points to header at offset 0)
  USER = storage+1   (points to user data at offset 1)

2. Allocation Path (正常)

2.1 HAK_RET_ALLOC マクロ (hakmem_tiny.c:160-162)

#define HAK_RET_ALLOC(cls, base_ptr) do { \
    *(uint8_t*)(base_ptr) = HEADER_MAGIC | ((cls) & HEADER_CLASS_MASK); \
    return (void*)((uint8_t*)(base_ptr) + 1);  // ✅ BASE → USER 変換
} while(0)

契約:

  • INPUT: BASE pointer (storage)
  • OUTPUT: USER pointer (storage+1)
  • 変換回数: 1回

2.2 Linear Carve (tiny_refill_opt.h:292-313)

uint8_t* cursor = base + (meta->carved * stride);
void* head = (void*)cursor;  // ← BASE pointer

// Line 313: Write header to storage[0]
*block = HEADER_MAGIC | class_idx;

// Line 334: Link chain using BASE pointers
tiny_next_write(class_idx, cursor, next);  // ← BASE + next_offset

契約:

  • 生成: BASE pointer chain
  • Header: 書き込み済み (line 313)
  • Next pointer: base+1 に保存 (C0-C6)

2.3 TLS SLL Splice (tls_sll_box.h:449-561)

static inline uint32_t tls_sll_splice(int class_idx, void* chain_head, ...) {
    // Line 508: Restore headers for ALL nodes
    *(uint8_t*)node = HEADER_MAGIC | (class_idx & HEADER_CLASS_MASK);

    // Line 557: Set SLL head to BASE pointer
    g_tls_sll_head[class_idx] = chain_head;  // ← BASE pointer
}

契約:

  • INPUT: BASE pointer chain
  • 保存: BASE pointers in SLL
  • Header: Defense in depth で再書き込み (line 508)

3. ⚠️ BUG: TLS SLL Pop (tls_sll_box.h:224-430)

3.1 Pop 実装 (BEFORE FIX)

static inline bool tls_sll_pop(int class_idx, void** out) {
    void* base = g_tls_sll_head[class_idx];  // ← BASE pointer
    if (!base) return false;

    // Read next pointer
    void* next = tiny_next_read(class_idx, base);
    g_tls_sll_head[class_idx] = next;

    *out = base;  // ✅ Return BASE pointer
    return true;
}

契約 (設計意図):

  • SLL stores: BASE pointers
  • Returns: BASE pointer
  • Caller: HAK_RET_ALLOC で BASE → USER 変換

3.2 Allocation 呼び出し側 (tiny_alloc_fast.inc.h:271-291)

void* base = NULL;
if (tls_sll_pop(class_idx, &base)) {
    // ✅ FIX #16 comment: "Return BASE pointer (not USER)"
    // Line 290: "Caller will call HAK_RET_ALLOC → tiny_region_id_write_header"
    return base;  // ← BASE pointer を返す
}

契約:

  • tls_sll_pop() returns: BASE
  • tiny_alloc_fast_pop() returns: BASE
  • Caller will apply HAK_RET_ALLOC

3.3 tiny_alloc_fast() 呼び出し (tiny_alloc_fast.inc.h:580-582)

ptr = tiny_alloc_fast_pop(class_idx);  // ← BASE pointer
if (__builtin_expect(ptr != NULL, 1)) {
    HAK_RET_ALLOC(class_idx, ptr);  // ← BASE → USER 変換 (1回目) ✅
}

変換回数: 1回 (正常)


4. 🐛 ROOT CAUSE: DOUBLE CONVERSION in Free Path

4.1 Application → hak_free_at()

// Application frees USER pointer
void* user_ptr = malloc(1024);  // Returns storage+1
free(user_ptr);                  // ← USER pointer

INPUT: USER pointer (storage+1)

4.2 hak_free_at() → hak_tiny_free() (hak_free_api.inc.h:119)

case PTR_KIND_TINY_HEADERLESS: {
    // C7: Headerless 1KB blocks
    hak_tiny_free(ptr);  // ← ptr is USER pointer
    goto done;
}

契約:

  • INPUT: ptr = USER pointer (storage+1)
  • 期待: BASE pointer を渡すべき

4.3 hak_tiny_free_superslab() (tiny_superslab_free.inc.h:28)

static inline void hak_tiny_free_superslab(void* ptr, SuperSlab* ss) {
    int slab_idx = slab_index_for(ss, ptr);
    TinySlabMeta* meta = &ss->slabs[slab_idx];

    // Phase E1-CORRECT: ALL classes (C0-C7) have 1-byte header
    void* base = (void*)((uint8_t*)ptr - 1);  // ← USER → BASE 変換 (1回目)

    // ... push to freelist or remote queue
}

変換回数: 1回 (USER → BASE)

4.4 Alignment Check (tiny_superslab_free.inc.h:95-117)

if (__builtin_expect(ss->size_class == 7, 0)) {
    size_t blk = g_tiny_class_sizes[ss->size_class];  // 1024
    uint8_t* slab_base = tiny_slab_base_for(ss, slab_idx);
    uintptr_t delta = (uintptr_t)base - (uintptr_t)slab_base;
    int align_ok = (delta % blk) == 0;

    if (!align_ok) {
        // 🚨 CRASH HERE!
        fprintf(stderr, "[C7_ALIGN_CHECK_FAIL] ptr=%p base=%p\n", ptr, base);
        fprintf(stderr, "[C7_ALIGN_CHECK_FAIL] delta=%zu blk=%zu delta%%blk=%zu\n",
                delta, blk, delta % blk);
        return;
    }
}

Task先生のエラーログ:

[C7_ALIGN_CHECK_FAIL] ptr=0x7f605c414402 base=0x7f605c414401
[C7_ALIGN_CHECK_FAIL] delta=17409 blk=1024 delta%blk=1

分析:

ptr       = 0x...402 (storage+2) ← 期待: storage+1 (USER) ❌
base      = ptr - 1 = 0x...401 (storage+1)
expected  = storage (0x...400)

delta = 17409 = 17 * 1024 + 1
delta % 1024 = 1  ← OFF BY ONE!

結論: ptr が storage+2 になっている = DOUBLE CONVERSION


🔬 バグの伝播経路

Phase 1: Carve → TLS SLL (正常)

[Linear Carve] cursor = base + carved*stride  // BASE pointer (storage)
               ↓ (BASE chain)
[TLS SLL Splice] g_tls_sll_head = chain_head  // BASE pointer (storage)

Phase 2: TLS SLL → Allocation (正常)

[TLS SLL Pop] base = g_tls_sll_head[cls]      // BASE pointer (storage)
              *out = base                       // Return BASE
              ↓ (BASE)
[tiny_alloc_fast] ptr = tiny_alloc_fast_pop()  // BASE pointer (storage)
                  HAK_RET_ALLOC(cls, ptr)       // BASE → USER (storage+1) ✅
                  ↓ (USER)
[Application] p = malloc(1024)                 // Receives USER (storage+1) ✅

Phase 3: Free → TLS SLL (BUG)

[Application] free(p)                          // USER pointer (storage+1)
              ↓ (USER)
[hak_free_at] hak_tiny_free(ptr)               // ptr = USER (storage+1) ❌
              ↓ (USER)
[hak_tiny_free_superslab]
    base = ptr - 1                             // USER → BASE (storage) ← 1回目変換
    ↓ (BASE)
    ss_remote_push(ss, slab_idx, base)         // BASE pushed to remote queue
    ↓ (BASE in remote queue)
[Adoption: Remote → Local Freelist]
    trc_pop_from_freelist(meta, ..., &chain)  // BASE chain
    ↓ (BASE)
[TLS SLL Splice] g_tls_sll_head = chain_head   // BASE stored in SLL ✅

ここまでは正常! BASE pointer が SLL に保存されている。

Phase 4: 次回 Allocation (DOUBLE CONVERSION)

[TLS SLL Pop] base = g_tls_sll_head[cls]      // BASE pointer (storage)
              *out = base                       // Return BASE (storage)
              ↓ (BASE)
[tiny_alloc_fast] ptr = tiny_alloc_fast_pop()  // BASE pointer (storage)
                  HAK_RET_ALLOC(cls, ptr)       // BASE → USER (storage+1) ✅
                  ↓ (USER = storage+1)
[Application] p = malloc(1024)                 // Receives USER (storage+1) ✅
              ... use memory ...
              free(p)                            // USER pointer (storage+1)
              ↓ (USER = storage+1)
[hak_tiny_free] ptr = storage+1
    base = ptr - 1 = storage                   // ✅ USER → BASE (1回目)
    ↓ (BASE = storage)
[hak_tiny_free_superslab]
    base = ptr - 1                             // ❌ USER → BASE (2回目!) DOUBLE CONVERSION!
    ↓ (storage - 1) ← WRONG!

Expected: base = storage (aligned to 1024)
Actual:   base = storage - 1 (offset 1023 → delta % 1024 = 1) ❌

WRONG! hak_tiny_free() は USER pointer を受け取っているのに、hak_tiny_free_superslab() でもう一度 -1 している!


🎯 矛盾点のまとめ

A. 設計意図 (Correct Contract)

Layer Stores Input Output Conversion
Carve - - BASE None (BASE generated)
TLS SLL BASE BASE BASE None
Alloc Pop - - BASE None
HAK_RET_ALLOC - BASE USER BASE → USER (1回)
Application - USER USER None
Free Enter - USER - USER → BASE (1回)
Freelist/Remote BASE BASE - None

Total conversions: 2回 (Alloc: BASE→USER, Free: USER→BASE)

B. 実際の実装 (Buggy Implementation)

Function Input Processing Output
hak_free_at() USER (storage+1) Pass through USER
hak_tiny_free() USER (storage+1) Pass through USER
hak_tiny_free_superslab() USER (storage+1) base = ptr - 1 BASE (storage)

問題: hak_tiny_free_superslab() は BASE pointer を期待しているのに、USER pointer を受け取っている!

結果:

  1. 初回 free: USER → BASE 変換 (正常)
  2. Remote queue に BASE で push (正常)
  3. Adoption で BASE chain を TLS SLL へ (正常)
  4. 次回 alloc: BASE → USER 変換 (正常)
  5. 次回 free: USER → BASE 変換が2回実行される

💡 修正方針 (Option C: Explicit Conversion at Boundary)

修正戦略

原則: Box API Boundary で明示的に変換

  1. TLS SLL: BASE pointers を保存 (現状維持)
  2. Alloc: HAK_RET_ALLOC で BASE → USER 変換 (現状維持)
  3. Free Entry: USER → BASE 変換を1箇所に集約 ← FIX!

具体的な修正

Fix 1: hak_free_at() で USER → BASE 変換

File: /mnt/workdisk/public_share/hakmem/core/box/hak_free_api.inc.h

Before (line 119):

case PTR_KIND_TINY_HEADERLESS: {
    hak_tiny_free(ptr);  // ← ptr is USER
    goto done;
}

After (FIX):

case PTR_KIND_TINY_HEADERLESS: {
    // ✅ FIX: Convert USER → BASE at API boundary
    void* base = (void*)((uint8_t*)ptr - 1);
    hak_tiny_free_base(base);  // ← Pass BASE pointer
    goto done;
}

Fix 2: hak_tiny_free_superslab()_base variant に

File: /mnt/workdisk/public_share/hakmem/core/tiny_superslab_free.inc.h

Option A: Rename function (推奨)

// OLD: static inline void hak_tiny_free_superslab(void* ptr, SuperSlab* ss)
// NEW: Takes BASE pointer explicitly
static inline void hak_tiny_free_superslab_base(void* base, SuperSlab* ss) {
    int slab_idx = slab_index_for(ss, base);  // ← Use base directly
    TinySlabMeta* meta = &ss->slabs[slab_idx];

    // ❌ REMOVE: void* base = (void*)((uint8_t*)ptr - 1);  // DOUBLE CONVERSION!

    // Alignment check now uses correct base
    if (__builtin_expect(ss->size_class == 7, 0)) {
        size_t blk = g_tiny_class_sizes[ss->size_class];
        uint8_t* slab_base = tiny_slab_base_for(ss, slab_idx);
        uintptr_t delta = (uintptr_t)base - (uintptr_t)slab_base;  // ✅ Correct delta
        int align_ok = (delta % blk) == 0;  // ✅ Should be 0 now!
        // ...
    }
    // ... rest of free logic
}

Option B: Keep function name, add parameter

static inline void hak_tiny_free_superslab(void* ptr, SuperSlab* ss, bool is_base) {
    void* base = is_base ? ptr : (void*)((uint8_t*)ptr - 1);
    // ... rest as above
}

Fix 3: Update all call sites

Files to update:

  1. /mnt/workdisk/public_share/hakmem/core/box/hak_free_api.inc.h (line 119, 127)
  2. /mnt/workdisk/public_share/hakmem/core/hakmem_tiny_free.inc (line 173, 470)

Pattern:

// OLD: hak_tiny_free_superslab(ptr, ss);
// NEW: hak_tiny_free_superslab_base(base, ss);

🧪 検証計画

1. Unit Test

void test_pointer_conversion(void) {
    // Allocate
    void* user_ptr = hak_tiny_alloc(1024);  // Should return USER (storage+1)
    assert(user_ptr != NULL);

    // Check alignment (USER pointer should be offset 1 from BASE)
    void* base = (void*)((uint8_t*)user_ptr - 1);
    assert(((uintptr_t)base % 1024) == 0);  // BASE aligned
    assert(((uintptr_t)user_ptr % 1024) == 1);  // USER offset by 1

    // Free (should accept USER pointer)
    hak_tiny_free(user_ptr);

    // Reallocate (should return same USER pointer)
    void* user_ptr2 = hak_tiny_alloc(1024);
    assert(user_ptr2 == user_ptr);  // Same block reused

    hak_tiny_free(user_ptr2);
}

2. Alignment Error Test

# Run with C7 allocation (1KB blocks)
./bench_fixed_size_hakmem 10000 1024 128

# Expected: No [C7_ALIGN_CHECK_FAIL] errors
# Before fix: delta%blk=1 (off by one)
# After fix:  delta%blk=0 (aligned)

3. Stress Test

# Run long allocation/free cycles
./bench_random_mixed_hakmem 1000000 1024 42

# Expected: Stable, no crashes
# Monitor: [C7_ALIGN_CHECK_FAIL] should be 0

4. Grep Audit (事前検証)

# Check for other USER → BASE conversions
grep -rn "(uint8_t\*)ptr - 1" core/

# Expected: Only 1 occurrence (at hak_free_at boundary)
# Before fix: 2+ occurrences (multiple conversions)

📝 影響範囲分析

影響するクラス

Class Size Header Impact
C0 8B Yes Same bug (overwrite header with next)
C1-C6 16-512B Yes Same bug pattern
C7 1KB Yes (Phase E1) Detected (alignment check)

なぜ C7 だけクラッシュ?

  • C7 alignment check が厳密 (1024B aligned)
  • Off-by-one が検出されやすい (delta % 1024 == 1)
  • C0-C6 は smaller alignment (8-512B), エラーが silent になりやすい

他の Free Path も同じバグ?

Yes! 以下も同様に修正が必要:

  1. PTR_KIND_TINY_HEADER (line 119):
case PTR_KIND_TINY_HEADER: {
    // ✅ FIX: Convert USER → BASE
    void* base = (void*)((uint8_t*)ptr - 1);
    hak_tiny_free_base(base);
    goto done;
}
  1. Direct SuperSlab free (hakmem_tiny_free.inc line 470):
if (ss && ss->magic == SUPERSLAB_MAGIC) {
    // ✅ FIX: Convert USER → BASE before passing to superslab free
    void* base = (void*)((uint8_t*)ptr - 1);
    hak_tiny_free_superslab_base(base, ss);
    HAK_STAT_FREE(ss->size_class);
    return;
}

🎯 修正の最小化

変更ファイル (3ファイルのみ)

  1. core/box/hak_free_api.inc.h (2箇所)

    • Line 119: USER → BASE 変換追加
    • Line 127: USER → BASE 変換追加
  2. core/tiny_superslab_free.inc.h (1箇所)

    • Line 28: void* base = (void*)((uint8_t*)ptr - 1); を削除
    • Function signature に _base suffix 追加
  3. core/hakmem_tiny_free.inc (2箇所)

    • Line 173: Call site update
    • Line 470: Call site update + USER → BASE 変換追加

変更行数

  • 追加: 約 10 lines (USER → BASE conversions)
  • 削除: 1 line (DOUBLE CONVERSION removal)
  • 修正: 2 lines (function call updates)

Total: < 15 lines changed


🚀 実装順序

Phase 1: Preparation (5分)

  1. Grep audit で全ての hak_tiny_free_superslab 呼び出しをリスト化
  2. Grep audit で全ての ptr - 1 変換をリスト化
  3. Test baseline: 現状のベンチマーク結果を記録

Phase 2: Core Fix (10分)

  1. tiny_superslab_free.inc.h: Rename function, remove DOUBLE CONVERSION
  2. hak_free_api.inc.h: Add USER → BASE at boundary (2箇所)
  3. hakmem_tiny_free.inc: Update call sites (2箇所)

Phase 3: Verification (10分)

  1. Build test: ./build.sh bench_fixed_size_hakmem
  2. Unit test: Run alignment check test (1KB blocks)
  3. Stress test: Run 100K iterations, check for errors

Phase 4: Validation (5分)

  1. Benchmark: Verify performance unchanged (< 1% regression acceptable)
  2. Grep audit: Verify only 1 USER → BASE conversion point
  3. Final test: Run full bench suite

Total time: 30分


📚 まとめ

Root Cause

DOUBLE CONVERSION: USER → BASE 変換が2回実行される

  1. hak_free_at() が USER pointer を受け取る
  2. hak_tiny_free() が USER pointer をそのまま渡す
  3. hak_tiny_free_superslab() が USER → BASE 変換 (1回目)
  4. 次回 free で再度 USER → BASE 変換 (2回目) ← BUG!

Solution

Box API Boundary で明示的に変換

  1. hak_free_at(): USER → BASE 変換 (1箇所に集約)
  2. hak_tiny_free_superslab(): BASE pointer を期待 (変換削除)
  3. All internal paths: BASE pointers only

Impact

  • 最小限の変更: 3ファイル, < 15 lines
  • パフォーマンス: 影響なし (変換回数は同じ)
  • 安全性: ポインタ契約が明確化, バグ再発を防止

Verification

  • C7 alignment check でバグ検出成功
  • Fix 後は delta % 1024 == 0 になる
  • 全クラス (C0-C7) で一貫性が保たれる