Files
hakmem/PHASE7_DEBUG_COMMANDS.md
Moe Charm (CI) 1010a961fb Tiny: fix header/stride mismatch and harden refill paths
- Root cause: header-based class indexing (HEADER_CLASSIDX=1) wrote a 1-byte
  header during allocation, but linear carve/refill and initial slab capacity
  still used bare class block sizes. This mismatch could overrun slab usable
  space and corrupt freelists, causing reproducible SEGV at ~100k iters.

Changes
- Superslab: compute capacity with effective stride (block_size + header for
  classes 0..6; class7 remains headerless) in superslab_init_slab(). Add a
  debug-only bound check in superslab_alloc_from_slab() to fail fast if carve
  would exceed usable bytes.
- Refill (non-P0 and P0): use header-aware stride for all linear carving and
  TLS window bump operations. Ensure alignment/validation in tiny_refill_opt.h
  also uses stride, not raw class size.
- Drain: keep existing defense-in-depth for remote sentinel and sanitize nodes
  before splicing into freelist (already present).

Notes
- This unifies the memory layout across alloc/linear-carve/refill with a single
  stride definition and keeps class7 (1024B) headerless as designed.
- Debug builds add fail-fast checks; release builds remain lean.

Next
- Re-run Tiny benches (256/1024B) in debug to confirm stability, then in
  release. If any remaining crash persists, bisect with HAKMEM_TINY_P0_BATCH_REFILL=0
  to isolate P0 batch carve, and continue reducing branch-miss as planned.
2025-11-09 18:55:50 +09:00

9.2 KiB

Phase 7 Debugging Commands - Action Checklist

Purpose: Debug why Phase 7 header-based fast free is NOT working


Quick Status Check

cd /mnt/workdisk/public_share/hakmem

# Verify Phase 7 flags are enabled
grep -E "HEADER_CLASSIDX|PREWARM_TLS|AGGRESSIVE_INLINE" build.sh

# Should show:
#   HEADER_CLASSIDX=1
#   AGGRESSIVE_INLINE=1
#   PREWARM_TLS=1

Investigation 1: Are Headers Being Written?

Add Debug Logging to Header Write

File: core/tiny_region_id.h:44-58

Add this after line 50:

#if !HAKMEM_BUILD_RELEASE
    fprintf(stderr, "[HEADER_WRITE] ptr=%p cls=%d magic=0x%02x\n",
            user_ptr, class_idx, header);
#endif

Build and Test

make clean
./build.sh bench_random_mixed_hakmem

# Run with small count to see header writes
./bench_random_mixed_hakmem 10 128 42 2>&1 | grep "HEADER_WRITE"

# Expected output:
# [HEADER_WRITE] ptr=0x7f... cls=4 magic=0xa4
# [HEADER_WRITE] ptr=0x7f... cls=4 magic=0xa4
# ...

If NO output: Headers are NOT being written! (allocation bug) If output present: Headers ARE being written (continue to Investigation 2)


Investigation 2: Why Does Header Read Fail?

Add Debug Logging to Header Read

File: core/tiny_free_fast_v2.inc.h:50-71

Add this after line 66 (header read):

#if !HAKMEM_BUILD_RELEASE
    static int log_count = 0;
    if (log_count < 20) {
        fprintf(stderr, "[HEADER_READ] ptr=%p header_addr=%p header=0x%02x magic_match=%d page_boundary=%d\n",
                ptr, header_addr, header,
                ((header & 0xF0) == TINY_MAGIC) ? 1 : 0,
                (((uintptr_t)ptr & 0xFFF) == 0) ? 1 : 0);
        log_count++;
    }
#endif

Build and Test

make clean
./build.sh bench_random_mixed_hakmem

./bench_random_mixed_hakmem 100 128 42 2>&1 | grep "HEADER_READ"

# Expected output (if working):
# [HEADER_READ] ptr=0x7f... header_addr=0x7f... header=0xa4 magic_match=1 page_boundary=0

# If magic_match=0: Header validation is failing!
# If page_boundary=1: mincore() might be blocking

Analysis:

  • header=0xa4 (class 4, magic 0xa) → Correct
  • header=0xb4 (Pool TLS magic) → Wrong allocator
  • header=0x00 or random → Header not written or corrupted
  • magic_match=0 Validation logic wrong

Investigation 3: Check Dispatch Priority

Verify Pool TLS is Not Interfering

File: core/box/hak_free_api.inc.h:81-110

Line 102 checks Pool magic BEFORE Tiny magic!

if ((header & 0xF0) == POOL_MAGIC) {  // 0xb0
    pool_free(ptr);
    goto done;
}
// Tiny check comes AFTER (line 116)

Problem: If Pool TLS accidentally claims Tiny allocations, they never reach Phase 7 Tiny path!

Test: Disable Pool TLS temporarily

# Edit build.sh - comment out Pool TLS flag
# POOL_TLS_PHASE1=1  ← comment this line

make clean
./build.sh bench_random_mixed_hakmem

HAKMEM_FREE_ROUTE_TRACE=1 ./bench_random_mixed_hakmem 100 128 42 2>&1 | grep "FREE_ROUTE" | sort | uniq -c

# Expected (if Pool TLS was interfering):
# 95 [FREE_ROUTE] header_fast
#  5 [FREE_ROUTE] header_16byte

# If still shows ss_hit: Pool TLS is NOT the problem

Investigation 4: Check Return Value of hak_tiny_free_fast_v2

Add Debug at Call Site

File: core/box/hak_free_api.inc.h:116-122

Add this:

#if !HAKMEM_BUILD_RELEASE
    int result = hak_tiny_free_fast_v2(ptr);
    static int log_count = 0;
    if (log_count < 20) {
        fprintf(stderr, "[FREE_V2] ptr=%p result=%d\n", ptr, result);
        log_count++;
    }
    if (__builtin_expect(result, 1)) {
#else
    if (__builtin_expect(hak_tiny_free_fast_v2(ptr), 1)) {
#endif

Build and Test

make clean
./build.sh bench_random_mixed_hakmem

./bench_random_mixed_hakmem 100 128 42 2>&1 | grep "FREE_V2"

# Expected output:
# [FREE_V2] ptr=0x7f... result=1  ← Success!
# [FREE_V2] ptr=0x7f... result=0  ← Failure (why?)

# If all result=0: Function ALWAYS fails (logic bug)
# If mixed 0/1: Some allocations work, others don't (routing issue)

Investigation 5: Full Trace (Allocation + Free)

Enable All Debug Logs

# Temporarily enable all debug in one run
make clean
./build.sh bench_random_mixed_hakmem

./bench_random_mixed_hakmem 10 128 42 2>&1 | tee phase7_debug_full.log

# Analyze log
grep "HEADER_WRITE" phase7_debug_full.log | wc -l  # Count writes
grep "HEADER_READ" phase7_debug_full.log | wc -l   # Count reads
grep "FREE_V2.*result=1" phase7_debug_full.log | wc -l  # Count successes
grep "FREE_V2.*result=0" phase7_debug_full.log | wc -l  # Count failures
grep "FREE_ROUTE.*header_fast" phase7_debug_full.log | wc -l  # Count fast path
grep "FREE_ROUTE.*ss_hit" phase7_debug_full.log | wc -l  # Count slow path

Expected Pattern (if working):

HEADER_WRITE: 10
HEADER_READ: 10
FREE_V2 result=1: 10
header_fast: 10
ss_hit: 0

Actual Pattern (broken):

HEADER_WRITE: 10 (or 0!)
HEADER_READ: 10
FREE_V2 result=0: 10
header_fast: 0
ss_hit: 10

Investigation 6: Memory Inspection (Advanced)

Check Header in Memory Directly

Add this test:

// In bench_random_mixed.c (after allocation)
void* p = malloc(128);
if (p) {
    unsigned char* header_addr = (unsigned char*)p - 1;
    fprintf(stderr, "[MEM_CHECK] ptr=%p header_addr=%p header=0x%02x\n",
            p, header_addr, *header_addr);
}

Expected: header=0xa4 (class 4, magic 0xa) If different: Header write is broken


Investigation 7: Check Magic Constants

Verify Magic Definitions

grep -rn "TINY_MAGIC\|POOL_MAGIC" core/ --include="*.h" | grep "#define"

# Should show:
# core/tiny_region_id.h:  #define TINY_MAGIC 0xa0
# core/pool_tls.h:        #define POOL_MAGIC 0xb0

If TINY_MAGIC != 0xa0: Wrong magic constant!


Investigation 8: Check Class Index Calculation

Verify Class Mapping

// Add to header write
fprintf(stderr, "[CLASS_CHECK] size=%zu → class=%d (expected=%d)\n",
        /* original size */, class_idx, /* manual calculation */);

// For 128B: class should be 4 (g_tiny_class_sizes[4] = 128)

Decision Tree

START
  ↓
Are HEADER_WRITE logs present?
  ├─ NO → Headers NOT written (allocation bug)
  │       → Check HAK_RET_ALLOC macro
  │       → Check tiny_region_id_write_header() calls
  │
  └─ YES → Headers ARE written ✅
        ↓
    Are HEADER_READ logs present?
      ├─ NO → Headers not read (impossible, must be present)
      │
      └─ YES → Headers ARE read ✅
            ↓
        Is magic_match=1?
          ├─ NO → Validation failing
          │       → Check TINY_MAGIC constant (should be 0xa0)
          │       → Check validation logic ((header & 0xF0) == TINY_MAGIC)
          │
          └─ YES → Validation passes ✅
                ↓
            Is FREE_V2 result=1?
              ├─ NO → Function returns failure
              │       → Check class_idx extraction
              │       → Check TLS push logic
              │       → Check return value
              │
              └─ YES → Function succeeds ✅
                    ↓
                Is FREE_ROUTE showing header_fast?
                  ├─ NO → Dispatch priority wrong
                  │       → Pool TLS checked before Tiny?
                  │       → goto done not executed?
                  │
                  └─ YES → **PHASE 7 WORKING!** 🎉

Expected Outcomes

Scenario 1: Headers Not Written

Symptom: No HEADER_WRITE logs Cause: tiny_region_id_write_header() not called Fix: Check HAK_RET_ALLOC macro expansion


Scenario 2: Magic Validation Fails

Symptom: magic_match=0 in logs Cause: Wrong magic constant or validation logic Fix: Verify TINY_MAGIC=0xa0, check (header & 0xF0) == 0xa0


Scenario 3: Pool TLS Interference

Symptom: Disabling Pool TLS fixes it Cause: Pool TLS claims Tiny allocations Fix: Check dispatch priority, ensure Tiny checked first


Scenario 4: Class Index Corruption

Symptom: Class index doesn't match size Cause: Wrong class calculation or header corruption Fix: Verify hak_tiny_size_to_class() logic


Quick Fix Testing

Once root cause found, test fix:

# 1. Apply fix
# 2. Rebuild
make clean
./build.sh bench_random_mixed_hakmem

# 3. Verify routing (should show header_fast now!)
HAKMEM_FREE_ROUTE_TRACE=1 ./bench_random_mixed_hakmem 100 128 42 2>&1 | \
    grep "FREE_ROUTE" | sort | uniq -c

# Expected (success):
#  95 [FREE_ROUTE] header_fast
#   5 [FREE_ROUTE] header_16byte

# 4. Benchmark (should show 4-8x improvement!)
for i in 1 2 3; do
    ./bench_random_mixed_hakmem 100000 128 42 2>/dev/null | grep "Throughput"
done

# Expected (if header fast path works):
# Throughput = 18000000+ operations per second  (was 4.5M, now 18M+)

Success Criteria

Phase 7 Header Fast Free is WORKING when:

  1. HEADER_WRITE logs show magic 0xa4 (class 4)
  2. HEADER_READ logs show magic_match=1
  3. FREE_V2 logs show result=1
  4. FREE_ROUTE shows 90%+ header_fast (not ss_hit!)
  5. Benchmark shows 15-20M ops/s (4x improvement)

Good luck debugging! 🔍🐛

If you find the issue, document it in: PHASE7_HEADER_FREE_FIX.md