Files
hakmem/scripts/run_larson_perf.sh

97 lines
3.1 KiB
Bash
Raw Normal View History

#!/usr/bin/env bash
set -euo pipefail
CRITICAL FIX: TLS 未初期化による 4T SEGV を完全解消 **問題:** - Larson 4T で 100% SEGV (1T は 2.09M ops/s で完走) - System/mimalloc は 4T で 33.52M ops/s 正常動作 - SS OFF + Remote OFF でも 4T で SEGV **根本原因: (Task agent ultrathink 調査結果)** ``` CRASH: mov (%r15),%r13 R15 = 0x6261 ← ASCII "ba" (ゴミ値、未初期化TLS) ``` Worker スレッドの TLS 変数が未初期化: - `__thread void* g_tls_sll_head[TINY_NUM_CLASSES];` ← 初期化なし - pthread_create() で生成されたスレッドでゼロ初期化されない - NULL チェックが通過 (0x6261 != NULL) → dereference → SEGV **修正内容:** 全 TLS 配列に明示的初期化子 `= {0}` を追加: 1. **core/hakmem_tiny.c:** - `g_tls_sll_head[TINY_NUM_CLASSES] = {0}` - `g_tls_sll_count[TINY_NUM_CLASSES] = {0}` - `g_tls_live_ss[TINY_NUM_CLASSES] = {0}` - `g_tls_bcur[TINY_NUM_CLASSES] = {0}` - `g_tls_bend[TINY_NUM_CLASSES] = {0}` 2. **core/tiny_fastcache.c:** - `g_tiny_fast_cache[TINY_FAST_CLASS_COUNT] = {0}` - `g_tiny_fast_count[TINY_FAST_CLASS_COUNT] = {0}` - `g_tiny_fast_free_head[TINY_FAST_CLASS_COUNT] = {0}` - `g_tiny_fast_free_count[TINY_FAST_CLASS_COUNT] = {0}` 3. **core/hakmem_tiny_magazine.c:** - `g_tls_mags[TINY_NUM_CLASSES] = {0}` 4. **core/tiny_sticky.c:** - `g_tls_sticky_ss[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}` - `g_tls_sticky_idx[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}` - `g_tls_sticky_pos[TINY_NUM_CLASSES] = {0}` **効果:** ``` Before: 1T: 2.09M ✅ | 4T: SEGV 💀 After: 1T: 2.41M ✅ | 4T: 4.19M ✅ (+15% 1T, SEGV解消) ``` **テスト:** ```bash # 1 thread: 完走 ./larson_hakmem 2 8 128 1024 1 12345 1 → Throughput = 2,407,597 ops/s ✅ # 4 threads: 完走(以前は SEGV) ./larson_hakmem 2 8 128 1024 1 12345 4 → Throughput = 4,192,155 ops/s ✅ ``` **調査協力:** Task agent (ultrathink mode) による完璧な根本原因特定 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 01:27:04 +09:00
# Defensive: ensure timeout exists; if not, best-effort shim
if ! command -v timeout >/dev/null 2>&1; then
echo "[warn] 'timeout' not found; runs may hang on bench bugs" >&2
TIMEOUT() { "$@"; }
else
TIMEOUT() { timeout --kill-after=2s "$@"; }
fi
# Perf-annotated Larson runs for system/mimalloc/HAKMEM without LD_PRELOAD.
# Writes results under scripts/bench_results/larson_perf_*.txt
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")"/.. && pwd)"
cd "$ROOT_DIR"
OUT_DIR="scripts/bench_results"
mkdir -p "$OUT_DIR"
dur=${DUR:-2}
min=${MIN:-8}
max=${MAX:-128}
chunks=${CHUNKS:-1024}
rounds=${ROUNDS:-1}
seed=${SEED:-12345}
threads_csv=${THREADS:-1,4}
# Ensure builds
[[ -x ./larson_system ]] || make -s larson_system >/dev/null
MI_SO="${MIMALLOC_SO:-mimalloc-bench/extern/mi/out/release/libmimalloc.so}"
have_mi=0
if [[ -f "$MI_SO" ]]; then
have_mi=1
[[ -x ./larson_mi ]] || make -s larson_mi >/dev/null || have_mi=0
fi
[[ -x ./larson_hakmem ]] || make -s larson_hakmem >/dev/null
IFS=',' read -ra ts <<<"$threads_csv"
run_one() {
local name=$1; shift
local bin=$1; shift
local thr=$1; shift
local tag="${name}_${thr}T_${dur}s_${min}-${max}"
CRITICAL FIX: TLS 未初期化による 4T SEGV を完全解消 **問題:** - Larson 4T で 100% SEGV (1T は 2.09M ops/s で完走) - System/mimalloc は 4T で 33.52M ops/s 正常動作 - SS OFF + Remote OFF でも 4T で SEGV **根本原因: (Task agent ultrathink 調査結果)** ``` CRASH: mov (%r15),%r13 R15 = 0x6261 ← ASCII "ba" (ゴミ値、未初期化TLS) ``` Worker スレッドの TLS 変数が未初期化: - `__thread void* g_tls_sll_head[TINY_NUM_CLASSES];` ← 初期化なし - pthread_create() で生成されたスレッドでゼロ初期化されない - NULL チェックが通過 (0x6261 != NULL) → dereference → SEGV **修正内容:** 全 TLS 配列に明示的初期化子 `= {0}` を追加: 1. **core/hakmem_tiny.c:** - `g_tls_sll_head[TINY_NUM_CLASSES] = {0}` - `g_tls_sll_count[TINY_NUM_CLASSES] = {0}` - `g_tls_live_ss[TINY_NUM_CLASSES] = {0}` - `g_tls_bcur[TINY_NUM_CLASSES] = {0}` - `g_tls_bend[TINY_NUM_CLASSES] = {0}` 2. **core/tiny_fastcache.c:** - `g_tiny_fast_cache[TINY_FAST_CLASS_COUNT] = {0}` - `g_tiny_fast_count[TINY_FAST_CLASS_COUNT] = {0}` - `g_tiny_fast_free_head[TINY_FAST_CLASS_COUNT] = {0}` - `g_tiny_fast_free_count[TINY_FAST_CLASS_COUNT] = {0}` 3. **core/hakmem_tiny_magazine.c:** - `g_tls_mags[TINY_NUM_CLASSES] = {0}` 4. **core/tiny_sticky.c:** - `g_tls_sticky_ss[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}` - `g_tls_sticky_idx[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}` - `g_tls_sticky_pos[TINY_NUM_CLASSES] = {0}` **効果:** ``` Before: 1T: 2.09M ✅ | 4T: SEGV 💀 After: 1T: 2.41M ✅ | 4T: 4.19M ✅ (+15% 1T, SEGV解消) ``` **テスト:** ```bash # 1 thread: 完走 ./larson_hakmem 2 8 128 1024 1 12345 1 → Throughput = 2,407,597 ops/s ✅ # 4 threads: 完走(以前は SEGV) ./larson_hakmem 2 8 128 1024 1 12345 4 → Throughput = 4,192,155 ops/s ✅ ``` **調査協力:** Task agent (ultrathink mode) による完璧な根本原因特定 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 01:27:04 +09:00
local base="$OUT_DIR/larson_${tag}"
local outfile="${base}.txt"
local outlog="${base}.stdout"
local errlog="${base}.stderr"
: >"$outfile"; : >"$outlog"; : >"$errlog"
echo "== $name threads=$thr ==" | tee -a "$outfile"
# Warm-up quick run (avoid one-time inits skew). Always bounded by timeout.
if [[ "$name" != "hakmem" ]]; then
TIMEOUT "$((dur+2))"s "$bin" 1 "$min" "$max" "$chunks" "$rounds" "$seed" "$thr" \
>>"$outlog" 2>>"$errlog" || true
fi
# Throughput run with timeout; capture both stdout/stderr to logs
echo "[cmd] $bin $dur $min $max $chunks $rounds $seed $thr" | tee -a "$outfile"
TIMEOUT "$((dur+3))"s "$bin" "$dur" "$min" "$max" "$chunks" "$rounds" "$seed" "$thr" \
>>"$outlog" 2>>"$errlog" || true
# Extract a single Throughput line from the captured stdout
local tput_line
if command -v rg >/dev/null 2>&1; then
tput_line=$(rg -n "Throughput" -m 1 "$outlog" || true)
else
tput_line=$(grep -n "Throughput" "$outlog" | head -n1 || true)
fi
[[ -n "$tput_line" ]] && echo "$tput_line" | tee -a "$outfile" || echo "(no Throughput line)" | tee -a "$outfile"
# perf stat (optional; if perf not present, skip gracefully)
if command -v perf >/dev/null 2>&1; then
TIMEOUT "$((dur+3))"s perf stat -o "$outfile" -a -d -d --append -- \
"$bin" "$dur" "$min" "$max" "$chunks" "$rounds" "$seed" "$thr" \
>>"$outlog" 2>>"$errlog" || true
else
echo "[warn] perf not found; skipping perf stat" | tee -a "$outfile"
fi
echo "[logs] stdout=$outlog stderr=$errlog" | tee -a "$outfile"
}
for t in "${ts[@]}"; do
run_one system ./larson_system "$t"
if (( have_mi == 1 )); then
run_one mimalloc ./larson_mi "$t"
fi
HAKMEM_QUIET=1 HAKMEM_DISABLE_BATCH=1 HAKMEM_TINY_META_ALLOC=1 HAKMEM_TINY_META_FREE=1 \
HAKMEM_TINY_USE_SUPERSLAB=${HAKMEM_TINY_USE_SUPERSLAB:-1} \
HAKMEM_TINY_MUST_ADOPT=${HAKMEM_TINY_MUST_ADOPT:-1} \
run_one hakmem ./larson_hakmem "$t"
done
echo "Written perf outputs under $OUT_DIR"