Files
hakmem/POOL_TLS_QUICKSTART.md
Moe Charm (CI) 1010a961fb Tiny: fix header/stride mismatch and harden refill paths
- Root cause: header-based class indexing (HEADER_CLASSIDX=1) wrote a 1-byte
  header during allocation, but linear carve/refill and initial slab capacity
  still used bare class block sizes. This mismatch could overrun slab usable
  space and corrupt freelists, causing reproducible SEGV at ~100k iters.

Changes
- Superslab: compute capacity with effective stride (block_size + header for
  classes 0..6; class7 remains headerless) in superslab_init_slab(). Add a
  debug-only bound check in superslab_alloc_from_slab() to fail fast if carve
  would exceed usable bytes.
- Refill (non-P0 and P0): use header-aware stride for all linear carving and
  TLS window bump operations. Ensure alignment/validation in tiny_refill_opt.h
  also uses stride, not raw class size.
- Drain: keep existing defense-in-depth for remote sentinel and sanitize nodes
  before splicing into freelist (already present).

Notes
- This unifies the memory layout across alloc/linear-carve/refill with a single
  stride definition and keeps class7 (1024B) headerless as designed.
- Debug builds add fail-fast checks; release builds remain lean.

Next
- Re-run Tiny benches (256/1024B) in debug to confirm stability, then in
  release. If any remaining crash persists, bisect with HAKMEM_TINY_P0_BATCH_REFILL=0
  to isolate P0 batch carve, and continue reducing branch-miss as planned.
2025-11-09 18:55:50 +09:00

142 lines
3.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Pool TLS Phase 1.5a - Quick Start Guide
Pool TLS Phase 1.5a は 8KB-52KB のメモリ割り当てを高速化する TLS Arena 実装です。
## 🚀 クイックスタート
### 1. 開発サイクル(最も簡単!)
```bash
# Build + Verify + Smoke Test を一発で実行
./dev_pool_tls.sh test
# 結果:
# ✅ All checks passed!
```
### 2. ベンチマーク実行
```bash
# Pool TLS vs System malloc の性能比較
./run_pool_bench.sh
# 結果例:
# HAKMEM (Pool TLS): 1790000 ops/s
# System malloc: 189000 ops/s
# Performance ratio: 947% (9.47x)
# 🏆 HAKMEM WINS!
```
### 3. 個別ビルド
```bash
# Pool TLS Phase 1.5a を有効にしてビルド
./build_pool_tls.sh bench_mid_large_mt_hakmem
./build_pool_tls.sh larson_hakmem
./build_pool_tls.sh bench_random_mixed_hakmem
```
## 📋 スクリプト一覧
| スクリプト | 用途 | 使い方 |
|-----------|------|--------|
| `dev_pool_tls.sh` | 開発サイクル統合 | `./dev_pool_tls.sh test` |
| `build_pool_tls.sh` | Pool TLS ビルド | `./build_pool_tls.sh <target>` |
| `run_pool_bench.sh` | 性能ベンチマーク | `./run_pool_bench.sh` |
| `build.sh` | 汎用ビルドChatGPT製 | `./build.sh <target>` |
| `verify_build.sh` | ビルド検証ChatGPT製 | `./verify_build.sh <binary>` |
## 🎯 推奨ワークフロー
### コード変更時
```bash
# 1. コード編集
vim core/pool_tls_arena.c
# 2. クイックテスト5-10秒
./dev_pool_tls.sh test
# 3. OK なら詳細ベンチマーク
./run_pool_bench.sh
```
### デバッグ時
```bash
# 1. デバッグビルド
./build_debug.sh bench_mid_large_mt_hakmem gdb
# 2. GDB で実行
gdb ./bench_mid_large_mt_hakmem
(gdb) run 1 100 256 42
```
### クリーンビルド
```bash
# 全削除してリビルド
./dev_pool_tls.sh clean
./dev_pool_tls.sh build
```
## 🔧 有効化されている機能
Pool TLS ビルドでは以下が自動的に有効化されます:
-`POOL_TLS_PHASE1=1` - Pool TLS Phase 1.5a8-52KB
-`HEADER_CLASSIDX=1` - Phase 7 header-based free
-`AGGRESSIVE_INLINE=1` - Phase 7 aggressive inlining
-`PREWARM_TLS=1` - Phase 7 TLS cache pre-warming
**フラグを忘れる心配なし!** スクリプトが全て設定します。
## 📊 性能目標
| Phase | 目標性能 | 現状 |
|-------|----------|------|
| Phase 1.5a (baseline) | 1-2M ops/s | ✅ 1.79M ops/s |
| Phase 1.5b (optimized) | 5-15M ops/s | 🚧 開発中 |
| Phase 2 (learning) | 15-30M ops/s | 📅 予定 |
## ❓ トラブルシューティング
### ビルドエラー
```bash
# フラグ確認
make print-flags
# クリーンビルド
./dev_pool_tls.sh clean
./dev_pool_tls.sh build
```
### 性能が出ない
```bash
# ビルド検証(古いバイナリでないか確認)
./verify_build.sh bench_mid_large_mt_hakmem
# リビルド
./build_pool_tls.sh bench_mid_large_mt_hakmem
```
### SEGV クラッシュ
```bash
# デバッグビルド
./build_debug.sh bench_mid_large_mt_hakmem gdb
# gdb で実行
gdb ./bench_mid_large_mt_hakmem
(gdb) run 1 100 256 42
(gdb) bt
```
## 📝 開発メモ
- **依存関係追跡**: `-MMD -MP` で自動検出ChatGPT 実装)
- **フラグ不整合チェック**: Makefile が自動検証ChatGPT 実装)
- **ビルド検証**: `verify_build.sh` でタイムスタンプ確認ChatGPT 実装)
## 🎓 詳細ドキュメント
- `CLAUDE.md` - 開発履歴
- `POOL_TLS_INVESTIGATION_FINAL.md` - Phase 1.5a 調査報告
- `Makefile` - ビルドシステム詳細