2025-11-29 06:29:02 +09:00
|
|
|
|
# CURRENT TASK - Critical Bugs Fixed
|
|
|
|
|
|
|
|
|
|
|
|
**Last Updated**: 2025-11-29
|
|
|
|
|
|
**Branch**: `master` @ 6d40dc741
|
|
|
|
|
|
**Scope**: Header Corruption + Segfault 根治完了 - 完全安定化達成
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 🎉 2025-11-29 UPDATE: CRITICAL BUGS RESOLVED
|
|
|
|
|
|
|
|
|
|
|
|
### ✅ 完了した修正
|
|
|
|
|
|
|
|
|
|
|
|
#### 1. Header Corruption Bug (Class 1) - **根治完了**
|
|
|
|
|
|
- **症状**: `[TLS_SLL_HDR_RESET] cls=1 got=0x00 expect=0xa1`
|
|
|
|
|
|
- **原因**: freelist → TLS SLL の2パスで header 未復元
|
|
|
|
|
|
- **修正**: 両パスに header restoration 追加
|
|
|
|
|
|
- **結果**: 20-thread Larson で header corruption **完全消滅** ✅
|
|
|
|
|
|
|
|
|
|
|
|
**Commits:**
|
|
|
|
|
|
- `3c6c76cb1` - box_carve_and_push_with_freelist() fix
|
|
|
|
|
|
- `a94344c1a` - tiny_drain_freelist_to_sll_once() fix
|
|
|
|
|
|
|
|
|
|
|
|
#### 2. Segmentation Fault Bug - **根治完了**
|
|
|
|
|
|
- **症状**: larson_hakmem が intermittent に SEGV (~50% 確率)
|
|
|
|
|
|
- **原因**: superslab_allocate() の implicit int declaration → pointer corruption via sign extension
|
|
|
|
|
|
- **修正**: 2ファイルに `#include "box/ss_allocation_box.h"` 追加
|
|
|
|
|
|
- **結果**: larson_hakmem が完全安定動作 ✅
|
|
|
|
|
|
|
|
|
|
|
|
**Commit:**
|
|
|
|
|
|
- `6d40dc741` - Add missing superslab_allocate() declaration
|
|
|
|
|
|
|
|
|
|
|
|
### 🔬 Task Agent の貢献
|
|
|
|
|
|
- Header corruption: 全 freelist paths を網羅的調査、dead code path まで発見
|
|
|
|
|
|
- Segfault: gdb + coredump で Assembly レベル解析、sign extension メカニズムを特定
|
|
|
|
|
|
|
|
|
|
|
|
### 📊 現在の安定性
|
|
|
|
|
|
| Test | Status |
|
|
|
|
|
|
|------|--------|
|
|
|
|
|
|
| Larson 1T | ✅ 安定動作 (51.95M ops/s) |
|
|
|
|
|
|
| Larson 4T | ✅ 安定動作 (header validation 有効) |
|
|
|
|
|
|
| Larson 20T | ✅ Header corruption 0 errors |
|
|
|
|
|
|
| Random Mixed | ✅ 安定動作 (66.82M ops/s) |
|
|
|
|
|
|
| SuperSlab expansion | ✅ Segfault 消滅 |
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 📋 Stable Master Established (2025-11-26)
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
**Branch**: `master` (formerly `larson-master-rebuild`)
|
|
|
|
|
|
**Scope**: 安定版 master 確立完了 - Larson 動作 + 67M ops/s
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 🎯 現状サマリ
|
|
|
|
|
|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
### ✅ 新 master 性能(安定版)
|
2025-11-26 14:45:26 +09:00
|
|
|
|
| Benchmark | Performance | Status |
|
|
|
|
|
|
|-----------|-------------|--------|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
| Larson 1T | **51.95M ops/s** | ✅ 安定動作 (0% crash) |
|
|
|
|
|
|
| Random Mixed 256B | **66.82M ops/s** | ✅ 安定動作 |
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
**Branch**: `master` @ d26dd092b
|
|
|
|
|
|
**Architecture**: E1-CORRECT (C0,C7 offset=0; C1-C6 offset=1)
|
|
|
|
|
|
|
|
|
|
|
|
### 📚 旧 master 保存(参考用)
|
|
|
|
|
|
- **Branch**: `master-80M-unstable` @ 328a6b722
|
|
|
|
|
|
- Random Mixed: ~80M ops/s
|
|
|
|
|
|
- Larson: **100% クラッシュ** (Step 2.5 バグ)
|
|
|
|
|
|
- Architecture: UNIFIED-HEADER (全クラス offset=1)
|
|
|
|
|
|
- **80M 達成経路**: `PERFORMANCE_HISTORY_62M_TO_80M.md` 参照
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
## 📋 作業計画
|
|
|
|
|
|
|
|
|
|
|
|
### Phase 0: 安定ベースライン確立 ✅ DONE
|
|
|
|
|
|
- [x] `larson-fix` ブランチから `larson-master-rebuild` 作成
|
|
|
|
|
|
- [x] Larson 動作確認 (51M ops/s)
|
|
|
|
|
|
- [x] Random Mixed 動作確認 (62M ops/s)
|
|
|
|
|
|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
### Phase 1: クリーンアップ & 安定化 ✅ DONE
|
2025-11-26 14:45:26 +09:00
|
|
|
|
**目標**: 安定状態でコードベースを整理
|
|
|
|
|
|
|
|
|
|
|
|
#### 1.1 Cherry-pick 済み(7コミット)
|
|
|
|
|
|
- [x] `9793f17d6` レガシーコード削除 (-1,159 LOC)
|
|
|
|
|
|
- [x] `cc0104c4e` テストファイル削除 (-1,750 LOC)
|
|
|
|
|
|
- [x] `416930eb6` バックアップファイル削除 (-1,072 KB)
|
|
|
|
|
|
- [x] `225b6fcc7` 死コード削除: UltraHot, RingCache等 (-1,844 LOC)
|
|
|
|
|
|
- [x] `2c99afa49` 学習システムバグドキュメント
|
|
|
|
|
|
- [x] `328a6b722` Larsonバグ分析更新
|
|
|
|
|
|
- [x] `0143e0fed` CONFIGURATION.md 追加
|
|
|
|
|
|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
#### 1.2 追加最適化
|
|
|
|
|
|
- [x] `a2e65716b` tiny_get_max_size inline化 (+2M ops/s期待値)
|
|
|
|
|
|
- [x] `d35504163` Superslab Min-Keep ポート(後にリバート)
|
|
|
|
|
|
- [x] `bea839add` Min-Keep リバート(Larson 安定化)
|
|
|
|
|
|
- [x] `d26dd092b` Performance History ドキュメント作成
|
|
|
|
|
|
|
|
|
|
|
|
#### 1.3 master 確立
|
|
|
|
|
|
- [x] 旧 master を `master-80M-unstable` にバックアップ
|
|
|
|
|
|
- [x] master ブランチを安定版 (d26dd092b) に更新
|
|
|
|
|
|
- [x] Larson 0% crash 確認 (51.95M ops/s)
|
|
|
|
|
|
- [x] Random Mixed 67M ops/s 確認
|
2025-11-26 14:45:26 +09:00
|
|
|
|
|
|
|
|
|
|
### Phase 2: 性能最適化ポート 📊 PENDING
|
|
|
|
|
|
**目標**: 62M → 80M+ ops/s 回復
|
|
|
|
|
|
|
|
|
|
|
|
#### 2.1 簡単なチューニング(独立・低リスク)
|
|
|
|
|
|
- [ ] `e81fe783d` tiny_get_max_size inline化 (+2M)
|
|
|
|
|
|
- [ ] `04a60c316` Superslab/SharedPool チューニング (+1M)
|
|
|
|
|
|
- [ ] `392d29018` Unified Cache容量チューニング (+1M)
|
|
|
|
|
|
- [ ] `dcd89ee88` Stage 1 lock-free (+0.3M)
|
|
|
|
|
|
|
|
|
|
|
|
#### 2.2 本丸(UNIFIED-HEADER)
|
|
|
|
|
|
- [ ] `472b6a60b` Phase UNIFIED-HEADER (+17%, C7ヘッダ統一)
|
|
|
|
|
|
- [ ] `d26519f67` UNIFIED-HEADERバグ修正 (+15-41%)
|
|
|
|
|
|
- [ ] `165c33bc2` Larsonフォールバック修正(必要なら)
|
|
|
|
|
|
|
|
|
|
|
|
#### 2.3 スキップ対象
|
|
|
|
|
|
- ❌ `03d321f6b` Phase 27 Ultra-Inline → **-10~15%回帰**
|
|
|
|
|
|
- ❌ Step 2.5関連コミット → **Larsonクラッシュの原因**
|
|
|
|
|
|
|
|
|
|
|
|
### Phase 3: 検証 & マージ 🔀 PENDING
|
|
|
|
|
|
- [ ] Larson 10回平均ベンチマーク
|
|
|
|
|
|
- [ ] Random Mixed 10回平均ベンチマーク
|
|
|
|
|
|
- [ ] master ブランチ更新
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
## 🔍 根本原因分析
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
### Larson クラッシュの原因
|
|
|
|
|
|
**First Bad Commit**: `19c1abfe7` "Fix Unified Cache TLS SLL bypass"
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
Step 2.5 が TLS_SLL_PUSH_DUP を「修正」するために追加されたが:
|
|
|
|
|
|
1. TLS_SLL_PUSH_DUP は実際には発生しない(ベースで10M回テスト済み)
|
|
|
|
|
|
2. Step 2.5 がマルチスレッド環境で cross-thread ownership 問題を引き起こす
|
|
|
|
|
|
3. 結論:**不要な「修正」が Larson を壊した**
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
### 80M 達成の主要因
|
|
|
|
|
|
| コミット | 内容 | 改善幅 |
|
|
|
|
|
|
|---------|------|--------|
|
|
|
|
|
|
| `472b6a60b` | UNIFIED-HEADER (C7統一) | **+17%** |
|
|
|
|
|
|
| `d26519f67` | UH バグ修正 | +15-41% |
|
|
|
|
|
|
| その他チューニング | inline, policy等 | +4-5M |
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
## 📁 関連ファイル
|
|
|
|
|
|
|
|
|
|
|
|
### 修正対象
|
|
|
|
|
|
- `core/front/tiny_unified_cache.c` - Step 2.5 なしのまま維持
|
|
|
|
|
|
- `core/tiny_free_fast_v2.inc.h` - LARSON_FIX 関連
|
|
|
|
|
|
- `core/box/ptr_conversion_box.h` - UNIFIED-HEADER で変更予定
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
### ドキュメント
|
|
|
|
|
|
- `LEARNING_SYSTEM_BUGS_P0.md` - 学習システムバグ記録
|
|
|
|
|
|
- `CONFIGURATION.md` - ENV変数リファレンス
|
|
|
|
|
|
- `PROFILES.md` - 性能プロファイル
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
## ✅ 完了マイルストーン
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
1. **Larson 安定化** - 51M ops/s で動作 ✅
|
|
|
|
|
|
2. **Cherry-pick Phase 1** - 7コミット完了 ✅
|
|
|
|
|
|
3. **ベースライン確立** - 62M/51M で安定 ✅
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
Phase FREE-FRONT-V3-1: Free route snapshot infrastructure + build fix
Summary:
========
Implemented Phase FREE-FRONT-V3 infrastructure to optimize free hotpath by:
1. Creating snapshot-based route decision table (consolidating route logic)
2. Removing redundant ENV checks from hot path
3. Preparing for future integration into hak_free_at()
Key Changes:
============
1. NEW FILES:
- core/box/free_front_v3_env_box.h: Route snapshot definition & API
- core/box/free_front_v3_env_box.c: Snapshot initialization & caching
2. Infrastructure Details:
- FreeRouteSnapshotV3: Maps class_idx → free_route_kind for all 8 classes
- Routes defined: LEGACY, TINY_V3, CORE_V6_C6, POOL_V1
- ENV-gated initialization (HAKMEM_TINY_FREE_FRONT_V3_ENABLED, default OFF)
- Per-thread TLS caching to avoid repeated ENV reads
3. Design Goals:
- Consolidate tiny_route_for_class() results into snapshot table
- Remove C7 ULTRA / v4 / v5 / v6 ENV checks from hot path
- Limit lookup (ss_fast_lookup/slab_index_for) to paths that truly need it
- Clear ownership boundary: front v3 handles routing, downstream handles free
4. Phase Plan:
- v3-1 ✅ COMPLETE: Infrastructure (snapshot table, ENV initialization, TLS cache)
- v3-2 (INFRASTRUCTURE ONLY): Placeholder integration in hak_free_api.inc.h
- v3-3 (FUTURE): Full integration + benchmark A/B to measure hotpath improvement
5. BUILD FIX:
- Added missing core/box/c7_meta_used_counter_box.o to OBJS_BASE in Makefile
- This symbol was referenced but not linked, causing undefined reference errors
- Benchmark targets now build cleanly without LTO
Status:
=======
- Build: ✅ PASS (bench_allocators_hakmem builds without errors)
- Integration: Currently DISABLED (default OFF, ready for v3-2 phase)
- No performance impact: Infrastructure-only, hotpath unchanged
Future Work:
============
- Phase v3-2: Integrate snapshot routing into hak_free_at() main path
- Phase v3-3: Measure free hotpath performance improvement (target: 1-2% less branch mispredict)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 19:17:30 +09:00
|
|
|
|
## Phase FREE-LEGACY-OPT シリーズ(2025-12-11)
|
|
|
|
|
|
|
|
|
|
|
|
### Phase FREE-LEGACY-OPT-4-1: Legacy per-class 分析 ✅ 完了
|
|
|
|
|
|
|
|
|
|
|
|
**目的**: Legacy fallback 49.2% の内訳を per-class で分析
|
|
|
|
|
|
|
|
|
|
|
|
**測定結果(Mixed 16-1024B)**:
|
|
|
|
|
|
- **C6 (513-1024B)**: 51.4% (137,319 / 266,942 Legacy calls)
|
|
|
|
|
|
- C5 (257-512B): 25.8%
|
|
|
|
|
|
- C4 (129-256B): 13.0%
|
|
|
|
|
|
- C3 (65-128B): 6.5%
|
|
|
|
|
|
- C2 (33-64B): 3.3%
|
|
|
|
|
|
- C0/C1/C7: 0.0%
|
|
|
|
|
|
|
|
|
|
|
|
**最大ターゲット**: C6 が Legacy の過半数を占める
|
|
|
|
|
|
|
|
|
|
|
|
**詳細**: `docs/analysis/FREE_LEGACY_PATH_ANALYSIS.md` 参照
|
|
|
|
|
|
|
|
|
|
|
|
### Phase FREE-LEGACY-OPT-4-2: C6_ULTRA_FREE_BOX 実装(進行中)
|
|
|
|
|
|
|
|
|
|
|
|
**目的**: C6 の free だけを C7 ULTRA 風 TLS キャッシュで受け、Legacy fallback を半減
|
|
|
|
|
|
|
|
|
|
|
|
**実装範囲**:
|
|
|
|
|
|
- C6 専用・free 専用(alloc は既存ルートのまま)
|
|
|
|
|
|
- TLS に `c6_freelist[32]` + `c6_count` + segment range check
|
|
|
|
|
|
- ENV: `HAKMEM_TINY_C6_ULTRA_FREE_ENABLED=0`(研究箱、デフォルト OFF)
|
|
|
|
|
|
|
|
|
|
|
|
**期待効果**:
|
|
|
|
|
|
- Legacy fallback: 49.2% → 24-27%(C6 分を削減)
|
|
|
|
|
|
- Mixed throughput: +5-8% 改善(44.8M → 47-48M ops/s)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2025-11-26 14:45:26 +09:00
|
|
|
|
## 🎯 次のアクション
|
2025-11-26 13:14:18 +09:00
|
|
|
|
|
2025-11-26 16:54:36 +09:00
|
|
|
|
### 現時点での選択肢
|
|
|
|
|
|
|
|
|
|
|
|
1. **Option A: 現状維持(推奨)**
|
|
|
|
|
|
- master @ 67M ops/s (Larson 安定)
|
|
|
|
|
|
- 80M の知見は `PERFORMANCE_HISTORY_62M_TO_80M.md` と `master-80M-unstable` に保存済み
|
|
|
|
|
|
- Phase 2 (性能最適化) は将来の作業として保留
|
|
|
|
|
|
|
|
|
|
|
|
2. **Option B: UNIFIED-HEADER ポート(高難度)**
|
|
|
|
|
|
- 80M 達成の主要因(+17% + +15-41%)
|
|
|
|
|
|
- E1-CORRECT との互換性問題あり
|
|
|
|
|
|
- 大規模な書き換えが必要
|
|
|
|
|
|
- 詳細: `PERFORMANCE_HISTORY_62M_TO_80M.md` Section "Option 3"
|
|
|
|
|
|
|
|
|
|
|
|
3. **Option C: Step 2.5 Revert(失敗済み)**
|
|
|
|
|
|
- master-80M-unstable から Step 2.5 をリバート
|
|
|
|
|
|
- 複雑な conflict (33行変更) で35+ 回失敗済み
|
|
|
|
|
|
- 推奨しない
|