Update CLAUDE.md: Document 2025-11-21 bug fixes and performance status

## Updates

### Current Performance (2025-11-21)
- **HAKMEM**: 9.3M ops/s (Random Mixed 256B, 100K iterations)
- **System malloc**: 58.8M ops/s (baseline)
- **Performance gap**: 6.3x slower (15.8% of target)

### Bug Fixes Completed Today
1. **C7 Stride Upgrade Fix**
   - Fixed local stride table in tiny_block_stride_for_class() (1024→2048)
   - Disabled false positive NXT_MISALIGN checks
   - Removed redundant geometry validations

2. **C7 TLS SLL Corruption Fix**
   - Changed C7 offset from 1→0 (protect next pointer from user data)
   - Limited header restoration to C1-C6 only
   - Removed premature slab release from drain path

3. **Result**: 100% corruption elimination (0 errors / 200K iterations) 

### Performance Concern
- **Previous**: 25.1M ops/s (Phase 3d-C, 2025-11-20)
- **Current**: 9.3M ops/s (Bug Fix後, 2025-11-21)
- **Drop**: -63% performance regression ⚠️

**Possible causes**:
- C7 offset=0 overhead (header sacrifice impact?)
- TLS SLL drain changes
- Measurement variance (System malloc: 90M→58.8M)

**Next action**: Investigate performance drop root cause

📝 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-11-21 23:49:59 +09:00
parent 8b67718bf2
commit e850e7cc42

View File

@ -11,15 +11,42 @@
---
## 📊 現在の性能2025-11-20
## 📊 現在の性能2025-11-21
### ベンチマーク結果Random Mixed 256B
```
HAKMEM (Phase 3d-C): 25.1M ops/s (+11.1% vs Phase 3d-B) ✅
System malloc: 90M ops/s (baseline)
性能差: 3.6倍遅い (27.9% of target)
HAKMEM (Bug Fix後): 9.3M ops/s ⚠️
System malloc: 58.8M ops/s (baseline)
性能差: 6.3倍遅い (15.8% of target)
```
### 🔧 本日の修正2025-11-21
1. **C7 Stride Upgrade Fix**: 1024B→2048B stride 移行の完全修正
- Local stride table 更新漏れを発見・修正
- False positive NXT_MISALIGN check を無効化
- 冗長な geometry validation を削除
2. **C7 TLS SLL Corruption Fix**: User data による next pointer 上書きを防止
- C7 offset を 1→0 に変更next pointer を user accessible 領域外に隔離)
- Header 復元を C1-C6 のみに限定
- Premature slab release を削除
3. **結果**: 100% corruption 除去0 errors / 200K iterations
### ⚠️ 性能低下の懸念
```
Phase 3d-C (2025-11-20): 25.1M ops/s (System比 27.9%)
本日Bug Fix後: 9.3M ops/s (System比 15.8%)
性能差: -63% 低下
```
**原因候補**:
- C7 offset=0 の影響header 犠牲による overhead
- TLS SLL drain 変更の影響
- 測定誤差System malloc: 90M→58.8M
**次のアクション**: 性能低下の原因調査が必要 🔍
### Phase 3d シリーズの成果 🎯
1. **Phase 3d-A (SlabMeta Box)**: Box境界確立 - メタデータアクセスのカプセル化
2. **Phase 3d-B (TLS Cache Merge)**: 22.6M ops/s - g_tls_sll[] 統合でL1D局所性向上