Claude
3429ed4457
Phase 6-7: Dual Free Lists (Phase 2) - Mixed results
Implementation:
Separate alloc/free paths to reduce cache line bouncing (mimalloc's strategy).
Changes:
1. Added g_tiny_fast_free_head[] - separate free staging area
2. Modified tiny_fast_alloc() - lazy migration from free_head
3. Modified tiny_fast_free() - push to free_head (separate cache line)
4. Modified tiny_fast_drain() - drain from free_head
Key design (inspired by mimalloc):
- alloc_head: Hot allocation path (g_tiny_fast_cache)
- free_head: Local frees staging (g_tiny_fast_free_head)
- Migration: Pointer swap when alloc_head empty (zero-cost batching)
- Benefit: alloc/free touch different cache lines → reduce bouncing
Results (Larson 2s 8-128B 1024):
- Phase 3 baseline: ST 0.474M, MT 1.712M ops/s
- Phase 2: ST 0.600M, MT 1.624M ops/s
- Change: **+27% ST, -5% MT** ⚠️
Analysis - Mixed results:
✅ Single-thread: +27% improvement
- Better cache locality (alloc/free separated)
- No contention, pure memory access pattern win
❌ Multi-thread: -5% regression (expected +30-50%)
- Migration logic overhead (extra branches)
- Dual arrays increase TLS size → more cache misses?
- Pointer swap cost on migration path
- May not help in Larson's specific access pattern
Comparison to system malloc:
- Current: 1.624M ops/s (MT)
- System: ~7.2M ops/s (MT)
- **Gap: Still 4.4x slower**
Key insights:
1. mimalloc's dual free lists help with *cross-thread* frees
2. Larson may be mostly *same-thread* frees → less benefit
3. Migration overhead > cache line bouncing reduction
4. ST improvement shows memory locality matters
5. Need to profile actual malloc/free patterns in Larson
Why mimalloc succeeds but HAKMEM doesn't:
- mimalloc has sophisticated remote free queue (lock-free MPSC)
- HAKMEM's simple dual lists don't handle cross-thread well
- Larson's workload may differ from mimalloc's target benchmarks
Next considerations:
- Verify Larson's same-thread vs cross-thread free ratio
- Consider combining all 3 phases (may have synergy)
- Profile with actual counters (malloc vs free hotspots)
- May need fundamentally different approach
2025-11-05 05:35:06 +00:00
..
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 05:10:02 +00:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 05:35:06 +00:00
2025-11-05 05:35:06 +00:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00
2025-11-05 12:31:14 +09:00