418 lines
10 KiB
Markdown
418 lines
10 KiB
Markdown
|
|
# Atomic Freelist Quick Start Guide
|
||
|
|
|
||
|
|
## TL;DR
|
||
|
|
|
||
|
|
**Problem**: 589 freelist access sites? → **Actual: 90 sites** (much better!)
|
||
|
|
**Solution**: Hybrid approach - lock-free CAS for hot paths, relaxed atomics for cold paths
|
||
|
|
**Effort**: 5-8 hours (3 phases)
|
||
|
|
**Risk**: Low (incremental, easy rollback)
|
||
|
|
**Impact**: -2-3% single-threaded, +MT stability
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Step-by-Step Implementation
|
||
|
|
|
||
|
|
### Step 1: Read Documentation (15 min)
|
||
|
|
|
||
|
|
1. **Strategy**: `ATOMIC_FREELIST_IMPLEMENTATION_STRATEGY.md`
|
||
|
|
- Accessor function design
|
||
|
|
- Memory ordering rationale
|
||
|
|
- Performance projections
|
||
|
|
|
||
|
|
2. **Site Guide**: `ATOMIC_FREELIST_SITE_BY_SITE_GUIDE.md`
|
||
|
|
- File-by-file conversion instructions
|
||
|
|
- Common pitfalls
|
||
|
|
- Testing checklist
|
||
|
|
|
||
|
|
3. **Analysis**: Run `scripts/analyze_freelist_sites.sh`
|
||
|
|
- Validates site counts
|
||
|
|
- Shows operation breakdown
|
||
|
|
- Estimates effort
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Step 2: Create Accessor Header (30 min)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Copy template to working file
|
||
|
|
cp core/box/slab_freelist_atomic.h.TEMPLATE core/box/slab_freelist_atomic.h
|
||
|
|
|
||
|
|
# Add include to tiny_next_ptr_box.h
|
||
|
|
echo '#include "tiny_next_ptr_box.h"' >> core/box/slab_freelist_atomic.h
|
||
|
|
|
||
|
|
# Verify compile
|
||
|
|
make clean
|
||
|
|
make bench_random_mixed_hakmem 2>&1 | grep -i error
|
||
|
|
```
|
||
|
|
|
||
|
|
**Expected**: Clean compile (no errors)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Step 3: Phase 1 - Hot Paths (2-3 hours)
|
||
|
|
|
||
|
|
#### 3.1 Convert NULL Checks (30 min)
|
||
|
|
|
||
|
|
**Pattern**: `if (meta->freelist)` → `if (slab_freelist_is_nonempty(meta))`
|
||
|
|
|
||
|
|
**Files**:
|
||
|
|
- `core/tiny_superslab_alloc.inc.h` (4 sites)
|
||
|
|
- `core/hakmem_tiny_refill_p0.inc.h` (1 site)
|
||
|
|
- `core/box/carve_push_box.c` (2 sites)
|
||
|
|
- `core/hakmem_tiny_tls_ops.h` (2 sites)
|
||
|
|
|
||
|
|
**Commands**:
|
||
|
|
```bash
|
||
|
|
# Add include at top of each file
|
||
|
|
# For tiny_superslab_alloc.inc.h:
|
||
|
|
sed -i '1i#include "box/slab_freelist_atomic.h"' core/tiny_superslab_alloc.inc.h
|
||
|
|
|
||
|
|
# Replace NULL checks (review carefully!)
|
||
|
|
# Do this manually - automated sed is too risky
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 3.2 Convert POP Operations (1 hour)
|
||
|
|
|
||
|
|
**Pattern**:
|
||
|
|
```c
|
||
|
|
// BEFORE:
|
||
|
|
void* block = meta->freelist;
|
||
|
|
meta->freelist = tiny_next_read(class_idx, block);
|
||
|
|
|
||
|
|
// AFTER:
|
||
|
|
void* block = slab_freelist_pop_lockfree(meta, class_idx);
|
||
|
|
if (!block) goto fallback; // Handle race
|
||
|
|
```
|
||
|
|
|
||
|
|
**Files**:
|
||
|
|
- `core/tiny_superslab_alloc.inc.h:117-145` (1 critical site)
|
||
|
|
- `core/box/carve_push_box.c:173-174` (1 site)
|
||
|
|
- `core/hakmem_tiny_tls_ops.h:83-85` (1 site)
|
||
|
|
|
||
|
|
**Testing after each file**:
|
||
|
|
```bash
|
||
|
|
make bench_random_mixed_hakmem
|
||
|
|
./out/release/bench_random_mixed_hakmem 10000 256 42
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 3.3 Convert PUSH Operations (1 hour)
|
||
|
|
|
||
|
|
**Pattern**:
|
||
|
|
```c
|
||
|
|
// BEFORE:
|
||
|
|
tiny_next_write(class_idx, node, meta->freelist);
|
||
|
|
meta->freelist = node;
|
||
|
|
|
||
|
|
// AFTER:
|
||
|
|
slab_freelist_push_lockfree(meta, class_idx, node);
|
||
|
|
```
|
||
|
|
|
||
|
|
**Files**:
|
||
|
|
- `core/box/carve_push_box.c` (6 sites - rollback paths)
|
||
|
|
|
||
|
|
**Testing**:
|
||
|
|
```bash
|
||
|
|
make bench_random_mixed_hakmem
|
||
|
|
./out/release/bench_random_mixed_hakmem 100000 256 42
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 3.4 Phase 1 Final Test (30 min)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Single-threaded baseline
|
||
|
|
./out/release/bench_random_mixed_hakmem 10000000 256 42
|
||
|
|
# Record ops/s (expect: 24.4-24.8M, vs 25.1M baseline)
|
||
|
|
|
||
|
|
# Multi-threaded stability
|
||
|
|
make larson_hakmem
|
||
|
|
./out/release/larson_hakmem 8 100000 256
|
||
|
|
# Expect: No crashes, ~18-20M ops/s
|
||
|
|
|
||
|
|
# Race detection
|
||
|
|
./build.sh tsan larson_hakmem
|
||
|
|
./out/tsan/larson_hakmem 4 10000 256
|
||
|
|
# Expect: No TSan warnings
|
||
|
|
```
|
||
|
|
|
||
|
|
**Success Criteria**:
|
||
|
|
- ✅ Single-threaded regression <5% (24.0M+ ops/s)
|
||
|
|
- ✅ Larson 8T stable (no crashes)
|
||
|
|
- ✅ No TSan warnings
|
||
|
|
- ✅ Clean build
|
||
|
|
|
||
|
|
**If failed**: Rollback and debug
|
||
|
|
```bash
|
||
|
|
git diff > phase1.patch # Save work
|
||
|
|
git checkout . # Revert
|
||
|
|
# Review phase1.patch and fix issues
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Step 4: Phase 2 - Warm Paths (2-3 hours)
|
||
|
|
|
||
|
|
**Scope**: Convert remaining 40 sites in 10 files
|
||
|
|
|
||
|
|
**Files** (in order of priority):
|
||
|
|
1. `core/tiny_refill_opt.h` (refill chain ops)
|
||
|
|
2. `core/tiny_free_magazine.inc.h` (magazine push)
|
||
|
|
3. `core/refill/ss_refill_fc.h` (FC refill)
|
||
|
|
4. `core/slab_handle.h` (slab handle ops)
|
||
|
|
5-10. Remaining files (see SITE_BY_SITE_GUIDE.md)
|
||
|
|
|
||
|
|
**Testing** (after each file):
|
||
|
|
```bash
|
||
|
|
make bench_random_mixed_hakmem
|
||
|
|
./out/release/bench_random_mixed_hakmem 100000 256 42
|
||
|
|
```
|
||
|
|
|
||
|
|
**Phase 2 Final Test**:
|
||
|
|
```bash
|
||
|
|
# All sizes
|
||
|
|
for size in 128 256 512 1024; do
|
||
|
|
./out/release/bench_random_mixed_hakmem 1000000 $size 42
|
||
|
|
done
|
||
|
|
|
||
|
|
# MT scaling
|
||
|
|
for threads in 1 2 4 8 16; do
|
||
|
|
./out/release/larson_hakmem $threads 100000 256
|
||
|
|
done
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### Step 5: Phase 3 - Cleanup (1-2 hours)
|
||
|
|
|
||
|
|
**Scope**: Convert/document remaining 25 sites
|
||
|
|
|
||
|
|
#### 5.1 Debug/Stats Sites (30 min)
|
||
|
|
|
||
|
|
**Pattern**: `meta->freelist` → `SLAB_FREELIST_DEBUG_PTR(meta)`
|
||
|
|
|
||
|
|
**Files**:
|
||
|
|
- `core/box/ss_stats_box.c`
|
||
|
|
- `core/tiny_debug.h`
|
||
|
|
- `core/tiny_remote.c`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 5.2 Init/Cleanup Sites (30 min)
|
||
|
|
|
||
|
|
**Pattern**: `meta->freelist = NULL` → `slab_freelist_store_relaxed(meta, NULL)`
|
||
|
|
|
||
|
|
**Files**:
|
||
|
|
- `core/hakmem_tiny_superslab.c`
|
||
|
|
- `core/hakmem_smallmid_superslab.c`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 5.3 Final Verification (30 min)
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Full rebuild
|
||
|
|
make clean && make all
|
||
|
|
|
||
|
|
# Run all tests
|
||
|
|
./run_all_tests.sh
|
||
|
|
|
||
|
|
# Check for remaining direct accesses
|
||
|
|
grep -rn "meta->freelist" core/ --include="*.c" --include="*.h" | \
|
||
|
|
grep -v "slab_freelist_" | grep -v "SLAB_FREELIST_DEBUG_PTR"
|
||
|
|
# Expect: 0 results (all converted or documented)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Common Pitfalls
|
||
|
|
|
||
|
|
### Pitfall 1: Double-Converting POP
|
||
|
|
```c
|
||
|
|
// ❌ WRONG: slab_freelist_pop_lockfree already calls tiny_next_read!
|
||
|
|
void* p = slab_freelist_pop_lockfree(meta, class_idx);
|
||
|
|
void* next = tiny_next_read(class_idx, p); // ❌ BUG!
|
||
|
|
|
||
|
|
// ✅ RIGHT: Use p directly
|
||
|
|
void* p = slab_freelist_pop_lockfree(meta, class_idx);
|
||
|
|
if (!p) goto fallback;
|
||
|
|
use(p); // ✅ CORRECT
|
||
|
|
```
|
||
|
|
|
||
|
|
### Pitfall 2: Forgetting Race Handling
|
||
|
|
```c
|
||
|
|
// ❌ WRONG: Assuming pop always succeeds
|
||
|
|
void* p = slab_freelist_pop_lockfree(meta, class_idx);
|
||
|
|
use(p); // ❌ SEGV if p == NULL!
|
||
|
|
|
||
|
|
// ✅ RIGHT: Always check for NULL
|
||
|
|
void* p = slab_freelist_pop_lockfree(meta, class_idx);
|
||
|
|
if (!p) goto fallback; // ✅ CORRECT
|
||
|
|
use(p);
|
||
|
|
```
|
||
|
|
|
||
|
|
### Pitfall 3: Including Header Before Dependencies
|
||
|
|
```c
|
||
|
|
// ❌ WRONG: slab_freelist_atomic.h needs tiny_next_ptr_box.h
|
||
|
|
#include "box/slab_freelist_atomic.h" // ❌ Compile error!
|
||
|
|
#include "box/tiny_next_ptr_box.h"
|
||
|
|
|
||
|
|
// ✅ RIGHT: Dependencies first
|
||
|
|
#include "box/tiny_next_ptr_box.h" // ✅ CORRECT
|
||
|
|
#include "box/slab_freelist_atomic.h"
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Performance Expectations
|
||
|
|
|
||
|
|
### Single-Threaded
|
||
|
|
|
||
|
|
| Metric | Before | After | Change |
|
||
|
|
|--------|--------|-------|--------|
|
||
|
|
| Random Mixed 256B | 25.1M ops/s | 24.4-24.8M ops/s | -1.2-2.8% |
|
||
|
|
| Larson 1T | 2.76M ops/s | 2.68-2.73M ops/s | -1.1-2.9% |
|
||
|
|
|
||
|
|
**Acceptable**: <5% regression (relaxed atomics have ~0% cost, CAS has 60-140% but rare)
|
||
|
|
|
||
|
|
### Multi-Threaded
|
||
|
|
|
||
|
|
| Metric | Before | After | Change |
|
||
|
|
|--------|--------|-------|--------|
|
||
|
|
| Larson 8T | CRASH | ~18-20M ops/s | ✅ FIXED |
|
||
|
|
| MT Scaling (8T) | 0% (crashes) | 70-80% | ✅ GAIN |
|
||
|
|
|
||
|
|
**Expected**: Stability + MT scalability >> 2-3% single-threaded cost
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Rollback Plan
|
||
|
|
|
||
|
|
If Phase 1 fails (>5% regression or instability):
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Option 1: Revert to master
|
||
|
|
git checkout master
|
||
|
|
git branch -D atomic-freelist-phase1
|
||
|
|
|
||
|
|
# Option 2: Alternative approach (per-slab spinlock)
|
||
|
|
# Add uint8_t lock field to TinySlabMeta (1 byte)
|
||
|
|
# Use __sync_lock_test_and_set() for spinlock (5-10% overhead)
|
||
|
|
# Guaranteed correctness, simpler implementation
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Success Criteria
|
||
|
|
|
||
|
|
### Phase 1
|
||
|
|
- ✅ Larson 8T runs without crash (100K iterations)
|
||
|
|
- ✅ Single-threaded regression <5% (24.0M+ ops/s)
|
||
|
|
- ✅ No ASan/TSan warnings
|
||
|
|
|
||
|
|
### Phase 2
|
||
|
|
- ✅ All MT tests pass (1T, 2T, 4T, 8T, 16T)
|
||
|
|
- ✅ Single-threaded regression <3% (24.4M+ ops/s)
|
||
|
|
- ✅ MT scaling 70%+ (8T = 5.6x+ speedup)
|
||
|
|
|
||
|
|
### Phase 3
|
||
|
|
- ✅ All 90 sites converted or documented
|
||
|
|
- ✅ Full test suite passes (100% pass rate)
|
||
|
|
- ✅ Zero direct `meta->freelist` accesses (except in atomic.h)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Time Budget
|
||
|
|
|
||
|
|
| Phase | Description | Files | Sites | Time |
|
||
|
|
|-------|-------------|-------|-------|------|
|
||
|
|
| **Prep** | Read docs, setup | - | - | 15 min |
|
||
|
|
| **Header** | Create accessor API | 1 | - | 30 min |
|
||
|
|
| **Phase 1** | Hot paths (critical) | 5 | 25 | 2-3h |
|
||
|
|
| **Phase 2** | Warm paths (important) | 10 | 40 | 2-3h |
|
||
|
|
| **Phase 3** | Cold paths (cleanup) | 5 | 25 | 1-2h |
|
||
|
|
| **Total** | | **21** | **90** | **6-9h** |
|
||
|
|
|
||
|
|
**Realistic**: 6-9 hours with testing and debugging
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Next Steps
|
||
|
|
|
||
|
|
1. **Review strategy** (15 min)
|
||
|
|
- `ATOMIC_FREELIST_IMPLEMENTATION_STRATEGY.md`
|
||
|
|
- `ATOMIC_FREELIST_SITE_BY_SITE_GUIDE.md`
|
||
|
|
|
||
|
|
2. **Run analysis** (5 min)
|
||
|
|
```bash
|
||
|
|
./scripts/analyze_freelist_sites.sh
|
||
|
|
```
|
||
|
|
|
||
|
|
3. **Create branch** (2 min)
|
||
|
|
```bash
|
||
|
|
git checkout -b atomic-freelist-phase1
|
||
|
|
git stash # Save any uncommitted work
|
||
|
|
```
|
||
|
|
|
||
|
|
4. **Create accessor header** (30 min)
|
||
|
|
```bash
|
||
|
|
cp core/box/slab_freelist_atomic.h.TEMPLATE core/box/slab_freelist_atomic.h
|
||
|
|
# Edit to add includes
|
||
|
|
make bench_random_mixed_hakmem # Test compile
|
||
|
|
```
|
||
|
|
|
||
|
|
5. **Start Phase 1** (2-3 hours)
|
||
|
|
- Convert 5 files, ~25 sites
|
||
|
|
- Test after each file
|
||
|
|
- Final test with Larson 8T
|
||
|
|
|
||
|
|
6. **Evaluate results**
|
||
|
|
- If pass: Continue to Phase 2
|
||
|
|
- If fail: Debug or rollback
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Support Documents
|
||
|
|
|
||
|
|
- **ATOMIC_FREELIST_IMPLEMENTATION_STRATEGY.md** - Overall strategy, performance analysis
|
||
|
|
- **ATOMIC_FREELIST_SITE_BY_SITE_GUIDE.md** - Detailed conversion instructions
|
||
|
|
- **core/box/slab_freelist_atomic.h.TEMPLATE** - Accessor API implementation
|
||
|
|
- **scripts/analyze_freelist_sites.sh** - Automated site analysis
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Questions?
|
||
|
|
|
||
|
|
**Q: Why not just add a mutex to TinySlabMeta?**
|
||
|
|
A: 40-byte overhead per slab, 10-20x performance hit. Lock-free CAS is 3-5x faster.
|
||
|
|
|
||
|
|
**Q: Why not use a global lock?**
|
||
|
|
A: Serializes all allocation, kills MT performance. Lock-free allows concurrency.
|
||
|
|
|
||
|
|
**Q: Why 3 phases instead of all at once?**
|
||
|
|
A: Risk management. Phase 1 fixes Larson crash (2-3h), can stop there if needed.
|
||
|
|
|
||
|
|
**Q: What if performance regression is >5%?**
|
||
|
|
A: Rollback to master, review strategy. Consider spinlock alternative (5-10% overhead, simpler).
|
||
|
|
|
||
|
|
**Q: Can I skip Phase 3?**
|
||
|
|
A: Yes, but you'll have ~25 sites with direct access (debug/stats). Document them clearly.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Recommendation
|
||
|
|
|
||
|
|
**Start with Phase 1 (2-3 hours)** and evaluate results:
|
||
|
|
- If Larson 8T stable + regression <5%: ✅ Continue to Phase 2
|
||
|
|
- If unstable or regression >5%: ❌ Rollback and review
|
||
|
|
|
||
|
|
**Best case**: 6-9 hours for full MT safety with <3% regression
|
||
|
|
**Worst case**: 2-3 hours to prove feasibility, then rollback if needed
|
||
|
|
|
||
|
|
**Risk**: Low (incremental, easy rollback, well-documented)
|
||
|
|
**Benefit**: High (MT stability, scalability, future-proof architecture)
|