# Atomic Freelist Site-by-Site Conversion Guide ## Quick Reference **Total Sites**: 90 **Phase 1 (Critical)**: 25 sites in 5 files **Phase 2 (Important)**: 40 sites in 10 files **Phase 3 (Cleanup)**: 25 sites in 5 files --- ## Phase 1: Critical Hot Paths (5 files, 25 sites) ### File 1: `core/box/slab_freelist_atomic.h` (NEW) **Action**: CREATE new file with accessor API (see ATOMIC_FREELIST_IMPLEMENTATION_STRATEGY.md section 1) **Lines**: ~80 lines **Time**: 30 minutes --- ### File 2: `core/tiny_superslab_alloc.inc.h` (8 sites) **File**: `/mnt/workdisk/public_share/hakmem/core/tiny_superslab_alloc.inc.h` #### Site 2.1: Line 26 (NULL check) ```c // BEFORE: if (meta->freelist == NULL && meta->used < meta->capacity) { // AFTER: if (slab_freelist_is_empty(meta) && meta->used < meta->capacity) { ``` **Reason**: Relaxed load for condition check --- #### Site 2.2: Line 38 (remote drain check) ```c // BEFORE: if (__builtin_expect(atomic_load_explicit(&ss->remote_heads[slab_idx], memory_order_acquire) != 0, 0)) { // AFTER: (no change - this is remote_heads, not freelist) ``` **Reason**: Already using atomic operations correctly --- #### Site 2.3: Line 88 (fast path check) ```c // BEFORE: if (__builtin_expect(meta->freelist == NULL && meta->used < meta->capacity, 1)) { // AFTER: if (__builtin_expect(slab_freelist_is_empty(meta) && meta->used < meta->capacity, 1)) { ``` **Reason**: Relaxed load for fast path condition --- #### Site 2.4: Lines 117-145 (freelist pop - CRITICAL) ```c // BEFORE: if (__builtin_expect(meta->freelist != NULL, 0)) { void* block = meta->freelist; if (meta->class_idx != class_idx) { // Class mismatch, abandon freelist meta->freelist = NULL; goto bump_path; } // Allocate from freelist meta->freelist = tiny_next_read(meta->class_idx, block); meta->used = (uint16_t)((uint32_t)meta->used + 1); ss_active_add(ss, 1); return (void*)((uint8_t*)block + 1); } // AFTER: if (__builtin_expect(slab_freelist_is_nonempty(meta), 0)) { // Try lock-free pop void* block = slab_freelist_pop_lockfree(meta, meta->class_idx); if (!block) { // Another thread won the race, fall through to bump path goto bump_path; } if (meta->class_idx != class_idx) { // Class mismatch, return to freelist and abandon slab_freelist_push_lockfree(meta, meta->class_idx, block); slab_freelist_store_relaxed(meta, NULL); // Clear freelist goto bump_path; } // Success meta->used = (uint16_t)((uint32_t)meta->used + 1); ss_active_add(ss, 1); return (void*)((uint8_t*)block + 1); } ``` **Reason**: Lock-free CAS for hot path allocation **CRITICAL**: Note that `slab_freelist_pop_lockfree()` already handles `tiny_next_read()` internally! --- #### Site 2.5: Line 134 (freelist clear) ```c // BEFORE: meta->freelist = NULL; // AFTER: slab_freelist_store_relaxed(meta, NULL); ``` **Reason**: Relaxed store for initialization --- #### Site 2.6: Line 308 (bump path check) ```c // BEFORE: if (meta && meta->freelist == NULL && meta->used < meta->capacity && tls->slab_base) { // AFTER: if (meta && slab_freelist_is_empty(meta) && meta->used < meta->capacity && tls->slab_base) { ``` **Reason**: Relaxed load for condition check --- #### Site 2.7: Line 351 (freelist update after remote drain) ```c // BEFORE: meta->freelist = next; // AFTER: slab_freelist_store_relaxed(meta, next); ``` **Reason**: Relaxed store after drain (single-threaded context) --- #### Site 2.8: Line 372 (bump path check) ```c // BEFORE: if (meta && meta->freelist == NULL && meta->used < meta->capacity && meta->carved < meta->capacity) { // AFTER: if (meta && slab_freelist_is_empty(meta) && meta->used < meta->capacity && meta->carved < meta->capacity) { ``` **Reason**: Relaxed load for condition check --- ### File 3: `core/hakmem_tiny_refill_p0.inc.h` (3 sites) **File**: `/mnt/workdisk/public_share/hakmem/core/hakmem_tiny_refill_p0.inc.h` #### Site 3.1: Line 101 (prefetch) ```c // BEFORE: __builtin_prefetch(&meta->freelist, 0, 3); // AFTER: (no change) __builtin_prefetch(&meta->freelist, 0, 3); ``` **Reason**: Prefetch works fine with atomic type, no conversion needed --- #### Site 3.2: Lines 252-253 (freelist check + prefetch) ```c // BEFORE: if (meta->freelist) { __builtin_prefetch(meta->freelist, 0, 3); } // AFTER: if (slab_freelist_is_nonempty(meta)) { void* head = slab_freelist_load_relaxed(meta); __builtin_prefetch(head, 0, 3); } ``` **Reason**: Need to load pointer before prefetching (cannot prefetch atomic type directly) **Alternative** (if prefetch not critical): ```c // Simpler: Skip prefetch if (slab_freelist_is_nonempty(meta)) { // ... rest of logic } ``` --- #### Site 3.3: Line ~260 (freelist pop in batch refill) **Context**: Need to review full function to find freelist pop logic ```bash grep -A20 "if (meta->freelist)" core/hakmem_tiny_refill_p0.inc.h ``` **Expected Pattern**: ```c // BEFORE: while (taken < want && meta->freelist) { void* p = meta->freelist; meta->freelist = tiny_next_read(class_idx, p); // ... push to TLS } // AFTER: while (taken < want && slab_freelist_is_nonempty(meta)) { void* p = slab_freelist_pop_lockfree(meta, class_idx); if (!p) break; // Another thread drained it // ... push to TLS } ``` --- ### File 4: `core/box/carve_push_box.c` (10 sites) **File**: `/mnt/workdisk/public_share/hakmem/core/box/carve_push_box.c` #### Site 4.1-4.2: Lines 33-34 (rollback push) ```c // BEFORE: tiny_next_write(class_idx, node, meta->freelist); meta->freelist = node; // AFTER: slab_freelist_push_lockfree(meta, class_idx, node); ``` **Reason**: Lock-free push for rollback (inside rollback_carved_blocks) **IMPORTANT**: `slab_freelist_push_lockfree()` already calls `tiny_next_write()` internally! --- #### Site 4.3-4.4: Lines 120-121 (rollback in box_carve_and_push) ```c // BEFORE: tiny_next_write(class_idx, popped, meta->freelist); meta->freelist = popped; // AFTER: slab_freelist_push_lockfree(meta, class_idx, popped); ``` **Reason**: Same as 4.1-4.2 --- #### Site 4.5-4.6: Lines 128-129 (rollback remaining) ```c // BEFORE: tiny_next_write(class_idx, node, meta->freelist); meta->freelist = node; // AFTER: slab_freelist_push_lockfree(meta, class_idx, node); ``` **Reason**: Same as 4.1-4.2 --- #### Site 4.7: Line 172 (freelist carve check) ```c // BEFORE: while (pushed < want && meta->freelist) { // AFTER: while (pushed < want && slab_freelist_is_nonempty(meta)) { ``` **Reason**: Relaxed load for loop condition --- #### Site 4.8: Lines 173-174 (freelist pop) ```c // BEFORE: void* p = meta->freelist; meta->freelist = tiny_next_read(class_idx, p); // AFTER: void* p = slab_freelist_pop_lockfree(meta, class_idx); if (!p) break; // Freelist exhausted ``` **Reason**: Lock-free pop for carve-with-freelist path --- #### Site 4.9-4.10: Lines 179-180 (rollback on push failure) ```c // BEFORE: tiny_next_write(class_idx, p, meta->freelist); meta->freelist = p; // AFTER: slab_freelist_push_lockfree(meta, class_idx, p); ``` **Reason**: Same as 4.1-4.2 --- ### File 5: `core/hakmem_tiny_tls_ops.h` (4 sites) **File**: `/mnt/workdisk/public_share/hakmem/core/hakmem_tiny_tls_ops.h` #### Site 5.1: Line 77 (TLS drain check) ```c // BEFORE: if (meta->freelist) { // AFTER: if (slab_freelist_is_nonempty(meta)) { ``` **Reason**: Relaxed load for condition check --- #### Site 5.2: Line 82 (TLS drain loop) ```c // BEFORE: while (local < need && meta->freelist) { // AFTER: while (local < need && slab_freelist_is_nonempty(meta)) { ``` **Reason**: Relaxed load for loop condition --- #### Site 5.3: Lines 83-85 (TLS drain pop) ```c // BEFORE: void* node = meta->freelist; // ... 1 line ... meta->freelist = tiny_next_read(class_idx, node); // AFTER: void* node = slab_freelist_pop_lockfree(meta, class_idx); if (!node) break; // Freelist exhausted // ... remove tiny_next_read line ... ``` **Reason**: Lock-free pop for TLS drain --- #### Site 5.4: Line 203 (TLS freelist init) ```c // BEFORE: meta->freelist = node; // AFTER: slab_freelist_store_relaxed(meta, node); ``` **Reason**: Relaxed store for initialization (single-threaded context) --- ### Phase 1 Summary **Total Changes**: - 1 new file (`slab_freelist_atomic.h`) - 5 modified files - ~25 conversion sites - ~8 POP operations converted to CAS - ~6 PUSH operations converted to CAS - ~11 NULL checks converted to relaxed loads **Time Estimate**: 2-3 hours (with testing) --- ## Phase 2: Important Paths (10 files, 40 sites) ### File 6: `core/tiny_refill_opt.h` #### Lines 199-230 (refill chain pop) ```c // BEFORE: while (taken < want && meta->freelist) { void* p = meta->freelist; // ... splice logic ... meta->freelist = next; } // AFTER: while (taken < want && slab_freelist_is_nonempty(meta)) { void* p = slab_freelist_pop_lockfree(meta, class_idx); if (!p) break; // ... splice logic (remove next assignment) ... } ``` --- ### File 7: `core/tiny_free_magazine.inc.h` #### Lines 135-136, 328 (magazine push) ```c // BEFORE: tiny_next_write(meta->class_idx, it.ptr, meta->freelist); meta->freelist = it.ptr; // AFTER: slab_freelist_push_lockfree(meta, meta->class_idx, it.ptr); ``` --- ### File 8: `core/refill/ss_refill_fc.h` #### Lines 151-153 (FC refill pop) ```c // BEFORE: if (meta->freelist != NULL) { void* p = meta->freelist; meta->freelist = tiny_next_read(class_idx, p); } // AFTER: if (slab_freelist_is_nonempty(meta)) { void* p = slab_freelist_pop_lockfree(meta, class_idx); if (!p) { // Race: freelist drained, skip } } ``` --- ### File 9: `core/slab_handle.h` #### Lines 211, 259, 308, 334 (slab handle ops) ```c // BEFORE (line 211): return h->meta->freelist; // AFTER: return slab_freelist_load_relaxed(h->meta); // BEFORE (line 259): h->meta->freelist = ptr; // AFTER: slab_freelist_store_relaxed(h->meta, ptr); // BEFORE (line 302): h->meta->freelist = NULL; // AFTER: slab_freelist_store_relaxed(h->meta, NULL); // BEFORE (line 308): h->meta->freelist = next; // AFTER: slab_freelist_store_relaxed(h->meta, next); // BEFORE (line 334): return (h->meta->freelist != NULL); // AFTER: return slab_freelist_is_nonempty(h->meta); ``` --- ### Files 10-15: Remaining Phase 2 Files **Pattern**: Same conversions as above - NULL checks → `slab_freelist_is_empty/nonempty()` - Direct loads → `slab_freelist_load_relaxed()` - Direct stores → `slab_freelist_store_relaxed()` - POP operations → `slab_freelist_pop_lockfree()` - PUSH operations → `slab_freelist_push_lockfree()` **Files**: - `core/hakmem_tiny_superslab.c` - `core/hakmem_tiny_alloc_new.inc` - `core/hakmem_tiny_free.inc` - `core/box/ss_allocation_box.c` - `core/box/free_local_box.c` - `core/box/integrity_box.c` **Time Estimate**: 2-3 hours (with testing) --- ## Phase 3: Cleanup (5 files, 25 sites) ### Debug/Stats Sites (NO CONVERSION) **Files**: - `core/box/ss_stats_box.c` - `core/tiny_debug.h` - `core/tiny_remote.c` **Change**: ```c // BEFORE: fprintf(stderr, "freelist=%p", meta->freelist); // AFTER: fprintf(stderr, "freelist=%p", SLAB_FREELIST_DEBUG_PTR(meta)); ``` **Reason**: Already atomic type, just need explicit cast for printf --- ### Init/Cleanup Sites (RELAXED STORE) **Files**: - `core/hakmem_tiny_superslab.c` (init) - `core/hakmem_smallmid_superslab.c` (init) **Change**: ```c // BEFORE: meta->freelist = NULL; // AFTER: slab_freelist_store_relaxed(meta, NULL); ``` **Reason**: Single-threaded initialization, relaxed is sufficient --- ### Verification Sites (RELAXED LOAD) **Files**: - `core/box/integrity_box.c` (integrity checks) **Change**: ```c // BEFORE: if (meta->freelist) { // ... integrity check ... } // AFTER: if (slab_freelist_is_nonempty(meta)) { // ... integrity check ... } ``` **Time Estimate**: 1-2 hours --- ## Common Pitfalls ### Pitfall 1: Double-Converting POP Operations **WRONG**: ```c // ❌ BAD: slab_freelist_pop_lockfree already calls tiny_next_read! void* p = slab_freelist_pop_lockfree(meta, class_idx); void* next = tiny_next_read(class_idx, p); // ❌ WRONG! ``` **RIGHT**: ```c // ✅ GOOD: slab_freelist_pop_lockfree returns the popped block directly void* p = slab_freelist_pop_lockfree(meta, class_idx); if (!p) break; // Handle race // Use p directly ``` --- ### Pitfall 2: Double-Converting PUSH Operations **WRONG**: ```c // ❌ BAD: slab_freelist_push_lockfree already calls tiny_next_write! tiny_next_write(class_idx, node, meta->freelist); // ❌ WRONG! slab_freelist_push_lockfree(meta, class_idx, node); ``` **RIGHT**: ```c // ✅ GOOD: slab_freelist_push_lockfree does everything slab_freelist_push_lockfree(meta, class_idx, node); ``` --- ### Pitfall 3: Forgetting CAS Race Handling **WRONG**: ```c // ❌ BAD: Assuming pop always succeeds void* p = slab_freelist_pop_lockfree(meta, class_idx); use(p); // ❌ SEGV if p == NULL! ``` **RIGHT**: ```c // ✅ GOOD: Always check for NULL (race condition) void* p = slab_freelist_pop_lockfree(meta, class_idx); if (!p) { // Another thread won the race, handle gracefully break; // or continue, or goto alternative path } use(p); ``` --- ### Pitfall 4: Using Wrong Memory Ordering **WRONG**: ```c // ❌ BAD: Using seq_cst for simple check (10x slower!) if (atomic_load_explicit(&meta->freelist, memory_order_seq_cst) != NULL) { ``` **RIGHT**: ```c // ✅ GOOD: Use relaxed for benign checks if (slab_freelist_is_nonempty(meta)) { // Uses relaxed internally ``` --- ## Testing Checklist (Per File) After converting each file: ```bash # 1. Compile check make clean make bench_random_mixed_hakmem 2>&1 | tee build.log grep -i "error\|warning" build.log # 2. Single-threaded correctness ./out/release/bench_random_mixed_hakmem 100000 256 42 # 3. Multi-threaded stress (if Phase 1 complete) ./out/release/larson_hakmem 8 10000 256 # 4. ASan check (if available) ./build.sh asan bench_random_mixed_hakmem ./out/asan/bench_random_mixed_hakmem 10000 256 42 ``` --- ## Progress Tracking Use this checklist to track conversion progress: ### Phase 1 (Critical) - [ ] File 1: `core/box/slab_freelist_atomic.h` (CREATE) - [ ] File 2: `core/tiny_superslab_alloc.inc.h` (8 sites) - [ ] File 3: `core/hakmem_tiny_refill_p0.inc.h` (3 sites) - [ ] File 4: `core/box/carve_push_box.c` (10 sites) - [ ] File 5: `core/hakmem_tiny_tls_ops.h` (4 sites) - [ ] Phase 1 Testing (Larson 8T) ### Phase 2 (Important) - [ ] File 6: `core/tiny_refill_opt.h` (5 sites) - [ ] File 7: `core/tiny_free_magazine.inc.h` (3 sites) - [ ] File 8: `core/refill/ss_refill_fc.h` (3 sites) - [ ] File 9: `core/slab_handle.h` (7 sites) - [ ] Files 10-15: Remaining files (22 sites) - [ ] Phase 2 Testing (MT stress) ### Phase 3 (Cleanup) - [ ] Debug/Stats sites (5 sites) - [ ] Init/Cleanup sites (10 sites) - [ ] Verification sites (10 sites) - [ ] Phase 3 Testing (Full suite) --- ## Quick Reference Card | Old Pattern | New Pattern | Use Case | |-------------|-------------|----------| | `if (meta->freelist)` | `if (slab_freelist_is_nonempty(meta))` | NULL check | | `if (meta->freelist == NULL)` | `if (slab_freelist_is_empty(meta))` | Empty check | | `void* p = meta->freelist;` | `void* p = slab_freelist_load_relaxed(meta);` | Simple load | | `meta->freelist = NULL;` | `slab_freelist_store_relaxed(meta, NULL);` | Init/clear | | `void* p = meta->freelist; meta->freelist = next;` | `void* p = slab_freelist_pop_lockfree(meta, cls);` | POP | | `tiny_next_write(...); meta->freelist = node;` | `slab_freelist_push_lockfree(meta, cls, node);` | PUSH | | `fprintf("...%p", meta->freelist)` | `fprintf("...%p", SLAB_FREELIST_DEBUG_PTR(meta))` | Debug print | --- ## Time Budget Summary | Phase | Files | Sites | Time | |-------|-------|-------|------| | Phase 1 (Hot) | 5 | 25 | 2-3h | | Phase 2 (Warm) | 10 | 40 | 2-3h | | Phase 3 (Cold) | 5 | 25 | 1-2h | | **Total** | **20** | **90** | **5-8h** | Add 20% buffer for unexpected issues: **6-10 hours total** --- ## Success Metrics After full conversion: - ✅ Zero direct `meta->freelist` accesses (except in atomic accessor functions) - ✅ All tests pass (single + MT) - ✅ ASan/TSan clean (no data races) - ✅ Performance regression <3% (single-threaded) - ✅ Larson 8T stable (no crashes) - ✅ MT scaling 70%+ (good scalability) --- ## Emergency Rollback If conversion fails at any phase: ```bash git stash # Save work in progress git checkout master git branch -D atomic-freelist-phase1 # Or phase2/phase3 # Review strategy and try alternative approach ```