# Phase 28: Background Spill Queue Atomic Audit **Date:** 2025-12-16 **Scope:** `core/hakmem_tiny_bg_spill.*` **Objective:** Classify all atomic operations as CORRECTNESS or TELEMETRY --- ## Executive Summary **Total Atomics Found:** 8 atomic operations **CORRECTNESS:** 8 (100%) **TELEMETRY:** 0 (0%) **Result:** All atomic operations in the background spill queue are critical for correctness. **NO compile-out candidates found.** --- ## Atomic Operations Inventory ### File: `core/hakmem_tiny_bg_spill.h` #### 1. `atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], 1u, ...)` **Location:** Line 32, `bg_spill_push_one()` **Classification:** **CORRECTNESS** **Reason:** Queue length tracking used for flow control **Code Context:** ```c static inline void bg_spill_push_one(int class_idx, void* p) { uintptr_t old_head; do { old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); tiny_next_write(class_idx, p, (void*)old_head); } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, (uintptr_t)p, memory_order_release, memory_order_relaxed)); atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], 1u, memory_order_relaxed); // <-- CORRECTNESS } ``` **Usage Evidence:** - `g_bg_spill_len` is checked in `tiny_free_magazine.inc.h:76-77`: ```c uint32_t qlen = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed); if ((int)qlen < g_bg_spill_target) { // Build a small chain: include current ptr and pop from mag up to limit ``` - **This is flow control**: decides whether to queue more work or take alternate path - **Correctness impact**: Prevents unbounded queue growth --- #### 2. `atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], (uint32_t)count, ...)` **Location:** Line 44, `bg_spill_push_chain()` **Classification:** **CORRECTNESS** **Reason:** Same as #1 - queue length tracking for flow control **Code Context:** ```c static inline void bg_spill_push_chain(int class_idx, void* head, void* tail, int count) { uintptr_t old_head; do { old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); tiny_next_write(class_idx, tail, (void*)old_head); } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, (uintptr_t)head, memory_order_release, memory_order_relaxed)); atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], (uint32_t)count, memory_order_relaxed); // <-- CORRECTNESS } ``` --- #### 3-4. `atomic_load_explicit(&g_bg_spill_head[class_idx], ...)` (lines 27, 39) **Classification:** **CORRECTNESS** **Reason:** Lock-free queue head pointer - essential for CAS loop **Code Context:** ```c do { old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); // <-- CORRECTNESS tiny_next_write(class_idx, p, (void*)old_head); } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, (uintptr_t)p, memory_order_release, memory_order_relaxed)); ``` **Analysis:** - Part of lock-free stack implementation - Load-compare-swap pattern for thread-safe queue operations - **Cannot be removed without breaking concurrency safety** --- #### 5-6. `atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], ...)` (lines 29, 41) **Classification:** **CORRECTNESS** **Reason:** Lock-free synchronization primitive **Analysis:** - CAS operation is the core of lock-free queue - Ensures atomic head pointer updates - **Fundamental correctness operation - untouchable** --- ### File: `core/hakmem_tiny_bg_spill.c` #### 7. `atomic_fetch_sub_explicit(&g_bg_spill_len[class_idx], (uint32_t)processed, ...)` **Location:** Line 91, `bg_spill_drain_class()` **Classification:** **CORRECTNESS** **Reason:** Queue length decrement paired with increment in push operations **Code Context:** ```c void bg_spill_drain_class(int class_idx, pthread_mutex_t* lock) { uint32_t approx = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed); if (approx == 0) return; uintptr_t chain = atomic_exchange_explicit(&g_bg_spill_head[class_idx], (uintptr_t)0, memory_order_acq_rel); if (chain == 0) return; // ... process nodes ... if (processed > 0) { atomic_fetch_sub_explicit(&g_bg_spill_len[class_idx], (uint32_t)processed, memory_order_relaxed); // <-- CORRECTNESS } } ``` **Analysis:** - Maintains queue length invariant - Paired with `fetch_add` in push operations - Used for flow control decisions (queue full check) --- #### 8. `atomic_load_explicit(&g_bg_spill_len[class_idx], ...)` **Location:** Line 30, `bg_spill_drain_class()` **Classification:** **CORRECTNESS** **Reason:** Early-exit optimization, but semantically part of correctness **Code Context:** ```c void bg_spill_drain_class(int class_idx, pthread_mutex_t* lock) { uint32_t approx = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed); // <-- CORRECTNESS if (approx == 0) return; // Early exit if queue empty ``` **Analysis:** - While technically an optimization (could check head pointer instead), this is tightly coupled to the queue length semantics - The counter `g_bg_spill_len` is **not a pure telemetry counter** - it's used for: 1. Flow control in free path (`qlen < g_bg_spill_target`) 2. Early-exit optimization in drain path - **Cannot be removed without affecting behavior** --- ## Additional Atomic Operations (Initialization/Exchange) #### 9-10. `atomic_store_explicit(&g_bg_spill_head/len[k], ...)` (lines 24-25) **Location:** `bg_spill_init()` **Classification:** **CORRECTNESS** **Reason:** Initialization of atomic variables #### 11. `atomic_exchange_explicit(&g_bg_spill_head[class_idx], ...)` **Location:** Line 33, `bg_spill_drain_class()` **Classification:** **CORRECTNESS** **Reason:** Atomic swap of head pointer - lock-free dequeue operation #### 12-13. `atomic_load/compare_exchange_weak_explicit` (lines 100, 102) **Location:** Re-prepend remainder logic **Classification:** **CORRECTNESS** **Reason:** Lock-free re-insertion of unprocessed nodes --- ## Key Finding: `g_bg_spill_len` is NOT Telemetry ### Flow Control Usage in `tiny_free_magazine.inc.h` ```c // Background spill: queue to BG thread instead of locking (when enabled) if (g_bg_spill_enable) { uint32_t qlen = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed); if ((int)qlen < g_bg_spill_target) { // <-- FLOW CONTROL DECISION // Build a small chain: include current ptr and pop from mag up to limit int limit = g_bg_spill_max_batch; // ... queue to background spill ... } } ``` **Analysis:** - `g_bg_spill_len` determines whether to queue work to background thread or take alternate path - This is **not a debug counter** - it's an operational control variable - Removing these atomics would break queue semantics and could lead to unbounded growth --- ## Comparison with Previous Phases | Phase | Path | TELEMETRY Found | CORRECTNESS Found | Compile-out Benefit | |-------|------|-----------------|-------------------|---------------------| | 24 | Alloc Gate | 4 | 0 | +1.62% | | 25 | Free Path | 5 | 0 | +0.84% | | 26 | Tiny Front (Hot) | 2 | 0 | NEUTRAL (+cleanliness) | | 27 | Tiny Front (Stats) | 3 | 0 | +0.28% | | **28** | **BG Spill Queue** | **0** | **8** | **N/A (NO-OP)** | --- ## Conclusion **Phase 28 Result: NO-OP** All 8 atomic operations in the background spill queue subsystem are classified as **CORRECTNESS**: - Lock-free queue synchronization (CAS loops, head pointer management) - Queue length tracking used for **flow control** (not telemetry) - Removal would break concurrency safety or change behavior **Recommendation:** - **Do not proceed with compile-out** - Document this phase as "Audit complete, all CORRECTNESS" - Move to Phase 29 candidate search --- ## Next Steps Search for Phase 29 candidates: 1. Check other subsystems with potential telemetry atomics 2. Review cumulative report to identify remaining hot paths 3. Consider diminishing returns vs. code complexity --- ## References - **Phase 24-27:** Cumulative +2.74% from telemetry pruning - **Lock-free patterns:** All CAS operations are correctness-critical - **Flow control vs. telemetry:** Queue length used for operational decisions, not just observation