243 lines
8.5 KiB
Markdown
243 lines
8.5 KiB
Markdown
|
|
# Phase 28: Background Spill Queue Atomic Audit
|
||
|
|
|
||
|
|
**Date:** 2025-12-16
|
||
|
|
**Scope:** `core/hakmem_tiny_bg_spill.*`
|
||
|
|
**Objective:** Classify all atomic operations as CORRECTNESS or TELEMETRY
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Executive Summary
|
||
|
|
|
||
|
|
**Total Atomics Found:** 8 atomic operations
|
||
|
|
**CORRECTNESS:** 8 (100%)
|
||
|
|
**TELEMETRY:** 0 (0%)
|
||
|
|
|
||
|
|
**Result:** All atomic operations in the background spill queue are critical for correctness. **NO compile-out candidates found.**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Atomic Operations Inventory
|
||
|
|
|
||
|
|
### File: `core/hakmem_tiny_bg_spill.h`
|
||
|
|
|
||
|
|
#### 1. `atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], 1u, ...)`
|
||
|
|
**Location:** Line 32, `bg_spill_push_one()`
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Queue length tracking used for flow control
|
||
|
|
|
||
|
|
**Code Context:**
|
||
|
|
```c
|
||
|
|
static inline void bg_spill_push_one(int class_idx, void* p) {
|
||
|
|
uintptr_t old_head;
|
||
|
|
do {
|
||
|
|
old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire);
|
||
|
|
tiny_next_write(class_idx, p, (void*)old_head);
|
||
|
|
} while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head,
|
||
|
|
(uintptr_t)p,
|
||
|
|
memory_order_release, memory_order_relaxed));
|
||
|
|
atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], 1u, memory_order_relaxed); // <-- CORRECTNESS
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Usage Evidence:**
|
||
|
|
- `g_bg_spill_len` is checked in `tiny_free_magazine.inc.h:76-77`:
|
||
|
|
```c
|
||
|
|
uint32_t qlen = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed);
|
||
|
|
if ((int)qlen < g_bg_spill_target) {
|
||
|
|
// Build a small chain: include current ptr and pop from mag up to limit
|
||
|
|
```
|
||
|
|
- **This is flow control**: decides whether to queue more work or take alternate path
|
||
|
|
- **Correctness impact**: Prevents unbounded queue growth
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 2. `atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], (uint32_t)count, ...)`
|
||
|
|
**Location:** Line 44, `bg_spill_push_chain()`
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Same as #1 - queue length tracking for flow control
|
||
|
|
|
||
|
|
**Code Context:**
|
||
|
|
```c
|
||
|
|
static inline void bg_spill_push_chain(int class_idx, void* head, void* tail, int count) {
|
||
|
|
uintptr_t old_head;
|
||
|
|
do {
|
||
|
|
old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire);
|
||
|
|
tiny_next_write(class_idx, tail, (void*)old_head);
|
||
|
|
} while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head,
|
||
|
|
(uintptr_t)head,
|
||
|
|
memory_order_release, memory_order_relaxed));
|
||
|
|
atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], (uint32_t)count, memory_order_relaxed); // <-- CORRECTNESS
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 3-4. `atomic_load_explicit(&g_bg_spill_head[class_idx], ...)` (lines 27, 39)
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Lock-free queue head pointer - essential for CAS loop
|
||
|
|
|
||
|
|
**Code Context:**
|
||
|
|
```c
|
||
|
|
do {
|
||
|
|
old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); // <-- CORRECTNESS
|
||
|
|
tiny_next_write(class_idx, p, (void*)old_head);
|
||
|
|
} while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head,
|
||
|
|
(uintptr_t)p,
|
||
|
|
memory_order_release, memory_order_relaxed));
|
||
|
|
```
|
||
|
|
|
||
|
|
**Analysis:**
|
||
|
|
- Part of lock-free stack implementation
|
||
|
|
- Load-compare-swap pattern for thread-safe queue operations
|
||
|
|
- **Cannot be removed without breaking concurrency safety**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 5-6. `atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], ...)` (lines 29, 41)
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Lock-free synchronization primitive
|
||
|
|
|
||
|
|
**Analysis:**
|
||
|
|
- CAS operation is the core of lock-free queue
|
||
|
|
- Ensures atomic head pointer updates
|
||
|
|
- **Fundamental correctness operation - untouchable**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### File: `core/hakmem_tiny_bg_spill.c`
|
||
|
|
|
||
|
|
#### 7. `atomic_fetch_sub_explicit(&g_bg_spill_len[class_idx], (uint32_t)processed, ...)`
|
||
|
|
**Location:** Line 91, `bg_spill_drain_class()`
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Queue length decrement paired with increment in push operations
|
||
|
|
|
||
|
|
**Code Context:**
|
||
|
|
```c
|
||
|
|
void bg_spill_drain_class(int class_idx, pthread_mutex_t* lock) {
|
||
|
|
uint32_t approx = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed);
|
||
|
|
if (approx == 0) return;
|
||
|
|
|
||
|
|
uintptr_t chain = atomic_exchange_explicit(&g_bg_spill_head[class_idx], (uintptr_t)0, memory_order_acq_rel);
|
||
|
|
if (chain == 0) return;
|
||
|
|
|
||
|
|
// ... process nodes ...
|
||
|
|
|
||
|
|
if (processed > 0) {
|
||
|
|
atomic_fetch_sub_explicit(&g_bg_spill_len[class_idx], (uint32_t)processed, memory_order_relaxed); // <-- CORRECTNESS
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Analysis:**
|
||
|
|
- Maintains queue length invariant
|
||
|
|
- Paired with `fetch_add` in push operations
|
||
|
|
- Used for flow control decisions (queue full check)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 8. `atomic_load_explicit(&g_bg_spill_len[class_idx], ...)`
|
||
|
|
**Location:** Line 30, `bg_spill_drain_class()`
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Early-exit optimization, but semantically part of correctness
|
||
|
|
|
||
|
|
**Code Context:**
|
||
|
|
```c
|
||
|
|
void bg_spill_drain_class(int class_idx, pthread_mutex_t* lock) {
|
||
|
|
uint32_t approx = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed); // <-- CORRECTNESS
|
||
|
|
if (approx == 0) return; // Early exit if queue empty
|
||
|
|
```
|
||
|
|
|
||
|
|
**Analysis:**
|
||
|
|
- While technically an optimization (could check head pointer instead), this is tightly coupled to the queue length semantics
|
||
|
|
- The counter `g_bg_spill_len` is **not a pure telemetry counter** - it's used for:
|
||
|
|
1. Flow control in free path (`qlen < g_bg_spill_target`)
|
||
|
|
2. Early-exit optimization in drain path
|
||
|
|
- **Cannot be removed without affecting behavior**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Additional Atomic Operations (Initialization/Exchange)
|
||
|
|
|
||
|
|
#### 9-10. `atomic_store_explicit(&g_bg_spill_head/len[k], ...)` (lines 24-25)
|
||
|
|
**Location:** `bg_spill_init()`
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Initialization of atomic variables
|
||
|
|
|
||
|
|
#### 11. `atomic_exchange_explicit(&g_bg_spill_head[class_idx], ...)`
|
||
|
|
**Location:** Line 33, `bg_spill_drain_class()`
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Atomic swap of head pointer - lock-free dequeue operation
|
||
|
|
|
||
|
|
#### 12-13. `atomic_load/compare_exchange_weak_explicit` (lines 100, 102)
|
||
|
|
**Location:** Re-prepend remainder logic
|
||
|
|
**Classification:** **CORRECTNESS**
|
||
|
|
**Reason:** Lock-free re-insertion of unprocessed nodes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Key Finding: `g_bg_spill_len` is NOT Telemetry
|
||
|
|
|
||
|
|
### Flow Control Usage in `tiny_free_magazine.inc.h`
|
||
|
|
|
||
|
|
```c
|
||
|
|
// Background spill: queue to BG thread instead of locking (when enabled)
|
||
|
|
if (g_bg_spill_enable) {
|
||
|
|
uint32_t qlen = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed);
|
||
|
|
if ((int)qlen < g_bg_spill_target) { // <-- FLOW CONTROL DECISION
|
||
|
|
// Build a small chain: include current ptr and pop from mag up to limit
|
||
|
|
int limit = g_bg_spill_max_batch;
|
||
|
|
// ... queue to background spill ...
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Analysis:**
|
||
|
|
- `g_bg_spill_len` determines whether to queue work to background thread or take alternate path
|
||
|
|
- This is **not a debug counter** - it's an operational control variable
|
||
|
|
- Removing these atomics would break queue semantics and could lead to unbounded growth
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Comparison with Previous Phases
|
||
|
|
|
||
|
|
| Phase | Path | TELEMETRY Found | CORRECTNESS Found | Compile-out Benefit |
|
||
|
|
|-------|------|-----------------|-------------------|---------------------|
|
||
|
|
| 24 | Alloc Gate | 4 | 0 | +1.62% |
|
||
|
|
| 25 | Free Path | 5 | 0 | +0.84% |
|
||
|
|
| 26 | Tiny Front (Hot) | 2 | 0 | NEUTRAL (+cleanliness) |
|
||
|
|
| 27 | Tiny Front (Stats) | 3 | 0 | +0.28% |
|
||
|
|
| **28** | **BG Spill Queue** | **0** | **8** | **N/A (NO-OP)** |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Conclusion
|
||
|
|
|
||
|
|
**Phase 28 Result: NO-OP**
|
||
|
|
|
||
|
|
All 8 atomic operations in the background spill queue subsystem are classified as **CORRECTNESS**:
|
||
|
|
- Lock-free queue synchronization (CAS loops, head pointer management)
|
||
|
|
- Queue length tracking used for **flow control** (not telemetry)
|
||
|
|
- Removal would break concurrency safety or change behavior
|
||
|
|
|
||
|
|
**Recommendation:**
|
||
|
|
- **Do not proceed with compile-out**
|
||
|
|
- Document this phase as "Audit complete, all CORRECTNESS"
|
||
|
|
- Move to Phase 29 candidate search
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Next Steps
|
||
|
|
|
||
|
|
Search for Phase 29 candidates:
|
||
|
|
1. Check other subsystems with potential telemetry atomics
|
||
|
|
2. Review cumulative report to identify remaining hot paths
|
||
|
|
3. Consider diminishing returns vs. code complexity
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- **Phase 24-27:** Cumulative +2.74% from telemetry pruning
|
||
|
|
- **Lock-free patterns:** All CAS operations are correctness-critical
|
||
|
|
- **Flow control vs. telemetry:** Queue length used for operational decisions, not just observation
|