195 lines
7.1 KiB
Markdown
195 lines
7.1 KiB
Markdown
|
|
# Phase 28: Background Spill Queue Atomic Prune Results
|
||
|
|
|
||
|
|
**Date:** 2025-12-16
|
||
|
|
**Status:** ✅ **COMPLETE (NO-OP)**
|
||
|
|
**Verdict:** **All CORRECTNESS - No compile-out candidates**
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Executive Summary
|
||
|
|
|
||
|
|
Phase 28 conducted a thorough audit of all atomic operations in the background spill queue subsystem (`core/hakmem_tiny_bg_spill.*`). **Result: All 8 atomics are CORRECTNESS-critical.** No telemetry atomics were found, therefore no compile-out was performed.
|
||
|
|
|
||
|
|
**Key Finding:** The `g_bg_spill_len` counter, which superficially resembles telemetry counters from previous phases, is actually used for **flow control** (queue depth limiting) and is therefore untouchable.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Audit Results
|
||
|
|
|
||
|
|
### Total Atomics: 8
|
||
|
|
|
||
|
|
| Atomic Operation | Location | Classification | Reason |
|
||
|
|
|-----------------|----------|----------------|--------|
|
||
|
|
| `atomic_load(&g_bg_spill_head)` | `hakmem_tiny_bg_spill.h:27` | CORRECTNESS | Lock-free queue CAS loop |
|
||
|
|
| `atomic_compare_exchange_weak(&g_bg_spill_head)` | `hakmem_tiny_bg_spill.h:29` | CORRECTNESS | Lock-free queue CAS |
|
||
|
|
| `atomic_fetch_add(&g_bg_spill_len, 1)` | `hakmem_tiny_bg_spill.h:32` | CORRECTNESS | Queue length (flow control) |
|
||
|
|
| `atomic_load(&g_bg_spill_head)` | `hakmem_tiny_bg_spill.h:39` | CORRECTNESS | Lock-free queue CAS loop |
|
||
|
|
| `atomic_compare_exchange_weak(&g_bg_spill_head)` | `hakmem_tiny_bg_spill.h:41` | CORRECTNESS | Lock-free queue CAS |
|
||
|
|
| `atomic_fetch_add(&g_bg_spill_len, count)` | `hakmem_tiny_bg_spill.h:44` | CORRECTNESS | Queue length (flow control) |
|
||
|
|
| `atomic_load(&g_bg_spill_len)` | `hakmem_tiny_bg_spill.c:30` | CORRECTNESS | Early-exit optimization |
|
||
|
|
| `atomic_fetch_sub(&g_bg_spill_len)` | `hakmem_tiny_bg_spill.c:91` | CORRECTNESS | Queue length decrement |
|
||
|
|
|
||
|
|
**CORRECTNESS:** 8/8 (100%)
|
||
|
|
**TELEMETRY:** 0/8 (0%)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Critical Finding: `g_bg_spill_len` is NOT Telemetry
|
||
|
|
|
||
|
|
### The Trap
|
||
|
|
|
||
|
|
At first glance, `g_bg_spill_len` looks like a telemetry counter:
|
||
|
|
- Named with `_len` suffix (like stats counters)
|
||
|
|
- Incremented on push, decremented on drain
|
||
|
|
- Uses `atomic_fetch_add/sub` (same pattern as telemetry)
|
||
|
|
|
||
|
|
### The Reality
|
||
|
|
|
||
|
|
**`g_bg_spill_len` is used for flow control in the hot free path:**
|
||
|
|
|
||
|
|
```c
|
||
|
|
// core/tiny_free_magazine.inc.h:75-77
|
||
|
|
if (g_bg_spill_enable) {
|
||
|
|
uint32_t qlen = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed);
|
||
|
|
if ((int)qlen < g_bg_spill_target) { // <-- FLOW CONTROL DECISION
|
||
|
|
// Build a small chain: include current ptr and pop from mag up to limit
|
||
|
|
// ...
|
||
|
|
bg_spill_push_chain(class_idx, head, tail, taken);
|
||
|
|
return;
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**What this means:**
|
||
|
|
- If queue length < target: queue work to background thread
|
||
|
|
- If queue length >= target: take alternate path (direct free)
|
||
|
|
- **Removing this atomic would change program behavior** (unbounded queue growth)
|
||
|
|
- **This is an operational counter, not a debug counter**
|
||
|
|
|
||
|
|
### Comparison with Telemetry Counters
|
||
|
|
|
||
|
|
| Counter | Phase | Purpose | Flow Control? | Classification |
|
||
|
|
|---------|-------|---------|---------------|----------------|
|
||
|
|
| `g_tiny_class_stats_*` | 24 | Observe cache hits | NO | TELEMETRY |
|
||
|
|
| `g_free_ss_enter` | 25 | Count free calls | NO | TELEMETRY |
|
||
|
|
| `g_unified_cache_*` | 27 | Measure cache perf | NO | TELEMETRY |
|
||
|
|
| **`g_bg_spill_len`** | **28** | **Queue depth limit** | **YES** | **CORRECTNESS** |
|
||
|
|
|
||
|
|
**Key Distinction:** Telemetry counters are **observational** (removed if not observed). Operational counters are **functional** (program behavior depends on them).
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Lock-Free Queue Atomics
|
||
|
|
|
||
|
|
The remaining 6 atomics are part of the lock-free stack implementation:
|
||
|
|
|
||
|
|
### Push Operation (lines 24-32, 36-44)
|
||
|
|
```c
|
||
|
|
static inline void bg_spill_push_one(int class_idx, void* p) {
|
||
|
|
uintptr_t old_head;
|
||
|
|
do {
|
||
|
|
old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire); // CORRECTNESS
|
||
|
|
tiny_next_write(class_idx, p, (void*)old_head);
|
||
|
|
} while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head, // CORRECTNESS
|
||
|
|
(uintptr_t)p,
|
||
|
|
memory_order_release, memory_order_relaxed));
|
||
|
|
atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], 1u, memory_order_relaxed); // CORRECTNESS (flow control)
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**Analysis:**
|
||
|
|
- Classic lock-free stack pattern (load → link → CAS loop)
|
||
|
|
- `atomic_load` + `atomic_compare_exchange_weak` are fundamental to correctness
|
||
|
|
- Cannot be removed without replacing entire queue implementation
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Decision: NO-OP
|
||
|
|
|
||
|
|
**Verdict:** Phase 28 is a **NO-OP**. No code changes required.
|
||
|
|
|
||
|
|
**Rationale:**
|
||
|
|
1. All atomics are CORRECTNESS-critical
|
||
|
|
2. `g_bg_spill_len` is used for flow control, not telemetry
|
||
|
|
3. Lock-free queue operations are untouchable
|
||
|
|
4. No A/B testing needed (nothing to test)
|
||
|
|
|
||
|
|
**Phase 28 Result:**
|
||
|
|
- **Atomics removed:** 0
|
||
|
|
- **Performance gain:** N/A
|
||
|
|
- **Code changes:** None
|
||
|
|
- **Documentation:** Audit complete, classification recorded
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Impact on Future Phases
|
||
|
|
|
||
|
|
### Lesson Learned
|
||
|
|
|
||
|
|
**Not all counters are telemetry.** Before classifying an atomic as TELEMETRY:
|
||
|
|
1. Search for all uses of the variable
|
||
|
|
2. Check if it's used in control flow (`if`, `while`, comparisons)
|
||
|
|
3. Determine if removal would change program behavior
|
||
|
|
4. Only compile-out if purely observational
|
||
|
|
|
||
|
|
### Similar Candidates to Audit Carefully
|
||
|
|
|
||
|
|
**Phase 29+ candidates that may have flow control:**
|
||
|
|
- `g_remote_target_len` (remote queue length - same pattern as bg_spill)
|
||
|
|
- `g_l25_pool.remote_count` (L25 pool remote counts)
|
||
|
|
- Any `*_len`, `*_count` that might be used for queue management
|
||
|
|
|
||
|
|
**Red flags for CORRECTNESS:**
|
||
|
|
- Used in `if (count < threshold)` statements
|
||
|
|
- Used to decide whether to queue work
|
||
|
|
- Used to prevent unbounded growth
|
||
|
|
- Paired with lock-free queue head pointers
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Phase 28 Files Analyzed
|
||
|
|
|
||
|
|
**No modifications:**
|
||
|
|
- `core/hakmem_tiny_bg_spill.h` (audit only)
|
||
|
|
- `core/hakmem_tiny_bg_spill.c` (audit only)
|
||
|
|
- `core/tiny_free_magazine.inc.h` (flow control usage identified)
|
||
|
|
|
||
|
|
**Documentation created:**
|
||
|
|
- `docs/analysis/PHASE28_BG_SPILL_ATOMIC_AUDIT.md` (detailed audit)
|
||
|
|
- `docs/analysis/PHASE28_BG_SPILL_ATOMIC_PRUNE_RESULTS.md` (this file)
|
||
|
|
- `docs/analysis/ATOMIC_PRUNE_CUMULATIVE_SUMMARY.md` (updated)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Cumulative Progress
|
||
|
|
|
||
|
|
| Phase | Atomics Removed | Impact | Verdict |
|
||
|
|
|-------|-----------------|--------|---------|
|
||
|
|
| 24 | 5 | +0.93% | GO ✅ |
|
||
|
|
| 25 | 1 | +1.07% | GO ✅ |
|
||
|
|
| 26 | 5 | -0.33% | NEUTRAL ✅ |
|
||
|
|
| 27 | 6 | +0.74% | GO ✅ |
|
||
|
|
| **28** | **0** | **N/A** | **NO-OP ✅** |
|
||
|
|
| **Total** | **17** | **+2.74%** | **✅** |
|
||
|
|
|
||
|
|
**Next:** Phase 29 (remote target queue or pool hotbox v2 stats)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Conclusion
|
||
|
|
|
||
|
|
Phase 28 successfully completed its audit objective:
|
||
|
|
1. ✅ All atomics identified (8 total)
|
||
|
|
2. ✅ All atomics classified (100% CORRECTNESS)
|
||
|
|
3. ✅ Flow control usage documented (`g_bg_spill_len`)
|
||
|
|
4. ✅ No compile-out candidates found
|
||
|
|
5. ✅ Cumulative summary updated
|
||
|
|
|
||
|
|
**Key Takeaway:** Audit phases are valuable even when they result in NO-OP. They document which atomics are untouchable and why, preventing future incorrect optimizations.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
**Phase 28 Status:** ✅ **COMPLETE (NO-OP)**
|
||
|
|
**Next Phase:** 29 (TBD based on priority)
|
||
|
|
**Date:** 2025-12-16
|