Files

Moe Charm (CI) 9ed8b9c79a Phase 27-28: Unified Cache stats validation + BG Spill audit

Phase 27: Unified Cache Stats A/B Test - GO (+0.74%)
- Target: g_unified_cache_* atomics (6 total) in WARM refill path
- Already implemented in Phase 23 (HAKMEM_TINY_UNIFIED_CACHE_MEASURE_COMPILED)
- A/B validation: Baseline 52.94M vs Compiled-in 52.55M ops/s
- Result: +0.74% mean, +1.01% median (both exceed +0.5% GO threshold)
- Impact: WARM path atomics have similar impact to HOT path
- Insight: Refill frequency is substantial, ENV check overhead matters

Phase 28: BG Spill Queue Atomic Audit - NO-OP
- Target: g_bg_spill_* atomics (8 total) in background spill subsystem
- Classification: 8/8 CORRECTNESS (100% untouchable)
- Key finding: g_bg_spill_len is flow control, NOT telemetry
  - Used in queue depth limiting: if (qlen < target) {...}
  - Operational counter (affects behavior), not observational
- Lesson: Counter name ≠ purpose, must trace all usages
- Result: NO-OP (no code changes, audit documentation only)

Cumulative Progress (Phase 24-28):
- Phase 24 (class stats): +0.93% GO
- Phase 25 (free stats): +1.07% GO
- Phase 26 (diagnostics): -0.33% NEUTRAL
- Phase 27 (unified cache): +0.74% GO
- Phase 28 (bg spill): NO-OP (audit only)
- Total: 17 atomics removed, +2.74% improvement

Documentation:
- PHASE27_UNIFIED_CACHE_STATS_RESULTS.md: Complete A/B test report
- PHASE28_BG_SPILL_ATOMIC_AUDIT.md: Detailed CORRECTNESS classification
- PHASE28_BG_SPILL_ATOMIC_PRUNE_RESULTS.md: NO-OP verdict and lessons
- ATOMIC_PRUNE_CUMULATIVE_SUMMARY.md: Updated with Phase 27-28
- CURRENT_TASK.md: Phase 29 candidate identified (Pool Hotbox v2)

Generated with Claude Code
https://claude.com/claude-code

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2025-12-16 06:12:17 +09:00

7.1 KiB

Raw Blame History

Phase 28: Background Spill Queue Atomic Prune Results

Date: 2025-12-16 Status: ✅ COMPLETE (NO-OP) Verdict: All CORRECTNESS - No compile-out candidates

Executive Summary

Phase 28 conducted a thorough audit of all atomic operations in the background spill queue subsystem (core/hakmem_tiny_bg_spill.*). Result: All 8 atomics are CORRECTNESS-critical. No telemetry atomics were found, therefore no compile-out was performed.

Key Finding: The g_bg_spill_len counter, which superficially resembles telemetry counters from previous phases, is actually used for flow control (queue depth limiting) and is therefore untouchable.

Audit Results

Total Atomics: 8

Atomic Operation	Location	Classification	Reason
`atomic_load(&g_bg_spill_head)`	`hakmem_tiny_bg_spill.h:27`	CORRECTNESS	Lock-free queue CAS loop
`atomic_compare_exchange_weak(&g_bg_spill_head)`	`hakmem_tiny_bg_spill.h:29`	CORRECTNESS	Lock-free queue CAS
`atomic_fetch_add(&g_bg_spill_len, 1)`	`hakmem_tiny_bg_spill.h:32`	CORRECTNESS	Queue length (flow control)
`atomic_load(&g_bg_spill_head)`	`hakmem_tiny_bg_spill.h:39`	CORRECTNESS	Lock-free queue CAS loop
`atomic_compare_exchange_weak(&g_bg_spill_head)`	`hakmem_tiny_bg_spill.h:41`	CORRECTNESS	Lock-free queue CAS
`atomic_fetch_add(&g_bg_spill_len, count)`	`hakmem_tiny_bg_spill.h:44`	CORRECTNESS	Queue length (flow control)
`atomic_load(&g_bg_spill_len)`	`hakmem_tiny_bg_spill.c:30`	CORRECTNESS	Early-exit optimization
`atomic_fetch_sub(&g_bg_spill_len)`	`hakmem_tiny_bg_spill.c:91`	CORRECTNESS	Queue length decrement

CORRECTNESS: 8/8 (100%) TELEMETRY: 0/8 (0%)

Critical Finding: `g_bg_spill_len` is NOT Telemetry

The Trap

At first glance, g_bg_spill_len looks like a telemetry counter:

Named with _len suffix (like stats counters)
Incremented on push, decremented on drain
Uses atomic_fetch_add/sub (same pattern as telemetry)

The Reality

g_bg_spill_len is used for flow control in the hot free path:

// core/tiny_free_magazine.inc.h:75-77
if (g_bg_spill_enable) {
    uint32_t qlen = atomic_load_explicit(&g_bg_spill_len[class_idx], memory_order_relaxed);
    if ((int)qlen < g_bg_spill_target) {  // <-- FLOW CONTROL DECISION
        // Build a small chain: include current ptr and pop from mag up to limit
        // ...
        bg_spill_push_chain(class_idx, head, tail, taken);
        return;
    }
}

What this means:

If queue length < target: queue work to background thread
If queue length >= target: take alternate path (direct free)
Removing this atomic would change program behavior (unbounded queue growth)
This is an operational counter, not a debug counter

Comparison with Telemetry Counters

Counter	Phase	Purpose	Flow Control?	Classification
`g_tiny_class_stats_*`	24	Observe cache hits	NO	TELEMETRY
`g_free_ss_enter`	25	Count free calls	NO	TELEMETRY
`g_unified_cache_*`	27	Measure cache perf	NO	TELEMETRY
`g_bg_spill_len`	28	Queue depth limit	YES	CORRECTNESS

Key Distinction: Telemetry counters are observational (removed if not observed). Operational counters are functional (program behavior depends on them).

Lock-Free Queue Atomics

The remaining 6 atomics are part of the lock-free stack implementation:

Push Operation (lines 24-32, 36-44)

static inline void bg_spill_push_one(int class_idx, void* p) {
    uintptr_t old_head;
    do {
        old_head = atomic_load_explicit(&g_bg_spill_head[class_idx], memory_order_acquire);  // CORRECTNESS
        tiny_next_write(class_idx, p, (void*)old_head);
    } while (!atomic_compare_exchange_weak_explicit(&g_bg_spill_head[class_idx], &old_head,  // CORRECTNESS
                                                    (uintptr_t)p,
                                                    memory_order_release, memory_order_relaxed));
    atomic_fetch_add_explicit(&g_bg_spill_len[class_idx], 1u, memory_order_relaxed);  // CORRECTNESS (flow control)
}

Analysis:

Classic lock-free stack pattern (load → link → CAS loop)
atomic_load + atomic_compare_exchange_weak are fundamental to correctness
Cannot be removed without replacing entire queue implementation

Decision: NO-OP

Verdict: Phase 28 is a NO-OP. No code changes required.

Rationale:

All atomics are CORRECTNESS-critical
g_bg_spill_len is used for flow control, not telemetry
Lock-free queue operations are untouchable
No A/B testing needed (nothing to test)

Phase 28 Result:

Atomics removed: 0
Performance gain: N/A
Code changes: None
Documentation: Audit complete, classification recorded

Impact on Future Phases

Lesson Learned

Not all counters are telemetry. Before classifying an atomic as TELEMETRY:

Search for all uses of the variable
Check if it's used in control flow (if, while, comparisons)
Determine if removal would change program behavior
Only compile-out if purely observational

Similar Candidates to Audit Carefully

Phase 29+ candidates that may have flow control:

g_remote_target_len (remote queue length - same pattern as bg_spill)
g_l25_pool.remote_count (L25 pool remote counts)
Any *_len, *_count that might be used for queue management

Red flags for CORRECTNESS:

Used in if (count < threshold) statements
Used to decide whether to queue work
Used to prevent unbounded growth
Paired with lock-free queue head pointers

Phase 28 Files Analyzed

No modifications:

core/hakmem_tiny_bg_spill.h (audit only)
core/hakmem_tiny_bg_spill.c (audit only)
core/tiny_free_magazine.inc.h (flow control usage identified)

Documentation created:

docs/analysis/PHASE28_BG_SPILL_ATOMIC_AUDIT.md (detailed audit)
docs/analysis/PHASE28_BG_SPILL_ATOMIC_PRUNE_RESULTS.md (this file)
docs/analysis/ATOMIC_PRUNE_CUMULATIVE_SUMMARY.md (updated)

Cumulative Progress

Phase	Atomics Removed	Impact	Verdict
24	5	+0.93%	GO ✅
25	1	+1.07%	GO ✅
26	5	-0.33%	NEUTRAL ✅
27	6	+0.74%	GO ✅
28	0	N/A	NO-OP ✅
Total	17	+2.74%	✅

Next: Phase 29 (remote target queue or pool hotbox v2 stats)

Conclusion

Phase 28 successfully completed its audit objective:

✅ All atomics identified (8 total)
✅ All atomics classified (100% CORRECTNESS)
✅ Flow control usage documented (g_bg_spill_len)
✅ No compile-out candidates found
✅ Cumulative summary updated

Key Takeaway: Audit phases are valuable even when they result in NO-OP. They document which atomics are untouchable and why, preventing future incorrect optimizations.

Phase 28 Status: ✅ COMPLETE (NO-OP) Next Phase: 29 (TBD based on priority) Date: 2025-12-16

7.1 KiB Raw Blame History