hakmem/docs/analysis/PHASE29_POOL_HOTBOX_V2_STATS_RESULTS.md

# Phase 29: Pool Hotbox v2 Stats Prune - Results

## Executive Summary

**Date:** 2025-12-16
**Result:** **NO-OP** (Pool Hotbox v2 not active in default configuration)
**Verdict:** NEUTRAL - Keep compile-out for code cleanliness
**Impact:** 0.00% (atomics never executed)

## A/B Test Results

### Configuration

- **Baseline (COMPILED=0, default):** Atomics compiled-out (via `#if HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED`)
- **Research (COMPILED=1):** Atomics active (`atomic_fetch_add_explicit` executed)
- **Workload:** `bench_random_mixed` (10 runs, 20M ops each)

### Raw Data

**Baseline (atomics OFF):**
```
Run  1: 52,651,025 ops/s
Run  2: 52,251,016 ops/s
Run  3: 52,545,864 ops/s
Run  4: 53,765,007 ops/s
Run  5: 53,284,121 ops/s
Run  6: 52,982,021 ops/s
Run  7: 53,073,218 ops/s
Run  8: 52,844,359 ops/s
Run  9: 53,238,262 ops/s
Run 10: 53,150,487 ops/s

Mean:   52,978,538 ops/s
Stdev:     429,428 ops/s (0.81%)
```

**Research (atomics ON):**
```
Run  1: 53,282,648 ops/s
Run  2: 53,973,577 ops/s
Run  3: 52,681,322 ops/s
Run  4: 54,175,703 ops/s
Run  5: 52,841,032 ops/s
Run  6: 53,461,187 ops/s
Run  7: 52,268,525 ops/s
Run  8: 53,799,964 ops/s
Run  9: 52,147,517 ops/s
Run 10: 54,432,544 ops/s

Mean:   53,306,402 ops/s
Stdev:     800,321 ops/s (1.50%)
```

### Statistical Analysis

| Metric | Baseline (OFF) | Research (ON) | Delta |
|--------|----------------|---------------|-------|
| Mean | 52,978,538 ops/s | 53,306,402 ops/s | -327,864 ops/s |
| Stdev | 429,428 (0.81%) | 800,321 (1.50%) | +370,893 (noise) |
| **Relative Delta** | - | - | **-0.62%** |

**Interpretation:** Research build (atomics ON) is 0.62% FASTER than baseline (atomics OFF).

## Root Cause Analysis: Why NO-OP?

### Discovery

Pool Hotbox v2 is **OFF by default** and gated by environment variable:

```c
// core/hakmem_pool.c:824-831
static int pool_hotbox_v2_global_enabled(void) {
    static int g = -1;
    if (__builtin_expect(g == -1, 0)) {
        const char* e = getenv("HAKMEM_POOL_V2_ENABLED");  // ← ENV gate
        g = (e && *e && *e != '0') ? 1 : 0;
    }
    return g;
}
```

**Result:** All `pool_hotbox_v2_record_*()` calls are no-ops:
- `pool_hotbox_v2_alloc()` is never called
- `pool_hotbox_v2_free()` is never called
- All 12 atomic counters are never incremented
- Compile-out has **zero runtime effect**

### Why Research Build is Faster

**Hypothesis:** Compiler optimization artifact (noise)

1. **High variance in research build:** 1.50% stdev vs 0.81% baseline
   - Suggests measurement noise, not real effect
   - Delta (-0.62%) is within 1 stdev of research build

2. **Code layout changes:**
   - Adding `#if` guards changes object file layout
   - May affect instruction cache alignment by chance
   - LTO/PGO sensitive to code structure

3. **Sample size:** 10 runs insufficient to distinguish noise from signal

**Conclusion:** The -0.62% "speedup" for atomics ON is likely **noise**, not a real effect.

## Comparison: Phase 27 vs Phase 29

| Phase | Target | Path | Active? | Result |
|-------|--------|------|---------|--------|
| 27 | Unified Cache stats | WARM | ✅ YES | +0.74% GO |
| 29 | Pool Hotbox v2 stats | HOT+WARM | ❌ NO | 0.00% NO-OP |

**Key difference:** Phase 27 stats were on ACTIVE code path; Phase 29 stats are on INACTIVE (ENV-gated) path.

## Why Keep Compile-Out?

Despite NO-OP result, **we maintain the compile-out** for:

1. **Code cleanliness:** Reduces binary size (12 atomics × 7 classes = 84 atomic counters)
2. **Future-proofing:** If Pool v2 is enabled later, compile-out is already in place
3. **Consistency:** Matches Phase 24-28 atomic prune pattern
4. **Documentation:** Makes it clear these are research-only counters

## Actionable Findings

### For Phase 29

**Decision:** NEUTRAL - Maintain compile-out (default `COMPILED=0`)

**Rationale:**
- No performance impact (code not running)
- No harm (compile-out is correct for inactive code)
- Future benefit (ready if Pool v2 is enabled)

### For Future Phases

**Lesson:** Before A/B testing compile-out, verify code is ACTIVE:

```bash
# Check if feature is runtime-enabled
rg "getenv.*FEATURE" && echo "⚠️ ENV-gated, may be OFF by default"

# Verify code path is exercised
# Option 1: Add temporary printf, check if it fires
# Option 2: Use perf to check if functions are called
```

**Updated audit checklist:**
1. ✅ Classify atomics (CORRECTNESS vs TELEMETRY)
2. ✅ Verify no flow control usage
3. **NEW:** ✅ Verify code path is ACTIVE in benchmark
4. Implement compile-out
5. A/B test

## Files Modified

### Phase 29 Implementation

1. **Build flag:** `core/hakmem_build_flags.h:352-361`
   ```c
   #ifndef HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED
   #  define HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED 0
   #endif
   ```

2. **Compile-out:** `core/hakmem_pool.c:903-1129`
   - Wrapped 13 atomic writes (lines 903, 913, 922, 931, 941, 947, 950, 953, 957, 972-983, 1117, 1126)
   - Example:
     ```c
     #if HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED
         atomic_fetch_add_explicit(&g_pool_hotbox_v2_stats[ci].alloc_calls, 1, ...);
     #else
         (void)0;
     #endif
     ```

3. **Include:** `core/hakmem_pool.c:48`
   - Added `#include "hakmem_build_flags.h"`

### Audit Documentation

- `docs/analysis/PHASE29_POOL_HOTBOX_V2_AUDIT.md`
  - Complete usage analysis (24 sites: 12 writes + 12 reads)
  - TELEMETRY classification (all 12 fields)
  - No CORRECTNESS usage found

## Performance Impact

**Expected:** +0.2% to +0.5% (similar to Phase 27)
**Actual:** 0.00% (code path not active)

**If Pool v2 were enabled:**
- 12 atomic counters on HOT+WARM path
- Estimated impact: +0.3% to +0.8% (higher than Phase 27 due to HOT path presence)

## Recommendations

### Immediate

1. **Keep compile-out:** No downside, future upside
2. **Update audit process:** Add "verify code is active" step
3. **Document ENV gates:** Tag all ENV-gated features in audit

### Future Work

**Phase 30+ candidates:**
- Focus on **ACTIVE** code paths only
- Check for ENV gates before scheduling A/B tests
- Consider enabling Pool v2 (if performance gain expected) to test this prune's true impact

### Pool Hotbox v2 Activation

**If enabling Pool v2 in future:**

```bash
# Enable Pool v2 globally
export HAKMEM_POOL_V2_ENABLED=1

# Enable specific classes (bitmask)
export HAKMEM_POOL_V2_CLASSES=0x7F  # All 7 classes

# Enable stats (if COMPILED=1)
export HAKMEM_POOL_V2_STATS=1
```

**Then re-run Phase 29 A/B test to measure true impact.**

## Conclusion

Phase 29 successfully implements compile-out infrastructure for Pool Hotbox v2 stats, but has **zero performance impact** because Pool v2 is disabled by default in the benchmark.

**Verdict:** NEUTRAL - Maintain compile-out for code cleanliness and future-proofing.

**Key lesson:** Always verify code path is ACTIVE before scheduling A/B tests. ENV-gated features may appear on hot paths but never execute.

---

**Phase 29 Status:** COMPLETE (NO-OP, but infrastructure ready)
**Next Phase:** Phase 30 (TBD - focus on ACTIVE code paths)
-												Phase 29: Pool Hotbox v2 Stats Prune - NO-OP (infrastructure ready)

Target: g_pool_hotbox_v2_stats atomics (12 total) in Pool v2
Result: 0.00% impact (code path inactive by default, ENV-gated)
Verdict: NO-OP - Maintain compile-out for future-proofing

Audit Results:
- Classification: 12/12 TELEMETRY (100% observational)
- Counters: alloc_calls, alloc_fast, alloc_refill, alloc_refill_fail,
  alloc_fallback_v1, free_calls, free_fast, free_fallback_v1,
  page_of_fail_* (4 failure counters)
- Verification: All stats/logging only, zero flow control usage
- Phase 28 lesson applied: Traced all usages, confirmed no CORRECTNESS

Key Finding: Pool v2 OFF by default
- Requires HAKMEM_POOL_V2_ENABLED=1 to activate
- Benchmark never executes Pool v2 code paths
- Compile-out has zero performance impact (code never runs)

Implementation (future-ready):
- Added HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED (default: 0)
- Wrapped 13 atomic write sites in core/hakmem_pool.c
- Pattern: #if HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED ... #endif
- Expected impact if Pool v2 enabled: +0.3~0.8% (HOT+WARM atomics)

A/B Test Results:
- Baseline (COMPILED=0): 52.98 M ops/s (±0.43M, 0.81% stdev)
- Research (COMPILED=1): 53.31 M ops/s (±0.80M, 1.50% stdev)
- Delta: -0.62% (noise, not real effect - code path not active)

Critical Lesson Learned (NEW):
Phase 29 revealed ENV-gated features can appear on hot paths but never
execute. Updated audit checklist:
1. Classify atomics (CORRECTNESS vs TELEMETRY)
2. Verify no flow control usage
3. NEW: Verify code path is ACTIVE in benchmark (check ENV gates)
4. Implement compile-out
5. A/B test

Verification methods added to documentation:
- rg "getenv.*FEATURE" to check ENV gates
- perf record/report to verify execution
- Debug printf for quick validation

Cumulative Progress (Phase 24-29):
- Phase 24 (class stats): +0.93% GO
- Phase 25 (free stats): +1.07% GO
- Phase 26 (diagnostics): -0.33% NEUTRAL
- Phase 27 (unified cache): +0.74% GO
- Phase 28 (bg spill): NO-OP (all CORRECTNESS)
- Phase 29 (pool v2): NO-OP (inactive code path)
- Total: 17 atomics removed, +2.74% improvement

Documentation:
- PHASE29_POOL_HOTBOX_V2_AUDIT.md: Complete audit with TELEMETRY classification
- PHASE29_POOL_HOTBOX_V2_STATS_RESULTS.md: Results + new lesson learned
- ATOMIC_PRUNE_CUMULATIVE_SUMMARY.md: Updated with Phase 29 + new checklist
- PHASE29_COMPLETE.md: Completion summary with recommendations

Decision: Keep compile-out despite NO-OP
- Code cleanliness (binary size reduction)
- Future-proofing (ready when Pool v2 enabled)
- Consistency with Phase 24-28 pattern

Generated with Claude Code
https://claude.com/claude-code

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

											
										
										
											2025-12-16 06:33:41 +09:00
+								# Phase 29: Pool Hotbox v2 Stats Prune - Results
 								## Executive Summary
 								**Date:** 2025-12-16
 								**Result:** **NO-OP** (Pool Hotbox v2 not active in default configuration)
 								**Verdict:** NEUTRAL - Keep compile-out for code cleanliness
 								**Impact:** 0.00% (atomics never executed)
 								## A/B Test Results
 								### Configuration
 								- **Baseline (COMPILED=0, default):** Atomics compiled-out (via `#if HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED`)
 								- **Research (COMPILED=1):** Atomics active (`atomic_fetch_add_explicit` executed)
 								- **Workload:** `bench_random_mixed` (10 runs, 20M ops each)
 								### Raw Data
 								**Baseline (atomics OFF):**
 								```
 								Run  1: 52,651,025 ops/s
 								Run  2: 52,251,016 ops/s
 								Run  3: 52,545,864 ops/s
 								Run  4: 53,765,007 ops/s
 								Run  5: 53,284,121 ops/s
 								Run  6: 52,982,021 ops/s
 								Run  7: 53,073,218 ops/s
 								Run  8: 52,844,359 ops/s
 								Run  9: 53,238,262 ops/s
 								Run 10: 53,150,487 ops/s
 								Mean:   52,978,538 ops/s
 								Stdev:     429,428 ops/s (0.81%)
 								```
 								**Research (atomics ON):**
 								```
 								Run  1: 53,282,648 ops/s
 								Run  2: 53,973,577 ops/s
 								Run  3: 52,681,322 ops/s
 								Run  4: 54,175,703 ops/s
 								Run  5: 52,841,032 ops/s
 								Run  6: 53,461,187 ops/s
 								Run  7: 52,268,525 ops/s
 								Run  8: 53,799,964 ops/s
 								Run  9: 52,147,517 ops/s
 								Run 10: 54,432,544 ops/s
 								Mean:   53,306,402 ops/s
 								Stdev:     800,321 ops/s (1.50%)
 								```
 								### Statistical Analysis
 								| Metric | Baseline (OFF) | Research (ON) | Delta |
 								|--------|----------------|---------------|-------|
 								| Mean | 52,978,538 ops/s | 53,306,402 ops/s | -327,864 ops/s |
 								| Stdev | 429,428 (0.81%) | 800,321 (1.50%) | +370,893 (noise) |
 								| **Relative Delta** | - | - | **-0.62%** |
 								**Interpretation:** Research build (atomics ON) is 0.62% FASTER than baseline (atomics OFF).
 								## Root Cause Analysis: Why NO-OP?
 								### Discovery
 								Pool Hotbox v2 is **OFF by default** and gated by environment variable:
 								```c
 								// core/hakmem_pool.c:824-831
 								static int pool_hotbox_v2_global_enabled(void) {
 								    static int g = -1;
 								    if (__builtin_expect(g == -1, 0)) {
 								        const char* e = getenv("HAKMEM_POOL_V2_ENABLED");  // ← ENV gate
 								        g = (e && *e && *e != '0') ? 1 : 0;
 								    }
 								    return g;
 								}
 								```
 								**Result:** All `pool_hotbox_v2_record_*()` calls are no-ops:
 								- `pool_hotbox_v2_alloc()` is never called
 								- `pool_hotbox_v2_free()` is never called
 								- All 12 atomic counters are never incremented
 								- Compile-out has **zero runtime effect**
 								### Why Research Build is Faster
 								**Hypothesis:** Compiler optimization artifact (noise)
 . **High variance in research build:** 1.50% stdev vs 0.81% baseline
 								   - Suggests measurement noise, not real effect
 								   - Delta (-0.62%) is within 1 stdev of research build
 . **Code layout changes:**
 								   - Adding `#if` guards changes object file layout
 								   - May affect instruction cache alignment by chance
 								   - LTO/PGO sensitive to code structure
 . **Sample size:** 10 runs insufficient to distinguish noise from signal
 								**Conclusion:** The -0.62% "speedup" for atomics ON is likely **noise**, not a real effect.
 								## Comparison: Phase 27 vs Phase 29
 								| Phase | Target | Path | Active? | Result |
 								|-------|--------|------|---------|--------|
 								| 27 | Unified Cache stats | WARM | ✅ YES | +0.74% GO |
 								| 29 | Pool Hotbox v2 stats | HOT+WARM | ❌ NO | 0.00% NO-OP |
 								**Key difference:** Phase 27 stats were on ACTIVE code path; Phase 29 stats are on INACTIVE (ENV-gated) path.
 								## Why Keep Compile-Out?
 								Despite NO-OP result, **we maintain the compile-out** for:
 . **Code cleanliness:** Reduces binary size (12 atomics × 7 classes = 84 atomic counters)
 . **Future-proofing:** If Pool v2 is enabled later, compile-out is already in place
 . **Consistency:** Matches Phase 24-28 atomic prune pattern
 . **Documentation:** Makes it clear these are research-only counters
 								## Actionable Findings
 								### For Phase 29
 								**Decision:** NEUTRAL - Maintain compile-out (default `COMPILED=0`)
 								**Rationale:**
 								- No performance impact (code not running)
 								- No harm (compile-out is correct for inactive code)
 								- Future benefit (ready if Pool v2 is enabled)
 								### For Future Phases
 								**Lesson:** Before A/B testing compile-out, verify code is ACTIVE:
 								```bash
 								# Check if feature is runtime-enabled
 								rg "getenv.*FEATURE" && echo "⚠️ ENV-gated, may be OFF by default"
 								# Verify code path is exercised
 								# Option 1: Add temporary printf, check if it fires
 								# Option 2: Use perf to check if functions are called
 								```
 								**Updated audit checklist:**
 . ✅ Classify atomics (CORRECTNESS vs TELEMETRY)
 . ✅ Verify no flow control usage
 . **NEW:** ✅ Verify code path is ACTIVE in benchmark
 . Implement compile-out
 . A/B test
 								## Files Modified
 								### Phase 29 Implementation
 . **Build flag:** `core/hakmem_build_flags.h:352-361`
 								   ```c
 								   #ifndef HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED
 								   #  define HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED 0
 								   #endif
 								   ```
 . **Compile-out:** `core/hakmem_pool.c:903-1129`
 								   - Wrapped 13 atomic writes (lines 903, 913, 922, 931, 941, 947, 950, 953, 957, 972-983, 1117, 1126)
 								   - Example:
 								     ```c
 								     #if HAKMEM_POOL_HOTBOX_V2_STATS_COMPILED
 								         atomic_fetch_add_explicit(&g_pool_hotbox_v2_stats[ci].alloc_calls, 1, ...);
 								     #else
 								         (void)0;
 								     #endif
 								     ```
 . **Include:** `core/hakmem_pool.c:48`
 								   - Added `#include "hakmem_build_flags.h"`
 								### Audit Documentation
 								- `docs/analysis/PHASE29_POOL_HOTBOX_V2_AUDIT.md`
 								  - Complete usage analysis (24 sites: 12 writes + 12 reads)
 								  - TELEMETRY classification (all 12 fields)
 								  - No CORRECTNESS usage found
 								## Performance Impact
 								**Expected:** +0.2% to +0.5% (similar to Phase 27)
 								**Actual:** 0.00% (code path not active)
 								**If Pool v2 were enabled:**
 								- 12 atomic counters on HOT+WARM path
 								- Estimated impact: +0.3% to +0.8% (higher than Phase 27 due to HOT path presence)
 								## Recommendations
 								### Immediate
 . **Keep compile-out:** No downside, future upside
 . **Update audit process:** Add "verify code is active" step
 . **Document ENV gates:** Tag all ENV-gated features in audit
 								### Future Work
 								**Phase 30+ candidates:**
 								- Focus on **ACTIVE** code paths only
 								- Check for ENV gates before scheduling A/B tests
 								- Consider enabling Pool v2 (if performance gain expected) to test this prune's true impact
 								### Pool Hotbox v2 Activation
 								**If enabling Pool v2 in future:**
 								```bash
 								# Enable Pool v2 globally
 								export HAKMEM_POOL_V2_ENABLED=1
 								# Enable specific classes (bitmask)
 								export HAKMEM_POOL_V2_CLASSES=0x7F  # All 7 classes
 								# Enable stats (if COMPILED=1)
 								export HAKMEM_POOL_V2_STATS=1
 								```
 								**Then re-run Phase 29 A/B test to measure true impact.**
 								## Conclusion
 								Phase 29 successfully implements compile-out infrastructure for Pool Hotbox v2 stats, but has **zero performance impact** because Pool v2 is disabled by default in the benchmark.
 								**Verdict:** NEUTRAL - Maintain compile-out for code cleanliness and future-proofing.
 								**Key lesson:** Always verify code path is ACTIVE before scheduling A/B tests. ENV-gated features may appear on hot paths but never execute.
 								---
 								**Phase 29 Status:** COMPLETE (NO-OP, but infrastructure ready)
 								**Next Phase:** Phase 30 (TBD - focus on ACTIVE code paths)