Files
hakmem/archive/analysis/RING_SIZE_INDEX.md
Moe Charm (CI) 52386401b3 Debug Counters Implementation - Clean History
Major Features:
- Debug counter infrastructure for Refill Stage tracking
- Free Pipeline counters (ss_local, ss_remote, tls_sll)
- Diagnostic counters for early return analysis
- Unified larson.sh benchmark runner with profiles
- Phase 6-3 regression analysis documentation

Bug Fixes:
- Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB)
- Fix profile variable naming consistency
- Add .gitignore patterns for large files

Performance:
- Phase 6-3: 4.79 M ops/s (has OOM risk)
- With SuperSlab: 3.13 M ops/s (+19% improvement)

This is a clean repository without large log files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 12:31:14 +09:00

3.3 KiB

Ring Size Analysis: Document Index

Overview

This directory contains a comprehensive ultra-deep analysis of why POOL_TLS_RING_CAP changes affect mid_large_mt and random_mixed benchmarks differently, and provides a solution that improves BOTH.

Documents

1. RING_SIZE_SUMMARY.md (Start Here!)

Length: 2.4 KB Read Time: 2 minutes

Executive summary with:

  • Problem statement
  • Root cause explanation
  • Solution overview
  • Expected results
  • Key insights

Best for: Quick understanding of the issue and solution.

2. RING_SIZE_VISUALIZATION.txt

Length: 14 KB Read Time: 5 minutes

Visual guide with ASCII art showing:

  • Pool routing diagrams
  • TLS memory footprint comparison
  • L1 cache pressure visualization
  • Performance bar charts
  • Implementation roadmap

Best for: Visual learners who want to see the problem graphically.

3. RING_SIZE_SOLUTION.md

Length: 7.6 KB Read Time: 10 minutes

Step-by-step implementation guide with:

  • Exact code changes (line numbers)
  • sed commands for bulk replacement
  • Testing plan with scripts
  • Expected performance matrix
  • Rollback plan

Best for: Implementing the fix.

4. RING_SIZE_DEEP_ANALYSIS.md

Length: 18 KB Read Time: 30 minutes

Complete technical analysis with 10 sections:

  1. Pool routing confirmation
  2. TLS memory footprint analysis
  3. Why ring size affects benchmarks differently
  4. Why Ring=128 hurts BOTH benchmarks
  5. Separate ring sizes per pool (solution)
  6. Optimal ring size sweep
  7. Other bottlenecks analysis
  8. Implementation guidance
  9. Recommended approach
  10. Conclusion + Appendix (cache analysis)

Best for: Deep understanding of the root cause and trade-offs.

Quick Navigation

Want to:Read:

  • Understand the problem in 2 min → RING_SIZE_SUMMARY.md
  • See visual diagrams → RING_SIZE_VISUALIZATION.txt
  • Implement the fix → RING_SIZE_SOLUTION.md
  • Deep technical dive → RING_SIZE_DEEP_ANALYSIS.md

Key Findings

Root Cause

POOL_TLS_RING_CAP controls ring size for L2 Pool (8-32KB) only:

  • mid_large_mt uses L2 Pool → benefits from larger rings
  • random_mixed uses Tiny Pool → hurt by L2's TLS growth evicting L1 cache

Solution

Use separate ring sizes per pool:

  • L2 Pool: POOL_L2_RING_CAP=48 (balanced)
  • L2.5 Pool: POOL_L25_RING_CAP=16 (unchanged)
  • Tiny Pool: No ring (freelist-based, unchanged)

Expected Results

Metric Ring=16 Ring=64 L2=48 vs Ring=64
mid_large_mt 36.04M 37.22M 36.8M -1.1%
random_mixed 22.5M 21.29M 22.5M +5.7%
Average 29.27M 29.26M 29.65M +1.3%
TLS/thread 2.36 KB 5.05 KB 3.4 KB -33%

Win-Win: Improves BOTH benchmarks simultaneously.

Implementation Timeline

  • Code changes: 30 minutes
  • Testing: 2-3 hours
  • Documentation: 30 minutes
  • Total: ~4 hours

Files to Modify

  1. core/hakmem_pool.c - Replace POOL_TLS_RING_CAPPOOL_L2_RING_CAP
  2. core/hakmem_l25_pool.c - Replace POOL_TLS_RING_CAPPOOL_L25_RING_CAP
  3. Makefile - Add -DPOOL_L2_RING_CAP=48 -DPOOL_L25_RING_CAP=16

Success Criteria

✓ mid_large_mt: ≥36.5M ops/s (+1.3% vs baseline) ✓ random_mixed: ≥22.4M ops/s (within ±1% of baseline) ✓ TLS footprint: ≤3.5 KB/thread ✓ No regressions in full benchmark suite