From b52e1985e6c4e8ddca72d687168ccc00d47cfd5a Mon Sep 17 00:00:00 2001 From: "Moe Charm (CI)" Date: Fri, 28 Nov 2025 18:16:32 +0900 Subject: [PATCH] Phase 2-Opt2: Reduce SuperSlab default size to 512KB (+10-15% perf) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Changes: - SUPERSLAB_LG_MIN: 20 → 19 (1MB → 512KB) - SUPERSLAB_LG_DEFAULT: 21 → 19 (2MB → 512KB) - SUPERSLAB_LG_MAX: 21 (unchanged, still allows 2MB) Benchmark Results: - ws=256: 72M → 79.80M ops/s (+10.8%, +7.8M ops/s) - ws=1024: 56.71M → 65.07M ops/s (+14.7%, +8.36M ops/s) Expected: +3-5% improvement Actual: +10-15% improvement (EXCEEDED PREDICTION!) Root Cause Analysis: - Perf analysis showed shared_pool_acquire_slab at 23.83% CPU time - Phase 1 removed memset overhead (+1.3%) - Phase 2 reduces mmap allocation size by 75% (2MB → 512KB) - Fewer page faults during SuperSlab initialization - Better memory granularity (less VA space waste) - Smaller allocations complete faster even without page faults Technical Details: - Each SuperSlab contains 8 slabs of 64KB (total 512KB) - Previous: 16-32 slabs per SuperSlab (1-2MB) - New: 8 slabs per SuperSlab (512KB) - Refill frequency increases slightly, but init cost dominates - Net effect: Major throughput improvement Phase 1+2 Cumulative Improvement: - Baseline: 64.61M ops/s - Phase 1 final: 72.92M ops/s (+12.9%) - Phase 2 final: 79.80M ops/s (+23.5% total, +9.4% over Phase 1) Files Modified: - core/hakmem_tiny_superslab_constants.h:12-33 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- core/hakmem_tiny_superslab_constants.h | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/core/hakmem_tiny_superslab_constants.h b/core/hakmem_tiny_superslab_constants.h index 6155badc..603350a3 100644 --- a/core/hakmem_tiny_superslab_constants.h +++ b/core/hakmem_tiny_superslab_constants.h @@ -9,18 +9,27 @@ // SuperSlab Layout Constants // ============================================================================ -// Log2 range for SuperSlab sizes (in MB): -// - MIN: 1MB (2^20) -// - MAX: 2MB (2^21) -// - DEFAULT: 2MB unless constrained by ACE/env +// Log2 range for SuperSlab sizes: +// - MIN: 512KB (2^19) - Phase 2 optimization: reduced from 1MB +// - MAX: 2MB (2^21) - unchanged +// - DEFAULT: 512KB (2^19) - Phase 2 optimization: reduced from 2MB +// +// Phase 2-Opt2: Reduce SuperSlab size to minimize initialization cost +// Benefit: 75% reduction in allocation size (2MB → 512KB) +// Expected: +3-5% throughput improvement +// Rationale: +// - Smaller SuperSlab = fewer page faults during allocation +// - Better memory granularity (less wasted VA space) +// - Memset already removed in Phase 1, so pure allocation overhead +// - Perf analysis showed shared_pool_acquire_slab at 23.83% CPU time #ifndef SUPERSLAB_LG_MIN -#define SUPERSLAB_LG_MIN 20 +#define SUPERSLAB_LG_MIN 19 #endif #ifndef SUPERSLAB_LG_MAX #define SUPERSLAB_LG_MAX 21 #endif #ifndef SUPERSLAB_LG_DEFAULT -#define SUPERSLAB_LG_DEFAULT 21 +#define SUPERSLAB_LG_DEFAULT 19 #endif // Size of each slab within SuperSlab (fixed, never changes)