Major Features: - Debug counter infrastructure for Refill Stage tracking - Free Pipeline counters (ss_local, ss_remote, tls_sll) - Diagnostic counters for early return analysis - Unified larson.sh benchmark runner with profiles - Phase 6-3 regression analysis documentation Bug Fixes: - Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB) - Fix profile variable naming consistency - Add .gitignore patterns for large files Performance: - Phase 6-3: 4.79 M ops/s (has OOM risk) - With SuperSlab: 3.13 M ops/s (+19% improvement) This is a clean repository without large log files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
75 lines
3.0 KiB
C
75 lines
3.0 KiB
C
#include <stdio.h>
|
||
#include <stdlib.h>
|
||
|
||
int main() {
|
||
printf("=== Deep Analysis: The Real 24-byte Mystery ===\n\n");
|
||
|
||
// Key insight: aligned_alloc() test showed ONLY 1.5 MB for 100 × 64KB
|
||
// Expected: 6.4 MB
|
||
// This means: RSS is NOT tracking all virtual memory!
|
||
|
||
printf("Observation from aligned_alloc test:\n");
|
||
printf(" 100 × 64 KB = 6.4 MB expected\n");
|
||
printf(" Actual RSS: 1.5 MB\n");
|
||
printf(" Ratio: 23%% (only touched pages counted!)\n\n");
|
||
|
||
printf("HAKMEM test results:\n");
|
||
printf(" 1M × 16B = 15.26 MB data\n");
|
||
printf(" RSS: 39.6 MB\n");
|
||
printf(" Overhead: 24.34 MB\n\n");
|
||
|
||
printf("Hypothesis: SuperSlab pre-allocation\n");
|
||
printf(" SuperSlab size: 2 MB\n");
|
||
printf(" Blocks per slab (16B): 4096\n");
|
||
printf(" If using SuperSlab:\n");
|
||
printf(" - Each SuperSlab: 2 MB (32 × 64 KB slabs)\n");
|
||
printf(" - Slabs needed: 245 regular OR 8 SuperSlabs\n");
|
||
printf(" - SuperSlab total: 8 × 2 MB = 16 MB\n\n");
|
||
|
||
printf("But wait! SuperSlab would HELP, not hurt!\n\n");
|
||
|
||
printf("Alternative: The TLS Magazine is FILLING UP\n");
|
||
printf(" TLS Magazine capacity: 2048 items per class\n");
|
||
printf(" At steady state (1M allocations active):\n");
|
||
printf(" - Magazine likely has ~1000-2000 items cached\n");
|
||
printf(" - These are ALLOCATED blocks held in magazine\n");
|
||
printf(" - 2048 × 16B × 8 classes = 256 KB\n");
|
||
printf(" But that's only 0.25 MB, not 24 MB!\n\n");
|
||
|
||
printf("REAL ROOT CAUSE: Working Set Effect\n");
|
||
printf(" The test allocates 1M × 16B sequentially\n");
|
||
printf(" RSS measures: Data + Pointer array + ALL touched pages\n\n");
|
||
|
||
printf("Let's recalculate with page granularity:\n");
|
||
printf(" Page size: 4 KB\n");
|
||
printf(" Slab size: 64 KB = 16 pages\n");
|
||
printf(" Slabs needed: 245\n");
|
||
printf(" Total pages touched: 245 × 16 = 3920 pages\n");
|
||
printf(" Total RSS from slabs: 3920 × 4 KB = 15.31 MB ✓\n\n");
|
||
|
||
printf("But actual RSS = 39.6 MB, so where's the other 24 MB?\n\n");
|
||
|
||
printf("=== THE ANSWER ===\n");
|
||
printf("It's NOT the slabs! It's something else entirely.\n\n");
|
||
|
||
printf("Checking test_memory_usage.c:\n");
|
||
printf(" void** ptrs = malloc(1M × 8 bytes);\n");
|
||
printf(" 1M allocations × 16 bytes each\n");
|
||
printf(" BUT: Each malloc has HEADER overhead!\n\n");
|
||
|
||
printf("Standard malloc overhead:\n");
|
||
printf(" glibc malloc: 8-16 bytes per allocation\n");
|
||
printf(" If glibc adds 16 bytes per block:\n");
|
||
printf(" 1M × (16 data + 16 header) = 32 MB\n");
|
||
printf(" Plus pointer array: 7.63 MB\n");
|
||
printf(" Total: 39.63 MB ✓✓✓\n\n");
|
||
|
||
printf("CONCLUSION:\n");
|
||
printf("The 24-byte overhead is HAKMEM's OWN block headers!\n");
|
||
printf("But wait... HAKMEM uses bitmap, not headers!\n\n");
|
||
|
||
printf("Let me check if test is calling glibc malloc underneath...\n");
|
||
|
||
return 0;
|
||
}
|