Files
hakmem/docs/archive/PROFILING_RESULTS_2025_10_22.md

45 lines
1.4 KiB
Markdown
Raw Normal View History

# Profiling Results (20251022)
This document stores the current observed overheads from lightweight sampling runs. Throughput is ignored; we focus on avg ns per path.
Environment: larson, LD_PRELOAD hakmem, sampling profiler ON (HAKMEM_PROF=1), sample rates as indicated.
## 10s reference (subset)
- BURST 10s, 1T
- tiny_alloc: avg ~28.8 ns (samples ~65k)
- BURST 10s, 4T
- malloc_alloc: avg ~98.7 ns (samples ~16k)
- LOOP 10s, 4T
- malloc_alloc: avg ~40.9 ns (samples ~35k)
## 2s sweep (1/256), 1T/4T (mid/large ranges)
- Mid 232KiB, 1T
- ace_alloc: ~2631 ns
- malloc_alloc: ~150220 ns
- Mid 232KiB, 4T
- ace_alloc: ~3135 ns
- malloc_alloc: ~250315 ns
- Large 64KiB1MiB, 1T
- ace_alloc: ~3258 ns → 3150 ns (after W_MAX tuning)
- malloc_alloc: ~9601,690 ns → 8401,330 ns (slight drop)
- Large 64KiB1MiB, 4T
- ace_alloc: ~5272 ns
- malloc_alloc: ~2.54.1 µs
Notes:
- Tiny path healthy: tiny_alloc ~1829 ns, tiny_reg_lookup ~1723 ns.
- Registry registration ~0.61.3 µs (rare: slab creation only).
- Pool/L25 lock/refill present (instrumented) but low sample count in 2s runs; use focused ranges and higher sampling for deeper analysis.
## 1s sweep (1/256), 1T (W_MAX_mid=1.40, W_MAX_large=1.30)
- Mid 232KiB, 1T
- ace_alloc: ~26.6 ns
- malloc_alloc: ~153.2 ns (slight drop)
- Large 64KiB1MiB, 1T
- ace_alloc: ~31.950.6 ns
- malloc_alloc: ~927.91,334.1 ns (modest drop)