New Feature: ss_prefault_box.h - Box for controlling SuperSlab page prefaulting policy - ENV: HAKMEM_SS_PREFAULT (0=OFF, 1=POPULATE, 2=TOUCH) - Default: OFF (safe mode until further optimization) Bug Fix: 4MB MAP_POPULATE regression - Problem: Fallback path allocated 4MB (2x size for alignment) with MAP_POPULATE causing 52x slower mmap (0.585ms → 30.6ms) and 35% throughput regression - Solution: Remove MAP_POPULATE from 4MB allocation, apply madvise(MADV_WILLNEED) only to the aligned 2MB region after trimming prefix/suffix Changes: - core/box/ss_prefault_box.h: New prefault policy box (header-only) - core/box/ss_allocation_box.c: Integrate prefault box, call ss_prefault_region() - core/superslab_cache.c: Fix fallback path - no MAP_POPULATE on 4MB, always munmap prefix/suffix, use MADV_WILLNEED for 2MB only - docs/specs/ENV_VARS*.md: Document HAKMEM_SS_PREFAULT Performance: - bench_random_mixed: 4.32M ops/s (regression fixed, slight improvement) - bench_tiny_hot: 157M ops/s with prefault=1 (no crash) Box Theory: - OS layer (ss_os_acquire): "how to mmap" - Prefault Box: "when to page-in" - Allocation Box: "when to call prefault" 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
26 KiB
HAKMEM Environment Variables Complete Reference
Total Variables: 83 environment variables + multiple compile-time flags Last Updated: 2025-11-01 Purpose: Complete reference for diagnosing memory issues and configuration
CRITICAL DISCOVERY: Statistics Disabled by Default
The Problem
Tiny Pool statistics are DISABLED unless you build with -DHAKMEM_ENABLE_STATS:
- Current behavior:
alloc=0, free=0, slab=0(statistics not collected) - Impact: Memory diagnostics are blind
- Root cause: Build-time flag NOT set in Makefile
How to Enable Statistics
Option 1: Build with statistics (RECOMMENDED for debugging)
make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS" bench_fragment_stress_hakmem
Option 2: Edit Makefile (add to line 18)
CFLAGS = -O3 ... -DHAKMEM_ENABLE_STATS ...
Why Statistics are Disabled
From /mnt/workdisk/public_share/hakmem/core/hakmem_tiny_stats.h:
// Purpose: Zero-overhead production builds by disabling stats collection
// Usage: Build with -DHAKMEM_ENABLE_STATS to enable (default: disabled)
// Impact: 3-5% speedup when disabled (removes 0.5ns TLS increment)
//
// Default: DISABLED (production performance)
// Enable: make CFLAGS=-DHAKMEM_ENABLE_STATS
When DISABLED: All stats_record_alloc() and stats_record_free() become no-ops
When ENABLED: Batched TLS counters track exact allocation/free counts
Environment Variable Categories
1. Tiny Pool Core (Critical)
HAKMEM_WRAP_TINY
- Default: 1 (enabled)
- Purpose: Enable Tiny Pool fast-path (bypasses wrapper guard)
- Impact: Controls whether malloc/free use Tiny Pool for ≤1KB allocations
- Usage:
export HAKMEM_WRAP_TINY=1(already default since Phase 7.4) - Location:
/mnt/workdisk/public_share/hakmem/core/hakmem_tiny_init.inc:25 - Notes: Without this, Tiny Pool returns NULL and falls back to L2/L25
HAKMEM_WRAP_TINY_REFILL
- Default: 0 (disabled)
- Purpose: Allow trylock-based magazine refill during wrapper calls
- Impact: Enables limited refill under trylock (no blocking)
- Usage:
export HAKMEM_WRAP_TINY_REFILL=1 - Safety: OFF by default (avoids deadlock risk in recursive malloc)
HAKMEM_TINY_USE_SUPERSLAB
- Default: 1 (enabled)
- Purpose: Enable SuperSlab allocator for Tiny Pool slabs
- Impact: When OFF, Tiny Pool cannot allocate new slabs
- Critical: Must be ON for Tiny Pool to work
2. Tiny Pool TLS Caching (Performance Critical)
HAKMEM_TINY_MAG_CAP
- Default: Per-class (typically 512-2048)
- Purpose: Global TLS magazine capacity override
- Impact: Larger = fewer refills, more memory
- Usage:
export HAKMEM_TINY_MAG_CAP=1024
HAKMEM_TINY_MAG_CAP_C{0..7}
- Default: None (uses class defaults)
- Purpose: Per-class magazine capacity override
- Example:
HAKMEM_TINY_MAG_CAP_C3=512(64B class) - Classes: C0=8B, C1=16B, C2=32B, C3=64B, C4=128B, C5=256B, C6=512B, C7=1KB
HAKMEM_TINY_TLS_SLL
- Default: 1 (enabled)
- Purpose: Enable TLS Single-Linked-List cache layer
- Impact: Fast-path cache before magazine
- Performance: Critical for tiny allocations (8-64B)
HAKMEM_SLL_MULTIPLIER
- Default: 2
- Purpose: SLL capacity = MAG_CAP × multiplier for small classes (0-3)
- Range: 1..16
- Impact: Higher = more TLS memory, fewer refills
HAKMEM_TINY_REFILL_MAX
- Default: 64
- Purpose: Magazine refill batch size (normal classes)
- Impact: Larger = fewer refills, more memory spike
HAKMEM_TINY_REFILL_MAX_HOT
- Default: 192
- Purpose: Magazine refill batch size for hot classes (≤64B)
- Impact: Larger batches for frequently used sizes
HAKMEM_TINY_REFILL_MAX_C{0..7}
- Default: None
- Purpose: Per-class refill batch override
- Example:
HAKMEM_TINY_REFILL_MAX_C2=96(32B class)
HAKMEM_TINY_REFILL_MAX_HOT_C{0..7}
- Default: None
- Purpose: Per-class hot refill override (classes 0-3)
- Priority: Overrides HAKMEM_TINY_REFILL_MAX_HOT
3. SuperSlab Configuration
HAKMEM_TINY_SS_MAX_MB
- Default: Unlimited
- Purpose: Maximum SuperSlab memory per class (MB)
- Impact: Caps total slab allocation
- Usage:
export HAKMEM_TINY_SS_MAX_MB=512
HAKMEM_TINY_SS_MIN_MB
- Default: 0
- Purpose: Minimum SuperSlab reservation per class (MB)
- Impact: Pre-allocates memory at startup
HAKMEM_TINY_SS_RESERVE
- Default: 0
- Purpose: Reserve SuperSlab memory at init
- Impact: Prevents initial allocation delays
HAKMEM_TINY_TRIM_SS
- Default: 0
- Purpose: Enable SuperSlab trimming/deallocation
- Impact: Returns memory to OS when idle
HAKMEM_TINY_SS_PARTIAL
- Default: 0
- Purpose: Enable partial slab reclamation
- Impact: Free partially-used slabs
HAKMEM_TINY_SS_PARTIAL_INTERVAL
- Default: 1000000 (1M allocations)
- Purpose: Interval between partial slab checks
- Impact: Lower = more aggressive trimming
HAKMEM_TINY_SS_CACHE
- Default: 0 (disabled)
- Purpose: Per-class SuperSlab cache capacity
- Impact: Limits how many freed SuperSlabs are kept in LRU cache before munmap
HAKMEM_TINY_SS_CACHE_C{0..7}
- Default: unset (inherits
HAKMEM_TINY_SS_CACHE) - Purpose: Per-class overrides for cache capacity
- Impact: Fine-grained control of cache size per Tiny class
HAKMEM_TINY_SS_PRECHARGE
- Default: 0
- Purpose: Precharge (pre-allocate) SuperSlabs into cache at startup/runtime
- Impact: Reduces first-use page faults by having warm SuperSlabs ready
HAKMEM_TINY_SS_PRECHARGE_C{0..7}
- Default: unset (inherits
HAKMEM_TINY_SS_PRECHARGE) - Purpose: Per-class precharge targets
- Impact: e.g.,
HAKMEM_TINY_SS_PRECHARGE_C0=4precharges 4 SuperSlabs for class 0
HAKMEM_TINY_SS_POPULATE_ONCE
- Default: 0
- Purpose: Use
MAP_POPULATEfor the next SuperSlab allocation only - Impact: One-shot prefault for A/B testing; superseded by
HAKMEM_SS_PREFAULTfor常時運用
HAKMEM_SS_PREFAULT
- Default:
0(OFF, safety-first default) - Type: integer (0–3)
- Purpose: Control SuperSlab prefault strategy to reduce kernel page fault overhead (enabled explicitly when tuning).
- Values:
0= OFF — legacy behavior, onlyHAKMEM_TINY_SS_POPULATE_ONCEmay trigger a one-shotMAP_POPULATE(現状の安全デフォルト)。1= POPULATE — always passpopulate=1toss_os_acquire()(useMAP_POPULATEfor every new SuperSlab). 要 perf 確認。2= TOUCH — POPULATE +ss_prefault_region()touches each page once (4KB stride) aftermmap(実験用)。3= ASYNC — reserved for future background-prefault implementation (currently treated as TOUCH).
- Implementation:
- Policy Box:
core/box/ss_prefault_box.h - Integration:
core/box/ss_allocation_box.ccallsss_prefault_policy()to setpopulateandss_prefault_region()immediately afterss_os_acquire().
- Policy Box:
4. Remote Free & Background Processing
HAKMEM_TINY_REMOTE_DRAIN_THRESHOLD
- Default: 32
- Purpose: Trigger remote free drain when count exceeds threshold
- Impact: Controls when to process cross-thread frees
- Per-class: ACE can tune this per-class
HAKMEM_TINY_REMOTE_DRAIN_TRYRATE
- Default: 16
- Purpose: Probability (1/N) of attempting trylock drain
- Impact: Lower = more aggressive draining
5. Statistics & Profiling
HAKMEM_ENABLE_STATS (BUILD-TIME)
- Default: UNDEFINED (statistics DISABLED)
- Purpose: Enable batched TLS statistics collection
- Build:
make CFLAGS=-DHAKMEM_ENABLE_STATS - Impact: 0.5ns overhead per alloc/free when enabled
- Critical: Must be defined to see any statistics
HAKMEM_TINY_STAT_RATE_LG
- Default: 0 (no sampling)
- Purpose: Sample statistics at 1/2^N rate
- Example:
HAKMEM_TINY_STAT_RATE_LG=4→ sample 1/16 allocs - Requires: HAKMEM_ENABLE_STATS + HAKMEM_TINY_STAT_SAMPLING build flags
HAKMEM_TINY_COUNT_SAMPLE
- Default: 8
- Purpose: Legacy sampling exponent (deprecated)
- Note: Replaced by batched stats in Phase 3
HAKMEM_TINY_PATH_DEBUG
- Default: 0
- Purpose: Enable allocation path debugging counters
- Requires: HAKMEM_DEBUG_COUNTERS=1 build flag
- Output: atexit() dump of path hit counts
6. ACE Learning System (Adaptive Control Engine)
HAKMEM_ACE_ENABLED
- Default: 0
- Purpose: Enable ACE learning system
- Impact: Adaptive tuning of Tiny Pool parameters
- Note: Already integrated but can be disabled
HAKMEM_ACE_OBSERVE
- Default: 0
- Purpose: Enable ACE observation logging
- Impact: Verbose output of ACE decisions
HAKMEM_ACE_DEBUG
- Default: 0
- Purpose: Enable ACE debug logging
- Impact: Detailed ACE internal state
HAKMEM_ACE_SAMPLE
- Default: Undefined (no sampling)
- Purpose: Sample ACE events at given rate
- Impact: Reduces ACE overhead
HAKMEM_ACE_LOG_LEVEL
- Default: 0
- Purpose: ACE logging verbosity (0-3)
- Levels: 0=off, 1=errors, 2=info, 3=debug
HAKMEM_ACE_FAST_INTERVAL_MS
- Default: 100ms
- Purpose: Fast ACE update interval
- Impact: How often ACE checks metrics
HAKMEM_ACE_SLOW_INTERVAL_MS
- Default: 1000ms
- Purpose: Slow ACE update interval
- Impact: Background tuning frequency
7. Intelligence Engine (INT)
HAKMEM_INT_ENGINE
- Default: 0
- Purpose: Enable background intelligence/adaptation engine
- Impact: Deferred event processing + adaptive tuning
- Pairs with: HAKMEM_TINY_FRONTEND
HAKMEM_INT_ADAPT_REFILL
- Default: 1 (when INT enabled)
- Purpose: Adapt REFILL_MAX dynamically (±16)
- Impact: Tunes refill sizes based on miss rate
HAKMEM_INT_ADAPT_CAPS
- Default: 1 (when INT enabled)
- Purpose: Adapt MAG/SLL capacities (±16/±32)
- Impact: Grows hot classes, shrinks cold ones
HAKMEM_INT_EVENT_TS
- Default: 0
- Purpose: Include timestamps in INT events
- Impact: Adds clock_gettime() overhead
HAKMEM_INT_SAMPLE
- Default: Undefined (no sampling)
- Purpose: Sample INT events at 1/2^N rate
- Impact: Reduces INT overhead on hot path
8. Frontend & Experimental Features
HAKMEM_TINY_FRONTEND
- Default: 0
- Purpose: Enable mimalloc-style frontend cache
- Impact: Adds FastCache layer before backend
- Experimental: A/B testing only
HAKMEM_TINY_FASTCACHE
- Default: 0
- Purpose: Low-level FastCache toggle
- Impact: Internal A/B switch
HAKMEM_TINY_QUICK
- Default: 0
- Purpose: Enable TinyQuickSlot (6-item single-cacheline stack)
- Impact: Ultra-fast path for ≤64B
- Experimental: Bench-only optimization
HAKMEM_TINY_HOTMAG (削除済み)
- 2025-12 cleanup: HotMag runtime ENVトグルは削除。HotMagはデフォルトOFF固定、ENVでの調整不可。
HAKMEM_TINY_HOTMAG_CAP (削除済み)
- 2025-12 cleanup: HotMag容量ENVを削除(固定値128)。
HAKMEM_TINY_HOTMAG_REFILL (削除済み)
- 2025-12 cleanup: HotMag refillバッチENVを削除(固定値32)。
HAKMEM_TINY_HOTMAG_C{0..7} (削除済み)
- 2025-12 cleanup: クラス別HotMag有効/無効ENVを削除(全クラス固定OFF)。
9. Tiny Front Routing
HAKMEM_TINY_PROFILE
- Default:
"conservative" - Type: string
- Purpose: Control Tiny Front (TLS SLL / FastCache) vs Pool/backend routing per Tiny class via a simple profile.
- Profiles:
"conservative":- All classes (C0–C7) use
TINY_FIRST: try Tiny Front first, then fallback to Pool/backend on miss.
- All classes (C0–C7) use
"hot":- C0–C3:
TINY_ONLY(small classes use Tiny exclusively via front gate) - C4–C6:
TINY_FIRST - C7:
POOL_ONLY(1KB headerless class uses Pool/backend)
- C0–C3:
"off":- All classes
POOL_ONLY(Tiny Front is fully disabled, Pool-only allocator behaviour).
- All classes
"full":- All classes
TINY_ONLY(microbench-style, front gate always routes via Tiny).
- All classes
- Implementation:
- Box:
core/box/tiny_route_box.h/tiny_route_box.c(per-classg_tiny_route[8]table). - Gate:
tiny_alloc_gate_fast()readsTinyRoutePolicyand decides Tiny vs Pool on each allocation.
- Box:
10. Superslab Tiering & Registry Control
HAKMEM_SS_TIER_DOWN_THRESHOLD
- Default:
0.25 - Range: 0.0–1.0
- Purpose: SuperSlab 利用率がこの値以下になったときに、Tier を
HOT → DRAININGに遷移させる下限。 - Impact:
- DRAINING Tier の SuperSlab は新規割り当ての対象外となり、drain/解放候補として扱われる。
- 利用率が低い SuperSlab への新規割り当てを避け、活発な SuperSlab に負荷を集中させる。
HAKMEM_SS_TIER_UP_THRESHOLD
- Default:
0.50 - Range: 0.0–1.0
- Purpose: DRAINING Tier の SuperSlab 利用率がこの値以上になったときに
DRAINING → HOTに戻す上限(ヒステリシス)。 - Impact:
- Down/Up 閾値にギャップを持たせることで、Tier が HOT と DRAINING の間で頻繁に振動するのを防ぐ。
- Sustained な利用増加が観測された SuperSlab のみ HOT に復帰させる。
11. Memory Efficiency & RSS Control
HAKMEM_TINY_RSS_BUDGET_KB
- Default: Unlimited
- Purpose: Total RSS budget for Tiny Pool (kB)
- Impact: When exceeded, shrinks MAG/SLL capacities
- INT interaction: Requires HAKMEM_INT_ENGINE=1
HAKMEM_TINY_INT_TIGHT
- Default: 0
- Purpose: Bias INT toward memory reduction
- Impact: Higher shrink thresholds, lower floor values
HAKMEM_TINY_DIET_STEP
- Default: 16
- Purpose: Capacity reduction step when over budget
- Impact: MAG -= step, SLL -= step×2
HAKMEM_TINY_CAP_FLOOR_C{0..7}
- Default: None (no floor)
- Purpose: Minimum MAG capacity per class
- Example:
HAKMEM_TINY_CAP_FLOOR_C0=64(8B class min) - Impact: Prevents INT from shrinking below floor
HAKMEM_TINY_MEM_DIET
- Default: 0
- Purpose: Enable memory diet mode (aggressive trimming)
- Impact: Reduces memory footprint at cost of performance
HAKMEM_TINY_SPILL_HYST
- Default: 0
- Purpose: Magazine spill hysteresis (avoid thrashing)
- Impact: Keep N extra items before spilling
11. Policy & Learning Parameters
HAKMEM_LEARN
- Default: 0 (OFF, unless HAKMEM_MODE=learning/research)
- Purpose: Legacy global learning toggle (CAP/WMAX Learner thread)
- Impact:
- HAKMEM_LEARN が明示的に設定されている場合:
0→ Learner 無効!=0→ Learner 有効
- 未設定の場合:
HAKMEM_MODE=learning/researchのときだけ Learner 有効- それ以外のモードでは Learner 無効(balanced/fast/minimal)
- 実装:
core/box/learner_env_box.h(学習レイヤ用 ENV Box)
- HAKMEM_LEARN が明示的に設定されている場合:
HAKMEM_WMAX_MID
- Default: 256KB
- Purpose: Mid-size allocation working set max
- Impact: Pool cache size for mid-tier
HAKMEM_WMAX_LARGE
- Default: 2MB
- Purpose: Large allocation working set max
- Impact: Pool cache size for large-tier
HAKMEM_CAP_MID
- Default: Unlimited
- Purpose: Mid-tier pool capacity cap
- Impact: Maximum mid-tier pool size
HAKMEM_CAP_LARGE
- Default: Unlimited
- Purpose: Large-tier pool capacity cap
- Impact: Maximum large-tier pool size
HAKMEM_WMAX_LEARN
- Default: 0
- Purpose: Enable working set max learning
- Impact: Adaptively tune WMAX based on hit rate
HAKMEM_WMAX_CANDIDATES_MID
- Default: "128,256,512,1024"
- Purpose: Candidate WMAX values for mid-tier learning
- Format: Comma-separated KB values
HAKMEM_WMAX_CANDIDATES_LARGE
- Default: "1024,2048,4096,8192"
- Purpose: Candidate WMAX values for large-tier learning
- Format: Comma-separated KB values
HAKMEM_WMAX_ADOPT_PCT
- Default: 0.01 (1%)
- Purpose: Adoption threshold for WMAX candidates
- Impact: How much better to switch candidates
HAKMEM_TARGET_HIT_MID
- Default: 0.65 (65%)
- Purpose: Target hit rate for mid-tier
- Impact: Learning objective
HAKMEM_TARGET_HIT_LARGE
- Default: 0.55 (55%)
- Purpose: Target hit rate for large-tier
- Impact: Learning objective
HAKMEM_GAIN_W_MISS
- Default: 1.0
- Purpose: Learning gain weight for misses
- Impact: How much to penalize misses
11. THP (Transparent Huge Pages)
HAKMEM_THP
- Default: "auto"
- Purpose: THP policy (off/auto/on)
- Values:
- "off" = MADV_NOHUGEPAGE for all
- "auto" = ≥2MB → MADV_HUGEPAGE
- "on" = MADV_HUGEPAGE for all ≥1MB
HAKMEM_THP_LEARN
- Default: 0
- Purpose: Enable THP policy learning
- Impact: Adaptively choose THP policy
HAKMEM_THP_CANDIDATES
- Default: "off,auto,on"
- Purpose: THP candidate policies for learning
- Format: Comma-separated
HAKMEM_THP_ADOPT_PCT
- Default: 0.015 (1.5%)
- Purpose: Adoption threshold for THP switch
- Impact: How much better to switch
12. L2/L25 Pool Configuration
HAKMEM_WRAP_L2
- Default: 0
- Purpose: Enable L2 pool wrapper bypass
- Impact: Allow L2 during wrapper calls
HAKMEM_WRAP_L25
- Default: 0
- Purpose: Enable L25 pool wrapper bypass
- Impact: Allow L25 during wrapper calls
HAKMEM_POOL_TLS_FREE
- Default: 1
- Purpose: Enable TLS-local free for L2 pool
- Impact: Lock-free fast path
HAKMEM_POOL_TLS_RING
- Default: 1
- Purpose: Enable TLS ring buffer for pool
- Impact: Batched cross-thread returns
HAKMEM_POOL_MIN_BUNDLE
- Default: 4
- Purpose: Minimum bundle size for L2 pool
- Impact: Batch refill size
HAKMEM_L25_MIN_BUNDLE
- Default: 4
- Purpose: Minimum bundle size for L25 pool
- Impact: Batch refill size
HAKMEM_L25_DZ
- Default: "64,256"
- Purpose: L25 size zones (comma-separated)
- Format: "size1,size2,..."
HAKMEM_L25_RUN_BLOCKS
- Default: 16
- Purpose: Run blocks per L25 slab
- Impact: Slab structure
HAKMEM_L25_RUN_FACTOR
- Default: 2
- Purpose: Run factor multiplier
- Impact: Slab allocation strategy
13. Debugging & Observability
HAKMEM_VERBOSE
- Default: 0
- Purpose: Enable verbose logging
- Impact: Detailed allocation logs
HAKMEM_QUIET
- Default: 0
- Purpose: Suppress all logging
- Impact: Overrides HAKMEM_VERBOSE
HAKMEM_TIMING
- Default: 0
- Purpose: Enable timing measurements
- Impact: Track allocation latency
HAKMEM_HIST_SAMPLE
- Default: 0
- Purpose: Size histogram sampling rate
- Impact: Track size distribution
HAKMEM_PROF
- Default: 0
- Purpose: Enable profiling mode
- Impact: Detailed performance tracking
HAKMEM_LOG_FILE
- Default: stderr
- Purpose: Redirect logs to file
- Impact: File path for logging output
14. Mode Presets
HAKMEM_MODE
- Default: "balanced"
- Purpose: High-level configuration preset
- Values:
- "minimal" = malloc/mmap only
- "fast" = pool fast-path + frozen learning
- "balanced" = BigCache + ELO + Batch (default)
- "learning" = ELO LEARN + adaptive
- "research" = all features + verbose
HAKMEM_PRESET
- Default: None
- Purpose: Evolution preset (from PRESETS.md)
- Impact: Load predefined parameter set
HAKMEM_FREE_POLICY
- Default: "batch"
- Purpose: Free path policy
- Values: "batch", "keep", "adaptive"
15. Build-Time Flags (Not Environment Variables)
HAKMEM_ENABLE_STATS
- Type: Compiler flag (
-DHAKMEM_ENABLE_STATS) - Default: NOT DEFINED
- Impact: Completely disables statistics when absent
- Critical: Must be set to collect any statistics
HAKMEM_BUILD_RELEASE
- Type: Compiler flag
- Default: NOT DEFINED (= 0)
- Impact: When undefined, enables debug paths
- Check:
#if !HAKMEM_BUILD_RELEASE= true when not set
HAKMEM_BUILD_DEBUG
- Type: Compiler flag
- Default: NOT DEFINED (= 0)
- Impact: Enables debug counters and logging
HAKMEM_DEBUG_COUNTERS
- Type: Compiler flag
- Default: 0
- Impact: Include path debug counters in build
HAKMEM_TINY_MINIMAL_FRONT
- Type: Compiler flag
- Default: 0
- Impact: Strip optional front-end layers (bench only)
HAKMEM_TINY_BENCH_FASTPATH
- Type: Compiler flag
- Default: 0
- Impact: Enable benchmark-optimized fast path
HAKMEM_TINY_BENCH_SLL_ONLY
- Type: Compiler flag
- Default: 0
- Impact: SLL-only mode (no magazines)
HAKMEM_USDT
- Type: Compiler flag
- Default: 0
- Impact: Enable USDT tracepoints for perf
- Requires:
<sys/sdt.h>(systemtap-sdt-dev)
NULL Return Path Analysis
Why hak_tiny_alloc() Returns NULL
The Tiny Pool allocator returns NULL in these cases:
-
Size > 1KB (line 97)
if (class_idx < 0) return NULL; // >1KB -
Wrapper Guard Active (lines 88-91, only when
!HAKMEM_BUILD_RELEASE)#if !HAKMEM_BUILD_RELEASE if (!g_wrap_tiny_enabled && g_tls_in_wrapper != 0) return NULL; #endifNote:
HAKMEM_BUILD_RELEASEis NOT defined by default! This guard is ACTIVE in your build and returns NULL during malloc recursion. -
Wrapper Context Empty (line 73)
return NULL; // empty → fallback to next allocator tierCalled from
hak_tiny_alloc_wrapper()when magazine is empty. -
Slow Path Exhaustion When all of these fail in
hak_tiny_alloc_slow():- HotMag refill fails
- TLS list empty
- TLS slab refill fails
hak_tiny_alloc_superslab()returns NULL
When Tiny Pool is Bypassed
Given HAKMEM_WRAP_TINY=1 (default), Tiny Pool is still bypassed when:
-
During wrapper recursion (if
HAKMEM_BUILD_RELEASEnot set)- malloc() calls getenv()
- getenv() calls malloc()
- Guard returns NULL → falls back to L2/L25
-
Size > 1KB
- Always falls through to L2 pool (1KB-32KB)
-
All caches empty + SuperSlab allocation fails
- Magazine empty
- SLL empty
- Active slabs full
- SuperSlab cannot allocate new slab
- Falls back to L2/L25
Memory Issue Diagnosis: 9GB Usage
Current Symptoms
- bench_fragment_stress_long_hakmem: 9GB RSS
- System allocator: 1.6MB RSS
- Tiny Pool stats:
alloc=0, free=0, slab=0(ZERO activity)
Root Cause Analysis
Hypothesis #1: Statistics Disabled (CONFIRMED)
Probability: 100%
Evidence:
HAKMEM_ENABLE_STATSnot defined in Makefile- All stats show 0 (no data collection)
- Code in
hakmem_tiny_stats.h:243-275shows no-op when disabled
Impact:
- Cannot see if Tiny Pool is being used
- Cannot diagnose allocation patterns
- Blind to memory leaks
Fix:
make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS" bench_fragment_stress_hakmem
Hypothesis #2: Wrapper Guard Blocking Tiny Pool
Probability: 90%
Evidence:
HAKMEM_BUILD_RELEASEnot defined → guard is ACTIVE- Wrapper guard code at
hakmem_tiny_alloc.inc:86-92 - During benchmark, many allocations may trigger wrapper context
Mechanism:
#if !HAKMEM_BUILD_RELEASE // This is TRUE (not defined)
if (!g_wrap_tiny_enabled && g_tls_in_wrapper != 0)
return NULL; // Bypass Tiny Pool!
#endif
Result:
- Tiny Pool returns NULL
- Falls back to L2/L25 pools
- L2/L25 may be leaking or over-allocating
Fix:
make CFLAGS="-DHAKMEM_BUILD_RELEASE=1"
Hypothesis #3: L2/L25 Pool Leak or Over-Retention
Probability: 75%
Evidence:
- If Tiny Pool is bypassed → L2/L25 handles ≤1KB allocations
- L2/L25 may have less aggressive trimming
- Fragment stress workload may trigger worst-case pooling
Verification:
- Enable L2/L25 statistics
- Check pool sizes:
g_pool_*counters - Look for unbounded pool growth
Fix: Tune L2/L25 parameters:
export HAKMEM_POOL_TLS_FREE=1
export HAKMEM_CAP_MID=256 # Cap mid-tier pool at 256 blocks
Recommended Diagnostic Steps
Step 1: Enable Statistics
make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS -DHAKMEM_BUILD_RELEASE=1" bench_fragment_stress_hakmem
Step 2: Run with Diagnostics
export HAKMEM_WRAP_TINY=1
export HAKMEM_VERBOSE=1
./bench_fragment_stress_hakmem
Step 3: Check Statistics
# In benchmark output, look for:
# - Tiny Pool stats (should be non-zero now)
# - L2/L25 pool stats
# - Cache hit rates
# - RSS growth pattern
Step 4: Profile Memory
# Option A: Valgrind massif
valgrind --tool=massif --massif-out-file=massif.out ./bench_fragment_stress_hakmem
ms_print massif.out
# Option B: HAKMEM internal profiling
export HAKMEM_PROF=1
export HAKMEM_PROF_SAMPLE=100
./bench_fragment_stress_hakmem
Step 5: Compare Allocator Tiers
# Force Tiny-only (disable L2/L25 fallback)
export HAKMEM_TINY_USE_SUPERSLAB=1
export HAKMEM_CAP_MID=0 # Disable mid-tier
export HAKMEM_CAP_LARGE=0 # Disable large-tier
./bench_fragment_stress_hakmem
# Check if RSS improves → L2/L25 is the problem
Quick Reference: Must-Set Variables for Debugging
# Enable everything for debugging
export HAKMEM_WRAP_TINY=1 # Use Tiny Pool
export HAKMEM_VERBOSE=1 # See what's happening
export HAKMEM_ACE_DEBUG=1 # ACE diagnostics
export HAKMEM_TINY_PATH_DEBUG=1 # Path counters (if built with HAKMEM_DEBUG_COUNTERS)
# Build with statistics
make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS -DHAKMEM_BUILD_RELEASE=1 -DHAKMEM_DEBUG_COUNTERS=1"
Summary: Critical Variables for Your Issue
| Variable | Current | Should Be | Impact |
|---|---|---|---|
| HAKMEM_ENABLE_STATS | undefined | -DHAKMEM_ENABLE_STATS |
Enable statistics collection |
| HAKMEM_BUILD_RELEASE | undefined (=0) | -DHAKMEM_BUILD_RELEASE=1 |
Disable wrapper guard |
| HAKMEM_WRAP_TINY | 1 ✓ | 1 | Already correct |
| HAKMEM_VERBOSE | 0 | 1 | See allocation logs |
Action: Rebuild with both flags, then re-run benchmark to see real statistics.