Files
hakmem/docs/specs/ENV_VARS_COMPLETE.md
Moe Charm (CI) cba6f785a1 Add SuperSlab Prefault Box with 4MB MAP_POPULATE bug fix
New Feature: ss_prefault_box.h
- Box for controlling SuperSlab page prefaulting policy
- ENV: HAKMEM_SS_PREFAULT (0=OFF, 1=POPULATE, 2=TOUCH)
- Default: OFF (safe mode until further optimization)

Bug Fix: 4MB MAP_POPULATE regression
- Problem: Fallback path allocated 4MB (2x size for alignment) with MAP_POPULATE
  causing 52x slower mmap (0.585ms → 30.6ms) and 35% throughput regression
- Solution: Remove MAP_POPULATE from 4MB allocation, apply madvise(MADV_WILLNEED)
  only to the aligned 2MB region after trimming prefix/suffix

Changes:
- core/box/ss_prefault_box.h: New prefault policy box (header-only)
- core/box/ss_allocation_box.c: Integrate prefault box, call ss_prefault_region()
- core/superslab_cache.c: Fix fallback path - no MAP_POPULATE on 4MB,
  always munmap prefix/suffix, use MADV_WILLNEED for 2MB only
- docs/specs/ENV_VARS*.md: Document HAKMEM_SS_PREFAULT

Performance:
- bench_random_mixed: 4.32M ops/s (regression fixed, slight improvement)
- bench_tiny_hot: 157M ops/s with prefault=1 (no crash)

Box Theory:
- OS layer (ss_os_acquire): "how to mmap"
- Prefault Box: "when to page-in"
- Allocation Box: "when to call prefault"

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 20:11:24 +09:00

26 KiB
Raw Blame History

HAKMEM Environment Variables Complete Reference

Total Variables: 83 environment variables + multiple compile-time flags Last Updated: 2025-11-01 Purpose: Complete reference for diagnosing memory issues and configuration


CRITICAL DISCOVERY: Statistics Disabled by Default

The Problem

Tiny Pool statistics are DISABLED unless you build with -DHAKMEM_ENABLE_STATS:

  • Current behavior: alloc=0, free=0, slab=0 (statistics not collected)
  • Impact: Memory diagnostics are blind
  • Root cause: Build-time flag NOT set in Makefile

How to Enable Statistics

Option 1: Build with statistics (RECOMMENDED for debugging)

make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS" bench_fragment_stress_hakmem

Option 2: Edit Makefile (add to line 18)

CFLAGS = -O3 ... -DHAKMEM_ENABLE_STATS ...

Why Statistics are Disabled

From /mnt/workdisk/public_share/hakmem/core/hakmem_tiny_stats.h:

// Purpose: Zero-overhead production builds by disabling stats collection
// Usage:   Build with -DHAKMEM_ENABLE_STATS to enable (default: disabled)
// Impact:  3-5% speedup when disabled (removes 0.5ns TLS increment)
//
// Default: DISABLED (production performance)
// Enable:  make CFLAGS=-DHAKMEM_ENABLE_STATS

When DISABLED: All stats_record_alloc() and stats_record_free() become no-ops When ENABLED: Batched TLS counters track exact allocation/free counts


Environment Variable Categories

1. Tiny Pool Core (Critical)

HAKMEM_WRAP_TINY

  • Default: 1 (enabled)
  • Purpose: Enable Tiny Pool fast-path (bypasses wrapper guard)
  • Impact: Controls whether malloc/free use Tiny Pool for ≤1KB allocations
  • Usage: export HAKMEM_WRAP_TINY=1 (already default since Phase 7.4)
  • Location: /mnt/workdisk/public_share/hakmem/core/hakmem_tiny_init.inc:25
  • Notes: Without this, Tiny Pool returns NULL and falls back to L2/L25

HAKMEM_WRAP_TINY_REFILL

  • Default: 0 (disabled)
  • Purpose: Allow trylock-based magazine refill during wrapper calls
  • Impact: Enables limited refill under trylock (no blocking)
  • Usage: export HAKMEM_WRAP_TINY_REFILL=1
  • Safety: OFF by default (avoids deadlock risk in recursive malloc)

HAKMEM_TINY_USE_SUPERSLAB

  • Default: 1 (enabled)
  • Purpose: Enable SuperSlab allocator for Tiny Pool slabs
  • Impact: When OFF, Tiny Pool cannot allocate new slabs
  • Critical: Must be ON for Tiny Pool to work

2. Tiny Pool TLS Caching (Performance Critical)

HAKMEM_TINY_MAG_CAP

  • Default: Per-class (typically 512-2048)
  • Purpose: Global TLS magazine capacity override
  • Impact: Larger = fewer refills, more memory
  • Usage: export HAKMEM_TINY_MAG_CAP=1024

HAKMEM_TINY_MAG_CAP_C{0..7}

  • Default: None (uses class defaults)
  • Purpose: Per-class magazine capacity override
  • Example: HAKMEM_TINY_MAG_CAP_C3=512 (64B class)
  • Classes: C0=8B, C1=16B, C2=32B, C3=64B, C4=128B, C5=256B, C6=512B, C7=1KB

HAKMEM_TINY_TLS_SLL

  • Default: 1 (enabled)
  • Purpose: Enable TLS Single-Linked-List cache layer
  • Impact: Fast-path cache before magazine
  • Performance: Critical for tiny allocations (8-64B)

HAKMEM_SLL_MULTIPLIER

  • Default: 2
  • Purpose: SLL capacity = MAG_CAP × multiplier for small classes (0-3)
  • Range: 1..16
  • Impact: Higher = more TLS memory, fewer refills

HAKMEM_TINY_REFILL_MAX

  • Default: 64
  • Purpose: Magazine refill batch size (normal classes)
  • Impact: Larger = fewer refills, more memory spike

HAKMEM_TINY_REFILL_MAX_HOT

  • Default: 192
  • Purpose: Magazine refill batch size for hot classes (≤64B)
  • Impact: Larger batches for frequently used sizes

HAKMEM_TINY_REFILL_MAX_C{0..7}

  • Default: None
  • Purpose: Per-class refill batch override
  • Example: HAKMEM_TINY_REFILL_MAX_C2=96 (32B class)

HAKMEM_TINY_REFILL_MAX_HOT_C{0..7}

  • Default: None
  • Purpose: Per-class hot refill override (classes 0-3)
  • Priority: Overrides HAKMEM_TINY_REFILL_MAX_HOT

3. SuperSlab Configuration

HAKMEM_TINY_SS_MAX_MB

  • Default: Unlimited
  • Purpose: Maximum SuperSlab memory per class (MB)
  • Impact: Caps total slab allocation
  • Usage: export HAKMEM_TINY_SS_MAX_MB=512

HAKMEM_TINY_SS_MIN_MB

  • Default: 0
  • Purpose: Minimum SuperSlab reservation per class (MB)
  • Impact: Pre-allocates memory at startup

HAKMEM_TINY_SS_RESERVE

  • Default: 0
  • Purpose: Reserve SuperSlab memory at init
  • Impact: Prevents initial allocation delays

HAKMEM_TINY_TRIM_SS

  • Default: 0
  • Purpose: Enable SuperSlab trimming/deallocation
  • Impact: Returns memory to OS when idle

HAKMEM_TINY_SS_PARTIAL

  • Default: 0
  • Purpose: Enable partial slab reclamation
  • Impact: Free partially-used slabs

HAKMEM_TINY_SS_PARTIAL_INTERVAL

  • Default: 1000000 (1M allocations)
  • Purpose: Interval between partial slab checks
  • Impact: Lower = more aggressive trimming

HAKMEM_TINY_SS_CACHE

  • Default: 0 (disabled)
  • Purpose: Per-class SuperSlab cache capacity
  • Impact: Limits how many freed SuperSlabs are kept in LRU cache before munmap

HAKMEM_TINY_SS_CACHE_C{0..7}

  • Default: unset (inherits HAKMEM_TINY_SS_CACHE)
  • Purpose: Per-class overrides for cache capacity
  • Impact: Fine-grained control of cache size per Tiny class

HAKMEM_TINY_SS_PRECHARGE

  • Default: 0
  • Purpose: Precharge (pre-allocate) SuperSlabs into cache at startup/runtime
  • Impact: Reduces first-use page faults by having warm SuperSlabs ready

HAKMEM_TINY_SS_PRECHARGE_C{0..7}

  • Default: unset (inherits HAKMEM_TINY_SS_PRECHARGE)
  • Purpose: Per-class precharge targets
  • Impact: e.g., HAKMEM_TINY_SS_PRECHARGE_C0=4 precharges 4 SuperSlabs for class 0

HAKMEM_TINY_SS_POPULATE_ONCE

  • Default: 0
  • Purpose: Use MAP_POPULATE for the next SuperSlab allocation only
  • Impact: One-shot prefault for A/B testing; superseded by HAKMEM_SS_PREFAULT for常時運用

HAKMEM_SS_PREFAULT

  • Default: 0 (OFF, safety-first default)
  • Type: integer (03)
  • Purpose: Control SuperSlab prefault strategy to reduce kernel page fault overhead (enabled explicitly when tuning).
  • Values:
    • 0 = OFF — legacy behavior, only HAKMEM_TINY_SS_POPULATE_ONCE may trigger a one-shot MAP_POPULATE(現状の安全デフォルト)。
    • 1 = POPULATE — always pass populate=1 to ss_os_acquire() (use MAP_POPULATE for every new SuperSlab). 要 perf 確認。
    • 2 = TOUCH — POPULATE + ss_prefault_region() touches each page once (4KB stride) after mmap(実験用)。
    • 3 = ASYNC — reserved for future background-prefault implementation (currently treated as TOUCH).
  • Implementation:
    • Policy Box: core/box/ss_prefault_box.h
    • Integration: core/box/ss_allocation_box.c calls ss_prefault_policy() to set populate and ss_prefault_region() immediately after ss_os_acquire().

4. Remote Free & Background Processing

HAKMEM_TINY_REMOTE_DRAIN_THRESHOLD

  • Default: 32
  • Purpose: Trigger remote free drain when count exceeds threshold
  • Impact: Controls when to process cross-thread frees
  • Per-class: ACE can tune this per-class

HAKMEM_TINY_REMOTE_DRAIN_TRYRATE

  • Default: 16
  • Purpose: Probability (1/N) of attempting trylock drain
  • Impact: Lower = more aggressive draining

5. Statistics & Profiling

HAKMEM_ENABLE_STATS (BUILD-TIME)

  • Default: UNDEFINED (statistics DISABLED)
  • Purpose: Enable batched TLS statistics collection
  • Build: make CFLAGS=-DHAKMEM_ENABLE_STATS
  • Impact: 0.5ns overhead per alloc/free when enabled
  • Critical: Must be defined to see any statistics

HAKMEM_TINY_STAT_RATE_LG

  • Default: 0 (no sampling)
  • Purpose: Sample statistics at 1/2^N rate
  • Example: HAKMEM_TINY_STAT_RATE_LG=4 → sample 1/16 allocs
  • Requires: HAKMEM_ENABLE_STATS + HAKMEM_TINY_STAT_SAMPLING build flags

HAKMEM_TINY_COUNT_SAMPLE

  • Default: 8
  • Purpose: Legacy sampling exponent (deprecated)
  • Note: Replaced by batched stats in Phase 3

HAKMEM_TINY_PATH_DEBUG

  • Default: 0
  • Purpose: Enable allocation path debugging counters
  • Requires: HAKMEM_DEBUG_COUNTERS=1 build flag
  • Output: atexit() dump of path hit counts

6. ACE Learning System (Adaptive Control Engine)

HAKMEM_ACE_ENABLED

  • Default: 0
  • Purpose: Enable ACE learning system
  • Impact: Adaptive tuning of Tiny Pool parameters
  • Note: Already integrated but can be disabled

HAKMEM_ACE_OBSERVE

  • Default: 0
  • Purpose: Enable ACE observation logging
  • Impact: Verbose output of ACE decisions

HAKMEM_ACE_DEBUG

  • Default: 0
  • Purpose: Enable ACE debug logging
  • Impact: Detailed ACE internal state

HAKMEM_ACE_SAMPLE

  • Default: Undefined (no sampling)
  • Purpose: Sample ACE events at given rate
  • Impact: Reduces ACE overhead

HAKMEM_ACE_LOG_LEVEL

  • Default: 0
  • Purpose: ACE logging verbosity (0-3)
  • Levels: 0=off, 1=errors, 2=info, 3=debug

HAKMEM_ACE_FAST_INTERVAL_MS

  • Default: 100ms
  • Purpose: Fast ACE update interval
  • Impact: How often ACE checks metrics

HAKMEM_ACE_SLOW_INTERVAL_MS

  • Default: 1000ms
  • Purpose: Slow ACE update interval
  • Impact: Background tuning frequency

7. Intelligence Engine (INT)

HAKMEM_INT_ENGINE

  • Default: 0
  • Purpose: Enable background intelligence/adaptation engine
  • Impact: Deferred event processing + adaptive tuning
  • Pairs with: HAKMEM_TINY_FRONTEND

HAKMEM_INT_ADAPT_REFILL

  • Default: 1 (when INT enabled)
  • Purpose: Adapt REFILL_MAX dynamically (±16)
  • Impact: Tunes refill sizes based on miss rate

HAKMEM_INT_ADAPT_CAPS

  • Default: 1 (when INT enabled)
  • Purpose: Adapt MAG/SLL capacities (±16/±32)
  • Impact: Grows hot classes, shrinks cold ones

HAKMEM_INT_EVENT_TS

  • Default: 0
  • Purpose: Include timestamps in INT events
  • Impact: Adds clock_gettime() overhead

HAKMEM_INT_SAMPLE

  • Default: Undefined (no sampling)
  • Purpose: Sample INT events at 1/2^N rate
  • Impact: Reduces INT overhead on hot path

8. Frontend & Experimental Features

HAKMEM_TINY_FRONTEND

  • Default: 0
  • Purpose: Enable mimalloc-style frontend cache
  • Impact: Adds FastCache layer before backend
  • Experimental: A/B testing only

HAKMEM_TINY_FASTCACHE

  • Default: 0
  • Purpose: Low-level FastCache toggle
  • Impact: Internal A/B switch

HAKMEM_TINY_QUICK

  • Default: 0
  • Purpose: Enable TinyQuickSlot (6-item single-cacheline stack)
  • Impact: Ultra-fast path for ≤64B
  • Experimental: Bench-only optimization

HAKMEM_TINY_HOTMAG (削除済み)

  • 2025-12 cleanup: HotMag runtime ENVトグルは削除。HotMagはデフォルトOFF固定、ENVでの調整不可。

HAKMEM_TINY_HOTMAG_CAP (削除済み)

  • 2025-12 cleanup: HotMag容量ENVを削除固定値128

HAKMEM_TINY_HOTMAG_REFILL (削除済み)

  • 2025-12 cleanup: HotMag refillバッチENVを削除固定値32

HAKMEM_TINY_HOTMAG_C{0..7} (削除済み)

  • 2025-12 cleanup: クラス別HotMag有効/無効ENVを削除全クラス固定OFF

9. Tiny Front Routing

HAKMEM_TINY_PROFILE

  • Default: "conservative"
  • Type: string
  • Purpose: Control Tiny Front (TLS SLL / FastCache) vs Pool/backend routing per Tiny class via a simple profile.
  • Profiles:
    • "conservative":
      • All classes (C0C7) use TINY_FIRST: try Tiny Front first, then fallback to Pool/backend on miss.
    • "hot":
      • C0C3: TINY_ONLY (small classes use Tiny exclusively via front gate)
      • C4C6: TINY_FIRST
      • C7: POOL_ONLY (1KB headerless class uses Pool/backend)
    • "off":
      • All classes POOL_ONLY (Tiny Front is fully disabled, Pool-only allocator behaviour).
    • "full":
      • All classes TINY_ONLY (microbench-style, front gate always routes via Tiny).
  • Implementation:
    • Box: core/box/tiny_route_box.h / tiny_route_box.c (per-class g_tiny_route[8] table).
    • Gate: tiny_alloc_gate_fast() reads TinyRoutePolicy and decides Tiny vs Pool on each allocation.

10. Superslab Tiering & Registry Control

HAKMEM_SS_TIER_DOWN_THRESHOLD

  • Default: 0.25
  • Range: 0.01.0
  • Purpose: SuperSlab 利用率がこの値以下になったときに、Tier を HOT → DRAINING に遷移させる下限。
  • Impact:
    • DRAINING Tier の SuperSlab は新規割り当ての対象外となり、drain/解放候補として扱われる。
    • 利用率が低い SuperSlab への新規割り当てを避け、活発な SuperSlab に負荷を集中させる。

HAKMEM_SS_TIER_UP_THRESHOLD

  • Default: 0.50
  • Range: 0.01.0
  • Purpose: DRAINING Tier の SuperSlab 利用率がこの値以上になったときに DRAINING → HOT に戻す上限(ヒステリシス)。
  • Impact:
    • Down/Up 閾値にギャップを持たせることで、Tier が HOT と DRAINING の間で頻繁に振動するのを防ぐ。
    • Sustained な利用増加が観測された SuperSlab のみ HOT に復帰させる。

11. Memory Efficiency & RSS Control

HAKMEM_TINY_RSS_BUDGET_KB

  • Default: Unlimited
  • Purpose: Total RSS budget for Tiny Pool (kB)
  • Impact: When exceeded, shrinks MAG/SLL capacities
  • INT interaction: Requires HAKMEM_INT_ENGINE=1

HAKMEM_TINY_INT_TIGHT

  • Default: 0
  • Purpose: Bias INT toward memory reduction
  • Impact: Higher shrink thresholds, lower floor values

HAKMEM_TINY_DIET_STEP

  • Default: 16
  • Purpose: Capacity reduction step when over budget
  • Impact: MAG -= step, SLL -= step×2

HAKMEM_TINY_CAP_FLOOR_C{0..7}

  • Default: None (no floor)
  • Purpose: Minimum MAG capacity per class
  • Example: HAKMEM_TINY_CAP_FLOOR_C0=64 (8B class min)
  • Impact: Prevents INT from shrinking below floor

HAKMEM_TINY_MEM_DIET

  • Default: 0
  • Purpose: Enable memory diet mode (aggressive trimming)
  • Impact: Reduces memory footprint at cost of performance

HAKMEM_TINY_SPILL_HYST

  • Default: 0
  • Purpose: Magazine spill hysteresis (avoid thrashing)
  • Impact: Keep N extra items before spilling

11. Policy & Learning Parameters

HAKMEM_LEARN

  • Default: 0 (OFF, unless HAKMEM_MODE=learning/research)
  • Purpose: Legacy global learning toggle (CAP/WMAX Learner thread)
  • Impact:
    • HAKMEM_LEARN が明示的に設定されている場合:
      • 0 → Learner 無効
      • !=0 → Learner 有効
    • 未設定の場合:
      • HAKMEM_MODE=learning / research のときだけ Learner 有効
      • それ以外のモードでは Learner 無効balanced/fast/minimal
    • 実装: core/box/learner_env_box.h(学習レイヤ用 ENV Box

HAKMEM_WMAX_MID

  • Default: 256KB
  • Purpose: Mid-size allocation working set max
  • Impact: Pool cache size for mid-tier

HAKMEM_WMAX_LARGE

  • Default: 2MB
  • Purpose: Large allocation working set max
  • Impact: Pool cache size for large-tier

HAKMEM_CAP_MID

  • Default: Unlimited
  • Purpose: Mid-tier pool capacity cap
  • Impact: Maximum mid-tier pool size

HAKMEM_CAP_LARGE

  • Default: Unlimited
  • Purpose: Large-tier pool capacity cap
  • Impact: Maximum large-tier pool size

HAKMEM_WMAX_LEARN

  • Default: 0
  • Purpose: Enable working set max learning
  • Impact: Adaptively tune WMAX based on hit rate

HAKMEM_WMAX_CANDIDATES_MID

  • Default: "128,256,512,1024"
  • Purpose: Candidate WMAX values for mid-tier learning
  • Format: Comma-separated KB values

HAKMEM_WMAX_CANDIDATES_LARGE

  • Default: "1024,2048,4096,8192"
  • Purpose: Candidate WMAX values for large-tier learning
  • Format: Comma-separated KB values

HAKMEM_WMAX_ADOPT_PCT

  • Default: 0.01 (1%)
  • Purpose: Adoption threshold for WMAX candidates
  • Impact: How much better to switch candidates

HAKMEM_TARGET_HIT_MID

  • Default: 0.65 (65%)
  • Purpose: Target hit rate for mid-tier
  • Impact: Learning objective

HAKMEM_TARGET_HIT_LARGE

  • Default: 0.55 (55%)
  • Purpose: Target hit rate for large-tier
  • Impact: Learning objective

HAKMEM_GAIN_W_MISS

  • Default: 1.0
  • Purpose: Learning gain weight for misses
  • Impact: How much to penalize misses

11. THP (Transparent Huge Pages)

HAKMEM_THP

  • Default: "auto"
  • Purpose: THP policy (off/auto/on)
  • Values:
    • "off" = MADV_NOHUGEPAGE for all
    • "auto" = ≥2MB → MADV_HUGEPAGE
    • "on" = MADV_HUGEPAGE for all ≥1MB

HAKMEM_THP_LEARN

  • Default: 0
  • Purpose: Enable THP policy learning
  • Impact: Adaptively choose THP policy

HAKMEM_THP_CANDIDATES

  • Default: "off,auto,on"
  • Purpose: THP candidate policies for learning
  • Format: Comma-separated

HAKMEM_THP_ADOPT_PCT

  • Default: 0.015 (1.5%)
  • Purpose: Adoption threshold for THP switch
  • Impact: How much better to switch

12. L2/L25 Pool Configuration

HAKMEM_WRAP_L2

  • Default: 0
  • Purpose: Enable L2 pool wrapper bypass
  • Impact: Allow L2 during wrapper calls

HAKMEM_WRAP_L25

  • Default: 0
  • Purpose: Enable L25 pool wrapper bypass
  • Impact: Allow L25 during wrapper calls

HAKMEM_POOL_TLS_FREE

  • Default: 1
  • Purpose: Enable TLS-local free for L2 pool
  • Impact: Lock-free fast path

HAKMEM_POOL_TLS_RING

  • Default: 1
  • Purpose: Enable TLS ring buffer for pool
  • Impact: Batched cross-thread returns

HAKMEM_POOL_MIN_BUNDLE

  • Default: 4
  • Purpose: Minimum bundle size for L2 pool
  • Impact: Batch refill size

HAKMEM_L25_MIN_BUNDLE

  • Default: 4
  • Purpose: Minimum bundle size for L25 pool
  • Impact: Batch refill size

HAKMEM_L25_DZ

  • Default: "64,256"
  • Purpose: L25 size zones (comma-separated)
  • Format: "size1,size2,..."

HAKMEM_L25_RUN_BLOCKS

  • Default: 16
  • Purpose: Run blocks per L25 slab
  • Impact: Slab structure

HAKMEM_L25_RUN_FACTOR

  • Default: 2
  • Purpose: Run factor multiplier
  • Impact: Slab allocation strategy

13. Debugging & Observability

HAKMEM_VERBOSE

  • Default: 0
  • Purpose: Enable verbose logging
  • Impact: Detailed allocation logs

HAKMEM_QUIET

  • Default: 0
  • Purpose: Suppress all logging
  • Impact: Overrides HAKMEM_VERBOSE

HAKMEM_TIMING

  • Default: 0
  • Purpose: Enable timing measurements
  • Impact: Track allocation latency

HAKMEM_HIST_SAMPLE

  • Default: 0
  • Purpose: Size histogram sampling rate
  • Impact: Track size distribution

HAKMEM_PROF

  • Default: 0
  • Purpose: Enable profiling mode
  • Impact: Detailed performance tracking

HAKMEM_LOG_FILE

  • Default: stderr
  • Purpose: Redirect logs to file
  • Impact: File path for logging output

14. Mode Presets

HAKMEM_MODE

  • Default: "balanced"
  • Purpose: High-level configuration preset
  • Values:
    • "minimal" = malloc/mmap only
    • "fast" = pool fast-path + frozen learning
    • "balanced" = BigCache + ELO + Batch (default)
    • "learning" = ELO LEARN + adaptive
    • "research" = all features + verbose

HAKMEM_PRESET

  • Default: None
  • Purpose: Evolution preset (from PRESETS.md)
  • Impact: Load predefined parameter set

HAKMEM_FREE_POLICY

  • Default: "batch"
  • Purpose: Free path policy
  • Values: "batch", "keep", "adaptive"

15. Build-Time Flags (Not Environment Variables)

HAKMEM_ENABLE_STATS

  • Type: Compiler flag (-DHAKMEM_ENABLE_STATS)
  • Default: NOT DEFINED
  • Impact: Completely disables statistics when absent
  • Critical: Must be set to collect any statistics

HAKMEM_BUILD_RELEASE

  • Type: Compiler flag
  • Default: NOT DEFINED (= 0)
  • Impact: When undefined, enables debug paths
  • Check: #if !HAKMEM_BUILD_RELEASE = true when not set

HAKMEM_BUILD_DEBUG

  • Type: Compiler flag
  • Default: NOT DEFINED (= 0)
  • Impact: Enables debug counters and logging

HAKMEM_DEBUG_COUNTERS

  • Type: Compiler flag
  • Default: 0
  • Impact: Include path debug counters in build

HAKMEM_TINY_MINIMAL_FRONT

  • Type: Compiler flag
  • Default: 0
  • Impact: Strip optional front-end layers (bench only)

HAKMEM_TINY_BENCH_FASTPATH

  • Type: Compiler flag
  • Default: 0
  • Impact: Enable benchmark-optimized fast path

HAKMEM_TINY_BENCH_SLL_ONLY

  • Type: Compiler flag
  • Default: 0
  • Impact: SLL-only mode (no magazines)

HAKMEM_USDT

  • Type: Compiler flag
  • Default: 0
  • Impact: Enable USDT tracepoints for perf
  • Requires: <sys/sdt.h> (systemtap-sdt-dev)

NULL Return Path Analysis

Why hak_tiny_alloc() Returns NULL

The Tiny Pool allocator returns NULL in these cases:

  1. Size > 1KB (line 97)

    if (class_idx < 0) return NULL;  // >1KB
    
  2. Wrapper Guard Active (lines 88-91, only when !HAKMEM_BUILD_RELEASE)

    #if !HAKMEM_BUILD_RELEASE
    if (!g_wrap_tiny_enabled && g_tls_in_wrapper != 0) return NULL;
    #endif
    

    Note: HAKMEM_BUILD_RELEASE is NOT defined by default! This guard is ACTIVE in your build and returns NULL during malloc recursion.

  3. Wrapper Context Empty (line 73)

    return NULL;  // empty → fallback to next allocator tier
    

    Called from hak_tiny_alloc_wrapper() when magazine is empty.

  4. Slow Path Exhaustion When all of these fail in hak_tiny_alloc_slow():

    • HotMag refill fails
    • TLS list empty
    • TLS slab refill fails
    • hak_tiny_alloc_superslab() returns NULL

When Tiny Pool is Bypassed

Given HAKMEM_WRAP_TINY=1 (default), Tiny Pool is still bypassed when:

  1. During wrapper recursion (if HAKMEM_BUILD_RELEASE not set)

    • malloc() calls getenv()
    • getenv() calls malloc()
    • Guard returns NULL → falls back to L2/L25
  2. Size > 1KB

    • Always falls through to L2 pool (1KB-32KB)
  3. All caches empty + SuperSlab allocation fails

    • Magazine empty
    • SLL empty
    • Active slabs full
    • SuperSlab cannot allocate new slab
    • Falls back to L2/L25

Memory Issue Diagnosis: 9GB Usage

Current Symptoms

  • bench_fragment_stress_long_hakmem: 9GB RSS
  • System allocator: 1.6MB RSS
  • Tiny Pool stats: alloc=0, free=0, slab=0 (ZERO activity)

Root Cause Analysis

Hypothesis #1: Statistics Disabled (CONFIRMED)

Probability: 100%

Evidence:

  • HAKMEM_ENABLE_STATS not defined in Makefile
  • All stats show 0 (no data collection)
  • Code in hakmem_tiny_stats.h:243-275 shows no-op when disabled

Impact:

  • Cannot see if Tiny Pool is being used
  • Cannot diagnose allocation patterns
  • Blind to memory leaks

Fix:

make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS" bench_fragment_stress_hakmem

Hypothesis #2: Wrapper Guard Blocking Tiny Pool

Probability: 90%

Evidence:

  • HAKMEM_BUILD_RELEASE not defined → guard is ACTIVE
  • Wrapper guard code at hakmem_tiny_alloc.inc:86-92
  • During benchmark, many allocations may trigger wrapper context

Mechanism:

#if !HAKMEM_BUILD_RELEASE  // This is TRUE (not defined)
if (!g_wrap_tiny_enabled && g_tls_in_wrapper != 0)
    return NULL;  // Bypass Tiny Pool!
#endif

Result:

  • Tiny Pool returns NULL
  • Falls back to L2/L25 pools
  • L2/L25 may be leaking or over-allocating

Fix:

make CFLAGS="-DHAKMEM_BUILD_RELEASE=1"

Hypothesis #3: L2/L25 Pool Leak or Over-Retention

Probability: 75%

Evidence:

  • If Tiny Pool is bypassed → L2/L25 handles ≤1KB allocations
  • L2/L25 may have less aggressive trimming
  • Fragment stress workload may trigger worst-case pooling

Verification:

  1. Enable L2/L25 statistics
  2. Check pool sizes: g_pool_* counters
  3. Look for unbounded pool growth

Fix: Tune L2/L25 parameters:

export HAKMEM_POOL_TLS_FREE=1
export HAKMEM_CAP_MID=256  # Cap mid-tier pool at 256 blocks

Step 1: Enable Statistics

make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS -DHAKMEM_BUILD_RELEASE=1" bench_fragment_stress_hakmem

Step 2: Run with Diagnostics

export HAKMEM_WRAP_TINY=1
export HAKMEM_VERBOSE=1
./bench_fragment_stress_hakmem

Step 3: Check Statistics

# In benchmark output, look for:
# - Tiny Pool stats (should be non-zero now)
# - L2/L25 pool stats
# - Cache hit rates
# - RSS growth pattern

Step 4: Profile Memory

# Option A: Valgrind massif
valgrind --tool=massif --massif-out-file=massif.out ./bench_fragment_stress_hakmem
ms_print massif.out

# Option B: HAKMEM internal profiling
export HAKMEM_PROF=1
export HAKMEM_PROF_SAMPLE=100
./bench_fragment_stress_hakmem

Step 5: Compare Allocator Tiers

# Force Tiny-only (disable L2/L25 fallback)
export HAKMEM_TINY_USE_SUPERSLAB=1
export HAKMEM_CAP_MID=0      # Disable mid-tier
export HAKMEM_CAP_LARGE=0    # Disable large-tier
./bench_fragment_stress_hakmem

# Check if RSS improves → L2/L25 is the problem

Quick Reference: Must-Set Variables for Debugging

# Enable everything for debugging
export HAKMEM_WRAP_TINY=1              # Use Tiny Pool
export HAKMEM_VERBOSE=1                # See what's happening
export HAKMEM_ACE_DEBUG=1              # ACE diagnostics
export HAKMEM_TINY_PATH_DEBUG=1        # Path counters (if built with HAKMEM_DEBUG_COUNTERS)

# Build with statistics
make clean
make CFLAGS="-DHAKMEM_ENABLE_STATS -DHAKMEM_BUILD_RELEASE=1 -DHAKMEM_DEBUG_COUNTERS=1"

Summary: Critical Variables for Your Issue

Variable Current Should Be Impact
HAKMEM_ENABLE_STATS undefined -DHAKMEM_ENABLE_STATS Enable statistics collection
HAKMEM_BUILD_RELEASE undefined (=0) -DHAKMEM_BUILD_RELEASE=1 Disable wrapper guard
HAKMEM_WRAP_TINY 1 ✓ 1 Already correct
HAKMEM_VERBOSE 0 1 See allocation logs

Action: Rebuild with both flags, then re-run benchmark to see real statistics.