Add comprehensive CONFIGURATION.md user documentation
(cherry-picked from 0143e0fed, conflict resolved)
This commit is contained in:
392
CONFIGURATION.md
Normal file
392
CONFIGURATION.md
Normal file
@ -0,0 +1,392 @@
|
|||||||
|
# HAKMEM Configuration Guide
|
||||||
|
|
||||||
|
**Last Updated**: 2025-11-26 (After Phase 2.2 - Learning Systems Consolidation)
|
||||||
|
|
||||||
|
This guide documents all canonical HAKMEM environment variables after Phase 0-2 cleanup.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📋 Quick Reference
|
||||||
|
|
||||||
|
Use the validation tool to check your configuration:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Validate current environment
|
||||||
|
./scripts/validate_config.sh
|
||||||
|
|
||||||
|
# Strict mode (treat warnings as errors)
|
||||||
|
./scripts/validate_config.sh --strict
|
||||||
|
|
||||||
|
# Quiet mode (errors only)
|
||||||
|
./scripts/validate_config.sh --quiet
|
||||||
|
```
|
||||||
|
|
||||||
|
**Deprecated variables?** See [DEPRECATED.md](DEPRECATED.md) for migration guide.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎯 Core Configuration
|
||||||
|
|
||||||
|
### Allocator Path Selection
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_WRAP_TINY` | 0, 1 | 1 | Enable TINY allocator (1-2048B) |
|
||||||
|
| `HAKMEM_WRAP_POOL` | 0, 1 | 1 | Enable POOL allocator (2-8KB) |
|
||||||
|
| `HAKMEM_WRAP_MID` | 0, 1 | 1 | Enable MID allocator (8-32KB) |
|
||||||
|
| `HAKMEM_WRAP_LARGE` | 0, 1 | 1 | Enable LARGE allocator (>32KB) |
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```bash
|
||||||
|
# Disable all HAKMEM allocators (use system malloc)
|
||||||
|
export HAKMEM_WRAP_TINY=0 HAKMEM_WRAP_POOL=0 HAKMEM_WRAP_MID=0 HAKMEM_WRAP_LARGE=0
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🐛 Debug & Diagnostics
|
||||||
|
|
||||||
|
**Canonical Variables** (After P0.4 - Debug Consolidation):
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_DEBUG_LEVEL` | 0-3 | 0 | Verbosity (0=none, 1=errors, 2=info, 3=verbose) |
|
||||||
|
| `HAKMEM_DEBUG_TINY` | 0, 1 | 0 | Enable TINY allocator debug output |
|
||||||
|
| `HAKMEM_TRACE_ALLOCATIONS` | 0, 1 | 0 | Trace every alloc/free (expensive!) |
|
||||||
|
| `HAKMEM_INTEGRITY_CHECKS` | 0, 1 | 1 | Enable integrity validation (canary checks) |
|
||||||
|
|
||||||
|
**Examples**:
|
||||||
|
```bash
|
||||||
|
# Production (quiet, integrity only)
|
||||||
|
export HAKMEM_DEBUG_LEVEL=0
|
||||||
|
export HAKMEM_INTEGRITY_CHECKS=1
|
||||||
|
|
||||||
|
# Debug session (verbose + TINY debug + tracing)
|
||||||
|
export HAKMEM_DEBUG_LEVEL=3
|
||||||
|
export HAKMEM_DEBUG_TINY=1
|
||||||
|
export HAKMEM_TRACE_ALLOCATIONS=1
|
||||||
|
export HAKMEM_INTEGRITY_CHECKS=1
|
||||||
|
|
||||||
|
# Performance testing (all checks OFF)
|
||||||
|
export HAKMEM_DEBUG_LEVEL=0
|
||||||
|
export HAKMEM_INTEGRITY_CHECKS=0
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🏗️ SuperSlab Management
|
||||||
|
|
||||||
|
**Canonical Variables** (After P0.1 - SuperSlab Unification):
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_SUPERSLAB_REUSE` | 0, 1 | 0 | Reuse empty slabs (reduces mmap/munmap syscalls) |
|
||||||
|
| `HAKMEM_SUPERSLAB_LAZY` | 0, 1 | 1 | Lazy deallocation (Phase 9, keep slabs cached) |
|
||||||
|
| `HAKMEM_SUPERSLAB_PREWARM` | 0-128 | 0 | Preallocate N SuperSlabs at startup |
|
||||||
|
| `HAKMEM_SUPERSLAB_LRU_CAP` | 0-1024 | 256 | Max cached SuperSlabs (LRU eviction) |
|
||||||
|
| `HAKMEM_SUPERSLAB_SOFT_CAP` | 0-1024 | 128 | Soft cap for SuperSlab pool (before eviction) |
|
||||||
|
|
||||||
|
**Examples**:
|
||||||
|
```bash
|
||||||
|
# High performance (aggressive reuse + large cache)
|
||||||
|
export HAKMEM_SUPERSLAB_REUSE=1
|
||||||
|
export HAKMEM_SUPERSLAB_LAZY=1
|
||||||
|
export HAKMEM_SUPERSLAB_PREWARM=16
|
||||||
|
export HAKMEM_SUPERSLAB_LRU_CAP=512
|
||||||
|
|
||||||
|
# Low memory footprint (minimal caching)
|
||||||
|
export HAKMEM_SUPERSLAB_REUSE=0
|
||||||
|
export HAKMEM_SUPERSLAB_LAZY=0
|
||||||
|
export HAKMEM_SUPERSLAB_LRU_CAP=32
|
||||||
|
export HAKMEM_SUPERSLAB_SOFT_CAP=16
|
||||||
|
```
|
||||||
|
|
||||||
|
**Note**: Phase 12 (Shared SuperSlab Pool) removed per-class registry population, making `SUPERSLAB_REUSE` less effective. Default is OFF.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧠 Learning Systems
|
||||||
|
|
||||||
|
**Canonical Variables** (After P2.2 - Learning Consolidation, 18→6 variables):
|
||||||
|
|
||||||
|
### Allocation Learning
|
||||||
|
Controls adaptive sizing for allocator caches (TLS, SFC, capacity tuning).
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_ALLOC_LEARN` | 0, 1 | 0 | Enable allocation pattern learning |
|
||||||
|
| `HAKMEM_ALLOC_LEARN_WINDOW` | 1-1000000 | 10000 | Learning window size (operations) |
|
||||||
|
| `HAKMEM_ALLOC_LEARN_RATE` | 0.0-1.0 | 0.1 | Learning rate (lower = slower adaptation) |
|
||||||
|
|
||||||
|
### Memory Learning
|
||||||
|
Controls THP (Transparent Huge Pages), RSS optimization, and max-size learning.
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_MEM_LEARN` | 0, 1 | 0 | Enable memory pattern learning (THP/RSS/WMAX) |
|
||||||
|
| `HAKMEM_MEM_LEARN_WINDOW` | 1-1000000 | 5000 | Learning window size (operations) |
|
||||||
|
| `HAKMEM_MEM_LEARN_THRESHOLD` | 0.0-1.0 | 0.8 | Activation threshold (80% confidence) |
|
||||||
|
|
||||||
|
### Advanced Overrides
|
||||||
|
**For troubleshooting only** - enables legacy advanced knobs that are auto-tuned by default.
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_LEARN_ADVANCED` | 0, 1 | 0 | Enable advanced override knobs (see DEPRECATED.md) |
|
||||||
|
|
||||||
|
**Examples**:
|
||||||
|
```bash
|
||||||
|
# Production (learning disabled, use static tuning)
|
||||||
|
export HAKMEM_ALLOC_LEARN=0
|
||||||
|
export HAKMEM_MEM_LEARN=0
|
||||||
|
|
||||||
|
# Adaptive workload (enable both learners)
|
||||||
|
export HAKMEM_ALLOC_LEARN=1
|
||||||
|
export HAKMEM_ALLOC_LEARN_WINDOW=20000
|
||||||
|
export HAKMEM_ALLOC_LEARN_RATE=0.05
|
||||||
|
export HAKMEM_MEM_LEARN=1
|
||||||
|
export HAKMEM_MEM_LEARN_WINDOW=10000
|
||||||
|
export HAKMEM_MEM_LEARN_THRESHOLD=0.75
|
||||||
|
|
||||||
|
# Migration troubleshooting (enable advanced overrides)
|
||||||
|
export HAKMEM_LEARN_ADVANCED=1
|
||||||
|
export HAKMEM_LEARN_DECAY=0.95 # Override auto-tuned decay
|
||||||
|
```
|
||||||
|
|
||||||
|
**Migration Note**: See [DEPRECATED.md](DEPRECATED.md) for mapping of 18 legacy variables → 6 canonical variables.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎯 TINY Allocator (1-2048B)
|
||||||
|
|
||||||
|
### TLS Cache Configuration
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_TINY_TLS_CAP` | 16-1024 | 64 | Per-class TLS cache capacity |
|
||||||
|
| `HAKMEM_TINY_TLS_REFILL` | 4-256 | 16 | Batch refill size |
|
||||||
|
| `HAKMEM_TINY_DRAIN_THRESH` | 0-1024 | 128 | Remote free drain threshold |
|
||||||
|
|
||||||
|
### Super Front Cache (SFC)
|
||||||
|
**Note**: SFC is **ACTIVE** and provides 95%+ hit rate for hot allocations.
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_TINY_SFC_ENABLE` | 0, 1 | 1 | Enable Super Front Cache (ultra-fast TLS cache) |
|
||||||
|
| `HAKMEM_TINY_SFC_CAPACITY` | 32-512 | 128 | SFC slot count |
|
||||||
|
| `HAKMEM_TINY_SFC_HOT_CLASSES` | 1-16 | 8 | Number of hot classes to cache |
|
||||||
|
|
||||||
|
### P0 Batch Optimization
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_TINY_P0_ENABLE` | 0, 1 | 1 | Enable P0 batch refill (O(1) freelist pop) |
|
||||||
|
| `HAKMEM_TINY_P0_BATCH` | 4-128 | 16 | P0 batch size |
|
||||||
|
| `HAKMEM_TINY_P0_NO_DRAIN` | 0, 1 | 0 | Disable remote drain (debug only) |
|
||||||
|
| `HAKMEM_TINY_P0_LOG` | 0, 1 | 0 | Enable P0 counter validation logging |
|
||||||
|
|
||||||
|
### Header Configuration
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_TINY_HEADER_CLASSIDX` | 0, 1 | 1 | Store class_idx in header (Phase 7, enables fast free) |
|
||||||
|
|
||||||
|
**Examples**:
|
||||||
|
```bash
|
||||||
|
# High-throughput (large caches, aggressive batching)
|
||||||
|
export HAKMEM_TINY_TLS_CAP=256
|
||||||
|
export HAKMEM_TINY_TLS_REFILL=32
|
||||||
|
export HAKMEM_TINY_SFC_CAPACITY=256
|
||||||
|
export HAKMEM_TINY_P0_ENABLE=1
|
||||||
|
export HAKMEM_TINY_P0_BATCH=32
|
||||||
|
|
||||||
|
# Low-latency (small caches, fine-grained refill)
|
||||||
|
export HAKMEM_TINY_TLS_CAP=32
|
||||||
|
export HAKMEM_TINY_TLS_REFILL=4
|
||||||
|
export HAKMEM_TINY_SFC_CAPACITY=64
|
||||||
|
export HAKMEM_TINY_P0_BATCH=8
|
||||||
|
|
||||||
|
# Debug P0 issues
|
||||||
|
export HAKMEM_TINY_P0_LOG=1
|
||||||
|
export HAKMEM_TINY_P0_NO_DRAIN=1 # Isolate batch refill from remote free
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🏊 Pool TLS Allocator (2-8KB)
|
||||||
|
|
||||||
|
### Arena Management
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_POOL_TLS_ARENA_MB_INIT` | 1-64 | 1 | Initial arena size (MB) |
|
||||||
|
| `HAKMEM_POOL_TLS_ARENA_MB_MAX` | 1-64 | 8 | Maximum arena size (MB) |
|
||||||
|
| `HAKMEM_POOL_TLS_ARENA_GROWTH_LEVELS` | 1-8 | 3 | Growth levels (1MB→2MB→4MB→8MB) |
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```bash
|
||||||
|
# Large arena for high-throughput 8KB allocations
|
||||||
|
export HAKMEM_POOL_TLS_ARENA_MB_INIT=4
|
||||||
|
export HAKMEM_POOL_TLS_ARENA_MB_MAX=32
|
||||||
|
export HAKMEM_POOL_TLS_ARENA_GROWTH_LEVELS=5 # 4MB→8MB→16MB→32MB
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📊 Statistics & Profiling
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_STATS_ENABLE` | 0, 1 | 0 | Enable statistics collection |
|
||||||
|
| `HAKMEM_STATS_VERBOSE` | 0, 1 | 0 | Verbose stats output |
|
||||||
|
| `HAKMEM_STATS_INTERVAL_SEC` | 1-3600 | 10 | Stats reporting interval (seconds) |
|
||||||
|
| `HAKMEM_PROFILE_SYSCALLS` | 0, 1 | 0 | Profile syscall counts (mmap/munmap/madvise) |
|
||||||
|
|
||||||
|
**Example**:
|
||||||
|
```bash
|
||||||
|
# Enable stats for performance analysis
|
||||||
|
export HAKMEM_STATS_ENABLE=1
|
||||||
|
export HAKMEM_STATS_VERBOSE=1
|
||||||
|
export HAKMEM_STATS_INTERVAL_SEC=5
|
||||||
|
export HAKMEM_PROFILE_SYSCALLS=1
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧪 Experimental Features
|
||||||
|
|
||||||
|
**Warning**: These features are experimental and may change or be removed.
|
||||||
|
|
||||||
|
| Variable | Values | Default | Description |
|
||||||
|
|----------|--------|---------|-------------|
|
||||||
|
| `HAKMEM_EXPERIMENTAL_ADAPTIVE_DRAIN` | 0, 1 | 0 | Adaptive remote free drain threshold |
|
||||||
|
| `HAKMEM_EXPERIMENTAL_CACHE_TUNING` | 0, 1 | 0 | Runtime cache capacity tuning |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🚀 Quick Start Examples
|
||||||
|
|
||||||
|
### 1. Production (Default Recommended)
|
||||||
|
```bash
|
||||||
|
# High performance, stable, integrity checks enabled
|
||||||
|
export HAKMEM_SUPERSLAB_LAZY=1
|
||||||
|
export HAKMEM_SUPERSLAB_LRU_CAP=256
|
||||||
|
export HAKMEM_TINY_P0_ENABLE=1
|
||||||
|
export HAKMEM_INTEGRITY_CHECKS=1
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Debug Session
|
||||||
|
```bash
|
||||||
|
# Verbose logging, tracing, integrity checks
|
||||||
|
export HAKMEM_DEBUG_LEVEL=3
|
||||||
|
export HAKMEM_DEBUG_TINY=1
|
||||||
|
export HAKMEM_TRACE_ALLOCATIONS=1
|
||||||
|
export HAKMEM_INTEGRITY_CHECKS=1
|
||||||
|
export HAKMEM_TINY_P0_LOG=1
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Low-Latency Workload
|
||||||
|
```bash
|
||||||
|
# Small caches, fine-grained batching, minimal syscalls
|
||||||
|
export HAKMEM_TINY_TLS_CAP=32
|
||||||
|
export HAKMEM_TINY_TLS_REFILL=4
|
||||||
|
export HAKMEM_TINY_SFC_CAPACITY=64
|
||||||
|
export HAKMEM_SUPERSLAB_LAZY=1
|
||||||
|
export HAKMEM_SUPERSLAB_LRU_CAP=128
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. High-Throughput Workload
|
||||||
|
```bash
|
||||||
|
# Large caches, aggressive batching, prewarm
|
||||||
|
export HAKMEM_TINY_TLS_CAP=256
|
||||||
|
export HAKMEM_TINY_TLS_REFILL=32
|
||||||
|
export HAKMEM_TINY_SFC_CAPACITY=256
|
||||||
|
export HAKMEM_TINY_P0_BATCH=32
|
||||||
|
export HAKMEM_SUPERSLAB_PREWARM=16
|
||||||
|
export HAKMEM_SUPERSLAB_LRU_CAP=512
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Memory-Efficient (Low RSS)
|
||||||
|
```bash
|
||||||
|
# Minimal caching, eager deallocation
|
||||||
|
export HAKMEM_SUPERSLAB_LAZY=0
|
||||||
|
export HAKMEM_SUPERSLAB_LRU_CAP=32
|
||||||
|
export HAKMEM_SUPERSLAB_SOFT_CAP=16
|
||||||
|
export HAKMEM_TINY_TLS_CAP=32
|
||||||
|
export HAKMEM_TINY_SFC_CAPACITY=64
|
||||||
|
export HAKMEM_POOL_TLS_ARENA_MB_MAX=2
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ✅ Validation & Testing
|
||||||
|
|
||||||
|
### Validate Configuration
|
||||||
|
```bash
|
||||||
|
# Check for deprecated/invalid variables
|
||||||
|
./scripts/validate_config.sh
|
||||||
|
|
||||||
|
# Example output:
|
||||||
|
# [DEPRECATED] HAKMEM_LEARN is deprecated, use HAKMEM_ALLOC_LEARN instead
|
||||||
|
# Sunset date: 2026-05-26 (6 months from 2025-11-26)
|
||||||
|
# See DEPRECATED.md for migration guide
|
||||||
|
#
|
||||||
|
# [WARN] HAKMEM_TINY_TLS_CAP=2048 is outside typical range (16-1024)
|
||||||
|
#
|
||||||
|
# [OK] HAKMEM_DEBUG_LEVEL=2
|
||||||
|
# [OK] HAKMEM_SUPERSLAB_LAZY=1
|
||||||
|
```
|
||||||
|
|
||||||
|
### Test Performance
|
||||||
|
```bash
|
||||||
|
# Baseline (10M iterations, 10 runs recommended)
|
||||||
|
./out/release/bench_random_mixed_hakmem
|
||||||
|
|
||||||
|
# Custom workload
|
||||||
|
./out/release/bench_random_mixed_hakmem 10000000 256 42
|
||||||
|
|
||||||
|
# Multi-threaded (Larson benchmark)
|
||||||
|
./out/release/larson_hakmem 8 # 8 threads
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ❓ FAQ
|
||||||
|
|
||||||
|
### Q: What's the difference between ALLOC_LEARN and MEM_LEARN?
|
||||||
|
**A**:
|
||||||
|
- `HAKMEM_ALLOC_LEARN`: Tunes **allocator behavior** (cache sizes, refill batches) based on allocation patterns
|
||||||
|
- `HAKMEM_MEM_LEARN`: Tunes **memory management** (THP usage, RSS optimization, max-size detection)
|
||||||
|
|
||||||
|
### Q: Should I enable learning in production?
|
||||||
|
**A**: **Generally NO**. Learning adds overhead (~5-10%) and is best for:
|
||||||
|
- Adaptive workloads with unpredictable patterns
|
||||||
|
- Benchmarking different configurations
|
||||||
|
- Initial tuning phase (then bake learned values into static config)
|
||||||
|
|
||||||
|
For production, use static tuning based on profiling.
|
||||||
|
|
||||||
|
### Q: Why is SUPERSLAB_REUSE default OFF?
|
||||||
|
**A**: Phase 12 (Shared SuperSlab Pool) removed per-class registry population. Reuse is now less effective and can cause fragmentation. Use `SUPERSLAB_LAZY=1` (default) instead for syscall reduction.
|
||||||
|
|
||||||
|
### Q: What's the performance impact of INTEGRITY_CHECKS?
|
||||||
|
**A**: ~2-5% overhead. Recommended for production (default ON) to catch memory corruption early. Disable only for performance testing.
|
||||||
|
|
||||||
|
### Q: How do I migrate from deprecated learning variables?
|
||||||
|
**A**: See [DEPRECATED.md](DEPRECATED.md) Section "Learning Systems (P2.2 Consolidation)" for complete mapping of 18→6 variables. The 6-month deprecation period provides backward compatibility.
|
||||||
|
|
||||||
|
### Q: What's SFC and why is it still active?
|
||||||
|
**A**: SFC (Super Front Cache) is an ultra-fast TLS cache (95%+ hit rate, 3-4 instructions). Unified Cache was tested in Phase 3d-B but found slower than SFC, so SFC remained as the active implementation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📚 See Also
|
||||||
|
|
||||||
|
- [DEPRECATED.md](DEPRECATED.md) - Deprecated variables and migration guide
|
||||||
|
- [BUILDING_QUICKSTART.md](BUILDING_QUICKSTART.md) - Build instructions
|
||||||
|
- [CLAUDE.md](CLAUDE.md) - Development history and performance benchmarks
|
||||||
|
- [hakmem_cleanup_proposal.txt](hakmem_cleanup_proposal.txt) - Cleanup roadmap
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Generated**: 2025-11-26 (Phase 2.2 - Learning Systems Consolidation)
|
||||||
Reference in New Issue
Block a user