Add Box 3 (Pointer Conversion Layer) and fix POOL_TLS_PHASE1 default
## Major Changes
### 1. Box 3: Pointer Conversion Module (NEW)
- File: core/box/ptr_conversion_box.h
- Purpose: Unified BASE ↔ USER pointer conversion (single source of truth)
- API: PTR_BASE_TO_USER(), PTR_USER_TO_BASE()
- Features: Zero-overhead inline, debug mode, NULL-safe, class 7 headerless support
- Design: Header-only, fully modular, no external dependencies
### 2. POOL_TLS_PHASE1 Default OFF (CRITICAL FIX)
- File: build.sh
- Change: POOL_TLS_PHASE1 now defaults to 0 (was hardcoded to 1)
- Impact: Eliminates pthread_mutex overhead on every free() (was causing 3.3x slowdown)
- Usage: Set POOL_TLS_PHASE1=1 env var to enable if needed
### 3. Pointer Conversion Fixes (PARTIAL)
- Files: core/box/front_gate_box.c, core/tiny_alloc_fast.inc.h, etc.
- Status: Partial implementation using Box 3 API
- Note: Work in progress, some conversions still need review
### 4. Performance Investigation Report (NEW)
- File: HOTPATH_PERFORMANCE_INVESTIGATION.md
- Findings:
- Hotpath works (+24% vs baseline) after POOL_TLS fix
- Still 9.2x slower than system malloc due to:
* Heavy initialization (23.85% of cycles)
* Syscall overhead (2,382 syscalls per 100K ops)
* Workload mismatch (C7 1KB is 49.8%, but only C5 256B has hotpath)
* 9.4x more instructions than system malloc
### 5. Known Issues
- SEGV at 20K-30K iterations (pre-existing bug, not related to pointer conversions)
- Root cause: Likely active counter corruption or TLS-SLL chain issues
- Status: Under investigation
## Performance Results (100K iterations, 256B)
- Baseline (Hotpath OFF): 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- System malloc: 82.2M ops/s (still 9.2x faster)
## Next Steps
- P0: Fix 20K-30K SEGV bug (GDB investigation needed)
- P1: Lazy initialization (+20-25% expected)
- P1: C7 (1KB) hotpath (+30-40% expected, biggest win)
- P2: Reduce syscalls (+15-20% expected)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@ -1,4 +1,6 @@
|
||||
// hakmem_tiny_init.inc
|
||||
// Note: uses TLS ops inline helpers for prewarm when class5 hotpath is enabled
|
||||
#include "hakmem_tiny_tls_ops.h"
|
||||
// Phase 2D-2: Initialization function extraction
|
||||
//
|
||||
// This file contains the hak_tiny_init() function extracted from hakmem_tiny.c
|
||||
@ -12,6 +14,15 @@ void hak_tiny_init(void) {
|
||||
// Step 1: Simple initialization (static global is already zero-initialized)
|
||||
g_tiny_initialized = 1;
|
||||
|
||||
// Hot-class toggle: class5 (256B) dedicated TLS fast path
|
||||
// Default ON; allow runtime override via HAKMEM_TINY_HOTPATH_CLASS5
|
||||
{
|
||||
const char* hp5 = getenv("HAKMEM_TINY_HOTPATH_CLASS5");
|
||||
if (hp5 && *hp5) {
|
||||
g_tiny_hotpath_class5 = (atoi(hp5) != 0) ? 1 : 0;
|
||||
}
|
||||
}
|
||||
|
||||
// Reset fast-cache defaults and apply preset (if provided)
|
||||
tiny_config_reset_defaults();
|
||||
char* preset_env = getenv("HAKMEM_TINY_PRESET");
|
||||
@ -89,6 +100,37 @@ void hak_tiny_init(void) {
|
||||
tls->spill_high = tiny_tls_default_spill(base_cap);
|
||||
tiny_tls_publish_targets(i, base_cap);
|
||||
}
|
||||
// Optional: override TLS parameters for hot class 5 (256B)
|
||||
if (g_tiny_hotpath_class5) {
|
||||
TinyTLSList* tls5 = &g_tls_lists[5];
|
||||
int cap_def = 512; // thick cache for hot class
|
||||
int refill_def = 128; // refill low-water mark
|
||||
int spill_def = 0; // 0 → use cap as hard spill threshold
|
||||
const char* ecap = getenv("HAKMEM_TINY_CLASS5_TLS_CAP");
|
||||
const char* eref = getenv("HAKMEM_TINY_CLASS5_TLS_REFILL");
|
||||
const char* espl = getenv("HAKMEM_TINY_CLASS5_TLS_SPILL");
|
||||
if (ecap && *ecap) cap_def = atoi(ecap);
|
||||
if (eref && *eref) refill_def = atoi(eref);
|
||||
if (espl && *espl) spill_def = atoi(espl);
|
||||
if (cap_def < 64) cap_def = 64; if (cap_def > 4096) cap_def = 4096;
|
||||
if (refill_def < 16) refill_def = 16; if (refill_def > cap_def) refill_def = cap_def;
|
||||
if (spill_def < 0) spill_def = 0; if (spill_def > cap_def) spill_def = cap_def;
|
||||
tls5->cap = (uint32_t)cap_def;
|
||||
tls5->refill_low = (uint32_t)refill_def;
|
||||
tls5->spill_high = (uint32_t)spill_def; // 0 → use cap logic in helper
|
||||
tiny_tls_publish_targets(5, (uint32_t)cap_def);
|
||||
|
||||
// Optional: one-shot TLS prewarm for class5
|
||||
// Env: HAKMEM_TINY_CLASS5_PREWARM=<n> (default 128, 0 disables)
|
||||
int prewarm = 128;
|
||||
const char* pw = getenv("HAKMEM_TINY_CLASS5_PREWARM");
|
||||
if (pw && *pw) prewarm = atoi(pw);
|
||||
if (prewarm < 0) prewarm = 0;
|
||||
if (prewarm > (int)tls5->cap) prewarm = (int)tls5->cap;
|
||||
if (prewarm > 0) {
|
||||
(void)tls_refill_from_tls_slab(5, tls5, (uint32_t)prewarm);
|
||||
}
|
||||
}
|
||||
if (mem_diet_enabled) {
|
||||
tiny_apply_mem_diet();
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user