## Major Changes
### 1. Box 3: Pointer Conversion Module (NEW)
- File: core/box/ptr_conversion_box.h
- Purpose: Unified BASE ↔ USER pointer conversion (single source of truth)
- API: PTR_BASE_TO_USER(), PTR_USER_TO_BASE()
- Features: Zero-overhead inline, debug mode, NULL-safe, class 7 headerless support
- Design: Header-only, fully modular, no external dependencies
### 2. POOL_TLS_PHASE1 Default OFF (CRITICAL FIX)
- File: build.sh
- Change: POOL_TLS_PHASE1 now defaults to 0 (was hardcoded to 1)
- Impact: Eliminates pthread_mutex overhead on every free() (was causing 3.3x slowdown)
- Usage: Set POOL_TLS_PHASE1=1 env var to enable if needed
### 3. Pointer Conversion Fixes (PARTIAL)
- Files: core/box/front_gate_box.c, core/tiny_alloc_fast.inc.h, etc.
- Status: Partial implementation using Box 3 API
- Note: Work in progress, some conversions still need review
### 4. Performance Investigation Report (NEW)
- File: HOTPATH_PERFORMANCE_INVESTIGATION.md
- Findings:
- Hotpath works (+24% vs baseline) after POOL_TLS fix
- Still 9.2x slower than system malloc due to:
* Heavy initialization (23.85% of cycles)
* Syscall overhead (2,382 syscalls per 100K ops)
* Workload mismatch (C7 1KB is 49.8%, but only C5 256B has hotpath)
* 9.4x more instructions than system malloc
### 5. Known Issues
- SEGV at 20K-30K iterations (pre-existing bug, not related to pointer conversions)
- Root cause: Likely active counter corruption or TLS-SLL chain issues
- Status: Under investigation
## Performance Results (100K iterations, 256B)
- Baseline (Hotpath OFF): 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- System malloc: 82.2M ops/s (still 9.2x faster)
## Next Steps
- P0: Fix 20K-30K SEGV bug (GDB investigation needed)
- P1: Lazy initialization (+20-25% expected)
- P1: C7 (1KB) hotpath (+30-40% expected, biggest win)
- P2: Reduce syscalls (+15-20% expected)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
90 lines
2.8 KiB
C
90 lines
2.8 KiB
C
/**
|
|
* @file ptr_conversion_box.h
|
|
* @brief Box 3: Unified Pointer Conversion Layer
|
|
*
|
|
* MISSION: Fix BASE/USER pointer confusion across codebase
|
|
*
|
|
* DESIGN:
|
|
* - BASE pointer: Points to start of block in storage (0-byte aligned)
|
|
* - USER pointer: Points to usable memory (+1 byte for classes 0-6, +0 for class 7)
|
|
* - Class 7 (2KB) is headerless (no +1 offset)
|
|
* - Classes 0-6 have 1-byte header (need +1 offset)
|
|
*
|
|
* BOX BOUNDARIES:
|
|
* - Box 1 (Front Gate) → Box 3 → Box 4 (User) [BASE to USER]
|
|
* - Box 4 (User) → Box 3 → Box 1 (Front Gate) [USER to BASE]
|
|
*/
|
|
|
|
#ifndef HAKMEM_PTR_CONVERSION_BOX_H
|
|
#define HAKMEM_PTR_CONVERSION_BOX_H
|
|
|
|
#include <stdint.h>
|
|
#include <stddef.h>
|
|
|
|
#ifdef HAKMEM_PTR_CONVERSION_DEBUG
|
|
#include <stdio.h>
|
|
#define PTR_CONV_LOG(...) fprintf(stderr, "[PTR_CONV] " __VA_ARGS__)
|
|
#else
|
|
#define PTR_CONV_LOG(...) ((void)0)
|
|
#endif
|
|
|
|
/**
|
|
* Convert BASE pointer (storage) to USER pointer (returned to caller)
|
|
*
|
|
* @param base_ptr Pointer to block in storage (no offset)
|
|
* @param class_idx Size class (0-6: +1 offset, 7: +0 offset)
|
|
* @return USER pointer (usable memory address)
|
|
*/
|
|
static inline void* ptr_base_to_user(void* base_ptr, uint8_t class_idx) {
|
|
if (base_ptr == NULL) {
|
|
return NULL;
|
|
}
|
|
|
|
/* Class 7 (2KB) is headerless - no offset */
|
|
if (class_idx == 7) {
|
|
PTR_CONV_LOG("BASE→USER cls=%u base=%p → user=%p (headerless)\n",
|
|
class_idx, base_ptr, base_ptr);
|
|
return base_ptr;
|
|
}
|
|
|
|
/* Classes 0-6 have 1-byte header - skip it */
|
|
void* user_ptr = (void*)((uint8_t*)base_ptr + 1);
|
|
PTR_CONV_LOG("BASE→USER cls=%u base=%p → user=%p (+1 offset)\n",
|
|
class_idx, base_ptr, user_ptr);
|
|
return user_ptr;
|
|
}
|
|
|
|
/**
|
|
* Convert USER pointer (from caller) to BASE pointer (storage)
|
|
*
|
|
* @param user_ptr Pointer from user (may have +1 offset)
|
|
* @param class_idx Size class (0-6: -1 offset, 7: -0 offset)
|
|
* @return BASE pointer (block start in storage)
|
|
*/
|
|
static inline void* ptr_user_to_base(void* user_ptr, uint8_t class_idx) {
|
|
if (user_ptr == NULL) {
|
|
return NULL;
|
|
}
|
|
|
|
/* Class 7 (2KB) is headerless - no offset */
|
|
if (class_idx == 7) {
|
|
PTR_CONV_LOG("USER→BASE cls=%u user=%p → base=%p (headerless)\n",
|
|
class_idx, user_ptr, user_ptr);
|
|
return user_ptr;
|
|
}
|
|
|
|
/* Classes 0-6 have 1-byte header - rewind it */
|
|
void* base_ptr = (void*)((uint8_t*)user_ptr - 1);
|
|
PTR_CONV_LOG("USER→BASE cls=%u user=%p → base=%p (-1 offset)\n",
|
|
class_idx, user_ptr, base_ptr);
|
|
return base_ptr;
|
|
}
|
|
|
|
/**
|
|
* Convenience macros for cleaner call sites
|
|
*/
|
|
#define PTR_BASE_TO_USER(base, cls) ptr_base_to_user((base), (cls))
|
|
#define PTR_USER_TO_BASE(user, cls) ptr_user_to_base((user), (cls))
|
|
|
|
#endif /* HAKMEM_PTR_CONVERSION_BOX_H */
|