Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// ptr_trace_box.h - Pointer Lifecycle Tracing System (Debug Only)
|
|
|
|
|
//
|
|
|
|
|
// Purpose:
|
|
|
|
|
// - Track complete lifecycle of pointers: allocation, free, TLS SLL operations, drain
|
|
|
|
|
// - Detect root cause of double-free bugs (TLS SLL vs Freelist synchronization issues)
|
|
|
|
|
// - Zero overhead in release builds (compile-time gated)
|
|
|
|
|
//
|
|
|
|
|
// Features:
|
|
|
|
|
// - Track 7 event types: CARVE, ALLOC_FREELIST, ALLOC_TLS_POP, FREE_TLS_PUSH,
|
|
|
|
|
// DRAIN_TO_FREELIST, SLAB_REUSE, REFILL
|
|
|
|
|
// - Environment variable control:
|
|
|
|
|
// - HAKMEM_PTR_TRACE_ALL=1: Trace all pointers (high overhead)
|
|
|
|
|
// - HAKMEM_PTR_TRACE=0xADDR: Trace specific pointer only
|
|
|
|
|
// - HAKMEM_PTR_TRACE_CLASS=N: Trace specific class only
|
|
|
|
|
// - Configurable ring buffer (default: 4096 entries per thread)
|
|
|
|
|
// - Automatic dump on crash/abort
|
|
|
|
|
//
|
|
|
|
|
// Design:
|
|
|
|
|
// - Thread-local ring buffer (no locks, no contention)
|
|
|
|
|
// - Atomic operation counter for sequencing across threads
|
|
|
|
|
// - Lazy initialization (first trace call per thread)
|
|
|
|
|
// - Header-only for inline performance
|
|
|
|
|
//
|
|
|
|
|
// Integration Points:
|
|
|
|
|
// - Linear carve: PTR_TRACE_CARVE(ptr, class_idx, op, slab_idx)
|
|
|
|
|
// - Freelist alloc: PTR_TRACE_ALLOC_FREELIST(ptr, class_idx, op, fl_head)
|
|
|
|
|
// - TLS SLL pop: PTR_TRACE_ALLOC_TLS_POP(ptr, class_idx, op, tls_count)
|
|
|
|
|
// - TLS SLL push: PTR_TRACE_FREE_TLS_PUSH(ptr, class_idx, op, tls_count)
|
|
|
|
|
// - Drain: PTR_TRACE_DRAIN_TO_FREELIST(ptr, class_idx, op, tls_count_before)
|
|
|
|
|
// - Slab reuse: PTR_TRACE_SLAB_REUSE(slab_base, class_idx, op)
|
|
|
|
|
// - Refill: PTR_TRACE_REFILL(class_idx, op, ss, slab_idx)
|
|
|
|
|
|
|
|
|
|
#ifndef PTR_TRACE_BOX_H
|
|
|
|
|
#define PTR_TRACE_BOX_H
|
|
|
|
|
|
|
|
|
|
#include <stdint.h>
|
|
|
|
|
#include <stdio.h>
|
|
|
|
|
#include <stdlib.h>
|
|
|
|
|
#include <string.h>
|
|
|
|
|
#include <stdatomic.h>
|
|
|
|
|
#include <pthread.h>
|
|
|
|
|
#include "../hakmem_build_flags.h"
|
|
|
|
|
#include "../hakmem_tiny_config.h"
|
2025-12-04 16:21:54 +09:00
|
|
|
#include "../hakmem_trace_master.h" // Phase 4c: Master trace control
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
|
|
|
|
|
// Only enable in debug builds
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
|
|
|
|
|
// ========== Configuration ==========
|
|
|
|
|
|
|
|
|
|
#ifndef PTR_TRACE_RING_SIZE
|
|
|
|
|
# define PTR_TRACE_RING_SIZE 4096
|
|
|
|
|
#endif
|
|
|
|
|
|
|
|
|
|
// Event types
|
|
|
|
|
typedef enum {
|
|
|
|
|
PTR_EVENT_CARVE = 1, // Linear carve (new block from slab)
|
|
|
|
|
PTR_EVENT_ALLOC_FREELIST = 2, // Allocated from freelist
|
|
|
|
|
PTR_EVENT_ALLOC_TLS_POP = 3, // Allocated from TLS SLL (pop)
|
|
|
|
|
PTR_EVENT_FREE_TLS_PUSH = 4, // Freed to TLS SLL (push)
|
|
|
|
|
PTR_EVENT_DRAIN_TO_FREELIST = 5, // Drained from TLS SLL to freelist
|
|
|
|
|
PTR_EVENT_SLAB_REUSE = 6, // Slab reused (all pointers invalidated)
|
|
|
|
|
PTR_EVENT_REFILL = 7, // Slab refill
|
|
|
|
|
PTR_EVENT_FREELIST_FREE = 8, // Freed directly to freelist (slow path)
|
|
|
|
|
} ptr_trace_event_t;
|
|
|
|
|
|
|
|
|
|
// Event record
|
|
|
|
|
typedef struct {
|
|
|
|
|
void* ptr; // Pointer address (BASE for allocations)
|
|
|
|
|
uint64_t op_num; // Global operation number
|
|
|
|
|
uint32_t event; // Event type (ptr_trace_event_t)
|
|
|
|
|
uint8_t class_idx; // Class index
|
|
|
|
|
uint8_t _pad[3]; // Padding to 8-byte boundary
|
|
|
|
|
union {
|
|
|
|
|
void* freelist_head; // Freelist head (ALLOC_FREELIST)
|
|
|
|
|
uint32_t tls_count; // TLS SLL count (TLS_PUSH/POP/DRAIN)
|
|
|
|
|
int slab_idx; // Slab index (CARVE/REFILL/SLAB_REUSE)
|
|
|
|
|
} aux;
|
|
|
|
|
const char* file; // Source file (__FILE__)
|
|
|
|
|
int line; // Source line (__LINE__)
|
|
|
|
|
} ptr_trace_record_t;
|
|
|
|
|
|
|
|
|
|
// ========== TLS State ==========
|
|
|
|
|
|
|
|
|
|
static __thread ptr_trace_record_t g_ptr_trace_ring[PTR_TRACE_RING_SIZE];
|
|
|
|
|
static __thread uint32_t g_ptr_trace_ring_idx = 0;
|
|
|
|
|
static __thread int g_ptr_trace_initialized = 0;
|
|
|
|
|
|
|
|
|
|
// Trace modes (cached per thread)
|
|
|
|
|
static __thread int g_ptr_trace_mode = -1; // -1=uninitialized, 0=off, 1=specific ptr, 2=specific class, 3=all
|
|
|
|
|
static __thread uintptr_t g_ptr_trace_target = 0; // Target pointer address (mode 1)
|
|
|
|
|
static __thread int g_ptr_trace_target_class = -1; // Target class (mode 2)
|
|
|
|
|
|
|
|
|
|
// ========== Global State ==========
|
|
|
|
|
|
|
|
|
|
// Global operation counter (atomic, shared across threads)
|
|
|
|
|
static _Atomic uint64_t g_ptr_trace_op_counter = 0;
|
|
|
|
|
|
|
|
|
|
// Dump registered flag (global, one-time setup)
|
|
|
|
|
static _Atomic int g_ptr_trace_dump_registered = 0;
|
|
|
|
|
|
|
|
|
|
// ========== Helpers ==========
|
|
|
|
|
|
|
|
|
|
static inline const char* ptr_event_name(ptr_trace_event_t ev) {
|
|
|
|
|
switch (ev) {
|
|
|
|
|
case PTR_EVENT_CARVE: return "CARVE";
|
|
|
|
|
case PTR_EVENT_ALLOC_FREELIST: return "ALLOC_FREELIST";
|
|
|
|
|
case PTR_EVENT_ALLOC_TLS_POP: return "ALLOC_TLS_POP";
|
|
|
|
|
case PTR_EVENT_FREE_TLS_PUSH: return "FREE_TLS_PUSH";
|
|
|
|
|
case PTR_EVENT_DRAIN_TO_FREELIST: return "DRAIN_TO_FREELIST";
|
|
|
|
|
case PTR_EVENT_SLAB_REUSE: return "SLAB_REUSE";
|
|
|
|
|
case PTR_EVENT_REFILL: return "REFILL";
|
|
|
|
|
case PTR_EVENT_FREELIST_FREE: return "FREELIST_FREE";
|
|
|
|
|
default: return "UNKNOWN";
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Initialize trace mode from environment variables
|
|
|
|
|
static inline void ptr_trace_init(void) {
|
|
|
|
|
if (g_ptr_trace_initialized) return;
|
|
|
|
|
g_ptr_trace_initialized = 1;
|
|
|
|
|
|
|
|
|
|
// Check HAKMEM_PTR_TRACE_ALL
|
|
|
|
|
const char* env_all = getenv("HAKMEM_PTR_TRACE_ALL");
|
|
|
|
|
if (env_all && *env_all && *env_all != '0') {
|
|
|
|
|
g_ptr_trace_mode = 3; // Trace all
|
|
|
|
|
fprintf(stderr, "[PTR_TRACE_INIT] Mode: ALL (high overhead)\n");
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check HAKMEM_PTR_TRACE (specific pointer)
|
|
|
|
|
const char* env_ptr = getenv("HAKMEM_PTR_TRACE");
|
|
|
|
|
if (env_ptr && *env_ptr) {
|
|
|
|
|
char* endp = NULL;
|
|
|
|
|
uintptr_t addr = (uintptr_t)strtoull(env_ptr, &endp, 0);
|
|
|
|
|
if (addr != 0) {
|
|
|
|
|
g_ptr_trace_mode = 1;
|
|
|
|
|
g_ptr_trace_target = addr;
|
|
|
|
|
fprintf(stderr, "[PTR_TRACE_INIT] Mode: SPECIFIC_PTR target=%p\n", (void*)addr);
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check HAKMEM_PTR_TRACE_CLASS
|
|
|
|
|
const char* env_cls = getenv("HAKMEM_PTR_TRACE_CLASS");
|
|
|
|
|
if (env_cls && *env_cls) {
|
|
|
|
|
int cls = atoi(env_cls);
|
|
|
|
|
if (cls >= 0 && cls < TINY_NUM_CLASSES) {
|
|
|
|
|
g_ptr_trace_mode = 2;
|
|
|
|
|
g_ptr_trace_target_class = cls;
|
|
|
|
|
fprintf(stderr, "[PTR_TRACE_INIT] Mode: SPECIFIC_CLASS class=%d\n", cls);
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Default: OFF
|
|
|
|
|
g_ptr_trace_mode = 0;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check if we should trace this pointer/class
|
|
|
|
|
static inline int ptr_trace_should_log(void* ptr, int class_idx) {
|
|
|
|
|
if (g_ptr_trace_mode == -1) {
|
|
|
|
|
ptr_trace_init();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
switch (g_ptr_trace_mode) {
|
|
|
|
|
case 0: return 0; // OFF
|
|
|
|
|
case 1: return ((uintptr_t)ptr == g_ptr_trace_target); // Specific pointer
|
|
|
|
|
case 2: return (class_idx == g_ptr_trace_target_class); // Specific class
|
|
|
|
|
case 3: return 1; // All
|
|
|
|
|
default: return 0;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Dump trace ring for current thread
|
|
|
|
|
static inline void ptr_trace_dump(void) {
|
|
|
|
|
fprintf(stderr, "\n========== PTR_TRACE_DUMP (thread=%lx) ==========\n",
|
|
|
|
|
(unsigned long)pthread_self());
|
|
|
|
|
fprintf(stderr, "Ring index: %u (size=%d)\n", g_ptr_trace_ring_idx, PTR_TRACE_RING_SIZE);
|
|
|
|
|
|
|
|
|
|
uint32_t count = (g_ptr_trace_ring_idx < PTR_TRACE_RING_SIZE)
|
|
|
|
|
? g_ptr_trace_ring_idx
|
|
|
|
|
: PTR_TRACE_RING_SIZE;
|
|
|
|
|
uint32_t start_idx = (g_ptr_trace_ring_idx >= PTR_TRACE_RING_SIZE)
|
|
|
|
|
? (g_ptr_trace_ring_idx % PTR_TRACE_RING_SIZE)
|
|
|
|
|
: 0;
|
|
|
|
|
|
|
|
|
|
fprintf(stderr, "Last %u events:\n", count);
|
|
|
|
|
for (uint32_t i = 0; i < count; i++) {
|
|
|
|
|
uint32_t idx = (start_idx + i) % PTR_TRACE_RING_SIZE;
|
|
|
|
|
ptr_trace_record_t* r = &g_ptr_trace_ring[idx];
|
|
|
|
|
|
|
|
|
|
fprintf(stderr, "[%4u] op=%06lu event=%-20s cls=%d ptr=%p",
|
|
|
|
|
i, (unsigned long)r->op_num, ptr_event_name(r->event),
|
|
|
|
|
r->class_idx, r->ptr);
|
|
|
|
|
|
|
|
|
|
// Print auxiliary info based on event type
|
|
|
|
|
switch (r->event) {
|
|
|
|
|
case PTR_EVENT_ALLOC_FREELIST:
|
|
|
|
|
fprintf(stderr, " fl_head=%p", r->aux.freelist_head);
|
|
|
|
|
break;
|
|
|
|
|
case PTR_EVENT_ALLOC_TLS_POP:
|
|
|
|
|
case PTR_EVENT_FREE_TLS_PUSH:
|
|
|
|
|
case PTR_EVENT_DRAIN_TO_FREELIST:
|
|
|
|
|
fprintf(stderr, " tls_count=%u", r->aux.tls_count);
|
|
|
|
|
break;
|
|
|
|
|
case PTR_EVENT_CARVE:
|
|
|
|
|
case PTR_EVENT_REFILL:
|
|
|
|
|
case PTR_EVENT_SLAB_REUSE:
|
|
|
|
|
fprintf(stderr, " slab_idx=%d", r->aux.slab_idx);
|
|
|
|
|
break;
|
|
|
|
|
default:
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fprintf(stderr, " from=%s:%d\n", r->file ? r->file : "(null)", r->line);
|
|
|
|
|
}
|
|
|
|
|
fprintf(stderr, "========== END PTR_TRACE_DUMP ==========\n\n");
|
|
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Dump all traces (called at exit)
|
|
|
|
|
static void ptr_trace_dump_atexit(void) {
|
|
|
|
|
fprintf(stderr, "\n[PTR_TRACE] Automatic dump at exit\n");
|
|
|
|
|
ptr_trace_dump();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Register atexit handler (once per process)
|
|
|
|
|
static inline void ptr_trace_register_dump(void) {
|
|
|
|
|
int expected = 0;
|
|
|
|
|
if (atomic_compare_exchange_strong(&g_ptr_trace_dump_registered, &expected, 1)) {
|
|
|
|
|
atexit(ptr_trace_dump_atexit);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Record a trace event
|
|
|
|
|
static inline void ptr_trace_record_impl(
|
|
|
|
|
ptr_trace_event_t event,
|
|
|
|
|
void* ptr,
|
|
|
|
|
int class_idx,
|
|
|
|
|
uint64_t op_num,
|
|
|
|
|
void* aux_ptr,
|
|
|
|
|
uint32_t aux_u32,
|
|
|
|
|
int aux_int,
|
|
|
|
|
const char* file,
|
|
|
|
|
int line)
|
|
|
|
|
{
|
|
|
|
|
if (!ptr_trace_should_log(ptr, class_idx)) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Register dump handler on first trace
|
|
|
|
|
ptr_trace_register_dump();
|
|
|
|
|
|
|
|
|
|
uint32_t idx = g_ptr_trace_ring_idx % PTR_TRACE_RING_SIZE;
|
|
|
|
|
ptr_trace_record_t* r = &g_ptr_trace_ring[idx];
|
|
|
|
|
|
|
|
|
|
r->ptr = ptr;
|
|
|
|
|
r->op_num = op_num;
|
|
|
|
|
r->event = event;
|
|
|
|
|
r->class_idx = (uint8_t)class_idx;
|
|
|
|
|
|
|
|
|
|
// Fill auxiliary data based on event type
|
|
|
|
|
switch (event) {
|
|
|
|
|
case PTR_EVENT_ALLOC_FREELIST:
|
|
|
|
|
r->aux.freelist_head = aux_ptr;
|
|
|
|
|
break;
|
|
|
|
|
case PTR_EVENT_ALLOC_TLS_POP:
|
|
|
|
|
case PTR_EVENT_FREE_TLS_PUSH:
|
|
|
|
|
case PTR_EVENT_DRAIN_TO_FREELIST:
|
|
|
|
|
r->aux.tls_count = aux_u32;
|
|
|
|
|
break;
|
|
|
|
|
case PTR_EVENT_CARVE:
|
|
|
|
|
case PTR_EVENT_REFILL:
|
|
|
|
|
case PTR_EVENT_SLAB_REUSE:
|
|
|
|
|
r->aux.slab_idx = aux_int;
|
|
|
|
|
break;
|
|
|
|
|
default:
|
|
|
|
|
r->aux.tls_count = 0;
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
r->file = file;
|
|
|
|
|
r->line = line;
|
|
|
|
|
|
|
|
|
|
g_ptr_trace_ring_idx++;
|
|
|
|
|
|
|
|
|
|
// Optional: Print event in real-time (very verbose)
|
|
|
|
|
static __thread int s_verbose = -1;
|
|
|
|
|
if (s_verbose == -1) {
|
2025-12-04 16:21:54 +09:00
|
|
|
s_verbose = hak_trace_check("HAKMEM_PTR_TRACE_VERBOSE", "ptr");
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
}
|
|
|
|
|
if (s_verbose) {
|
|
|
|
|
fprintf(stderr, "[PTR_TRACE] op=%06lu event=%-20s cls=%d ptr=%p from=%s:%d\n",
|
|
|
|
|
(unsigned long)op_num, ptr_event_name(event), class_idx, ptr,
|
|
|
|
|
file ? file : "?", line);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ========== Public API (Macros) ==========
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_CARVE(ptr, class_idx, slab_idx) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_CARVE, (ptr), (class_idx), _op, \
|
|
|
|
|
NULL, 0, (slab_idx), __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_ALLOC_FREELIST(ptr, class_idx, fl_head) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_ALLOC_FREELIST, (ptr), (class_idx), _op, \
|
|
|
|
|
(fl_head), 0, 0, __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_ALLOC_TLS_POP(ptr, class_idx, tls_count) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_ALLOC_TLS_POP, (ptr), (class_idx), _op, \
|
|
|
|
|
NULL, (tls_count), 0, __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_FREE_TLS_PUSH(ptr, class_idx, tls_count) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_FREE_TLS_PUSH, (ptr), (class_idx), _op, \
|
|
|
|
|
NULL, (tls_count), 0, __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_DRAIN_TO_FREELIST(ptr, class_idx, tls_count_before) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_DRAIN_TO_FREELIST, (ptr), (class_idx), _op, \
|
|
|
|
|
NULL, (tls_count_before), 0, __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_SLAB_REUSE(slab_base, class_idx, slab_idx) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_SLAB_REUSE, (slab_base), (class_idx), _op, \
|
|
|
|
|
NULL, 0, (slab_idx), __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_REFILL(class_idx, ss, slab_idx) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_REFILL, (void*)(ss), (class_idx), _op, \
|
|
|
|
|
NULL, 0, (slab_idx), __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
#define PTR_TRACE_FREELIST_FREE(ptr, class_idx) do { \
|
|
|
|
|
uint64_t _op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed); \
|
|
|
|
|
ptr_trace_record_impl(PTR_EVENT_FREELIST_FREE, (ptr), (class_idx), _op, \
|
|
|
|
|
NULL, 0, 0, __FILE__, __LINE__); \
|
|
|
|
|
} while (0)
|
|
|
|
|
|
|
|
|
|
// Manual dump (for debugging)
|
|
|
|
|
#define PTR_TRACE_DUMP() ptr_trace_dump()
|
|
|
|
|
|
|
|
|
|
#else // HAKMEM_BUILD_RELEASE (Release build - no-op macros)
|
|
|
|
|
|
|
|
|
|
// Zero-overhead stubs for release builds
|
|
|
|
|
#define PTR_TRACE_CARVE(ptr, class_idx, slab_idx) ((void)0)
|
|
|
|
|
#define PTR_TRACE_ALLOC_FREELIST(ptr, class_idx, fl_head) ((void)0)
|
|
|
|
|
#define PTR_TRACE_ALLOC_TLS_POP(ptr, class_idx, tls_count) ((void)0)
|
|
|
|
|
#define PTR_TRACE_FREE_TLS_PUSH(ptr, class_idx, tls_count) ((void)0)
|
|
|
|
|
#define PTR_TRACE_DRAIN_TO_FREELIST(ptr, class_idx, tls_count_before) ((void)0)
|
|
|
|
|
#define PTR_TRACE_SLAB_REUSE(slab_base, class_idx, slab_idx) ((void)0)
|
|
|
|
|
#define PTR_TRACE_REFILL(class_idx, ss, slab_idx) ((void)0)
|
|
|
|
|
#define PTR_TRACE_FREELIST_FREE(ptr, class_idx) ((void)0)
|
|
|
|
|
#define PTR_TRACE_DUMP() ((void)0)
|
|
|
|
|
|
|
|
|
|
#endif // !HAKMEM_BUILD_RELEASE
|
|
|
|
|
|
|
|
|
|
#endif // PTR_TRACE_BOX_H
|