2025-11-14 01:02:00 +09:00
|
|
|
// tls_sll_box.h - Box TLS-SLL: Single-Linked List API (Unified Box version)
|
2025-11-10 16:48:20 +09:00
|
|
|
//
|
2025-11-14 01:02:00 +09:00
|
|
|
// Goal:
|
|
|
|
|
// - Single authoritative Box for TLS SLL operations.
|
|
|
|
|
// - All next pointer layout is decided by tiny_next_ptr_box.h (Box API).
|
|
|
|
|
// - Callers pass BASE pointers only; no local next_offset arithmetic.
|
|
|
|
|
// - Compatible with existing ptr_trace PTR_NEXT_* macros (off is logging-only).
|
2025-11-10 16:48:20 +09:00
|
|
|
//
|
2025-11-14 01:02:00 +09:00
|
|
|
// Invariants:
|
|
|
|
|
// - g_tiny_class_sizes[cls] is TOTAL stride (including 1-byte header when enabled).
|
|
|
|
|
// - For HEADER_CLASSIDX != 0, tiny_nextptr.h encodes:
|
|
|
|
|
// class 0: next_off = 0
|
2025-11-21 23:00:24 +09:00
|
|
|
// class 1-7: next_off = 1
|
2025-11-14 01:02:00 +09:00
|
|
|
// Callers MUST NOT duplicate this logic.
|
|
|
|
|
// - TLS SLL stores BASE pointers only.
|
|
|
|
|
// - Box provides: push / pop / splice with capacity & integrity checks.
|
2025-11-10 16:48:20 +09:00
|
|
|
|
|
|
|
|
#ifndef TLS_SLL_BOX_H
|
|
|
|
|
#define TLS_SLL_BOX_H
|
|
|
|
|
|
|
|
|
|
#include <stdint.h>
|
|
|
|
|
#include <stdbool.h>
|
2025-11-14 01:02:00 +09:00
|
|
|
#include <stdio.h>
|
|
|
|
|
#include <stdlib.h>
|
2025-11-21 23:00:24 +09:00
|
|
|
#include <stdatomic.h>
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
#include "../hakmem_internal.h" // Phase 10: Type Safety (hak_base_ptr_t)
|
2025-11-14 01:02:00 +09:00
|
|
|
#include "../hakmem_tiny_config.h"
|
2025-11-10 23:41:53 +09:00
|
|
|
#include "../hakmem_build_flags.h"
|
2025-11-29 06:57:03 +09:00
|
|
|
#include "../hakmem_debug_master.h" // For unified debug level control
|
2025-11-14 01:02:00 +09:00
|
|
|
#include "../tiny_remote.h"
|
|
|
|
|
#include "../tiny_region_id.h"
|
|
|
|
|
#include "../hakmem_tiny_integrity.h"
|
|
|
|
|
#include "../ptr_track.h"
|
|
|
|
|
#include "../ptr_trace.h"
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
#include "../tiny_debug_ring.h"
|
2025-11-21 23:00:24 +09:00
|
|
|
#include "../hakmem_super_registry.h"
|
2025-12-03 20:42:28 +09:00
|
|
|
#include "ss_addr_map_box.h"
|
2025-11-21 23:00:24 +09:00
|
|
|
#include "../superslab/superslab_inline.h"
|
Add tiny_ptr_bridge_box for centralized pointer classification
Consolidates the logic for resolving Tiny BASE pointers into
(SuperSlab*, slab_idx, TinySlabMeta*, class_idx) tuples.
Box Theory compliance:
- Single Responsibility: ptr→(ss,slab,meta,class) resolution only
- No side effects: pure classification, no logging, no mutations
- Clear API: 4 functions (classify_raw/base, validate_raw/base_class)
- Fail-fast friendly: callers decide error handling policy
Implementation:
- core/box/tiny_ptr_bridge_box.h: New box (4.7 KB)
- core/box/tls_sll_box.h: Integrated into sanitize_head/check_node
Architecture:
- Used in 3 call sites within TLS SLL Box
- Ready for gradual migration to other code paths
- Foundation for future centralized validation
Testing: 150+ seconds stable (sh8bench)
- 30s test: exit code 0, 0 crashes
- 120s test: exit code 0, 0 crashes
- Behavior: identical to previous hand-rolled implementation
Benefits:
- Single point of authority for ptr→(ss,slab,meta,class) logic
- Easier to add validation rules in future (range check, magic, etc.)
- Consistent API for all ptr classification needs
- Foundation for removing code duplication across allocator
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 05:54:54 +09:00
|
|
|
#include "tiny_ptr_bridge_box.h" // Box: ptr→(ss,slab,meta,class) bridge
|
2025-11-14 01:02:00 +09:00
|
|
|
#include "tiny_next_ptr_box.h"
|
2025-11-29 07:57:49 +09:00
|
|
|
#include "tiny_header_box.h" // Header Box: Single Source of Truth for header operations
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-11-21 23:00:24 +09:00
|
|
|
// Per-thread debug shadow: last successful push base per class (release-safe)
|
2025-12-03 13:28:44 +09:00
|
|
|
// Changed to extern to share across TUs (defined in hakmem_tiny.c)
|
|
|
|
|
extern __thread hak_base_ptr_t s_tls_sll_last_push[TINY_NUM_CLASSES];
|
2025-11-21 23:00:24 +09:00
|
|
|
|
2025-11-22 11:30:46 +09:00
|
|
|
// Per-thread callsite tracking: last push caller per class (debug-only)
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
static __thread const char* s_tls_sll_last_push_from[TINY_NUM_CLASSES] = {NULL};
|
|
|
|
|
static __thread const char* s_tls_sll_last_pop_from[TINY_NUM_CLASSES] = {NULL};
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-20 07:32:30 +09:00
|
|
|
// Phase 3d-B: Unified TLS SLL (defined in hakmem_tiny.c)
|
|
|
|
|
extern __thread TinyTLSSLL g_tls_sll[TINY_NUM_CLASSES];
|
2025-11-21 23:00:24 +09:00
|
|
|
extern __thread uint64_t g_tls_canary_before_sll;
|
|
|
|
|
extern __thread uint64_t g_tls_canary_after_sll;
|
|
|
|
|
extern __thread const char* g_tls_sll_last_writer[TINY_NUM_CLASSES];
|
2025-11-14 01:05:30 +09:00
|
|
|
extern int g_tls_sll_class_mask; // bit i=1 → SLL allowed for class i
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-11-27 07:30:32 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
// Global callsite record (debug only; zero overhead in release)
|
|
|
|
|
static const char* g_tls_sll_push_file[TINY_NUM_CLASSES] = {0};
|
|
|
|
|
static int g_tls_sll_push_line[TINY_NUM_CLASSES] = {0};
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// ========== Debug guard ==========
|
2025-11-10 23:41:53 +09:00
|
|
|
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline void tls_sll_debug_guard(int class_idx, hak_base_ptr_t base, const char* where)
|
2025-11-14 01:02:00 +09:00
|
|
|
{
|
|
|
|
|
(void)class_idx;
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw = HAK_BASE_TO_RAW(base);
|
|
|
|
|
if ((uintptr_t)raw < 4096) {
|
2025-11-14 01:02:00 +09:00
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_GUARD] %s: suspicious ptr=%p cls=%d\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
where, raw, class_idx);
|
2025-11-10 23:41:53 +09:00
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#else
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline void tls_sll_debug_guard(int class_idx, hak_base_ptr_t base, const char* where)
|
2025-11-14 01:02:00 +09:00
|
|
|
{
|
|
|
|
|
(void)class_idx; (void)base; (void)where;
|
|
|
|
|
}
|
2025-11-10 23:41:53 +09:00
|
|
|
#endif
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Normalize helper: callers are required to pass BASE already.
|
|
|
|
|
// Kept as a no-op for documentation / future hardening.
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline hak_base_ptr_t tls_sll_normalize_base(int class_idx, hak_base_ptr_t node)
|
2025-11-14 01:02:00 +09:00
|
|
|
{
|
2025-11-21 23:00:24 +09:00
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
2025-12-01 16:37:59 +09:00
|
|
|
if (!hak_base_is_null(node) && class_idx >= 0 && class_idx < TINY_NUM_CLASSES) {
|
2025-11-21 23:00:24 +09:00
|
|
|
extern const size_t g_tiny_class_sizes[];
|
|
|
|
|
size_t stride = g_tiny_class_sizes[class_idx];
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw = HAK_BASE_TO_RAW(node);
|
2025-11-21 23:00:24 +09:00
|
|
|
if (__builtin_expect(stride != 0, 1)) {
|
2025-12-01 16:37:59 +09:00
|
|
|
uintptr_t delta = (uintptr_t)raw % stride;
|
2025-11-21 23:00:24 +09:00
|
|
|
if (__builtin_expect(delta == 1, 0)) {
|
|
|
|
|
// USER pointer passed in; normalize to BASE (= user-1) to avoid offset-1 writes.
|
2025-12-01 16:37:59 +09:00
|
|
|
void* base = (uint8_t*)raw - 1;
|
2025-11-21 23:00:24 +09:00
|
|
|
static _Atomic uint32_t g_tls_sll_norm_userptr = 0;
|
|
|
|
|
uint32_t n = atomic_fetch_add_explicit(&g_tls_sll_norm_userptr, 1, memory_order_relaxed);
|
|
|
|
|
if (n < 8) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_NORMALIZE_USERPTR] cls=%d node=%p -> base=%p stride=%zu\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw, base, stride);
|
2025-12-04 04:15:10 +09:00
|
|
|
void* bt[16];
|
|
|
|
|
int frames = backtrace(bt, 16);
|
|
|
|
|
backtrace_symbols_fd(bt, frames, fileno(stderr));
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
2025-12-01 16:37:59 +09:00
|
|
|
return HAK_BASE_FROM_RAW(base);
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#else
|
2025-11-10 23:41:53 +09:00
|
|
|
(void)class_idx;
|
2025-11-21 23:00:24 +09:00
|
|
|
#endif
|
2025-11-10 23:41:53 +09:00
|
|
|
return node;
|
|
|
|
|
}
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-11-21 23:00:24 +09:00
|
|
|
// Narrow dump around TLS SLL array when corruption is detected (env-gated)
|
|
|
|
|
static inline void tls_sll_dump_tls_window(int class_idx, const char* stage)
|
|
|
|
|
{
|
|
|
|
|
static _Atomic uint32_t g_tls_sll_diag_shots = 0;
|
|
|
|
|
static int s_diag_enable = -1;
|
|
|
|
|
if (__builtin_expect(s_diag_enable == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_DIAG");
|
|
|
|
|
s_diag_enable = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (!__builtin_expect(s_diag_enable, 0)) return;
|
|
|
|
|
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_tls_sll_diag_shots, 1, memory_order_relaxed);
|
|
|
|
|
if (shot >= 2) return; // limit noise
|
|
|
|
|
|
|
|
|
|
if (shot == 0) {
|
|
|
|
|
// Map TLS layout once to confirm index→address mapping during triage
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_ADDRMAP] before=%p sll=%p after=%p entry_size=%zu\n",
|
|
|
|
|
(void*)&g_tls_canary_before_sll,
|
|
|
|
|
(void*)g_tls_sll,
|
|
|
|
|
(void*)&g_tls_canary_after_sll,
|
|
|
|
|
sizeof(TinyTLSSLL));
|
|
|
|
|
for (int c = 0; c < TINY_NUM_CLASSES; c++) {
|
|
|
|
|
fprintf(stderr, " C%d: head@%p count@%p\n",
|
|
|
|
|
c,
|
|
|
|
|
(void*)&g_tls_sll[c].head,
|
|
|
|
|
(void*)&g_tls_sll[c].count);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_INVALID_POP_DIAG] shot=%u stage=%s cls=%d head=%p count=%u last_push=%p last_writer=%s\n",
|
|
|
|
|
shot + 1,
|
|
|
|
|
stage ? stage : "(null)",
|
|
|
|
|
class_idx,
|
2025-12-01 16:37:59 +09:00
|
|
|
HAK_BASE_TO_RAW(g_tls_sll[class_idx].head),
|
2025-11-21 23:00:24 +09:00
|
|
|
g_tls_sll[class_idx].count,
|
2025-12-01 16:37:59 +09:00
|
|
|
HAK_BASE_TO_RAW(s_tls_sll_last_push[class_idx]),
|
2025-11-21 23:00:24 +09:00
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)");
|
|
|
|
|
fprintf(stderr, " tls_sll snapshot (head/count):");
|
|
|
|
|
for (int c = 0; c < TINY_NUM_CLASSES; c++) {
|
2025-12-01 16:37:59 +09:00
|
|
|
fprintf(stderr, " C%d:%p/%u", c, HAK_BASE_TO_RAW(g_tls_sll[c].head), g_tls_sll[c].count);
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
fprintf(stderr, " canary_before=%#llx canary_after=%#llx\n",
|
|
|
|
|
(unsigned long long)g_tls_canary_before_sll,
|
|
|
|
|
(unsigned long long)g_tls_canary_after_sll);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static inline void tls_sll_record_writer(int class_idx, const char* who)
|
|
|
|
|
{
|
|
|
|
|
if (__builtin_expect(class_idx >= 0 && class_idx < TINY_NUM_CLASSES, 1)) {
|
|
|
|
|
g_tls_sll_last_writer[class_idx] = who;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline int tls_sll_head_valid(hak_base_ptr_t head)
|
2025-11-21 23:00:24 +09:00
|
|
|
{
|
2025-12-01 16:37:59 +09:00
|
|
|
uintptr_t a = (uintptr_t)HAK_BASE_TO_RAW(head);
|
2025-11-21 23:00:24 +09:00
|
|
|
return (a >= 4096 && a <= 0x00007fffffffffffULL);
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-03 21:01:25 +09:00
|
|
|
// Defensive: validate current TLS head before using it.
|
|
|
|
|
// If invalid, drop the list to avoid propagating corruption.
|
|
|
|
|
static inline void tls_sll_sanitize_head(int class_idx, const char* stage)
|
|
|
|
|
{
|
|
|
|
|
if (__builtin_expect(class_idx < 0 || class_idx >= TINY_NUM_CLASSES, 0)) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
hak_base_ptr_t head = g_tls_sll[class_idx].head;
|
|
|
|
|
if (hak_base_is_null(head)) return;
|
|
|
|
|
|
|
|
|
|
void* raw = HAK_BASE_TO_RAW(head);
|
Add tiny_ptr_bridge_box for centralized pointer classification
Consolidates the logic for resolving Tiny BASE pointers into
(SuperSlab*, slab_idx, TinySlabMeta*, class_idx) tuples.
Box Theory compliance:
- Single Responsibility: ptr→(ss,slab,meta,class) resolution only
- No side effects: pure classification, no logging, no mutations
- Clear API: 4 functions (classify_raw/base, validate_raw/base_class)
- Fail-fast friendly: callers decide error handling policy
Implementation:
- core/box/tiny_ptr_bridge_box.h: New box (4.7 KB)
- core/box/tls_sll_box.h: Integrated into sanitize_head/check_node
Architecture:
- Used in 3 call sites within TLS SLL Box
- Ready for gradual migration to other code paths
- Foundation for future centralized validation
Testing: 150+ seconds stable (sh8bench)
- 30s test: exit code 0, 0 crashes
- 120s test: exit code 0, 0 crashes
- Behavior: identical to previous hand-rolled implementation
Benefits:
- Single point of authority for ptr→(ss,slab,meta,class) logic
- Easier to add validation rules in future (range check, magic, etc.)
- Consistent API for all ptr classification needs
- Foundation for removing code duplication across allocator
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 05:54:54 +09:00
|
|
|
TinyPtrBridgeInfo info = tiny_ptr_bridge_classify_raw(raw);
|
|
|
|
|
SuperSlab* ss = info.ss;
|
|
|
|
|
int idx = info.slab_idx;
|
|
|
|
|
uint8_t meta_cls = info.meta_cls;
|
2025-12-03 21:01:25 +09:00
|
|
|
|
|
|
|
|
int reset = 0;
|
Add tiny_ptr_bridge_box for centralized pointer classification
Consolidates the logic for resolving Tiny BASE pointers into
(SuperSlab*, slab_idx, TinySlabMeta*, class_idx) tuples.
Box Theory compliance:
- Single Responsibility: ptr→(ss,slab,meta,class) resolution only
- No side effects: pure classification, no logging, no mutations
- Clear API: 4 functions (classify_raw/base, validate_raw/base_class)
- Fail-fast friendly: callers decide error handling policy
Implementation:
- core/box/tiny_ptr_bridge_box.h: New box (4.7 KB)
- core/box/tls_sll_box.h: Integrated into sanitize_head/check_node
Architecture:
- Used in 3 call sites within TLS SLL Box
- Ready for gradual migration to other code paths
- Foundation for future centralized validation
Testing: 150+ seconds stable (sh8bench)
- 30s test: exit code 0, 0 crashes
- 120s test: exit code 0, 0 crashes
- Behavior: identical to previous hand-rolled implementation
Benefits:
- Single point of authority for ptr→(ss,slab,meta,class) logic
- Easier to add validation rules in future (range check, magic, etc.)
- Consistent API for all ptr classification needs
- Foundation for removing code duplication across allocator
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 05:54:54 +09:00
|
|
|
if (!ss || !info.meta || idx < 0 || meta_cls != (uint8_t)class_idx) {
|
2025-12-03 21:01:25 +09:00
|
|
|
reset = 1;
|
|
|
|
|
}
|
|
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
|
|
|
|
if (!reset) {
|
|
|
|
|
uint8_t hdr = *(uint8_t*)raw;
|
|
|
|
|
uint8_t expect = HEADER_MAGIC | (class_idx & HEADER_CLASS_MASK);
|
|
|
|
|
if (hdr != expect) {
|
|
|
|
|
reset = 1;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
if (reset) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_SANITIZE] stage=%s cls=%d head=%p meta_cls=%u idx=%d ss=%p\n",
|
|
|
|
|
stage ? stage : "(null)",
|
|
|
|
|
class_idx,
|
|
|
|
|
raw,
|
|
|
|
|
(unsigned)meta_cls,
|
|
|
|
|
idx,
|
|
|
|
|
(void*)ss);
|
|
|
|
|
g_tls_sll[class_idx].head = HAK_BASE_FROM_RAW(NULL);
|
|
|
|
|
g_tls_sll[class_idx].count = 0;
|
|
|
|
|
tls_sll_record_writer(class_idx, "sanitize");
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
static inline int tls_sll_check_node(int class_idx, void* raw, void* from_base, const char* stage)
|
|
|
|
|
{
|
|
|
|
|
if (!raw) return 1;
|
Add tiny_ptr_bridge_box for centralized pointer classification
Consolidates the logic for resolving Tiny BASE pointers into
(SuperSlab*, slab_idx, TinySlabMeta*, class_idx) tuples.
Box Theory compliance:
- Single Responsibility: ptr→(ss,slab,meta,class) resolution only
- No side effects: pure classification, no logging, no mutations
- Clear API: 4 functions (classify_raw/base, validate_raw/base_class)
- Fail-fast friendly: callers decide error handling policy
Implementation:
- core/box/tiny_ptr_bridge_box.h: New box (4.7 KB)
- core/box/tls_sll_box.h: Integrated into sanitize_head/check_node
Architecture:
- Used in 3 call sites within TLS SLL Box
- Ready for gradual migration to other code paths
- Foundation for future centralized validation
Testing: 150+ seconds stable (sh8bench)
- 30s test: exit code 0, 0 crashes
- 120s test: exit code 0, 0 crashes
- Behavior: identical to previous hand-rolled implementation
Benefits:
- Single point of authority for ptr→(ss,slab,meta,class) logic
- Easier to add validation rules in future (range check, magic, etc.)
- Consistent API for all ptr classification needs
- Foundation for removing code duplication across allocator
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 05:54:54 +09:00
|
|
|
TinyPtrBridgeInfo info = tiny_ptr_bridge_classify_raw(raw);
|
|
|
|
|
SuperSlab* ss = info.ss;
|
|
|
|
|
int idx = info.slab_idx;
|
|
|
|
|
uint8_t meta_cls = info.meta_cls;
|
|
|
|
|
if (!ss || !info.meta || idx < 0 || meta_cls != (uint8_t)class_idx) {
|
2025-12-03 21:56:52 +09:00
|
|
|
goto bad;
|
|
|
|
|
}
|
|
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
|
|
|
|
{
|
|
|
|
|
uint8_t hdr = *(uint8_t*)raw;
|
|
|
|
|
uint8_t expect = HEADER_MAGIC | (class_idx & HEADER_CLASS_MASK);
|
|
|
|
|
if (hdr != expect) {
|
|
|
|
|
goto bad;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
return 1;
|
|
|
|
|
bad:
|
|
|
|
|
static _Atomic uint32_t g_head_set_diag = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_head_set_diag, 1, memory_order_relaxed);
|
|
|
|
|
if (shot < 8) {
|
|
|
|
|
uint8_t from_meta_cls = 0xff;
|
|
|
|
|
int from_idx = -1;
|
|
|
|
|
SuperSlab* from_ss = NULL;
|
|
|
|
|
TinySlabMeta* from_meta = NULL;
|
|
|
|
|
uint64_t from_meta_used = 0;
|
|
|
|
|
void* from_meta_freelist = NULL;
|
|
|
|
|
if (from_base) {
|
Add tiny_ptr_bridge_box for centralized pointer classification
Consolidates the logic for resolving Tiny BASE pointers into
(SuperSlab*, slab_idx, TinySlabMeta*, class_idx) tuples.
Box Theory compliance:
- Single Responsibility: ptr→(ss,slab,meta,class) resolution only
- No side effects: pure classification, no logging, no mutations
- Clear API: 4 functions (classify_raw/base, validate_raw/base_class)
- Fail-fast friendly: callers decide error handling policy
Implementation:
- core/box/tiny_ptr_bridge_box.h: New box (4.7 KB)
- core/box/tls_sll_box.h: Integrated into sanitize_head/check_node
Architecture:
- Used in 3 call sites within TLS SLL Box
- Ready for gradual migration to other code paths
- Foundation for future centralized validation
Testing: 150+ seconds stable (sh8bench)
- 30s test: exit code 0, 0 crashes
- 120s test: exit code 0, 0 crashes
- Behavior: identical to previous hand-rolled implementation
Benefits:
- Single point of authority for ptr→(ss,slab,meta,class) logic
- Easier to add validation rules in future (range check, magic, etc.)
- Consistent API for all ptr classification needs
- Foundation for removing code duplication across allocator
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 05:54:54 +09:00
|
|
|
TinyPtrBridgeInfo from_info = tiny_ptr_bridge_classify_raw(from_base);
|
|
|
|
|
from_ss = from_info.ss;
|
|
|
|
|
from_idx = from_info.slab_idx;
|
|
|
|
|
from_meta = from_info.meta;
|
|
|
|
|
from_meta_cls = from_info.meta_cls;
|
|
|
|
|
if (from_meta) {
|
2025-12-03 21:56:52 +09:00
|
|
|
from_meta_used = from_meta->used;
|
|
|
|
|
from_meta_freelist = from_meta->freelist;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
// Dump raw next pointers stored in from_base for extra forensics
|
|
|
|
|
uintptr_t from_next_off0 = 0;
|
|
|
|
|
uintptr_t from_next_off1 = 0;
|
|
|
|
|
size_t next_off_dbg = tiny_next_off(class_idx);
|
|
|
|
|
if (from_base) {
|
|
|
|
|
memcpy(&from_next_off0, from_base, sizeof(from_next_off0));
|
|
|
|
|
memcpy(&from_next_off1, (uint8_t*)from_base + next_off_dbg, sizeof(from_next_off1));
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_SET_INVALID] stage=%s cls=%d head=%p meta_cls=%u idx=%d ss=%p "
|
|
|
|
|
"from_base=%p from_meta_cls=%u from_idx=%d from_ss=%p "
|
|
|
|
|
"from_meta_used=%llu from_meta_freelist=%p next_off=%zu next_raw0=%p next_raw1=%p "
|
|
|
|
|
"canary_before=%#llx canary_after=%#llx last_writer=%s last_push=%p\n",
|
|
|
|
|
stage ? stage : "(null)",
|
|
|
|
|
class_idx,
|
|
|
|
|
raw,
|
|
|
|
|
(unsigned)meta_cls,
|
|
|
|
|
idx,
|
|
|
|
|
ss,
|
|
|
|
|
from_base,
|
|
|
|
|
(unsigned)from_meta_cls,
|
|
|
|
|
from_idx,
|
|
|
|
|
(void*)from_ss,
|
|
|
|
|
(unsigned long long)from_meta_used,
|
|
|
|
|
from_meta_freelist,
|
|
|
|
|
next_off_dbg,
|
|
|
|
|
(void*)from_next_off0,
|
|
|
|
|
(void*)from_next_off1,
|
|
|
|
|
(unsigned long long)g_tls_canary_before_sll,
|
|
|
|
|
(unsigned long long)g_tls_canary_after_sll,
|
|
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)",
|
|
|
|
|
HAK_BASE_TO_RAW(s_tls_sll_last_push[class_idx]));
|
|
|
|
|
void* bt[16];
|
|
|
|
|
int frames = backtrace(bt, 16);
|
|
|
|
|
backtrace_symbols_fd(bt, frames, fileno(stderr));
|
|
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
|
|
|
|
return 0;
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-04 04:15:10 +09:00
|
|
|
// Forward decl for head trace (definition below)
|
|
|
|
|
static inline void tls_sll_head_trace(int class_idx,
|
|
|
|
|
void* old_head,
|
|
|
|
|
void* new_head,
|
|
|
|
|
void* from_base,
|
|
|
|
|
const char* stage);
|
|
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
static inline void tls_sll_set_head(int class_idx, hak_base_ptr_t head, const char* stage)
|
|
|
|
|
{
|
|
|
|
|
void* raw = HAK_BASE_TO_RAW(head);
|
2025-12-04 04:15:10 +09:00
|
|
|
void* old_raw = HAK_BASE_TO_RAW(g_tls_sll[class_idx].head);
|
|
|
|
|
tls_sll_head_trace(class_idx, old_raw, raw, NULL, stage);
|
2025-12-03 21:56:52 +09:00
|
|
|
if (!tls_sll_check_node(class_idx, raw, NULL, stage)) {
|
|
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
g_tls_sll[class_idx].head = head;
|
|
|
|
|
tls_sll_record_writer(class_idx, stage ? stage : "set_head");
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static inline void tls_sll_set_head_from(int class_idx, hak_base_ptr_t head, void* from_base, const char* stage)
|
|
|
|
|
{
|
|
|
|
|
void* raw = HAK_BASE_TO_RAW(head);
|
2025-12-04 04:15:10 +09:00
|
|
|
void* old_raw = HAK_BASE_TO_RAW(g_tls_sll[class_idx].head);
|
|
|
|
|
tls_sll_head_trace(class_idx, old_raw, raw, from_base, stage);
|
2025-12-03 21:56:52 +09:00
|
|
|
if (!tls_sll_check_node(class_idx, raw, from_base, stage)) {
|
|
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
g_tls_sll[class_idx].head = head;
|
|
|
|
|
tls_sll_record_writer(class_idx, stage ? stage : "set_head");
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static inline void tls_sll_set_head_raw(int class_idx, void* raw_head, const char* stage)
|
|
|
|
|
{
|
|
|
|
|
tls_sll_set_head(class_idx, HAK_BASE_FROM_RAW(raw_head), stage);
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline void tls_sll_log_hdr_mismatch(int class_idx, hak_base_ptr_t base, uint8_t got, uint8_t expect, const char* stage)
|
2025-11-21 23:00:24 +09:00
|
|
|
{
|
|
|
|
|
static _Atomic uint32_t g_hdr_mismatch_log = 0;
|
|
|
|
|
uint32_t n = atomic_fetch_add_explicit(&g_hdr_mismatch_log, 1, memory_order_relaxed);
|
|
|
|
|
if (n < 16) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_HDR_MISMATCH] stage=%s cls=%d base=%p got=0x%02x expect=0x%02x\n",
|
|
|
|
|
stage ? stage : "(null)",
|
|
|
|
|
class_idx,
|
2025-12-01 16:37:59 +09:00
|
|
|
HAK_BASE_TO_RAW(base),
|
2025-11-21 23:00:24 +09:00
|
|
|
got,
|
|
|
|
|
expect);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline void tls_sll_diag_next(int class_idx, hak_base_ptr_t base, hak_base_ptr_t next, const char* stage)
|
2025-11-21 23:00:24 +09:00
|
|
|
{
|
2025-11-28 04:39:20 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-11-21 23:00:24 +09:00
|
|
|
static int s_diag_enable = -1;
|
|
|
|
|
if (__builtin_expect(s_diag_enable == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_DIAG");
|
|
|
|
|
s_diag_enable = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (!__builtin_expect(s_diag_enable, 0)) return;
|
|
|
|
|
|
|
|
|
|
// Narrow to target classes to preserve early shots
|
|
|
|
|
if (class_idx != 4 && class_idx != 6 && class_idx != 7) return;
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw_next = HAK_BASE_TO_RAW(next);
|
2025-11-21 23:00:24 +09:00
|
|
|
int in_range = tls_sll_head_valid(next);
|
|
|
|
|
if (in_range) {
|
|
|
|
|
// Range check (abort on clearly bad pointers to catch first offender)
|
2025-12-01 16:37:59 +09:00
|
|
|
validate_ptr_range(raw_next, "tls_sll_pop_next_diag");
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
SuperSlab* ss = hak_super_lookup(raw_next);
|
|
|
|
|
int slab_idx = ss ? slab_index_for(ss, raw_next) : -1;
|
2025-11-21 23:00:24 +09:00
|
|
|
TinySlabMeta* meta = (ss && slab_idx >= 0 && slab_idx < ss_slabs_capacity(ss)) ? &ss->slabs[slab_idx] : NULL;
|
|
|
|
|
int meta_cls = meta ? (int)meta->class_idx : -1;
|
|
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
2025-12-01 16:37:59 +09:00
|
|
|
int hdr_cls = raw_next ? tiny_region_id_read_header((uint8_t*)raw_next + 1) : -1;
|
2025-11-21 23:00:24 +09:00
|
|
|
#else
|
|
|
|
|
int hdr_cls = -1;
|
|
|
|
|
#endif
|
|
|
|
|
|
|
|
|
|
static _Atomic uint32_t g_next_diag_once = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_next_diag_once, 1, memory_order_relaxed);
|
|
|
|
|
if (shot < 12) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_POP_NEXT_DIAG] shot=%u stage=%s cls=%d base=%p next=%p hdr_cls=%d meta_cls=%d slab=%d ss=%p\n",
|
|
|
|
|
shot + 1,
|
|
|
|
|
stage ? stage : "(null)",
|
|
|
|
|
class_idx,
|
2025-12-01 16:37:59 +09:00
|
|
|
HAK_BASE_TO_RAW(base),
|
|
|
|
|
raw_next,
|
2025-11-21 23:00:24 +09:00
|
|
|
hdr_cls,
|
|
|
|
|
meta_cls,
|
|
|
|
|
slab_idx,
|
|
|
|
|
(void*)ss);
|
|
|
|
|
}
|
2025-11-28 04:39:20 +09:00
|
|
|
#else
|
|
|
|
|
(void)class_idx; (void)base; (void)next; (void)stage;
|
|
|
|
|
#endif
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
|
2025-12-04 04:15:10 +09:00
|
|
|
// Optional: trace head writes to locate corruption sources (env: HAKMEM_TINY_SLL_HEADLOG=1)
|
|
|
|
|
static inline void tls_sll_fetch_ptr_info(void* p, SuperSlab** out_ss, int* out_idx, uint8_t* out_cls)
|
|
|
|
|
{
|
|
|
|
|
SuperSlab* ss = hak_super_lookup(p);
|
|
|
|
|
int cap = ss ? ss_slabs_capacity(ss) : 0;
|
|
|
|
|
int idx = (ss && ss->magic == SUPERSLAB_MAGIC) ? slab_index_for(ss, p) : -1;
|
|
|
|
|
uint8_t cls = (idx >= 0 && idx < cap) ? ss->slabs[idx].class_idx : 0xff;
|
|
|
|
|
if (out_ss) *out_ss = ss;
|
|
|
|
|
if (out_idx) *out_idx = idx;
|
|
|
|
|
if (out_cls) *out_cls = cls;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static inline void tls_sll_head_trace(int class_idx,
|
|
|
|
|
void* old_head,
|
|
|
|
|
void* new_head,
|
|
|
|
|
void* from_base,
|
|
|
|
|
const char* stage)
|
|
|
|
|
{
|
|
|
|
|
static int g_headlog_en = 1; // default ON for triage; disable with HAKMEM_TINY_SLL_HEADLOG=0
|
|
|
|
|
static int g_headlog_cls = -2; // -1 = no filter; >=0 only that class
|
|
|
|
|
if (__builtin_expect(g_headlog_en == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_HEADLOG");
|
|
|
|
|
g_headlog_en = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
} else {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_HEADLOG");
|
|
|
|
|
if (e && *e == '0') g_headlog_en = 0;
|
|
|
|
|
}
|
|
|
|
|
if (g_headlog_cls == -2) {
|
|
|
|
|
const char* c = getenv("HAKMEM_TINY_SLL_HEADCLS");
|
|
|
|
|
if (c && *c) {
|
|
|
|
|
g_headlog_cls = atoi(c);
|
|
|
|
|
} else {
|
|
|
|
|
g_headlog_cls = -1;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if (!__builtin_expect(g_headlog_en, 0)) return;
|
|
|
|
|
if (g_headlog_cls >= 0 && class_idx != g_headlog_cls) return;
|
|
|
|
|
|
|
|
|
|
static _Atomic uint32_t g_headlog_shot = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_headlog_shot, 1, memory_order_relaxed);
|
|
|
|
|
if (shot >= 256) return;
|
|
|
|
|
|
|
|
|
|
uint32_t count_before = 0;
|
|
|
|
|
if (class_idx >= 0 && class_idx < TINY_NUM_CLASSES) {
|
|
|
|
|
count_before = g_tls_sll[class_idx].count;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
SuperSlab *new_ss = NULL, *old_ss = NULL, *from_ss = NULL;
|
|
|
|
|
int new_idx = -1, old_idx = -1, from_idx = -1;
|
|
|
|
|
uint8_t new_cls = 0xff, old_cls = 0xff, from_cls = 0xff;
|
|
|
|
|
tls_sll_fetch_ptr_info(new_head, &new_ss, &new_idx, &new_cls);
|
|
|
|
|
tls_sll_fetch_ptr_info(old_head, &old_ss, &old_idx, &old_cls);
|
|
|
|
|
tls_sll_fetch_ptr_info(from_base, &from_ss, &from_idx, &from_cls);
|
|
|
|
|
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_HEAD_SET] shot=%u stage=%s cls=%d count=%u old=%p new=%p from=%p "
|
|
|
|
|
"new_ss=%p new_idx=%d new_cls=%u old_ss=%p old_idx=%d old_cls=%u "
|
|
|
|
|
"from_ss=%p from_idx=%d from_cls=%u last_writer=%s last_push=%p\n",
|
|
|
|
|
shot + 1,
|
|
|
|
|
stage ? stage : "(null)",
|
|
|
|
|
class_idx,
|
|
|
|
|
(unsigned)count_before,
|
|
|
|
|
old_head,
|
|
|
|
|
new_head,
|
|
|
|
|
from_base,
|
|
|
|
|
(void*)new_ss,
|
|
|
|
|
new_idx,
|
|
|
|
|
(unsigned)new_cls,
|
|
|
|
|
(void*)old_ss,
|
|
|
|
|
old_idx,
|
|
|
|
|
(unsigned)old_cls,
|
|
|
|
|
(void*)from_ss,
|
|
|
|
|
from_idx,
|
|
|
|
|
(unsigned)from_cls,
|
|
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)",
|
|
|
|
|
HAK_BASE_TO_RAW(s_tls_sll_last_push[class_idx]));
|
|
|
|
|
}
|
|
|
|
|
|
2025-11-10 16:48:20 +09:00
|
|
|
// ========== Push ==========
|
2025-11-10 17:02:25 +09:00
|
|
|
//
|
2025-11-14 01:02:00 +09:00
|
|
|
// Push BASE pointer into TLS SLL for given class.
|
|
|
|
|
// Returns true on success, false if capacity full or input invalid.
|
2025-11-22 11:30:46 +09:00
|
|
|
//
|
|
|
|
|
// Implementation function with callsite tracking (where).
|
|
|
|
|
// Use tls_sll_push() macro instead of calling directly.
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline bool tls_sll_push_impl(int class_idx, hak_base_ptr_t ptr, uint32_t capacity, const char* where)
|
2025-11-14 01:02:00 +09:00
|
|
|
{
|
2025-12-04 10:38:19 +09:00
|
|
|
static _Atomic uint32_t g_tls_push_trace = 0;
|
|
|
|
|
if (atomic_fetch_add_explicit(&g_tls_push_trace, 1, memory_order_relaxed) < 4096) {
|
2025-12-03 20:42:28 +09:00
|
|
|
HAK_TRACE("[tls_sll_push_impl_enter]\n");
|
|
|
|
|
}
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
HAK_CHECK_CLASS_IDX(class_idx, "tls_sll_push");
|
|
|
|
|
|
2025-11-14 01:05:30 +09:00
|
|
|
// Class mask gate (narrow triage): if disallowed, reject push
|
|
|
|
|
if (__builtin_expect(((g_tls_sll_class_mask & (1u << class_idx)) == 0), 0)) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-03 21:01:25 +09:00
|
|
|
// Defensive: ensure current head is sane before linking new node.
|
|
|
|
|
tls_sll_sanitize_head(class_idx, "push");
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Capacity semantics:
|
|
|
|
|
// - capacity == 0 → disabled (reject)
|
|
|
|
|
// - capacity > 1<<20 → treat as "unbounded" sentinel (no limit)
|
|
|
|
|
if (capacity == 0) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
const uint32_t kCapacityHardMax = (1u << 20);
|
|
|
|
|
const int unlimited = (capacity > kCapacityHardMax);
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
if (hak_base_is_null(ptr)) {
|
2025-11-14 01:02:00 +09:00
|
|
|
return false;
|
2025-11-10 16:48:20 +09:00
|
|
|
}
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Base pointer only (callers must pass BASE; this is a no-op by design).
|
|
|
|
|
ptr = tls_sll_normalize_base(class_idx, ptr);
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw_ptr = HAK_BASE_TO_RAW(ptr);
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-11-21 23:00:24 +09:00
|
|
|
// Detect meta/class mismatch on push (first few only).
|
2025-12-03 20:42:28 +09:00
|
|
|
bool push_valid = true;
|
2025-12-03 21:56:52 +09:00
|
|
|
SuperSlab* ss_ptr = NULL;
|
2025-11-21 23:00:24 +09:00
|
|
|
do {
|
|
|
|
|
static _Atomic uint32_t g_tls_sll_push_meta_mis = 0;
|
2025-12-01 16:37:59 +09:00
|
|
|
struct SuperSlab* ss = hak_super_lookup(raw_ptr);
|
2025-11-21 23:00:24 +09:00
|
|
|
if (ss && ss->magic == SUPERSLAB_MAGIC) {
|
2025-12-03 21:56:52 +09:00
|
|
|
ss_ptr = ss;
|
2025-12-01 16:37:59 +09:00
|
|
|
int sidx = slab_index_for(ss, raw_ptr);
|
2025-11-21 23:00:24 +09:00
|
|
|
if (sidx >= 0 && sidx < ss_slabs_capacity(ss)) {
|
|
|
|
|
uint8_t meta_cls = ss->slabs[sidx].class_idx;
|
|
|
|
|
if (meta_cls < TINY_NUM_CLASSES && meta_cls != (uint8_t)class_idx) {
|
2025-12-03 20:42:28 +09:00
|
|
|
push_valid = false;
|
2025-11-21 23:00:24 +09:00
|
|
|
uint32_t n = atomic_fetch_add_explicit(&g_tls_sll_push_meta_mis, 1, memory_order_relaxed);
|
|
|
|
|
if (n < 4) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_PUSH_META_MISMATCH] cls=%d meta_cls=%u base=%p slab_idx=%d ss=%p\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, (unsigned)meta_cls, raw_ptr, sidx, (void*)ss);
|
2025-11-21 23:00:24 +09:00
|
|
|
void* bt[8];
|
|
|
|
|
int frames = backtrace(bt, 8);
|
|
|
|
|
backtrace_symbols_fd(bt, frames, fileno(stderr));
|
|
|
|
|
}
|
|
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
|
|
|
|
}
|
2025-12-03 20:42:28 +09:00
|
|
|
} else {
|
|
|
|
|
push_valid = false;
|
|
|
|
|
static _Atomic uint32_t g_tls_sll_push_no_ss = 0;
|
|
|
|
|
uint32_t n = atomic_fetch_add_explicit(&g_tls_sll_push_no_ss, 1, memory_order_relaxed);
|
|
|
|
|
if (n < 4) {
|
|
|
|
|
extern int g_super_reg_initialized;
|
|
|
|
|
extern SSAddrMap g_ss_addr_map;
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_PUSH_NO_SS] cls=%d base=%p from=%s reg_init=%d map_count=%zu\n",
|
|
|
|
|
class_idx,
|
|
|
|
|
raw_ptr,
|
|
|
|
|
where ? where : "(null)",
|
|
|
|
|
g_super_reg_initialized,
|
|
|
|
|
g_ss_addr_map.count);
|
|
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
} while (0);
|
2025-12-03 20:42:28 +09:00
|
|
|
if (!push_valid) {
|
|
|
|
|
return false; // Drop malformed pointer instead of corrupting TLS SLL
|
|
|
|
|
}
|
2025-11-21 23:00:24 +09:00
|
|
|
|
2025-11-29 05:37:24 +09:00
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
|
|
|
|
// Validate header on push - detect blocks pushed without header write
|
2025-11-29 06:57:03 +09:00
|
|
|
// Enabled via HAKMEM_DEBUG_LEVEL >= 3 (INFO level) or in debug builds
|
|
|
|
|
// Legacy: HAKMEM_TINY_SLL_VALIDATE_HDR=1 still works for compatibility
|
2025-11-29 05:37:24 +09:00
|
|
|
do {
|
|
|
|
|
static int g_validate_hdr = -1;
|
|
|
|
|
if (__builtin_expect(g_validate_hdr == -1, 0)) {
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
g_validate_hdr = 1; // Always on in debug
|
|
|
|
|
#else
|
2025-11-29 06:57:03 +09:00
|
|
|
g_validate_hdr = hak_debug_check_level("HAKMEM_TINY_SLL_VALIDATE_HDR", 3);
|
2025-11-29 05:37:24 +09:00
|
|
|
#endif
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (__builtin_expect(g_validate_hdr, 0)) {
|
|
|
|
|
static _Atomic uint32_t g_tls_sll_push_bad_hdr = 0;
|
2025-12-01 16:37:59 +09:00
|
|
|
uint8_t hdr = *(uint8_t*)raw_ptr;
|
2025-11-29 05:37:24 +09:00
|
|
|
uint8_t expected = HEADER_MAGIC | (class_idx & HEADER_CLASS_MASK);
|
|
|
|
|
if (hdr != expected) {
|
|
|
|
|
uint32_t n = atomic_fetch_add_explicit(&g_tls_sll_push_bad_hdr, 1, memory_order_relaxed);
|
|
|
|
|
if (n < 10) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_PUSH_BAD_HDR] cls=%d base=%p got=0x%02x expect=0x%02x from=%s\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_ptr, hdr, expected, where ? where : "(null)");
|
2025-11-29 05:37:24 +09:00
|
|
|
void* bt[8];
|
|
|
|
|
int frames = backtrace(bt, 8);
|
|
|
|
|
backtrace_symbols_fd(bt, frames, fileno(stderr));
|
|
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
} while (0);
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
// Minimal range guard before we touch memory.
|
2025-12-01 16:37:59 +09:00
|
|
|
if (!validate_ptr_range(raw_ptr, "tls_sll_push_base")) {
|
2025-11-14 01:02:00 +09:00
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_PUSH] FATAL invalid BASE ptr cls=%d base=%p\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_ptr);
|
2025-11-14 01:02:00 +09:00
|
|
|
abort();
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
}
|
2025-11-21 23:00:24 +09:00
|
|
|
#else
|
|
|
|
|
// Release: drop malformed ptrs but keep running.
|
2025-12-01 16:37:59 +09:00
|
|
|
uintptr_t ptr_addr = (uintptr_t)raw_ptr;
|
2025-11-21 23:00:24 +09:00
|
|
|
if (ptr_addr < 4096 || ptr_addr > 0x00007fffffffffffULL) {
|
|
|
|
|
extern _Atomic uint64_t g_tls_sll_invalid_push[];
|
|
|
|
|
uint64_t cnt = atomic_fetch_add_explicit(&g_tls_sll_invalid_push[class_idx], 1, memory_order_relaxed);
|
|
|
|
|
static __thread uint8_t s_log_limit_push[TINY_NUM_CLASSES] = {0};
|
|
|
|
|
if (s_log_limit_push[class_idx] < 4) {
|
|
|
|
|
fprintf(stderr, "[TLS_SLL_PUSH_INVALID] cls=%d base=%p dropped count=%llu\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_ptr, (unsigned long long)cnt + 1);
|
2025-11-21 23:00:24 +09:00
|
|
|
s_log_limit_push[class_idx]++;
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
#endif
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Capacity check BEFORE any writes.
|
2025-11-20 07:32:30 +09:00
|
|
|
uint32_t cur = g_tls_sll[class_idx].count;
|
2025-11-14 01:02:00 +09:00
|
|
|
if (!unlimited && cur >= capacity) {
|
|
|
|
|
return false;
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
}
|
|
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
// Pin SuperSlab while node resides in TLS SLL (prevents premature free)
|
|
|
|
|
if (ss_ptr && ss_ptr->magic == SUPERSLAB_MAGIC) {
|
|
|
|
|
superslab_ref_inc(ss_ptr);
|
|
|
|
|
}
|
|
|
|
|
|
2025-12-03 13:28:44 +09:00
|
|
|
// DEBUG: Strict address check on push to catch corruption early
|
|
|
|
|
uintptr_t ptr_val = (uintptr_t)raw_ptr;
|
|
|
|
|
if (ptr_val < 4096 || ptr_val > 0x00007fffffffffffULL) {
|
|
|
|
|
fprintf(stderr, "[TLS_SLL_PUSH_INVALID] cls=%d base=%p (val=%llx) from=%s\n",
|
|
|
|
|
class_idx, raw_ptr, (unsigned long long)ptr_val, where ? where : "(null)");
|
|
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
|
2025-11-29 07:57:49 +09:00
|
|
|
// Header restoration using Header Box (C1-C6 only; C0/C7 skip)
|
2025-11-14 01:29:55 +09:00
|
|
|
// Safe mode (HAKMEM_TINY_SLL_SAFEHEADER=1): never overwrite header; reject on magic mismatch.
|
|
|
|
|
// Default mode: restore expected header.
|
2025-12-03 12:11:27 +09:00
|
|
|
#if !HAKMEM_TINY_HEADERLESS
|
2025-11-29 07:57:49 +09:00
|
|
|
if (tiny_class_preserves_header(class_idx)) {
|
2025-11-14 01:29:55 +09:00
|
|
|
static int g_sll_safehdr = -1;
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
static int g_sll_ring_en = -1; // optional ring trace for TLS-SLL anomalies
|
2025-11-14 01:29:55 +09:00
|
|
|
if (__builtin_expect(g_sll_safehdr == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_SAFEHEADER");
|
|
|
|
|
g_sll_safehdr = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
}
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
if (__builtin_expect(g_sll_ring_en == -1, 0)) {
|
|
|
|
|
const char* r = getenv("HAKMEM_TINY_SLL_RING");
|
|
|
|
|
g_sll_ring_en = (r && *r && *r != '0') ? 1 : 0;
|
|
|
|
|
}
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// ptr is BASE pointer, header is at ptr+0
|
2025-12-01 16:37:59 +09:00
|
|
|
uint8_t* b = (uint8_t*)raw_ptr;
|
2025-11-29 07:57:49 +09:00
|
|
|
uint8_t got_pre, expected;
|
|
|
|
|
tiny_header_validate(b, class_idx, &got_pre, &expected);
|
2025-11-21 23:00:24 +09:00
|
|
|
if (__builtin_expect(got_pre != expected, 0)) {
|
|
|
|
|
tls_sll_log_hdr_mismatch(class_idx, ptr, got_pre, expected, "push_preheader");
|
|
|
|
|
}
|
2025-11-14 01:29:55 +09:00
|
|
|
if (g_sll_safehdr) {
|
|
|
|
|
uint8_t got = *b;
|
|
|
|
|
if ((got & 0xF0u) != HEADER_MAGIC) {
|
|
|
|
|
// Reject push silently (fall back to slow path at caller)
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
if (__builtin_expect(g_sll_ring_en, 0)) {
|
|
|
|
|
// aux encodes: high 8 bits = got, low 8 bits = expected
|
|
|
|
|
uintptr_t aux = ((uintptr_t)got << 8) | (uintptr_t)expected;
|
2025-12-01 16:37:59 +09:00
|
|
|
tiny_debug_ring_record(0x7F10 /*TLS_SLL_REJECT*/, (uint16_t)class_idx, raw_ptr, aux);
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
}
|
2025-11-14 01:29:55 +09:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
} else {
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
PTR_TRACK_TLS_PUSH(b, class_idx);
|
|
|
|
|
PTR_TRACK_HEADER_WRITE(b, expected);
|
2025-12-03 10:57:16 +09:00
|
|
|
// GEMINI FIX: Always write header before push + memory barrier
|
|
|
|
|
// This prevents compiler/CPU reordering that might delay header write after next-ptr write
|
|
|
|
|
// or expose incomplete state to other threads (though TLS SLL should be private).
|
|
|
|
|
*(uint8_t*)b = (uint8_t)(0xa0 | (class_idx & 0x0f));
|
|
|
|
|
__atomic_thread_fence(__ATOMIC_RELEASE);
|
2025-11-14 01:29:55 +09:00
|
|
|
}
|
2025-11-14 01:02:00 +09:00
|
|
|
}
|
2025-12-03 12:11:27 +09:00
|
|
|
#endif
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-11-10 23:41:53 +09:00
|
|
|
tls_sll_debug_guard(class_idx, ptr, "push");
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-11-14 01:02:00 +09:00
|
|
|
// Optional double-free detection: scan a bounded prefix of the list.
|
2025-11-22 08:43:18 +09:00
|
|
|
// Increased from 64 to 256 to catch orphaned blocks deeper in the chain.
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
{
|
2025-12-01 16:37:59 +09:00
|
|
|
hak_base_ptr_t scan = g_tls_sll[class_idx].head;
|
2025-11-14 01:02:00 +09:00
|
|
|
uint32_t scanned = 0;
|
2025-11-22 08:43:18 +09:00
|
|
|
const uint32_t limit = (g_tls_sll[class_idx].count < 256)
|
2025-11-20 07:32:30 +09:00
|
|
|
? g_tls_sll[class_idx].count
|
2025-11-22 08:43:18 +09:00
|
|
|
: 256;
|
2025-12-01 16:37:59 +09:00
|
|
|
while (!hak_base_is_null(scan) && scanned < limit) {
|
|
|
|
|
if (hak_base_eq(scan, ptr)) {
|
2025-11-14 01:02:00 +09:00
|
|
|
fprintf(stderr,
|
2025-11-22 11:30:46 +09:00
|
|
|
"[TLS_SLL_PUSH_DUP] cls=%d ptr=%p head=%p count=%u scanned=%u last_push=%p last_push_from=%s last_pop_from=%s last_writer=%s where=%s\n",
|
2025-11-22 08:43:18 +09:00
|
|
|
class_idx,
|
2025-12-01 16:37:59 +09:00
|
|
|
raw_ptr,
|
|
|
|
|
HAK_BASE_TO_RAW(g_tls_sll[class_idx].head),
|
2025-11-22 08:43:18 +09:00
|
|
|
g_tls_sll[class_idx].count,
|
|
|
|
|
scanned,
|
2025-12-01 16:37:59 +09:00
|
|
|
HAK_BASE_TO_RAW(s_tls_sll_last_push[class_idx]),
|
2025-11-22 11:30:46 +09:00
|
|
|
s_tls_sll_last_push_from[class_idx] ? s_tls_sll_last_push_from[class_idx] : "(null)",
|
|
|
|
|
s_tls_sll_last_pop_from[class_idx] ? s_tls_sll_last_pop_from[class_idx] : "(null)",
|
|
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)",
|
|
|
|
|
where ? where : "(null)");
|
2025-11-22 08:43:18 +09:00
|
|
|
ptr_trace_dump_now("tls_sll_dup");
|
2025-11-27 05:57:22 +09:00
|
|
|
// ABORT to get backtrace showing exact double-free location
|
|
|
|
|
abort();
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
}
|
2025-12-01 16:37:59 +09:00
|
|
|
void* next_raw;
|
|
|
|
|
PTR_NEXT_READ("tls_sll_scan", class_idx, HAK_BASE_TO_RAW(scan), 0, next_raw);
|
|
|
|
|
scan = HAK_BASE_FROM_RAW(next_raw);
|
2025-11-14 01:02:00 +09:00
|
|
|
scanned++;
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Link new node to current head via Box API (offset is handled inside tiny_nextptr).
|
2025-12-01 16:37:59 +09:00
|
|
|
// Note: g_tls_sll[...].head is hak_base_ptr_t, but PTR_NEXT_WRITE takes void* val.
|
|
|
|
|
PTR_NEXT_WRITE("tls_push", class_idx, raw_ptr, 0, HAK_BASE_TO_RAW(g_tls_sll[class_idx].head));
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head(class_idx, ptr, "push");
|
2025-11-20 07:32:30 +09:00
|
|
|
g_tls_sll[class_idx].count = cur + 1;
|
2025-11-21 23:00:24 +09:00
|
|
|
s_tls_sll_last_push[class_idx] = ptr;
|
2025-11-10 16:48:20 +09:00
|
|
|
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
// Trace TLS SLL push (debug only)
|
|
|
|
|
extern void ptr_trace_record_impl(int event, void* ptr, int class_idx, uint64_t op_num,
|
|
|
|
|
void* aux_ptr, uint32_t aux_u32, int aux_int,
|
|
|
|
|
const char* file, int line);
|
|
|
|
|
extern _Atomic uint64_t g_ptr_trace_op_counter;
|
|
|
|
|
uint64_t _trace_op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed);
|
2025-12-01 16:37:59 +09:00
|
|
|
ptr_trace_record_impl(4 /*PTR_EVENT_FREE_TLS_PUSH*/, raw_ptr, class_idx, _trace_op,
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
NULL, g_tls_sll[class_idx].count, 0,
|
|
|
|
|
where ? where : __FILE__, __LINE__);
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-22 11:30:46 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
// Record callsite for debugging (debug-only)
|
|
|
|
|
s_tls_sll_last_push_from[class_idx] = where;
|
|
|
|
|
#else
|
|
|
|
|
(void)where; // Suppress unused warning in release
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-10 16:48:20 +09:00
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ========== Pop ==========
|
|
|
|
|
//
|
2025-11-14 01:02:00 +09:00
|
|
|
// Pop BASE pointer from TLS SLL.
|
|
|
|
|
// Returns true on success and stores BASE into *out.
|
2025-11-22 11:30:46 +09:00
|
|
|
//
|
|
|
|
|
// Implementation function with callsite tracking (where).
|
|
|
|
|
// Use tls_sll_pop() macro instead of calling directly.
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline bool tls_sll_pop_impl(int class_idx, hak_base_ptr_t* out, const char* where)
|
2025-11-14 01:02:00 +09:00
|
|
|
{
|
2025-12-04 10:38:19 +09:00
|
|
|
static _Atomic uint32_t g_tls_pop_trace = 0;
|
|
|
|
|
if (atomic_fetch_add_explicit(&g_tls_pop_trace, 1, memory_order_relaxed) < 4096) {
|
2025-12-03 20:42:28 +09:00
|
|
|
HAK_TRACE("[tls_sll_pop_impl_enter]\n");
|
|
|
|
|
}
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
HAK_CHECK_CLASS_IDX(class_idx, "tls_sll_pop");
|
2025-11-14 01:05:30 +09:00
|
|
|
// Class mask gate: if disallowed, behave as empty
|
|
|
|
|
if (__builtin_expect(((g_tls_sll_class_mask & (1u << class_idx)) == 0), 0)) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
atomic_fetch_add(&g_integrity_check_class_bounds, 1);
|
|
|
|
|
|
2025-12-03 21:01:25 +09:00
|
|
|
// Defensive: ensure current head is sane before accessing it.
|
|
|
|
|
tls_sll_sanitize_head(class_idx, "pop_enter");
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
hak_base_ptr_t base = g_tls_sll[class_idx].head;
|
|
|
|
|
if (hak_base_is_null(base)) {
|
2025-11-14 01:02:00 +09:00
|
|
|
return false;
|
2025-11-10 16:48:20 +09:00
|
|
|
}
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw_base = HAK_BASE_TO_RAW(base);
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Sentinel guard: remote sentinel must never be in TLS SLL.
|
2025-12-01 16:37:59 +09:00
|
|
|
if (__builtin_expect((uintptr_t)raw_base == TINY_REMOTE_SENTINEL, 0)) {
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head(class_idx, HAK_BASE_FROM_RAW(NULL), "pop_sentinel");
|
2025-11-20 07:32:30 +09:00
|
|
|
g_tls_sll[class_idx].count = 0;
|
2025-11-21 23:00:24 +09:00
|
|
|
tls_sll_record_writer(class_idx, "pop_sentinel_reset");
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-11-14 01:02:00 +09:00
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_POP] Remote sentinel detected at head; SLL reset (cls=%d)\n",
|
|
|
|
|
class_idx);
|
2025-11-10 18:04:08 +09:00
|
|
|
#endif
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
{
|
|
|
|
|
static int g_sll_ring_en = -1;
|
|
|
|
|
if (__builtin_expect(g_sll_ring_en == -1, 0)) {
|
|
|
|
|
const char* r = getenv("HAKMEM_TINY_SLL_RING");
|
|
|
|
|
g_sll_ring_en = (r && *r && *r != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (__builtin_expect(g_sll_ring_en, 0)) {
|
2025-12-01 16:37:59 +09:00
|
|
|
tiny_debug_ring_record(0x7F11 /*TLS_SLL_SENTINEL*/, (uint16_t)class_idx, raw_base, 0);
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
}
|
|
|
|
|
}
|
2025-11-14 01:02:00 +09:00
|
|
|
return false;
|
|
|
|
|
}
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-12-01 16:37:59 +09:00
|
|
|
if (!validate_ptr_range(raw_base, "tls_sll_pop_base")) {
|
2025-11-14 01:02:00 +09:00
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_POP] FATAL invalid BASE ptr cls=%d base=%p\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_base);
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
abort();
|
|
|
|
|
}
|
2025-11-21 23:00:24 +09:00
|
|
|
#else
|
|
|
|
|
// Fail-fast even in release: drop malformed TLS head to avoid SEGV on bad base.
|
2025-12-01 16:37:59 +09:00
|
|
|
uintptr_t base_addr = (uintptr_t)raw_base;
|
2025-11-21 23:00:24 +09:00
|
|
|
if (base_addr < 4096 || base_addr > 0x00007fffffffffffULL) {
|
|
|
|
|
extern _Atomic uint64_t g_tls_sll_invalid_head[];
|
|
|
|
|
uint64_t cnt = atomic_fetch_add_explicit(&g_tls_sll_invalid_head[class_idx], 1, memory_order_relaxed);
|
|
|
|
|
static __thread uint8_t s_log_limit[TINY_NUM_CLASSES] = {0};
|
|
|
|
|
if (s_log_limit[class_idx] < 4) {
|
2025-12-03 12:43:02 +09:00
|
|
|
fprintf(stderr, "[TLS_SLL_POP_INVALID] cls=%d head=%p (val=%llx) dropped count=%llu\n",
|
|
|
|
|
class_idx, raw_base, (unsigned long long)base_addr, (unsigned long long)cnt + 1);
|
2025-11-21 23:00:24 +09:00
|
|
|
s_log_limit[class_idx]++;
|
2025-12-03 13:28:44 +09:00
|
|
|
tls_sll_dump_tls_window(class_idx, "invalid_head"); // Added dump
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
// Help triage: show last successful push base for this thread/class
|
2025-12-01 16:37:59 +09:00
|
|
|
if (!hak_base_is_null(s_tls_sll_last_push[class_idx]) && s_log_limit[class_idx] <= 4) {
|
2025-11-21 23:00:24 +09:00
|
|
|
fprintf(stderr, "[TLS_SLL_POP_INVALID] cls=%d last_push=%p\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, HAK_BASE_TO_RAW(s_tls_sll_last_push[class_idx]));
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
tls_sll_dump_tls_window(class_idx, "head_range");
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head(class_idx, HAK_BASE_FROM_RAW(NULL), "pop_invalid_head");
|
2025-11-21 23:00:24 +09:00
|
|
|
g_tls_sll[class_idx].count = 0;
|
|
|
|
|
return false;
|
|
|
|
|
}
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
#endif
|
|
|
|
|
|
2025-11-21 23:00:24 +09:00
|
|
|
// Optional high-frequency canary check for target classes (e.g., 4/6)
|
|
|
|
|
static int s_canary_fast = -1;
|
|
|
|
|
if (__builtin_expect(s_canary_fast == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_CANARY_FAST");
|
|
|
|
|
s_canary_fast = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (__builtin_expect(s_canary_fast && (class_idx == 4 || class_idx == 6), 0)) {
|
|
|
|
|
extern _Atomic uint64_t g_tls_sll_pop_counter[];
|
|
|
|
|
uint64_t pc = atomic_fetch_add_explicit(&g_tls_sll_pop_counter[class_idx], 1, memory_order_relaxed) + 1;
|
|
|
|
|
periodic_canary_check(pc, class_idx == 4 ? "tls_sll_pop_cls4" : "tls_sll_pop_cls6");
|
|
|
|
|
}
|
|
|
|
|
|
2025-11-10 23:41:53 +09:00
|
|
|
tls_sll_debug_guard(class_idx, base, "pop");
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
2025-11-29 07:57:49 +09:00
|
|
|
// Header validation using Header Box (C1-C6 only; C0/C7 skip)
|
2025-12-03 12:11:27 +09:00
|
|
|
#if !HAKMEM_TINY_HEADERLESS
|
2025-11-29 07:57:49 +09:00
|
|
|
if (tiny_class_preserves_header(class_idx)) {
|
|
|
|
|
uint8_t got, expect;
|
2025-12-01 16:37:59 +09:00
|
|
|
PTR_TRACK_TLS_POP(raw_base, class_idx);
|
|
|
|
|
bool valid = tiny_header_validate(raw_base, class_idx, &got, &expect);
|
|
|
|
|
PTR_TRACK_HEADER_READ(raw_base, got);
|
2025-11-29 07:57:49 +09:00
|
|
|
if (__builtin_expect(!valid, 0)) {
|
2025-11-14 01:02:00 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_POP] CORRUPTED HEADER cls=%d base=%p got=0x%02x expect=0x%02x\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_base, got, expect);
|
2025-11-14 01:02:00 +09:00
|
|
|
ptr_trace_dump_now("header_corruption");
|
|
|
|
|
abort();
|
|
|
|
|
#else
|
|
|
|
|
// In release, fail-safe: drop list.
|
2025-11-21 23:00:24 +09:00
|
|
|
// PERF DEBUG: Count header corruption resets
|
|
|
|
|
static _Atomic uint64_t g_hdr_reset_count = 0;
|
|
|
|
|
uint64_t cnt = atomic_fetch_add_explicit(&g_hdr_reset_count, 1, memory_order_relaxed);
|
2025-12-03 20:42:28 +09:00
|
|
|
// Narrow diagnostics for early shots to root-cause corruption.
|
|
|
|
|
static _Atomic uint32_t g_hdr_reset_diag = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_hdr_reset_diag, 1, memory_order_relaxed);
|
|
|
|
|
if (shot < 8) {
|
2025-12-03 21:01:25 +09:00
|
|
|
// Extra diagnostics: dump raw next pointers at offsets 0 and tiny_next_off()
|
|
|
|
|
uintptr_t next_raw_off0 = 0;
|
|
|
|
|
uintptr_t next_raw_off1 = 0;
|
|
|
|
|
size_t next_off = tiny_next_off(class_idx);
|
|
|
|
|
memcpy(&next_raw_off0, raw_base, sizeof(next_raw_off0));
|
|
|
|
|
memcpy(&next_raw_off1, (uint8_t*)raw_base + next_off, sizeof(next_raw_off1));
|
|
|
|
|
uint8_t dump8[8] = {0};
|
|
|
|
|
memcpy(dump8, raw_base, sizeof(dump8));
|
|
|
|
|
|
2025-12-03 20:42:28 +09:00
|
|
|
SuperSlab* ss_diag = hak_super_lookup(raw_base);
|
|
|
|
|
int slab_idx = ss_diag ? slab_index_for(ss_diag, raw_base) : -1;
|
|
|
|
|
uint8_t meta_cls = 0xff;
|
|
|
|
|
if (ss_diag && slab_idx >= 0 && slab_idx < ss_slabs_capacity(ss_diag)) {
|
|
|
|
|
meta_cls = ss_diag->slabs[slab_idx].class_idx;
|
|
|
|
|
}
|
|
|
|
|
void* raw_next_diag = NULL;
|
|
|
|
|
PTR_NEXT_READ("tls_hdr_reset_diag", class_idx, raw_base, 0, raw_next_diag);
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_HDR_RESET] shot=%u cls=%d base=%p got=0x%02x expect=0x%02x "
|
2025-12-03 21:01:25 +09:00
|
|
|
"next=%p meta_cls=%u slab_idx=%d last_writer=%s last_push=%p count=%llu "
|
|
|
|
|
"next_off=%zu next_raw0=%p next_raw1=%p bytes=%02x%02x%02x%02x%02x%02x%02x%02x\n",
|
2025-12-03 20:42:28 +09:00
|
|
|
shot + 1,
|
|
|
|
|
class_idx,
|
|
|
|
|
raw_base,
|
|
|
|
|
got,
|
|
|
|
|
expect,
|
|
|
|
|
raw_next_diag,
|
|
|
|
|
(unsigned)meta_cls,
|
|
|
|
|
slab_idx,
|
|
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)",
|
|
|
|
|
HAK_BASE_TO_RAW(s_tls_sll_last_push[class_idx]),
|
2025-12-03 21:01:25 +09:00
|
|
|
(unsigned long long)cnt,
|
|
|
|
|
next_off,
|
|
|
|
|
(void*)next_raw_off0,
|
|
|
|
|
(void*)next_raw_off1,
|
|
|
|
|
dump8[0], dump8[1], dump8[2], dump8[3],
|
|
|
|
|
dump8[4], dump8[5], dump8[6], dump8[7]);
|
2025-12-03 20:42:28 +09:00
|
|
|
} else if (cnt % 10000 == 0) {
|
2025-11-21 23:00:24 +09:00
|
|
|
fprintf(stderr, "[TLS_SLL_HDR_RESET] cls=%d base=%p got=0x%02x expect=0x%02x count=%llu\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_base, got, expect, (unsigned long long)cnt);
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head(class_idx, HAK_BASE_FROM_RAW(NULL), "header_reset");
|
2025-11-20 07:32:30 +09:00
|
|
|
g_tls_sll[class_idx].count = 0;
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
{
|
|
|
|
|
static int g_sll_ring_en = -1;
|
|
|
|
|
if (__builtin_expect(g_sll_ring_en == -1, 0)) {
|
|
|
|
|
const char* r = getenv("HAKMEM_TINY_SLL_RING");
|
|
|
|
|
g_sll_ring_en = (r && *r && *r != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (__builtin_expect(g_sll_ring_en, 0)) {
|
|
|
|
|
// aux encodes: high 8 bits = got, low 8 bits = expect
|
|
|
|
|
uintptr_t aux = ((uintptr_t)got << 8) | (uintptr_t)expect;
|
2025-12-01 16:37:59 +09:00
|
|
|
tiny_debug_ring_record(0x7F12 /*TLS_SLL_HDR_CORRUPT*/, (uint16_t)class_idx, raw_base, aux);
|
Front-Direct implementation: SS→FC direct refill + SLL complete bypass
## Summary
Implemented Front-Direct architecture with complete SLL bypass:
- Direct SuperSlab → FastCache refill (1-hop, bypasses SLL)
- SLL-free allocation/free paths when Front-Direct enabled
- Legacy path sealing (SLL inline opt-in, SFC cascade ENV-only)
## New Modules
- core/refill/ss_refill_fc.h (236 lines): Standard SS→FC refill entry point
- Remote drain → Freelist → Carve priority
- Header restoration for C1-C6 (NOT C0/C7)
- ENV: HAKMEM_TINY_P0_DRAIN_THRESH, HAKMEM_TINY_P0_NO_DRAIN
- core/front/fast_cache.h: FastCache (L1) type definition
- core/front/quick_slot.h: QuickSlot (L0) type definition
## Allocation Path (core/tiny_alloc_fast.inc.h)
- Added s_front_direct_alloc TLS flag (lazy ENV check)
- SLL pop guarded by: g_tls_sll_enable && !s_front_direct_alloc
- Refill dispatch:
- Front-Direct: ss_refill_fc_fill() → fastcache_pop() (1-hop)
- Legacy: sll_refill_batch_from_ss() → SLL → FC (2-hop, A/B only)
- SLL inline pop sealed (requires HAKMEM_TINY_INLINE_SLL=1 opt-in)
## Free Path (core/hakmem_tiny_free.inc, core/hakmem_tiny_fastcache.inc.h)
- FC priority: Try fastcache_push() first (same-thread free)
- tiny_fast_push() bypass: Returns 0 when s_front_direct_free || !g_tls_sll_enable
- Fallback: Magazine/slow path (safe, bypasses SLL)
## Legacy Sealing
- SFC cascade: Default OFF (ENV-only via HAKMEM_TINY_SFC_CASCADE=1)
- Deleted: core/hakmem_tiny_free.inc.bak, core/pool_refill_legacy.c.bak
- Documentation: ss_refill_fc_fill() promoted as CANONICAL refill entry
## ENV Controls
- HAKMEM_TINY_FRONT_DIRECT=1: Enable Front-Direct (SS→FC direct)
- HAKMEM_TINY_P0_DIRECT_FC_ALL=1: Same as above (alt name)
- HAKMEM_TINY_REFILL_BATCH=1: Enable batch refill (also enables Front-Direct)
- HAKMEM_TINY_SFC_CASCADE=1: Enable SFC cascade (default OFF)
- HAKMEM_TINY_INLINE_SLL=1: Enable inline SLL pop (default OFF, requires AGGRESSIVE_INLINE)
## Benchmarks (Front-Direct Enabled)
```bash
ENV: HAKMEM_BENCH_FAST_FRONT=1 HAKMEM_TINY_FRONT_DIRECT=1
HAKMEM_TINY_REFILL_BATCH=1 HAKMEM_TINY_P0_DIRECT_FC_ALL=1
HAKMEM_TINY_REFILL_COUNT_HOT=256 HAKMEM_TINY_REFILL_COUNT_MID=96
HAKMEM_TINY_BUMP_CHUNK=256
bench_random_mixed (16-1040B random, 200K iter):
256 slots: 1.44M ops/s (STABLE, 0 SEGV)
128 slots: 1.44M ops/s (STABLE, 0 SEGV)
bench_fixed_size (fixed size, 200K iter):
256B: 4.06M ops/s (has debug logs, expected >10M without logs)
128B: Similar (debug logs affect)
```
## Verification
- TRACE_RING test (10K iter): **0 SLL events** detected ✅
- Complete SLL bypass confirmed when Front-Direct=1
- Stable execution: 200K iterations × multiple sizes, 0 SEGV
## Next Steps
- Disable debug logs in hak_alloc_api.inc.h (call_num 14250-14280 range)
- Re-benchmark with clean Release build (target: 10-15M ops/s)
- 128/256B shortcut path optimization (FC hit rate improvement)
Co-Authored-By: ChatGPT <chatgpt@openai.com>
Suggested-By: ultrathink
2025-11-14 05:41:49 +09:00
|
|
|
}
|
|
|
|
|
}
|
2025-11-14 01:02:00 +09:00
|
|
|
return false;
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
#endif
|
2025-11-14 01:02:00 +09:00
|
|
|
}
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
}
|
2025-12-03 12:11:27 +09:00
|
|
|
#endif
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Read next via Box API.
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw_next;
|
|
|
|
|
PTR_NEXT_READ("tls_pop", class_idx, raw_base, 0, raw_next);
|
|
|
|
|
hak_base_ptr_t next = HAK_BASE_FROM_RAW(raw_next);
|
2025-11-21 23:00:24 +09:00
|
|
|
tls_sll_diag_next(class_idx, base, next, "pop_next");
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
2025-12-04 04:58:22 +09:00
|
|
|
// Optional: misalignment guard to catch BASE/USER混入由来のnextズレ (triage用)
|
|
|
|
|
do {
|
|
|
|
|
static int g_misalign_guard = -1;
|
|
|
|
|
if (__builtin_expect(g_misalign_guard == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_SLL_MISALIGN_GUARD");
|
|
|
|
|
g_misalign_guard = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (!__builtin_expect(g_misalign_guard, 0)) break;
|
|
|
|
|
if (hak_base_is_null(next)) break;
|
|
|
|
|
extern const size_t g_tiny_class_sizes[];
|
|
|
|
|
size_t stride = (class_idx >= 0 && class_idx < TINY_NUM_CLASSES)
|
|
|
|
|
? g_tiny_class_sizes[class_idx]
|
|
|
|
|
: 0;
|
|
|
|
|
if (stride == 0) break;
|
|
|
|
|
uintptr_t next_addr = (uintptr_t)raw_next;
|
|
|
|
|
if ((next_addr % stride) != 0) {
|
|
|
|
|
static _Atomic uint32_t g_misalign_shot = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_misalign_shot, 1, memory_order_relaxed);
|
|
|
|
|
if (shot < 8) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_POP_MISALIGNED_NEXT] shot=%u cls=%d base=%p next=%p stride=%zu where=%s last_writer=%s\n",
|
|
|
|
|
shot + 1,
|
|
|
|
|
class_idx,
|
|
|
|
|
raw_base,
|
|
|
|
|
raw_next,
|
|
|
|
|
stride,
|
|
|
|
|
where ? where : "(null)",
|
|
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)");
|
|
|
|
|
}
|
|
|
|
|
// Drop list defensively; nextが壊れているのでheadごと破棄
|
|
|
|
|
tls_sll_set_head(class_idx, HAK_BASE_FROM_RAW(NULL), "pop_next_misaligned");
|
|
|
|
|
g_tls_sll[class_idx].count = 0;
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
} while (0);
|
|
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
// Validate next pointer before installing as new head.
|
|
|
|
|
if (!hak_base_is_null(next)) {
|
|
|
|
|
SuperSlab* next_ss = hak_super_lookup(raw_next);
|
|
|
|
|
int next_cap = next_ss ? ss_slabs_capacity(next_ss) : 0;
|
|
|
|
|
int next_idx = (next_ss && next_ss->magic == SUPERSLAB_MAGIC) ? slab_index_for(next_ss, raw_next) : -1;
|
|
|
|
|
uint8_t next_meta_cls = (next_idx >= 0 && next_idx < next_cap) ? next_ss->slabs[next_idx].class_idx : 0xff;
|
|
|
|
|
if (!next_ss || next_ss->magic != SUPERSLAB_MAGIC || next_idx < 0 || next_idx >= next_cap || next_meta_cls != (uint8_t)class_idx) {
|
|
|
|
|
static _Atomic uint32_t g_next_invalid = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_next_invalid, 1, memory_order_relaxed);
|
|
|
|
|
if (shot < 8) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_NEXT_INVALID] cls=%d next=%p meta_cls=%u idx=%d ss=%p from_base=%p head=%p last_writer=%s\n",
|
|
|
|
|
class_idx,
|
|
|
|
|
raw_next,
|
|
|
|
|
(unsigned)next_meta_cls,
|
|
|
|
|
next_idx,
|
|
|
|
|
(void*)next_ss,
|
|
|
|
|
raw_base,
|
|
|
|
|
HAK_BASE_TO_RAW(g_tls_sll[class_idx].head),
|
|
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)");
|
|
|
|
|
}
|
|
|
|
|
// Drop remainder of list to avoid chasing stale pointers.
|
|
|
|
|
next = HAK_BASE_FROM_RAW(NULL);
|
|
|
|
|
tls_sll_set_head(class_idx, next, "pop_next_invalid");
|
|
|
|
|
g_tls_sll[class_idx].count = 0;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-12-01 16:37:59 +09:00
|
|
|
if (!hak_base_is_null(next) && !validate_ptr_range(raw_next, "tls_sll_pop_next")) {
|
2025-11-14 01:02:00 +09:00
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_POP] FATAL invalid next ptr cls=%d base=%p next=%p\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, raw_base, raw_next);
|
2025-11-14 01:02:00 +09:00
|
|
|
ptr_trace_dump_now("next_corruption");
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head_from(class_idx, next, raw_base, where ? where : "pop");
|
2025-12-01 16:37:59 +09:00
|
|
|
if ((class_idx == 4 || class_idx == 6) && !hak_base_is_null(next) && !tls_sll_head_valid(next)) {
|
2025-11-21 23:00:24 +09:00
|
|
|
fprintf(stderr, "[TLS_SLL_POP_POST_INVALID] cls=%d next=%p last_writer=%s\n",
|
|
|
|
|
class_idx,
|
2025-12-01 16:37:59 +09:00
|
|
|
raw_next,
|
2025-11-21 23:00:24 +09:00
|
|
|
g_tls_sll_last_writer[class_idx] ? g_tls_sll_last_writer[class_idx] : "(null)");
|
|
|
|
|
tls_sll_dump_tls_window(class_idx, "pop_post");
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head(class_idx, HAK_BASE_FROM_RAW(NULL), "pop_post");
|
2025-11-21 23:00:24 +09:00
|
|
|
g_tls_sll[class_idx].count = 0;
|
|
|
|
|
return false;
|
|
|
|
|
}
|
2025-11-20 07:32:30 +09:00
|
|
|
if (g_tls_sll[class_idx].count > 0) {
|
|
|
|
|
g_tls_sll[class_idx].count--;
|
2025-11-10 16:48:20 +09:00
|
|
|
}
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Clear next inside popped node to avoid stale-chain issues.
|
2025-12-04 04:58:22 +09:00
|
|
|
PTR_NEXT_WRITE("tls_pop_clear", class_idx, raw_base, 0, NULL);
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
// Release SuperSlab pin now that node left TLS SLL
|
|
|
|
|
do {
|
|
|
|
|
SuperSlab* ss_pop = hak_super_lookup(raw_base);
|
|
|
|
|
if (ss_pop && ss_pop->magic == SUPERSLAB_MAGIC) {
|
|
|
|
|
superslab_ref_dec(ss_pop);
|
|
|
|
|
}
|
|
|
|
|
} while (0);
|
|
|
|
|
|
2025-11-22 11:30:46 +09:00
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// Trace TLS SLL pop (debug only)
|
|
|
|
|
extern void ptr_trace_record_impl(int event, void* ptr, int class_idx, uint64_t op_num,
|
|
|
|
|
void* aux_ptr, uint32_t aux_u32, int aux_int,
|
|
|
|
|
const char* file, int line);
|
|
|
|
|
extern _Atomic uint64_t g_ptr_trace_op_counter;
|
|
|
|
|
uint64_t _trace_op = atomic_fetch_add_explicit(&g_ptr_trace_op_counter, 1, memory_order_relaxed);
|
2025-12-01 16:37:59 +09:00
|
|
|
ptr_trace_record_impl(3 /*PTR_EVENT_ALLOC_TLS_POP*/, raw_base, class_idx, _trace_op,
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
NULL, g_tls_sll[class_idx].count + 1, 0,
|
|
|
|
|
where ? where : __FILE__, __LINE__);
|
|
|
|
|
|
2025-11-22 11:30:46 +09:00
|
|
|
// Record callsite for debugging (debug-only)
|
|
|
|
|
s_tls_sll_last_pop_from[class_idx] = where;
|
2025-11-27 08:18:01 +09:00
|
|
|
|
|
|
|
|
// Debug: Log pop operations (first 50, class 1 only)
|
|
|
|
|
{
|
|
|
|
|
extern _Atomic uint64_t g_debug_op_count;
|
|
|
|
|
uint64_t op = atomic_load(&g_debug_op_count);
|
|
|
|
|
if (op < 50 && class_idx == 1) {
|
|
|
|
|
fprintf(stderr, "[OP#%04lu POP] cls=%d base=%p tls_count_after=%u\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
(unsigned long)op, class_idx, raw_base,
|
2025-11-27 08:18:01 +09:00
|
|
|
g_tls_sll[class_idx].count);
|
|
|
|
|
fflush(stderr);
|
|
|
|
|
}
|
|
|
|
|
}
|
2025-11-22 11:30:46 +09:00
|
|
|
#else
|
|
|
|
|
(void)where; // Suppress unused warning in release
|
|
|
|
|
#endif
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
*out = base;
|
2025-11-10 16:48:20 +09:00
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ========== Splice ==========
|
2025-11-10 17:02:25 +09:00
|
|
|
//
|
2025-11-14 01:02:00 +09:00
|
|
|
// Splice a pre-linked chain of BASE pointers into TLS SLL head.
|
|
|
|
|
// chain_head is BASE; links are via Box API-compatible next layout.
|
|
|
|
|
// Returns number of nodes actually moved (<= capacity remaining).
|
|
|
|
|
|
|
|
|
|
static inline uint32_t tls_sll_splice(int class_idx,
|
2025-12-01 16:37:59 +09:00
|
|
|
hak_base_ptr_t chain_head,
|
2025-11-14 01:02:00 +09:00
|
|
|
uint32_t count,
|
|
|
|
|
uint32_t capacity)
|
|
|
|
|
{
|
|
|
|
|
HAK_CHECK_CLASS_IDX(class_idx, "tls_sll_splice");
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
if (hak_base_is_null(chain_head) || count == 0 || capacity == 0) {
|
2025-11-14 01:02:00 +09:00
|
|
|
return 0;
|
2025-11-10 16:48:20 +09:00
|
|
|
}
|
2025-11-13 01:45:30 +09:00
|
|
|
|
2025-11-20 07:32:30 +09:00
|
|
|
uint32_t cur = g_tls_sll[class_idx].count;
|
2025-11-14 01:02:00 +09:00
|
|
|
if (cur >= capacity) {
|
|
|
|
|
return 0;
|
2025-11-13 01:45:30 +09:00
|
|
|
}
|
|
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
uint32_t room = capacity - cur;
|
|
|
|
|
uint32_t to_move = (count < room) ? count : room;
|
|
|
|
|
|
|
|
|
|
// Traverse chain up to to_move, validate, and find tail.
|
2025-12-01 16:37:59 +09:00
|
|
|
hak_base_ptr_t tail = chain_head;
|
2025-11-14 01:02:00 +09:00
|
|
|
uint32_t moved = 1;
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
tls_sll_debug_guard(class_idx, chain_head, "splice_head");
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-11-29 07:57:49 +09:00
|
|
|
// Restore header defensively on each node we touch (C1-C6 only; C0/C7 skip)
|
2025-12-01 16:37:59 +09:00
|
|
|
tiny_header_write_if_preserved(HAK_BASE_TO_RAW(chain_head), class_idx);
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
while (moved < to_move) {
|
|
|
|
|
tls_sll_debug_guard(class_idx, tail, "splice_traverse");
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw_next;
|
|
|
|
|
PTR_NEXT_READ("tls_splice_trav", class_idx, HAK_BASE_TO_RAW(tail), 0, raw_next);
|
|
|
|
|
hak_base_ptr_t next = HAK_BASE_FROM_RAW(raw_next);
|
|
|
|
|
|
|
|
|
|
if (!hak_base_is_null(next) && !tls_sll_head_valid(next)) {
|
2025-11-21 23:00:24 +09:00
|
|
|
static _Atomic uint32_t g_splice_diag = 0;
|
|
|
|
|
uint32_t shot = atomic_fetch_add_explicit(&g_splice_diag, 1, memory_order_relaxed);
|
|
|
|
|
if (shot < 8) {
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_SPLICE_INVALID_NEXT] cls=%d head=%p tail=%p next=%p moved=%u/%u\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, HAK_BASE_TO_RAW(chain_head), HAK_BASE_TO_RAW(tail), raw_next, moved, to_move);
|
2025-11-21 23:00:24 +09:00
|
|
|
}
|
|
|
|
|
}
|
2025-11-14 01:02:00 +09:00
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
if (hak_base_is_null(next)) {
|
2025-11-10 16:48:20 +09:00
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
|
2025-11-29 07:57:49 +09:00
|
|
|
// Restore header on each traversed node (C1-C6 only; C0/C7 skip)
|
2025-12-01 16:37:59 +09:00
|
|
|
tiny_header_write_if_preserved(raw_next, class_idx);
|
Fix #16: Resolve double BASE→USER conversion causing header corruption
🎯 ROOT CAUSE: Internal allocation helpers were prematurely converting
BASE → USER pointers before returning to caller. The caller then applied
HAK_RET_ALLOC/tiny_region_id_write_header which performed ANOTHER BASE→USER
conversion, resulting in double offset (BASE+2) and header written at
wrong location.
📦 BOX THEORY SOLUTION: Establish clean pointer conversion boundary at
tiny_region_id_write_header, making it the single source of truth for
BASE → USER conversion.
🔧 CHANGES:
- Fix #16: Remove premature BASE→USER conversions (6 locations)
* core/tiny_alloc_fast.inc.h (3 fixes)
* core/hakmem_tiny_refill.inc.h (2 fixes)
* core/hakmem_tiny_fastcache.inc.h (1 fix)
- Fix #12: Add header validation in tls_sll_pop (detect corruption)
- Fix #14: Defense-in-depth header restoration in tls_sll_splice
- Fix #15: USER pointer detection (for debugging)
- Fix #13: Bump window header restoration
- Fix #2, #6, #7, #8: Various header restoration & NULL termination
🧪 TEST RESULTS: 100% SUCCESS
- 10K-500K iterations: All passed
- 8 seeds × 100K: All passed (42,123,456,789,999,314,271,161)
- Performance: ~630K ops/s average (stable)
- Header corruption: ZERO
📋 FIXES SUMMARY:
Fix #1-8: Initial header restoration & chain fixes (chatgpt-san)
Fix #9-10: USER pointer auto-fix (later disabled)
Fix #12: Validation system (caught corruption at call 14209)
Fix #13: Bump window header writes
Fix #14: Splice defense-in-depth
Fix #15: USER pointer detection (debugging tool)
Fix #16: Double conversion fix (FINAL SOLUTION) ✅
🎓 LESSONS LEARNED:
1. Validation catches bugs early (Fix #12 was critical)
2. Class-specific inline logging reveals patterns (Option C)
3. Box Theory provides clean architectural boundaries
4. Multiple investigation approaches (Task/chatgpt-san collaboration)
📄 DOCUMENTATION:
- P0_BUG_STATUS.md: Complete bug tracking timeline
- C2_CORRUPTION_ROOT_CAUSE_FINAL.md: Detailed root cause analysis
- FINAL_ANALYSIS_C2_CORRUPTION.md: Investigation methodology
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Task Agent <task@anthropic.com>
Co-Authored-By: ChatGPT <chatgpt@openai.com>
2025-11-12 10:33:57 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
tail = next;
|
|
|
|
|
moved++;
|
|
|
|
|
}
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
// Link tail to existing head and install new head.
|
|
|
|
|
tls_sll_debug_guard(class_idx, tail, "splice_tail");
|
2025-12-01 16:37:59 +09:00
|
|
|
PTR_NEXT_WRITE("tls_splice_link", class_idx, HAK_BASE_TO_RAW(tail), 0, HAK_BASE_TO_RAW(g_tls_sll[class_idx].head));
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-12-03 21:56:52 +09:00
|
|
|
tls_sll_set_head(class_idx, chain_head, "splice");
|
2025-11-20 07:32:30 +09:00
|
|
|
g_tls_sll[class_idx].count = cur + moved;
|
2025-11-10 16:48:20 +09:00
|
|
|
|
2025-11-14 01:02:00 +09:00
|
|
|
return moved;
|
2025-11-10 16:48:20 +09:00
|
|
|
}
|
|
|
|
|
|
2025-11-22 11:30:46 +09:00
|
|
|
// ========== Macro Wrappers ==========
|
|
|
|
|
//
|
2025-11-27 07:30:32 +09:00
|
|
|
// Box Theory: Callers use tls_sll_push/pop() macros which auto-insert callsite info (debug only).
|
|
|
|
|
// No changes required to call sites.
|
2025-11-22 11:30:46 +09:00
|
|
|
|
|
|
|
|
#if !HAKMEM_BUILD_RELEASE
|
2025-12-01 16:37:59 +09:00
|
|
|
static inline bool tls_sll_push_guarded(int class_idx, hak_base_ptr_t ptr, uint32_t capacity,
|
2025-11-27 07:30:32 +09:00
|
|
|
const char* where, const char* file, int line) {
|
|
|
|
|
// Enhanced duplicate guard (scan up to 256 nodes for deep duplicates)
|
|
|
|
|
uint32_t scanned = 0;
|
2025-12-01 16:37:59 +09:00
|
|
|
hak_base_ptr_t cur = g_tls_sll[class_idx].head;
|
2025-11-27 07:30:32 +09:00
|
|
|
const uint32_t limit = (g_tls_sll[class_idx].count < 256) ? g_tls_sll[class_idx].count : 256;
|
|
|
|
|
|
2025-12-01 16:37:59 +09:00
|
|
|
while (!hak_base_is_null(cur) && scanned < limit) {
|
|
|
|
|
if (hak_base_eq(cur, ptr)) {
|
2025-11-27 07:30:32 +09:00
|
|
|
// Enhanced error message with both old and new callsite info
|
|
|
|
|
const char* last_file = g_tls_sll_push_file[class_idx] ? g_tls_sll_push_file[class_idx] : "(null)";
|
|
|
|
|
fprintf(stderr,
|
|
|
|
|
"[TLS_SLL_DUP] cls=%d ptr=%p head=%p count=%u scanned=%u\n"
|
|
|
|
|
" Current push: where=%s at %s:%d\n"
|
|
|
|
|
" Previous push: %s:%d\n",
|
2025-12-01 16:37:59 +09:00
|
|
|
class_idx, HAK_BASE_TO_RAW(ptr), HAK_BASE_TO_RAW(g_tls_sll[class_idx].head), g_tls_sll[class_idx].count, scanned,
|
2025-11-27 07:30:32 +09:00
|
|
|
where, file, line,
|
|
|
|
|
last_file, g_tls_sll_push_line[class_idx]);
|
|
|
|
|
|
|
|
|
|
// Dump pointer trace for detailed analysis
|
|
|
|
|
ptr_trace_dump_now("tls_sll_dup");
|
|
|
|
|
abort();
|
|
|
|
|
}
|
2025-12-01 16:37:59 +09:00
|
|
|
void* raw_next = NULL;
|
|
|
|
|
PTR_NEXT_READ("tls_sll_dupcheck", class_idx, HAK_BASE_TO_RAW(cur), 0, raw_next);
|
|
|
|
|
cur = HAK_BASE_FROM_RAW(raw_next);
|
2025-11-27 07:30:32 +09:00
|
|
|
scanned++;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Call impl (duplicate check in impl will be skipped since we already checked above and would abort)
|
|
|
|
|
// Note: impl has its own duplicate check, but we'll never reach it because we abort above
|
|
|
|
|
bool ok = tls_sll_push_impl(class_idx, ptr, capacity, where);
|
|
|
|
|
if (ok) {
|
|
|
|
|
g_tls_sll_push_file[class_idx] = file;
|
|
|
|
|
g_tls_sll_push_line[class_idx] = line;
|
|
|
|
|
}
|
|
|
|
|
return ok;
|
|
|
|
|
}
|
|
|
|
|
|
2025-11-22 11:30:46 +09:00
|
|
|
# define tls_sll_push(cls, ptr, cap) \
|
2025-11-27 07:30:32 +09:00
|
|
|
tls_sll_push_guarded((cls), (ptr), (cap), __func__, __FILE__, __LINE__)
|
2025-11-22 11:30:46 +09:00
|
|
|
# define tls_sll_pop(cls, out) \
|
|
|
|
|
tls_sll_pop_impl((cls), (out), __func__)
|
|
|
|
|
#else
|
|
|
|
|
# define tls_sll_push(cls, ptr, cap) \
|
2025-12-03 20:42:28 +09:00
|
|
|
tls_sll_push_impl((cls), (ptr), (cap), __func__)
|
2025-11-22 11:30:46 +09:00
|
|
|
# define tls_sll_pop(cls, out) \
|
2025-12-03 20:42:28 +09:00
|
|
|
tls_sll_pop_impl((cls), (out), __func__)
|
2025-11-22 11:30:46 +09:00
|
|
|
#endif
|
|
|
|
|
|
2025-12-03 20:42:28 +09:00
|
|
|
#endif // TLS_SLL_BOX_H
|