Files
hakmem/core/hakmem_tiny_config_box.inc

202 lines
8.2 KiB
PHP
Raw Normal View History

Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// hakmem_tiny_config_box.inc
// Box: Tiny allocator configuration, debug counters, and return helpers.
// Extracted from hakmem_tiny.c to reduce file size and isolate config logic.
Larson double-free investigation: Add full operation lifecycle logging **Diagnostic Enhancement**: Complete malloc/free/pop operation tracing for debug **Problem**: Larson crashes with TLS_SLL_DUP at count=18, need to trace exact pointer lifecycle to identify if allocator returns duplicate addresses or if benchmark has double-free bug. **Implementation** (ChatGPT + Claude + Task collaboration): 1. **Global Operation Counter** (core/hakmem_tiny_config_box.inc:9): - Single atomic counter for all operations (malloc/free/pop) - Chronological ordering across all paths 2. **Allocation Logging** (core/hakmem_tiny_config_box.inc:148-161): - HAK_RET_ALLOC macro enhanced with operation logging - Logs first 50 class=1 allocations with ptr/base/tls_count 3. **Free Logging** (core/tiny_free_fast_v2.inc.h:222-235): - Added before tls_sll_push() call (line 221) - Logs first 50 class=1 frees with ptr/base/tls_count_before 4. **Pop Logging** (core/box/tls_sll_box.h:587-597): - Added in tls_sll_pop_impl() after successful pop - Logs first 50 class=1 pops with base/tls_count_after 5. **Drain Debug Logging** (core/box/tls_sll_drain_box.h:143-151): - Enhanced drain loop with detailed logging - Tracks pop failures and drained block counts **Initial Findings**: - First 19 operations: ALL frees, ZERO allocations, ZERO pops - OP#0006: First free of 0x...430 - OP#0018: Duplicate free of 0x...430 → TLS_SLL_DUP detected - Suggests either: (a) allocations before logging starts, or (b) Larson bug **Debug-only**: All logging gated by !HAKMEM_BUILD_RELEASE (zero cost in release) **Next Steps**: - Expand logging window to 200 operations - Log initialization phase allocations - Cross-check with Larson benchmark source **Status**: Ready for extended testing
2025-11-27 08:18:01 +09:00
// ============================================================================
// Global Operation Counter (for debug logging)
// ============================================================================
#include <stdatomic.h>
_Atomic uint64_t g_debug_op_count = 0;
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// ============================================================================
// Size class table (Box 3 dependency)
// ============================================================================
// Phase E1-CORRECT: ALL classes have 1-byte header
// These sizes represent TOTAL BLOCK SIZE (stride) = [Header 1B][Data N-1B]
// Usable data = stride - 1 (implicit)
const size_t g_tiny_class_sizes[TINY_NUM_CLASSES] = {
8, // Class 0: 8B total = [Header 1B][Data 7B]
16, // Class 1: 16B total = [Header 1B][Data 15B]
32, // Class 2: 32B total = [Header 1B][Data 31B]
64, // Class 3: 64B total = [Header 1B][Data 63B]
128, // Class 4: 128B total = [Header 1B][Data 127B]
256, // Class 5: 256B total = [Header 1B][Data 255B]
512, // Class 6: 512B total = [Header 1B][Data 511B]
Fix C0/C7 class confusion: Upgrade C7 stride to 2048B and fix meta->class_idx initialization Root Cause: 1. C7 stride was 1024B, unable to serve 1024B user requests (need 1025B with header) 2. New SuperSlabs start with meta->class_idx=0 (mmap zero-init) 3. superslab_init_slab() only sets class_idx if meta->class_idx==255 4. Multiple code paths used conditional assignment (if class_idx==255), leaving C7 slabs with class_idx=0 5. This caused C7 blocks to be misidentified as C0, leading to HDR_META_MISMATCH errors Changes: 1. Upgrade C7 stride: 1024B → 2048B (can now serve 1024B requests) 2. Update blocks_per_slab[7]: 64 → 32 (2048B stride / 64KB slab) 3. Update size-to-class LUT: entries 513-2048 now map to C7 4. Fix superslab_init_slab() fail-safe: only reinitialize if class_idx==255 (not 0) 5. Add explicit class_idx assignment in 6 initialization paths: - tiny_superslab_alloc.inc.h: superslab_refill() after init - hakmem_tiny_superslab.c: backend_shared after init (main path) - ss_unified_backend_box.c: unconditional assignment - ss_legacy_backend_box.c: explicit assignment - superslab_expansion_box.c: explicit assignment - ss_allocation_box.c: fail-safe condition fix Fix P0 refill bug: - Update obsolete array access after Phase 3d-B TLS SLL unification - g_tls_sll_head[cls] → g_tls_sll[cls].head - g_tls_sll_count[cls] → g_tls_sll[cls].count Results: - HDR_META_MISMATCH: eliminated (0 errors in 100K iterations) - 1024B allocations now routed to C7 (Tiny fast path) - NXT_MISALIGN warnings remain (legacy 1024B SuperSlabs, separate issue) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 13:44:05 +09:00
2048 // Class 7: 2048B total = [Header 1B][Data 2047B] (upgraded for 1024B requests)
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
};
// ============================================================================
// Phase 16: Dynamic Tiny Max Size (ENV: HAKMEM_TINY_MAX_CLASS)
// Phase 17-1: Auto-adjust when Small-Mid enabled
// ============================================================================
// Forward declaration for Small-Mid check
extern bool smallmid_is_enabled(void);
// Optimized: Cached max size for hot path
// Moved to hakmem_tiny.h for global inlining
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// ============================================================================
// PRIORITY 1-4: Integrity Check Counters
// ============================================================================
_Atomic uint64_t g_integrity_check_class_bounds = 0;
_Atomic uint64_t g_integrity_check_freelist = 0;
_Atomic uint64_t g_integrity_check_canary = 0;
_Atomic uint64_t g_integrity_check_header = 0;
// Build-time gate for debug counters (path/ultra). Default OFF.
#ifndef HAKMEM_DEBUG_COUNTERS
#define HAKMEM_DEBUG_COUNTERS 0
#endif
int g_debug_fast0 = 0;
int g_debug_remote_guard = 0;
int g_remote_force_notify = 0;
// Tiny free safety (debug)
int g_tiny_safe_free = 0; // Default OFF for performance; env: HAKMEM_SAFE_FREE=1 でON
int g_tiny_safe_free_strict = 0; // env: HAKMEM_SAFE_FREE_STRICT=1
int g_tiny_force_remote = 0; // env: HAKMEM_TINY_FORCE_REMOTE=1
// Build-time gate: Minimal Tiny front (bench-only)
static inline int superslab_trace_enabled(void) {
static int g_ss_trace_flag = -1;
if (__builtin_expect(g_ss_trace_flag == -1, 0)) {
const char* tr = getenv("HAKMEM_TINY_SUPERSLAB_TRACE");
g_ss_trace_flag = (tr && atoi(tr) != 0) ? 1 : 0;
}
return g_ss_trace_flag;
}
// When enabled, physically excludes optional front tiers from the hot path
// (UltraFront/Quick/Frontend/HotMag/SS-try/BumpShadow), leaving:
// SLL → TLS Magazine → SuperSlab → (remaining slow path)
#ifndef HAKMEM_TINY_MINIMAL_FRONT
#define HAKMEM_TINY_MINIMAL_FRONT 1
#endif
// Strict front: compile-out optional front tiers but keep baseline structure intact
#ifndef HAKMEM_TINY_STRICT_FRONT
#define HAKMEM_TINY_STRICT_FRONT 0
#endif
// Bench-only fast path knobs (defaults)
#ifndef HAKMEM_TINY_BENCH_REFILL
#define HAKMEM_TINY_BENCH_REFILL 8
#endif
// Optional per-class overrides (bench-only)
#ifndef HAKMEM_TINY_BENCH_REFILL8
#define HAKMEM_TINY_BENCH_REFILL8 HAKMEM_TINY_BENCH_REFILL
#endif
#ifndef HAKMEM_TINY_BENCH_REFILL16
#define HAKMEM_TINY_BENCH_REFILL16 HAKMEM_TINY_BENCH_REFILL
#endif
#ifndef HAKMEM_TINY_BENCH_REFILL32
#define HAKMEM_TINY_BENCH_REFILL32 HAKMEM_TINY_BENCH_REFILL
#endif
#ifndef HAKMEM_TINY_BENCH_REFILL64
#define HAKMEM_TINY_BENCH_REFILL64 HAKMEM_TINY_BENCH_REFILL
#endif
// Bench-only warmup amounts (pre-fill TLS SLL on first alloc per class)
#ifndef HAKMEM_TINY_BENCH_WARMUP8
#define HAKMEM_TINY_BENCH_WARMUP8 64
#endif
#ifndef HAKMEM_TINY_BENCH_WARMUP16
#define HAKMEM_TINY_BENCH_WARMUP16 96
#endif
#ifndef HAKMEM_TINY_BENCH_WARMUP32
#define HAKMEM_TINY_BENCH_WARMUP32 160
#endif
#ifndef HAKMEM_TINY_BENCH_WARMUP64
#define HAKMEM_TINY_BENCH_WARMUP64 192
#endif
#ifdef HAKMEM_TINY_BENCH_FASTPATH
static __thread unsigned char g_tls_bench_warm_done[4];
#endif
#if HAKMEM_DEBUG_COUNTERS
#define HAK_PATHDBG_INC(arr, idx) do { if (g_path_debug_enabled) { (arr)[(idx)]++; } } while(0)
#define HAK_ULTRADBG_INC(arr, idx) do { (arr)[(idx)]++; } while(0)
#else
#define HAK_PATHDBG_INC(arr, idx) do { (void)(idx); } while(0)
#define HAK_ULTRADBG_INC(arr, idx) do { (void)(idx); } while(0)
#endif
// Simple scalar debug increment (no-op when HAKMEM_DEBUG_COUNTERS=0)
#if HAKMEM_DEBUG_COUNTERS
#define HAK_DBG_INC(var) do { (var)++; } while(0)
#else
#define HAK_DBG_INC(var) do { (void)0; } while(0)
#endif
// Return helper: record tiny alloc stat (guarded) then return pointer
static inline void tiny_debug_track_alloc_ret(int cls, void* ptr);
// ========== HAK_RET_ALLOC: Single Definition Point ==========
// Choose implementation based on HAKMEM_TINY_HEADER_CLASSIDX
// - Phase 7 enabled: Write header and return user pointer
// - Phase 7 disabled: Legacy behavior (stats + route + return)
#if HAKMEM_TINY_HEADER_CLASSIDX
#if HAKMEM_BUILD_RELEASE
// Phase E1-CORRECT: ALL classes have 1-byte headers (including C7)
// Ultra-fast inline macro (3-4 instructions)
#define HAK_RET_ALLOC(cls, base_ptr) do { \
Phase 1 Refactoring Complete: Box-based Logic Consolidation ✅ Summary: - Task 1.1 ✅: Created tiny_layout_box.h for centralized class/header definitions - Task 1.2 ✅: Updated tiny_nextptr.h to use layout Box (bitmasking optimization) - Task 1.3 ✅: Enhanced ptr_conversion_box.h with Phantom Types support - Task 1.4 ✅: Implemented test_phantom.c for Debug-mode type checking Verification Results (by Task Agent): - Box Pattern Compliance: ⭐⭐⭐⭐⭐ (5/5) - MISSION/DESIGN documented - Type Safety: ⭐⭐⭐⭐⭐ (5/5) - Phantom Types working as designed - Test Coverage: ⭐⭐⭐☆☆ (3/5) - Compile-time tests OK, runtime tests planned - Performance: 0 bytes, 0 cycles overhead in Release build - Build Status: ✅ Success (526KB libhakmem.so, zero warnings) Key Achievements: 1. Single Source of Truth principle fully implemented 2. Circular dependency eliminated (layout→header→nextptr→conversion) 3. Release build: 100% inlining, zero overhead 4. Debug build: Full type checking with Phantom Types 5. HAK_RET_ALLOC macro migrated to Box API Known Issues (unrelated to Phase 1): - TLS_SLL_HDR_RESET from sh8bench (existing, will be resolved in Phase 2) Next Steps: - Phase 2 readiness: ✅ READY - Recommended: Create migration guide + runtime test suite - Alignment guarantee will be addressed in Phase 2 (Headerless layout) 🤖 Generated with Claude Code + Gemini (implementation) + Task Agent (verification) Co-Authored-By: Gemini <gemini@example.com> Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 11:38:11 +09:00
tiny_header_write_for_alloc((base_ptr), (cls)); \
hak_base_ptr_t _base = HAK_BASE_FROM_RAW(base_ptr); \
hak_user_ptr_t _user = ptr_base_to_user(_base, (cls)); \
return (void*)HAK_USER_TO_RAW(_user); \
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
} while(0)
#else
Larson double-free investigation: Add full operation lifecycle logging **Diagnostic Enhancement**: Complete malloc/free/pop operation tracing for debug **Problem**: Larson crashes with TLS_SLL_DUP at count=18, need to trace exact pointer lifecycle to identify if allocator returns duplicate addresses or if benchmark has double-free bug. **Implementation** (ChatGPT + Claude + Task collaboration): 1. **Global Operation Counter** (core/hakmem_tiny_config_box.inc:9): - Single atomic counter for all operations (malloc/free/pop) - Chronological ordering across all paths 2. **Allocation Logging** (core/hakmem_tiny_config_box.inc:148-161): - HAK_RET_ALLOC macro enhanced with operation logging - Logs first 50 class=1 allocations with ptr/base/tls_count 3. **Free Logging** (core/tiny_free_fast_v2.inc.h:222-235): - Added before tls_sll_push() call (line 221) - Logs first 50 class=1 frees with ptr/base/tls_count_before 4. **Pop Logging** (core/box/tls_sll_box.h:587-597): - Added in tls_sll_pop_impl() after successful pop - Logs first 50 class=1 pops with base/tls_count_after 5. **Drain Debug Logging** (core/box/tls_sll_drain_box.h:143-151): - Enhanced drain loop with detailed logging - Tracks pop failures and drained block counts **Initial Findings**: - First 19 operations: ALL frees, ZERO allocations, ZERO pops - OP#0006: First free of 0x...430 - OP#0018: Duplicate free of 0x...430 → TLS_SLL_DUP detected - Suggests either: (a) allocations before logging starts, or (b) Larson bug **Debug-only**: All logging gated by !HAKMEM_BUILD_RELEASE (zero cost in release) **Next Steps**: - Expand logging window to 200 operations - Log initialization phase allocations - Cross-check with Larson benchmark source **Status**: Ready for extended testing
2025-11-27 08:18:01 +09:00
// Debug: Keep full validation via tiny_region_id_write_header() + operation logging
#define HAK_RET_ALLOC(cls, ptr) do { \
extern _Atomic uint64_t g_debug_op_count; \
extern __thread TinyTLSSLL g_tls_sll[]; \
void* base_ptr = (ptr); \
void* user_ptr = tiny_region_id_write_header(base_ptr, (cls)); \
uint64_t op = atomic_fetch_add(&g_debug_op_count, 1); \
if (op < 50 && (cls) == 1) { \
fprintf(stderr, "[OP#%04lu ALLOC] cls=%d ptr=%p base=%p from=alloc tls_count=%u\n", \
(unsigned long)op, (cls), user_ptr, base_ptr, \
g_tls_sll[(cls)].count); \
fflush(stderr); \
} \
return user_ptr; \
} while(0)
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
#endif
#else
// Legacy: Stats and routing before return
#ifdef HAKMEM_ENABLE_STATS
// Optional: samplingビルド時に有効化。ホットパスは直接インライン呼び出し間接分岐なし
#ifdef HAKMEM_TINY_STAT_SAMPLING
static __thread unsigned g_tls_stat_accum_alloc[TINY_NUM_CLASSES];
static int g_stat_rate_lg = 0; // 0=毎回、それ以外=2^lgごと
static inline __attribute__((always_inline)) void hkm_stat_alloc(int cls) {
if (__builtin_expect(g_stat_rate_lg == 0, 1)) { stats_record_alloc(cls); return; }
unsigned m = (1u << g_stat_rate_lg) - 1u;
if (((++g_tls_stat_accum_alloc[cls]) & m) == 0u) stats_record_alloc(cls);
}
#else
static inline __attribute__((always_inline)) void hkm_stat_alloc(int cls) { stats_record_alloc(cls); }
#endif
#define HAK_RET_ALLOC(cls, ptr) do { \
tiny_debug_track_alloc_ret((cls), (ptr)); \
hkm_stat_alloc((cls)); \
ROUTE_COMMIT((cls), 0x7F); \
return (ptr); \
} while(0)
#else
#define HAK_RET_ALLOC(cls, ptr) do { \
tiny_debug_track_alloc_ret((cls), (ptr)); \
ROUTE_COMMIT((cls), 0x7F); \
return (ptr); \
} while(0)
#endif
#endif // HAKMEM_TINY_HEADER_CLASSIDX
// Free-side stats: compile-time zero when stats disabled
#ifdef HAKMEM_ENABLE_STATS
#define HAK_STAT_FREE(cls) do { stats_record_free((cls)); } while(0)
#else
#define HAK_STAT_FREE(cls) do { } while(0)
#endif