Files
hakmem/core/hakmem_tiny_phase6_wrappers_box.inc

102 lines
4.2 KiB
PHP
Raw Normal View History

Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// Phase 6-1.7: Box Theory Refactoring - Mutual exclusion check
#if HAKMEM_TINY_PHASE6_BOX_REFACTOR
Comprehensive legacy cleanup and architecture consolidation Summary of Changes: MOVED TO ARCHIVE: - core/hakmem_tiny_legacy_slow_box.inc → archive/ * Slow path legacy code preserved for reference * Superseded by Gatekeeper Box architecture - core/superslab_allocate.c → archive/superslab_allocate_legacy.c * Legacy SuperSlab allocation implementation * Functionality integrated into new Box system - core/superslab_head.c → archive/superslab_head_legacy.c * Legacy slab head management * Refactored through Box architecture REMOVED DEAD CODE: - Eliminated unused allocation policy variants from ss_allocation_box.c * Reduced from 127+ lines of conditional logic to focused implementation * Removed: old policy branches, unused allocation strategies * Kept: current Box-based allocation path ADDED NEW INFRASTRUCTURE: - core/superslab_head_stub.c (41 lines) * Minimal stub for backward compatibility * Delegates to new architecture - Enhanced core/superslab_cache.c (75 lines added) * Added missing API functions for cache management * Proper interface for SuperSlab cache integration REFACTORED CORE SYSTEMS: - core/hakmem_super_registry.c * Moved registration logic from scattered locations * Centralized SuperSlab registry management - core/hakmem_tiny.c * Removed 27 lines of redundant initialization * Simplified through Box architecture - core/hakmem_tiny_alloc.inc * Streamlined allocation path to use Gatekeeper * Removed legacy decision logic - core/box/ss_allocation_box.c/h * Dramatically simplified allocation policy * Removed conditional branches for unused strategies * Focused on current Box-based approach BUILD SYSTEM: - Updated Makefile for archive structure - Removed obsolete object file references - Maintained build compatibility SAFETY & TESTING: - All deletions verified: no broken references - Build verification: RELEASE=0 and RELEASE=1 pass - Smoke tests: 100% pass rate - Functional verification: allocation/free intact Architecture Consolidation: Before: Multiple overlapping allocation paths with legacy code branches After: Single unified path through Gatekeeper Boxes with clear architecture Benefits: - Reduced code size and complexity - Improved maintainability - Single source of truth for allocation logic - Better diagnostic/observability hooks - Foundation for future optimizations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 14:22:48 +09:00
#if defined(HAKMEM_TINY_PHASE6_METADATA)
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
#error "Cannot enable PHASE6_BOX_REFACTOR with other Phase 6 options"
#endif
// Box 1: Atomic Operations (Layer 0 - Foundation)
#include "tiny_atomic.h"
// Box 5: Allocation Fast Path (Layer 1 - 3-4 instructions)
#include "tiny_alloc_fast.inc.h"
// ---------------- Refill count (Front) global config ----------------
// Parsed once at init; hot path reads plain ints (no getenv).
int g_refill_count_global = 0; // HAKMEM_TINY_REFILL_COUNT
int g_refill_count_hot = 0; // HAKMEM_TINY_REFILL_COUNT_HOT
int g_refill_count_mid = 0; // HAKMEM_TINY_REFILL_COUNT_MID
int g_refill_count_class[TINY_NUM_CLASSES] = {0}; // HAKMEM_TINY_REFILL_COUNT_C{0..7}
// Export wrapper functions for hakmem.c to call
// Phase 6-1.7 Optimization: Remove diagnostic overhead, rely on LTO for inlining
void* hak_tiny_alloc_fast_wrapper(size_t size) {
WIP: Add TLS SLL validation and SuperSlab registry fallback ChatGPT's diagnostic changes to address TLS_SLL_HDR_RESET issue. Current status: Partial mitigation, but root cause remains. Changes Applied: 1. SuperSlab Registry Fallback (hakmem_super_registry.h) - Added legacy table probe when hash map lookup misses - Prevents NULL returns for valid SuperSlabs during initialization - Status: ✅ Works but may hide underlying registration issues 2. TLS SLL Push Validation (tls_sll_box.h) - Reject push if SuperSlab lookup returns NULL - Reject push if class_idx mismatch detected - Added [TLS_SLL_PUSH_NO_SS] diagnostic message - Status: ✅ Prevents list corruption (defensive) 3. SuperSlab Allocation Class Fix (superslab_allocate.c) - Pass actual class_idx to sp_internal_allocate_superslab - Prevents dummy class=8 causing OOB access - Status: ✅ Root cause fix for allocation path 4. Debug Output Additions - First 256 push/pop operations traced - First 4 mismatches logged with details - SuperSlab registration state logged - Status: ✅ Diagnostic tool (not a fix) 5. TLS Hint Box Removed - Deleted ss_tls_hint_box.{c,h} (Phase 1 optimization) - Simplified to focus on stability first - Status: ⏳ Can be re-added after root cause fixed Current Problem (REMAINS UNSOLVED): - [TLS_SLL_HDR_RESET] still occurs after ~60 seconds of sh8bench - Pointer is 16 bytes offset from expected (class 1 → class 2 boundary) - hak_super_lookup returns NULL for that pointer - Suggests: Use-After-Free, Double-Free, or pointer arithmetic error Root Cause Analysis: - Pattern: Pointer offset by +16 (one class 1 stride) - Timing: Cumulative problem (appears after 60s, not immediately) - Location: Header corruption detected during TLS SLL pop Remaining Issues: ⚠️ Registry fallback is defensive (may hide registration bugs) ⚠️ Push validation prevents symptoms but not root cause ⚠️ 16-byte pointer offset source unidentified Next Steps for Investigation: 1. Full pointer arithmetic audit (Magazine ⇔ TLS SLL paths) 2. Enhanced logging at HDR_RESET point: - Expected vs actual pointer value - Pointer provenance (where it came from) - Allocation trace for that block 3. Verify Headerless flag is OFF throughout build 4. Check for double-offset application in conversions Technical Assessment: - 60% root cause fixes (allocation class, validation) - 40% defensive mitigation (registry fallback, push rejection) Performance Impact: - Registry fallback: +10-30 cycles on cold path (negligible) - Push validation: +5-10 cycles per push (acceptable) - Overall: < 2% performance impact estimated Related Issues: - Phase 1 TLS Hint Box removed temporarily - Phase 2 Headerless blocked until stability achieved 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 20:42:28 +09:00
static _Atomic int g_alloc_fast_trace = 0;
if (atomic_fetch_add_explicit(&g_alloc_fast_trace, 1, memory_order_relaxed) < 128) {
HAK_TRACE("[tiny_alloc_fast_wrapper_enter]\n");
}
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// Phase E5: Ultra fast path (8-instruction alloc, bypasses all layers)
// Enable with: HAKMEM_ULTRA_FAST_PATH=1 (compile-time)
#if HAKMEM_ULTRA_FAST_PATH
void* ret = tiny_alloc_fast_ultra(size);
if (ret) return ret;
// Miss → fallback to full fast path
#endif
// Bench-only ultra-short path: bypass diagnostics and pointer tracking
// Enable with: HAKMEM_BENCH_FAST_FRONT=1
static int g_bench_fast_front = -1;
if (__builtin_expect(g_bench_fast_front == -1, 0)) {
const char* e = getenv("HAKMEM_BENCH_FAST_FRONT");
g_bench_fast_front = (e && *e && *e != '0') ? 1 : 0;
}
if (__builtin_expect(g_bench_fast_front, 0)) {
return tiny_alloc_fast(size);
}
static _Atomic uint64_t wrapper_call_count = 0;
uint64_t call_num = atomic_fetch_add(&wrapper_call_count, 1);
// Pointer tracking init (first call only)
PTR_TRACK_INIT();
// PRIORITY 3: Periodic canary validation (every 1000 ops)
periodic_canary_check(call_num, "hak_tiny_alloc_fast_wrapper");
// Box I: Periodic full integrity check (every 5000 ops)
#if HAKMEM_INTEGRITY_LEVEL >= 3
if ((call_num % 5000) == 0) {
extern void integrity_periodic_full_check(const char*);
integrity_periodic_full_check("periodic check in alloc wrapper");
}
#endif
#if !HAKMEM_BUILD_RELEASE
if (call_num > 14250 && call_num < 14280 && size <= 1024) {
fprintf(stderr, "[HAK_TINY_ALLOC_FAST_WRAPPER] call=%lu size=%zu\n", call_num, size);
fflush(stderr);
}
#endif
void* result = tiny_alloc_fast(size);
#if !HAKMEM_BUILD_RELEASE
if (call_num > 14250 && call_num < 14280 && size <= 1024) {
fprintf(stderr, "[HAK_TINY_ALLOC_FAST_WRAPPER] call=%lu returned %p\n", call_num, result);
fflush(stderr);
}
#endif
return result;
}
void hak_tiny_free_fast_wrapper(void* ptr) {
static _Atomic uint64_t free_call_count = 0;
uint64_t call_num = atomic_fetch_add(&free_call_count, 1);
if (call_num > 14135 && call_num < 14145) {
fprintf(stderr, "[HAK_TINY_FREE_FAST_WRAPPER] call=%lu ptr=%p\n", call_num, ptr);
fflush(stderr);
}
Remove legacy redundant code after Gatekeeper Box consolidation Summary of Deletions: - Remove core/box/unified_batch_box.c (26 lines) * Legacy batch allocation logic superseded by Alloc Gatekeeper Box * unified_cache now handles allocation aggregation - Remove core/box/unified_batch_box.h (29 lines) * Header declarations for deprecated unified_batch_box module - Remove core/tiny_free_fast.inc.h (329 lines) * Legacy fast-path free implementation * Functionality consolidated into: - tiny_free_gate_box.h (Fail-Fast layer + diagnostics) - malloc_tiny_fast.h (Free path integration) - unified_cache (return to freelist) * Code path now routes through Gatekeeper Box for consistency Build System Updates: - Update Makefile * Remove unified_batch_box.o from OBJS_BASE * Remove unified_batch_box_shared.o from SHARED_OBJS * Remove unified_batch_box.o from BENCH_HAKMEM_OBJS_BASE - Update core/hakmem_tiny_phase6_wrappers_box.inc * Remove unified_batch_box references * Simplify allocation wrapper to use new Gatekeeper architecture Impact: - Removes ~385 lines of redundant/superseded code - Consolidates allocation logic through unified Gatekeeper entry points - All functionality preserved via new Box-based architecture - Simplifies codebase and reduces maintenance burden Testing: - Build verification: make clean && make RELEASE=0/1 - Smoke tests: All pass (simple_alloc, loop 10M, pool_tls) - No functional regressions Rationale: After implementing Alloc/Free Gatekeeper Boxes with Fail-Fast layers and Unified Cache type safety, the legacy separate implementations became redundant. This commit completes the architectural consolidation and simplifies the allocator codebase. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 12:55:53 +09:00
// Box 6 v1 (tiny_free_fast) は v2 に置き換え済み。
// Wrapper からは Box 経路の slow/fast 判定に委ねる。
hak_tiny_free(ptr);
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
if (call_num > 14135 && call_num < 14145) {
fprintf(stderr, "[HAK_TINY_FREE_FAST_WRAPPER] call=%lu completed\n", call_num);
fflush(stderr);
}
}
Comprehensive legacy cleanup and architecture consolidation Summary of Changes: MOVED TO ARCHIVE: - core/hakmem_tiny_legacy_slow_box.inc → archive/ * Slow path legacy code preserved for reference * Superseded by Gatekeeper Box architecture - core/superslab_allocate.c → archive/superslab_allocate_legacy.c * Legacy SuperSlab allocation implementation * Functionality integrated into new Box system - core/superslab_head.c → archive/superslab_head_legacy.c * Legacy slab head management * Refactored through Box architecture REMOVED DEAD CODE: - Eliminated unused allocation policy variants from ss_allocation_box.c * Reduced from 127+ lines of conditional logic to focused implementation * Removed: old policy branches, unused allocation strategies * Kept: current Box-based allocation path ADDED NEW INFRASTRUCTURE: - core/superslab_head_stub.c (41 lines) * Minimal stub for backward compatibility * Delegates to new architecture - Enhanced core/superslab_cache.c (75 lines added) * Added missing API functions for cache management * Proper interface for SuperSlab cache integration REFACTORED CORE SYSTEMS: - core/hakmem_super_registry.c * Moved registration logic from scattered locations * Centralized SuperSlab registry management - core/hakmem_tiny.c * Removed 27 lines of redundant initialization * Simplified through Box architecture - core/hakmem_tiny_alloc.inc * Streamlined allocation path to use Gatekeeper * Removed legacy decision logic - core/box/ss_allocation_box.c/h * Dramatically simplified allocation policy * Removed conditional branches for unused strategies * Focused on current Box-based approach BUILD SYSTEM: - Updated Makefile for archive structure - Removed obsolete object file references - Maintained build compatibility SAFETY & TESTING: - All deletions verified: no broken references - Build verification: RELEASE=0 and RELEASE=1 pass - Smoke tests: 100% pass rate - Functional verification: allocation/free intact Architecture Consolidation: Before: Multiple overlapping allocation paths with legacy code branches After: Single unified path through Gatekeeper Boxes with clear architecture Benefits: - Reduced code size and complexity - Improved maintainability - Single source of truth for allocation logic - Better diagnostic/observability hooks - Foundation for future optimizations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 14:22:48 +09:00
// Metadata-only fallback when Box Refactor is disabled
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
#elif defined(HAKMEM_TINY_PHASE6_METADATA)
// Phase 6-1.6: Metadata header (recommended)
#include "hakmem_tiny_metadata.inc"
#endif