Files
hakmem/core/hakmem_tiny_phase6_wrappers_box.inc

102 lines
4.2 KiB
PHP
Raw Normal View History

Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// Phase 6-1.7: Box Theory Refactoring - Mutual exclusion check
#if HAKMEM_TINY_PHASE6_BOX_REFACTOR
#if defined(HAKMEM_TINY_PHASE6_METADATA) || defined(HAKMEM_TINY_PHASE6_ULTRA_SIMPLE)
#error "Cannot enable PHASE6_BOX_REFACTOR with other Phase 6 options"
#endif
// Box 1: Atomic Operations (Layer 0 - Foundation)
#include "tiny_atomic.h"
// Box 5: Allocation Fast Path (Layer 1 - 3-4 instructions)
#include "tiny_alloc_fast.inc.h"
// ---------------- Refill count (Front) global config ----------------
// Parsed once at init; hot path reads plain ints (no getenv).
int g_refill_count_global = 0; // HAKMEM_TINY_REFILL_COUNT
int g_refill_count_hot = 0; // HAKMEM_TINY_REFILL_COUNT_HOT
int g_refill_count_mid = 0; // HAKMEM_TINY_REFILL_COUNT_MID
int g_refill_count_class[TINY_NUM_CLASSES] = {0}; // HAKMEM_TINY_REFILL_COUNT_C{0..7}
// Export wrapper functions for hakmem.c to call
// Phase 6-1.7 Optimization: Remove diagnostic overhead, rely on LTO for inlining
void* hak_tiny_alloc_fast_wrapper(size_t size) {
WIP: Add TLS SLL validation and SuperSlab registry fallback ChatGPT's diagnostic changes to address TLS_SLL_HDR_RESET issue. Current status: Partial mitigation, but root cause remains. Changes Applied: 1. SuperSlab Registry Fallback (hakmem_super_registry.h) - Added legacy table probe when hash map lookup misses - Prevents NULL returns for valid SuperSlabs during initialization - Status: ✅ Works but may hide underlying registration issues 2. TLS SLL Push Validation (tls_sll_box.h) - Reject push if SuperSlab lookup returns NULL - Reject push if class_idx mismatch detected - Added [TLS_SLL_PUSH_NO_SS] diagnostic message - Status: ✅ Prevents list corruption (defensive) 3. SuperSlab Allocation Class Fix (superslab_allocate.c) - Pass actual class_idx to sp_internal_allocate_superslab - Prevents dummy class=8 causing OOB access - Status: ✅ Root cause fix for allocation path 4. Debug Output Additions - First 256 push/pop operations traced - First 4 mismatches logged with details - SuperSlab registration state logged - Status: ✅ Diagnostic tool (not a fix) 5. TLS Hint Box Removed - Deleted ss_tls_hint_box.{c,h} (Phase 1 optimization) - Simplified to focus on stability first - Status: ⏳ Can be re-added after root cause fixed Current Problem (REMAINS UNSOLVED): - [TLS_SLL_HDR_RESET] still occurs after ~60 seconds of sh8bench - Pointer is 16 bytes offset from expected (class 1 → class 2 boundary) - hak_super_lookup returns NULL for that pointer - Suggests: Use-After-Free, Double-Free, or pointer arithmetic error Root Cause Analysis: - Pattern: Pointer offset by +16 (one class 1 stride) - Timing: Cumulative problem (appears after 60s, not immediately) - Location: Header corruption detected during TLS SLL pop Remaining Issues: ⚠️ Registry fallback is defensive (may hide registration bugs) ⚠️ Push validation prevents symptoms but not root cause ⚠️ 16-byte pointer offset source unidentified Next Steps for Investigation: 1. Full pointer arithmetic audit (Magazine ⇔ TLS SLL paths) 2. Enhanced logging at HDR_RESET point: - Expected vs actual pointer value - Pointer provenance (where it came from) - Allocation trace for that block 3. Verify Headerless flag is OFF throughout build 4. Check for double-offset application in conversions Technical Assessment: - 60% root cause fixes (allocation class, validation) - 40% defensive mitigation (registry fallback, push rejection) Performance Impact: - Registry fallback: +10-30 cycles on cold path (negligible) - Push validation: +5-10 cycles per push (acceptable) - Overall: < 2% performance impact estimated Related Issues: - Phase 1 TLS Hint Box removed temporarily - Phase 2 Headerless blocked until stability achieved 🤖 Generated with Claude Code (https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 20:42:28 +09:00
static _Atomic int g_alloc_fast_trace = 0;
if (atomic_fetch_add_explicit(&g_alloc_fast_trace, 1, memory_order_relaxed) < 128) {
HAK_TRACE("[tiny_alloc_fast_wrapper_enter]\n");
}
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
// Phase E5: Ultra fast path (8-instruction alloc, bypasses all layers)
// Enable with: HAKMEM_ULTRA_FAST_PATH=1 (compile-time)
#if HAKMEM_ULTRA_FAST_PATH
void* ret = tiny_alloc_fast_ultra(size);
if (ret) return ret;
// Miss → fallback to full fast path
#endif
// Bench-only ultra-short path: bypass diagnostics and pointer tracking
// Enable with: HAKMEM_BENCH_FAST_FRONT=1
static int g_bench_fast_front = -1;
if (__builtin_expect(g_bench_fast_front == -1, 0)) {
const char* e = getenv("HAKMEM_BENCH_FAST_FRONT");
g_bench_fast_front = (e && *e && *e != '0') ? 1 : 0;
}
if (__builtin_expect(g_bench_fast_front, 0)) {
return tiny_alloc_fast(size);
}
static _Atomic uint64_t wrapper_call_count = 0;
uint64_t call_num = atomic_fetch_add(&wrapper_call_count, 1);
// Pointer tracking init (first call only)
PTR_TRACK_INIT();
// PRIORITY 3: Periodic canary validation (every 1000 ops)
periodic_canary_check(call_num, "hak_tiny_alloc_fast_wrapper");
// Box I: Periodic full integrity check (every 5000 ops)
#if HAKMEM_INTEGRITY_LEVEL >= 3
if ((call_num % 5000) == 0) {
extern void integrity_periodic_full_check(const char*);
integrity_periodic_full_check("periodic check in alloc wrapper");
}
#endif
#if !HAKMEM_BUILD_RELEASE
if (call_num > 14250 && call_num < 14280 && size <= 1024) {
fprintf(stderr, "[HAK_TINY_ALLOC_FAST_WRAPPER] call=%lu size=%zu\n", call_num, size);
fflush(stderr);
}
#endif
void* result = tiny_alloc_fast(size);
#if !HAKMEM_BUILD_RELEASE
if (call_num > 14250 && call_num < 14280 && size <= 1024) {
fprintf(stderr, "[HAK_TINY_ALLOC_FAST_WRAPPER] call=%lu returned %p\n", call_num, result);
fflush(stderr);
}
#endif
return result;
}
void hak_tiny_free_fast_wrapper(void* ptr) {
static _Atomic uint64_t free_call_count = 0;
uint64_t call_num = atomic_fetch_add(&free_call_count, 1);
if (call_num > 14135 && call_num < 14145) {
fprintf(stderr, "[HAK_TINY_FREE_FAST_WRAPPER] call=%lu ptr=%p\n", call_num, ptr);
fflush(stderr);
}
Remove legacy redundant code after Gatekeeper Box consolidation Summary of Deletions: - Remove core/box/unified_batch_box.c (26 lines) * Legacy batch allocation logic superseded by Alloc Gatekeeper Box * unified_cache now handles allocation aggregation - Remove core/box/unified_batch_box.h (29 lines) * Header declarations for deprecated unified_batch_box module - Remove core/tiny_free_fast.inc.h (329 lines) * Legacy fast-path free implementation * Functionality consolidated into: - tiny_free_gate_box.h (Fail-Fast layer + diagnostics) - malloc_tiny_fast.h (Free path integration) - unified_cache (return to freelist) * Code path now routes through Gatekeeper Box for consistency Build System Updates: - Update Makefile * Remove unified_batch_box.o from OBJS_BASE * Remove unified_batch_box_shared.o from SHARED_OBJS * Remove unified_batch_box.o from BENCH_HAKMEM_OBJS_BASE - Update core/hakmem_tiny_phase6_wrappers_box.inc * Remove unified_batch_box references * Simplify allocation wrapper to use new Gatekeeper architecture Impact: - Removes ~385 lines of redundant/superseded code - Consolidates allocation logic through unified Gatekeeper entry points - All functionality preserved via new Box-based architecture - Simplifies codebase and reduces maintenance burden Testing: - Build verification: make clean && make RELEASE=0/1 - Smoke tests: All pass (simple_alloc, loop 10M, pool_tls) - No functional regressions Rationale: After implementing Alloc/Free Gatekeeper Boxes with Fail-Fast layers and Unified Cache type safety, the legacy separate implementations became redundant. This commit completes the architectural consolidation and simplifies the allocator codebase. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 12:55:53 +09:00
// Box 6 v1 (tiny_free_fast) は v2 に置き換え済み。
// Wrapper からは Box 経路の slow/fast 判定に委ねる。
hak_tiny_free(ptr);
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
if (call_num > 14135 && call_num < 14145) {
fprintf(stderr, "[HAK_TINY_FREE_FAST_WRAPPER] call=%lu completed\n", call_num);
fflush(stderr);
}
}
ENV Cleanup: Delete Ultra HEAP & BG Remote dead code (-1,096 LOC) Deleted files (11): - core/ultra/ directory (6 files: tiny_ultra_heap.*, tiny_ultra_page_arena.*) - core/front/tiny_ultrafront.h - core/tiny_ultra_fast.inc.h - core/hakmem_tiny_ultra_front.inc.h - core/hakmem_tiny_ultra_simple.inc - core/hakmem_tiny_ultra_batch_box.inc Edited files (10): - core/hakmem_tiny.c: Remove Ultra HEAP #includes, move ultra_batch_for_class() - core/hakmem_tiny_tls_state_box.inc: Delete TinyUltraFront, g_ultra_simple - core/hakmem_tiny_phase6_wrappers_box.inc: Delete ULTRA_SIMPLE block - core/hakmem_tiny_alloc.inc: Delete Ultra-Front code block - core/hakmem_tiny_init.inc: Delete ULTRA_SIMPLE ENV loading - core/hakmem_tiny_remote_target.{c,h}: Delete g_bg_remote_enable/batch - core/tiny_refill.h: Remove BG Remote check (always break) - core/hakmem_tiny_background.inc: Delete BG Remote drain loop Deleted ENV variables: - HAKMEM_TINY_ULTRA_HEAP (build flag, undefined) - HAKMEM_TINY_ULTRA_L0 - HAKMEM_TINY_ULTRA_HEAP_DUMP - HAKMEM_TINY_ULTRA_PAGE_DUMP - HAKMEM_TINY_ULTRA_FRONT - HAKMEM_TINY_BG_REMOTE (no getenv, dead code) - HAKMEM_TINY_BG_REMOTE_BATCH (no getenv, dead code) - HAKMEM_TINY_ULTRA_SIMPLE (references only) Impact: - Code reduction: -1,096 lines - Binary size: 305KB → 304KB (-1KB) - Build: PASS - Sanity: 15.69M ops/s (3 runs avg) - Larson: 1 crash observed (seed 43, likely existing instability) Notes: - Ultra HEAP never compiled (#if HAKMEM_TINY_ULTRA_HEAP undefined) - BG Remote variables never initialized (g_bg_remote_enable always 0) - Ultra SLIM (ultra_slim_alloc_box.h) preserved (active 4-layer path) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 04:35:47 +09:00
// HAKMEM_TINY_PHASE6_ULTRA_SIMPLE - REMOVED (dead code cleanup 2025-11-27)
Refactor: Extract 5 Box modules from hakmem_tiny.c (-52% size reduction) Split hakmem_tiny.c (2081 lines) into focused modules for better maintainability. ## Changes **hakmem_tiny.c**: 2081 → 995 lines (-1086 lines, -52% reduction) ## Extracted Modules (5 boxes) 1. **config_box** (211 lines) - Size class tables, integrity counters - Debug flags, benchmark macros - HAK_RET_ALLOC/HAK_STAT_FREE instrumentation 2. **publish_box** (419 lines) - Publish/Adopt counters and statistics - Bench mailbox, partial ring - Live cap/Hot slot management - TLS helper functions (tiny_tls_default_*) 3. **globals_box** (256 lines) - Global variable declarations (~70 variables) - TinyPool instance and initialization flag - TLS variables (g_tls_lists, g_fast_head, g_fast_count) - SuperSlab configuration (partial ring, empty reserves) - Adopt gate functions 4. **phase6_wrappers_box** (122 lines) - Phase 6 Box Theory wrapper layer - hak_tiny_alloc_fast_wrapper() - hak_tiny_free_fast_wrapper() - Diagnostic instrumentation 5. **ace_guard_box** (100 lines) - ACE Learning Layer (hkm_ace_set_drain_threshold) - FastCache API (tiny_fc_room, tiny_fc_push_bulk) - Tiny Guard debugging system (5 functions) ## Benefits - **Readability**: Giant 2k file → focused 1k core + 5 coherent modules - **Maintainability**: Each box has clear responsibility and boundaries - **Build**: All modules compile successfully ✅ ## Technical Details - Phase 1: ChatGPT extracted config_box + publish_box (-625 lines) - Phase 2-4: Claude extracted globals_box + phase6_wrappers_box + ace_guard_box (-461 lines) - All extractions use .inc files (same translation unit, preserves static/TLS linkage) - Fixed Makefile: Added tiny_sizeclass_hist_box.o to OBJS_BASE and BENCH_HAKMEM_OBJS_BASE 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-21 01:16:45 +09:00
#elif defined(HAKMEM_TINY_PHASE6_METADATA)
// Phase 6-1.6: Metadata header (recommended)
#include "hakmem_tiny_metadata.inc"
#endif