|
|
43015725af
|
ENV cleanup: Add RELEASE guards to DEBUG ENV variables (14 vars)
Added compile-time guards (#if HAKMEM_BUILD_RELEASE) to eliminate
DEBUG ENV variable overhead in RELEASE builds.
Variables guarded (14 total):
- HAKMEM_TINY_TRACE_RING, HAKMEM_TINY_DUMP_RING_ATEXIT
- HAKMEM_TINY_RF_TRACE, HAKMEM_TINY_MAILBOX_TRACE
- HAKMEM_TINY_MAILBOX_TRACE_LIMIT, HAKMEM_TINY_MAILBOX_SLOWDISC
- HAKMEM_TINY_MAILBOX_SLOWDISC_PERIOD
- HAKMEM_SS_PREWARM_DEBUG, HAKMEM_SS_FREE_DEBUG
- HAKMEM_TINY_FRONT_METRICS, HAKMEM_TINY_FRONT_DUMP
- HAKMEM_TINY_COUNTERS_DUMP, HAKMEM_TINY_REFILL_DUMP
- HAKMEM_PTR_TRACE_DUMP, HAKMEM_PTR_TRACE_VERBOSE
Files modified (9 core files):
- core/tiny_debug_ring.c (ring trace/dump)
- core/box/mailbox_box.c (mailbox trace + slowdisc)
- core/tiny_refill.h (refill trace)
- core/hakmem_tiny_superslab.c (superslab debug)
- core/box/ss_allocation_box.c (allocation debug)
- core/tiny_superslab_free.inc.h (free debug)
- core/box/front_metrics_box.c (frontend metrics)
- core/hakmem_tiny_stats.c (stats dump)
- core/ptr_trace.h (pointer trace)
Bug fixes during implementation:
1. mailbox_box.c - Fixed variable scope (moved 'used' outside guard)
2. hakmem_tiny_stats.c - Fixed incomplete declarations (on1, on2)
Impact:
- Binary size: -85KB total
- bench_random_mixed_hakmem: 319K → 305K (-14K, -4.4%)
- larson_hakmem: 380K → 309K (-71K, -18.7%)
- Performance: No regression (16.9-17.9M ops/s maintained)
- Functional: All tests pass (Random Mixed + Larson)
- Behavior: DEBUG ENV vars correctly ignored in RELEASE builds
Testing:
- Build: Clean compilation (warnings only, pre-existing)
- 100K Random Mixed: 16.9-17.9M ops/s (PASS)
- 10K Larson: 25.9M ops/s (PASS)
- DEBUG ENV verification: Correctly ignored (PASS)
Result: 14 DEBUG ENV variables now have zero overhead in RELEASE builds.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-11-27 03:41:07 +09:00 |
|
|
|
a78224123e
|
Fix C0/C7 class confusion: Upgrade C7 stride to 2048B and fix meta->class_idx initialization
Root Cause:
1. C7 stride was 1024B, unable to serve 1024B user requests (need 1025B with header)
2. New SuperSlabs start with meta->class_idx=0 (mmap zero-init)
3. superslab_init_slab() only sets class_idx if meta->class_idx==255
4. Multiple code paths used conditional assignment (if class_idx==255), leaving C7 slabs with class_idx=0
5. This caused C7 blocks to be misidentified as C0, leading to HDR_META_MISMATCH errors
Changes:
1. Upgrade C7 stride: 1024B → 2048B (can now serve 1024B requests)
2. Update blocks_per_slab[7]: 64 → 32 (2048B stride / 64KB slab)
3. Update size-to-class LUT: entries 513-2048 now map to C7
4. Fix superslab_init_slab() fail-safe: only reinitialize if class_idx==255 (not 0)
5. Add explicit class_idx assignment in 6 initialization paths:
- tiny_superslab_alloc.inc.h: superslab_refill() after init
- hakmem_tiny_superslab.c: backend_shared after init (main path)
- ss_unified_backend_box.c: unconditional assignment
- ss_legacy_backend_box.c: explicit assignment
- superslab_expansion_box.c: explicit assignment
- ss_allocation_box.c: fail-safe condition fix
Fix P0 refill bug:
- Update obsolete array access after Phase 3d-B TLS SLL unification
- g_tls_sll_head[cls] → g_tls_sll[cls].head
- g_tls_sll_count[cls] → g_tls_sll[cls].count
Results:
- HDR_META_MISMATCH: eliminated (0 errors in 100K iterations)
- 1024B allocations now routed to C7 (Tiny fast path)
- NXT_MISALIGN warnings remain (legacy 1024B SuperSlabs, separate issue)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-11-21 13:44:05 +09:00 |
|
|
|
38552c3f39
|
Phase 3d-A: SlabMeta Box boundary - Encapsulate SuperSlab metadata access
ChatGPT-guided Box theory refactoring (Phase A: Boundary only).
Changes:
- Created ss_slab_meta_box.h with 15 inline accessor functions
- HOT fields (8): freelist, used, capacity (fast path)
- COLD fields (6): class_idx, carved, owner_tid_low (init/debug)
- Legacy (1): ss_slab_meta_ptr() for atomic ops
- Migrated 14 direct slabs[] access sites across 6 files
- hakmem_shared_pool.c (4 sites)
- tiny_free_fast_v2.inc.h (1 site)
- hakmem_tiny.c (3 sites)
- external_guard_box.h (1 site)
- hakmem_tiny_lifecycle.inc (1 site)
- ss_allocation_box.c (4 sites)
Architecture:
- Zero overhead (static inline wrappers)
- Single point of change for future layout optimizations
- Enables Hot/Cold split (Phase C) without touching call sites
- A/B testing support via compile-time flags
Verification:
- Build: ✅ Success (no errors)
- Stability: ✅ All sizes pass (128B-1KB, 22-24M ops/s)
- Behavior: Unchanged (thin wrapper, no logic changes)
Next: Phase B (TLS Cache Merge, +12-18% expected)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-11-20 02:01:52 +09:00 |
|