d9991f39ff
Phase ALLOC-TINY-FAST-DUALHOT-1 & Optimization Roadmap Update
...
Add comprehensive design docs and research boxes:
- docs/analysis/ALLOC_TINY_FAST_DUALHOT_1_DESIGN.md: ALLOC DUALHOT investigation
- docs/analysis/FREE_TINY_FAST_DUALHOT_1_DESIGN.md: FREE DUALHOT final specs
- docs/analysis/FREE_TINY_FAST_HOTCOLD_OPT_1_DESIGN.md: Hot/Cold split research
- docs/analysis/POOL_MID_INUSE_DEFERRED_DN_BATCH_DESIGN.md: Deferred batching design
- docs/analysis/POOL_MID_INUSE_DEFERRED_REGRESSION_ANALYSIS.md: Stats overhead findings
- docs/analysis/MID_DESC_CACHE_BENCHMARK_2025-12-12.md: Cache measurement results
- docs/analysis/LAST_MATCH_CACHE_IMPLEMENTATION.md: TLS cache investigation
Research boxes (SS page table):
- core/box/ss_pt_env_box.h: HAKMEM_SS_LOOKUP_KIND gate
- core/box/ss_pt_types_box.h: 2-level page table structures
- core/box/ss_pt_lookup_box.h: ss_pt_lookup() implementation
- core/box/ss_pt_register_box.h: Page table registration
- core/box/ss_pt_impl.c: Global definitions
Updates:
- docs/specs/ENV_VARS_COMPLETE.md: HOTCOLD, DEFERRED, SS_LOOKUP env vars
- core/box/hak_free_api.inc.h: FREE-DISPATCH-SSOT integration
- core/box/pool_mid_inuse_deferred_box.h: Deferred API updates
- core/box/pool_mid_inuse_deferred_stats_box.h: Stats collection
- core/hakmem_super_registry: SS page table integration
Current Status:
- FREE-TINY-FAST-DUALHOT-1: +13% improvement, ready for adoption
- ALLOC-TINY-FAST-DUALHOT-1: -2% regression, frozen as research box
- Next: Optimization roadmap per ROI (mimalloc gap 2.5x)
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-12-13 05:35:46 +09:00
e95e61f0ff
Phase POLICY-FAST-PATH-V2 complete + MID-V35-HOTPATH-OPT-1 design
...
## Phase POLICY-FAST-PATH-V2 (FROZEN)
- Implementation complete: free_policy_fast_v2_box.h + malloc_tiny_fast.h integration
- A/B Results:
- Mixed (ws=400): -1.6% regression ❌ (branch cost > skip benefit)
- C6-heavy (ws=200): +5.4% improvement ✅
- Decision: Default OFF, FROZEN (ws<300 / C6-heavy research only)
- Learning: Large WS causes branch misprediction to dominate
## Phase 3-GRADUATE + ENV probe fix
- 64-probe retry for getenv() stability during bench_profile putenv()
- C6 ULTRA intrusive freelist: FROZEN (research box)
## Phase MID-V35-HOTPATH-OPT-1-DESIGN
- Design doc for next optimization target
- Target: MID v3.5 alloc/free hot path (C5-C6)
- Boxes: Stats Gate, TLS Layout, Boundary Check elimination
- Expected: +3-9% on Mixed mainline
Files:
- core/box/free_policy_fast_v2_box.h (new)
- core/box/free_path_stats_box.h/c (policy_fast_v2_skip counter)
- core/front/malloc_tiny_fast.h (fast-path integration)
- docs/analysis/MID_V35_HOTPATH_OPT_1_DESIGN.md (new)
- docs/analysis/PHASE_3_GRADUATE_*.md (new)
- CURRENT_TASK.md (phase status update)
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-12-12 18:40:08 +09:00
cba6f785a1
Add SuperSlab Prefault Box with 4MB MAP_POPULATE bug fix
...
New Feature: ss_prefault_box.h
- Box for controlling SuperSlab page prefaulting policy
- ENV: HAKMEM_SS_PREFAULT (0=OFF, 1=POPULATE, 2=TOUCH)
- Default: OFF (safe mode until further optimization)
Bug Fix: 4MB MAP_POPULATE regression
- Problem: Fallback path allocated 4MB (2x size for alignment) with MAP_POPULATE
causing 52x slower mmap (0.585ms → 30.6ms) and 35% throughput regression
- Solution: Remove MAP_POPULATE from 4MB allocation, apply madvise(MADV_WILLNEED)
only to the aligned 2MB region after trimming prefix/suffix
Changes:
- core/box/ss_prefault_box.h: New prefault policy box (header-only)
- core/box/ss_allocation_box.c: Integrate prefault box, call ss_prefault_region()
- core/superslab_cache.c: Fix fallback path - no MAP_POPULATE on 4MB,
always munmap prefix/suffix, use MADV_WILLNEED for 2MB only
- docs/specs/ENV_VARS*.md: Document HAKMEM_SS_PREFAULT
Performance:
- bench_random_mixed: 4.32M ops/s (regression fixed, slight improvement)
- bench_tiny_hot: 157M ops/s with prefault=1 (no crash)
Box Theory:
- OS layer (ss_os_acquire): "how to mmap"
- Prefault Box: "when to page-in"
- Allocation Box: "when to call prefault"
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com >
2025-12-04 20:11:24 +09:00
d5e6ed535c
P-Tier + Tiny Route Policy: Aggressive Superslab Management + Safe Routing
...
## Phase 1: Utilization-Aware Superslab Tiering (案B実装済)
- Add ss_tier_box.h: Classify SuperSlabs into HOT/DRAINING/FREE based on utilization
- HOT (>25%): Accept new allocations
- DRAINING (≤25%): Drain only, no new allocs
- FREE (0%): Ready for eager munmap
- Enhanced shared_pool_release_slab():
- Check tier transition after each slab release
- If tier→FREE: Force remaining slots to EMPTY and call superslab_free() immediately
- Bypasses LRU cache to prevent registry bloat from accumulating DRAINING SuperSlabs
- Test results (bench_random_mixed_hakmem):
- 1M iterations: ✅ ~1.03M ops/s (previously passed)
- 10M iterations: ✅ ~1.15M ops/s (previously: registry full error)
- 50M iterations: ✅ ~1.08M ops/s (stress test)
## Phase 2: Tiny Front Routing Policy (新規Box)
- Add tiny_route_box.h/c: Single 8-byte table for class→routing decisions
- ROUTE_TINY_ONLY: Tiny front exclusive (no fallback)
- ROUTE_TINY_FIRST: Try Tiny, fallback to Pool if fails
- ROUTE_POOL_ONLY: Skip Tiny entirely
- Profiles via HAKMEM_TINY_PROFILE ENV:
- "hot": C0-C3=TINY_ONLY, C4-C6=TINY_FIRST, C7=POOL_ONLY
- "conservative" (default): All TINY_FIRST
- "off": All POOL_ONLY (disable Tiny)
- "full": All TINY_ONLY (microbench mode)
- A/B test results (ws=256, 100k ops random_mixed):
- Default (conservative): ~2.90M ops/s
- hot: ~2.65M ops/s (more conservative)
- off: ~2.86M ops/s
- full: ~2.98M ops/s (slightly best)
## Design Rationale
### Registry Pressure Fix (案B)
- Problem: DRAINING tier SS occupied registry indefinitely
- Solution: When total_active_blocks→0, immediately free to clear registry slot
- Result: No more "registry full" errors under stress
### Routing Policy Box (新)
- Problem: Tiny front optimization scattered across ENV/branches
- Solution: Centralize routing in single table, select profiles via ENV
- Benefit: Safe A/B testing without touching hot path code
- Future: Integrate with RSS budget/learning layers for dynamic profile switching
## Next Steps (性能最適化)
- Profile Tiny front internals (TLS SLL, FastCache, Superslab backend latency)
- Identify bottleneck between current ~2.9M ops/s and mimalloc ~100M ops/s
- Consider:
- Reduce shared pool lock contention
- Optimize unified cache hit rate
- Streamline Superslab carving logic
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-12-04 18:01:25 +09:00
984cca41ef
P0 Optimization: Shared Pool fast path with O(1) metadata lookup
...
Performance Results:
- Throughput: 2.66M ops/s → 3.8M ops/s (+43% improvement)
- sp_meta_find_or_create: O(N) linear scan → O(1) direct pointer
- Stage 2 metadata scan: 100% → 10-20% (80-90% reduction via hints)
Core Optimizations:
1. O(1) Metadata Lookup (superslab_types.h)
- Added `shared_meta` pointer field to SuperSlab struct
- Eliminates O(N) linear search through ss_metadata[] array
- First access: O(N) scan + cache | Subsequent: O(1) direct return
2. sp_meta_find_or_create Fast Path (hakmem_shared_pool.c)
- Check cached ss->shared_meta first before linear scan
- Cache pointer after successful linear scan for future lookups
- Reduces 7.8% CPU hotspot to near-zero for hot paths
3. Stage 2 Class Hints Fast Path (hakmem_shared_pool_acquire.c)
- Try class_hints[class_idx] FIRST before full metadata scan
- Uses O(1) ss->shared_meta lookup for hint validation
- __builtin_expect() for branch prediction optimization
- 80-90% of acquire calls now skip full metadata scan
4. Proper Initialization (ss_allocation_box.c)
- Initialize shared_meta = NULL in superslab_allocate()
- Ensures correct NULL-check semantics for new SuperSlabs
Additional Improvements:
- Updated ptr_trace and debug ring for release build efficiency
- Enhanced ENV variable documentation and analysis
- Added learner_env_box.h for configuration management
- Various Box optimizations for reduced overhead
Thread Safety:
- All atomic operations use correct memory ordering
- shared_meta cached under mutex protection
- Lock-free Stage 2 uses proper CAS with acquire/release semantics
Testing:
- Benchmark: 1M iterations, 3.8M ops/s stable
- Build: Clean compile RELEASE=0 and RELEASE=1
- No crashes, memory leaks, or correctness issues
Next Optimization Candidates:
- P1: Per-SuperSlab free slot bitmap for O(1) slot claiming
- P2: Reduce Stage 2 critical section size
- P3: Page pre-faulting (MAP_POPULATE)
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-12-04 16:21:54 +09:00
0546454168
WIP: Add TLS SLL validation and SuperSlab registry fallback
...
ChatGPT's diagnostic changes to address TLS_SLL_HDR_RESET issue.
Current status: Partial mitigation, but root cause remains.
Changes Applied:
1. SuperSlab Registry Fallback (hakmem_super_registry.h)
- Added legacy table probe when hash map lookup misses
- Prevents NULL returns for valid SuperSlabs during initialization
- Status: ✅ Works but may hide underlying registration issues
2. TLS SLL Push Validation (tls_sll_box.h)
- Reject push if SuperSlab lookup returns NULL
- Reject push if class_idx mismatch detected
- Added [TLS_SLL_PUSH_NO_SS] diagnostic message
- Status: ✅ Prevents list corruption (defensive)
3. SuperSlab Allocation Class Fix (superslab_allocate.c)
- Pass actual class_idx to sp_internal_allocate_superslab
- Prevents dummy class=8 causing OOB access
- Status: ✅ Root cause fix for allocation path
4. Debug Output Additions
- First 256 push/pop operations traced
- First 4 mismatches logged with details
- SuperSlab registration state logged
- Status: ✅ Diagnostic tool (not a fix)
5. TLS Hint Box Removed
- Deleted ss_tls_hint_box.{c,h} (Phase 1 optimization)
- Simplified to focus on stability first
- Status: ⏳ Can be re-added after root cause fixed
Current Problem (REMAINS UNSOLVED):
- [TLS_SLL_HDR_RESET] still occurs after ~60 seconds of sh8bench
- Pointer is 16 bytes offset from expected (class 1 → class 2 boundary)
- hak_super_lookup returns NULL for that pointer
- Suggests: Use-After-Free, Double-Free, or pointer arithmetic error
Root Cause Analysis:
- Pattern: Pointer offset by +16 (one class 1 stride)
- Timing: Cumulative problem (appears after 60s, not immediately)
- Location: Header corruption detected during TLS SLL pop
Remaining Issues:
⚠️ Registry fallback is defensive (may hide registration bugs)
⚠️ Push validation prevents symptoms but not root cause
⚠️ 16-byte pointer offset source unidentified
Next Steps for Investigation:
1. Full pointer arithmetic audit (Magazine ⇔ TLS SLL paths)
2. Enhanced logging at HDR_RESET point:
- Expected vs actual pointer value
- Pointer provenance (where it came from)
- Allocation trace for that block
3. Verify Headerless flag is OFF throughout build
4. Check for double-offset application in conversions
Technical Assessment:
- 60% root cause fixes (allocation class, validation)
- 40% defensive mitigation (registry fallback, push rejection)
Performance Impact:
- Registry fallback: +10-30 cycles on cold path (negligible)
- Push validation: +5-10 cycles per push (acceptable)
- Overall: < 2% performance impact estimated
Related Issues:
- Phase 1 TLS Hint Box removed temporarily
- Phase 2 Headerless blocked until stability achieved
🤖 Generated with Claude Code (https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-12-03 20:42:28 +09:00
7f9e4015da
docs: Update ENV_VARS.md with Phase 3 additions
...
Added documentation for new environment variables and build flags:
Benchmark Environment Variables:
- HAKMEM_BENCH_FAST_FRONT: Enable ultra-fast header-based free path
- HAKMEM_BENCH_WARMUP: Warmup cycles before timed run
- HAKMEM_FREE_ROUTE_TRACE: Debug trace for free() routing
- HAKMEM_EXTERNAL_GUARD_LOG: ExternalGuard debug logging
- HAKMEM_EXTERNAL_GUARD_STATS: ExternalGuard statistics at exit
Build Flags:
- HAKMEM_TINY_SS_TRUST_MMAP_ZERO: mmap zero-trust optimization
- Default: 0 (safe)
- Performance: +5.93% on bench_tiny_hot (allocation-heavy)
- Safety: Release-only, cache reuse always gets full memset
- Location: core/hakmem_build_flags.h:170-180
- Implementation: core/box/ss_allocation_box.c:37-78
Deprecated:
- HAKMEM_DISABLE_MINCORE_CHECK: Removed in Phase 3 (commit d78baf41c )
Each entry includes:
- Default value
- Usage example
- Effect description
- Source code location
- A/B testing guidance (where applicable)
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-29 09:58:14 +09:00
2a47624850
Document Phase 4c/4d master trace and stats control
...
Complete ENV cleanup Phase 4 documentation:
- Phase 4c: HAKMEM_TRACE unified trace control
- Phase 4d: HAKMEM_STATS unified stats control
- Summary of all 6 new master control variables
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-28 16:11:38 +09:00
322d94ac6a
Document Phase 4b master debug control in ENV_VARS.md
...
Add documentation for new HAKMEM_DEBUG_ALL and HAKMEM_DEBUG_LEVEL
environment variables introduced in Phase 4b.
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-28 16:03:53 +09:00
eec33ca37d
Document ENV Cleanup Phase 4a completion (20 variables total)
...
Phase 4a: Hot Path getenv Caching - COMPLETED
- All getenv() calls in hot paths verified as properly cached
- Key fix: hakmem_elo.c (10+ loop calls → cached is_quiet())
- Verified correct caching in 7 other critical files
Added ENV_VARIABLE_SURVEY.md:
- Comprehensive survey of 228 ENV variables
- Category breakdown and consolidation recommendations
- Target: ~80 variables (65% reduction)
Updated docs/specs/ENV_VARS.md:
- Added ENV Cleanup Progress section
- Documented Phase 4a findings
- Outlined Phase 4b+ future work (HAKMEM_DEBUG/TRACE/STATS unified vars)
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-28 15:29:16 +09:00
a6e681aae7
P2: TLS SLL Redesign - class_map default, tls_cached tracking, conditional header restore
...
This commit completes the P2 phase of the Tiny Pool TLS SLL redesign to fix the
Header/Next pointer conflict that was causing ~30% crash rates.
Changes:
- P2.1: Make class_map lookup the default (ENV: HAKMEM_TINY_NO_CLASS_MAP=1 for legacy)
- P2.2: Add meta->tls_cached field to track blocks cached in TLS SLL
- P2.3: Make Header restoration conditional in tiny_next_store() (default: skip)
- P2.4: Add invariant verification functions (active + tls_cached ≈ used)
- P0.4: Document new ENV variables in ENV_VARS.md
New ENV variables:
- HAKMEM_TINY_ACTIVE_TRACK=1: Enable active/tls_cached tracking (~1% overhead)
- HAKMEM_TINY_NO_CLASS_MAP=1: Disable class_map (legacy mode)
- HAKMEM_TINY_RESTORE_HEADER=1: Force header restoration (legacy mode)
- HAKMEM_TINY_INVARIANT_CHECK=1: Enable invariant verification (debug)
- HAKMEM_TINY_INVARIANT_DUMP=1: Enable periodic state dumps (debug)
Benchmark results (bench_tiny_hot_hakmem 64B):
- Default (class_map ON): 84.49 M ops/sec
- ACTIVE_TRACK=1: 83.62 M ops/sec (-1%)
- NO_CLASS_MAP=1 (legacy): 85.06 M ops/sec
- MT performance: +21-28% vs system allocator
No crashes observed. All tests passed.
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-28 14:11:37 +09:00
b1b2ab11c7
Update CONFIGURATION.md with ENV Cleanup Phase 1-3 results
...
Added new section "Debug Variables (Gated in Release Builds)" documenting:
- 13 debug variables now compiled out in HAKMEM_BUILD_RELEASE=1
- 4 production config variables preserved (intentional)
- Performance impact: +1.0% (30.2M → 30.5M ops/s)
Updated sections:
- Header: Last Updated 2025-11-28
- Recent Changes: Added Phase 1-3 entry
- FAQ: Added Q&A about gated debug variables
- See Also: Added link to ENV_CLEANUP_TASK.md
Variables documented:
- Core debug: TINY_ALLOC_DEBUG, TINY_PROFILE, WATCH_ADDR
- Trace/timing: PTR_TRACE_*, TIMING
- Freelist: TINY_SLL_DIAG, FREELIST_MASK, SS_FREE_DEBUG
- SuperSlab: SUPER_LOOKUP_DEBUG, SUPER_REG_DEBUG, SS_LRU_DEBUG, SS_PREWARM_DEBUG
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-28 04:22:37 +09:00
f4978b1529
ENV Cleanup Phase 5: Additional DEBUG guards + doc cleanup
...
Code changes:
- core/slab_handle.h: Add RELEASE guard for HAKMEM_TINY_FREELIST_MASK
- core/tiny_superslab_free.inc.h: Add guards for HAKMEM_TINY_ROUTE_FREE, HAKMEM_TINY_FREELIST_MASK
Documentation cleanup:
- docs/specs/CONFIGURATION.md: Remove 21 doc-only ENV variables
- docs/specs/ENV_VARS.md: Remove doc-only variables
Testing:
- Build: PASS (305KB binary, unchanged)
- Sanity: PASS (17.22M ops/s average, 3 runs)
- Larson: PASS (52.12M ops/s, 0 crashes)
Impact:
- 2 additional DEBUG ENV variables guarded (no overhead in RELEASE)
- Documentation accuracy improved
- Binary size maintained
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-27 03:55:17 +09:00
d511084c5b
ENV cleanup: Remove 21 doc-only variables from ENV_VARS.md
...
Removed 21 ENV variables that existed only in documentation with zero
code references (no getenv() calls in source):
Pool/Refill (1):
- HAKMEM_POOL_REFILL_BATCH
Intelligence Engine (3):
- HAKMEM_INT_ENGINE, HAKMEM_INT_EVENT_TS, HAKMEM_INT_SAMPLE
Frontend/FastCache (3):
- HAKMEM_TINY_FRONTEND, HAKMEM_TINY_FASTCACHE, HAKMEM_TINY_FAST
Wrapper/Safety/Debug (5):
- HAKMEM_WRAP_TINY_REFILL, HAKMEM_SAFE_FREE_STRICT, HAKMEM_TINY_GUARD
- HAKMEM_TINY_DEBUG_FAST0, HAKMEM_TINY_DEBUG_REMOTE_GUARD
Optimization/TLS/Memory (9):
- HAKMEM_TINY_QUICK, HAKMEM_USE_REGISTRY
- HAKMEM_TINY_TLS_LIST, HAKMEM_TINY_DRAIN_TO_SLL, HAKMEM_TINY_ALLOC_RING
- HAKMEM_TINY_MEM_DIET, HAKMEM_SLL_MULTIPLIER
- HAKMEM_TINY_PREFETCH, HAKMEM_TINY_SS_RESERVE
Impact:
- ENV_VARS.md: 327 lines → 285 lines (-42 lines, 12.8% reduction)
- Code impact: Zero (documentation-only cleanup)
- Variables were: planned features never implemented, replaced features,
or abandoned experiments
Documentation:
- Added SAFE_TO_DELETE_ENV_VARS.md to docs/analysis/
- Complete analysis of why each variable is obsolete
- Verification proof that variables don't exist in code
File: docs/specs/ENV_VARS.md
Status: Documentation cleanup - no code changes
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-27 02:52:35 +09:00
a9ddb52ad4
ENV cleanup: Remove BG/HotMag vars & guard fprintf (Larson 52.3M ops/s)
...
Phase 1 完了:環境変数整理 + fprintf デバッグガード
ENV変数削除(BG/HotMag系):
- core/hakmem_tiny_init.inc: HotMag ENV 削除 (~131 lines)
- core/hakmem_tiny_bg_spill.c: BG spill ENV 削除
- core/tiny_refill.h: BG remote 固定値化
- core/hakmem_tiny_slow.inc: BG refs 削除
fprintf Debug Guards (#if !HAKMEM_BUILD_RELEASE):
- core/hakmem_shared_pool.c: Lock stats (~18 fprintf)
- core/page_arena.c: Init/Shutdown/Stats (~27 fprintf)
- core/hakmem.c: SIGSEGV init message
ドキュメント整理:
- 328 markdown files 削除(旧レポート・重複docs)
性能確認:
- Larson: 52.35M ops/s (前回52.8M、安定動作✅ )
- ENV整理による機能影響なし
- Debug出力は一部残存(次phase で対応)
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-26 14:45:26 +09:00
67fb15f35f
Wrap debug fprintf in !HAKMEM_BUILD_RELEASE guards (Release build optimization)
...
## Changes
### 1. core/page_arena.c
- Removed init failure message (lines 25-27) - error is handled by returning early
- All other fprintf statements already wrapped in existing #if !HAKMEM_BUILD_RELEASE blocks
### 2. core/hakmem.c
- Wrapped SIGSEGV handler init message (line 72)
- CRITICAL: Kept SIGSEGV/SIGBUS/SIGABRT error messages (lines 62-64) - production needs crash logs
### 3. core/hakmem_shared_pool.c
- Wrapped all debug fprintf statements in #if !HAKMEM_BUILD_RELEASE:
- Node pool exhaustion warning (line 252)
- SP_META_CAPACITY_ERROR warning (line 421)
- SP_FIX_GEOMETRY debug logging (line 745)
- SP_ACQUIRE_STAGE0.5_EMPTY debug logging (line 865)
- SP_ACQUIRE_STAGE0_L0 debug logging (line 803)
- SP_ACQUIRE_STAGE1_LOCKFREE debug logging (line 922)
- SP_ACQUIRE_STAGE2_LOCKFREE debug logging (line 996)
- SP_ACQUIRE_STAGE3 debug logging (line 1116)
- SP_SLOT_RELEASE debug logging (line 1245)
- SP_SLOT_FREELIST_LOCKFREE debug logging (line 1305)
- SP_SLOT_COMPLETELY_EMPTY debug logging (line 1316)
- Fixed lock_stats_init() for release builds (lines 60-65) - ensure g_lock_stats_enabled is initialized
## Performance Validation
Before: 51M ops/s (with debug fprintf overhead)
After: 49.1M ops/s (consistent performance, fprintf removed from hot paths)
## Build & Test
```bash
./build.sh larson_hakmem
./out/release/larson_hakmem 1 5 1 1000 100 10000 42
# Result: 49.1M ops/s
```
Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-26 13:14:18 +09:00
52386401b3
Debug Counters Implementation - Clean History
...
Major Features:
- Debug counter infrastructure for Refill Stage tracking
- Free Pipeline counters (ss_local, ss_remote, tls_sll)
- Diagnostic counters for early return analysis
- Unified larson.sh benchmark runner with profiles
- Phase 6-3 regression analysis documentation
Bug Fixes:
- Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB)
- Fix profile variable naming consistency
- Add .gitignore patterns for large files
Performance:
- Phase 6-3: 4.79 M ops/s (has OOM risk)
- With SuperSlab: 3.13 M ops/s (+19% improvement)
This is a clean repository without large log files.
🤖 Generated with [Claude Code](https://claude.com/claude-code )
Co-Authored-By: Claude <noreply@anthropic.com >
2025-11-05 12:31:14 +09:00