|
|
19056282b6
|
Phase 3 D2: Wrapper Env Cache - [DECISION: NO-GO]
Target: Reduce wrapper_env_cfg() overhead in malloc/free hot path
- Strategy: Cache wrapper env configuration pointer in TLS
- Approach: Fast pointer cache (TLS caches const wrapper_env_cfg_t*)
Implementation:
- core/box/wrapper_env_cache_env_box.h: ENV gate (HAKMEM_WRAP_ENV_CACHE)
- core/box/wrapper_env_cache_box.h: TLS cache layer (wrapper_env_cfg_fast)
- core/box/hak_wrappers.inc.h: Integration into malloc/free hot paths
- ENV gate: HAKMEM_WRAP_ENV_CACHE=0/1 (default OFF)
A/B Test Results (Mixed, 10-run, 20M iters):
- Baseline (D2=0): 46.52M ops/s (avg), 46.47M ops/s (median)
- Optimized (D2=1): 45.85M ops/s (avg), 45.98M ops/s (median)
- Improvement: avg -1.44%, median -1.05% (DECISION: NO-GO)
Analysis:
- Regression cause: TLS cache adds overhead (branch + TLS access)
- wrapper_env_cfg() is already minimal (pointer return after simple check)
- Adding TLS caching layer makes it worse, not better
- Branch prediction penalty outweighs any potential savings
Cumulative Phase 2-3:
- B3: +2.89%, B4: +1.47%, C3: +2.20%
- D1: +1.06% (opt-in), D2: -1.44% (NO-GO)
- Total: ~7.2% (excluding D2)
Decision: FREEZE as research box (default OFF, regression confirmed)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2025-12-13 22:03:27 +09:00 |
|
|
|
acc64f2438
|
Phase ML1: Pool v1 memset 89.73% overhead 軽量化 (+15.34% improvement)
## Summary
- ChatGPT により bench_profile.h の setenv segfault を修正(RTLD_NEXT 経由に切り替え)
- core/box/pool_zero_mode_box.h 新設:ENV キャッシュ経由で ZERO_MODE を統一管理
- core/hakmem_pool.c で zero mode に応じた memset 制御(FULL/header/off)
- A/B テスト結果:ZERO_MODE=header で +15.34% improvement(1M iterations, C6-heavy)
## Files Modified
- core/box/pool_api.inc.h: pool_zero_mode_box.h include
- core/bench_profile.h: glibc setenv → malloc+putenv(segfault 回避)
- core/hakmem_pool.c: zero mode 参照・制御ロジック
- core/box/pool_zero_mode_box.h (新設): enum/getter
- CURRENT_TASK.md: Phase ML1 結果記載
## Test Results
| Iterations | ZERO_MODE=full | ZERO_MODE=header | Improvement |
|-----------|----------------|-----------------|------------|
| 10K | 3.06 M ops/s | 3.17 M ops/s | +3.65% |
| 1M | 23.71 M ops/s | 27.34 M ops/s | **+15.34%** |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2025-12-10 09:08:18 +09:00 |
|
|
|
3f461ba25f
|
Cleanup: Consolidate debug ENV vars to HAKMEM_DEBUG_LEVEL
Integrated 4 new debug environment variables added during bug fixes
into the existing unified HAKMEM_DEBUG_LEVEL system (expanded to 0-5 levels).
Changes:
1. Expanded HAKMEM_DEBUG_LEVEL from 0-3 to 0-5 levels:
- 0 = OFF (production)
- 1 = ERROR (critical errors)
- 2 = WARN (warnings)
- 3 = INFO (allocation paths, header validation, stats)
- 4 = DEBUG (guard instrumentation, failfast)
- 5 = TRACE (verbose tracing)
2. Integrated 4 environment variables:
- HAKMEM_ALLOC_PATH_TRACE → HAKMEM_DEBUG_LEVEL >= 3 (INFO)
- HAKMEM_TINY_SLL_VALIDATE_HDR → HAKMEM_DEBUG_LEVEL >= 3 (INFO)
- HAKMEM_TINY_REFILL_FAILFAST → HAKMEM_DEBUG_LEVEL >= 4 (DEBUG)
- HAKMEM_TINY_GUARD → HAKMEM_DEBUG_LEVEL >= 4 (DEBUG)
3. Kept 2 special-purpose variables (fine-grained control):
- HAKMEM_TINY_GUARD_CLASS (target class for guard)
- HAKMEM_TINY_GUARD_MAX (max guard events)
4. Backward compatibility:
- Legacy ENV vars still work via hak_debug_check_level()
- New code uses unified system
- No behavior changes for existing users
Updated files:
- core/hakmem_debug_master.h (level 0-5 expansion)
- core/hakmem_tiny_superslab_internal.h (alloc path trace)
- core/box/tls_sll_box.h (header validation)
- core/tiny_failfast.c (failfast level)
- core/tiny_refill_opt.h (failfast guard)
- core/hakmem_tiny_ace_guard_box.inc (guard enable)
- core/hakmem_tiny.c (include hakmem_debug_master.h)
Impact:
- Simpler debug control: HAKMEM_DEBUG_LEVEL=3 instead of 4 separate ENVs
- Easier to discover/use
- Consistent debug levels across codebase
- Reduces ENV variable proliferation (43+ vars surveyed)
Future work:
- Consolidate remaining 39+ debug variables (documented in survey)
- Gradual migration over 2-3 releases
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-11-29 06:57:03 +09:00 |
|
|
|
bcfb4f6b59
|
Remove dead code: UltraHot, RingCache, FrontC23, Class5 Hotpath
(cherry-picked from 225b6fcc7, conflicts resolved)
|
2025-11-26 12:33:49 +09:00 |
|
|
|
fcf098857a
|
Phase12 debug: restore SUPERSLAB constants/APIs, implement Box2 drain boundary, fix tiny_fast_pop to return BASE, honor TLS SLL toggle in alloc/free fast paths, add fail-fast stubs, and quiet capacity sentinel. Update CURRENT_TASK with A/B results (SLL-off stable; SLL-on crash).
|
2025-11-14 01:02:00 +09:00 |
|