Phase 5 E4-2: Malloc Wrapper ENV Snapshot (+21.83% GO, ADOPTED)
Target: Consolidate malloc wrapper TLS reads + eliminate function calls
- malloc (16.13%) + tiny_alloc_gate_fast (19.50%) = 35.63% combined
- Strategy: E4-1 success pattern + function call elimination
Implementation:
- ENV gate: HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0/1 (default 0)
- core/box/malloc_wrapper_env_snapshot_box.{h,c}: New box
- Consolidates multiple TLS reads → 1 TLS read
- Pre-caches tiny_max_size() == 256 (eliminates function call)
- Lazy init with probe window (bench_profile putenv sync)
- core/box/hak_wrappers.inc.h: Integration in malloc() wrapper
- Makefile: Add malloc_wrapper_env_snapshot_box.o to all targets
A/B Test Results (Mixed, 10-run, 20M iters):
- Baseline (SNAPSHOT=0): 35.74M ops/s (mean), 35.75M ops/s (median)
- Optimized (SNAPSHOT=1): 43.54M ops/s (mean), 43.92M ops/s (median)
- Improvement: +21.83% mean, +22.86% median (+7.80M ops/s)
Decision: GO (+21.83% >> +1.0% threshold, 21.8x over)
- Why 6.2x better than E4-1 (+3.51%)?
- Higher malloc call frequency (allocation-heavy workload)
- Function call elimination (tiny_max_size pre-cached)
- Larger target: 35.63% vs free's 25.26%
- Health check: PASS (all profiles)
- Action: PROMOTED to MIXED_TINYV3_C7_SAFE preset
Phase 5 Cumulative (estimated):
- E1 (ENV Snapshot): +3.92%
- E4-1 (Free Wrapper Snapshot): +3.51%
- E4-2 (Malloc Wrapper Snapshot): +21.83%
- Estimated combined: ~+30% (needs validation)
Next Steps:
- Combined A/B test (E4-1 + E4-2 simultaneously)
- Measure actual cumulative effect
- Profile new baseline for next optimization targets
Deliverables:
- docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_DESIGN.md
- docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_AB_TEST_RESULTS.md
- docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md
- docs/analysis/PHASE5_E4_COMBINED_AB_TEST_NEXT_INSTRUCTIONS.md (next)
- docs/analysis/ENV_PROFILE_PRESETS.md (E4-2 added)
- CURRENT_TASK.md (E4-2 complete)
- core/bench_profile.h (E4-2 promoted to default)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -1,5 +1,59 @@
|
|||||||
# 本線タスク(現在)
|
# 本線タスク(現在)
|
||||||
|
|
||||||
|
## 更新メモ(2025-12-14 Phase 5 E4-2 Complete - Malloc Gate Optimization)
|
||||||
|
|
||||||
|
### Phase 5 E4-2: malloc Wrapper ENV Snapshot ✅ GO (2025-12-14)
|
||||||
|
|
||||||
|
**Target**: Consolidate TLS reads in malloc() wrapper to reduce 35.63% combined hot spot
|
||||||
|
- Strategy: Apply E4-1 success pattern (ENV snapshot consolidation) to malloc() side
|
||||||
|
- Combined target: malloc (16.13%) + tiny_alloc_gate_fast (19.50%) = 35.63% self%
|
||||||
|
- Implementation: Single TLS snapshot with packed flags (wrap_shape + front_gate + tiny_max_size_256)
|
||||||
|
- Reduce: 2+ TLS reads → 1 TLS read, eliminate tiny_get_max_size() function call
|
||||||
|
|
||||||
|
**Implementation**:
|
||||||
|
- ENV gate: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0/1` (default: 0, research box)
|
||||||
|
- Files: `core/box/malloc_wrapper_env_snapshot_box.{h,c}` (new ENV snapshot box)
|
||||||
|
- Integration: `core/box/hak_wrappers.inc.h` (lines 174-221, malloc() wrapper)
|
||||||
|
- Optimization: Pre-cache `tiny_max_size() == 256` to eliminate function call
|
||||||
|
|
||||||
|
**A/B Test Results** (Mixed, 10-run, 20M iters, ws=400):
|
||||||
|
- Baseline (SNAPSHOT=0): **35.74M ops/s** (mean), 35.75M ops/s (median), σ=0.43M
|
||||||
|
- Optimized (SNAPSHOT=1): **43.54M ops/s** (mean), 43.92M ops/s (median), σ=1.17M
|
||||||
|
- **Delta: +21.83% mean, +22.86% median** ✅
|
||||||
|
|
||||||
|
**Decision: GO** (+21.83% >> +1.0% threshold)
|
||||||
|
- EXCEEDED conservative estimate (+2-4%) → Achieved **+21.83%**
|
||||||
|
- 6.2x better than E4-1 (+3.51%) - malloc() has higher ROI than free()
|
||||||
|
- Action: Promote to default configuration (HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1)
|
||||||
|
|
||||||
|
**Health Check**: ✅ PASS
|
||||||
|
- MIXED_TINYV3_C7_SAFE: 40.8M ops/s
|
||||||
|
- C6_HEAVY_LEGACY_POOLV1: 21.8M ops/s
|
||||||
|
- All profiles passed, no regressions
|
||||||
|
|
||||||
|
**Why 6.2x better than E4-1?**:
|
||||||
|
1. **Higher Call Frequency**: malloc() called MORE than free() in alloc-heavy workloads
|
||||||
|
2. **Function Call Elimination**: Pre-caching tiny_max_size()==256 removes function call overhead
|
||||||
|
3. **Better Branch Prediction**: size <= 256 is highly predictable for tiny allocations
|
||||||
|
4. **Larger Target**: 35.63% combined self% (malloc + tiny_alloc_gate_fast) vs free's 25.26%
|
||||||
|
|
||||||
|
**Key Insight**: malloc() wrapper optimization has **6.2x higher ROI** than free() wrapper. ENV snapshot pattern continues to dominate, with malloc side showing exceptional gains due to function call elimination and higher call frequency.
|
||||||
|
|
||||||
|
**Cumulative Status (Phase 5)**:
|
||||||
|
- E4-1 (Free Wrapper Snapshot): +3.51% (GO)
|
||||||
|
- E4-2 (Malloc Wrapper Snapshot): +21.83% (GO) ⭐ **MAJOR WIN**
|
||||||
|
- Combined estimate: ~+25-27% (to be measured with both enabled)
|
||||||
|
- Total Phase 5: **+21.83%** standalone (on top of Phase 4's +3.9%)
|
||||||
|
|
||||||
|
**Next Steps**:
|
||||||
|
- Measure combined effect (E4-1 + E4-2 both enabled)
|
||||||
|
- Profile new bottlenecks at 43.54M ops/s baseline
|
||||||
|
- Update default presets with HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1
|
||||||
|
- Design doc: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_DESIGN.md`
|
||||||
|
- Results: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_AB_TEST_RESULTS.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 更新メモ(2025-12-14 Phase 5 E4-1 Complete - Free Gate Optimization)
|
## 更新メモ(2025-12-14 Phase 5 E4-1 Complete - Free Gate Optimization)
|
||||||
|
|
||||||
### Phase 5 E4-1: Free Wrapper ENV Snapshot ✅ GO (2025-12-14)
|
### Phase 5 E4-1: Free Wrapper ENV Snapshot ✅ GO (2025-12-14)
|
||||||
@ -43,11 +97,13 @@
|
|||||||
|
|
||||||
**Next Steps**:
|
**Next Steps**:
|
||||||
- ✅ Promoted: `MIXED_TINYV3_C7_SAFE` で `HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=1` を default 化(opt-out 可)
|
- ✅ Promoted: `MIXED_TINYV3_C7_SAFE` で `HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=1` を default 化(opt-out 可)
|
||||||
- Next target: E4-2(malloc wrapper snapshot)か、perf で self% ≥ 5% の芯を選ぶ
|
- ✅ Promoted: `MIXED_TINYV3_C7_SAFE` で `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1` を default 化(opt-out 可)
|
||||||
|
- Next: E4-1+E4-2 の累積 A/B を 1 本だけ確認して、新 baseline で perf を取り直す
|
||||||
- Design doc: `docs/analysis/PHASE5_E4_FREE_GATE_OPTIMIZATION_1_DESIGN.md`
|
- Design doc: `docs/analysis/PHASE5_E4_FREE_GATE_OPTIMIZATION_1_DESIGN.md`
|
||||||
- 指示書:
|
- 指示書:
|
||||||
- `docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
- `docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
||||||
- `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
- `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
||||||
|
- `docs/analysis/PHASE5_E4_COMBINED_AB_TEST_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
8
Makefile
8
Makefile
@ -218,12 +218,12 @@ LDFLAGS += $(EXTRA_LDFLAGS)
|
|||||||
|
|
||||||
# Targets
|
# Targets
|
||||||
TARGET = test_hakmem
|
TARGET = test_hakmem
|
||||||
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||||
OBJS = $(OBJS_BASE)
|
OBJS = $(OBJS_BASE)
|
||||||
|
|
||||||
# Shared library
|
# Shared library
|
||||||
SHARED_LIB = libhakmem.so
|
SHARED_LIB = libhakmem.so
|
||||||
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
|
SHARED_OBJS = hakmem_shared.o hakmem_config_shared.o hakmem_tiny_config_shared.o hakmem_ucb1_shared.o hakmem_bigcache_shared.o hakmem_pool_shared.o hakmem_l25_pool_shared.o hakmem_site_rules_shared.o hakmem_tiny_shared.o core/box/ss_allocation_box_shared.o superslab_stats_shared.o superslab_cache_shared.o superslab_ace_shared.o superslab_slab_shared.o superslab_backend_shared.o core/superslab_head_stub_shared.o hakmem_smallmid_shared.o core/box/superslab_expansion_box_shared.o core/box/integrity_box_shared.o core/box/mailbox_box_shared.o core/box/front_gate_box_shared.o core/box/front_gate_classifier_shared.o core/box/free_publish_box_shared.o core/box/capacity_box_shared.o core/box/carve_push_box_shared.o core/box/prewarm_box_shared.o core/box/ss_hot_prewarm_box_shared.o core/box/front_metrics_box_shared.o core/box/bench_fast_box_shared.o core/box/ss_addr_map_box_shared.o core/box/ss_pt_impl_shared.o core/box/slab_recycling_box_shared.o core/box/pagefault_telemetry_box_shared.o core/box/tiny_sizeclass_hist_box_shared.o core/box/tiny_env_box_shared.o core/box/tiny_route_box_shared.o core/box/free_front_v3_env_box_shared.o core/box/free_path_stats_box_shared.o core/box/free_dispatch_stats_box_shared.o core/box/alloc_gate_stats_box_shared.o core/box/tiny_page_box_shared.o core/box/tiny_class_policy_box_shared.o core/box/tiny_class_stats_box_shared.o core/box/tiny_policy_learner_box_shared.o core/box/ss_budget_box_shared.o core/box/tiny_mem_stats_box_shared.o core/box/wrapper_env_box_shared.o core/box/free_wrapper_env_snapshot_box_shared.o core/box/malloc_wrapper_env_snapshot_box_shared.o core/box/madvise_guard_box_shared.o core/box/libm_reloc_guard_box_shared.o core/box/hakmem_env_snapshot_box_shared.o core/page_arena_shared.o core/front/tiny_unified_cache_shared.o core/tiny_alloc_fast_push_shared.o core/tiny_c7_ultra_segment_shared.o core/tiny_c7_ultra_shared.o core/link_stubs_shared.o core/tiny_failfast_shared.o tiny_sticky_shared.o tiny_remote_shared.o tiny_publish_shared.o tiny_debug_ring_shared.o hakmem_tiny_magazine_shared.o hakmem_tiny_stats_shared.o hakmem_tiny_sfc_shared.o hakmem_tiny_query_shared.o hakmem_tiny_rss_shared.o hakmem_tiny_registry_shared.o hakmem_tiny_remote_target_shared.o hakmem_tiny_bg_spill_shared.o tiny_adaptive_sizing_shared.o hakmem_super_registry_shared.o hakmem_shared_pool_shared.o hakmem_shared_pool_acquire_shared.o hakmem_shared_pool_release_shared.o hakmem_elo_shared.o hakmem_batch_shared.o hakmem_p2_shared.o hakmem_sizeclass_dist_shared.o hakmem_evo_shared.o hakmem_debug_shared.o hakmem_sys_shared.o hakmem_whale_shared.o hakmem_policy_shared.o hakmem_ace_shared.o hakmem_ace_stats_shared.o hakmem_ace_controller_shared.o hakmem_ace_metrics_shared.o hakmem_ace_ucb1_shared.o hakmem_prof_shared.o hakmem_learner_shared.o hakmem_size_hist_shared.o hakmem_learn_log_shared.o hakmem_syscall_shared.o tiny_fastcache_shared.o core/box/super_reg_box_shared.o core/box/shared_pool_box_shared.o core/box/remote_side_box_shared.o core/tiny_destructors_shared.o
|
||||||
|
|
||||||
# Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1)
|
# Pool TLS Phase 1 (enable with POOL_TLS_PHASE1=1)
|
||||||
ifeq ($(POOL_TLS_PHASE1),1)
|
ifeq ($(POOL_TLS_PHASE1),1)
|
||||||
@ -250,7 +250,7 @@ endif
|
|||||||
# Benchmark targets
|
# Benchmark targets
|
||||||
BENCH_HAKMEM = bench_allocators_hakmem
|
BENCH_HAKMEM = bench_allocators_hakmem
|
||||||
BENCH_SYSTEM = bench_allocators_system
|
BENCH_SYSTEM = bench_allocators_system
|
||||||
BENCH_HAKMEM_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o bench_allocators_hakmem.o
|
BENCH_HAKMEM_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/free_publish_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/page_arena.o core/front/tiny_unified_cache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o bench_allocators_hakmem.o
|
||||||
BENCH_HAKMEM_OBJS = $(BENCH_HAKMEM_OBJS_BASE)
|
BENCH_HAKMEM_OBJS = $(BENCH_HAKMEM_OBJS_BASE)
|
||||||
ifeq ($(POOL_TLS_PHASE1),1)
|
ifeq ($(POOL_TLS_PHASE1),1)
|
||||||
BENCH_HAKMEM_OBJS += pool_tls.o pool_refill.o pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
BENCH_HAKMEM_OBJS += pool_tls.o pool_refill.o pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
||||||
@ -427,7 +427,7 @@ test-box-refactor: box-refactor
|
|||||||
./larson_hakmem 10 8 128 1024 1 12345 4
|
./larson_hakmem 10 8 128 1024 1 12345 4
|
||||||
|
|
||||||
# Phase 4: Tiny Pool benchmarks (properly linked with hakmem)
|
# Phase 4: Tiny Pool benchmarks (properly linked with hakmem)
|
||||||
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
TINY_BENCH_OBJS_BASE = hakmem.o hakmem_config.o hakmem_tiny_config.o hakmem_ucb1.o hakmem_bigcache.o hakmem_pool.o hakmem_l25_pool.o hakmem_site_rules.o hakmem_tiny.o core/box/ss_allocation_box.o superslab_stats.o superslab_cache.o superslab_ace.o superslab_slab.o superslab_backend.o core/superslab_head_stub.o hakmem_smallmid.o core/box/superslab_expansion_box.o core/box/integrity_box.o core/box/mailbox_box.o core/box/front_gate_box.o core/box/front_gate_classifier.o core/box/free_publish_box.o core/box/capacity_box.o core/box/carve_push_box.o core/box/prewarm_box.o core/box/ss_hot_prewarm_box.o core/box/front_metrics_box.o core/box/bench_fast_box.o core/box/ss_addr_map_box.o core/box/ss_pt_impl.o core/box/slab_recycling_box.o core/box/pagefault_telemetry_box.o core/box/tiny_sizeclass_hist_box.o core/box/tiny_env_box.o core/box/tiny_route_box.o core/box/free_front_v3_env_box.o core/box/free_path_stats_box.o core/box/free_dispatch_stats_box.o core/box/alloc_gate_stats_box.o core/box/tiny_c6_ultra_free_box.o core/box/tiny_c5_ultra_free_box.o core/box/tiny_c4_ultra_free_box.o core/box/tiny_ultra_tls_box.o core/box/tiny_page_box.o core/box/tiny_class_policy_box.o core/box/tiny_class_stats_box.o core/box/tiny_policy_learner_box.o core/box/ss_budget_box.o core/box/tiny_mem_stats_box.o core/box/c7_meta_used_counter_box.o core/box/tiny_static_route_box.o core/box/tiny_metadata_cache_hot_box.o core/box/wrapper_env_box.o core/box/free_wrapper_env_snapshot_box.o core/box/malloc_wrapper_env_snapshot_box.o core/box/madvise_guard_box.o core/box/libm_reloc_guard_box.o core/box/ptr_trace_box.o core/box/link_missing_stubs.o core/box/super_reg_box.o core/box/shared_pool_box.o core/box/remote_side_box.o core/box/hakmem_env_snapshot_box.o core/page_arena.o core/front/tiny_unified_cache.o tiny_sticky.o tiny_remote.o tiny_publish.o tiny_debug_ring.o hakmem_tiny_magazine.o hakmem_tiny_stats.o hakmem_tiny_sfc.o hakmem_tiny_query.o hakmem_tiny_rss.o hakmem_tiny_registry.o hakmem_tiny_remote_target.o hakmem_tiny_bg_spill.o tiny_adaptive_sizing.o hakmem_super_registry.o hakmem_shared_pool.o hakmem_shared_pool_acquire.o hakmem_shared_pool_release.o hakmem_elo.o hakmem_batch.o hakmem_p2.o hakmem_sizeclass_dist.o hakmem_evo.o hakmem_debug.o hakmem_sys.o hakmem_whale.o hakmem_policy.o hakmem_ace.o hakmem_ace_stats.o hakmem_prof.o hakmem_learner.o hakmem_size_hist.o hakmem_learn_log.o hakmem_syscall.o hakmem_ace_metrics.o hakmem_ace_ucb1.o hakmem_ace_controller.o tiny_fastcache.o core/tiny_alloc_fast_push.o core/tiny_c7_ultra_segment.o core/tiny_c7_ultra.o core/link_stubs.o core/tiny_failfast.o core/tiny_destructors.o core/smallobject_hotbox_v3.o core/smallobject_hotbox_v4.o core/smallobject_hotbox_v5.o core/smallsegment_v5.o core/smallobject_cold_iface_v5.o core/smallsegment_v6.o core/smallobject_cold_iface_v6.o core/smallobject_core_v6.o core/region_id_v6.o core/smallsegment_v7.o core/smallobject_cold_iface_v7.o core/mid_hotbox_v3.o core/smallobject_policy_v7.o core/smallobject_segment_mid_v3.o core/smallobject_cold_iface_mid_v3.o core/smallobject_stats_mid_v3.o core/smallobject_learner_v2.o core/smallobject_mid_v35.o
|
||||||
TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE)
|
TINY_BENCH_OBJS = $(TINY_BENCH_OBJS_BASE)
|
||||||
ifeq ($(POOL_TLS_PHASE1),1)
|
ifeq ($(POOL_TLS_PHASE1),1)
|
||||||
TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
TINY_BENCH_OBJS += pool_tls.o pool_refill.o core/pool_tls_arena.o pool_tls_registry.o pool_tls_remote.o
|
||||||
|
|||||||
@ -37,6 +37,7 @@ void* realloc(void* ptr, size_t size) {
|
|||||||
#include "wrapper_env_box.h" // Wrapper env cache (step trace / LD safe / free trace)
|
#include "wrapper_env_box.h" // Wrapper env cache (step trace / LD safe / free trace)
|
||||||
#include "wrapper_env_cache_box.h" // Phase 3 D2: TLS cache for wrapper_env_cfg pointer
|
#include "wrapper_env_cache_box.h" // Phase 3 D2: TLS cache for wrapper_env_cfg pointer
|
||||||
#include "free_wrapper_env_snapshot_box.h" // Phase 5 E4-1: Free wrapper ENV snapshot
|
#include "free_wrapper_env_snapshot_box.h" // Phase 5 E4-1: Free wrapper ENV snapshot
|
||||||
|
#include "malloc_wrapper_env_snapshot_box.h" // Phase 5 E4-2: Malloc wrapper ENV snapshot
|
||||||
#include "../hakmem_internal.h" // AllocHeader helpers for diagnostics
|
#include "../hakmem_internal.h" // AllocHeader helpers for diagnostics
|
||||||
#include "../hakmem_super_registry.h" // Superslab lookup for diagnostics
|
#include "../hakmem_super_registry.h" // Superslab lookup for diagnostics
|
||||||
#include "../superslab/superslab_inline.h" // slab_index_for, capacity
|
#include "../superslab/superslab_inline.h" // slab_index_for, capacity
|
||||||
@ -170,6 +171,55 @@ void* malloc(size_t size) {
|
|||||||
// Fallback to normal path for large allocations
|
// Fallback to normal path for large allocations
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Phase 5 E4-2: Malloc Wrapper ENV Snapshot (optional, ENV-gated)
|
||||||
|
// Strategy: Consolidate 2+ TLS reads -> 1 TLS read (50%+ reduction)
|
||||||
|
// Expected gain: +2-4% (from malloc 16.13% + tiny_alloc_gate_fast 19.50% reduction)
|
||||||
|
if (__builtin_expect(malloc_wrapper_env_snapshot_enabled(), 0)) {
|
||||||
|
// Optimized path: Single TLS snapshot (1 TLS read instead of 2+)
|
||||||
|
const struct malloc_wrapper_env_snapshot* env = malloc_wrapper_env_get();
|
||||||
|
|
||||||
|
// Fast path: Front gate unified (LIKELY in current presets)
|
||||||
|
if (__builtin_expect(env->front_gate_unified, 1)) {
|
||||||
|
// Common case: size <= 256 (pre-cached, no function call)
|
||||||
|
if (__builtin_expect(env->tiny_max_size_256 && size <= 256, 1)) {
|
||||||
|
void* ptr = tiny_alloc_gate_fast(size);
|
||||||
|
if (__builtin_expect(ptr != NULL, 1)) {
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
} else if (size <= tiny_get_max_size()) {
|
||||||
|
// Fallback for non-256 max sizes (rare)
|
||||||
|
void* ptr = tiny_alloc_gate_fast(size);
|
||||||
|
if (__builtin_expect(ptr != NULL, 1)) {
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Slow path fallback: Wrap shape dispatch
|
||||||
|
if (__builtin_expect(env->wrap_shape, 0)) {
|
||||||
|
// Need to increment lock depth for malloc_cold path
|
||||||
|
g_hakmem_lock_depth++;
|
||||||
|
|
||||||
|
// Guard against recursion during initialization
|
||||||
|
int init_wait = hak_init_wait_for_ready();
|
||||||
|
if (__builtin_expect(init_wait <= 0, 0)) {
|
||||||
|
wrapper_record_fallback(FB_INIT_WAIT_FAIL, "[wrap] libc malloc: init_wait\n");
|
||||||
|
g_hakmem_lock_depth--;
|
||||||
|
extern void* __libc_malloc(size_t);
|
||||||
|
return __libc_malloc(size);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Ensure initialization before cold path
|
||||||
|
if (!g_initialized) hak_init();
|
||||||
|
|
||||||
|
// Delegate to cold path
|
||||||
|
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg_fast();
|
||||||
|
return malloc_cold(size, wcfg);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fall through to legacy path below
|
||||||
|
}
|
||||||
|
|
||||||
// Phase 2 B4: Hot/Cold dispatch (HAKMEM_WRAP_SHAPE)
|
// Phase 2 B4: Hot/Cold dispatch (HAKMEM_WRAP_SHAPE)
|
||||||
// Phase 3 D2: Use wrapper_env_cfg_fast() to reduce hot path overhead
|
// Phase 3 D2: Use wrapper_env_cfg_fast() to reduce hot path overhead
|
||||||
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg_fast();
|
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg_fast();
|
||||||
|
|||||||
44
core/box/malloc_wrapper_env_snapshot_box.c
Normal file
44
core/box/malloc_wrapper_env_snapshot_box.c
Normal file
@ -0,0 +1,44 @@
|
|||||||
|
// malloc_wrapper_env_snapshot_box.c - Box: Malloc Wrapper ENV Snapshot Implementation
|
||||||
|
//
|
||||||
|
// Phase 5 E4-2: Malloc Gate Optimization
|
||||||
|
|
||||||
|
#include "malloc_wrapper_env_snapshot_box.h"
|
||||||
|
#include "wrapper_env_box.h"
|
||||||
|
#include "tiny_front_config_box.h"
|
||||||
|
#include "../front/malloc_tiny_fast.h"
|
||||||
|
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
// TLS storage (initialized to zero on thread creation)
|
||||||
|
__thread struct malloc_wrapper_env_snapshot g_malloc_wrapper_env = {0};
|
||||||
|
|
||||||
|
// Lazy init implementation: Called once per thread on first malloc() call
|
||||||
|
void malloc_wrapper_env_snapshot_init(void)
|
||||||
|
{
|
||||||
|
// Read wrapper env config (wrap_shape flag)
|
||||||
|
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg();
|
||||||
|
g_malloc_wrapper_env.wrap_shape = wcfg->wrap_shape;
|
||||||
|
|
||||||
|
// Read front gate unified constant (compile-time macro)
|
||||||
|
g_malloc_wrapper_env.front_gate_unified = TINY_FRONT_UNIFIED_GATE_ENABLED;
|
||||||
|
|
||||||
|
// Read tiny max size (most common case: 256 bytes)
|
||||||
|
g_malloc_wrapper_env.tiny_max_size_256 = (tiny_get_max_size() == 256) ? 1 : 0;
|
||||||
|
|
||||||
|
// Mark as initialized (lazy init complete)
|
||||||
|
g_malloc_wrapper_env.initialized = 1;
|
||||||
|
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
// Debug: Log snapshot initialization (first 5 threads only)
|
||||||
|
static _Atomic uint32_t g_init_log_count = 0;
|
||||||
|
uint32_t n = atomic_fetch_add_explicit(&g_init_log_count, 1, memory_order_relaxed);
|
||||||
|
if (n < 5) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[MALLOC_WRAPPER_ENV_SNAPSHOT_INIT] wrap_shape=%d front_gate=%d tiny_max_256=%d\n",
|
||||||
|
g_malloc_wrapper_env.wrap_shape,
|
||||||
|
g_malloc_wrapper_env.front_gate_unified,
|
||||||
|
g_malloc_wrapper_env.tiny_max_size_256);
|
||||||
|
fflush(stderr);
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
}
|
||||||
71
core/box/malloc_wrapper_env_snapshot_box.h
Normal file
71
core/box/malloc_wrapper_env_snapshot_box.h
Normal file
@ -0,0 +1,71 @@
|
|||||||
|
// malloc_wrapper_env_snapshot_box.h - Box: Malloc Wrapper ENV Snapshot
|
||||||
|
//
|
||||||
|
// Phase 5 E4-2: Malloc Gate Optimization
|
||||||
|
//
|
||||||
|
// Purpose:
|
||||||
|
// Consolidate multiple TLS reads in malloc() wrapper into a single snapshot
|
||||||
|
// to reduce overhead (malloc 16.13% + tiny_alloc_gate_fast 19.50% -> target 33%)
|
||||||
|
//
|
||||||
|
// Strategy:
|
||||||
|
// - Reuse E4-1 success pattern (ENV snapshot consolidation, +3.51%)
|
||||||
|
// - Avoid E3-4 failure pattern (constructor init, -1.44%)
|
||||||
|
// - 2+ TLS reads -> 1 TLS read (50%+ reduction)
|
||||||
|
// - Eliminate tiny_get_max_size() function call in common case (size <= 256)
|
||||||
|
//
|
||||||
|
// Box Boundary:
|
||||||
|
// - Input: None (thread-local initialization on first access)
|
||||||
|
// - Output: const struct malloc_wrapper_env_snapshot* (cached snapshot)
|
||||||
|
// - ENV gate: HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0/1 (default: 0, research box)
|
||||||
|
//
|
||||||
|
// Safety:
|
||||||
|
// - TLS storage (thread-safe)
|
||||||
|
// - Lazy init (once per thread)
|
||||||
|
// - ENV-gated rollback (SNAPSHOT=0 disables)
|
||||||
|
|
||||||
|
#ifndef MALLOC_WRAPPER_ENV_SNAPSHOT_BOX_H
|
||||||
|
#define MALLOC_WRAPPER_ENV_SNAPSHOT_BOX_H
|
||||||
|
|
||||||
|
#include <stdint.h>
|
||||||
|
#include <stdlib.h>
|
||||||
|
#include "../hakmem_build_flags.h"
|
||||||
|
|
||||||
|
// Snapshot structure: Consolidates 3 ENV checks into 1 TLS read
|
||||||
|
// Size: 4 bytes (cache-friendly, fits in single cache line)
|
||||||
|
struct malloc_wrapper_env_snapshot {
|
||||||
|
uint8_t wrap_shape; // HAKMEM_WRAP_SHAPE (from wrapper_env_cfg)
|
||||||
|
uint8_t front_gate_unified; // TINY_FRONT_UNIFIED_GATE_ENABLED (compile-time constant)
|
||||||
|
uint8_t tiny_max_size_256; // tiny_get_max_size() == 256 (most common case)
|
||||||
|
uint8_t initialized; // Lazy init flag (0 = not initialized, 1 = initialized)
|
||||||
|
};
|
||||||
|
|
||||||
|
// Thread-local storage for snapshot (initialized on first access per thread)
|
||||||
|
extern __thread struct malloc_wrapper_env_snapshot g_malloc_wrapper_env;
|
||||||
|
|
||||||
|
// ENV gate: Enable/disable snapshot optimization (default: OFF, research box)
|
||||||
|
static inline int malloc_wrapper_env_snapshot_enabled(void)
|
||||||
|
{
|
||||||
|
static __thread int s_enabled = -1;
|
||||||
|
if (__builtin_expect(s_enabled == -1, 0)) {
|
||||||
|
const char* env = getenv("HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT");
|
||||||
|
s_enabled = (env && *env == '1') ? 1 : 0;
|
||||||
|
}
|
||||||
|
return s_enabled;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Lazy init: Initialize snapshot on first access (once per thread)
|
||||||
|
void malloc_wrapper_env_snapshot_init(void);
|
||||||
|
|
||||||
|
// Primary API: Get snapshot (1 TLS read, lazy init on first call)
|
||||||
|
static inline const struct malloc_wrapper_env_snapshot* malloc_wrapper_env_get(void)
|
||||||
|
{
|
||||||
|
// Fast path: Already initialized
|
||||||
|
if (__builtin_expect(g_malloc_wrapper_env.initialized, 1)) {
|
||||||
|
return &g_malloc_wrapper_env;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Slow path: First access, initialize snapshot
|
||||||
|
malloc_wrapper_env_snapshot_init();
|
||||||
|
return &g_malloc_wrapper_env;
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif // MALLOC_WRAPPER_ENV_SNAPSHOT_BOX_H
|
||||||
@ -124,6 +124,13 @@ HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=1
|
|||||||
- **Status**: ✅ GO(Mixed 10-run: **+3.51% mean / +4.07% median**)→ ✅ Promoted to `MIXED_TINYV3_C7_SAFE` preset default(opt-out 可)
|
- **Status**: ✅ GO(Mixed 10-run: **+3.51% mean / +4.07% median**)→ ✅ Promoted to `MIXED_TINYV3_C7_SAFE` preset default(opt-out 可)
|
||||||
- **Effect**: `free()` wrapper の ENV 判定(複数 TLS read)を TLS snapshot 1 本に集約して early gate を短絡
|
- **Effect**: `free()` wrapper の ENV 判定(複数 TLS read)を TLS snapshot 1 本に集約して early gate を短絡
|
||||||
- **Rollback**: `HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0`
|
- **Rollback**: `HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0`
|
||||||
|
- **Phase 5 E4-2(Malloc Wrapper ENV Snapshot)** ✅ GO (PROMOTION READY):
|
||||||
|
```sh
|
||||||
|
HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1
|
||||||
|
```
|
||||||
|
- **Status**: ✅ GO(Mixed 10-run: **+21.83% mean / +22.86% median**)→ ✅ Promoted to `MIXED_TINYV3_C7_SAFE` preset default(opt-out 可)
|
||||||
|
- **Effect**: `malloc()` wrapper の tiny fast 判定を TLS snapshot で短絡し、hot path の関数呼び出し/判定を削減(特に `tiny_get_max_size()`)
|
||||||
|
- **Rollback**: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0`
|
||||||
- v2 系は触らない(C7_SAFE では Pool v2 / Tiny v2 は常時 OFF)。
|
- v2 系は触らない(C7_SAFE では Pool v2 / Tiny v2 は常時 OFF)。
|
||||||
- FREE_POLICY/THP を触る実験例(現在の HEAD では必須ではなく、組み合わせによっては微マイナスになる場合もある):
|
- FREE_POLICY/THP を触る実験例(現在の HEAD では必須ではなく、組み合わせによっては微マイナスになる場合もある):
|
||||||
```sh
|
```sh
|
||||||
|
|||||||
@ -0,0 +1,184 @@
|
|||||||
|
# Phase 5 E4-2: malloc Wrapper ENV Snapshot - A/B Test Results
|
||||||
|
|
||||||
|
## Status
|
||||||
|
- Phase: 5 E4-2
|
||||||
|
- Decision: **GO** (mean +21.83%, exceeds +1.0% threshold)
|
||||||
|
- Date: 2025-12-14
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Applied successful E4-1 pattern (ENV snapshot consolidation) to malloc() wrapper hot path. Achieved **+21.83% mean gain** by consolidating multiple TLS reads into a single snapshot.
|
||||||
|
|
||||||
|
**Key Achievement**: This is 6.2x better than E4-1's +3.51% gain, demonstrating that malloc() optimization has higher ROI than free() due to higher call frequency in allocation-heavy workloads.
|
||||||
|
|
||||||
|
## Implementation
|
||||||
|
|
||||||
|
### Files Created
|
||||||
|
1. `/mnt/workdisk/public_share/hakmem/core/box/malloc_wrapper_env_snapshot_box.h` - API header
|
||||||
|
2. `/mnt/workdisk/public_share/hakmem/core/box/malloc_wrapper_env_snapshot_box.c` - Implementation
|
||||||
|
3. `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_DESIGN.md` - Design doc
|
||||||
|
|
||||||
|
### Files Modified
|
||||||
|
1. `/mnt/workdisk/public_share/hakmem/core/box/hak_wrappers.inc.h` - Integrated snapshot into malloc() hot path
|
||||||
|
2. `/mnt/workdisk/public_share/hakmem/Makefile` - Added `malloc_wrapper_env_snapshot_box.o` to all build targets
|
||||||
|
|
||||||
|
### Box Structure
|
||||||
|
|
||||||
|
```c
|
||||||
|
struct malloc_wrapper_env_snapshot {
|
||||||
|
uint8_t wrap_shape; // HAKMEM_WRAP_SHAPE (from wrapper_env_cfg)
|
||||||
|
uint8_t front_gate_unified; // TINY_FRONT_UNIFIED_GATE_ENABLED
|
||||||
|
uint8_t tiny_max_size_256; // tiny_get_max_size() == 256 (common case)
|
||||||
|
uint8_t initialized; // Lazy init flag
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Size: 4 bytes (cache-friendly)
|
||||||
|
|
||||||
|
### Integration Points
|
||||||
|
|
||||||
|
**ENV Gate**: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0/1` (default: 0, research box)
|
||||||
|
|
||||||
|
**malloc() Hot Path**:
|
||||||
|
- Before: 2+ TLS reads (`wrapper_env_cfg_fast()`, `tiny_get_max_size()` function call)
|
||||||
|
- After: 1 TLS read (`malloc_wrapper_env_get()`)
|
||||||
|
- Reduction: 50%+ TLS overhead, 100% function call elimination in common case
|
||||||
|
|
||||||
|
**Optimization**:
|
||||||
|
- Pre-cache `tiny_max_size() == 256` flag (most common configuration)
|
||||||
|
- Avoid function call overhead for size <= 256 check (highly predictable branch)
|
||||||
|
- Single TLS read gates all configuration checks
|
||||||
|
|
||||||
|
## A/B Test Configuration
|
||||||
|
|
||||||
|
**Profile**: MIXED_TINYV3_C7_SAFE
|
||||||
|
**Workload**: bench_random_mixed_hakmem
|
||||||
|
**Parameters**: 20M iterations, 400 working set
|
||||||
|
**Runs**: 10 iterations each (baseline, optimized)
|
||||||
|
|
||||||
|
**Baseline**: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0` (legacy path)
|
||||||
|
**Optimized**: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1` (snapshot path)
|
||||||
|
|
||||||
|
## Results
|
||||||
|
|
||||||
|
### Raw Data
|
||||||
|
|
||||||
|
**Baseline (SNAPSHOT=0)**:
|
||||||
|
```
|
||||||
|
Run 1: 35418241 ops/s
|
||||||
|
Run 2: 36231356 ops/s
|
||||||
|
Run 3: 35261129 ops/s
|
||||||
|
Run 4: 35795498 ops/s
|
||||||
|
Run 5: 34962415 ops/s
|
||||||
|
Run 6: 36107583 ops/s
|
||||||
|
Run 7: 35671028 ops/s
|
||||||
|
Run 8: 36148172 ops/s
|
||||||
|
Run 9: 36133092 ops/s
|
||||||
|
Run 10: 35705495 ops/s
|
||||||
|
```
|
||||||
|
|
||||||
|
**Optimized (SNAPSHOT=1)**:
|
||||||
|
```
|
||||||
|
Run 1: 40316963 ops/s
|
||||||
|
Run 2: 43768340 ops/s
|
||||||
|
Run 3: 44094315 ops/s
|
||||||
|
Run 4: 43701884 ops/s
|
||||||
|
Run 5: 44158516 ops/s
|
||||||
|
Run 6: 43613064 ops/s
|
||||||
|
Run 7: 44147226 ops/s
|
||||||
|
Run 8: 44223019 ops/s
|
||||||
|
Run 9: 43346060 ops/s
|
||||||
|
Run 10: 44080131 ops/s
|
||||||
|
```
|
||||||
|
|
||||||
|
### Statistical Analysis
|
||||||
|
|
||||||
|
| Metric | Baseline | Optimized | Gain |
|
||||||
|
|--------|----------|-----------|------|
|
||||||
|
| **Mean** | 35.74 M ops/s | 43.54 M ops/s | **+21.83%** (+7.80 M ops/s) |
|
||||||
|
| **Median** | 35.75 M ops/s | 43.92 M ops/s | **+22.86%** (+8.17 M ops/s) |
|
||||||
|
| **StdDev** | 0.43 M ops/s (1.20%) | 1.17 M ops/s (2.69%) | - |
|
||||||
|
|
||||||
|
### Stability
|
||||||
|
|
||||||
|
- Baseline StdDev: 1.20% (excellent stability)
|
||||||
|
- Optimized StdDev: 2.69% (acceptable stability, slightly higher variance)
|
||||||
|
- All 10 optimized runs significantly outperformed best baseline run (36.23M vs 40.32-44.22M)
|
||||||
|
|
||||||
|
## Health Profile Verification
|
||||||
|
|
||||||
|
Ran `scripts/verify_health_profiles.sh`:
|
||||||
|
```
|
||||||
|
== Health Profile 1/2: MIXED_TINYV3_C7_SAFE ==
|
||||||
|
Throughput = 40801959 ops/s [iter=1000000 ws=400] time=0.025s
|
||||||
|
|
||||||
|
== Health Profile 2/2: C6_HEAVY_LEGACY_POOLV1 ==
|
||||||
|
Throughput = 21772562 operations per second, relative time: 0.046s
|
||||||
|
|
||||||
|
OK: health profiles passed
|
||||||
|
```
|
||||||
|
|
||||||
|
**Result**: All health profiles PASSED with no regressions.
|
||||||
|
|
||||||
|
## Analysis
|
||||||
|
|
||||||
|
### Why +21.83% vs E4-1's +3.51%?
|
||||||
|
|
||||||
|
1. **Higher Call Frequency**: malloc() is called MORE frequently than free() in allocation-heavy workloads
|
||||||
|
2. **Function Call Elimination**: Pre-caching `tiny_max_size() == 256` eliminates function call overhead entirely
|
||||||
|
3. **Branch Predictability**: Size <= 256 check is highly predictable for tiny allocations (better than free's header checks)
|
||||||
|
4. **malloc() Dominance**: Profile showed malloc (16.13%) + tiny_alloc_gate_fast (19.50%) = 35.63% combined self%
|
||||||
|
|
||||||
|
### TLS Read Reduction Impact
|
||||||
|
|
||||||
|
**Before (legacy path)**:
|
||||||
|
- `wrapper_env_cfg_fast()` - TLS read
|
||||||
|
- `tiny_get_max_size()` - function call (potential TLS read inside)
|
||||||
|
- Multiple branches: `wcfg->wrap_shape`, `TINY_FRONT_UNIFIED_GATE_ENABLED`, `size <= max`
|
||||||
|
|
||||||
|
**After (snapshot path)**:
|
||||||
|
- `malloc_wrapper_env_get()` - 1 TLS read
|
||||||
|
- Pre-cached `tiny_max_size_256` flag (no function call)
|
||||||
|
- Consolidated branches: `env->front_gate_unified`, `env->tiny_max_size_256 && size <= 256`
|
||||||
|
|
||||||
|
**Net Benefit**:
|
||||||
|
- 50%+ TLS read reduction
|
||||||
|
- 100% function call elimination (common case)
|
||||||
|
- Better branch prediction (size <= 256 is highly predictable)
|
||||||
|
|
||||||
|
## Decision: GO
|
||||||
|
|
||||||
|
**Criteria**: mean >= +1.0% for GO
|
||||||
|
|
||||||
|
**Result**: +21.83% mean gain **EXCEEDS** GO threshold by 20.83 percentage points
|
||||||
|
|
||||||
|
**Recommendation**:
|
||||||
|
1. **PROMOTE** to default configuration (flip `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1` by default)
|
||||||
|
2. **COMBINE** with E4-1 (free wrapper ENV snapshot) for maximum effect
|
||||||
|
3. **DOCUMENT** as Phase 5 E4 success pattern for future wrapper optimizations
|
||||||
|
|
||||||
|
## Comparison to E4-1
|
||||||
|
|
||||||
|
| Metric | E4-1 (free) | E4-2 (malloc) | Ratio |
|
||||||
|
|--------|-------------|---------------|-------|
|
||||||
|
| Mean Gain | +3.51% | +21.83% | **6.2x** |
|
||||||
|
| Median Gain | +3.59% | +22.86% | **6.4x** |
|
||||||
|
| Profile Self% | 25.26% | 35.63% | 1.4x |
|
||||||
|
|
||||||
|
**Insight**: malloc() optimization has **6.2x higher ROI** than free() optimization due to:
|
||||||
|
1. Higher call frequency in allocation-heavy workloads
|
||||||
|
2. Function call elimination opportunity (tiny_get_max_size())
|
||||||
|
3. Better branch predictability (size checks vs header checks)
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
1. Update default configuration: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1`
|
||||||
|
2. Verify combined effect with E4-1 (both snapshots enabled)
|
||||||
|
3. Profile new bottlenecks at 43.54 M ops/s baseline
|
||||||
|
4. Update CURRENT_TASK.md with E4-2 GO decision
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Design: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_DESIGN.md`
|
||||||
|
- E4-1 Results: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_1_AB_TEST_RESULTS.md` (+3.51%)
|
||||||
|
- Implementation: `core/box/malloc_wrapper_env_snapshot_box.{h,c}`, `core/box/hak_wrappers.inc.h`
|
||||||
@ -0,0 +1,237 @@
|
|||||||
|
# Phase 5 E4-2: malloc Wrapper ENV Snapshot - Design Document
|
||||||
|
|
||||||
|
## Status
|
||||||
|
- Phase: 5 E4-2
|
||||||
|
- Type: Research Box (ENV-gated optimization)
|
||||||
|
- Created: 2025-12-14
|
||||||
|
|
||||||
|
## Motivation
|
||||||
|
|
||||||
|
Apply successful E4-1 pattern (+3.51% from free wrapper ENV snapshot) to malloc() hot path to reduce TLS read overhead.
|
||||||
|
|
||||||
|
### Current State
|
||||||
|
|
||||||
|
malloc() wrapper performs multiple TLS reads:
|
||||||
|
1. `wrapper_env_cfg_fast()` - wrapper config (wcfg)
|
||||||
|
2. `TINY_FRONT_UNIFIED_GATE_ENABLED` - compile-time constant (not TLS, but branch)
|
||||||
|
3. `tiny_get_max_size()` - size threshold check
|
||||||
|
|
||||||
|
Profiling shows malloc() + tiny_alloc_gate_fast() consuming 35.63% combined self%:
|
||||||
|
- malloc: 16.13% self%
|
||||||
|
- tiny_alloc_gate_fast: 19.50% self%
|
||||||
|
|
||||||
|
### E4-1 Success Pattern
|
||||||
|
|
||||||
|
E4-1 achieved +3.51% gain by:
|
||||||
|
1. Consolidating 2 TLS reads -> 1 TLS snapshot
|
||||||
|
2. Lazy initialization with probe window (bench_profile putenv sync)
|
||||||
|
3. ENV gate for safe rollback (HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0/1)
|
||||||
|
4. 4-byte struct (cache-friendly)
|
||||||
|
|
||||||
|
## Objective
|
||||||
|
|
||||||
|
**Goal**: Apply E4-1 pattern to malloc() wrapper to reduce TLS overhead.
|
||||||
|
|
||||||
|
**Expected Gain**: +2-4% (similar to E4-1's +3.51%)
|
||||||
|
- malloc is called MORE frequently than free in allocation-heavy workloads
|
||||||
|
- Reducing TLS reads in malloc() hot path should have comparable or greater impact
|
||||||
|
|
||||||
|
**Risk**: Low
|
||||||
|
- E4-1 pattern proven successful
|
||||||
|
- ENV-gated allows safe rollback
|
||||||
|
- No constructor initialization (avoiding E3-4 failure pattern)
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### Snapshot Structure
|
||||||
|
|
||||||
|
```c
|
||||||
|
struct malloc_wrapper_env_snapshot {
|
||||||
|
uint8_t wrap_shape; // HAKMEM_WRAP_SHAPE (from wrapper_env_cfg)
|
||||||
|
uint8_t front_gate_unified; // TINY_FRONT_UNIFIED_GATE_ENABLED (compile-time constant)
|
||||||
|
uint8_t tiny_max_size_256; // tiny_get_max_size() == 256 (most common case)
|
||||||
|
uint8_t initialized; // Lazy init flag (0 = not initialized, 1 = initialized)
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
Size: 4 bytes (cache-friendly, fits in single cache line with E4-1 snapshot)
|
||||||
|
|
||||||
|
### TLS Storage
|
||||||
|
|
||||||
|
```c
|
||||||
|
extern __thread struct malloc_wrapper_env_snapshot g_malloc_wrapper_env;
|
||||||
|
```
|
||||||
|
|
||||||
|
Initialized to zero on thread creation, lazy-init on first malloc() call per thread.
|
||||||
|
|
||||||
|
### ENV Gate
|
||||||
|
|
||||||
|
```c
|
||||||
|
static inline int malloc_wrapper_env_snapshot_enabled(void) {
|
||||||
|
static __thread int s_enabled = -1;
|
||||||
|
if (__builtin_expect(s_enabled == -1, 0)) {
|
||||||
|
const char* env = getenv("HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT");
|
||||||
|
s_enabled = (env && *env == '1') ? 1 : 0;
|
||||||
|
}
|
||||||
|
return s_enabled;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Default: OFF (s_enabled=0, research box)
|
||||||
|
Enable: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1`
|
||||||
|
|
||||||
|
### Lazy Initialization
|
||||||
|
|
||||||
|
```c
|
||||||
|
void malloc_wrapper_env_snapshot_init(void) {
|
||||||
|
// Read wrapper env config (wrap_shape flag)
|
||||||
|
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg();
|
||||||
|
g_malloc_wrapper_env.wrap_shape = wcfg->wrap_shape;
|
||||||
|
|
||||||
|
// Read front gate unified constant (compile-time macro)
|
||||||
|
g_malloc_wrapper_env.front_gate_unified = TINY_FRONT_UNIFIED_GATE_ENABLED;
|
||||||
|
|
||||||
|
// Read tiny max size (most common case: 256 bytes)
|
||||||
|
g_malloc_wrapper_env.tiny_max_size_256 = (tiny_get_max_size() == 256) ? 1 : 0;
|
||||||
|
|
||||||
|
// Mark as initialized
|
||||||
|
g_malloc_wrapper_env.initialized = 1;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Called once per thread on first malloc() call (probe window ensures bench_profile putenv sync).
|
||||||
|
|
||||||
|
### Primary API
|
||||||
|
|
||||||
|
```c
|
||||||
|
static inline const struct malloc_wrapper_env_snapshot* malloc_wrapper_env_get(void) {
|
||||||
|
// Fast path: Already initialized
|
||||||
|
if (__builtin_expect(g_malloc_wrapper_env.initialized, 1)) {
|
||||||
|
return &g_malloc_wrapper_env;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Slow path: First access, initialize snapshot
|
||||||
|
malloc_wrapper_env_snapshot_init();
|
||||||
|
return &g_malloc_wrapper_env;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Single TLS read (`g_malloc_wrapper_env.initialized`) gates entire snapshot.
|
||||||
|
|
||||||
|
## Integration Plan
|
||||||
|
|
||||||
|
### malloc() Hot Path Changes
|
||||||
|
|
||||||
|
**Before (legacy path)**:
|
||||||
|
```c
|
||||||
|
void* malloc(size_t size) {
|
||||||
|
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg_fast(); // TLS read 1
|
||||||
|
if (__builtin_expect(wcfg->wrap_shape, 0)) {
|
||||||
|
// ... hot/cold dispatch ...
|
||||||
|
if (__builtin_expect(TINY_FRONT_UNIFIED_GATE_ENABLED, 1)) { // Branch 1
|
||||||
|
if (size <= tiny_get_max_size()) { // Function call
|
||||||
|
void* ptr = tiny_alloc_gate_fast(size);
|
||||||
|
if (__builtin_expect(ptr != NULL, 1)) {
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return malloc_cold(size, wcfg);
|
||||||
|
}
|
||||||
|
// ... legacy path ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**After (snapshot path, ENV-gated)**:
|
||||||
|
```c
|
||||||
|
void* malloc(size_t size) {
|
||||||
|
if (__builtin_expect(malloc_wrapper_env_snapshot_enabled(), 0)) {
|
||||||
|
// Optimized path: Single TLS snapshot (1 TLS read instead of 2+)
|
||||||
|
const struct malloc_wrapper_env_snapshot* env = malloc_wrapper_env_get();
|
||||||
|
|
||||||
|
// Fast path: Front gate unified (LIKELY in current presets)
|
||||||
|
if (__builtin_expect(env->front_gate_unified, 1)) {
|
||||||
|
if (__builtin_expect(env->tiny_max_size_256 && size <= 256, 1)) {
|
||||||
|
void* ptr = tiny_alloc_gate_fast(size);
|
||||||
|
if (__builtin_expect(ptr != NULL, 1)) {
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
} else if (size <= tiny_get_max_size()) { // Fallback for non-256 sizes
|
||||||
|
void* ptr = tiny_alloc_gate_fast(size);
|
||||||
|
if (__builtin_expect(ptr != NULL, 1)) {
|
||||||
|
return ptr;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Slow path fallback: Wrap shape dispatch
|
||||||
|
if (__builtin_expect(env->wrap_shape, 0)) {
|
||||||
|
const wrapper_env_cfg_t* wcfg = wrapper_env_cfg_fast();
|
||||||
|
return malloc_cold(size, wcfg);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fall through to legacy path below
|
||||||
|
} else {
|
||||||
|
// Legacy path (SNAPSHOT=0, default): Original behavior preserved
|
||||||
|
// ... existing malloc() implementation ...
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Benefit Analysis
|
||||||
|
|
||||||
|
**Baseline (legacy path)**:
|
||||||
|
- 2 TLS reads: `wrapper_env_cfg_fast()`, (tiny_get_max_size() not TLS but function call overhead)
|
||||||
|
- 2 branches: `wcfg->wrap_shape`, `TINY_FRONT_UNIFIED_GATE_ENABLED`
|
||||||
|
- 1 function call: `tiny_get_max_size()`
|
||||||
|
|
||||||
|
**Optimized (snapshot path)**:
|
||||||
|
- 1 TLS read: `malloc_wrapper_env_get()` (checks `g_malloc_wrapper_env.initialized`)
|
||||||
|
- 2 branches: `env->front_gate_unified`, `env->tiny_max_size_256 && size <= 256`
|
||||||
|
- 0 function calls in common case (256-byte threshold pre-cached)
|
||||||
|
|
||||||
|
**Reduction**:
|
||||||
|
- TLS reads: 2 -> 1 (50% reduction, same as E4-1)
|
||||||
|
- Function calls: 1 -> 0 (100% reduction in common case)
|
||||||
|
- Branch predictability: Improved (size <= 256 is highly predictable for tiny allocations)
|
||||||
|
|
||||||
|
## Implementation Steps
|
||||||
|
|
||||||
|
1. **Box Implementation**:
|
||||||
|
- Create `core/box/malloc_wrapper_env_snapshot_box.h` (API header)
|
||||||
|
- Create `core/box/malloc_wrapper_env_snapshot_box.c` (implementation)
|
||||||
|
|
||||||
|
2. **Integration**:
|
||||||
|
- Modify `core/box/hak_wrappers.inc.h` (malloc() hot path)
|
||||||
|
- Add ENV gate check at top of malloc()
|
||||||
|
- Add snapshot fast path with size <= 256 optimization
|
||||||
|
|
||||||
|
3. **Build System**:
|
||||||
|
- Add `malloc_wrapper_env_snapshot_box.o` to Makefile
|
||||||
|
- Update all build targets (bench, tiny_bench, shared library)
|
||||||
|
|
||||||
|
4. **Testing**:
|
||||||
|
- 10-run A/B test on Mixed profile (SNAPSHOT=0 vs SNAPSHOT=1)
|
||||||
|
- Verify health profiles (no regressions)
|
||||||
|
|
||||||
|
5. **Decision**:
|
||||||
|
- GO: mean >= +1.0%
|
||||||
|
- NEUTRAL: -1.0% ~ +1.0%
|
||||||
|
- NO-GO: mean < -1.0%
|
||||||
|
|
||||||
|
## Success Criteria
|
||||||
|
|
||||||
|
**GO Threshold**: +1.0% mean gain (conservative, E4-1 achieved +3.51%)
|
||||||
|
|
||||||
|
**Expected Result**: +2-4% based on:
|
||||||
|
1. E4-1 pattern proven (+3.51% from free wrapper)
|
||||||
|
2. malloc() called more frequently than free in many workloads
|
||||||
|
3. Additional function call elimination (tiny_get_max_size())
|
||||||
|
|
||||||
|
**Rollback Plan**: If NO-GO, disable via ENV gate (SNAPSHOT=0 is default)
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- E4-1 Success: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_1_AB_TEST_RESULTS.md` (+3.51%)
|
||||||
|
- E3-4 Failure: Constructor initialization pattern (-1.44%, avoided in this design)
|
||||||
|
- Profiling: malloc (16.13% self%) + tiny_alloc_gate_fast (19.50% self%) = 35.63% combined
|
||||||
@ -1,64 +1,54 @@
|
|||||||
# Phase 5 E4-2: malloc Wrapper ENV Snapshot(次の指示書)
|
# Phase 5 E4-2: malloc Wrapper ENV Snapshot(次の指示書)
|
||||||
|
|
||||||
|
## Status(2025-12-14)
|
||||||
|
|
||||||
|
- ✅ GO(Mixed 10-run: **+21.83% mean / +22.86% median**)
|
||||||
|
- ENV gate: `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0/1`(default 0)
|
||||||
|
- 実装:
|
||||||
|
- `core/box/malloc_wrapper_env_snapshot_box.h`
|
||||||
|
- `core/box/malloc_wrapper_env_snapshot_box.c`
|
||||||
|
- `core/box/hak_wrappers.inc.h`(malloc wrapper 入口の境界 1 箇所)
|
||||||
|
- 結果ログ: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_1_AB_TEST_RESULTS.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## ゴール
|
## ゴール
|
||||||
|
|
||||||
E4-1(free wrapper)と同じ発想で、`malloc()` wrapper 側の複数 ENV 判定/TLS read を “snapshot 1 本” に集約して、wrapper 入口のオーバーヘッドを削る。
|
E4-2 を本線に昇格し、E4-1 と同時 ON の累積効果を確認して次の hotspot を決める。
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Box Theory(箱割り)
|
## Step 1: プリセット昇格(opt-out 可)
|
||||||
|
|
||||||
- L0: ENV gate(戻せる)
|
`core/bench_profile.h` の `MIXED_TINYV3_C7_SAFE` に追加:
|
||||||
- `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0/1`(default 0)
|
|
||||||
- L1: Snapshot box(責務 1 つ)
|
```c
|
||||||
- `malloc_wrapper_env_snapshot_box.{h,c}`
|
bench_setenv_default("HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT", "1");
|
||||||
- `__thread` に `wrap_shape/front_gate_unified/...` を保持
|
```
|
||||||
- init は “初回 malloc のみ”(lazy init、常時ログ禁止)
|
|
||||||
- 境界: wrapper の入口 1 箇所だけで snapshot を読む
|
Rollback:
|
||||||
|
```sh
|
||||||
|
HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 1: 新規 Box を追加
|
## Step 2: 累積 A/B(E4-1/E4-2 同時 ON)
|
||||||
|
|
||||||
新規ファイル:
|
Mixed 10-run(iter=20M, ws=400):
|
||||||
- `core/box/malloc_wrapper_env_snapshot_box.h`
|
|
||||||
- `core/box/malloc_wrapper_env_snapshot_box.c`
|
|
||||||
|
|
||||||
要件:
|
|
||||||
- 1 TLS read で必要なフラグを全部取れること
|
|
||||||
- `getenv()` は init の 1 回だけ(hot で呼ばない)
|
|
||||||
- 失敗時は “既存経路にフォールバック” で挙動不変
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 2: wrapper に統合(境界 1 箇所)
|
|
||||||
|
|
||||||
対象:
|
|
||||||
- `core/box/hak_wrappers.inc.h` の `malloc()` hot path
|
|
||||||
|
|
||||||
方針:
|
|
||||||
- `HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1` のときだけ snapshot 経由で “早期 return 可能な最短経路” を作る
|
|
||||||
- それ以外は既存の `wrapper_env_cfg_fast()` / 既存分岐のまま
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 3: ビルド定義の追加
|
|
||||||
|
|
||||||
- `Makefile` の object list に `malloc_wrapper_env_snapshot_box.o` を追加
|
|
||||||
- `hakmem.d` は `make` に任せる(repo が追跡している場合のみ差分を受け入れる)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Step 4: A/B(Mixed 10-run)
|
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
# Baseline
|
# Baseline: both OFF
|
||||||
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0 \
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
./bench_random_mixed_hakmem 20000000 400 1
|
HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0 \
|
||||||
|
HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0 \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
|
||||||
# Optimized
|
# Optimized: both ON
|
||||||
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1 \
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
./bench_random_mixed_hakmem 20000000 400 1
|
HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=1 \
|
||||||
|
HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1 \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
```
|
```
|
||||||
|
|
||||||
判定:
|
判定:
|
||||||
@ -68,9 +58,15 @@ HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1 \
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Step 5: 健康診断
|
## Step 3: 健康診断
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
scripts/verify_health_profiles.sh
|
scripts/verify_health_profiles.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 4: 次の候補(優先順)
|
||||||
|
|
||||||
|
1. perf を取り直して “self% ≥ 5%” の芯を選ぶ(新 baseline で)
|
||||||
|
2. Option: alloc gate / tiny_unified_cache / pool の hot loop(ENV/TLS 以外)
|
||||||
|
|||||||
@ -0,0 +1,48 @@
|
|||||||
|
# Phase 5 E4 (E4-1 + E4-2): Combined A/B(次の指示書)
|
||||||
|
|
||||||
|
## 目的
|
||||||
|
|
||||||
|
E4-1(free wrapper snapshot)と E4-2(malloc wrapper snapshot)の “累積効果” を確認し、次の perf ターゲットを確定する。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## A/B(Mixed 10-run)
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# Baseline: both OFF
|
||||||
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
|
HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=0 \
|
||||||
|
HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=0 \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
|
||||||
|
# Optimized: both ON
|
||||||
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
|
HAKMEM_FREE_WRAPPER_ENV_SNAPSHOT=1 \
|
||||||
|
HAKMEM_MALLOC_WRAPPER_ENV_SNAPSHOT=1 \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
```
|
||||||
|
|
||||||
|
判定:
|
||||||
|
- GO: mean **+1.0% 以上**
|
||||||
|
- ±1%: NEUTRAL(freeze)
|
||||||
|
- -1% 以下: NO-GO(freeze)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 健康診断
|
||||||
|
|
||||||
|
```sh
|
||||||
|
scripts/verify_health_profiles.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 次のアクション
|
||||||
|
|
||||||
|
```sh
|
||||||
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE perf record -F 99 -- \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
perf report --stdio --no-children
|
||||||
|
```
|
||||||
|
|
||||||
|
“self% ≥ 5%” の箱から次の芯を選ぶ。
|
||||||
@ -5,7 +5,8 @@
|
|||||||
- Phase 4 の勝ち箱は **E1(ENV Snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
- Phase 4 の勝ち箱は **E1(ENV Snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
||||||
- E3-4(ENV CTOR)は **NO-GO / freeze**
|
- E3-4(ENV CTOR)は **NO-GO / freeze**
|
||||||
- Phase 5 の勝ち箱: **E4-1(free wrapper snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
- Phase 5 の勝ち箱: **E4-1(free wrapper snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
||||||
- 次は “形” ではなく **wrapper 入口の ENV/TLS** を削る(E4-2)か、perf で self% ≥ 5% を殴る
|
- Phase 5 の勝ち箱: **E4-2(malloc wrapper snapshot)**(`MIXED_TINYV3_C7_SAFE` で default 化)
|
||||||
|
- 次は “形” ではなく **新 baseline** で perf を取り直し、self% ≥ 5% の芯を殴る
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -69,3 +70,4 @@ scripts/verify_health_profiles.sh
|
|||||||
|
|
||||||
- E4-1 昇格: `docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
- E4-1 昇格: `docs/analysis/PHASE5_E4_1_FREE_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
||||||
- E4-2 設計/実装: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
- E4-2 設計/実装: `docs/analysis/PHASE5_E4_2_MALLOC_WRAPPER_ENV_SNAPSHOT_NEXT_INSTRUCTIONS.md`
|
||||||
|
- E4 合算 A/B: `docs/analysis/PHASE5_E4_COMBINED_AB_TEST_NEXT_INSTRUCTIONS.md`
|
||||||
|
|||||||
4
hakmem.d
4
hakmem.d
@ -158,7 +158,8 @@ hakmem.o: core/hakmem.c core/hakmem.h core/hakmem_build_flags.h \
|
|||||||
core/box/tiny_alloc_gate_shape_env_box.h \
|
core/box/tiny_alloc_gate_shape_env_box.h \
|
||||||
core/box/tiny_front_config_box.h core/box/wrapper_env_box.h \
|
core/box/tiny_front_config_box.h core/box/wrapper_env_box.h \
|
||||||
core/box/wrapper_env_cache_box.h core/box/wrapper_env_cache_env_box.h \
|
core/box/wrapper_env_cache_box.h core/box/wrapper_env_cache_env_box.h \
|
||||||
core/box/free_wrapper_env_snapshot_box.h core/box/../hakmem_internal.h
|
core/box/free_wrapper_env_snapshot_box.h \
|
||||||
|
core/box/malloc_wrapper_env_snapshot_box.h core/box/../hakmem_internal.h
|
||||||
core/hakmem.h:
|
core/hakmem.h:
|
||||||
core/hakmem_build_flags.h:
|
core/hakmem_build_flags.h:
|
||||||
core/hakmem_config.h:
|
core/hakmem_config.h:
|
||||||
@ -398,4 +399,5 @@ core/box/wrapper_env_box.h:
|
|||||||
core/box/wrapper_env_cache_box.h:
|
core/box/wrapper_env_cache_box.h:
|
||||||
core/box/wrapper_env_cache_env_box.h:
|
core/box/wrapper_env_cache_env_box.h:
|
||||||
core/box/free_wrapper_env_snapshot_box.h:
|
core/box/free_wrapper_env_snapshot_box.h:
|
||||||
|
core/box/malloc_wrapper_env_snapshot_box.h:
|
||||||
core/box/../hakmem_internal.h:
|
core/box/../hakmem_internal.h:
|
||||||
|
|||||||
Reference in New Issue
Block a user