|
|
e4c5f05355
|
Phase 86: Free Path Legacy Mask (NO-GO, +0.25%)
## Summary
Implemented Phase 86 "mask-only commit" optimization for free path:
- Bitset mask (0x7f for C0-C6) to identify LEGACY classes
- Direct call to tiny_legacy_fallback_free_base_with_env()
- No indirect function pointers (avoids Phase 85's -0.86% regression)
- Fail-fast on LARSON_FIX=1 (cross-thread validation incompatibility)
## Results (10-run SSOT)
**NO-GO**: +0.25% improvement (threshold: +1.0%)
- Control: 51,750,467 ops/s (CV: 2.26%)
- Treatment: 51,881,055 ops/s (CV: 2.32%)
- Delta: +0.25% (mean), -0.15% (median)
## Root Cause
Competing optimizations plateau:
1. Phase 9/10 MONO LEGACY (+1.89%) already capture most free path benefit
2. Remaining margin insufficient to overcome:
- Two branch checks (mask_enabled + has_class)
- I-cache layout tax in hot path
- Direct function call overhead
## Phase 85 vs Phase 86
| Metric | Phase 85 | Phase 86 |
|--------|----------|----------|
| Approach | Indirect calls + table | Bitset mask + direct call |
| Result | -0.86% | +0.25% |
| Verdict | NO-GO (regression) | NO-GO (insufficient) |
Phase 86 correctly avoided indirect call penalties but revealed architectural
limit: can't escape Phase 9/10 overlay without restructuring.
## Recommendation
Free path optimization layer has reached practical ceiling:
- Phase 9/10 +1.89% + Phase 6/19/FASTLANE +16-27% ≈ 18-29% total
- Further attempts on ceremony elimination face same constraints
- Recommend focus on different optimization layers (malloc, etc.)
## Files Changed
### New
- core/box/free_path_legacy_mask_box.h (API + globals)
- core/box/free_path_legacy_mask_box.c (refresh logic)
### Modified
- core/bench_profile.h (added refresh call)
- core/front/malloc_tiny_fast.h (added Phase 86 fast path check)
- Makefile (added object files)
- CURRENT_TASK.md (documented result)
All changes conditional on HAKMEM_FREE_PATH_LEGACY_MASK=1 (default OFF).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
|
2025-12-18 22:05:34 +09:00 |
|
|
|
89a9212700
|
Phase 83-1 + Allocator Comparison: Switch dispatch fixed (NO-GO +0.32%), PROFILE correction, SCORECARD update
Key changes:
- Phase 83-1: Switch dispatch fixed mode (tiny_inline_slots_switch_dispatch_fixed_box) - NO-GO (marginal +0.32%, branch reduction negligible)
Reason: lazy-init pattern already optimal, Phase 78-1 pattern shows diminishing returns
- Allocator comparison baseline update (10-run SSOT, WS=400, ITERS=20M):
tcmalloc: 115.26M (92.33% of mimalloc)
jemalloc: 97.39M (77.96% of mimalloc)
system: 85.20M (68.24% of mimalloc)
mimalloc: 124.82M (baseline)
- hakmem PROFILE correction: scripts/run_mixed_10_cleanenv.sh + run_allocator_quick_matrix.sh
PROFILE explicitly set to MIXED_TINYV3_C7_SAFE for hakmem measurements
Result: baseline stabilized to 55.53M (44.46% of mimalloc)
Previous unstable measurement (35.57M) was due to profile leak
- Documentation:
* PERFORMANCE_TARGETS_SCORECARD.md: Reference allocators + M1/M2 milestone status
* PHASE83_1_SWITCH_DISPATCH_FIXED_RESULTS.md: Phase 83-1 analysis (NO-GO)
* ALLOCATOR_COMPARISON_QUICK_RUNBOOK.md: Quick comparison procedure
* ALLOCATOR_COMPARISON_SSOT.md: Detailed SSOT methodology
- M2 milestone status: 44.46% (target 55%, gap -10.54pp) - structural improvements needed
🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
2025-12-18 18:50:00 +09:00 |
|