|
|
cbb35ee27f
|
Phase 13 v1 + E5-2 retest: Both NEUTRAL, freeze as research boxes
Phase 13 v1: Header Write Elimination (C7 preserve header)
- Verdict: NEUTRAL (+0.78%)
- Implementation: HAKMEM_TINY_C7_PRESERVE_HEADER ENV gate (default OFF)
- Makes C7 nextptr offset conditional (0→1 when enabled)
- 4-point matrix A/B test results:
* Case A (baseline): 51.49M ops/s
* Case B (WRITE_ONCE=1): 52.07M ops/s (+1.13%)
* Case C (C7_PRESERVE=1): 51.36M ops/s (-0.26%)
* Case D (both): 51.89M ops/s (+0.78% NEUTRAL)
- Action: Freeze as research box (default OFF, manual opt-in)
Phase 5 E5-2: Header Write-Once retest (promotion test)
- Verdict: NEUTRAL (+0.54%)
- Motivation: Phase 13 Case B showed +1.13%, re-tested with dedicated 20-run
- Results (20-run):
* Case A (baseline): 51.10M ops/s
* Case B (WRITE_ONCE=1): 51.37M ops/s (+0.54%)
- Previous test: +0.45% (consistent with NEUTRAL)
- Action: Keep as research box (default OFF, manual opt-in)
Key findings:
- Header write tax optimization shows consistent NEUTRAL results
- Neither Phase 13 v1 nor E5-2 reaches GO threshold (+1.0%)
- Both implemented as reversible ENV gates for future research
Files changed:
- New: core/box/tiny_c7_preserve_header_env_box.{c,h}
- Modified: core/box/tiny_layout_box.h (C7 offset conditional)
- Modified: core/tiny_nextptr.h, core/box/tiny_header_box.h (comments)
- Modified: core/bench_profile.h (refresh sync)
- Modified: Makefile (add new .o files)
- Modified: scripts/run_mixed_10_cleanenv.sh (add C7_PRESERVE ENV)
- Docs: PHASE13_*, PHASE5_E5_2_HEADER_WRITE_ONCE_* (design/results)
Next: Phase 14 (Pointer-chase reduction, tcache-style intrusive LIFO)
🤖 Generated with Claude Code
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
2025-12-15 00:32:25 +09:00 |
|
|
|
f7b18aaf13
|
Phase 5 E5-2: Header Write-Once (NEUTRAL, FROZEN)
Target: tiny_region_id_write_header (3.35% self%)
- Hypothesis: Headers redundant for reused blocks
- Strategy: Write headers ONCE at refill boundary, skip in hot alloc
Implementation:
- ENV gate: HAKMEM_TINY_HEADER_WRITE_ONCE=0/1 (default 0)
- core/box/tiny_header_write_once_env_box.h: ENV gate
- core/box/tiny_header_write_once_stats_box.h: Stats counters
- core/box/tiny_header_box.h: Added tiny_header_finalize_alloc()
- core/front/tiny_unified_cache.c: Prefill at 3 refill sites
- core/box/tiny_front_hot_box.h: Use finalize function
A/B Test Results (Mixed, 10-run, 20M iters):
- Baseline (WRITE_ONCE=0): 44.22M ops/s (mean), 44.53M ops/s (median)
- Optimized (WRITE_ONCE=1): 44.42M ops/s (mean), 44.36M ops/s (median)
- Improvement: +0.45% mean, -0.38% median
Decision: NEUTRAL (within ±1.0% threshold)
- Action: FREEZE as research box (default OFF, do not promote)
Root Cause Analysis:
- Header writes are NOT redundant - existing code writes only when needed
- Branch overhead (~4 cycles) cancels savings (~3-5 cycles)
- perf self% ≠ optimization ROI (3.35% target → +0.45% gain)
Key Lessons:
1. Verify assumptions before optimizing (inspect code paths)
2. Hot spot self% measures time IN function, not savings from REMOVING it
3. Branch overhead matters (even "simple" checks add cycles)
Positive Outcome:
- StdDev reduced 50% (0.96M → 0.48M) - more stable performance
Health Check: PASS (all profiles)
Next Candidates:
- free_tiny_fast_cold: 7.14% self%
- unified_cache_push: 3.39% self%
- hakmem_env_snapshot_enabled: 2.97% self%
Deliverables:
- docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_DESIGN.md
- docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md
- CURRENT_TASK.md (E5-2 complete, FROZEN)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
2025-12-14 06:22:25 +09:00 |
|