Files
hakmem/docs/analysis
Moe Charm (CI) 9ffef0ac9a Phase 19-5 Investigation: Both getenv() consolidation attempts NO-GO
Testing Results:
- Phase 19-5 (Global ENV Cache): -4.28% regression (57.1M → 54.66M ops/s)
- Phase 19-5v2 (HakmemEnvSnapshot): -7.7% regression (57.1M → 52.71M ops/s)

Root Cause Analysis:
Phase 19-5 Failed: 400B global struct causes L1 cache layout conflicts
- Cache coherency overhead > syscall savings
- False sharing on g_hak_env_cache struct

Phase 19-5v2 Failed (WORSE): Broke existing ultra-efficient per-thread TLS cache
- Original pattern: static __thread int g_larson_fix = -1
  - Cost: 1 getenv per thread (lazy init at first check)
  - Benefit: 1-cycle memory reads for all subsequent checks
  - Already near-optimal for runtime-configurable gates
- My change: Replaced with env->tiny_larson_fix access
  - Issue: env pointer NULL-safety, lost efficient TLS cache
  - Result: Worse performance than both baseline and v1

Key Discovery:
Original code's per-thread TLS cache pattern is already excellent.
Attempts to consolidate into global or snapshot-based caches failed
because they lose the amortization benefit and introduce layout conflicts.

Decision: DEFER Phase 19-5 series
- Current TLS pattern is near-optimal for runtime-configurable gates
- Focus remaining effort on other instruction reduction candidates:
  - Stats removal (+3-5%)
  - Header optimization (+2-3%)
  - Route fast path (+2-3%)

Updated: CURRENT_TASK.md with findings
Reverted: All Phase 19-5v2 code changes (git reset --hard HEAD~1)

Phase 19 Final Status (19-1b through 19-4c):
- Cumulative improvement: +9.65% (52.06M → 57.1M ops/s)
- GO phases: 19-1b (+5.88%), 19-3a (+4.42%), 19-3b (+2.76%), 19-4a (+0.16%), 19-4c (+0.88%)
- Stable state: Phase 19-4c

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-15 19:32:24 +09:00
..