Testing Results: - Phase 19-5 (Global ENV Cache): -4.28% regression (57.1M → 54.66M ops/s) - Phase 19-5v2 (HakmemEnvSnapshot): -7.7% regression (57.1M → 52.71M ops/s) Root Cause Analysis: Phase 19-5 Failed: 400B global struct causes L1 cache layout conflicts - Cache coherency overhead > syscall savings - False sharing on g_hak_env_cache struct Phase 19-5v2 Failed (WORSE): Broke existing ultra-efficient per-thread TLS cache - Original pattern: static __thread int g_larson_fix = -1 - Cost: 1 getenv per thread (lazy init at first check) - Benefit: 1-cycle memory reads for all subsequent checks - Already near-optimal for runtime-configurable gates - My change: Replaced with env->tiny_larson_fix access - Issue: env pointer NULL-safety, lost efficient TLS cache - Result: Worse performance than both baseline and v1 Key Discovery: Original code's per-thread TLS cache pattern is already excellent. Attempts to consolidate into global or snapshot-based caches failed because they lose the amortization benefit and introduce layout conflicts. Decision: DEFER Phase 19-5 series - Current TLS pattern is near-optimal for runtime-configurable gates - Focus remaining effort on other instruction reduction candidates: - Stats removal (+3-5%) - Header optimization (+2-3%) - Route fast path (+2-3%) Updated: CURRENT_TASK.md with findings Reverted: All Phase 19-5v2 code changes (git reset --hard HEAD~1) Phase 19 Final Status (19-1b through 19-4c): - Cumulative improvement: +9.65% (52.06M → 57.1M ops/s) - GO phases: 19-1b (+5.88%), 19-3a (+4.42%), 19-3b (+2.76%), 19-4a (+0.16%), 19-4c (+0.88%) - Stable state: Phase 19-4c 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Docs Overview
このフォルダは hakmem の設計・計測・運用メモを体系化して管理する場所です。
- INDEX.md: 目次(各ドキュメントへのリンク)
- benchmarks/: ベンチマーク手順とスイープ結果の保存先
- specs/: 現在の仕様(SACS‑3/HW/ENV)を集約
- roadmap/: これからの実装計画・優先度・タスク
運用ルール(提案)
- 1つの変更/計測のまとまりにつき1ファイル(or 1フォルダ)
- 再現コマンド・環境変数・ハード構成は必ず記載
- 大きな連続出力はファイルへ保存し、本文からは抜粋/要約を記載