Files
hakmem/CURRENT_TASK.md

88 lines
3.7 KiB
Markdown
Raw Normal View History

# CURRENT_TASKRolling
## 0) 今の「正」Phase 39
- **性能比較の正**: **FAST build**`make perf_fast`
- **安全・互換の正**: Standard build`make bench_random_mixed_hakmem`
- **観測の正**: OBSERVE build`make perf_observe`
- **スコアカード**: `docs/analysis/PERFORMANCE_TARGETS_SCORECARD.md`
- **計測の正Mixed 10-run**: `scripts/run_mixed_10_cleanenv.sh``ITERS=20000000 WS=400`
## 1) 現状(最新スナップショット)
- FAST v3: **56.04M ops/s**mimalloc の **47.4%**
- Standard: **53.50M ops/s**mimalloc の **45.3%**
※詳細は `docs/analysis/PERFORMANCE_TARGETS_SCORECARD.md` を正とする(ここは要点だけ)。
## 2) 原則Box Theory 運用)
- 変更は箱で分けるENV / build flag で戻せる)
- 境界は 1 箇所(変換点を増やさない)
- **削除して速くするlink-out / 大きい削除)は封印**layout/LTO で符号反転する)
- ✅ compile-out`#if HAKMEM_*_COMPILED` / `#if HAKMEM_BENCH_MINIMAL`)は許容
- ❌ Makefile から `.o` を外す / コード物理削除は原則しないPhase 22-2 NO-GO
- A/B は **同一バイナリ**でトグルENV / build flag。別バイナリ比較は layout が混ざる。
## 3) 次の指示書Phase 40
**Phase 40: 残存 gate function の BENCH_MINIMAL 定数化(継続)**
Phase 39 で +1.98% 達成。FAST v3 perf profile で残存 gate function を調査した結果、以下を特定:
### 優先候補HOT path:
1. **tiny_header_mode()** (`core/tiny_region_id.h:180-200`)
- **Hotspot**: `tiny_region_id_write_header` (4.56% self-time)
- **Pattern**: lazy-init (`static int g_header_mode = -1` + `getenv()`)
- **Default**: TINY_HEADER_MODE_FULL (0)
- **BENCH_MINIMAL 値**: 固定 FULL (0)(ヘッダー書き込み有効)
- **影響**: alloc hot path、毎回実行
- **期待**: +0.3~0.8%
2. **mid_v3_enabled()** (`core/box/mid_hotbox_v3_env_box.h:14-26`)
- **Hotspot**: free path で条件分岐(`g_free_dispatch_ssot` ブロック)
- **Pattern**: lazy-init (`static int g_enable = -1` + `getenv()`)
- **Default**: 0 (OFF)
- **BENCH_MINIMAL 値**: 固定 0
- **影響**: free path で毎回 check
- **期待**: +0.2~0.5%
3. **mid_v3_debug_enabled()** (`core/box/mid_hotbox_v3_env_box.h:78-89`)
- **Hotspot**: free path で debug log check
- **Pattern**: lazy-init (`static int g_debug = -1` + `getenv()`)
- **Default**: 0 (OFF)
- **BENCH_MINIMAL 値**: 固定 0
- **期待**: +0.1~0.3%
### 保留候補:
4. **g_free_dispatch_ssot** (`core/box/hak_free_api.inc.h:236-240`)
- Phase 39 で「保留」(互換性優先)
- Default: 0 (backward compat)
- 再検討: BENCH_MINIMAL で固定 1 にすべきか?
### 実装方針:
**Step 1**: tiny_header_mode() 単独で A/B test最大 impact 候補)
**Step 2**: mid_v3_enabled() 単独で A/B test
**Step 3**: mid_v3_debug_enabled() を追加
**Step 4**: 累積効果確認GO 閾値: +0.5%
**GO 条件**: build-level 変更のため +0.5% 以上
## 4) 直近のログ(要点だけ)
Phase 30-31: Standard procedure + g_tiny_free_trace atomic prune Phase 30: Standard Procedure Establishment - Created 4-step standardized methodology (Step 0-3) - Step 0: Execution Verification (NEW - Phase 29 lesson) - Step 1: CORRECTNESS/TELEMETRY Classification (Phase 28 lesson) - Step 2: Compile-Out Implementation (Phase 24-27 pattern) - Step 3: A/B Test (build-level comparison) - Executed audit_atomics.sh: 412 atomics analyzed - Identified Phase 31 candidate: g_tiny_free_trace (HOT path, TOP PRIORITY) Phase 31: g_tiny_free_trace Compile-Out (HOT Path TELEMETRY) - Target: core/hakmem_tiny_free.inc:326 (trace-rate-limit atomic) - Added HAKMEM_TINY_FREE_TRACE_COMPILED (default: 0) - Classification: Pure TELEMETRY (trace output only, no flow control) - A/B Result: NEUTRAL (baseline -0.35% mean, +0.19% median) - Verdict: NEUTRAL → Adopted for code cleanliness (Phase 26 precedent) - Rationale: HOT path TELEMETRY removal improves code quality A/B Test Details: - Baseline (COMPILED=0): 53.638M ops/s mean, 53.799M median - Compiled-in (COMPILED=1): 53.828M ops/s mean, 53.697M median - Conflicting signals within ±0.5% noise margin - Phase 25 comparison: g_free_ss_enter (+1.07% GO) vs g_tiny_free_trace (NEUTRAL) - Hypothesis: Rate-limited atomic (128 calls) optimized by compiler Cumulative Progress (Phase 24-31): - Phase 24 (class stats): +0.93% GO - Phase 25 (free stats): +1.07% GO - Phase 26 (diagnostics): -0.33% NEUTRAL - Phase 27 (unified cache): +0.74% GO - Phase 28 (bg spill): NO-OP (all CORRECTNESS) - Phase 29 (pool v2): NO-OP (ENV-gated) - Phase 30 (procedure): PROCEDURE - Phase 31 (free trace): -0.35% NEUTRAL - Total: 18 atomics removed, +2.74% net improvement Documentation Created: - PHASE30_STANDARD_PROCEDURE.md: Complete 4-step methodology - ATOMIC_AUDIT_FULL.txt: 412 atomics comprehensive audit - PHASE31_CANDIDATES_HOT/WARM.txt: Priority-sorted candidates - PHASE31_RECOMMENDED_CANDIDATES.md: TOP 3 with Step 0 verification - PHASE31_TINY_FREE_TRACE_ATOMIC_PRUNE_RESULTS.md: Complete A/B results - ATOMIC_PRUNE_CUMULATIVE_SUMMARY.md: Updated (Phase 30-31) - CURRENT_TASK.md: Phase 32 candidate identified (g_hak_tiny_free_calls) Key Lessons: - Lesson 7 (Phase 30): Step 0 execution verification prevents wasted effort - Lesson 8 (Phase 31): NEUTRAL + code cleanliness = valid adoption - HOT path ≠ guaranteed performance win (rate-limited atomics may be optimized) Next Phase: Phase 32 candidate (g_hak_tiny_free_calls) - Location: core/hakmem_tiny_free.inc:335 (9 lines below Phase 31 target) - Expected: +0.3~0.7% or NEUTRAL Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 07:31:15 +09:00
- Phase 2434: atomic prune 累積 **+2.74%**(その後 diminishing returns
- Phase 35-A: `HAKMEM_BENCH_MINIMAL=1`gate prune**GO +4.39%**
- Phase 36: FAST-only policy snapshot 最適化 **GO +0.71%**
- Phase 37: Standard TLS cache **NO-GO**runtime gate の税が勝つ)
- Phase 38: FAST/OBSERVE/Standard 運用確立scorecard + Makefile targets
- Phase 39: FAST v3 gate 定数化 **GO +1.98%**
- 結果詳細: `docs/analysis/PHASE39_FAST_V3_GATE_CONSTANTIZATION_RESULTS.md`
Phase 30-31: Standard procedure + g_tiny_free_trace atomic prune Phase 30: Standard Procedure Establishment - Created 4-step standardized methodology (Step 0-3) - Step 0: Execution Verification (NEW - Phase 29 lesson) - Step 1: CORRECTNESS/TELEMETRY Classification (Phase 28 lesson) - Step 2: Compile-Out Implementation (Phase 24-27 pattern) - Step 3: A/B Test (build-level comparison) - Executed audit_atomics.sh: 412 atomics analyzed - Identified Phase 31 candidate: g_tiny_free_trace (HOT path, TOP PRIORITY) Phase 31: g_tiny_free_trace Compile-Out (HOT Path TELEMETRY) - Target: core/hakmem_tiny_free.inc:326 (trace-rate-limit atomic) - Added HAKMEM_TINY_FREE_TRACE_COMPILED (default: 0) - Classification: Pure TELEMETRY (trace output only, no flow control) - A/B Result: NEUTRAL (baseline -0.35% mean, +0.19% median) - Verdict: NEUTRAL → Adopted for code cleanliness (Phase 26 precedent) - Rationale: HOT path TELEMETRY removal improves code quality A/B Test Details: - Baseline (COMPILED=0): 53.638M ops/s mean, 53.799M median - Compiled-in (COMPILED=1): 53.828M ops/s mean, 53.697M median - Conflicting signals within ±0.5% noise margin - Phase 25 comparison: g_free_ss_enter (+1.07% GO) vs g_tiny_free_trace (NEUTRAL) - Hypothesis: Rate-limited atomic (128 calls) optimized by compiler Cumulative Progress (Phase 24-31): - Phase 24 (class stats): +0.93% GO - Phase 25 (free stats): +1.07% GO - Phase 26 (diagnostics): -0.33% NEUTRAL - Phase 27 (unified cache): +0.74% GO - Phase 28 (bg spill): NO-OP (all CORRECTNESS) - Phase 29 (pool v2): NO-OP (ENV-gated) - Phase 30 (procedure): PROCEDURE - Phase 31 (free trace): -0.35% NEUTRAL - Total: 18 atomics removed, +2.74% net improvement Documentation Created: - PHASE30_STANDARD_PROCEDURE.md: Complete 4-step methodology - ATOMIC_AUDIT_FULL.txt: 412 atomics comprehensive audit - PHASE31_CANDIDATES_HOT/WARM.txt: Priority-sorted candidates - PHASE31_RECOMMENDED_CANDIDATES.md: TOP 3 with Step 0 verification - PHASE31_TINY_FREE_TRACE_ATOMIC_PRUNE_RESULTS.md: Complete A/B results - ATOMIC_PRUNE_CUMULATIVE_SUMMARY.md: Updated (Phase 30-31) - CURRENT_TASK.md: Phase 32 candidate identified (g_hak_tiny_free_calls) Key Lessons: - Lesson 7 (Phase 30): Step 0 execution verification prevents wasted effort - Lesson 8 (Phase 31): NEUTRAL + code cleanliness = valid adoption - HOT path ≠ guaranteed performance win (rate-limited atomics may be optimized) Next Phase: Phase 32 candidate (g_hak_tiny_free_calls) - Location: core/hakmem_tiny_free.inc:335 (9 lines below Phase 31 target) - Expected: +0.3~0.7% or NEUTRAL Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-16 07:31:15 +09:00
## 5) アーカイブ
-`CURRENT_TASK.md`(詳細ログ)は `archive/CURRENT_TASK_ARCHIVE_20251216.md`