# Phase 4 Perf Profiling - Files Index **Date**: 2025-12-14 **Status**: Complete ## Created Documents ### 1. Primary Analysis **File**: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE4_PERF_PROFILE_ANALYSIS.md` **Size**: ~5000 words **Contents**: - Detailed perf report breakdown - Candidate analysis (tiny_alloc_gate_fast, free_tiny_fast_cold, ENV gates) - Shape optimization plateau analysis - E1 implementation plan (ENV snapshot consolidation) - Alternative targets (E2/E3/E4) ### 2. Executive Summary **File**: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE4_PERF_PROFILE_FINAL_REPORT.md` **Size**: ~3000 words **Contents**: - Executive summary - Top hotspots analysis - Selected target (E1 ENV Snapshot Consolidation) - Implementation roadmap - Success criteria checklist ### 3. Files Index (This Document) **File**: `/mnt/workdisk/public_share/hakmem/docs/analysis/PHASE4_PROFILING_FILES_INDEX.md` **Contents**: - List of all created/modified files - Quick reference guide ## Modified Documents ### 1. CURRENT_TASK.md **File**: `/mnt/workdisk/public_share/hakmem/CURRENT_TASK.md` **Changes**: - Added Phase 4 perf profiling summary (lines 3-39) - Key findings: ENV gate overhead (3.26%), shape plateau analysis - Next target: Phase 4 E1 - ENV Snapshot Consolidation ## Perf Data Artifacts ### 1. Raw Perf Data **File**: `/mnt/workdisk/public_share/hakmem/perf.data` **Format**: Binary (perf record output) **Size**: 0.059 MB **Samples**: 922 @ 999Hz **Command**: ```bash HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \ perf record -F 999 -- ./bench_random_mixed_hakmem 40000000 400 1 ``` ### 2. Perf Report (Full) **File**: `/tmp/perf_report_full.txt` **Format**: Text (perf report --stdio output) **Contents**: Full symbol-sorted report with self% breakdown ### 3. Perf Summary **File**: `/tmp/perf_summary.txt` **Format**: Text (quick reference) **Contents**: Top hotspots, selected target, perf command reference ## Key Findings ### ENV Gate Overhead (3.26% Combined) 1. `tiny_c7_ultra_enabled_env()`: 1.28% 2. `tiny_front_v3_enabled()`: 1.01% 3. `tiny_metadata_cache_enabled()`: 0.97% **Root Cause**: 3 separate TLS reads + lazy init checks on every hot path call ### Shape Optimization Plateau - B3 (Routing Shape): +2.89% (first pass) - D3 (Alloc Gate Shape): +0.56% NEUTRAL (diminishing returns) - **Lesson**: Branch prediction saturated, next frontier is caching/structural changes ### Selected Next Target **Phase 4 E1**: ENV Snapshot Consolidation - Expected gain: +3.0-3.5% - Approach: Consolidate all ENV gates into single TLS snapshot struct - Precedent: `tiny_front_v3_snapshot` (proven pattern) ## Quick Navigation ### Detailed Analysis ```bash cat /mnt/workdisk/public_share/hakmem/docs/analysis/PHASE4_PERF_PROFILE_ANALYSIS.md ``` ### Executive Summary ```bash cat /mnt/workdisk/public_share/hakmem/docs/analysis/PHASE4_PERF_PROFILE_FINAL_REPORT.md ``` ### Current Task Status ```bash head -100 /mnt/workdisk/public_share/hakmem/CURRENT_TASK.md ``` ### Perf Commands (Re-run) ```bash # Profile HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \ perf record -F 999 -- ./bench_random_mixed_hakmem 40000000 400 1 # Report (top 80) perf report --stdio --no-children --sort=symbol | head -80 # Annotate specific function perf annotate --stdio tiny_alloc_gate_fast.lto_priv.0 | head -100 ``` ## Next Steps 1. **Phase 4 E1 Implementation** (2-3 days): - Create `core/box/hakmem_env_snapshot_box.h/c` - Migrate priority ENV gates (C7 ultra, front_v3, metadata_cache) - Refactor ~14 call sites - A/B test (Mixed 10-run, target +2.5%) - Health check, promote to default if GO 2. **Phase 4 E2** (SECONDARY, defer until E1 complete): - Per-class alloc fast path specialization - Expected gain: +2-3% 3. **Phase 4 E3** (TERTIARY, extends E1): - Free path ENV gate consolidation - Expected gain: +0.4-0.6% ## References - **Baseline**: 46.37M ops/s (MIXED_TINYV3_C7_SAFE, Phase 3 + D1) - **Target**: 47.8M ops/s (+3.0% via E1) - **Profile**: MIXED_TINYV3_C7_SAFE (20M iterations, ws=400) - **Workload**: bench_random_mixed_hakmem (50% alloc / 50% free) --- **Status**: COMPLETE - Ready for Phase 4 E1 **Date**: 2025-12-14