Phase 5 E5-3: Candidate Analysis (All DEFERRED) + E5-4 Instructions
E5-3 Analysis Results:
- free_tiny_fast_cold (7.14%): DEFER - cold path, low ROI
- unified_cache_push (3.39%): DEFER - already optimized
- hakmem_env_snapshot_enabled (2.97%): DEFER - low headroom
Key Insight: perf self% is time-weighted, not frequency-weighted.
Cold paths appear hot but have low total impact.
Next: E5-4 (Malloc Tiny Direct Path)
- Apply E5-1 winning pattern to malloc side
- Target: tiny_alloc_gate_fast() gate tax elimination
- ENV gate: HAKMEM_MALLOC_TINY_DIRECT=0/1
Files added:
- docs/analysis/PHASE5_E5_3_ANALYSIS_AND_RECOMMENDATIONS.md
- docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md
- core/box/free_cold_shape_env_box.{h,c} (research box, not tested)
- core/box/free_cold_shape_stats_box.{h,c} (research box, not tested)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -1,5 +1,68 @@
|
|||||||
# 本線タスク(現在)
|
# 本線タスク(現在)
|
||||||
|
|
||||||
|
## 更新メモ(2025-12-14 Phase 5 E5-3 Analysis - Strategic Pivot)
|
||||||
|
|
||||||
|
### Phase 5 E5-3: Candidate Analysis & Strategic Recommendations ⚠️ DEFER (2025-12-14)
|
||||||
|
|
||||||
|
**Decision**: **DEFER all E5-3 candidates** (E5-3a/b/c). Pivot to E5-4 (Malloc Direct Path, E5-1 pattern replication).
|
||||||
|
|
||||||
|
**Analysis**:
|
||||||
|
- **E5-3a (free_tiny_fast_cold 7.14%)**: NO-GO (cold path, low frequency despite high self%)
|
||||||
|
- **E5-3b (unified_cache_push 3.39%)**: MAYBE (already optimized, marginal ROI ~+1.0%)
|
||||||
|
- **E5-3c (hakmem_env_snapshot_enabled 2.97%)**: NO-GO (E3-4 precedent shows -1.44% regression)
|
||||||
|
|
||||||
|
**Key Insight**: **Profiler self% ≠ optimization opportunity**
|
||||||
|
- Self% is time-weighted (samples during execution), not frequency-weighted
|
||||||
|
- Cold paths appear hot due to expensive operations when hit, not total cost
|
||||||
|
- E5-2 lesson: 3.35% self% → +0.45% NEUTRAL (branch overhead ≈ savings)
|
||||||
|
|
||||||
|
**ROI Assessment**:
|
||||||
|
| Candidate | Self% | Frequency | Expected Gain | Risk | Decision |
|
||||||
|
|-----------|-------|-----------|---------------|------|----------|
|
||||||
|
| E5-3a (cold path) | 7.14% | LOW | +0.5% | HIGH | NO-GO |
|
||||||
|
| E5-3b (push) | 3.39% | HIGH | +1.0% | MEDIUM | DEFER |
|
||||||
|
| E5-3c (env snapshot) | 2.97% | HIGH | -1.0% | HIGH | NO-GO |
|
||||||
|
|
||||||
|
**Strategic Pivot**: Focus on **E5-1 Success Pattern** (wrapper-level deduplication)
|
||||||
|
- E5-1 (Free Tiny Direct): +3.35% (GO) ✅
|
||||||
|
- **Next**: E5-4 (Malloc Tiny Direct) - Apply E5-1 pattern to alloc side
|
||||||
|
- **Expected**: +2-4% (similar to E5-1, based on malloc wrapper overhead)
|
||||||
|
|
||||||
|
**Cumulative Status (Phase 5)**:
|
||||||
|
- E4-1 (Free Wrapper Snapshot): +3.51% standalone
|
||||||
|
- E4-2 (Malloc Wrapper Snapshot): +21.83% standalone
|
||||||
|
- E4 Combined: +6.43% (from baseline with both OFF)
|
||||||
|
- E5-1 (Free Tiny Direct): +3.35% (from E4 baseline)
|
||||||
|
- E5-2 (Header Write-Once): +0.45% NEUTRAL (frozen)
|
||||||
|
- **E5-3**: **DEFER** (analysis complete, no implementation/test)
|
||||||
|
- **Total Phase 5**: ~+9-10% cumulative (E4+E5-1 promoted, E5-2 frozen, E5-3 deferred)
|
||||||
|
|
||||||
|
**Implementation** (E5-3a research box, NOT TESTED):
|
||||||
|
- Files created:
|
||||||
|
- `core/box/free_cold_shape_env_box.{h,c}` (ENV gate, default OFF)
|
||||||
|
- `core/box/free_cold_shape_stats_box.{h,c}` (stats counters)
|
||||||
|
- `docs/analysis/PHASE5_E5_3_ANALYSIS_AND_RECOMMENDATIONS.md` (analysis)
|
||||||
|
- Files modified:
|
||||||
|
- `core/front/malloc_tiny_fast.h` (lines 418-437, cold path shape optimization)
|
||||||
|
- Pattern: Early exit for LEGACY path (skip LARSON check when !use_tiny_heap)
|
||||||
|
- **Status**: FROZEN (default OFF, pre-analysis shows NO-GO, not worth A/B testing)
|
||||||
|
|
||||||
|
**Key Lessons**:
|
||||||
|
1. **Profiler self% misleads** when frequency is low (cold path)
|
||||||
|
2. **Micro-optimizations plateau** in already-optimized code (E5-2, E5-3b)
|
||||||
|
3. **Branch hints are profile-dependent** (E3-4 failure, E5-3c risk)
|
||||||
|
4. **Wrapper-level deduplication wins** (E4-1, E4-2, E5-1 pattern)
|
||||||
|
|
||||||
|
**Next Steps**:
|
||||||
|
- **E5-4 Design**: Malloc Tiny Direct Path (E5-1 pattern for alloc)
|
||||||
|
- Target: malloc() wrapper overhead (~12.95% self% in E4 profile)
|
||||||
|
- Method: Single size check → direct call to malloc_tiny_fast_for_class()
|
||||||
|
- Expected: +2-4% (based on E5-1 precedent +3.35%)
|
||||||
|
- Design doc: `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_DESIGN.md`
|
||||||
|
- Next instructions: `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 更新メモ(2025-12-14 Phase 5 E5-2 Complete - Header Write-Once)
|
## 更新メモ(2025-12-14 Phase 5 E5-2 Complete - Header Write-Once)
|
||||||
|
|
||||||
### Phase 5 E5-2: Header Write-Once Optimization ⚪ NEUTRAL (2025-12-14)
|
### Phase 5 E5-2: Header Write-Once Optimization ⚪ NEUTRAL (2025-12-14)
|
||||||
@ -120,12 +183,15 @@
|
|||||||
|
|
||||||
**Next Steps**:
|
**Next Steps**:
|
||||||
- ✅ Promote: `HAKMEM_FREE_TINY_DIRECT=1` to `MIXED_TINYV3_C7_SAFE` preset
|
- ✅ Promote: `HAKMEM_FREE_TINY_DIRECT=1` to `MIXED_TINYV3_C7_SAFE` preset
|
||||||
- Next: E5-2 (Header Prefill at Refill, 2.59% target) or E5-3 (ENV Snapshot Shape, 2.57% target)
|
- ✅ E5-2: NEUTRAL → FREEZE
|
||||||
|
- ✅ E5-3: DEFER(ROI 低)
|
||||||
|
- Next: **E5-4 (Malloc Tiny Direct)**(E5-1 パターンの alloc 側複製)
|
||||||
- Design docs:
|
- Design docs:
|
||||||
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_1_DESIGN.md`
|
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_1_DESIGN.md`
|
||||||
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_1_AB_TEST_RESULTS.md`
|
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_1_AB_TEST_RESULTS.md`
|
||||||
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
- `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
- `docs/analysis/PHASE5_E5_COMPREHENSIVE_ANALYSIS.md`
|
- `docs/analysis/PHASE5_E5_COMPREHENSIVE_ANALYSIS.md`
|
||||||
|
- `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
5
core/box/free_cold_shape_env_box.c
Normal file
5
core/box/free_cold_shape_env_box.c
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
// free_cold_shape_env_box.c - Phase 5 E5-3a: Free Cold Path Shape Optimization
|
||||||
|
#include "free_cold_shape_env_box.h"
|
||||||
|
|
||||||
|
// Global gate state (-1: uninitialized, 0: OFF, 1: ON)
|
||||||
|
int g_free_cold_shape = -1;
|
||||||
57
core/box/free_cold_shape_env_box.h
Normal file
57
core/box/free_cold_shape_env_box.h
Normal file
@ -0,0 +1,57 @@
|
|||||||
|
// free_cold_shape_env_box.h - Phase 5 E5-3a: Free Cold Path Shape Optimization
|
||||||
|
//
|
||||||
|
// Purpose: Optimize free_tiny_fast_cold() branch structure for better prediction
|
||||||
|
// Target: free_tiny_fast_cold (7.14% self% in Mixed workload)
|
||||||
|
//
|
||||||
|
// Hypothesis:
|
||||||
|
// - Cold path has heavy branching overhead (route determination, LARSON check, ENV gates)
|
||||||
|
// - MIXED workload: LARSON=0 and use_tiny_heap=0 are COMMON (not rare)
|
||||||
|
// - Current branch hints assume LARSON/TinyHeap are rare, but profile shows otherwise
|
||||||
|
// - Reordering branches + fixing hints can reduce mispredictions
|
||||||
|
//
|
||||||
|
// Strategy:
|
||||||
|
// - Shape 1 (Optimized): Reorder branches to handle common LEGACY path first
|
||||||
|
// - Check use_tiny_heap==0 FIRST (LIKELY in Mixed, ~90%+ of cold path)
|
||||||
|
// - Short-circuit to LEGACY fallback when heap routing not needed
|
||||||
|
// - Defer LARSON/cross-thread checks to only when needed (heap routes)
|
||||||
|
// - Keep LARSON safety when needed (heap routes still do cross-thread check)
|
||||||
|
//
|
||||||
|
// Design:
|
||||||
|
// - ENV: HAKMEM_FREE_COLD_SHAPE=0/1 (default: 0, research box)
|
||||||
|
// - Shape 0 (baseline): Current structure (LARSON+heap check, then legacy)
|
||||||
|
// - Shape 1 (optimized): use_tiny_heap==0 early exit, LARSON only for heap
|
||||||
|
//
|
||||||
|
// Expected Benefit:
|
||||||
|
// - Reduce branch mispredictions in cold path (~7.14% self%)
|
||||||
|
// - Target gain: +1-3% (if branch prediction is bottleneck)
|
||||||
|
// - Conservative estimate: +0.5-1.5% (cold path is 7.14%, not dominant)
|
||||||
|
//
|
||||||
|
// Box Theory Compliance:
|
||||||
|
// - L0: ENV gate (default 0)
|
||||||
|
// - L1: Single boundary (free_tiny_fast_cold function)
|
||||||
|
// - Rollback: ENV=0 reverts to baseline
|
||||||
|
// - A/B testable: Same binary, ENV toggle
|
||||||
|
|
||||||
|
#ifndef HAK_FREE_COLD_SHAPE_ENV_BOX_H
|
||||||
|
#define HAK_FREE_COLD_SHAPE_ENV_BOX_H
|
||||||
|
|
||||||
|
#include <stdlib.h>
|
||||||
|
|
||||||
|
// Global gate state (defined in free_cold_shape_env_box.c)
|
||||||
|
extern int g_free_cold_shape;
|
||||||
|
|
||||||
|
// ENV gate: Check if optimized cold path shape is enabled
|
||||||
|
// Default: 0 (baseline), set HAKMEM_FREE_COLD_SHAPE=1 for optimized shape
|
||||||
|
static inline int free_cold_shape_enabled(void) {
|
||||||
|
if (__builtin_expect(g_free_cold_shape == -1, 0)) {
|
||||||
|
const char* e = getenv("HAKMEM_FREE_COLD_SHAPE");
|
||||||
|
if (e && *e) {
|
||||||
|
g_free_cold_shape = (*e == '1') ? 1 : 0;
|
||||||
|
} else {
|
||||||
|
g_free_cold_shape = 0; // default: OFF (research box)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return g_free_cold_shape;
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif // HAK_FREE_COLD_SHAPE_ENV_BOX_H
|
||||||
29
core/box/free_cold_shape_stats_box.c
Normal file
29
core/box/free_cold_shape_stats_box.c
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
// free_cold_shape_stats_box.c - Phase 5 E5-3a: Free Cold Shape Stats
|
||||||
|
#include "free_cold_shape_stats_box.h"
|
||||||
|
|
||||||
|
// Stats counters (global atomics)
|
||||||
|
_Atomic uint64_t g_free_cold_shape_legacy_fast = 0;
|
||||||
|
_Atomic uint64_t g_free_cold_shape_heap_path = 0;
|
||||||
|
_Atomic uint64_t g_free_cold_shape_enabled_count = 0;
|
||||||
|
|
||||||
|
void free_cold_shape_print_stats(void) {
|
||||||
|
#if !HAKMEM_BUILD_RELEASE
|
||||||
|
uint64_t legacy = atomic_load(&g_free_cold_shape_legacy_fast);
|
||||||
|
uint64_t heap = atomic_load(&g_free_cold_shape_heap_path);
|
||||||
|
uint64_t enabled = atomic_load(&g_free_cold_shape_enabled_count);
|
||||||
|
uint64_t total = legacy + heap;
|
||||||
|
|
||||||
|
if (total == 0) return; // No activity
|
||||||
|
|
||||||
|
fprintf(stderr, "\n[FREE-COLD-SHAPE] Stats:\n");
|
||||||
|
fprintf(stderr, " Shape enabled: %llu\n", (unsigned long long)enabled);
|
||||||
|
fprintf(stderr, " LEGACY fast path: %llu (%.1f%%)\n",
|
||||||
|
(unsigned long long)legacy,
|
||||||
|
100.0 * legacy / total);
|
||||||
|
fprintf(stderr, " Heap route path: %llu (%.1f%%)\n",
|
||||||
|
(unsigned long long)heap,
|
||||||
|
100.0 * heap / total);
|
||||||
|
fprintf(stderr, " Total cold hits: %llu\n", (unsigned long long)total);
|
||||||
|
fflush(stderr);
|
||||||
|
#endif
|
||||||
|
}
|
||||||
34
core/box/free_cold_shape_stats_box.h
Normal file
34
core/box/free_cold_shape_stats_box.h
Normal file
@ -0,0 +1,34 @@
|
|||||||
|
// free_cold_shape_stats_box.h - Phase 5 E5-3a: Free Cold Shape Stats
|
||||||
|
//
|
||||||
|
// Purpose: Track cold path branch distributions
|
||||||
|
// Metrics: legacy_fast_path, heap_path, shape_enabled
|
||||||
|
|
||||||
|
#ifndef HAK_FREE_COLD_SHAPE_STATS_BOX_H
|
||||||
|
#define HAK_FREE_COLD_SHAPE_STATS_BOX_H
|
||||||
|
|
||||||
|
#include <stdint.h>
|
||||||
|
#include <stdatomic.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
|
||||||
|
// Forward declarations for HAKMEM_DEBUG_COUNTERS
|
||||||
|
#ifndef HAKMEM_DEBUG_COUNTERS
|
||||||
|
#define HAKMEM_DEBUG_COUNTERS 0
|
||||||
|
#endif
|
||||||
|
|
||||||
|
// Stats counters (global atomics, always compiled)
|
||||||
|
extern _Atomic uint64_t g_free_cold_shape_legacy_fast; // Optimized: LEGACY path (no heap)
|
||||||
|
extern _Atomic uint64_t g_free_cold_shape_heap_path; // Heap route path
|
||||||
|
extern _Atomic uint64_t g_free_cold_shape_enabled_count; // Shape=1 hits
|
||||||
|
|
||||||
|
// Increment macros (compile-out in release builds)
|
||||||
|
#if HAKMEM_DEBUG_COUNTERS
|
||||||
|
#define FREE_COLD_SHAPE_STAT_INC(name) \
|
||||||
|
atomic_fetch_add_explicit(&g_free_cold_shape_##name, 1, memory_order_relaxed)
|
||||||
|
#else
|
||||||
|
#define FREE_COLD_SHAPE_STAT_INC(name) ((void)0)
|
||||||
|
#endif
|
||||||
|
|
||||||
|
// Print stats (implemented in free_cold_shape_stats_box.c)
|
||||||
|
void free_cold_shape_print_stats(void);
|
||||||
|
|
||||||
|
#endif // HAK_FREE_COLD_SHAPE_STATS_BOX_H
|
||||||
@ -70,6 +70,8 @@
|
|||||||
#include "../box/tiny_metadata_cache_hot_box.h" // Phase 3 C2: Policy hot cache (metadata cache optimization)
|
#include "../box/tiny_metadata_cache_hot_box.h" // Phase 3 C2: Policy hot cache (metadata cache optimization)
|
||||||
#include "../box/tiny_free_route_cache_env_box.h" // Phase 3 D1: Free path route cache
|
#include "../box/tiny_free_route_cache_env_box.h" // Phase 3 D1: Free path route cache
|
||||||
#include "../box/hakmem_env_snapshot_box.h" // Phase 4 E1: ENV snapshot consolidation
|
#include "../box/hakmem_env_snapshot_box.h" // Phase 4 E1: ENV snapshot consolidation
|
||||||
|
#include "../box/free_cold_shape_env_box.h" // Phase 5 E5-3a: Free cold path shape optimization
|
||||||
|
#include "../box/free_cold_shape_stats_box.h" // Phase 5 E5-3a: Free cold shape stats
|
||||||
|
|
||||||
// Helper: current thread id (low 32 bits) for owner check
|
// Helper: current thread id (low 32 bits) for owner check
|
||||||
#ifndef TINY_SELF_U32_LOCAL_DEFINED
|
#ifndef TINY_SELF_U32_LOCAL_DEFINED
|
||||||
@ -413,6 +415,28 @@ static int free_tiny_fast_cold(void* ptr, void* base, int class_idx)
|
|||||||
}
|
}
|
||||||
#endif // !HAKMEM_BUILD_RELEASE
|
#endif // !HAKMEM_BUILD_RELEASE
|
||||||
|
|
||||||
|
// Phase 5 E5-3a: Optimized cold path shape
|
||||||
|
// Strategy: Handle common LEGACY path first (use_tiny_heap==0 in Mixed ~90%+)
|
||||||
|
// Defer expensive LARSON/cross-thread checks to only when heap routing needed
|
||||||
|
static __thread int g_cold_shape = -1;
|
||||||
|
if (__builtin_expect(g_cold_shape == -1, 0)) {
|
||||||
|
g_cold_shape = free_cold_shape_enabled() ? 1 : 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (g_cold_shape == 1) {
|
||||||
|
// Optimized shape: Check use_tiny_heap FIRST
|
||||||
|
if (__builtin_expect(!use_tiny_heap, 1)) {
|
||||||
|
// Most common case in Mixed: LEGACY path, no heap routing
|
||||||
|
// Skip LARSON/cross-thread check entirely (not needed for legacy)
|
||||||
|
FREE_COLD_SHAPE_STAT_INC(legacy_fast);
|
||||||
|
FREE_COLD_SHAPE_STAT_INC(enabled_count);
|
||||||
|
goto legacy_fallback;
|
||||||
|
}
|
||||||
|
// Rare: heap routing needed, do full validation
|
||||||
|
FREE_COLD_SHAPE_STAT_INC(heap_path);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Baseline shape: LARSON check first (current behavior)
|
||||||
// Cross-thread free detection (Larson MT crash fix, ENV gated) + TinyHeap free path
|
// Cross-thread free detection (Larson MT crash fix, ENV gated) + TinyHeap free path
|
||||||
{
|
{
|
||||||
static __thread int g_larson_fix = -1;
|
static __thread int g_larson_fix = -1;
|
||||||
@ -467,7 +491,7 @@ static int free_tiny_fast_cold(void* ptr, void* base, int class_idx)
|
|||||||
}
|
}
|
||||||
return 0; // remote push failed; fall back to normal path
|
return 0; // remote push failed; fall back to normal path
|
||||||
}
|
}
|
||||||
// Same-thread + TinyHeap route → route-based free
|
// Same-thread + TinyHeap route → route-based free
|
||||||
if (__builtin_expect(use_tiny_heap, 0)) {
|
if (__builtin_expect(use_tiny_heap, 0)) {
|
||||||
FREE_TINY_FAST_HOTCOLD_STAT_INC(cold_tinyheap);
|
FREE_TINY_FAST_HOTCOLD_STAT_INC(cold_tinyheap);
|
||||||
switch (route) {
|
switch (route) {
|
||||||
@ -541,6 +565,7 @@ static int free_tiny_fast_cold(void* ptr, void* base, int class_idx)
|
|||||||
#endif
|
#endif
|
||||||
|
|
||||||
// Phase REFACTOR-2: Legacy fallback (use unified helper)
|
// Phase REFACTOR-2: Legacy fallback (use unified helper)
|
||||||
|
legacy_fallback:
|
||||||
FREE_TINY_FAST_HOTCOLD_STAT_INC(cold_legacy_fallback);
|
FREE_TINY_FAST_HOTCOLD_STAT_INC(cold_legacy_fallback);
|
||||||
tiny_legacy_fallback_free_base(base, class_idx);
|
tiny_legacy_fallback_free_base(base, class_idx);
|
||||||
return 1;
|
return 1;
|
||||||
|
|||||||
@ -72,7 +72,7 @@ perf report --stdio --no-children
|
|||||||
```
|
```
|
||||||
|
|
||||||
判断基準(self% ≥ 5%):
|
判断基準(self% ≥ 5%):
|
||||||
- `tiny_region_id_write_header` が依然 5% 以上 → **E5-2** 優先
|
- `tiny_region_id_write_header` が依然 5% 以上 → **E5-2** は NEUTRAL で freeze 済み(次は E5-4 を優先)
|
||||||
- `hakmem_env_snapshot_enabled` / `tiny_get_max_size` が 5% 付近まで上がる → **E5-3** 優先
|
- `hakmem_env_snapshot_enabled` / `tiny_get_max_size` が 5% 付近まで上がる → **E5-3** 優先
|
||||||
|
|
||||||
---
|
---
|
||||||
@ -83,4 +83,3 @@ perf report --stdio --no-children
|
|||||||
- 目標: `tiny_region_id_write_header` の hot path stores を減らす(A3 の “always_inline” は NO-GO 済み)
|
- 目標: `tiny_region_id_write_header` の hot path stores を減らす(A3 の “always_inline” は NO-GO 済み)
|
||||||
- E5-3: `hakmem_env_snapshot_enabled()` の分岐形/配置を “enabled 前提” に寄せる
|
- E5-3: `hakmem_env_snapshot_enabled()` の分岐形/配置を “enabled 前提” に寄せる
|
||||||
- 目標: mispredict を避け、`malloc_tiny_fast.h` 内の繰り返し gate を軽くする
|
- 目標: mispredict を避け、`malloc_tiny_fast.h` 内の繰り返し gate を軽くする
|
||||||
|
|
||||||
|
|||||||
231
docs/analysis/PHASE5_E5_3_ANALYSIS_AND_RECOMMENDATIONS.md
Normal file
231
docs/analysis/PHASE5_E5_3_ANALYSIS_AND_RECOMMENDATIONS.md
Normal file
@ -0,0 +1,231 @@
|
|||||||
|
# Phase 5 E5-3: Candidate Analysis and Strategic Recommendations
|
||||||
|
|
||||||
|
## Executive Summary
|
||||||
|
|
||||||
|
**Recommendation**: **DEFER E5-3 optimization**. Continue with established winning patterns (E5-1 style wrapper-level optimizations) rather than pursuing diminishing-returns micro-optimizations in profiler hot spots.
|
||||||
|
|
||||||
|
**Rationale**:
|
||||||
|
- E5-2 (Header Write-Once, 3.35% self%) achieved only +0.45% NEUTRAL
|
||||||
|
- E5-3 candidates (7.14%, 3.39%, 2.97% self%) have similar or worse ROI profiles
|
||||||
|
- Profiler self% != optimization opportunity (time-weighted samples can mislead)
|
||||||
|
- Cumulative gains from E4+E5-1 (~+9-10%) represent significant progress
|
||||||
|
- Next phase should target higher-level structural opportunities
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## E5-3 Candidate Analysis
|
||||||
|
|
||||||
|
### Context: Post-E5-2 Baseline
|
||||||
|
- **E5-1 (Free Tiny Direct)**: +3.35% GO (adopted)
|
||||||
|
- **E5-2 (Header Write-Once)**: +0.45% NEUTRAL (frozen as research box)
|
||||||
|
- **New baseline**: 44.42M ops/s (Mixed, 20M iters, ws=400)
|
||||||
|
|
||||||
|
### Available Candidates (from perf profile)
|
||||||
|
|
||||||
|
| Candidate | Self% | Call Frequency | ROI Assessment |
|
||||||
|
|-----------|-------|----------------|----------------|
|
||||||
|
| free_tiny_fast_cold | 7.14% | LOW (cold path) | **NO-GO** |
|
||||||
|
| unified_cache_push | 3.39% | HIGH (every free) | **MAYBE** |
|
||||||
|
| hakmem_env_snapshot_enabled | 2.97% | HIGH (wrapper+gate) | **NO-GO** |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Detailed Analysis
|
||||||
|
|
||||||
|
### E5-3a: free_tiny_fast_cold (7.14% self%) ❌ **NO-GO**
|
||||||
|
|
||||||
|
**Hypothesis**: Cold path branch structure optimization (route determination, LARSON check)
|
||||||
|
|
||||||
|
**Why NO-GO**:
|
||||||
|
1. **Self% Misleading**: 7.14% is time-weighted, not frequency
|
||||||
|
- Cold path is called RARELY (only when hot path misses)
|
||||||
|
- High self% = expensive when hit, not = high total cost
|
||||||
|
- Optimizing cold path has minimal impact on overall throughput
|
||||||
|
|
||||||
|
2. **Branch Prediction Already Optimized**:
|
||||||
|
- Current implementation uses `__builtin_expect` hints
|
||||||
|
- LARSON/heap checks are already marked UNLIKELY
|
||||||
|
- Further branch reordering has marginal benefit (~0.1-0.5% at best)
|
||||||
|
|
||||||
|
3. **Similar to E5-2 Failure**:
|
||||||
|
- E5-2 targeted 3.35% self%, gained only +0.45%
|
||||||
|
- E5-3a targets 7.14% self% BUT lower frequency
|
||||||
|
- Expected gain: +0.3-1.0% (< +1.0% GO threshold)
|
||||||
|
|
||||||
|
4. **Structural Issues**:
|
||||||
|
- Goto-based early exit adds control flow complexity
|
||||||
|
- Potential I-cache pollution (similar to Phase 1 A3 failure)
|
||||||
|
- Safety risks (LARSON check bypass in optimized path)
|
||||||
|
|
||||||
|
**Conservative Estimate**: +0.5% ± 0.5% (NEUTRAL range)
|
||||||
|
|
||||||
|
**Decision**: **NO-GO / DEFER**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### E5-3b: unified_cache_push (3.39% self%) ⚠️ **MAYBE**
|
||||||
|
|
||||||
|
**Hypothesis**: Push operation overhead (TLS access, modulo arithmetic, bounds check)
|
||||||
|
|
||||||
|
**Why MAYBE**:
|
||||||
|
1. **Frequency**: Called on EVERY free (high frequency)
|
||||||
|
2. **Current Implementation**: Already highly optimized
|
||||||
|
- Ring buffer with power-of-2 masking (no division)
|
||||||
|
- Single TLS access (g_unified_cache[class_idx])
|
||||||
|
- Minimal branch count (1-2 branches)
|
||||||
|
|
||||||
|
3. **Potential Optimizations**:
|
||||||
|
- **Inline Expansion**: Force always_inline (may hurt I-cache)
|
||||||
|
- **TLS Caching**: Cache g_unified_cache base pointer (adds TLS variable)
|
||||||
|
- **Bounds Check Removal**: Assume capacity never changes (unsafe)
|
||||||
|
|
||||||
|
4. **Risk Assessment**:
|
||||||
|
- **High risk**: unified_cache_push is already in critical path
|
||||||
|
- **Low ROI**: 3.39% self% with limited optimization headroom
|
||||||
|
- **Similar to E5-2**: Micro-optimization with marginal benefit
|
||||||
|
|
||||||
|
**Conservative Estimate**: +0.5-1.5% (borderline NEUTRAL/GO)
|
||||||
|
|
||||||
|
**Decision**: **DEFER** (pursue only if E5-1 pattern exhausted)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### E5-3c: hakmem_env_snapshot_enabled (2.97% self%) ❌ **NO-GO**
|
||||||
|
|
||||||
|
**Hypothesis**: Branch hint optimization (enabled=1 is常用 in MIXED)
|
||||||
|
|
||||||
|
**Why NO-GO**:
|
||||||
|
1. **E3-4 Precedent**: Phase 4 E3-4 (ENV Constructor Init) **FAILED**
|
||||||
|
- Attempted to eliminate lazy check overhead (3.22% self%)
|
||||||
|
- Result: -1.44% regression (constructor mode added overhead)
|
||||||
|
- Root cause: Branch predictor tuning is profile-dependent
|
||||||
|
|
||||||
|
2. **Branch Hint Contradiction**:
|
||||||
|
- Default builds: enabled=0 → hint UNLIKELY is correct
|
||||||
|
- MIXED preset: enabled=1 → hint UNLIKELY is WRONG
|
||||||
|
- Changing hint helps MIXED but hurts default builds
|
||||||
|
|
||||||
|
3. **Optimization Space**: Already consolidated in E4-1 (E1)
|
||||||
|
- ENV snapshot reduced 3 TLS reads → 1 TLS read
|
||||||
|
- Remaining overhead is unavoidable (lazy init check)
|
||||||
|
- Further optimization requires constructor init (E3-4 showed this fails)
|
||||||
|
|
||||||
|
**Conservative Estimate**: -1.0% to +0.5% (high regression risk)
|
||||||
|
|
||||||
|
**Decision**: **NO-GO** (proven failure in E3-4)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Strategic Recommendations
|
||||||
|
|
||||||
|
### Priority 1: Exploit E5-1 Success Pattern ✅
|
||||||
|
|
||||||
|
**E5-1 Strategy (Free Tiny Direct)**:
|
||||||
|
- **Target**: Wrapper-level overhead (deduplication)
|
||||||
|
- **Method**: Single header check → direct call to free_tiny_fast()
|
||||||
|
- **Result**: +3.35% (GO)
|
||||||
|
|
||||||
|
**Replicable Patterns**:
|
||||||
|
1. **Malloc Tiny Direct**: Apply E5-1 pattern to malloc() side
|
||||||
|
- Single size check → direct call to malloc_tiny_fast_for_class()
|
||||||
|
- Eliminate: Size validation redundancy, ENV snapshot overhead
|
||||||
|
- Expected: +2-4% (similar to E5-1)
|
||||||
|
|
||||||
|
2. **Alloc Gate Specialization**: Per-class fast paths
|
||||||
|
- C0-C3: Direct to LEGACY (skip policy snapshot)
|
||||||
|
- C4-C7: Route-specific fast paths
|
||||||
|
- Expected: +1-3%
|
||||||
|
|
||||||
|
### Priority 2: Profile New Baseline
|
||||||
|
|
||||||
|
After E4+E5-1 adoption (~+9-10% cumulative):
|
||||||
|
1. **Re-profile Mixed workload** (new bottlenecks may emerge)
|
||||||
|
2. **Identify high-frequency, high-overhead** targets
|
||||||
|
3. **Focus on deduplication/consolidation** (proven pattern)
|
||||||
|
|
||||||
|
### Priority 3: Avoid Diminishing Returns
|
||||||
|
|
||||||
|
**Red Flags** (E5-2, E5-3 lessons):
|
||||||
|
- **Self% > 3%** but **low frequency** → misleading
|
||||||
|
- **Micro-optimizations** in already-optimized code → marginal ROI
|
||||||
|
- **Branch hint tuning** → profile-dependent, high regression risk
|
||||||
|
- **Cold path optimization** → time-weighted ≠ frequency-weighted
|
||||||
|
|
||||||
|
**Green Flags** (E4-1, E4-2, E5-1 successes):
|
||||||
|
- **Wrapper-level deduplication** → +3-6% per optimization
|
||||||
|
- **TLS consolidation** → +2-4% per consolidation
|
||||||
|
- **Direct path creation** → +2-4% per path
|
||||||
|
- **Structural changes** (not micro-tuning) → higher ROI
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Lessons from Phase 5
|
||||||
|
|
||||||
|
### Wins (E4-1, E4-2, E5-1)
|
||||||
|
1. **ENV Snapshot Consolidation** (E4-1): +3.51%
|
||||||
|
- 3 TLS reads → 1 TLS read
|
||||||
|
- Deduplication > micro-optimization
|
||||||
|
|
||||||
|
2. **Malloc Wrapper Snapshot** (E4-2): +21.83% standalone (+6.43% combined)
|
||||||
|
- Function call elimination (tiny_get_max_size)
|
||||||
|
- Pre-caching + TLS consolidation
|
||||||
|
|
||||||
|
3. **Free Tiny Direct** (E5-1): +3.35%
|
||||||
|
- Single header check → direct call
|
||||||
|
- Wrapper-level deduplication
|
||||||
|
|
||||||
|
**Common Pattern**: **Eliminate redundancy at architectural boundaries** (wrapper, gate, snapshot)
|
||||||
|
|
||||||
|
### Losses / Neutrals (E3-4, E5-2)
|
||||||
|
1. **ENV Constructor Init** (E3-4): -1.44%
|
||||||
|
- Constructor mode added overhead
|
||||||
|
- Branch prediction is profile-dependent
|
||||||
|
|
||||||
|
2. **Header Write-Once** (E5-2): +0.45% NEUTRAL
|
||||||
|
- Assumption incorrect (headers NOT redundant)
|
||||||
|
- Branch overhead ≈ savings
|
||||||
|
|
||||||
|
**Common Pattern**: **Micro-optimizations in hot functions** have limited ROI when code is already optimized
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
**E5-3 Recommendation**: **DEFER all three candidates**
|
||||||
|
|
||||||
|
**Rationale**:
|
||||||
|
1. **E5-3a (cold path)**: Low frequency, high risk, estimated +0.5% NEUTRAL
|
||||||
|
2. **E5-3b (push)**: Already optimized, marginal ROI, estimated +1.0% borderline
|
||||||
|
3. **E5-3c (env snapshot)**: Proven failure (E3-4), estimated -1.0% NO-GO
|
||||||
|
|
||||||
|
**Next Steps**:
|
||||||
|
1. ✅ **Promote E5-1** to `MIXED_TINYV3_C7_SAFE` preset (if not already done)
|
||||||
|
2. ✅ **Profile new baseline** (E4+E5-1 ON) to find next high-ROI targets
|
||||||
|
3. ✅ **Design E5-4**: Malloc Tiny Direct (E5-1 pattern applied to alloc side)
|
||||||
|
- Expected: +2-4% based on E5-1 precedent
|
||||||
|
- Lower risk than E5-3 candidates
|
||||||
|
4. ✅ **Update roadmap**: Focus on wrapper-level optimizations, avoid diminishing returns
|
||||||
|
|
||||||
|
**Key Insight**: **Profiler self% is necessary but not sufficient** for optimization prioritization. Frequency, redundancy, and architectural seams matter more than raw self%.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix: Implementation Notes (E5-3a - Not Executed)
|
||||||
|
|
||||||
|
**Files Created** (research box, not tested):
|
||||||
|
- `core/box/free_cold_shape_env_box.{h,c}` (ENV gate)
|
||||||
|
- `core/box/free_cold_shape_stats_box.{h,c}` (stats counters)
|
||||||
|
|
||||||
|
**Integration Point**:
|
||||||
|
- `core/front/malloc_tiny_fast.h` (lines 418-437, free_tiny_fast_cold)
|
||||||
|
|
||||||
|
**Decision**: **FROZEN** (default OFF, do not pursue A/B testing)
|
||||||
|
|
||||||
|
**Rationale**: Pre-analysis shows NO-GO (low frequency, high risk, marginal ROI < +1.0%)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Date**: 2025-12-14
|
||||||
|
**Phase**: 5 E5-3
|
||||||
|
**Status**: Analysis Complete → **DEFER E5-3**, Proceed to E5-4 (Malloc Direct Path)
|
||||||
|
**Cumulative**: E4+E5-1 = ~+9-10% (baseline: 44.42M ops/s Mixed)
|
||||||
@ -0,0 +1,122 @@
|
|||||||
|
# Phase 5 E5-4: Malloc Tiny Direct Path(次の指示書)
|
||||||
|
|
||||||
|
## Status(2025-12-14 / E5-2 FREEZE 後)
|
||||||
|
|
||||||
|
- E5-1(Free Tiny Direct)は ✅ GO(+3.35%)
|
||||||
|
- E5-2(Header refill write-once)は ⚪ NEUTRAL → FREEZE
|
||||||
|
- E5-3(env shape 等)は **DEFER**
|
||||||
|
- 次の芯: **E5-4(Malloc Tiny Direct)** = E5-1 の成功パターンを alloc 側へ複製
|
||||||
|
|
||||||
|
狙い:
|
||||||
|
- `malloc()` wrapper から `tiny_alloc_gate_fast()` 呼び出しの “ゲート税” を削り、
|
||||||
|
**wrapper → malloc_tiny_fast_for_class()** へ最短で入る。
|
||||||
|
|
||||||
|
前提:
|
||||||
|
- “Tiny を使ってはいけない” モード(POOL_ONLY 等)を壊さない(= `g_tiny_route[]` は必ず尊重)。
|
||||||
|
- fail-fast: 失敗したら既存経路へ即フォールバック。
|
||||||
|
- 戻せる: ENV gate default OFF。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 0: 対象ホットの確認(perf)
|
||||||
|
|
||||||
|
E4/E5-1 を ON にした baseline で確認:
|
||||||
|
```sh
|
||||||
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
|
perf record -F 99 -- ./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
perf report --stdio --no-children
|
||||||
|
```
|
||||||
|
|
||||||
|
狙いの目安:
|
||||||
|
- `tiny_alloc_gate_fast` が self% **≥ 8%** なら E5-4 の ROI は高い
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 1: 箱の追加(ENV gate + optional stats)
|
||||||
|
|
||||||
|
### 1) ENV gate(必須)
|
||||||
|
- 新規: `core/box/malloc_tiny_direct_env_box.h`
|
||||||
|
- ENV: `HAKMEM_MALLOC_TINY_DIRECT=0/1`(default 0)
|
||||||
|
- `static inline bool malloc_tiny_direct_enabled(void)` を提供
|
||||||
|
|
||||||
|
### 2) stats(任意、compile-out 推奨)
|
||||||
|
- 新規: `core/box/malloc_tiny_direct_stats_box.h`
|
||||||
|
- `direct_total`, `direct_hit`, `direct_miss`, `route_pool_only`, `class_oob`, `fast_null`
|
||||||
|
- `HAKMEM_DEBUG_COUNTERS=0` で compile-out(観測税ゼロ)
|
||||||
|
|
||||||
|
Box Theory:
|
||||||
|
- L0: ENV gate(戻せる)
|
||||||
|
- L1: direct try(副作用ゼロ)
|
||||||
|
- 見える化: カウンタのみ
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 2: wrapper へ統合(境界1箇所)
|
||||||
|
|
||||||
|
対象: `core/box/hak_wrappers.inc.h` の `malloc()` hot path(E4-2 snapshot の中)
|
||||||
|
|
||||||
|
やること:
|
||||||
|
- 既存の
|
||||||
|
- `size <= 256` → `tiny_alloc_gate_fast(size)`
|
||||||
|
- `size <= tiny_get_max_size()` → `tiny_alloc_gate_fast(size)`
|
||||||
|
を “direct try” に置換/前段追加する。
|
||||||
|
|
||||||
|
**Direct try の条件(安全最優先)**:
|
||||||
|
1) `malloc_wrapper_env_snapshot_enabled()` が ON(E4-2 の経路内)
|
||||||
|
2) `env->front_gate_unified` が true(Tiny front を使う前提)
|
||||||
|
3) `size <= 256`(まず最頻だけ、範囲を狭く)
|
||||||
|
4) `class_idx = hak_tiny_size_to_class(size)` が [0..7]
|
||||||
|
5) `g_tiny_route[class_idx] != ROUTE_POOL_ONLY`(Tiny 禁止を尊重)
|
||||||
|
|
||||||
|
**Direct try の呼び出し**:
|
||||||
|
- `void* p = malloc_tiny_fast_for_class(size, class_idx);`
|
||||||
|
- `p != NULL` なら即 return
|
||||||
|
- `p == NULL` なら既存ルートにフォールバック(TinyFirst/Refill失敗を許容)
|
||||||
|
|
||||||
|
重要:
|
||||||
|
- `tiny_alloc_gate_fast()` の “診断/検証” は bypass されるので、
|
||||||
|
debug ビルドでは direct try を **tiny_alloc_gate_diag_enabled()==0 のときだけ**に限定する(推奨)。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 3: A/B テスト(同一バイナリ)
|
||||||
|
|
||||||
|
### A: baseline(E5-4 OFF)
|
||||||
|
```sh
|
||||||
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
|
HAKMEM_MALLOC_TINY_DIRECT=0 \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
```
|
||||||
|
|
||||||
|
### B: optimized(E5-4 ON)
|
||||||
|
```sh
|
||||||
|
HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE \
|
||||||
|
HAKMEM_MALLOC_TINY_DIRECT=1 \
|
||||||
|
./bench_random_mixed_hakmem 20000000 400 1
|
||||||
|
```
|
||||||
|
|
||||||
|
判定(Mixed 10-run mean):
|
||||||
|
- GO: **+1.0% 以上**
|
||||||
|
- ±1.0%: NEUTRAL → freeze
|
||||||
|
- -1.0% 以下: NO-GO → freeze
|
||||||
|
|
||||||
|
追加で C6-heavy も 5-run だけ確認(回帰が無いこと)。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 4: 健康診断(必須)
|
||||||
|
|
||||||
|
```sh
|
||||||
|
scripts/verify_health_profiles.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 5: 昇格(GO のときだけ)
|
||||||
|
|
||||||
|
- `core/bench_profile.h`(`MIXED_TINYV3_C7_SAFE`)に:
|
||||||
|
- `bench_setenv_default("HAKMEM_MALLOC_TINY_DIRECT", "1");`
|
||||||
|
- `docs/analysis/ENV_PROFILE_PRESETS.md` に:
|
||||||
|
- 効果、A/B、rollback(`HAKMEM_MALLOC_TINY_DIRECT=0`)を追記
|
||||||
|
- `CURRENT_TASK.md` を更新
|
||||||
|
|
||||||
@ -1,6 +1,6 @@
|
|||||||
# Phase 5 E5: Post E4-Combined Next Instructions(次の指示書)
|
# Phase 5 E5: Post E4-Combined Next Instructions(次の指示書)
|
||||||
|
|
||||||
## Status(2025-12-14 / E4 Combined GO 後)
|
## Status(2025-12-14 / E5-2 FREEZE 反映)
|
||||||
|
|
||||||
- Baseline(Mixed, 20M iters, ws=400): **47.34M ops/s**(E4-1+E4-2 ON)
|
- Baseline(Mixed, 20M iters, ws=400): **47.34M ops/s**(E4-1+E4-2 ON)
|
||||||
- Hot spots(self%):
|
- Hot spots(self%):
|
||||||
@ -15,6 +15,9 @@
|
|||||||
|
|
||||||
Update:
|
Update:
|
||||||
- E5-1(Free Tiny Direct Path)✅ GO(+3.35% mean / +3.36% median)→ 指示書: `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
- E5-1(Free Tiny Direct Path)✅ GO(+3.35% mean / +3.36% median)→ 指示書: `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
- E5-2(Header write to refill boundary)⚪ NEUTRAL → FREEZE(追わない)
|
||||||
|
- E5-3(env shape 等)DEFER → 次は E5-4(malloc 側 direct)
|
||||||
|
- E5-4 指示書: `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -74,7 +77,15 @@ perf report --stdio --no-children --symbol free
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## E5-2(優先B): `tiny_region_id_write_header` を “毎回 alloc” から外す(refill 境界へ)
|
## E5-2: Header write-once(⚪ NEUTRAL → FROZEN)
|
||||||
|
|
||||||
|
結論:
|
||||||
|
- E5-2 は **NEUTRAL**(branch overhead ≈ savings)なので **freeze**。
|
||||||
|
- 以後は追わず、次は E5-4 を優先する。
|
||||||
|
|
||||||
|
参照:
|
||||||
|
- Design: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_DESIGN.md`
|
||||||
|
- Results: `docs/analysis/PHASE5_E5_2_HEADER_REFILL_ONCE_AB_TEST_RESULTS.md`
|
||||||
|
|
||||||
### 仮説
|
### 仮説
|
||||||
`tiny_region_id_write_header` は “正しいが高頻度”。
|
`tiny_region_id_write_header` は “正しいが高頻度”。
|
||||||
@ -96,7 +107,14 @@ perf report --stdio --no-children --symbol free
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## E5-3(優先C / 小パッチ): `hakmem_env_snapshot_enabled()` の分岐形を “enabled 前提” に寄せる
|
## E5-4(次の芯): Malloc Tiny Direct(E5-1 の alloc 側複製)
|
||||||
|
|
||||||
|
指示書:
|
||||||
|
- `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## E5-3(DEFER): `hakmem_env_snapshot_enabled()` の分岐形を “enabled 前提” に寄せる
|
||||||
|
|
||||||
### 背景
|
### 背景
|
||||||
`MIXED_TINYV3_C7_SAFE` では `HAKMEM_ENV_SNAPSHOT=1` が常用になったため、
|
`MIXED_TINYV3_C7_SAFE` では `HAKMEM_ENV_SNAPSHOT=1` が常用になったため、
|
||||||
|
|||||||
@ -73,3 +73,4 @@ scripts/verify_health_profiles.sh
|
|||||||
- E4 合算 A/B: `docs/analysis/PHASE5_E4_COMBINED_AB_TEST_NEXT_INSTRUCTIONS.md`
|
- E4 合算 A/B: `docs/analysis/PHASE5_E4_COMBINED_AB_TEST_NEXT_INSTRUCTIONS.md`
|
||||||
- E5 次の芯: `docs/analysis/PHASE5_E5_NEXT_INSTRUCTIONS.md`
|
- E5 次の芯: `docs/analysis/PHASE5_E5_NEXT_INSTRUCTIONS.md`
|
||||||
- E5-1 昇格: `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
- E5-1 昇格: `docs/analysis/PHASE5_E5_1_FREE_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
- E5-4 次: `docs/analysis/PHASE5_E5_4_MALLOC_TINY_DIRECT_NEXT_INSTRUCTIONS.md`
|
||||||
|
|||||||
Reference in New Issue
Block a user