2025-12-15 12:29:27 +09:00
|
|
|
|
# Phase 19-3: ENV Snapshot Consolidation — Revised Implementation Instructions
|
|
|
|
|
|
|
|
|
|
|
|
## 0. Design Corrections (from initial analysis)
|
|
|
|
|
|
|
|
|
|
|
|
**Key mistakes in initial design**:
|
|
|
|
|
|
|
|
|
|
|
|
1. **Wrong**: Create new TLS ctx
|
|
|
|
|
|
**Right**: Use existing `HakmemEnvSnapshot*` from `core/box/hakmem_env_snapshot_box.h`
|
|
|
|
|
|
|
|
|
|
|
|
2. **Wrong**: Option A with static `__thread g_init` (doesn't respect `hakmem_env_snapshot_refresh_from_env()`)
|
|
|
|
|
|
**Right**: Pass `const HakmemEnvSnapshot* env` down the call stack (refresh works automatically)
|
|
|
|
|
|
|
|
|
|
|
|
3. **Wrong**: Modify `core/wrapper.c`
|
|
|
|
|
|
**Right**: Modify `core/box/hak_wrappers.inc.h` (actual integration point)
|
|
|
|
|
|
|
|
|
|
|
|
4. **Wrong**: Keep `__builtin_expect(hakmem_env_snapshot_enabled(), 0)`
|
|
|
|
|
|
**Right**: Remove UNLIKELY hint (Phase 19-1 trap: snapshot is now ON by default, hint is backwards)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 1. Strategy: Simplest Path (3 micro-phases)
|
|
|
|
|
|
|
|
|
|
|
|
### Phase 19-3a: Remove UNLIKELY hint from ENV checks
|
|
|
|
|
|
|
|
|
|
|
|
**Problem**: `__builtin_expect(hakmem_env_snapshot_enabled(), 0)` appears in:
|
|
|
|
|
|
- `core/front/malloc_tiny_fast.h:236` (C7 ULTRA alloc)
|
|
|
|
|
|
- `core/front/malloc_tiny_fast.h:403` (Front V3 free hotcold)
|
|
|
|
|
|
- `core/front/malloc_tiny_fast.h:624` (C7 ULTRA free)
|
|
|
|
|
|
- `core/front/malloc_tiny_fast.h:830` (C7 ULTRA free larson)
|
|
|
|
|
|
- `core/front/malloc_tiny_fast.h:910` (Front V3 free larson)
|
|
|
|
|
|
|
|
|
|
|
|
**Issue**: Snapshot is now ON in presets → UNLIKELY hint is backwards (same trap as Phase 19-1 NO-GO)
|
|
|
|
|
|
|
|
|
|
|
|
**Fix**: Replace `__builtin_expect(hakmem_env_snapshot_enabled(), 0)` with `hakmem_env_snapshot_enabled()`
|
|
|
|
|
|
|
|
|
|
|
|
**Expected**: +0-2% from correct branch prediction
|
|
|
|
|
|
|
|
|
|
|
|
**A/B Test**:
|
|
|
|
|
|
```sh
|
|
|
|
|
|
# Before (with UNLIKELY hint)
|
|
|
|
|
|
scripts/run_mixed_10_cleanenv.sh
|
|
|
|
|
|
|
|
|
|
|
|
# After (without hint)
|
|
|
|
|
|
scripts/run_mixed_10_cleanenv.sh
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**GO Criteria**: No regression (±1%)
|
|
|
|
|
|
|
2025-12-15 12:50:16 +09:00
|
|
|
|
**Status**: DONE ✅ (GO +4.42%)
|
|
|
|
|
|
|
2025-12-15 12:29:27 +09:00
|
|
|
|
---
|
|
|
|
|
|
|
2025-12-15 12:50:16 +09:00
|
|
|
|
### Phase 19-3b: Pass snapshot down (hot helper pass-down)
|
|
|
|
|
|
|
|
|
|
|
|
**Status**: DONE ✅ (GO +2.76% mean / +2.57% median)
|
|
|
|
|
|
Results: `docs/analysis/PHASE19_FASTLANE_INSTRUCTION_REDUCTION_3B_ENV_SNAPSHOT_PASSDOWN_AB_TEST_RESULTS.md`
|
2025-12-15 12:29:27 +09:00
|
|
|
|
|
|
|
|
|
|
**Current state**: Each callee calls `hakmem_env_snapshot_enabled()` independently
|
|
|
|
|
|
- 5 calls × 2 TLS reads each = **10 TLS reads/op**
|
|
|
|
|
|
|
2025-12-15 12:50:16 +09:00
|
|
|
|
**Implementation (landed)**:
|
|
|
|
|
|
- `core/front/malloc_tiny_fast.h`
|
|
|
|
|
|
- Capture `env` once per hot call and pass it down:
|
|
|
|
|
|
- `free_tiny_fast()` / `free_tiny_fast_hot()` capture `env` once
|
|
|
|
|
|
- `free_tiny_fast_cold(..., env)` consumes it
|
|
|
|
|
|
- `tiny_legacy_fallback_free_base_with_env(..., env)` consumes it
|
|
|
|
|
|
- Reuse the same snapshot for alloc route selection:
|
|
|
|
|
|
- `tiny_policy_hot_get_route_with_env(class_idx, env)`
|
|
|
|
|
|
- `core/box/tiny_legacy_fallback_box.h`: add `tiny_legacy_fallback_free_base_with_env(...)`
|
|
|
|
|
|
- `core/box/tiny_metadata_cache_hot_box.h`: add `tiny_policy_hot_get_route_with_env(...)`
|
|
|
|
|
|
|
|
|
|
|
|
**Optional extension (if chasing the last alloc-side gate)**:
|
|
|
|
|
|
Pass `env` down from `core/box/hak_wrappers.inc.h` entry. Keep the invariant: malloc miss must fall through (do not call `malloc_cold()` directly).
|
|
|
|
|
|
|
|
|
|
|
|
**Example (optional)**:
|
2025-12-15 12:29:27 +09:00
|
|
|
|
```c
|
|
|
|
|
|
// core/box/hak_wrappers.inc.h (malloc wrapper)
|
|
|
|
|
|
void* malloc(size_t size) {
|
|
|
|
|
|
// Entry-point: read ENV snapshot once
|
|
|
|
|
|
const HakmemEnvSnapshot* env = hakmem_env_snapshot_enabled() ? hakmem_env_snapshot() : NULL;
|
|
|
|
|
|
|
|
|
|
|
|
if (fastlane_direct_enabled()) {
|
|
|
|
|
|
// NOTE: Keep FastLane safety rule: do not use fast paths before init.
|
|
|
|
|
|
if (g_initialized) {
|
|
|
|
|
|
void* ptr = malloc_tiny_fast_with_env(size, env); // NEW: pass env
|
|
|
|
|
|
if (ptr != NULL) return ptr;
|
|
|
|
|
|
// IMPORTANT: malloc miss must fall through to existing wrapper path
|
|
|
|
|
|
// (do NOT call malloc_cold() directly; it expects lock_depth to be incremented).
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
void* ptr = front_fastlane_try_malloc(size);
|
|
|
|
|
|
if (__builtin_expect(ptr != NULL, 1)) return ptr;
|
|
|
|
|
|
// Not handled → continue to existing wrapper path below (wrap_shape / lock_depth / init waits / malloc_cold(...)).
|
|
|
|
|
|
// (Do not duplicate the full wrapper here; only the env pass-down is new.)
|
|
|
|
|
|
/* existing wrapper path */
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// core/box/hak_wrappers.inc.h (free wrapper)
|
|
|
|
|
|
void free(void* ptr) {
|
|
|
|
|
|
if (__builtin_expect(!ptr, 0)) return;
|
|
|
|
|
|
|
|
|
|
|
|
// Entry-point: read ENV snapshot once
|
|
|
|
|
|
const HakmemEnvSnapshot* env = hakmem_env_snapshot_enabled() ? hakmem_env_snapshot() : NULL;
|
|
|
|
|
|
|
|
|
|
|
|
if (fastlane_direct_enabled()) {
|
|
|
|
|
|
// NOTE: Keep FastLane safety rule: do not use fast paths before init.
|
|
|
|
|
|
if (g_initialized) {
|
|
|
|
|
|
if (free_tiny_fast_with_env(ptr, env)) return; // NEW: pass env
|
|
|
|
|
|
free_cold(ptr);
|
|
|
|
|
|
return;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if (front_fastlane_try_free(ptr)) return;
|
|
|
|
|
|
free_cold(ptr, wrapper_env_cfg_fast());
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Propagate env down**:
|
|
|
|
|
|
```c
|
|
|
|
|
|
// core/front/malloc_tiny_fast.h (example)
|
|
|
|
|
|
static inline void* malloc_tiny_fast_with_env(size_t size, const HakmemEnvSnapshot* env) {
|
|
|
|
|
|
// ... existing logic ...
|
|
|
|
|
|
|
|
|
|
|
|
// OLD: if (__builtin_expect(hakmem_env_snapshot_enabled(), 0)) {
|
|
|
|
|
|
// NEW: if (env) {
|
|
|
|
|
|
if (env) {
|
|
|
|
|
|
// Use snapshot from env (NO additional TLS read)
|
|
|
|
|
|
if (tiny_c7_ultra_enabled_cached(env)) { // NEW: use cached check
|
|
|
|
|
|
// C7 ULTRA path
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// ... rest of logic ...
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Helper functions** (optional, if needed):
|
|
|
|
|
|
```c
|
|
|
|
|
|
// core/box/hakmem_env_snapshot_box.h
|
|
|
|
|
|
static inline bool tiny_c7_ultra_enabled_cached(const HakmemEnvSnapshot* env) {
|
|
|
|
|
|
if (!env) return 0;
|
|
|
|
|
|
// Read from snapshot (already in cache, no TLS read)
|
|
|
|
|
|
return env->tiny_c7_ultra_enabled;
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Expected**:
|
|
|
|
|
|
- TLS reads: 10/op → 2/op (just wrapper entry check)
|
|
|
|
|
|
- Instructions: -8.0 to -10.0/op
|
|
|
|
|
|
- Throughput: **+3-5%**
|
|
|
|
|
|
|
|
|
|
|
|
**ENV Gate** (optional, if conservative rollout needed):
|
|
|
|
|
|
```c
|
|
|
|
|
|
// core/box/env_snapshot_consolidation_env_box.h (NEW, if needed)
|
|
|
|
|
|
extern _Atomic int g_env_snapshot_consolidation_enabled;
|
|
|
|
|
|
|
|
|
|
|
|
static inline bool env_snapshot_consolidation_enabled(void) {
|
|
|
|
|
|
return atomic_load_explicit(&g_env_snapshot_consolidation_enabled, memory_order_relaxed);
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Alternative** (no new ENV gate):
|
|
|
|
|
|
- Just always pass env down when HAKMEM_FASTLANE_DIRECT=1
|
|
|
|
|
|
- Rely on existing FASTLANE_DIRECT gate for rollback
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
### Phase 19-3c: Propagate to all callees (if 19-3b is GO)
|
|
|
|
|
|
|
|
|
|
|
|
**Targets**:
|
|
|
|
|
|
- `tiny_legacy_fallback_box.h:28` (ENV snapshot check)
|
|
|
|
|
|
- `tiny_metadata_cache_hot_box.h:64` (metadata cache check)
|
|
|
|
|
|
- Other helper functions that currently call `hakmem_env_snapshot_enabled()`
|
|
|
|
|
|
|
|
|
|
|
|
**Expected cumulative**: **+5-8%**
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 2. Box Theory Compliance
|
|
|
|
|
|
|
|
|
|
|
|
**Boundary**:
|
|
|
|
|
|
- L0 (ENV gate): Optional `HAKMEM_ENV_SNAPSHOT_CONSOLIDATION=0/1` OR reuse `HAKMEM_FASTLANE_DIRECT=1`
|
|
|
|
|
|
- L1 (Hot inline): `const HakmemEnvSnapshot* env` parameter (NULL-safe)
|
|
|
|
|
|
- L2 (Fallback): If `env == NULL`, use old path (call `hakmem_env_snapshot_enabled()`)
|
|
|
|
|
|
|
|
|
|
|
|
**Rollback**:
|
|
|
|
|
|
- Runtime: `HAKMEM_ENV_SNAPSHOT_CONSOLIDATION=0` (or `HAKMEM_FASTLANE_DIRECT=0`)
|
|
|
|
|
|
- Compile-time: `#if` guards on env parameter propagation
|
|
|
|
|
|
- NULL-safe: `if (!env) { /* old path */ }`
|
|
|
|
|
|
|
|
|
|
|
|
**Observability**:
|
|
|
|
|
|
- Perf stat: TLS read count reduction
|
|
|
|
|
|
- `hakmem_env_snapshot_enabled()` samples should drop from 7% → <1%
|
|
|
|
|
|
|
|
|
|
|
|
**Refresh compatibility**:
|
|
|
|
|
|
- ✅ Works: Wrapper reads fresh snapshot on each operation
|
|
|
|
|
|
- ✅ `bench_profile` putenv() + refresh works (no static cache)
|
|
|
|
|
|
- ✅ No version tracking needed
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 3. Implementation Steps
|
|
|
|
|
|
|
|
|
|
|
|
### Step 1: Phase 19-3a (UNLIKELY hint removal)
|
|
|
|
|
|
|
|
|
|
|
|
**Files to modify**:
|
|
|
|
|
|
```sh
|
|
|
|
|
|
# Find all instances
|
|
|
|
|
|
grep -n "__builtin_expect(hakmem_env_snapshot_enabled(), 0)" core/front/malloc_tiny_fast.h
|
|
|
|
|
|
|
|
|
|
|
|
# Replace with
|
|
|
|
|
|
# hakmem_env_snapshot_enabled()
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Lines to change**:
|
|
|
|
|
|
- `malloc_tiny_fast.h:236`
|
|
|
|
|
|
- `malloc_tiny_fast.h:403`
|
|
|
|
|
|
- `malloc_tiny_fast.h:624`
|
|
|
|
|
|
- `malloc_tiny_fast.h:830`
|
|
|
|
|
|
- `malloc_tiny_fast.h:910`
|
|
|
|
|
|
|
|
|
|
|
|
**A/B Test**:
|
|
|
|
|
|
```sh
|
|
|
|
|
|
# Baseline (before)
|
|
|
|
|
|
scripts/run_mixed_10_cleanenv.sh
|
|
|
|
|
|
|
|
|
|
|
|
# Optimized (after)
|
|
|
|
|
|
scripts/run_mixed_10_cleanenv.sh
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**GO Criteria**: No regression (±1%)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
### Step 2: Phase 19-3b (Pass env from wrapper)
|
|
|
|
|
|
|
|
|
|
|
|
**2.1 Update wrappers** (`core/box/hak_wrappers.inc.h`)
|
|
|
|
|
|
|
|
|
|
|
|
Find malloc wrapper (around line 674):
|
|
|
|
|
|
```c
|
|
|
|
|
|
void* malloc(size_t size) {
|
|
|
|
|
|
// ADD: Read snapshot once at entry
|
|
|
|
|
|
const HakmemEnvSnapshot* env = hakmem_env_snapshot_enabled() ? hakmem_env_snapshot() : NULL;
|
|
|
|
|
|
|
|
|
|
|
|
// ... rest of existing logic ...
|
|
|
|
|
|
// Replace malloc_tiny_fast(size) with malloc_tiny_fast_with_env(size, env)
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Find free wrapper (around line 188):
|
|
|
|
|
|
```c
|
|
|
|
|
|
void free(void* ptr) {
|
|
|
|
|
|
// ADD: Read snapshot once at entry
|
|
|
|
|
|
const HakmemEnvSnapshot* env = hakmem_env_snapshot_enabled() ? hakmem_env_snapshot() : NULL;
|
|
|
|
|
|
|
|
|
|
|
|
// ... rest of existing logic ...
|
|
|
|
|
|
// Replace free_tiny_fast(ptr) with free_tiny_fast_with_env(ptr, env)
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**2.2 Add _with_env variants** (`core/front/malloc_tiny_fast.h`)
|
|
|
|
|
|
|
|
|
|
|
|
Option A: Rename existing functions (breaking change, requires full propagation)
|
|
|
|
|
|
Option B: Add new `_with_env` variants, keep old functions as wrappers (safer)
|
|
|
|
|
|
|
|
|
|
|
|
**Recommended: Option B** (incremental migration)
|
|
|
|
|
|
|
|
|
|
|
|
```c
|
|
|
|
|
|
// malloc_tiny_fast.h (add new variant)
|
|
|
|
|
|
static inline void* malloc_tiny_fast_with_env(size_t size, const HakmemEnvSnapshot* env) {
|
|
|
|
|
|
// Replace hakmem_env_snapshot_enabled() checks with (env && ...)
|
|
|
|
|
|
// ... existing logic ...
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// Keep old function as wrapper (for gradual migration)
|
|
|
|
|
|
static inline void* malloc_tiny_fast(size_t size) {
|
|
|
|
|
|
const HakmemEnvSnapshot* env = hakmem_env_snapshot_enabled() ? hakmem_env_snapshot() : NULL;
|
|
|
|
|
|
return malloc_tiny_fast_with_env(size, env);
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**2.3 Update ENV checks inside _with_env**
|
|
|
|
|
|
|
|
|
|
|
|
Replace:
|
|
|
|
|
|
```c
|
|
|
|
|
|
if (__builtin_expect(hakmem_env_snapshot_enabled(), 0)) {
|
|
|
|
|
|
// C7 ULTRA check
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
With:
|
|
|
|
|
|
```c
|
|
|
|
|
|
if (env) {
|
|
|
|
|
|
// Use env snapshot (NO TLS read)
|
|
|
|
|
|
if (env->tiny_c7_ultra_enabled) {
|
|
|
|
|
|
// C7 ULTRA path
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**2.4 Build & Verify**
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
make clean && make -j bench_random_mixed_hakmem
|
|
|
|
|
|
# Should compile without errors
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**2.5 A/B Test**
|
|
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
|
# Baseline (19-3a only)
|
|
|
|
|
|
scripts/run_mixed_10_cleanenv.sh
|
|
|
|
|
|
|
|
|
|
|
|
# Optimized (19-3b with env passing)
|
|
|
|
|
|
scripts/run_mixed_10_cleanenv.sh
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**GO Criteria**: +3% minimum
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
### Step 3: Phase 19-3c (Propagate to helpers)
|
|
|
|
|
|
|
|
|
|
|
|
**If 19-3b is GO**, propagate env to:
|
|
|
|
|
|
- `tiny_legacy_fallback_box.h`
|
|
|
|
|
|
- `tiny_metadata_cache_hot_box.h`
|
|
|
|
|
|
- Other helper functions
|
|
|
|
|
|
|
|
|
|
|
|
**Expected cumulative**: +5-8%
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 4. Safety Checklist
|
|
|
|
|
|
|
|
|
|
|
|
- [ ] NULL-safe: All `if (env)` checks handle NULL correctly
|
|
|
|
|
|
- [ ] Refresh works: Wrapper reads fresh snapshot each call
|
|
|
|
|
|
- [ ] Rollback: Can disable via ENV or compile flag
|
|
|
|
|
|
- [ ] No static cache: No version tracking needed
|
|
|
|
|
|
- [ ] Existing code paths preserved: Old functions still work
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 5. Expected Performance (Cumulative)
|
|
|
|
|
|
|
|
|
|
|
|
| Phase | TLS reads/op | Instructions/op | Throughput |
|
|
|
|
|
|
|-------|--------------|-----------------|------------|
|
|
|
|
|
|
| Baseline (19-1b) | 10 | 169.45 | 52.06M ops/s |
|
|
|
|
|
|
| 19-3a (hint fix) | 10 | ~169 | +0-2% |
|
|
|
|
|
|
| 19-3b (env pass) | 2 | ~159-161 | +3-5% |
|
|
|
|
|
|
| 19-3c (helpers) | 2 | ~157-159 | +5-8% |
|
|
|
|
|
|
|
|
|
|
|
|
**Target after Phase 19-3c**:
|
|
|
|
|
|
- Throughput: **54.7-56.2M ops/s** (vs 52.06M baseline)
|
|
|
|
|
|
- Instructions/op: **157-159** (vs 169.45 baseline)
|
|
|
|
|
|
- Gap to libc (135.92): **+15-17%** (vs +24.6% before 19-3)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 6. Perf Validation
|
|
|
|
|
|
|
|
|
|
|
|
**Before Phase 19-3**:
|
|
|
|
|
|
```sh
|
|
|
|
|
|
perf stat -e cycles,instructions -- ./bench_random_mixed_hakmem 200000000 400 1
|
|
|
|
|
|
# Instructions: ~169.45/op
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**After Phase 19-3c**:
|
|
|
|
|
|
```sh
|
|
|
|
|
|
perf stat -e cycles,instructions -- ./bench_random_mixed_hakmem 200000000 400 1
|
|
|
|
|
|
# Instructions: ~157-159/op (target: -10.0 reduction)
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
**Perf record validation**:
|
|
|
|
|
|
```sh
|
|
|
|
|
|
perf record -g -- ./bench_random_mixed_hakmem 50000000 400 1
|
|
|
|
|
|
perf report --stdio --no-children | grep hakmem_env_snapshot_enabled
|
|
|
|
|
|
# Should show <1% samples (down from 7%)
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 7. Risk Assessment
|
|
|
|
|
|
|
|
|
|
|
|
**Phase 19-3a**: **LOW**
|
|
|
|
|
|
- Simple search-replace
|
|
|
|
|
|
- No algorithmic changes
|
|
|
|
|
|
- Worst case: ±0% (no-op)
|
|
|
|
|
|
|
|
|
|
|
|
**Phase 19-3b**: **MEDIUM**
|
|
|
|
|
|
- API signature changes (add env parameter)
|
|
|
|
|
|
- Mechanical changes (low semantic risk)
|
|
|
|
|
|
- Rollback: Keep old functions as wrappers
|
|
|
|
|
|
|
|
|
|
|
|
**Phase 19-3c**: **MEDIUM**
|
|
|
|
|
|
- Wider propagation
|
|
|
|
|
|
- More functions to update
|
|
|
|
|
|
- Same rollback strategy
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 8. Timeline
|
|
|
|
|
|
|
|
|
|
|
|
- **Phase 19-3a**: 1-2 hours (search-replace + A/B)
|
|
|
|
|
|
- **Phase 19-3b**: 4-6 hours (wrapper + _with_env variants + A/B)
|
|
|
|
|
|
- **Phase 19-3c**: 3-4 hours (helper propagation + A/B)
|
|
|
|
|
|
|
|
|
|
|
|
**Total**: 8-12 hours (1-2 days part-time)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 9. Next Steps After This Phase
|
|
|
|
|
|
|
|
|
|
|
|
If Phase 19-3 achieves +5-8%:
|
|
|
|
|
|
- Current: 52.06M ops/s
|
|
|
|
|
|
- After 19-3: ~54.7-56.2M ops/s
|
|
|
|
|
|
- Gap to libc (79.72M): ~+42-46%
|
|
|
|
|
|
|
|
|
|
|
|
**Remaining candidates** (from Phase 19-1 Design):
|
|
|
|
|
|
- Candidate C: Stats removal (+3-5%, already done in BENCH_MINIMAL)
|
|
|
|
|
|
- Candidate D: Header inline (+2-3%)
|
|
|
|
|
|
- Candidate E: Route fast path (+2-3%)
|
|
|
|
|
|
|
|
|
|
|
|
**Or**: Re-profile with `perf record` to find next hot path (self% ≥ 5%)
|