Phase 4 E3-4: ENV Constructor Init (+4.75% GO)

Target: Eliminate E1 lazy init check overhead (3.22% self%)
- E1 consolidated ENV gates but lazy check remained in hot path
- Strategy: __attribute__((constructor(101))) for pre-main init

Implementation:
- ENV gate: HAKMEM_ENV_SNAPSHOT_CTOR=0/1 (default 0, research box)
- core/box/hakmem_env_snapshot_box.c: Constructor function added
  - Reads ENV before main() when CTOR=1
  - Refresh also syncs gate state for bench_profile putenv
- core/box/hakmem_env_snapshot_box.h: Dual-mode enabled check
  - CTOR=1 fast path: direct global read (no lazy branch)
  - CTOR=0 fallback: legacy lazy init (rollback safe)
  - Branch hints adjusted for default OFF baseline

A/B Test Results (Mixed, 10-run, 20M iters, E1=1):
- Baseline (CTOR=0): 44.28M ops/s (mean), 44.60M ops/s (median)
- Optimized (CTOR=1): 46.38M ops/s (mean), 46.53M ops/s (median)
- Improvement: +4.75% mean, +4.35% median

Decision: GO (+4.75% >> +0.5% threshold)
- Expected +0.5-1.5%, achieved +4.75%
- Lazy init branch overhead was larger than expected
- Action: Keep as research box (default OFF), evaluate promotion

Phase 4 Cumulative:
- E1 (ENV Snapshot): +3.92%
- E2 (Alloc Per-Class): -0.21% (NEUTRAL, frozen)
- E3-4 (Constructor Init): +4.75%
- Total Phase 4: ~+8.5%

Deliverables:
- docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_DESIGN.md
- docs/analysis/PHASE4_E3_ENV_CONSTRUCTOR_INIT_NEXT_INSTRUCTIONS.md
- docs/analysis/PHASE4_COMPREHENSIVE_STATUS_ANALYSIS.md
- docs/analysis/PHASE4_EXECUTIVE_SUMMARY.md
- scripts/verify_health_profiles.sh (sanity check script)
- CURRENT_TASK.md (E3-4 complete, next instructions)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-14 02:57:35 +09:00
parent 6a6744d065
commit 21e2e4ac2b
11 changed files with 1010 additions and 10 deletions

View File

@ -10,10 +10,16 @@
// - Lazy init with version-based refresh (follows tiny_front_v3_snapshot pattern)
// - Learner interlock: tiny_metadata_cache_eff = cache && !learner
//
// E3-4 Extension: Constructor init to eliminate lazy check overhead
// - ENV: HAKMEM_ENV_SNAPSHOT_CTOR=0/1 (default 0)
// - When =1: Gate init runs in constructor (before main)
// - Eliminates 3.22% lazy init check overhead
//
// Benefits:
// - 3 TLS reads → 1 TLS read (66% reduction)
// - 3 lazy init checks → 1 lazy init check
// - Expected gain: +1-3% (conservative from 3.26% overhead)
// - E3-4: Lazy init check → no check (constructor init)
// - Expected gain: +1-3% (E1) + +0.5-1.5% (E3-4)
#ifndef HAK_ENV_SNAPSHOT_BOX_H
#define HAK_ENV_SNAPSHOT_BOX_H
@ -47,18 +53,29 @@ static inline const HakmemEnvSnapshot* hakmem_env_snapshot(void) {
return &g_hakmem_env_snapshot;
}
// E3-4: Global gate state (defined in hakmem_env_snapshot_box.c)
extern int g_hakmem_env_snapshot_gate;
extern int g_hakmem_env_snapshot_ctor_mode;
// ENV gate: default OFF (research box, set =1 to enable)
// E3-4: Dual-mode - constructor init (fast) or legacy lazy init (fallback)
static inline bool hakmem_env_snapshot_enabled(void) {
static int g = -1;
if (__builtin_expect(g == -1, 0)) {
// E3-4 Fast path: constructor mode (no lazy check, just global read)
// Default is OFF, so ctor_mode==1 is UNLIKELY.
if (__builtin_expect(g_hakmem_env_snapshot_ctor_mode == 1, 0)) {
return g_hakmem_env_snapshot_gate != 0;
}
// Legacy path: lazy init (fallback when HAKMEM_ENV_SNAPSHOT_CTOR=0)
if (__builtin_expect(g_hakmem_env_snapshot_gate == -1, 0)) {
const char* e = getenv("HAKMEM_ENV_SNAPSHOT");
if (e && *e) {
g = (*e == '1') ? 1 : 0;
g_hakmem_env_snapshot_gate = (*e == '1') ? 1 : 0;
} else {
g = 0; // default: OFF (research box)
g_hakmem_env_snapshot_gate = 0; // default: OFF (research box)
}
}
return g != 0;
return g_hakmem_env_snapshot_gate != 0;
}
#endif // HAK_ENV_SNAPSHOT_BOX_H