Phase 68: PGO training set diversification (seed/WS expansion)

Changes:
- scripts/box/pgo_fast_profile_config.sh: Expanded WS patterns (3→5) and seeds (1→3)
  for reduced overfitting and better production workload representativeness
- PERFORMANCE_TARGETS_SCORECARD.md: Phase 68 baseline promoted (61.614M = 50.93%)
- CURRENT_TASK.md: Phase 68 marked complete, Phase 67a (layout tax forensics) set Active

Results:
- 10-run verification: +1.19% vs Phase 66 baseline (GO, >+1.0% threshold)
- M1 milestone: 50.93% of mimalloc (target 50%, exceeded by +0.93pp)
- Stability: 10-run mean/median with <2.1% CV

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
Moe Charm (CI)
2025-12-17 21:08:17 +09:00
parent 10fb0497e2
commit 84f5034e45
44 changed files with 1520 additions and 583 deletions

View File

@ -11,6 +11,7 @@
#include "tiny_c7_hotbox.h" // tiny_c7_alloc_fast wrapper
#include "mid_hotbox_v3_box.h" // Phase MID-V3: Mid/Pool HotBox v3 types
#include "mid_hotbox_v3_env_box.h" // Phase MID-V3: ENV gate for v3
#include "../hakmem_build_flags.h" // Phase 64: For backend pruning
#ifdef HAKMEM_POOL_TLS_PHASE1
#include "../pool_tls.h"
@ -79,6 +80,7 @@ inline void* hak_alloc_at(size_t size, hak_callsite_t site) {
// Design: TLS lane cache with page-based allocation, RegionIdBox integration
// NOTE: Must come BEFORE Tiny to intercept specific size classes
// PERF: C6 shows +11% improvement, Mixed (257-768B) shows +19.8% improvement
#if !HAKMEM_FAST_PROFILE_PRUNE_BACKENDS
if (__builtin_expect(mid_v3_enabled() && size >= 257 && size <= 768, 0)) {
static _Atomic int entry_log_count = 0;
if (mid_v3_debug_enabled() && atomic_fetch_add(&entry_log_count, 1) < 3) {
@ -115,6 +117,7 @@ inline void* hak_alloc_at(size_t size, hak_callsite_t site) {
}
}
}
#endif
// Phase 16: Dynamic Tiny max size (ENV: HAKMEM_TINY_MAX_CLASS)
// Default: 1023B (C0-C7), reduced to 255B (C0-C5) when Small-Mid enabled