52 lines
1.7 KiB
Markdown
52 lines
1.7 KiB
Markdown
|
|
# Phase 66: PGO (FAST minimal, GCC+LTO) — Instructions
|
|||
|
|
|
|||
|
|
## Goal
|
|||
|
|
|
|||
|
|
Use GCC PGO **without changing the toolchain** (keep GCC + `-flto`) to reduce layout tax and improve inline/layout decisions for the FAST minimal benchmark binary.
|
|||
|
|
|
|||
|
|
## Principles (Box Theory)
|
|||
|
|
|
|||
|
|
- No “link-out” pruning for performance (layout tax risk).
|
|||
|
|
- A/B must remain fair: same compiler/linker/LTO; only PGO profile differs.
|
|||
|
|
- Fail-fast: profile collection failures abort.
|
|||
|
|
|
|||
|
|
## Workflow (Makefile SSOT)
|
|||
|
|
|
|||
|
|
### Full pipeline
|
|||
|
|
|
|||
|
|
```sh
|
|||
|
|
make pgo-fast-full
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
This runs:
|
|||
|
|
1. `make pgo-fast-profile` — builds profile-gen binaries (FAST minimal)
|
|||
|
|
2. `make pgo-fast-collect` — collects `.gcda` by running deterministic workloads
|
|||
|
|
3. `make pgo-fast-build` — builds PGO-optimized binary and renames it to `bench_random_mixed_hakmem_minimal_pgo`
|
|||
|
|
4. Runs Mixed 10-run with `BENCH_BIN=./bench_random_mixed_hakmem_minimal_pgo`
|
|||
|
|
|
|||
|
|
### Manual steps (debug)
|
|||
|
|
|
|||
|
|
```sh
|
|||
|
|
make pgo-fast-profile
|
|||
|
|
make pgo-fast-collect
|
|||
|
|
make pgo-fast-build
|
|||
|
|
BENCH_BIN=./bench_random_mixed_hakmem_minimal_pgo scripts/run_mixed_10_cleanenv.sh
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Profile workloads (SSOT)
|
|||
|
|
|
|||
|
|
- Config file: `scripts/box/pgo_fast_profile_config.sh`
|
|||
|
|
- Collector: `scripts/box/pgo_tiny_profile_box.sh`
|
|||
|
|
|
|||
|
|
The collector enforces a per-workload timeout and verifies `.gcda` generation.
|
|||
|
|
|
|||
|
|
Important:
|
|||
|
|
- PGO は **training workload と benchmark preset/ENV の一致**が生命線。
|
|||
|
|
- `scripts/box/pgo_fast_profile_config.sh` は `scripts/run_mixed_10_cleanenv.sh` 経由で profile を取る(mismatch を避ける)。
|
|||
|
|
|
|||
|
|
## GO / NO-GO
|
|||
|
|
|
|||
|
|
- GO: Mixed 10-run mean **+1.0%** or more vs `bench_random_mixed_hakmem_minimal`
|
|||
|
|
- NEUTRAL: ±1.0% → keep as research target (do not promote)
|
|||
|
|
- NO-GO: -1.0% or worse → investigate profile mismatch / layout tax / workload coverage
|