Incremental improvements: mid_desc cache, pool hotpath optimization, and doc updates
**Changes:** - core/box/pool_api.inc.h: Code organization and micro-optimizations - CURRENT_TASK.md: Updated Phase MD1 (mid_desc TLS cache: +3.2% for C6-heavy) - docs/analysis files: Various analysis and documentation updates - AGENTS.md: Agent role clarifications - TINY_FRONT_V3_FLATTENING_GUIDE.md: Flattening strategy documentation **Verification:** - random_mixed_hakmem: 44.8M ops/s (1M iterations, 400 working set) - No segfaults or assertions across all benchmark variants - Stable performance across multiple runs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
@ -78,6 +78,22 @@ Throughput: **12.39M ops/s**(DEBUG/-O0 相当)
|
||||
- そのほか: free/malloc/main が約 30% 強、header write 系は今回のデバッグログに埋もれて確認できず。
|
||||
|
||||
所感:
|
||||
|
||||
## Phase HF1(DEBUG, front v3+LUT+fast classify+mid_desc_cache ON)
|
||||
- ビルド: `CFLAGS='-O0 -g' USE_LTO=0 OPT_LEVEL=0 NATIVE=0`
|
||||
- ENV: `HAKMEM_PROFILE=MIXED_TINYV3_C7_SAFE`, `HAKMEM_MID_DESC_CACHE_ENABLED=1`
|
||||
- コマンド: `perf record -F 5000 --call-graph dwarf -e cycles:u -o perf.data.tiny_mixed_hf1 ./bench_random_mixed_hakmem 1000000 400 1`
|
||||
- self% 上位(perf_tiny_mixed_hf1.txt 抜粋):
|
||||
- tiny_alloc_gate_fast 16.85%
|
||||
- free 13.63% / malloc 13.34% / main 9.02%(ベンチ枠)
|
||||
- __memset_avx2_unaligned_erms 5.65%(初期化)
|
||||
- hak_super_registry_init 5.57%(初期化)
|
||||
- so_alloc_fast 2.41%, unified_cache_push 2.23%
|
||||
- tiny_front_v3_enabled 2.23%, tiny_front_v3_lut_lookup 2.21%
|
||||
- smallobject_hotbox_v3_can_own_c7 1.94%
|
||||
- tiny_region_id_write_header 1.82%
|
||||
- ss_map_lookup 1.61%, mid_desc_lookup_cached 0.98%, classify_ptr 0.65%
|
||||
所感: TF3 + mid_desc_cache 適用後、ss_map_lookup/self% は 1.6% まで沈み、tiny_region_id_write_header が引き続き ~1.8% で上位。次の削り候補は header 書き込み回数削減 or front前段の小枝刈り。
|
||||
- front v3 + LUT ON でも free 側の `ss_map_lookup` / `hak_super_lookup` が ~11% 程度残っており、ここを FAST classify で直叩きする余地が大きい。
|
||||
- `classify_ptr` は 1% 未満だが、`ss_map_lookup` とセットで落とせれば +5〜10% の目標に寄せられる見込み。
|
||||
|
||||
|
||||
Reference in New Issue
Block a user