tomoaki/hakmem

Files

History

Moe Charm (CI) d5e6ed535c P-Tier + Tiny Route Policy: Aggressive Superslab Management + Safe Routing

## Phase 1: Utilization-Aware Superslab Tiering (案B実装済)

- Add ss_tier_box.h: Classify SuperSlabs into HOT/DRAINING/FREE based on utilization
  - HOT (>25%): Accept new allocations
  - DRAINING (≤25%): Drain only, no new allocs
  - FREE (0%): Ready for eager munmap

- Enhanced shared_pool_release_slab():
  - Check tier transition after each slab release
  - If tier→FREE: Force remaining slots to EMPTY and call superslab_free() immediately
  - Bypasses LRU cache to prevent registry bloat from accumulating DRAINING SuperSlabs

- Test results (bench_random_mixed_hakmem):
  - 1M iterations: ✅ ~1.03M ops/s (previously passed)
  - 10M iterations: ✅ ~1.15M ops/s (previously: registry full error)
  - 50M iterations: ✅ ~1.08M ops/s (stress test)

## Phase 2: Tiny Front Routing Policy (新規Box)

- Add tiny_route_box.h/c: Single 8-byte table for class→routing decisions
  - ROUTE_TINY_ONLY: Tiny front exclusive (no fallback)
  - ROUTE_TINY_FIRST: Try Tiny, fallback to Pool if fails
  - ROUTE_POOL_ONLY: Skip Tiny entirely

- Profiles via HAKMEM_TINY_PROFILE ENV:
  - "hot": C0-C3=TINY_ONLY, C4-C6=TINY_FIRST, C7=POOL_ONLY
  - "conservative" (default): All TINY_FIRST
  - "off": All POOL_ONLY (disable Tiny)
  - "full": All TINY_ONLY (microbench mode)

- A/B test results (ws=256, 100k ops random_mixed):
  - Default (conservative): ~2.90M ops/s
  - hot: ~2.65M ops/s (more conservative)
  - off: ~2.86M ops/s
  - full: ~2.98M ops/s (slightly best)

## Design Rationale

### Registry Pressure Fix (案B)
- Problem: DRAINING tier SS occupied registry indefinitely
- Solution: When total_active_blocks→0, immediately free to clear registry slot
- Result: No more "registry full" errors under stress

### Routing Policy Box (新)
- Problem: Tiny front optimization scattered across ENV/branches
- Solution: Centralize routing in single table, select profiles via ENV
- Benefit: Safe A/B testing without touching hot path code
- Future: Integrate with RSS budget/learning layers for dynamic profile switching

## Next Steps (性能最適化)
- Profile Tiny front internals (TLS SLL, FastCache, Superslab backend latency)
- Identify bottleneck between current ~2.9M ops/s and mimalloc ~100M ops/s
- Consider:
  - Reduce shared pool lock contention
  - Optimize unified cache hit rate
  - Streamline Superslab carving logic

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-12-04 18:01:25 +09:00

..

P0 Optimization: Shared Pool fast path with O(1) metadata lookup

2025-12-04 16:21:54 +09:00

P0 Optimization: Shared Pool fast path with O(1) metadata lookup

2025-12-04 16:21:54 +09:00

WIP: Add TLS SLL validation and SuperSlab registry fallback

2025-12-03 20:42:28 +09:00

Phase 4-Step1: Add PGO workflow automation (+6.25% performance)

2025-11-29 11:28:38 +09:00

WIP: Add TLS SLL validation and SuperSlab registry fallback

2025-12-03 20:42:28 +09:00

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

P-Tier + Tiny Route Policy: Aggressive Superslab Management + Safe Routing

2025-12-04 18:01:25 +09:00

Doc: Add debug ENV consolidation plan and survey

2025-11-29 06:58:12 +09:00

ACE_LEARNING_LAYER_PLAN.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

ACE_LEARNING_LAYER.md

WIP: Add TLS SLL validation and SuperSlab registry fallback

2025-12-03 20:42:28 +09:00

ANALYSIS_SYMPTOM_PROLIFERATION.md

Critical analysis: symptom suppression vs root cause elimination

2025-12-04 03:09:28 +09:00

BENCH_REPORT_2025_11_09.md

Tiny: Enable P0→FC direct path for class7 (1KB) by default + docs

2025-11-09 23:15:02 +09:00

BREAKTHROUGH_STABILITY_ACHIEVED.md

Document breakthrough: sh8bench stability achieved with SuperSlab refcount pinning

2025-12-03 21:57:36 +09:00

BUILD_PHASE7_POOL_TLS.md

Tiny: fix header/stride mismatch and harden refill paths

2025-11-09 18:55:50 +09:00

BUILDING_QUICKSTART.md

Tiny: fix remote sentinel leak → SEGV; add defense-in-depth; PoolTLS: refill-boundary remote drain; build UX help; quickstart docs

2025-11-09 16:49:34 +09:00

CHATGPT_CONTEXT_SUMMARY.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

CHATGPT_HANDOFF_TLS_DIAGNOSIS.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

CHATGPT_PROGRESS_AND_ISSUES.md

Add ChatGPT progress analysis and remaining issues documentation

2025-12-03 20:44:18 +09:00

CRASH_180s_INVESTIGATION_GUIDE.md

Fix critical integer overflow bug in TLS SLL trace counters

2025-12-04 10:38:19 +09:00

CRITICAL_DISCOVERY_TLS_HEAD_CORRUPTION.md

Document critical discovery: TLS head corruption is not offset issue

2025-12-03 21:02:04 +09:00

DEFENSIVE_LAYERS_MAPPING.md

Add defensive layers mapping and diagnostic logging enhancements

2025-12-04 04:15:10 +09:00

DOCS_REORG_PLAN.md

ENV cleanup: Remove BG/HotMag vars & guard fprintf (Larson 52.3M ops/s)

2025-11-26 14:45:26 +09:00

FINAL_ROOT_CAUSE_AND_RESOLUTION.md

Add comprehensive final report on root cause fix

2025-12-04 05:40:50 +09:00

FREE_SAFETY.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

GEMINI_HANDOFF_SUMMARY.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

HEADERLESS_STABILITY_DEBUG_INSTRUCTIONS.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

INDEX.md

ENV cleanup: Remove BG/HotMag vars & guard fprintf (Larson 52.3M ops/s)

2025-11-26 14:45:26 +09:00

INTEGER_OVERFLOW_BUG_FIX.md

Fix critical integer overflow bug in TLS SLL trace counters

2025-12-04 10:38:19 +09:00

PERF_ANALYSIS_TINY_MIXED.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

PHASE1_TLS_HINT_BENCHMARK.md

Implement Phase 1: TLS SuperSlab Hint Box for Headerless performance

2025-12-03 18:06:24 +09:00

PHASE2_BENCHMARK_RESULTS.md

Add Phase 2 benchmark results: Headerless ON/OFF comparison

2025-12-03 17:23:32 +09:00

PHASE2_HEADERLESS_INSTRUCTION_FOR_GEMINI.md

Add Phase 2 Headerless implementation instruction for Gemini

2025-12-03 11:41:34 +09:00

PHASE_E2_EXECUTIVE_SUMMARY.md

Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets

2025-11-13 06:50:20 +09:00

PHASE_E2_REGRESSION_ANALYSIS.md

Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets

2025-11-13 06:50:20 +09:00

PHASE_E2_VISUAL_COMPARISON.md

Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets

2025-11-13 06:50:20 +09:00

PHASE_E3_IMPLEMENTATION_PLAN.md

Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets

2025-11-13 06:50:20 +09:00

RAPID_DIAGNOSIS_CANARY_SANDWICH.md

Fix critical integer overflow bug in TLS SLL trace counters

2025-12-04 10:38:19 +09:00

README_HANDOFF_CHATGPT.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

README.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

REFACTOR_PLAN_GEMINI_ENHANCED.md

Update REFACTOR_PLAN to mark Phase 2 complete and document Magazine Spill fix

2025-12-03 17:16:19 +09:00

REFACTORING_INSTRUCTION_FOR_GEMINI.md

Add detailed refactoring instruction for Gemini - Phase 1 implementation

2025-12-03 11:20:59 +09:00

SEGFAULT_INVESTIGATION_FOR_GEMINI.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

SEGV_INVESTIGATION.md

Implement Phase 2: Headerless Allocator Support (Partial)

2025-12-03 12:11:27 +09:00

SESSION_SUMMARY_2025_12_04_INTEGER_OVERFLOW_FIX.md

Implement Phantom typing for Tiny FastCache layer

2025-12-04 11:05:06 +09:00

SESSION_SUMMARY_2025_12_04.md

Add comprehensive session summary: root cause fix + Box theory implementation

2025-12-04 06:12:47 +09:00

sh8bench_debug_instruction.md

sh8bench修正: LRU registry未登録問題 + self-heal修復

2025-12-03 09:15:59 +09:00

SLAB_HANDLE.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

STATUS_2025_12_03_CURRENT.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

TINY_C7_1KB_SEGV_TRIAGE.md

Tiny C7(1KB) SEGV triage hardening: always-on lightweight free-time guards for headerless class7 in both hak_tiny_free_with_slab and superslab free path (alignment/range check, fail-fast via SIGUSR2). Leave C7 P0/direct-FC gated OFF by default. Add docs/TINY_C7_1KB_SEGV_TRIAGE.md for Claude with repro matrix, hypotheses, instrumentation and acceptance criteria.

2025-11-10 01:59:11 +09:00

TINY_MODULARIZATION_ANALYSIS.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

TINY_MODULARIZATION_SUMMARY.md

Debug Counters Implementation - Clean History

2025-11-05 12:31:14 +09:00

TINY_P0_BATCH_REFILL.md

ENV cleanup: Remove BG/HotMag vars & guard fprintf (Larson 52.3M ops/s)

2025-11-26 14:45:26 +09:00

TINY_REDESIGN_CHECKLIST.md

Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup

2025-11-28 13:42:39 +09:00

tls_sll_hdr_reset_final_report.md

Add final investigation report for TLS_SLL_HDR_RESET

2025-12-03 11:14:59 +09:00

tls_sll_hdr_reset_for_gemini.md

Save current state before investigating TLS_SLL_HDR_RESET

2025-12-03 10:34:39 +09:00

tls_sll_hdr_reset_investigation_report.md

Fix TLS SLL race condition with atomic fence and report investigation results

2025-12-03 10:57:16 +09:00

tls_sll_hdr_reset_investigation_v2.md

Save current state before investigating TLS_SLL_HDR_RESET

2025-12-03 10:34:39 +09:00

TLS_SLL_HEADER_CORRUPTION_DIAGNOSIS.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

tls_sll_header_corruption_investigation.md

Save current state before investigating TLS_SLL_HDR_RESET

2025-12-03 10:34:39 +09:00

TLS_SS_HINT_BOX_DESIGN.md

Add comprehensive ChatGPT handoff documentation for TLS SLL diagnosis

2025-12-03 20:41:34 +09:00

README.md

Docs Overview

このフォルダは hakmem の設計・計測・運用メモを体系化して管理する場所です。

INDEX.md: 目次（各ドキュメントへのリンク）
benchmarks/: ベンチマーク手順とスイープ結果の保存先
specs/: 現在の仕様（SACS‑3/HW/ENV）を集約
roadmap/: これからの実装計画・優先度・タスク

運用ルール（提案）

1つの変更/計測のまとまりにつき1ファイル（or 1フォルダ）
再現コマンド・環境変数・ハード構成は必ず記載
大きな連続出力はファイルへ保存し、本文からは抜粋/要約を記載