Phase 1 完了:環境変数整理 + fprintf デバッグガード ENV変数削除(BG/HotMag系): - core/hakmem_tiny_init.inc: HotMag ENV 削除 (~131 lines) - core/hakmem_tiny_bg_spill.c: BG spill ENV 削除 - core/tiny_refill.h: BG remote 固定値化 - core/hakmem_tiny_slow.inc: BG refs 削除 fprintf Debug Guards (#if !HAKMEM_BUILD_RELEASE): - core/hakmem_shared_pool.c: Lock stats (~18 fprintf) - core/page_arena.c: Init/Shutdown/Stats (~27 fprintf) - core/hakmem.c: SIGSEGV init message ドキュメント整理: - 328 markdown files 削除(旧レポート・重複docs) 性能確認: - Larson: 52.35M ops/s (前回52.8M、安定動作✅) - ENV整理による機能影響なし - Debug出力は一部残存(次phase で対応) 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
216 lines
6.5 KiB
Markdown
216 lines
6.5 KiB
Markdown
# Pool TLS + Learning Implementation Checklist
|
|
|
|
## Pre-Implementation Review
|
|
|
|
### Contract Understanding
|
|
- [ ] Read and understand all 4 contracts (A-D) in POOL_TLS_LEARNING_DESIGN.md
|
|
- [ ] Identify which contract applies to each code section
|
|
- [ ] Review enforcement strategies for each contract
|
|
|
|
## Phase 1: Ultra-Simple TLS Implementation
|
|
|
|
### Box 1: TLS Freelist (pool_tls.c)
|
|
|
|
#### Setup
|
|
- [ ] Create `core/pool_tls.c` and `core/pool_tls.h`
|
|
- [ ] Define TLS globals: `__thread void* g_tls_pool_head[POOL_SIZE_CLASSES]`
|
|
- [ ] Define TLS counts: `__thread uint32_t g_tls_pool_count[POOL_SIZE_CLASSES]`
|
|
- [ ] Define default refill counts array
|
|
|
|
#### Hot Path Implementation
|
|
- [ ] Implement `pool_alloc_fast()` - must be 5-6 instructions max
|
|
- [ ] Pop from TLS freelist
|
|
- [ ] Conditional header write (if enabled)
|
|
- [ ] Call refill only on miss
|
|
- [ ] Implement `pool_free_fast()` - must be 5-6 instructions max
|
|
- [ ] Header validation (if enabled)
|
|
- [ ] Push to TLS freelist
|
|
- [ ] Optional drain check
|
|
|
|
#### Contract D Validation
|
|
- [ ] Verify Box1 has NO learning code
|
|
- [ ] Verify Box1 has NO metrics collection
|
|
- [ ] Verify Box1 only exposes public API and internal chain installer
|
|
- [ ] No includes of ace_learning.h or pool_refill.h in pool_tls.c
|
|
|
|
#### Testing
|
|
- [ ] Unit test: Allocation/free correctness
|
|
- [ ] Performance test: Target 40-60M ops/s
|
|
- [ ] Verify hot path is < 10 instructions with objdump
|
|
|
|
### Box 2: Refill Engine (pool_refill.c)
|
|
|
|
#### Setup
|
|
- [ ] Create `core/pool_refill.c` and `core/pool_refill.h`
|
|
- [ ] Import only pool_tls.h public API
|
|
- [ ] Define refill statistics (miss streak, etc.)
|
|
|
|
#### Refill Implementation
|
|
- [ ] Implement `pool_refill_and_alloc()`
|
|
- [ ] Capture pre-refill state
|
|
- [ ] Get refill count (default for Phase 1)
|
|
- [ ] Batch allocate from backend
|
|
- [ ] Install chain in TLS
|
|
- [ ] Return first block
|
|
|
|
#### Contract B Validation
|
|
- [ ] Verify refill NEVER blocks waiting for policy
|
|
- [ ] Verify refill only reads atomic policy values
|
|
- [ ] No immediate cache manipulation
|
|
|
|
#### Contract C Validation
|
|
- [ ] Event created on stack
|
|
- [ ] Event data copied, not referenced
|
|
- [ ] No dynamic allocation for events
|
|
|
|
## Phase 2: Metrics Collection
|
|
|
|
### Metrics Addition
|
|
- [ ] Add hit/miss counters to TLS state
|
|
- [ ] Add miss streak tracking
|
|
- [ ] Instrument hot path (with ifdef guard)
|
|
- [ ] Implement `pool_print_stats()`
|
|
|
|
### Performance Validation
|
|
- [ ] Measure regression with metrics enabled
|
|
- [ ] Must be < 2% performance impact
|
|
- [ ] Verify counters are accurate
|
|
|
|
## Phase 3: Learning Integration
|
|
|
|
### Box 3: ACE Learning (ace_learning.c)
|
|
|
|
#### Setup
|
|
- [ ] Create `core/ace_learning.c` and `core/ace_learning.h`
|
|
- [ ] Pre-allocate event ring buffer: `RefillEvent g_event_pool[QUEUE_SIZE]`
|
|
- [ ] Initialize MPSC queue structure
|
|
- [ ] Define policy table: `_Atomic uint32_t g_refill_policies[CLASSES]`
|
|
|
|
#### MPSC Queue Implementation
|
|
- [ ] Implement `ace_push_event()`
|
|
- [ ] Contract A: Check for full queue
|
|
- [ ] Contract A: DROP if full (never block!)
|
|
- [ ] Contract A: Track drops with counter
|
|
- [ ] Contract C: COPY event to ring buffer
|
|
- [ ] Use proper memory ordering
|
|
- [ ] Implement `ace_consume_events()`
|
|
- [ ] Read events with acquire semantics
|
|
- [ ] Process and release slots
|
|
- [ ] Sleep when queue empty
|
|
|
|
#### Contract A Validation
|
|
- [ ] Push function NEVER blocks
|
|
- [ ] Drops are tracked
|
|
- [ ] Drop rate monitoring implemented
|
|
- [ ] Warning issued if drop rate > 1%
|
|
|
|
#### Contract B Validation
|
|
- [ ] ACE only writes to policy table
|
|
- [ ] No immediate actions taken
|
|
- [ ] No direct TLS manipulation
|
|
- [ ] No blocking operations
|
|
|
|
#### Contract C Validation
|
|
- [ ] Ring buffer pre-allocated
|
|
- [ ] Events copied, not moved
|
|
- [ ] No malloc/free in event path
|
|
- [ ] Clear slot ownership model
|
|
|
|
#### Contract D Validation
|
|
- [ ] ace_learning.c does NOT include pool_tls.h internals
|
|
- [ ] No direct calls to Box1 functions
|
|
- [ ] Only ace_push_event() exposed to Box2
|
|
- [ ] Make notify_learning() static in pool_refill.c
|
|
|
|
#### Learning Algorithm
|
|
- [ ] Implement UCB1 or similar
|
|
- [ ] Track per-class statistics
|
|
- [ ] Gradual policy adjustments
|
|
- [ ] Oscillation detection
|
|
|
|
### Integration Points
|
|
|
|
#### Box2 → Box3 Connection
|
|
- [ ] Add event creation in pool_refill_and_alloc()
|
|
- [ ] Call ace_push_event() after successful refill
|
|
- [ ] Make notify_learning() wrapper static
|
|
|
|
#### Box2 Policy Reading
|
|
- [ ] Replace DEFAULT_REFILL_COUNT with ace_get_refill_count()
|
|
- [ ] Atomic read of policy (no blocking)
|
|
- [ ] Fallback to default if no policy
|
|
|
|
#### Startup
|
|
- [ ] Launch learning thread in hakmem_init()
|
|
- [ ] Initialize policy table with defaults
|
|
- [ ] Verify thread starts successfully
|
|
|
|
## Diagnostics Implementation
|
|
|
|
### Queue Monitoring
|
|
- [ ] Implement drop rate calculation
|
|
- [ ] Add queue health metrics structure
|
|
- [ ] Periodic health checks
|
|
|
|
### Debug Flags
|
|
- [ ] POOL_DEBUG_CONTRACTS - contract validation
|
|
- [ ] POOL_DEBUG_DROPS - log dropped events
|
|
- [ ] Add contract violation counters
|
|
|
|
### Runtime Diagnostics
|
|
- [ ] Implement pool_print_diagnostics()
|
|
- [ ] Per-class statistics
|
|
- [ ] Queue health report
|
|
- [ ] Contract violation summary
|
|
|
|
## Final Validation
|
|
|
|
### Performance
|
|
- [ ] Larson: 2.5M+ ops/s
|
|
- [ ] bench_random_mixed: 40M+ ops/s
|
|
- [ ] Background thread < 1% CPU
|
|
- [ ] Drop rate < 0.1%
|
|
|
|
### Correctness
|
|
- [ ] No memory leaks (Valgrind)
|
|
- [ ] Thread safety verified
|
|
- [ ] All contracts validated
|
|
- [ ] Stress test passes
|
|
|
|
### Code Quality
|
|
- [ ] Each box in separate .c file
|
|
- [ ] Clear API boundaries
|
|
- [ ] No cross-box includes
|
|
- [ ] < 1000 LOC total
|
|
|
|
## Sign-off Checklist
|
|
|
|
### Contract A (Queue Never Blocks)
|
|
- [ ] Verified ace_push_event() drops on full
|
|
- [ ] Drop tracking implemented
|
|
- [ ] No blocking operations in push path
|
|
- [ ] Approved by: _____________
|
|
|
|
### Contract B (Policy Scope Limited)
|
|
- [ ] ACE only adjusts next refill count
|
|
- [ ] No immediate actions
|
|
- [ ] Atomic reads only
|
|
- [ ] Approved by: _____________
|
|
|
|
### Contract C (Memory Ownership Clear)
|
|
- [ ] Ring buffer pre-allocated
|
|
- [ ] Events copied not moved
|
|
- [ ] No use-after-free possible
|
|
- [ ] Approved by: _____________
|
|
|
|
### Contract D (API Boundaries Enforced)
|
|
- [ ] Box files separate
|
|
- [ ] No improper includes
|
|
- [ ] Static functions where needed
|
|
- [ ] Approved by: _____________
|
|
|
|
## Notes
|
|
|
|
**Remember**: The goal is an ultra-simple hot path (5-6 cycles) with smart learning that never interferes with performance. When in doubt, favor simplicity and speed over completeness of telemetry.
|
|
|
|
**Key Principle**: "キャッシュ増やす時だけ学習させる、push して他のスレッドに任せる" - Learning happens only during refill, pushed async to another thread. |