Tiny: Enable P0 batch refill by default + docs and task update

Summary
- Default P0 ON: Build-time HAKMEM_TINY_P0_BATCH_REFILL=1 remains; runtime gate now defaults to ON
  (HAKMEM_TINY_P0_ENABLE unset or not '0'). Kill switch preserved via HAKMEM_TINY_P0_DISABLE=1.
- Fix critical bug: After freelist→SLL batch splice, increment TinySlabMeta::used by 'from_freelist'
  to mirror non-P0 behavior (prevents under-accounting and follow-on carve invariants from breaking).
- Add low-overhead A/B toggles for triage: HAKMEM_TINY_P0_NO_DRAIN (skip remote drain),
  HAKMEM_TINY_P0_LOG (emit [P0_COUNTER_OK/MISMATCH] based on total_active_blocks delta).
- Keep linear carve fail-fast guards across simple/general/TLS-bump paths.

Perf (1T, 100k×256B)
- P0 OFF: ~2.73M ops/s (stable)
- P0 ON (no drain): ~2.45M ops/s
- P0 ON (normal drain): ~2.76M ops/s (fastest)

Known
- Rare [P0_COUNTER_MISMATCH] warnings persist (non-fatal). Continue auditing active/used
  balance around batch freelist splice and remote drain splice.

Docs
- Add docs/TINY_P0_BATCH_REFILL.md (runtime switches, behavior, perf notes).
- Update CURRENT_TASK.md with Tiny P0 status (default ON) and next steps.
This commit is contained in:
Moe Charm (CI)
2025-11-09 22:12:34 +09:00
parent 1010a961fb
commit d9b334b968
24 changed files with 1240 additions and 69 deletions

View File

@ -178,11 +178,18 @@ static inline uint32_t sll_cap_for_class(int class_idx, uint32_t mag_cap);
// Forward decl: used by tiny_spec_pop_path before its definition
// Phase 6-1.7: Export for box refactor (Box 5 needs access from hakmem.c)
// Note: Remove 'inline' to provide linkable definition for LTO
// P0 Fix: When P0 is enabled, use sll_refill_batch_from_ss instead
#if HAKMEM_TINY_P0_BATCH_REFILL
// P0 enabled: use batch refill
static inline int sll_refill_batch_from_ss(int class_idx, int max_take);
#else
// P0 disabled: use original refill
#ifdef HAKMEM_TINY_PHASE6_BOX_REFACTOR
int sll_refill_small_from_ss(int class_idx, int max_take);
#else
static inline int sll_refill_small_from_ss(int class_idx, int max_take);
#endif
#endif
static inline void hak_tiny_free_superslab(void* ptr, SuperSlab* ss);
static void* __attribute__((cold, noinline)) tiny_slow_alloc_fast(int class_idx);
static inline void tiny_remote_drain_owner(struct TinySlab* slab);
@ -1221,8 +1228,12 @@ void hak_tiny_prewarm_tls_cache(void) {
int count = HAKMEM_TINY_PREWARM_COUNT; // Default: 16 blocks per class
// Trigger refill to populate TLS cache
// Note: sll_refill_small_from_ss is available because BOX_REFACTOR exports it
// P0 Fix: Use appropriate refill function based on P0 status
#if HAKMEM_TINY_P0_BATCH_REFILL
sll_refill_batch_from_ss(class_idx, count);
#else
sll_refill_small_from_ss(class_idx, count);
#endif
}
}
#endif