|
|
3f738c0d6e
|
Phase B: TinyFrontC23Box - Ultra-simple front path for C2/C3
Implemented dedicated fast path for C2/C3 (128B/256B) to bypass
SFC/SLL/Magazine complexity and directly access FastCache + SuperSlab.
Changes:
- core/front/tiny_front_c23.h: New ultra-simple front path (NEW)
- Direct FC → SS refill (2 layers vs 5+ in generic path)
- ENV-gated: HAKMEM_TINY_FRONT_C23_SIMPLE=1
- Refill target: 64 blocks (optimized via A/B testing)
- core/tiny_alloc_fast.inc.h: Hook at entry point (+11 lines)
- Early return for C2/C3 when C23 path enabled
- Safe fallback to generic path on failure
Results (100K iterations, A/B tested refill=16/32/64/128):
- 128B: 8.27M → 9.55M ops/s (+15.5% with refill=64) ✅
- 256B: 7.90M → 8.61M ops/s (+9.0% with refill=32) ✅
- 256B: 7.90M → 8.47M ops/s (+7.2% with refill=64) ✅
Optimal Refill: 64 blocks
- Balanced performance across C2/C3
- 128B best case: +15.5%
- 256B good performance: +7.2%
- Simple single-value default
Architecture:
- Flow: FC pop → (miss) → ss_refill_fc_fill(64) → FC pop retry
- Bypassed layers: SLL, Magazine, SFC, MidTC
- Preserved: Box boundaries, safety checks, fallback paths
- Free path: Unchanged (TLS SLL + drain)
Box Theory Compliance:
- Clear Front ← Backend boundary (ss_refill_fc_fill)
- ENV-gated A/B testing (default OFF, opt-in)
- Safe fallback: NULL → generic path handles slow case
- Zero impact when disabled
Performance Gap Analysis:
- Current: 8-9M ops/s
- After Phase B: 9-10M ops/s (+10-15%)
- Target: 15-20M ops/s
- Remaining gap: ~2x (suggests deeper bottlenecks remain)
Next Steps:
- Perf profiling to identify next bottleneck
- Current hypotheses: classify_ptr, drain overhead, refill path
- Phase C candidates: FC-direct free, inline optimizations
ENV Usage:
# Enable C23 fast path (default: OFF)
export HAKMEM_TINY_FRONT_C23_SIMPLE=1
# Optional: Override refill target (default: 64)
export HAKMEM_TINY_FRONT_C23_REFILL=32
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
|
2025-11-14 19:27:45 +09:00 |
|