327 lines
10 KiB
Markdown
327 lines
10 KiB
Markdown
|
|
# Central Allocator Router Box Design & Pre-allocation Fix
|
|||
|
|
|
|||
|
|
## Executive Summary
|
|||
|
|
|
|||
|
|
Found CRITICAL bug in pre-allocation: condition is inverted (counts failures as successes). Also identified architectural issue: allocation routing is scattered across 3+ files with no central control, making debugging nearly impossible. Proposed Central Router Box architecture provides single entry point, complete visibility, and clean component boundaries.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Part 1: Central Router Box Design
|
|||
|
|
|
|||
|
|
### Architecture Overview
|
|||
|
|
|
|||
|
|
**Current Problem:** Allocation routing logic is scattered across multiple files:
|
|||
|
|
- `core/box/hak_alloc_api.inc.h` - primary routing (186 lines!)
|
|||
|
|
- `core/hakmem_ace.c:hkm_ace_alloc()` - secondary routing (106 lines)
|
|||
|
|
- `core/box/pool_core_api.inc.h` - tertiary routing (dead code, 300+ lines)
|
|||
|
|
- No single source of truth
|
|||
|
|
- No unified logging
|
|||
|
|
- Silent failures everywhere
|
|||
|
|
|
|||
|
|
**Solution:** Central Router Box with ONE clear responsibility: **Route allocations to the correct allocator based on size**
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
malloc(size)
|
|||
|
|
↓
|
|||
|
|
┌───────────────────┐
|
|||
|
|
│ Central Router │ ← SINGLE ENTRY POINT
|
|||
|
|
│ hak_router() │ ← Logs EVERY decision
|
|||
|
|
└───────────────────┘
|
|||
|
|
↓
|
|||
|
|
┌───────────────────────────────────────┐
|
|||
|
|
│ Size-based Routing │
|
|||
|
|
│ 0-1KB → Tiny │
|
|||
|
|
│ 1-8KB → ACE → Pool (or mmap) │
|
|||
|
|
│ 8-32KB → Mid │
|
|||
|
|
│ 32KB-2MB → ACE → Pool/L25 (or mmap) │
|
|||
|
|
│ 2MB+ → mmap direct │
|
|||
|
|
└───────────────────────────────────────┘
|
|||
|
|
↓
|
|||
|
|
┌─────────────────────────────┐
|
|||
|
|
│ Component Black Boxes │
|
|||
|
|
│ - Tiny allocator │
|
|||
|
|
│ - Mid allocator │
|
|||
|
|
│ - ACE allocator │
|
|||
|
|
│ - Pool allocator │
|
|||
|
|
│ - mmap wrapper │
|
|||
|
|
└─────────────────────────────┘
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### API Specification
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// core/box/hak_router.h
|
|||
|
|
|
|||
|
|
// Single entry point for ALL allocations
|
|||
|
|
void* hak_router_alloc(size_t size, uintptr_t site_id);
|
|||
|
|
|
|||
|
|
// Single exit point for ALL frees
|
|||
|
|
void hak_router_free(void* ptr);
|
|||
|
|
|
|||
|
|
// Health check - are all components ready?
|
|||
|
|
typedef struct {
|
|||
|
|
bool tiny_ready;
|
|||
|
|
bool mid_ready;
|
|||
|
|
bool ace_ready;
|
|||
|
|
bool pool_ready;
|
|||
|
|
bool mmap_ready;
|
|||
|
|
uint64_t total_routes;
|
|||
|
|
uint64_t route_failures;
|
|||
|
|
uint64_t fallback_count;
|
|||
|
|
} RouterHealth;
|
|||
|
|
|
|||
|
|
RouterHealth hak_router_health_check(void);
|
|||
|
|
|
|||
|
|
// Enable/disable detailed routing logs
|
|||
|
|
void hak_router_set_verbose(bool verbose);
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Component Responsibilities
|
|||
|
|
|
|||
|
|
**Router Box (core/box/hak_router.c):**
|
|||
|
|
- Owns SIZE → ALLOCATOR routing logic
|
|||
|
|
- Logs every routing decision (when verbose)
|
|||
|
|
- Tracks routing statistics
|
|||
|
|
- Handles fallback logic transparently
|
|||
|
|
- NO allocation implementation (just routing)
|
|||
|
|
|
|||
|
|
**Allocator Boxes (existing):**
|
|||
|
|
- Tiny: Handles 0-1KB allocations
|
|||
|
|
- Mid: Handles 8-32KB allocations
|
|||
|
|
- ACE: Handles size → class rounding
|
|||
|
|
- Pool: Handles class-sized blocks
|
|||
|
|
- mmap: Handles large/fallback allocations
|
|||
|
|
|
|||
|
|
### File Structure
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
core/
|
|||
|
|
├── box/
|
|||
|
|
│ ├── hak_router.h # Router API (NEW)
|
|||
|
|
│ ├── hak_router.c # Router implementation (NEW)
|
|||
|
|
│ ├── hak_router_stats.h # Statistics tracking (NEW)
|
|||
|
|
│ ├── hak_alloc_api.inc.h # DEPRECATED - replaced by router
|
|||
|
|
│ └── [existing allocator boxes...]
|
|||
|
|
└── hakmem.c # Modified to use router
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Integration Plan
|
|||
|
|
|
|||
|
|
**Phase 1: Parallel Implementation (Safe)**
|
|||
|
|
1. Create `hak_router.c/h` alongside existing code
|
|||
|
|
2. Implement complete routing logic with verbose logging
|
|||
|
|
3. Add feature flag `HAKMEM_USE_CENTRAL_ROUTER`
|
|||
|
|
4. Test with flag enabled in development
|
|||
|
|
|
|||
|
|
**Phase 2: Gradual Migration**
|
|||
|
|
1. Replace `hak_alloc_at()` internals to call `hak_router_alloc()`
|
|||
|
|
2. Keep existing API for compatibility
|
|||
|
|
3. Add routing logs to identify issues
|
|||
|
|
4. Run comprehensive benchmarks
|
|||
|
|
|
|||
|
|
**Phase 3: Cleanup**
|
|||
|
|
1. Remove scattered routing from individual allocators
|
|||
|
|
2. Deprecate `hak_alloc_api.inc.h`
|
|||
|
|
3. Simplify ACE to just handle rounding (not routing)
|
|||
|
|
|
|||
|
|
### Migration Strategy
|
|||
|
|
|
|||
|
|
**Can be done gradually:**
|
|||
|
|
- Start with feature flag (no risk)
|
|||
|
|
- Replace one allocation path at a time
|
|||
|
|
- Keep old code as fallback
|
|||
|
|
- Full migration only after validation
|
|||
|
|
|
|||
|
|
**Example migration:**
|
|||
|
|
```c
|
|||
|
|
// In hak_alloc_at() - gradual migration
|
|||
|
|
void* hak_alloc_at(size_t size, hak_callsite_t site) {
|
|||
|
|
#ifdef HAKMEM_USE_CENTRAL_ROUTER
|
|||
|
|
return hak_router_alloc(size, (uintptr_t)site);
|
|||
|
|
#else
|
|||
|
|
// ... existing 186 lines of routing logic ...
|
|||
|
|
#endif
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Part 2: Pre-allocation Debug Results
|
|||
|
|
|
|||
|
|
### Root Cause Analysis
|
|||
|
|
|
|||
|
|
**CRITICAL BUG FOUND:** Return value check is INVERTED in `core/box/pool_init_api.inc.h:122`
|
|||
|
|
|
|||
|
|
```c
|
|||
|
|
// CURRENT CODE (WRONG):
|
|||
|
|
if (refill_freelist(5, s) == 0) { // Checks for FAILURE (0 = failure)
|
|||
|
|
allocated++; // But counts as SUCCESS!
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
// CORRECT CODE:
|
|||
|
|
if (refill_freelist(5, s) != 0) { // Check for SUCCESS (non-zero = success)
|
|||
|
|
allocated++; // Count successes
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Failure Scenario Explanation
|
|||
|
|
|
|||
|
|
1. **refill_freelist() API:**
|
|||
|
|
- Returns 1 on success
|
|||
|
|
- Returns 0 on failure
|
|||
|
|
- Defined in `core/box/pool_refill.inc.h:31`
|
|||
|
|
|
|||
|
|
2. **Bug Impact:**
|
|||
|
|
- Pre-allocation IS happening successfully
|
|||
|
|
- But counter shows 0 because it's counting failures
|
|||
|
|
- This gives FALSE impression that pre-allocation failed
|
|||
|
|
- Pool is actually working but appears broken
|
|||
|
|
|
|||
|
|
3. **Why it still works:**
|
|||
|
|
- Even though counter is wrong, pages ARE allocated
|
|||
|
|
- Pool serves allocations correctly
|
|||
|
|
- Just the diagnostic message is wrong
|
|||
|
|
|
|||
|
|
### Concrete Fix (Code Patch)
|
|||
|
|
|
|||
|
|
```diff
|
|||
|
|
--- a/core/box/pool_init_api.inc.h
|
|||
|
|
+++ b/core/box/pool_init_api.inc.h
|
|||
|
|
@@ -119,7 +119,7 @@ static void hak_pool_init_impl(void) {
|
|||
|
|
if (g_class_sizes[5] != 0) {
|
|||
|
|
int allocated = 0;
|
|||
|
|
for (int s = 0; s < prewarm_pages && s < POOL_NUM_SHARDS; s++) {
|
|||
|
|
- if (refill_freelist(5, s) == 0) {
|
|||
|
|
+ if (refill_freelist(5, s) != 0) { // FIX: Check for SUCCESS (1), not FAILURE (0)
|
|||
|
|
allocated++;
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
@@ -133,7 +133,7 @@ static void hak_pool_init_impl(void) {
|
|||
|
|
if (g_class_sizes[6] != 0) {
|
|||
|
|
int allocated = 0;
|
|||
|
|
for (int s = 0; s < prewarm_pages && s < POOL_NUM_SHARDS; s++) {
|
|||
|
|
- if (refill_freelist(6, s) == 0) {
|
|||
|
|
+ if (refill_freelist(6, s) != 0) { // FIX: Check for SUCCESS (1), not FAILURE (0)
|
|||
|
|
allocated++;
|
|||
|
|
}
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Verification Steps
|
|||
|
|
|
|||
|
|
1. **Apply the fix:**
|
|||
|
|
```bash
|
|||
|
|
# Edit the file
|
|||
|
|
vi core/box/pool_init_api.inc.h
|
|||
|
|
# Change line 122: == 0 to != 0
|
|||
|
|
# Change line 136: == 0 to != 0
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **Rebuild:**
|
|||
|
|
```bash
|
|||
|
|
make clean && make HEADER_CLASSIDX=1 AGGRESSIVE_INLINE=1 PREWARM_TLS=1 bench_mid_large_mt_hakmem
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
3. **Test:**
|
|||
|
|
```bash
|
|||
|
|
HAKMEM_ACE_ENABLED=1 HAKMEM_WRAP_L2=1 ./bench_mid_large_mt_hakmem
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
4. **Expected output:**
|
|||
|
|
```
|
|||
|
|
[Pool] Pre-allocated 4 pages for Bridge class 5 (40 KB) ← Should show 4, not 0!
|
|||
|
|
[Pool] Pre-allocated 4 pages for Bridge class 6 (52 KB) ← Should show 4, not 0!
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
5. **Performance should improve** from 437K ops/s to potentially 50-80M ops/s (with pre-allocation working)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Recommendations
|
|||
|
|
|
|||
|
|
### Short-term (Immediate)
|
|||
|
|
|
|||
|
|
1. **Apply the pre-allocation fix NOW** (1-line change × 2)
|
|||
|
|
- This will immediately improve performance
|
|||
|
|
- No risk - just fixing inverted condition
|
|||
|
|
|
|||
|
|
2. **Add verbose logging to understand flow:**
|
|||
|
|
```c
|
|||
|
|
fprintf(stderr, "[Pool] refill_freelist(5, %d) returned %d\n", s, result);
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
3. **Remove dead code:**
|
|||
|
|
- Delete `core/box/pool_core_api.inc.h` (not included anywhere)
|
|||
|
|
- This file has duplicate `refill_freelist()` causing confusion
|
|||
|
|
|
|||
|
|
### Long-term (1-2 weeks)
|
|||
|
|
|
|||
|
|
1. **Implement Central Router Box**
|
|||
|
|
- Start with feature flag for safety
|
|||
|
|
- Add comprehensive logging
|
|||
|
|
- Gradual migration path
|
|||
|
|
|
|||
|
|
2. **Clean up scattered routing:**
|
|||
|
|
- Remove routing from ACE (should only round sizes)
|
|||
|
|
- Simplify hak_alloc_api.inc.h to just call router
|
|||
|
|
- Each allocator should have ONE responsibility
|
|||
|
|
|
|||
|
|
3. **Add integration tests:**
|
|||
|
|
- Test each size range
|
|||
|
|
- Verify correct allocator is used
|
|||
|
|
- Check fallback paths work
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Architectural Insights
|
|||
|
|
|
|||
|
|
### The "Boxing" Problem
|
|||
|
|
|
|||
|
|
The user's insight **"バグがすぐ見つからないということは 箱化が足りない"** is EXACTLY right.
|
|||
|
|
|
|||
|
|
Current architecture violates Single Responsibility Principle:
|
|||
|
|
- ACE does routing AND rounding
|
|||
|
|
- Pool does allocation AND routing decisions
|
|||
|
|
- hak_alloc_api does routing AND fallback AND statistics
|
|||
|
|
|
|||
|
|
This creates:
|
|||
|
|
- **Invisible failures** (no central logging)
|
|||
|
|
- **Debugging nightmare** (must trace through 3+ files)
|
|||
|
|
- **Hidden dependencies** (who calls whom?)
|
|||
|
|
- **Silent bugs** (like the inverted condition)
|
|||
|
|
|
|||
|
|
### The Solution: True Boxing
|
|||
|
|
|
|||
|
|
Each box should have ONE clear responsibility:
|
|||
|
|
- **Router Box**: Routes based on size (ONLY routing)
|
|||
|
|
- **Tiny Box**: Allocates 0-1KB (ONLY tiny allocations)
|
|||
|
|
- **ACE Box**: Rounds sizes to classes (ONLY rounding)
|
|||
|
|
- **Pool Box**: Manages class-sized blocks (ONLY pool management)
|
|||
|
|
|
|||
|
|
With proper boxing:
|
|||
|
|
- Bugs become VISIBLE (central logging)
|
|||
|
|
- Components are TESTABLE (clear interfaces)
|
|||
|
|
- Changes are SAFE (isolated impact)
|
|||
|
|
- Performance improves (clear fast paths)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Appendix: Additional Findings
|
|||
|
|
|
|||
|
|
### Dead Code Discovery
|
|||
|
|
|
|||
|
|
Found duplicate `refill_freelist()` implementation in `core/box/pool_core_api.inc.h` that is:
|
|||
|
|
- Never included by any file
|
|||
|
|
- Has identical logic to the real implementation
|
|||
|
|
- Creates confusion when debugging
|
|||
|
|
- Should be deleted
|
|||
|
|
|
|||
|
|
### Bridge Classes Confirmed Working
|
|||
|
|
|
|||
|
|
Verified that Bridge classes ARE properly initialized:
|
|||
|
|
- `g_class_sizes[5] = 40960` (40KB) ✓
|
|||
|
|
- `g_class_sizes[6] = 53248` (52KB) ✓
|
|||
|
|
- Not being overwritten by Policy (fix already applied)
|
|||
|
|
- ACE correctly routes 33KB → 40KB class
|
|||
|
|
|
|||
|
|
The ONLY issue was the inverted condition in pre-allocation counting.
|