Major Features: - Debug counter infrastructure for Refill Stage tracking - Free Pipeline counters (ss_local, ss_remote, tls_sll) - Diagnostic counters for early return analysis - Unified larson.sh benchmark runner with profiles - Phase 6-3 regression analysis documentation Bug Fixes: - Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB) - Fix profile variable naming consistency - Add .gitignore patterns for large files Performance: - Phase 6-3: 4.79 M ops/s (has OOM risk) - With SuperSlab: 3.13 M ops/s (+19% improvement) This is a clean repository without large log files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
670 lines
17 KiB
Markdown
670 lines
17 KiB
Markdown
# Hybrid Bitmap+Mini-Magazine Implementation Design
|
|
|
|
**Date**: 2025-10-26
|
|
**Goal**: Clean, modular implementation of Hybrid approach
|
|
**Target**: 83ns → 40-50ns (40-50% improvement)
|
|
**Philosophy**: 綺麗綺麗大作戦 - Clean code from the start
|
|
|
|
---
|
|
|
|
## Current Structure Analysis
|
|
|
|
### Existing Implementation (Phase 6.X)
|
|
|
|
**Already Implemented** ✅:
|
|
1. **Two-Tier Bitmap**: `summary` + `bitmap` (lines 103-106 in hakmem_tiny.h)
|
|
2. **TLS Magazine**: 2048 items (lines 54-70 in hakmem_tiny.c)
|
|
3. **TLS Active Slabs**: A/B slabs (lines 72-73)
|
|
4. **Remote Free MPSC**: Lock-free remote free (lines 126-138)
|
|
5. **Hint Word**: Scan start position (line 102)
|
|
|
|
**Current Hot Path** (lines 656-661):
|
|
```c
|
|
if (mag->top > 0) {
|
|
void* p = mag->items[--mag->top].ptr;
|
|
t_tiny_rng ^= t_tiny_rng << 13; // Statistics XOR (3 ns)
|
|
t_tiny_rng ^= t_tiny_rng >> 17;
|
|
t_tiny_rng ^= t_tiny_rng << 5;
|
|
if ((t_tiny_rng & ...) == 0) g_tiny_pool.alloc_count[class_idx]++;
|
|
return p;
|
|
}
|
|
```
|
|
|
|
**Bottlenecks Identified**:
|
|
1. ❌ Statistics in hot path: +10-15ns (lines 658-659, 677-678, 793-794)
|
|
2. ❌ Bitmap scan on TLS slab: +5-6ns (line 671, 776)
|
|
3. ❌ Multiple TLS reads: +2-3ns (mag, slab_a, slab_b)
|
|
|
|
**Current Performance**: 83 ns/op
|
|
|
|
---
|
|
|
|
## Clean Modular Design
|
|
|
|
### Module 1: Page Mini-Magazine (Data Plane)
|
|
|
|
**Purpose**: O(1) LIFO free-list for fast allocation
|
|
**Location**: New file `hakmem_tiny_mini_mag.h`
|
|
|
|
```c
|
|
// ============================================================================
|
|
// Page Mini-Magazine: Fast LIFO Cache (Data Plane)
|
|
// ============================================================================
|
|
|
|
// Intrusive LIFO block
|
|
typedef struct MiniMagBlock {
|
|
struct MiniMagBlock* next; // 8 bytes in free block
|
|
} MiniMagBlock;
|
|
|
|
// Mini-magazine metadata (embedded in TinySlab)
|
|
typedef struct {
|
|
MiniMagBlock* head; // LIFO stack head
|
|
uint16_t count; // Current item count
|
|
uint16_t capacity; // Max items (16-32)
|
|
} PageMiniMag;
|
|
|
|
// Fast path: Pop from mini-magazine (1-2 ns)
|
|
static inline void* mini_mag_pop(PageMiniMag* mag) {
|
|
MiniMagBlock* b = mag->head;
|
|
if (!b) return NULL;
|
|
|
|
mag->head = b->next;
|
|
mag->count--;
|
|
return (void*)b;
|
|
}
|
|
|
|
// Fast path: Push to mini-magazine (1-2 ns)
|
|
static inline int mini_mag_push(PageMiniMag* mag, void* ptr) {
|
|
if (mag->count >= mag->capacity) return 0; // Full
|
|
|
|
MiniMagBlock* b = (MiniMagBlock*)ptr;
|
|
b->next = mag->head;
|
|
mag->head = b;
|
|
mag->count++;
|
|
return 1;
|
|
}
|
|
```
|
|
|
|
**Characteristics**:
|
|
- ✅ Zero overhead (intrusive next-pointer)
|
|
- ✅ O(1) pop/push (1-2 ns)
|
|
- ✅ Cache-friendly (LIFO = temporal locality)
|
|
- ✅ No locks (owner-only access)
|
|
|
|
---
|
|
|
|
### Module 2: Batch Refill from Bitmap (Control Plane)
|
|
|
|
**Purpose**: Batch refill mini-magazine from two-tier bitmap
|
|
**Location**: New file `hakmem_tiny_batch_refill.h`
|
|
|
|
```c
|
|
// ============================================================================
|
|
// Batch Refill: Two-Tier Bitmap → Mini-Magazine (Control Plane)
|
|
// ============================================================================
|
|
|
|
// Refill mini-magazine from bitmap (batch of N items)
|
|
// Returns number of items refilled
|
|
static inline int batch_refill_from_bitmap(
|
|
TinySlab* slab,
|
|
PageMiniMag* mag,
|
|
int want
|
|
) {
|
|
if (want <= 0 || mag->count >= mag->capacity) return 0;
|
|
|
|
size_t block_size = g_tiny_class_sizes[slab->class_idx];
|
|
int got = 0;
|
|
|
|
// Two-tier bitmap scan (using existing summary)
|
|
while (got < want && slab->free_count > 0) {
|
|
int block_idx = hak_tiny_find_free_block(slab);
|
|
if (block_idx < 0) break;
|
|
|
|
// Mark as used in bitmap
|
|
hak_tiny_set_used(slab, block_idx);
|
|
slab->free_count--;
|
|
|
|
// Calculate block pointer
|
|
void* ptr = (char*)slab->base + (block_idx * block_size);
|
|
|
|
// Push to mini-magazine
|
|
if (!mini_mag_push(mag, ptr)) break;
|
|
got++;
|
|
}
|
|
|
|
return got;
|
|
}
|
|
|
|
// Spill mini-magazine back to bitmap (on slab eviction)
|
|
static inline void batch_spill_to_bitmap(
|
|
TinySlab* slab,
|
|
PageMiniMag* mag
|
|
) {
|
|
size_t block_size = g_tiny_class_sizes[slab->class_idx];
|
|
|
|
while (mag->count > 0) {
|
|
void* ptr = mini_mag_pop(mag);
|
|
if (!ptr) break;
|
|
|
|
// Calculate block index
|
|
uintptr_t offset = (uintptr_t)ptr - (uintptr_t)slab->base;
|
|
int block_idx = offset / block_size;
|
|
|
|
// Mark as free in bitmap
|
|
hak_tiny_set_free(slab, block_idx);
|
|
slab->free_count++;
|
|
}
|
|
}
|
|
```
|
|
|
|
**Characteristics**:
|
|
- ✅ Batch processing (amortized cost: 3ns/item)
|
|
- ✅ Uses existing two-tier bitmap
|
|
- ✅ Preserves bitmap consistency
|
|
- ✅ Minimal code (~40 lines)
|
|
|
|
---
|
|
|
|
### Module 3: Integrated TinySlab Structure
|
|
|
|
**Purpose**: Add mini-magazine to existing TinySlab
|
|
**Location**: `hakmem_tiny.h` (modification)
|
|
|
|
```c
|
|
// Modified TinySlab structure
|
|
typedef struct TinySlab {
|
|
void* base; // Base address (64KB aligned)
|
|
uint64_t* bitmap; // Free block bitmap (dynamic size)
|
|
uint16_t free_count; // Number of free blocks
|
|
uint16_t total_count; // Total blocks in slab
|
|
uint8_t class_idx; // Size class index (0-7)
|
|
uint8_t _padding[3];
|
|
struct TinySlab* next; // Next slab in list
|
|
|
|
// MPSC remote-free stack head (lock-free)
|
|
atomic_uintptr_t remote_head;
|
|
atomic_uint remote_count;
|
|
pthread_t owner_tid;
|
|
|
|
// Two-tier bitmap (existing)
|
|
uint16_t hint_word;
|
|
uint8_t summary_words;
|
|
uint8_t _pad_sum[1];
|
|
uint64_t* summary;
|
|
|
|
// NEW: Page Mini-Magazine (16-32 items)
|
|
PageMiniMag mini_mag; // 16 bytes (aligned)
|
|
} TinySlab;
|
|
```
|
|
|
|
**Changes**:
|
|
- ✅ Additive only (no breaking changes)
|
|
- ✅ Cache-line aligned (64 bytes)
|
|
- ✅ Backward compatible
|
|
|
|
---
|
|
|
|
### Module 4: Statistics Batching (Out-of-Band)
|
|
|
|
**Purpose**: Remove statistics from hot path
|
|
**Location**: New file `hakmem_tiny_stats.h`
|
|
|
|
```c
|
|
// ============================================================================
|
|
// Statistics: Batched TLS Counters (Out-of-Band)
|
|
// ============================================================================
|
|
|
|
// Per-thread batch counters (flushed periodically)
|
|
static __thread uint32_t t_alloc_batch[TINY_NUM_CLASSES];
|
|
static __thread uint32_t t_free_batch[TINY_NUM_CLASSES];
|
|
|
|
// Fast path: Increment TLS counter only (0.5 ns)
|
|
static inline void stats_record_alloc(int class_idx) {
|
|
t_alloc_batch[class_idx]++;
|
|
}
|
|
|
|
static inline void stats_record_free(int class_idx) {
|
|
t_free_batch[class_idx]++;
|
|
}
|
|
|
|
// Cold path: Flush batch to global counters
|
|
static inline void stats_flush_if_needed(int class_idx) {
|
|
// Flush every 256 allocations
|
|
if ((t_alloc_batch[class_idx] & 0xFF) == 0xFF) {
|
|
g_tiny_pool.alloc_count[class_idx] += 256;
|
|
t_alloc_batch[class_idx] = 0;
|
|
}
|
|
if ((t_free_batch[class_idx] & 0xFF) == 0xFF) {
|
|
g_tiny_pool.free_count[class_idx] += 256;
|
|
t_free_batch[class_idx] = 0;
|
|
}
|
|
}
|
|
```
|
|
|
|
**Characteristics**:
|
|
- ✅ Hot path: 0.5 ns (simple increment)
|
|
- ✅ Cold path: Batch flush (every 256 ops)
|
|
- ✅ Accuracy: 99.6% (vs 93.75% with sampling)
|
|
- ✅ No XOR RNG overhead
|
|
|
|
---
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Add Mini-Magazine to TinySlab (2-3 hours)
|
|
|
|
**Goal**: Enable page-level fast path
|
|
|
|
**Files to modify**:
|
|
1. `hakmem_tiny.h`: Add `PageMiniMag` to `TinySlab`
|
|
2. Create `hakmem_tiny_mini_mag.h`: Mini-magazine operations
|
|
3. Create `hakmem_tiny_batch_refill.h`: Batch refill/spill
|
|
|
|
**Changes**:
|
|
```c
|
|
// hakmem_tiny.h (line 107, after summary)
|
|
typedef struct TinySlab {
|
|
// ... existing fields ...
|
|
uint64_t* summary;
|
|
|
|
// NEW: Page Mini-Magazine
|
|
PageMiniMag mini_mag;
|
|
} TinySlab;
|
|
|
|
// Initialize mini-magazine on slab creation (hakmem_tiny.c:allocate_new_slab)
|
|
slab->mini_mag.head = NULL;
|
|
slab->mini_mag.count = 0;
|
|
slab->mini_mag.capacity = 16; // Start with 16 items
|
|
```
|
|
|
|
**Testing**:
|
|
```bash
|
|
# Compile
|
|
make clean && make -j4
|
|
|
|
# Unit test: Mini-magazine operations
|
|
# (create test_mini_mag.c)
|
|
./test_mini_mag
|
|
|
|
# Integration test: Verify no regressions
|
|
./bench_tiny --iterations=100000 --threads=1
|
|
```
|
|
|
|
**Expected**: No performance change (feature not used yet)
|
|
|
|
---
|
|
|
|
### Phase 2: Integrate Mini-Magazine into Hot Path (3-4 hours)
|
|
|
|
**Goal**: Use mini-magazine in allocation fast path
|
|
|
|
**Files to modify**:
|
|
1. `hakmem_tiny.c`: Modify `hak_tiny_alloc()` to use mini-mag
|
|
|
|
**New hot path logic**:
|
|
```c
|
|
void* hak_tiny_alloc(size_t size) {
|
|
int class_idx = hak_tiny_size_to_class(size);
|
|
if (class_idx < 0) return NULL;
|
|
|
|
// 1. TLS Magazine (existing fast path)
|
|
TinyTLSMag* mag = &g_tls_mags[class_idx];
|
|
if (mag->top > 0) {
|
|
void* p = mag->items[--mag->top].ptr;
|
|
stats_record_alloc(class_idx); // NEW: Batched
|
|
return p;
|
|
}
|
|
|
|
// 2. TLS Active Slab Mini-Magazine (NEW: lock-free)
|
|
TinySlab* tls = g_tls_active_slab_a[class_idx];
|
|
if (tls && tls->mini_mag.count > 0) {
|
|
void* p = mini_mag_pop(&tls->mini_mag);
|
|
if (p) {
|
|
stats_record_alloc(class_idx);
|
|
return p;
|
|
}
|
|
}
|
|
|
|
// 3. Refill mini-magazine from bitmap (medium path)
|
|
if (tls && tls->free_count > 0) {
|
|
int got = batch_refill_from_bitmap(tls, &tls->mini_mag, 16);
|
|
if (got > 0) {
|
|
void* p = mini_mag_pop(&tls->mini_mag);
|
|
if (p) {
|
|
stats_record_alloc(class_idx);
|
|
return p;
|
|
}
|
|
}
|
|
}
|
|
|
|
// 4. Slow path (existing global pool logic)
|
|
return tiny_alloc_slow_path(class_idx);
|
|
}
|
|
```
|
|
|
|
**Testing**:
|
|
```bash
|
|
# Performance test
|
|
./bench_tiny --iterations=1000000 --threads=1
|
|
# Expected: 83ns → 55-65ns
|
|
|
|
# Multi-threaded test
|
|
./bench_tiny_mt --iterations=100000 --threads=4
|
|
# Expected: No regressions
|
|
|
|
# Correctness test
|
|
./test_mf2
|
|
./test_mf2_warmup
|
|
```
|
|
|
|
**Expected**: 83ns → 55-65ns (+25-35% improvement)
|
|
|
|
---
|
|
|
|
### Phase 3: Remove Statistics from Hot Path (1 hour)
|
|
|
|
**Goal**: Eliminate XOR RNG overhead
|
|
|
|
**Files to modify**:
|
|
1. Create `hakmem_tiny_stats.h`: Batched statistics
|
|
2. `hakmem_tiny.c`: Replace XOR RNG with batched counters
|
|
|
|
**Changes**:
|
|
```c
|
|
// Remove lines 658-659, 677-678, 793-794 (XOR RNG)
|
|
// Replace with:
|
|
stats_record_alloc(class_idx);
|
|
|
|
// Add periodic flush (in slow path)
|
|
stats_flush_if_needed(class_idx);
|
|
```
|
|
|
|
**Testing**:
|
|
```bash
|
|
# Verify statistics still work
|
|
./test_mf2
|
|
hak_tiny_get_stats(...) # Should show reasonable counts
|
|
|
|
# Performance test
|
|
./bench_tiny --iterations=1000000 --threads=1
|
|
# Expected: 55-65ns → 45-55ns
|
|
```
|
|
|
|
**Expected**: 55-65ns → 45-55ns (+10ns improvement)
|
|
|
|
---
|
|
|
|
### Phase 4: TLS Magazine Integration (1-2 hours)
|
|
|
|
**Goal**: Optimize TLS Magazine refill from mini-magazines
|
|
|
|
**Files to modify**:
|
|
1. `hakmem_tiny.c`: Refill TLS Magazine from slab mini-magazines
|
|
|
|
**Changes**:
|
|
```c
|
|
// When TLS Magazine is low, refill from multiple slab mini-magazines
|
|
static void refill_tls_magazine(int class_idx) {
|
|
TinyTLSMag* mag = &g_tls_mags[class_idx];
|
|
int room = mag->cap - mag->top;
|
|
if (room <= 0) return;
|
|
|
|
// Try TLS active slab A
|
|
TinySlab* tls_a = g_tls_active_slab_a[class_idx];
|
|
if (tls_a) {
|
|
while (room > 0 && tls_a->mini_mag.count > 0) {
|
|
void* p = mini_mag_pop(&tls_a->mini_mag);
|
|
if (!p) break;
|
|
mag->items[mag->top++].ptr = p;
|
|
room--;
|
|
}
|
|
|
|
// If mini-mag empty, refill from bitmap
|
|
if (tls_a->mini_mag.count == 0 && tls_a->free_count > 0) {
|
|
batch_refill_from_bitmap(tls_a, &tls_a->mini_mag, 16);
|
|
}
|
|
}
|
|
|
|
// Try TLS active slab B (if still room)
|
|
if (room > 0) {
|
|
TinySlab* tls_b = g_tls_active_slab_b[class_idx];
|
|
// ... similar logic ...
|
|
}
|
|
}
|
|
```
|
|
|
|
**Testing**:
|
|
```bash
|
|
# Full benchmark suite
|
|
./bench_tiny --iterations=1000000 --threads=1
|
|
./bench_tiny_mt --iterations=100000 --threads=4
|
|
./bench_allocators_hakmem --scenario json
|
|
./bench_allocators_hakmem --scenario mir
|
|
|
|
# Expected: 45-55ns → 40-50ns
|
|
```
|
|
|
|
**Expected**: 45-55ns → 40-50ns (+5-10ns improvement)
|
|
|
|
---
|
|
|
|
## Code Organization (綺麗綺麗)
|
|
|
|
### File Structure
|
|
|
|
```
|
|
hakmem/
|
|
├── hakmem_tiny.h # Main header (modified)
|
|
├── hakmem_tiny.c # Main implementation (modified)
|
|
├── hakmem_tiny_mini_mag.h # NEW: Mini-magazine operations
|
|
├── hakmem_tiny_batch_refill.h # NEW: Batch refill/spill
|
|
├── hakmem_tiny_stats.h # NEW: Batched statistics
|
|
└── hakmem_tiny_superslab.h # Existing (unchanged)
|
|
```
|
|
|
|
### Module Dependencies
|
|
|
|
```
|
|
hakmem_tiny.c
|
|
├── includes hakmem_tiny.h
|
|
├── includes hakmem_tiny_mini_mag.h (Phase 1)
|
|
├── includes hakmem_tiny_batch_refill.h (Phase 2)
|
|
└── includes hakmem_tiny_stats.h (Phase 3)
|
|
```
|
|
|
|
### Coding Standards
|
|
|
|
**Naming Convention**:
|
|
```c
|
|
// Prefix: mini_mag_*, batch_*, stats_*
|
|
mini_mag_pop()
|
|
mini_mag_push()
|
|
batch_refill_from_bitmap()
|
|
batch_spill_to_bitmap()
|
|
stats_record_alloc()
|
|
stats_flush_if_needed()
|
|
```
|
|
|
|
**Inline Policy**:
|
|
```c
|
|
// Hot path: always inline
|
|
static inline void* mini_mag_pop(...) __attribute__((always_inline));
|
|
|
|
// Medium path: let compiler decide
|
|
static inline int batch_refill_from_bitmap(...);
|
|
|
|
// Cold path: never inline
|
|
static void tiny_alloc_slow_path(...) __attribute__((noinline));
|
|
```
|
|
|
|
**Alignment**:
|
|
```c
|
|
// Cache-line aligned structures
|
|
typedef struct __attribute__((aligned(64))) {
|
|
// ...
|
|
} PageMiniMag;
|
|
```
|
|
|
|
**Comments**:
|
|
```c
|
|
// Module-level comment (box)
|
|
// ============================================================================
|
|
// Page Mini-Magazine: Fast LIFO Cache (Data Plane)
|
|
// ============================================================================
|
|
|
|
// Function-level comment (purpose + cost)
|
|
// Fast path: Pop from mini-magazine (1-2 ns)
|
|
static inline void* mini_mag_pop(PageMiniMag* mag) {
|
|
// ...
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
|
|
**test_mini_mag.c**:
|
|
```c
|
|
// Test mini-magazine operations
|
|
void test_push_pop() {
|
|
PageMiniMag mag = {.head = NULL, .count = 0, .capacity = 16};
|
|
void* ptrs[16];
|
|
|
|
// Push 16 items
|
|
for (int i = 0; i < 16; i++) {
|
|
ptrs[i] = malloc(64);
|
|
assert(mini_mag_push(&mag, ptrs[i]) == 1);
|
|
}
|
|
assert(mag.count == 16);
|
|
|
|
// Push when full (should fail)
|
|
void* extra = malloc(64);
|
|
assert(mini_mag_push(&mag, extra) == 0);
|
|
|
|
// Pop 16 items (LIFO order)
|
|
for (int i = 15; i >= 0; i--) {
|
|
void* p = mini_mag_pop(&mag);
|
|
assert(p == ptrs[i]); // LIFO
|
|
}
|
|
assert(mag.count == 0);
|
|
|
|
// Pop when empty (should return NULL)
|
|
assert(mini_mag_pop(&mag) == NULL);
|
|
}
|
|
```
|
|
|
|
**test_batch_refill.c**:
|
|
```c
|
|
// Test batch refill from bitmap
|
|
void test_refill() {
|
|
TinySlab* slab = allocate_new_slab(0); // 8B class
|
|
assert(slab->free_count == 8192);
|
|
|
|
// Refill 16 items
|
|
int got = batch_refill_from_bitmap(slab, &slab->mini_mag, 16);
|
|
assert(got == 16);
|
|
assert(slab->mini_mag.count == 16);
|
|
assert(slab->free_count == 8192 - 16);
|
|
|
|
// Verify items are valid
|
|
for (int i = 0; i < 16; i++) {
|
|
void* p = mini_mag_pop(&slab->mini_mag);
|
|
assert(p >= slab->base && p < (char*)slab->base + TINY_SLAB_SIZE);
|
|
}
|
|
}
|
|
```
|
|
|
|
### Integration Tests
|
|
|
|
**Existing tests should pass**:
|
|
```bash
|
|
./test_mf2
|
|
./test_mf2_warmup
|
|
./bench_tiny --iterations=100000 --threads=1
|
|
./bench_tiny_mt --iterations=100000 --threads=4
|
|
```
|
|
|
|
### Performance Tests
|
|
|
|
**Before/After comparison**:
|
|
```bash
|
|
# Baseline (before optimization)
|
|
./bench_tiny --iterations=1000000 --threads=1 > baseline.txt
|
|
|
|
# After Phase 1 (mini-magazine added but not used)
|
|
./bench_tiny --iterations=1000000 --threads=1 > phase1.txt
|
|
diff baseline.txt phase1.txt # Should be identical
|
|
|
|
# After Phase 2 (mini-magazine integrated)
|
|
./bench_tiny --iterations=1000000 --threads=1 > phase2.txt
|
|
# Expected: 83ns → 55-65ns
|
|
|
|
# After Phase 3 (statistics batched)
|
|
./bench_tiny --iterations=1000000 --threads=1 > phase3.txt
|
|
# Expected: 55-65ns → 45-55ns
|
|
|
|
# After Phase 4 (TLS integration)
|
|
./bench_tiny --iterations=1000000 --threads=1 > phase4.txt
|
|
# Expected: 45-55ns → 40-50ns
|
|
```
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
### Performance Targets
|
|
|
|
| Phase | Target | Pass Criteria |
|
|
|-------|--------|---------------|
|
|
| **Baseline** | 83 ns/op | Current performance |
|
|
| **Phase 1** | 83 ns/op | No regression |
|
|
| **Phase 2** | 55-65 ns/op | +25-35% improvement |
|
|
| **Phase 3** | 45-55 ns/op | +35-45% improvement |
|
|
| **Phase 4** | 40-50 ns/op | **+40-52% improvement** ✅ |
|
|
|
|
### Functional Requirements
|
|
|
|
- [ ] All existing tests pass
|
|
- [ ] No memory leaks (valgrind clean)
|
|
- [ ] Thread-safe (helgrind clean)
|
|
- [ ] Statistics accurate (within 1% of actual)
|
|
- [ ] No regressions on L2/L2.5 pools
|
|
|
|
### Code Quality
|
|
|
|
- [ ] Clean compilation (`-Wall -Wextra -Werror`)
|
|
- [ ] Modular design (separate .h files)
|
|
- [ ] Inline hints appropriate
|
|
- [ ] Cache-line aligned critical structures
|
|
- [ ] Comments explain "why" not "what"
|
|
|
|
---
|
|
|
|
## Rollback Plan
|
|
|
|
Each phase is independent and can be reverted:
|
|
|
|
```bash
|
|
# Revert Phase 4
|
|
git revert <phase4-commit>
|
|
|
|
# Revert Phase 3
|
|
git revert <phase3-commit>
|
|
|
|
# Revert Phase 2
|
|
git revert <phase2-commit>
|
|
|
|
# Revert Phase 1
|
|
git revert <phase1-commit>
|
|
```
|
|
|
|
All changes are backward-compatible (additive only).
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-10-26
|
|
**Status**: Design complete, ready for implementation
|
|
**Next**: Begin Phase 1 - Add Mini-Magazine to TinySlab
|