Files
hakmem/docs/analysis/GEMINI_BIGCACHE_ANALYSIS.md
Moe Charm (CI) 52386401b3 Debug Counters Implementation - Clean History
Major Features:
- Debug counter infrastructure for Refill Stage tracking
- Free Pipeline counters (ss_local, ss_remote, tls_sll)
- Diagnostic counters for early return analysis
- Unified larson.sh benchmark runner with profiles
- Phase 6-3 regression analysis documentation

Bug Fixes:
- Fix SuperSlab disabled by default (HAKMEM_TINY_USE_SUPERSLAB)
- Fix profile variable naming consistency
- Add .gitignore patterns for large files

Performance:
- Phase 6-3: 4.79 M ops/s (has OOM risk)
- With SuperSlab: 3.13 M ops/s (+19% improvement)

This is a clean repository without large log files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-05 12:31:14 +09:00

6.4 KiB

Gemini Analysis: BigCache heap-buffer-overflow

Date: 2025-10-21 Status: Already Fixed - Root cause identified, fix confirmed in code


🎯 Summary

Gemini analyzed a heap-buffer-overflow detected by AddressSanitizer and identified the root cause as BigCache returning undersized blocks.

Critical finding: BigCache was returning cached blocks smaller than requested size, causing memset() overflow.

Fix status: Already implemented in hakmem_bigcache.c:151 with size check:

if (slot->valid && slot->site == site && slot->actual_bytes >= size) {
    // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Size check prevents undersize returns

🔍 Root Cause Analysis (by Gemini)

Error Sequence

  1. Iteration 0: Benchmark requests 2.000MB (2,097,152 bytes)

    • alloc_malloc() allocates 2.000MB block
    • Benchmark uses and frees the block
    • hak_free()hak_bigcache_put() caches it with actual_bytes = 2,000,000
    • Block stored in size-class "2MB class"
  2. Iteration 1: Benchmark requests 2.004MB (2,101,248 bytes)

    • Same size-class "2MB class" lookup
    • BUG: BigCache returns 2.000MB block without checking actual_bytes >= requested_size
    • Allocator returns 2.000MB block for 2.004MB request
  3. Overflow: memset() at bench_allocators.c:213

    • Tries to write 2.004MB (2,138,112 bytes in log)
    • Block is only 2.000MB
    • heap-buffer-overflow by ~4KB

AddressSanitizer Log

heap-buffer-overflow on address 0x7f36708c1000
WRITE of size 2138112 at 0x7f36708c1000
    #0 memset
    #1 bench_cold_churn bench_allocators.c:213

freed by thread T0 here:
    #1 bigcache_free_callback hakmem.c:526
    #2 evict_slot hakmem_bigcache.c:96
    #3 hak_bigcache_put hakmem_bigcache.c:182

previously allocated by thread T0 here:
    #1 alloc_malloc hakmem.c:426
    #2 allocate_with_policy hakmem.c:499

Note: "freed by thread T0" refers to BigCache internal "free slot" state, not OS-level deallocation.


🐛 Implementation Bug (Before Fix)

Problem

BigCache was checking only size-class match, not actual size sufficiency:

// WRONG (hypothetical buggy version)
int hak_bigcache_try_get(size_t size, uintptr_t site, void** out_ptr) {
    int site_idx = hash_site(site);
    int class_idx = get_class_index(size);  // Same class for 2.000MB and 2.004MB
    
    BigCacheSlot* slot = &g_cache[site_idx][class_idx];
    
    if (slot->valid && slot->site == site) {  // ❌ Missing size check!
        *out_ptr = slot->ptr;
        slot->valid = 0;
        return 1;  // Returns 2.000MB block for 2.004MB request
    }
    
    return 0;
}

Two checks needed

  1. Size-class match: Which class does the request belong to?
  2. Actual size sufficient: slot->actual_bytes >= requested_bytes? (MISSING)

Fix Implementation

Current Code (Fixed)

File: hakmem_bigcache.c:139-163

// Phase 6.4 P2: O(1) get - Direct table lookup
int hak_bigcache_try_get(size_t size, uintptr_t site, void** out_ptr) {
    if (!g_initialized) hak_bigcache_init();
    if (!is_cacheable(size)) return 0;

    // O(1) calculation: site_idx, class_idx
    int site_idx = hash_site(site);
    int class_idx = get_class_index(size);  // P3: branchless

    // O(1) lookup: table[site_idx][class_idx]
    BigCacheSlot* slot = &g_cache[site_idx][class_idx];

    // ✅ Check: valid, matching site, AND sufficient size (Segfault fix!)
    if (slot->valid && slot->site == site && slot->actual_bytes >= size) {
        // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FIX: Size sufficiency check
        
        // Hit! Return and invalidate slot
        *out_ptr = slot->ptr;
        slot->valid = 0;

        g_stats.hits++;
        return 1;
    }

    // Miss (invalid, wrong site, or undersized)
    g_stats.misses++;
    return 0;
}

Key Addition

Line 151:

if (slot->valid && slot->site == site && slot->actual_bytes >= size) {
    // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Prevents undersize blocks

Comment confirms this was a known fix: "AND sufficient size (Segfault fix!)"


🧪 Verification

Test Scenario (cold-churn benchmark)

// bench_allocators.c cold_churn scenario
for (int i = 0; i < iterations; i++) {
    size_t size = base_size + (i * increment);
    // Iteration 0: 2,097,152 bytes (2.000MB)
    // Iteration 1: 2,101,248 bytes (2.004MB)  ← Would trigger bug
    // Iteration 2: 2,105,344 bytes (2.008MB)
    
    void* p = hak_alloc_cs(size);
    memset(p, 0xAA, size);  // ← Overflow point if undersized block
    hak_free_cs(p);
}

Expected Behavior (After Fix)

  1. Iteration 0: Allocate 2.000MB → Use → Free → BigCache stores (actual_bytes = 2,000,000)

  2. Iteration 1: Request 2.004MB

    • BigCache checks: slot->actual_bytes (2,000,000) >= size (2,004,000)FALSE
    • Cache miss → Allocate new 2.004MB block
    • No overflow
  3. Iteration 2: Request 2.008MB

    • Similar cache miss → New allocation
    • No overflow

📊 Gemini's Recommendations

Recommendation 1: Add size check DONE

Before:

if (slot->is_used) {
    // Return block without size check
    return slot->ptr;
}

After (Current implementation):

if (slot->is_used && slot->actual_bytes >= requested_bytes) {
    // Only return if size is sufficient
    return slot->ptr;
}

Recommendation 2: Fallback on undersize

If no suitable block found in cache:

// If loop finds no sufficient block
return NULL;  // Force new allocation via mmap

Current implementation handles this correctly by returning 0 (miss) on line 162.


🎯 Conclusion

Status: Bug already fixed

The heap-buffer-overflow issue identified by AddressSanitizer has been correctly diagnosed by Gemini and the fix is already implemented in the codebase.

Key lesson: Size-class caching requires two-level checking:

  1. Class match (performance)
  2. Actual size sufficiency (correctness)

Code location: hakmem_bigcache.c:151

Comment evidence: "AND sufficient size (Segfault fix!)" confirms this was a known issue that has been addressed.