Commit Graph

6 Commits

Author SHA1 Message Date
855ea7223c Phase E1-CORRECT: Fix USER/BASE pointer conversion bugs in slab_index_for calls
CRITICAL BUG FIX: Phase E1 introduced 1-byte headers for ALL size classes (C0-C7),
changing the pointer contract. However, many locations still called slab_index_for()
with USER pointers (storage+1) instead of BASE pointers (storage), causing off-by-one
slab index calculations that corrupted memory.

Root Cause:
- USER pointer = BASE + 1 (returned by malloc, points past header)
- BASE pointer = storage start (where 1-byte header is written)
- slab_index_for() expects BASE pointer for correct slab boundary calculations
- Passing USER pointer → wrong slab_idx → wrong metadata → freelist corruption

Impact Before Fix:
- bench_random_mixed crashes at ~14K iterations with SEGV
- Massive C7 alignment check failures (wrong slab classification)
- Memory corruption from writing to wrong slab freelists

Fixes Applied (8 locations):

1. core/hakmem_tiny_free.inc:137
   - Added USER→BASE conversion before slab_index_for()

2. core/hakmem_tiny_ultra_simple.inc:148
   - Added USER→BASE conversion before slab_index_for()

3. core/tiny_free_fast.inc.h:220
   - Added USER→BASE conversion before slab_index_for()

4-5. core/tiny_free_magazine.inc.h:126,315
   - Added USER→BASE conversion before slab_index_for() (2 locations)

6. core/box/free_local_box.c:14,22,62
   - Added USER→BASE conversion before slab_index_for()
   - Fixed delta calculation to use BASE instead of USER
   - Fixed debug logging to use BASE instead of USER

7. core/hakmem_tiny.c:448,460,473 (tiny_debug_track_alloc_ret)
   - Added USER→BASE conversion before slab_index_for() (2 calls)
   - Fixed delta calculation to use BASE instead of USER
   - This function is called on EVERY allocation in debug builds

Results After Fix:
 bench_random_mixed stable up to 66K iterations (~4.7x improvement)
 C7 alignment check failures eliminated (was: 100% failure rate)
 Front Gate "Unknown" classification dropped to 0% (was: 1.67%)
 No segfaults for workloads up to ~33K allocations

Remaining Issue:
 Segfault still occurs at iteration 66152 (allocs=33137, frees=33014)
   - Different bug from USER/BASE conversion issues
   - Likely capacity/boundary condition (further investigation needed)

Testing:
- bench_random_mixed_hakmem 1K-66K iterations: PASS
- bench_random_mixed_hakmem 67K+ iterations: FAIL (different bug)
- bench_fixed_size_hakmem 200K iterations: PASS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 05:21:36 +09:00
862e8ea7db Infrastructure and build updates
- Update build configuration and flags
- Add missing header files and dependencies
- Update TLS list implementation with proper scoping
- Fix various compilation warnings and issues
- Update debug ring and tiny allocation infrastructure
- Update benchmark results documentation

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
2025-11-11 21:49:05 +09:00
5b31629650 tiny: fix TLS list next_off scope; default TLS_LIST=1; add sentinel guards; header-aware TLS ops; release quiet for benches 2025-11-11 10:00:36 +09:00
dde490f842 Phase 7: header-aware TLS front caches and FG gating
- core/hakmem_tiny_fastcache.inc.h: make tiny_fast_pop/push read/write next at base+1 for C0–C6; clear C7 next on pop
- core/hakmem_tiny_hot_pop.inc.h: header-aware next reads for g_fast_head pops (classes 0–3)
- core/tiny_free_magazine.inc.h: header-aware chain linking for BG spill chain (base+1 for C0–C6)
- core/box/front_gate_classifier.c: registry fallback classifies headerless only for class 7; others as headered

Build OK; bench_fixed_size_hakmem still SIGBUS right after init. FREE_ROUTE trace shows invalid frees (ptr=0xa0, etc.). Next steps: instrument early frees and audit remaining header-aware writes in any front caches not yet patched.
2025-11-10 18:04:08 +09:00
b09ba4d40d Box TLS-SLL + free boundary hardening: normalize C0–C6 to base (ptr-1) at free boundary; route all caches/freelists via base; replace remaining g_tls_sll_head direct writes with Box API (tls_sll_push/splice) in refill/magazine/ultra; keep C7 excluded. Fixes rbp=0xa0 free crash by preventing header overwrite and centralizing TLS-SLL invariants. 2025-11-10 16:48:20 +09:00
602edab87f Phase 1: Box Theory refactoring + include reduction
Phase 1-1: Split hakmem_tiny_free.inc (1,711 → 452 lines, -73%)
- Created tiny_free_magazine.inc.h (413 lines) - Magazine layer
- Created tiny_superslab_alloc.inc.h (394 lines) - SuperSlab alloc
- Created tiny_superslab_free.inc.h (305 lines) - SuperSlab free

Phase 1-2++: Refactor hakmem_pool.c (1,481 → 907 lines, -38.8%)
- Created pool_tls_types.inc.h (32 lines) - TLS structures
- Created pool_mf2_types.inc.h (266 lines) - MF2 data structures
- Created pool_mf2_helpers.inc.h (158 lines) - Helper functions
- Created pool_mf2_adoption.inc.h (129 lines) - Adoption logic

Phase 1-3: Reduce hakmem_tiny.c includes (60 → 46, -23.3%)
- Created tiny_system.h - System headers umbrella (stdio, stdlib, etc.)
- Created tiny_api.h - API headers umbrella (stats, query, rss, registry)

Performance: 4.19M ops/s maintained (±0% regression)
Verified: Larson benchmark 2×8×128×1024 = 4,192,128 ops/s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-06 21:54:12 +09:00