Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
// integrity_box.c - Box I: Integrity Verification System Implementation
|
|
|
|
|
// Purpose: Complete implementation of modular integrity checks
|
|
|
|
|
// Author: Claude + Task (2025-11-12)
|
|
|
|
|
|
|
|
|
|
#include "integrity_box.h"
|
|
|
|
|
#include "../hakmem_tiny.h"
|
|
|
|
|
#include "../superslab/superslab_types.h"
|
|
|
|
|
#include "../tiny_box_geometry.h"
|
|
|
|
|
#include <stdio.h>
|
|
|
|
|
#include <assert.h>
|
|
|
|
|
#include <stdatomic.h>
|
|
|
|
|
#include <string.h>
|
2025-11-15 14:35:44 +09:00
|
|
|
#include <stdlib.h>
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// TLS Canary Magic
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
#define TLS_CANARY_MAGIC 0xDEADBEEFDEADBEEFULL
|
|
|
|
|
|
|
|
|
|
// External canaries from hakmem_tiny.c
|
2025-11-20 07:32:30 +09:00
|
|
|
// Phase 3d-B: TLS Cache Merge - Unified canaries for unified TLS SLL array
|
|
|
|
|
extern __thread uint64_t g_tls_canary_before_sll;
|
|
|
|
|
extern __thread uint64_t g_tls_canary_after_sll;
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Global Statistics (atomic for thread safety)
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
static _Atomic uint64_t g_integrity_checks_performed = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_checks_passed = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_checks_failed = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_tls_bounds_checks = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_freelist_checks = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_metadata_checks = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_canary_checks = 0;
|
|
|
|
|
static _Atomic uint64_t g_integrity_full_system_checks = 0;
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Initialization
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
void integrity_box_init(void) {
|
|
|
|
|
// Initialize statistics (atomic init is implicit)
|
|
|
|
|
atomic_store(&g_integrity_checks_performed, 0);
|
|
|
|
|
atomic_store(&g_integrity_checks_passed, 0);
|
|
|
|
|
atomic_store(&g_integrity_checks_failed, 0);
|
|
|
|
|
atomic_store(&g_integrity_tls_bounds_checks, 0);
|
|
|
|
|
atomic_store(&g_integrity_freelist_checks, 0);
|
|
|
|
|
atomic_store(&g_integrity_metadata_checks, 0);
|
|
|
|
|
atomic_store(&g_integrity_canary_checks, 0);
|
|
|
|
|
atomic_store(&g_integrity_full_system_checks, 0);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Priority 1: TLS Bounds Validation
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
IntegrityResult integrity_validate_tls_bounds(
|
|
|
|
|
uint8_t class_idx,
|
|
|
|
|
const char* context) {
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_performed, 1);
|
|
|
|
|
atomic_fetch_add(&g_integrity_tls_bounds_checks, 1);
|
|
|
|
|
|
|
|
|
|
if (class_idx >= TINY_NUM_CLASSES) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "TLS_BOUNDS_OVERFLOW",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "class_idx out of bounds",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_TLS_BOUNDS_OVERFLOW
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_passed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = true,
|
|
|
|
|
.check_name = "TLS_BOUNDS_OK",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "TLS bounds check passed",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_OK
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Priority 2: Freelist Pointer Validation
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
IntegrityResult integrity_validate_freelist_ptr(
|
|
|
|
|
void* ptr,
|
|
|
|
|
void* slab_base,
|
|
|
|
|
void* slab_end,
|
|
|
|
|
uint8_t class_idx,
|
|
|
|
|
const char* context) {
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_performed, 1);
|
|
|
|
|
atomic_fetch_add(&g_integrity_freelist_checks, 1);
|
|
|
|
|
|
|
|
|
|
// NULL is valid (end of freelist)
|
|
|
|
|
if (ptr == NULL) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_passed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = true,
|
|
|
|
|
.check_name = "FREELIST_PTR_NULL",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "NULL freelist pointer (valid)",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_OK
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check pointer is in valid range
|
|
|
|
|
if (ptr < slab_base || ptr >= slab_end) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "FREELIST_PTR_OUT_OF_BOUNDS",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "Freelist pointer outside slab bounds",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_FREELIST_PTR_OUT_OF_BOUNDS
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check stride alignment
|
|
|
|
|
size_t stride = tiny_stride_for_class(class_idx);
|
|
|
|
|
ptrdiff_t offset = (uint8_t*)ptr - (uint8_t*)slab_base;
|
|
|
|
|
|
|
|
|
|
if (offset % stride != 0) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "FREELIST_PTR_MISALIGNED",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "Freelist pointer not stride-aligned",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_FREELIST_PTR_MISALIGNED
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_passed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = true,
|
|
|
|
|
.check_name = "FREELIST_PTR_OK",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "Freelist pointer valid",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_OK
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Priority 3: TLS Canary Validation
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
IntegrityResult integrity_validate_tls_canaries(const char* context) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_performed, 1);
|
|
|
|
|
atomic_fetch_add(&g_integrity_canary_checks, 1);
|
|
|
|
|
|
2025-11-20 07:32:30 +09:00
|
|
|
// Phase 3d-B: Check canary before unified g_tls_sll array
|
|
|
|
|
if (g_tls_canary_before_sll != TLS_CANARY_MAGIC) {
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
2025-11-20 07:32:30 +09:00
|
|
|
.check_name = "CANARY_CORRUPTED_BEFORE_SLL",
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
2025-11-20 07:32:30 +09:00
|
|
|
.message = "Canary before g_tls_sll corrupted",
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
.error_code = INTEGRITY_ERROR_CANARY_CORRUPTED_BEFORE_HEAD
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
2025-11-20 07:32:30 +09:00
|
|
|
// Phase 3d-B: Check canary after unified g_tls_sll array
|
|
|
|
|
if (g_tls_canary_after_sll != TLS_CANARY_MAGIC) {
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
2025-11-20 07:32:30 +09:00
|
|
|
.check_name = "CANARY_CORRUPTED_AFTER_SLL",
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
2025-11-20 07:32:30 +09:00
|
|
|
.message = "Canary after g_tls_sll corrupted",
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
.error_code = INTEGRITY_ERROR_CANARY_CORRUPTED_AFTER_HEAD
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_passed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = true,
|
|
|
|
|
.check_name = "CANARY_OK",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "All canaries intact",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_OK
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Priority ALPHA: Slab Metadata Validation (THE KEY!)
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
SlabMetadataState integrity_capture_slab_metadata(
|
|
|
|
|
const void* meta_ptr,
|
|
|
|
|
void* slab_base,
|
|
|
|
|
uint8_t class_idx) {
|
|
|
|
|
|
|
|
|
|
// Cast to TinySlabMeta type
|
|
|
|
|
const TinySlabMeta* meta = (const TinySlabMeta*)meta_ptr;
|
|
|
|
|
|
|
|
|
|
SlabMetadataState state = {0};
|
|
|
|
|
|
|
|
|
|
if (meta == NULL) {
|
|
|
|
|
// NULL metadata - return invalid state
|
|
|
|
|
state.carved = 0xFFFF;
|
|
|
|
|
state.used = 0xFFFF;
|
|
|
|
|
state.capacity = 0;
|
|
|
|
|
state.freelist = NULL;
|
|
|
|
|
state.slab_base = NULL;
|
|
|
|
|
state.class_idx = class_idx;
|
|
|
|
|
state.free_count = 0xFFFF;
|
|
|
|
|
state.is_virgin = false;
|
|
|
|
|
state.is_full = false;
|
|
|
|
|
state.is_empty = false;
|
|
|
|
|
return state;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Capture core fields
|
|
|
|
|
state.carved = meta->carved;
|
|
|
|
|
state.used = meta->used;
|
|
|
|
|
state.capacity = meta->capacity;
|
|
|
|
|
state.freelist = meta->freelist;
|
|
|
|
|
state.slab_base = slab_base;
|
|
|
|
|
state.class_idx = class_idx;
|
|
|
|
|
|
|
|
|
|
// Compute derived fields
|
|
|
|
|
if (state.carved >= state.used) {
|
|
|
|
|
state.free_count = state.carved - state.used;
|
|
|
|
|
} else {
|
|
|
|
|
state.free_count = 0xFFFF; // Invalid!
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
state.is_virgin = (state.carved == 0);
|
|
|
|
|
state.is_full = (state.carved == state.capacity && state.used == state.capacity);
|
|
|
|
|
state.is_empty = (state.used == 0);
|
|
|
|
|
|
|
|
|
|
return state;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
IntegrityResult integrity_validate_slab_metadata(
|
|
|
|
|
const SlabMetadataState* state,
|
|
|
|
|
const char* context) {
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_performed, 1);
|
|
|
|
|
atomic_fetch_add(&g_integrity_metadata_checks, 1);
|
|
|
|
|
|
|
|
|
|
// Check 1: carved <= capacity
|
|
|
|
|
if (state->carved > state->capacity) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_CARVED_OVERFLOW",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "carved > capacity (slab corruption)",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_METADATA_CARVED_OVERFLOW
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check 2: used <= carved
|
|
|
|
|
if (state->used > state->carved) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_USED_GT_CARVED",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "used > carved (double-free or corruption)",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_METADATA_USED_GT_CARVED
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check 3: used <= capacity
|
|
|
|
|
if (state->used > state->capacity) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_USED_OVERFLOW",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "used > capacity (counter corruption)",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_METADATA_USED_OVERFLOW
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check 4: free_count consistency
|
|
|
|
|
uint16_t expected_free = state->carved - state->used;
|
|
|
|
|
if (state->free_count != expected_free) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_FREE_COUNT_MISMATCH",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "free_count != (carved - used)",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_METADATA_FREE_COUNT_MISMATCH
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check 5: Capacity is reasonable (not corrupted)
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
// Phase E1-CORRECT FIX: Tiny classes have varying capacities:
|
|
|
|
|
// - Class 0 (8B): 65536/8 = 8192 blocks per slab
|
|
|
|
|
// - Class 1 (16B): 65536/16 = 4096
|
|
|
|
|
// - Class 2 (32B): 65536/32 = 2048
|
|
|
|
|
// - Class 3 (64B): 65536/64 = 1024
|
|
|
|
|
// - Class 4 (128B): 65536/128 = 512
|
|
|
|
|
// Use 10000 as safe upper bound (Class 0 max is 8192)
|
|
|
|
|
if (state->capacity > 10000) {
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_CAPACITY_UNREASONABLE",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
.message = "capacity > 10000 (likely corrupted)",
|
Add Box I (Integrity), Box E (Expansion), and comprehensive P0 debugging infrastructure
## Major Additions
### 1. Box I: Integrity Verification System (NEW - 703 lines)
- Files: core/box/integrity_box.h (267 lines), core/box/integrity_box.c (436 lines)
- Purpose: Unified integrity checking across all HAKMEM subsystems
- Features:
* 4-level integrity checking (0-4, compile-time controlled)
* Priority 1: TLS array bounds validation
* Priority 2: Freelist pointer validation
* Priority 3: TLS canary monitoring
* Priority ALPHA: Slab metadata invariant checking (5 invariants)
* Atomic statistics tracking (thread-safe)
* Beautiful BOX_BOUNDARY design pattern
### 2. Box E: SuperSlab Expansion System (COMPLETE)
- Files: core/box/superslab_expansion_box.h, core/box/superslab_expansion_box.c
- Purpose: Safe SuperSlab expansion with TLS state guarantee
- Features:
* Immediate slab 0 binding after expansion
* TLS state snapshot and restoration
* Design by Contract (pre/post-conditions, invariants)
* Thread-safe with mutex protection
### 3. Comprehensive Integrity Checking System
- File: core/hakmem_tiny_integrity.h (NEW)
- Unified validation functions for all allocator subsystems
- Uninitialized memory pattern detection (0xa2, 0xcc, 0xdd, 0xfe)
- Pointer range validation (null-page, kernel-space)
### 4. P0 Bug Investigation - Root Cause Identified
**Bug**: SEGV at iteration 28440 (deterministic with seed 42)
**Pattern**: 0xa2a2a2a2a2a2a2a2 (uninitialized/ASan poisoning)
**Location**: TLS SLL (Single-Linked List) cache layer
**Root Cause**: Race condition or use-after-free in TLS list management (class 0)
**Detection**: Box I successfully caught invalid pointer at exact crash point
### 5. Defensive Improvements
- Defensive memset in SuperSlab allocation (all metadata arrays)
- Enhanced pointer validation with pattern detection
- BOX_BOUNDARY markers throughout codebase (beautiful modular design)
- 5 metadata invariant checks in allocation/free/refill paths
## Integration Points
- Modified 13 files with Box I/E integration
- Added 10+ BOX_BOUNDARY markers
- 5 critical integrity check points in P0 refill path
## Test Results (100K iterations)
- Baseline: 7.22M ops/s
- Hotpath ON: 8.98M ops/s (+24% improvement ✓)
- P0 Bug: Still crashes at 28440 iterations (TLS SLL race condition)
- Root cause: Identified but not yet fixed (requires deeper investigation)
## Performance
- Box I overhead: Zero in release builds (HAKMEM_INTEGRITY_LEVEL=0)
- Debug builds: Full validation enabled (HAKMEM_INTEGRITY_LEVEL=4)
- Beautiful modular design maintains clean separation of concerns
## Known Issues
- P0 Bug at 28440 iterations: Race condition in TLS SLL cache (class 0)
- Cause: Use-after-free or race in remote free draining
- Next step: Valgrind investigation to pinpoint exact corruption location
## Code Quality
- Total new code: ~1400 lines (Box I + Box E + integrity system)
- Design: Beautiful Box Theory with clear boundaries
- Modularity: Complete separation of concerns
- Documentation: Comprehensive inline comments and BOX_BOUNDARY markers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-12 02:45:00 +09:00
|
|
|
.error_code = INTEGRITY_ERROR_METADATA_CAPACITY_UNREASONABLE
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check 6: Freelist pointer validity
|
|
|
|
|
// The freelist pointer should either be:
|
|
|
|
|
// - NULL (linear carving mode or empty freelist)
|
|
|
|
|
// - A valid pointer within the slab's address range
|
|
|
|
|
// - NOT uninitialized garbage like 0xa2a2a2a2a2a2a2a2
|
|
|
|
|
if (state->freelist != NULL && state->slab_base != NULL) {
|
|
|
|
|
uintptr_t freelist_addr = (uintptr_t)state->freelist;
|
|
|
|
|
uintptr_t slab_start = (uintptr_t)state->slab_base;
|
|
|
|
|
|
|
|
|
|
// Detect obvious corruption patterns (0xa2, 0xcc, 0xdd, 0xfe are common debug fill patterns)
|
|
|
|
|
uint8_t* freelist_bytes = (uint8_t*)&freelist_addr;
|
|
|
|
|
bool is_pattern_fill = (freelist_bytes[0] == freelist_bytes[1] &&
|
|
|
|
|
freelist_bytes[1] == freelist_bytes[2] &&
|
|
|
|
|
freelist_bytes[2] == freelist_bytes[3] &&
|
|
|
|
|
freelist_bytes[3] == freelist_bytes[4] &&
|
|
|
|
|
freelist_bytes[4] == freelist_bytes[5] &&
|
|
|
|
|
freelist_bytes[5] == freelist_bytes[6] &&
|
|
|
|
|
freelist_bytes[6] == freelist_bytes[7]);
|
|
|
|
|
|
|
|
|
|
if (is_pattern_fill && (freelist_bytes[0] == 0xa2 ||
|
|
|
|
|
freelist_bytes[0] == 0xcc ||
|
|
|
|
|
freelist_bytes[0] == 0xdd ||
|
|
|
|
|
freelist_bytes[0] == 0xfe)) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
fprintf(stderr, "[BOX I] CRITICAL: Uninitialized freelist detected!\n");
|
|
|
|
|
fprintf(stderr, "[BOX I] freelist=%p (pattern: 0x%02x repeated)\n",
|
|
|
|
|
state->freelist, freelist_bytes[0]);
|
|
|
|
|
fprintf(stderr, "[BOX I] carved=%u used=%u capacity=%u class=%u\n",
|
|
|
|
|
state->carved, state->used, state->capacity, state->class_idx);
|
|
|
|
|
fprintf(stderr, "[BOX I] This indicates the slab was used before proper initialization!\n");
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_FREELIST_UNINITIALIZED",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "freelist contains uninitialized pattern (0xa2/0xcc/0xdd/0xfe)",
|
|
|
|
|
.error_code = 0xA090
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Basic range check (freelist should be within reasonable address space)
|
|
|
|
|
// Kernel space on x86-64 starts at 0xffff800000000000
|
|
|
|
|
if (freelist_addr >= 0xffff800000000000UL) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_failed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = false,
|
|
|
|
|
.check_name = "METADATA_FREELIST_KERNEL_ADDR",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "freelist points to kernel space (corrupted)",
|
|
|
|
|
.error_code = 0xA091
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
atomic_fetch_add(&g_integrity_checks_passed, 1);
|
|
|
|
|
return (IntegrityResult){
|
|
|
|
|
.passed = true,
|
|
|
|
|
.check_name = "METADATA_OK",
|
|
|
|
|
.file = __FILE__,
|
|
|
|
|
.line = __LINE__,
|
|
|
|
|
.message = "All metadata checks passed",
|
|
|
|
|
.error_code = INTEGRITY_ERROR_OK
|
|
|
|
|
};
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Periodic Full System Check
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
void integrity_periodic_full_check(const char* context) {
|
|
|
|
|
atomic_fetch_add(&g_integrity_full_system_checks, 1);
|
|
|
|
|
|
|
|
|
|
// Check all TLS canaries
|
|
|
|
|
IntegrityResult canary_result = integrity_validate_tls_canaries(context);
|
|
|
|
|
if (!canary_result.passed) {
|
|
|
|
|
fprintf(stderr, "[INTEGRITY FAILURE] Periodic check failed: %s\n",
|
|
|
|
|
canary_result.message);
|
|
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Check TLS bounds for all classes
|
|
|
|
|
for (uint8_t cls = 0; cls < TINY_NUM_CLASSES; cls++) {
|
|
|
|
|
IntegrityResult bounds_result = integrity_validate_tls_bounds(cls, context);
|
|
|
|
|
if (!bounds_result.passed) {
|
|
|
|
|
fprintf(stderr, "[INTEGRITY FAILURE] Periodic check failed for class %u: %s\n",
|
|
|
|
|
cls, bounds_result.message);
|
|
|
|
|
abort();
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// ============================================================================
|
|
|
|
|
// Statistics API
|
|
|
|
|
// ============================================================================
|
|
|
|
|
|
|
|
|
|
IntegrityStatistics integrity_get_statistics(void) {
|
|
|
|
|
IntegrityStatistics stats;
|
|
|
|
|
stats.checks_performed = atomic_load(&g_integrity_checks_performed);
|
|
|
|
|
stats.checks_passed = atomic_load(&g_integrity_checks_passed);
|
|
|
|
|
stats.checks_failed = atomic_load(&g_integrity_checks_failed);
|
|
|
|
|
stats.tls_bounds_checks = atomic_load(&g_integrity_tls_bounds_checks);
|
|
|
|
|
stats.freelist_checks = atomic_load(&g_integrity_freelist_checks);
|
|
|
|
|
stats.metadata_checks = atomic_load(&g_integrity_metadata_checks);
|
|
|
|
|
stats.canary_checks = atomic_load(&g_integrity_canary_checks);
|
|
|
|
|
stats.full_system_checks = atomic_load(&g_integrity_full_system_checks);
|
|
|
|
|
return stats;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
void integrity_print_statistics(void) {
|
|
|
|
|
IntegrityStatistics stats = integrity_get_statistics();
|
|
|
|
|
|
|
|
|
|
fprintf(stderr, "\n=== Box I: Integrity Statistics ===\n");
|
|
|
|
|
fprintf(stderr, "Total checks performed: %lu\n", stats.checks_performed);
|
|
|
|
|
fprintf(stderr, " Passed: %lu (%.2f%%)\n", stats.checks_passed,
|
|
|
|
|
stats.checks_performed > 0 ? 100.0 * stats.checks_passed / stats.checks_performed : 0.0);
|
|
|
|
|
fprintf(stderr, " Failed: %lu (%.2f%%)\n", stats.checks_failed,
|
|
|
|
|
stats.checks_performed > 0 ? 100.0 * stats.checks_failed / stats.checks_performed : 0.0);
|
|
|
|
|
fprintf(stderr, "\nBy check type:\n");
|
|
|
|
|
fprintf(stderr, " TLS bounds checks: %lu\n", stats.tls_bounds_checks);
|
|
|
|
|
fprintf(stderr, " Freelist checks: %lu\n", stats.freelist_checks);
|
|
|
|
|
fprintf(stderr, " Metadata checks: %lu (Priority ALPHA)\n", stats.metadata_checks);
|
|
|
|
|
fprintf(stderr, " Canary checks: %lu\n", stats.canary_checks);
|
|
|
|
|
fprintf(stderr, " Full system checks: %lu\n", stats.full_system_checks);
|
|
|
|
|
fprintf(stderr, "===================================\n\n");
|
|
|
|
|
}
|