Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
// tiny_nextptr.h - Authoritative next-pointer offset/load/store for tiny boxes
|
2025-11-11 21:49:05 +09:00
|
|
|
//
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
// Finalized Phase E1-CORRECT spec (物理制約込み):
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// P0.1: C7 uses offset 0 (overwrites header), C0-C6 use offset 1 (header preserved)
|
2025-11-11 21:49:05 +09:00
|
|
|
//
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
// HAKMEM_TINY_HEADER_CLASSIDX != 0 のとき:
|
|
|
|
|
//
|
|
|
|
|
// Class 0:
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// [1B header][15B payload] (total 16B)
|
|
|
|
|
// → headerは保持し、next は header直後 base+1 に格納
|
|
|
|
|
// → next_off = 1
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
//
|
|
|
|
|
// Class 1〜6:
|
|
|
|
|
// [1B header][payload >= 8B]
|
|
|
|
|
// → headerは保持し、next は header直後 base+1 に格納
|
|
|
|
|
// → next_off = 1
|
|
|
|
|
//
|
|
|
|
|
// Class 7:
|
2025-11-21 13:44:05 +09:00
|
|
|
// [1B header][payload 2047B]
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// → headerは上書きし、next は base+0 に格納(最大サイズなので許容)
|
|
|
|
|
// → next_off = 0
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
//
|
|
|
|
|
// HAKMEM_TINY_HEADER_CLASSIDX == 0 のとき:
|
|
|
|
|
//
|
|
|
|
|
// 全クラス headerなし → next_off = 0
|
|
|
|
|
//
|
|
|
|
|
// このヘッダは上記仕様を唯一の真実として提供する。
|
|
|
|
|
// すべての tiny freelist / TLS / fast-cache / refill / SLL で
|
|
|
|
|
// tiny_next_off/tiny_next_load/tiny_next_store を経由すること。
|
|
|
|
|
// 直接の *(void**) アクセスやローカルな offset 分岐は使用禁止。
|
2025-11-11 21:49:05 +09:00
|
|
|
|
|
|
|
|
#ifndef TINY_NEXTPTR_H
|
|
|
|
|
#define TINY_NEXTPTR_H
|
|
|
|
|
|
|
|
|
|
#include <stdint.h>
|
|
|
|
|
#include <string.h>
|
2025-11-28 14:11:37 +09:00
|
|
|
#include <stdlib.h> // P2.3: for getenv()
|
2025-11-11 21:49:05 +09:00
|
|
|
#include "hakmem_build_flags.h"
|
2025-11-21 13:44:05 +09:00
|
|
|
#include "tiny_region_id.h" // HEADER_MAGIC/HEADER_CLASS_MASK for header repair/logging
|
|
|
|
|
#include "hakmem_super_registry.h" // hak_super_lookup
|
|
|
|
|
#include "superslab/superslab_inline.h" // slab_index_for
|
|
|
|
|
#include <stdio.h>
|
|
|
|
|
#include <stdatomic.h>
|
|
|
|
|
#include <dlfcn.h>
|
|
|
|
|
#include <execinfo.h> // backtrace for rare misalign diagnostics
|
2025-11-11 21:49:05 +09:00
|
|
|
|
|
|
|
|
// Compute freelist next-pointer offset within a block for the given class.
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// P0.1: C7 uses offset 0 (overwrites header), C0-C6 use offset 1 (header preserved)
|
2025-11-11 21:49:05 +09:00
|
|
|
static inline __attribute__((always_inline)) size_t tiny_next_off(int class_idx) {
|
|
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// C7 (2048B): offset 0 (overwrites header in freelist - largest class can tolerate)
|
|
|
|
|
// C0-C6: offset 1 (header preserved - user data is not disturbed)
|
|
|
|
|
return (class_idx == 7) ? 0u : 1u;
|
2025-11-11 21:49:05 +09:00
|
|
|
#else
|
|
|
|
|
(void)class_idx;
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
return 0u;
|
2025-11-11 21:49:05 +09:00
|
|
|
#endif
|
|
|
|
|
}
|
|
|
|
|
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
// Safe load of next pointer from a block base.
|
2025-11-11 21:49:05 +09:00
|
|
|
static inline __attribute__((always_inline)) void* tiny_next_load(const void* base, int class_idx) {
|
|
|
|
|
size_t off = tiny_next_off(class_idx);
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
|
|
|
|
|
if (off == 0) {
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// Aligned access at base (header無し or C7 freelist時)
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
return *(void* const*)base;
|
2025-11-11 21:49:05 +09:00
|
|
|
}
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
|
|
|
|
|
// off != 0: use memcpy to avoid UB on architectures that forbid unaligned loads.
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// C0-C6: offset 1 (header preserved)
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
void* next = NULL;
|
|
|
|
|
const uint8_t* p = (const uint8_t*)base + off;
|
|
|
|
|
memcpy(&next, p, sizeof(void*));
|
|
|
|
|
return next;
|
2025-11-11 21:49:05 +09:00
|
|
|
}
|
|
|
|
|
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
// Safe store of next pointer into a block base.
|
2025-11-28 14:11:37 +09:00
|
|
|
// P2.3: Header restoration is now conditional (default: skip when class_map is active)
|
|
|
|
|
// - When class_map is used for class_idx lookup (default), header restoration is unnecessary
|
|
|
|
|
// - Alloc path always writes fresh header before returning block to user (HAK_RET_ALLOC)
|
|
|
|
|
// - ENV: HAKMEM_TINY_RESTORE_HEADER=1 to force header restoration (legacy mode)
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// P0.1: C7 uses offset 0 (overwrites header), C0-C6 use offset 1 (header preserved)
|
2025-11-11 21:49:05 +09:00
|
|
|
static inline __attribute__((always_inline)) void tiny_next_store(void* base, int class_idx, void* next) {
|
|
|
|
|
size_t off = tiny_next_off(class_idx);
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
|
2025-11-21 13:44:05 +09:00
|
|
|
#if HAKMEM_TINY_HEADER_CLASSIDX
|
2025-11-28 14:11:37 +09:00
|
|
|
// P2.3: Skip header restoration by default (class_map is now default for class_idx lookup)
|
|
|
|
|
// ENV: HAKMEM_TINY_RESTORE_HEADER=1 to force header restoration (legacy fallback mode)
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
if (off != 0) {
|
2025-11-28 14:11:37 +09:00
|
|
|
static int g_restore_header = -1;
|
|
|
|
|
if (__builtin_expect(g_restore_header == -1, 0)) {
|
|
|
|
|
const char* e = getenv("HAKMEM_TINY_RESTORE_HEADER");
|
|
|
|
|
g_restore_header = (e && *e && *e != '0') ? 1 : 0;
|
|
|
|
|
}
|
|
|
|
|
if (__builtin_expect(g_restore_header, 0)) {
|
|
|
|
|
// Legacy mode: Restore header for classes that preserve it (C0-C6)
|
|
|
|
|
*(uint8_t*)base = HEADER_MAGIC | (class_idx & HEADER_CLASS_MASK);
|
|
|
|
|
}
|
2025-11-21 13:44:05 +09:00
|
|
|
}
|
|
|
|
|
#endif
|
|
|
|
|
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
if (off == 0) {
|
Tiny Pool redesign: P0.1, P0.3, P1.1, P1.2 - Out-of-band class_idx lookup
This commit implements the first phase of Tiny Pool redesign based on
ChatGPT architecture review. The goal is to eliminate Header/Next pointer
conflicts by moving class_idx lookup out-of-band (to SuperSlab metadata).
## P0.1: C0(8B) class upgraded to 16B
- Size table changed: {16,32,64,128,256,512,1024,2048} (8 classes)
- LUT updated: 1..16 → class 0, 17..32 → class 1, etc.
- tiny_next_off: C0 now uses offset 1 (header preserved)
- Eliminates edge cases for 8B allocations
## P0.3: Slab reuse guard Box (tls_slab_reuse_guard_box.h)
- New Box for draining TLS SLL before slab reuse
- ENV gate: HAKMEM_TINY_SLAB_REUSE_GUARD=1
- Prevents stale pointers when slabs are recycled
- Follows Box theory: single responsibility, minimal API
## P1.1: SuperSlab class_map addition
- Added uint8_t class_map[SLABS_PER_SUPERSLAB_MAX] to SuperSlab
- Maps slab_idx → class_idx for out-of-band lookup
- Initialized to 255 (UNASSIGNED) on SuperSlab creation
- Set correctly on slab initialization in all backends
## P1.2: Free fast path uses class_map
- ENV gate: HAKMEM_TINY_USE_CLASS_MAP=1
- Free path can now get class_idx from class_map instead of Header
- Falls back to Header read if class_map returns invalid value
- Fixed Legacy Backend dynamic slab initialization bug
## Documentation added
- HAKMEM_ARCHITECTURE_OVERVIEW.md: 4-layer architecture analysis
- TLS_SLL_ARCHITECTURE_INVESTIGATION.md: Root cause analysis
- PTR_LIFECYCLE_TRACE_AND_ROOT_CAUSE_ANALYSIS.md: Pointer tracking
- TINY_REDESIGN_CHECKLIST.md: Implementation roadmap (P0-P3)
## Test results
- Baseline: 70% success rate (30% crash - pre-existing issue)
- class_map enabled: 70% success rate (same as baseline)
- Performance: ~30.5M ops/s (unchanged)
## Next steps (P1.3, P2, P3)
- P1.3: Add meta->active for accurate TLS/freelist sync
- P2: TLS SLL redesign with Box-based counting
- P3: Complete Header out-of-band migration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-28 13:42:39 +09:00
|
|
|
// Aligned access at base (overwrites header for C7).
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
*(void**)base = next;
|
2025-11-11 21:49:05 +09:00
|
|
|
return;
|
|
|
|
|
}
|
Phase E3-FINAL: Fix Box API offset bugs - ALL classes now use correct offsets
## Root Cause Analysis (GPT5)
**Physical Layout Constraints**:
- Class 0: 8B = [1B header][7B payload] → offset 1 = 9B needed = ❌ IMPOSSIBLE
- Class 1-6: >=16B = [1B header][15B+ payload] → offset 1 = ✅ POSSIBLE
- Class 7: 1KB → offset 0 (compatibility)
**Correct Specification**:
- HAKMEM_TINY_HEADER_CLASSIDX != 0:
- Class 0, 7: next at offset 0 (overwrites header when on freelist)
- Class 1-6: next at offset 1 (after header)
- HAKMEM_TINY_HEADER_CLASSIDX == 0:
- All classes: next at offset 0
**Previous Bug**:
- Attempted "ALL classes offset 1" unification
- Class 0 with offset 1 caused immediate SEGV (9B > 8B block size)
- Mixed 2-arg/3-arg API caused confusion
## Fixes Applied
### 1. Restored 3-Argument Box API (core/box/tiny_next_ptr_box.h)
```c
// Correct signatures
void tiny_next_write(int class_idx, void* base, void* next_value)
void* tiny_next_read(int class_idx, const void* base)
// Correct offset calculation
size_t offset = (class_idx == 0 || class_idx == 7) ? 0 : 1;
```
### 2. Updated 123+ Call Sites Across 34 Files
- hakmem_tiny_hot_pop_v4.inc.h (4 locations)
- hakmem_tiny_fastcache.inc.h (3 locations)
- hakmem_tiny_tls_list.h (12 locations)
- superslab_inline.h (5 locations)
- tiny_fastcache.h (3 locations)
- ptr_trace.h (macro definitions)
- tls_sll_box.h (2 locations)
- + 27 additional files
Pattern: `tiny_next_read(base)` → `tiny_next_read(class_idx, base)`
Pattern: `tiny_next_write(base, next)` → `tiny_next_write(class_idx, base, next)`
### 3. Added Sentinel Detection Guards
- tiny_fast_push(): Block nodes with sentinel in ptr or ptr->next
- tls_list_push(): Block nodes with sentinel in ptr or ptr->next
- Defense-in-depth against remote free sentinel leakage
## Verification (GPT5 Report)
**Test Command**: `./out/release/bench_random_mixed_hakmem --iterations=70000`
**Results**:
- ✅ Main loop completed successfully
- ✅ Drain phase completed successfully
- ✅ NO SEGV (previous crash at iteration 66151 is FIXED)
- ℹ️ Final log: "tiny_alloc(1024) failed" is normal fallback to Mid/ACE layers
**Analysis**:
- Class 0 immediate SEGV: ✅ RESOLVED (correct offset 0 now used)
- 66K iteration crash: ✅ RESOLVED (offset consistency fixed)
- Box API conflicts: ✅ RESOLVED (unified 3-arg API)
## Technical Details
### Offset Logic Justification
```
Class 0: 8B block → next pointer (8B) fits ONLY at offset 0
Class 1: 16B block → next pointer (8B) fits at offset 1 (after 1B header)
Class 2: 32B block → next pointer (8B) fits at offset 1
...
Class 6: 512B block → next pointer (8B) fits at offset 1
Class 7: 1024B block → offset 0 for legacy compatibility
```
### Files Modified (Summary)
- Core API: `box/tiny_next_ptr_box.h`
- Hot paths: `hakmem_tiny_hot_pop*.inc.h`, `tiny_fastcache.h`
- TLS layers: `hakmem_tiny_tls_list.h`, `hakmem_tiny_tls_ops.h`
- SuperSlab: `superslab_inline.h`, `tiny_superslab_*.inc.h`
- Refill: `hakmem_tiny_refill.inc.h`, `tiny_refill_opt.h`
- Free paths: `tiny_free_magazine.inc.h`, `tiny_superslab_free.inc.h`
- Documentation: Multiple Phase E3 reports
## Remaining Work
None for Box API offset bugs - all structural issues resolved.
Future enhancements (non-critical):
- Periodic `grep -R '*(void**)' core/` to detect direct pointer access violations
- Enforce Box API usage via static analysis
- Document offset rationale in architecture docs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-13 06:50:20 +09:00
|
|
|
|
|
|
|
|
// off != 0: use memcpy for portability / UB-avoidance.
|
|
|
|
|
uint8_t* p = (uint8_t*)base + off;
|
|
|
|
|
memcpy(p, &next, sizeof(void*));
|
2025-11-11 21:49:05 +09:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
#endif // TINY_NEXTPTR_H
|