Files
hakmem/core/box/smallsegment_v6_box.h

82 lines
3.0 KiB
C
Raw Normal View History

Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
// smallsegment_v6_box.h - SmallSegment v6 型定義Phase v6-2
#ifndef HAKMEM_SMALLSEGMENT_V6_BOX_H
#define HAKMEM_SMALLSEGMENT_V6_BOX_H
#include <stdint.h>
// Segment constants
#define SMALL_SEGMENT_V6_SIZE (2 * 1024 * 1024) // 2 MiB
#define SMALL_PAGE_V6_SIZE (64 * 1024) // 64 KiB
#define SMALL_PAGES_PER_SEGMENT (SMALL_SEGMENT_V6_SIZE / SMALL_PAGE_V6_SIZE) // 32
#define SMALL_SEGMENT_V6_MAGIC 0xC06E56u // C0(re) v6
#define SMALL_PAGE_V6_SHIFT 16 // log2(64KiB)
// C6 configuration
#define SMALL_V6_C6_CLASS_IDX 6
#define SMALL_V6_C6_BLOCK_SIZE 512
// C5 configuration (Phase v6-5)
#define SMALL_V6_C5_CLASS_IDX 5
#define SMALL_V6_C5_BLOCK_SIZE 256
Phase FREE-FRONT-V3-1: Free route snapshot infrastructure + build fix Summary: ======== Implemented Phase FREE-FRONT-V3 infrastructure to optimize free hotpath by: 1. Creating snapshot-based route decision table (consolidating route logic) 2. Removing redundant ENV checks from hot path 3. Preparing for future integration into hak_free_at() Key Changes: ============ 1. NEW FILES: - core/box/free_front_v3_env_box.h: Route snapshot definition & API - core/box/free_front_v3_env_box.c: Snapshot initialization & caching 2. Infrastructure Details: - FreeRouteSnapshotV3: Maps class_idx → free_route_kind for all 8 classes - Routes defined: LEGACY, TINY_V3, CORE_V6_C6, POOL_V1 - ENV-gated initialization (HAKMEM_TINY_FREE_FRONT_V3_ENABLED, default OFF) - Per-thread TLS caching to avoid repeated ENV reads 3. Design Goals: - Consolidate tiny_route_for_class() results into snapshot table - Remove C7 ULTRA / v4 / v5 / v6 ENV checks from hot path - Limit lookup (ss_fast_lookup/slab_index_for) to paths that truly need it - Clear ownership boundary: front v3 handles routing, downstream handles free 4. Phase Plan: - v3-1 ✅ COMPLETE: Infrastructure (snapshot table, ENV initialization, TLS cache) - v3-2 (INFRASTRUCTURE ONLY): Placeholder integration in hak_free_api.inc.h - v3-3 (FUTURE): Full integration + benchmark A/B to measure hotpath improvement 5. BUILD FIX: - Added missing core/box/c7_meta_used_counter_box.o to OBJS_BASE in Makefile - This symbol was referenced but not linked, causing undefined reference errors - Benchmark targets now build cleanly without LTO Status: ======= - Build: ✅ PASS (bench_allocators_hakmem builds without errors) - Integration: Currently DISABLED (default OFF, ready for v3-2 phase) - No performance impact: Infrastructure-only, hotpath unchanged Future Work: ============ - Phase v3-2: Integrate snapshot routing into hak_free_at() main path - Phase v3-3: Measure free hotpath performance improvement (target: 1-2% less branch mispredict) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 19:17:30 +09:00
// C4 configuration (Phase v6-6)
#define SMALL_V6_C4_CLASS_IDX 4
#define SMALL_V6_C4_BLOCK_SIZE 128
Phase v6-1/2/3/4: SmallObject Core v6 - C6-only implementation + refactor Phase v6-1: C6-only route stub (v1/pool fallback) Phase v6-2: Segment v6 + ColdIface v6 + Core v6 HotPath implementation - 2MiB segment / 64KiB page allocation - O(1) ptr→page_meta lookup with segment masking - C6-heavy A/B: SEGV-free but -44% performance (15.3M ops/s) Phase v6-3: Thin-layer optimization (TLS ownership check + batch header + refill batching) - TLS ownership fast-path skip page_meta for 90%+ of frees - Batch header writes during refill (32 allocs = 1 header write) - TLS batch refill (1/32 refill frequency) - C6-heavy A/B: v6-2 15.3M → v6-3 27.1M ops/s (±0% vs baseline) ✅ Phase v6-4: Mixed hang fix (segment metadata lookup correction) - Root cause: metadata lookup was reading mmap region instead of TLS slot - Fix: use TLS slot descriptor with in_use validation - Mixed health: 5M iterations SEGV-free, 35.8M ops/s ✅ Phase v6-refactor: Code quality improvements (macro unification + inline + docs) - Add SMALL_V6_* prefix macros (header, pointer conversion, page index) - Extract inline validation functions (small_page_v6_valid, small_ptr_in_segment_v6) - Doxygen-style comments for all public functions - Result: 0 compiler warnings, maintained +1.2% performance Files: - core/box/smallobject_core_v6_box.h (new, type & API definitions) - core/box/smallobject_cold_iface_v6.h (new, cold iface API) - core/box/smallsegment_v6_box.h (new, segment type definitions) - core/smallobject_core_v6.c (new, C6 alloc/free implementation) - core/smallobject_cold_iface_v6.c (new, refill/retire logic) - core/smallsegment_v6.c (new, segment allocator) - docs/analysis/SMALLOBJECT_CORE_V6_DESIGN.md (new, design document) - core/box/tiny_route_env_box.h (modified, v6 route added) - core/front/malloc_tiny_fast.h (modified, v6 case in route switch) - Makefile (modified, v6 objects added) - CURRENT_TASK.md (modified, v6 status added) Status: - C6-heavy: v6 OFF 27.1M → v6-3 ON 27.1M ops/s (±0%) ✅ - Mixed: v6 ON 35.8M ops/s (C6-only, other classes via v1) ✅ - Build: 0 warnings, fully documented ✅ 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-11 15:29:59 +09:00
// Page index calculation macro (requires 'seg' variable in scope)
#define SMALL_V6_PAGE_IDX(seg, addr) (((uintptr_t)(addr) - (seg)->base) >> SMALL_PAGE_V6_SHIFT)
// Forward declaration
typedef struct SmallPageMetaV6 SmallPageMetaV6;
// Page metadata
typedef struct SmallPageMetaV6 {
void* free_list; // freelist head (block先頭をnextとして使う)
uint16_t used; // 現在使用中スロット数
uint16_t capacity; // ページ内スロット数
uint8_t class_idx; // サイズクラス
uint8_t flags; // FULL / PARTIAL / REMOTE_PENDING など
uint16_t page_idx; // Segment 内 index
void* segment; // SmallSegmentV6* への backpointer
} SmallPageMetaV6;
// Segment structure
typedef struct SmallSegmentV6 {
uintptr_t base; // Segment base address
uint32_t num_pages; // Number of pages (typically 32)
uint32_t owner_tid; // Owner thread ID
uint32_t magic; // 0xC0REV6 for validation
SmallPageMetaV6 page_meta[SMALL_PAGES_PER_SEGMENT];
} SmallSegmentV6;
// ============================================================================
// Inline Helper Functions
// ============================================================================
/// Check if page is valid and active
static inline int small_page_v6_valid(SmallPageMetaV6* page) {
return page != NULL && page->capacity > 0;
}
/// Check if pointer is within segment bounds
static inline int small_ptr_in_segment_v6(SmallSegmentV6* seg, void* ptr) {
uintptr_t addr = (uintptr_t)ptr;
return addr >= seg->base && addr < seg->base + SMALL_SEGMENT_V6_SIZE;
}
/// Check if segment is valid and initialized
static inline int small_segment_v6_valid(SmallSegmentV6* seg) {
return seg != NULL && seg->magic == SMALL_SEGMENT_V6_MAGIC;
}
// ============================================================================
// API
// ============================================================================
SmallSegmentV6* small_segment_v6_acquire_for_thread(void);
void small_segment_v6_release(SmallSegmentV6* seg);
SmallPageMetaV6* small_page_meta_v6_of(void* ptr);
#endif // HAKMEM_SMALLSEGMENT_V6_BOX_H