Files
hakmem/ANALYSIS_INDEX.md
Moe Charm (CI) 1da8754d45 CRITICAL FIX: TLS 未初期化による 4T SEGV を完全解消
**問題:**
- Larson 4T で 100% SEGV (1T は 2.09M ops/s で完走)
- System/mimalloc は 4T で 33.52M ops/s 正常動作
- SS OFF + Remote OFF でも 4T で SEGV

**根本原因: (Task agent ultrathink 調査結果)**
```
CRASH: mov (%r15),%r13
R15 = 0x6261  ← ASCII "ba" (ゴミ値、未初期化TLS)
```

Worker スレッドの TLS 変数が未初期化:
- `__thread void* g_tls_sll_head[TINY_NUM_CLASSES];`  ← 初期化なし
- pthread_create() で生成されたスレッドでゼロ初期化されない
- NULL チェックが通過 (0x6261 != NULL) → dereference → SEGV

**修正内容:**
全 TLS 配列に明示的初期化子 `= {0}` を追加:

1. **core/hakmem_tiny.c:**
   - `g_tls_sll_head[TINY_NUM_CLASSES] = {0}`
   - `g_tls_sll_count[TINY_NUM_CLASSES] = {0}`
   - `g_tls_live_ss[TINY_NUM_CLASSES] = {0}`
   - `g_tls_bcur[TINY_NUM_CLASSES] = {0}`
   - `g_tls_bend[TINY_NUM_CLASSES] = {0}`

2. **core/tiny_fastcache.c:**
   - `g_tiny_fast_cache[TINY_FAST_CLASS_COUNT] = {0}`
   - `g_tiny_fast_count[TINY_FAST_CLASS_COUNT] = {0}`
   - `g_tiny_fast_free_head[TINY_FAST_CLASS_COUNT] = {0}`
   - `g_tiny_fast_free_count[TINY_FAST_CLASS_COUNT] = {0}`

3. **core/hakmem_tiny_magazine.c:**
   - `g_tls_mags[TINY_NUM_CLASSES] = {0}`

4. **core/tiny_sticky.c:**
   - `g_tls_sticky_ss[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}`
   - `g_tls_sticky_idx[TINY_NUM_CLASSES][TINY_STICKY_RING] = {0}`
   - `g_tls_sticky_pos[TINY_NUM_CLASSES] = {0}`

**効果:**
```
Before: 1T: 2.09M   |  4T: SEGV 💀
After:  1T: 2.41M   |  4T: 4.19M   (+15% 1T, SEGV解消)
```

**テスト:**
```bash
# 1 thread: 完走
./larson_hakmem 2 8 128 1024 1 12345 1
→ Throughput = 2,407,597 ops/s 

# 4 threads: 完走(以前は SEGV)
./larson_hakmem 2 8 128 1024 1 12345 4
→ Throughput = 4,192,155 ops/s 
```

**調査協力:** Task agent (ultrathink mode) による完璧な根本原因特定

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-07 01:27:04 +09:00

8.3 KiB

Large Files Analysis - Document Index

Overview

Comprehensive analysis of 1000+ line files in HAKMEM allocator codebase, with detailed refactoring recommendations and implementation plan.

Analysis Date: 2025-11-06
Status: COMPLETE - Ready for Implementation
Scope: 5 large files, 9,008 lines (28% of codebase)


Documents

1. LARGE_FILES_ANALYSIS.md (645 lines) - Main Analysis Report

Length: 645 lines | Read Time: 30-40 minutes

Contents:

  • Executive summary with priority matrix
  • Detailed analysis of each of the 5 large files:
    • hakmem_pool.c (2,592 lines)
    • hakmem_tiny.c (1,765 lines)
    • hakmem.c (1,745 lines)
    • hakmem_tiny_free.inc (1,711 lines) - CRITICAL
    • hakmem_l25_pool.c (1,195 lines)

For each file:

  • Primary responsibilities
  • Code structure breakdown (line ranges)
  • Key functions listing
  • Include analysis
  • Cross-file dependencies
  • Complexity metrics
  • Refactoring recommendations with rationale

Key Findings:

  • hakmem_tiny_free.inc: Average 171 lines per function (EXTREME - should be 20-30)
  • hakmem_pool.c: 65 functions mixed across 4 responsibilities
  • hakmem_tiny.c: 35 header includes (extreme coupling)
  • hakmem.c: 38 includes, mixing API + dispatch + config
  • hakmem_l25_pool.c: Code duplication with MidPool

When to Use:

  • First time readers wanting detailed analysis
  • Technical discussions and design reviews
  • Understanding current code structure

2. LARGE_FILES_REFACTORING_PLAN.md (577 lines) - Implementation Guide

Length: 577 lines | Read Time: 20-30 minutes

Contents:

  • Critical path timeline (5 phases)
  • Phase-by-phase implementation details:
    • Phase 1: Tiny Free Path (Week 1) - CRITICAL
    • Phase 2: Pool Manager (Week 2) - CRITICAL
    • Phase 3: Tiny Core (Week 3) - CRITICAL
    • Phase 4: Main Dispatcher (Week 4) - HIGH
    • Phase 5: Pool Core Library (Week 5) - HIGH

For each phase:

  • Specific deliverables
  • Metrics (before/after)
  • Build integration details
  • Dependency graphs
  • Expected results

Additional sections:

  • Before/after dependency graph visualization
  • Metrics comparison table
  • Risk mitigation strategies
  • Success criteria checklist
  • Time & effort estimates
  • Rollback procedures
  • Next immediate steps

Key Timeline:

  • Total: 2 weeks (1 developer) or 1 week (2 developers)
  • Phase 1: 3 days (Tiny Free, CRITICAL)
  • Phase 2: 4 days (Pool, CRITICAL)
  • Phase 3: 3 days (Tiny core consolidation, CRITICAL)
  • Phase 4: 2 days (Dispatcher split, HIGH)
  • Phase 5: 2 days (Pool core library, HIGH)

When to Use:

  • Implementation planning
  • Work breakdown structure
  • Parallel work assignment
  • Risk assessment
  • Timeline estimation

3. LARGE_FILES_QUICK_REFERENCE.md (270 lines) - Quick Reference

Length: 270 lines | Read Time: 10-15 minutes

Contents:

  • TL;DR problem summary
  • TL;DR solution summary (5 phases)
  • Quick reference tables
  • Phase 1 quick start checklist
  • Key metrics to track (before/after)
  • Common FAQ section
  • File organization diagram
  • Next steps checklist

Key Checklists:

  • Phase 1 (Tiny Free): 10-point implementation checklist
  • Success criteria per phase
  • Metrics to establish baseline

When to Use:

  • Executive summary for stakeholders
  • Quick review before meetings
  • Team onboarding
  • Daily progress tracking
  • Decision-making checklist

Quick Navigation

By Role

Technical Lead:

  1. Start: LARGE_FILES_QUICK_REFERENCE.md (overview)
  2. Deep dive: LARGE_FILES_ANALYSIS.md (current state)
  3. Plan: LARGE_FILES_REFACTORING_PLAN.md (implementation)

Developer:

  1. Start: LARGE_FILES_QUICK_REFERENCE.md (quick reference)
  2. Checklist: Phase-specific section in REFACTORING_PLAN.md
  3. Details: Relevant section in ANALYSIS.md

Project Manager:

  1. Overview: LARGE_FILES_QUICK_REFERENCE.md (TL;DR)
  2. Timeline: LARGE_FILES_REFACTORING_PLAN.md (phase breakdown)
  3. Metrics: Metrics section in QUICK_REFERENCE.md

Code Reviewer:

  1. Analysis: LARGE_FILES_ANALYSIS.md (current structure)
  2. Refactoring: LARGE_FILES_REFACTORING_PLAN.md (expected changes)
  3. Checklist: Success criteria in REFACTORING_PLAN.md

By Priority

CRITICAL READS (required):

  • LARGE_FILES_ANALYSIS.md - Detailed problem analysis
  • LARGE_FILES_REFACTORING_PLAN.md - Implementation approach

HIGHLY RECOMMENDED (important):

  • LARGE_FILES_QUICK_REFERENCE.md - Overview and checklists

Key Statistics

Current State (Before)

  • Files over 1000 lines: 5
  • Total lines in large files: 9,008 (28% of 32,175)
  • Max file size: 2,592 lines
  • Avg function size: 40-171 lines (extreme)
  • Worst file: hakmem_tiny_free.inc (171 lines/function)
  • Includes in worst file: 35 (hakmem_tiny.c)

Target State (After)

  • Files over 1000 lines: 0
  • Files over 800 lines: 0
  • Max file size: 800 lines (-69%)
  • Avg function size: 25-35 lines (-60%)
  • Includes per file: 5-8 (-80%)
  • Compilation time: 2.5x faster

Quick Start

For Immediate Understanding

  1. Read LARGE_FILES_QUICK_REFERENCE.md (10 min)
  2. Review TL;DR sections in this index (5 min)
  3. Review metrics comparison table (5 min)

For Implementation Planning

  1. Review LARGE_FILES_QUICK_REFERENCE.md Phase 1 checklist (5 min)
  2. Read Phase 1 section in REFACTORING_PLAN.md (10 min)
  3. Identify owner and schedule (5 min)

For Technical Deep Dive

  1. Read LARGE_FILES_ANALYSIS.md completely (40 min)
  2. Review before/after dependency graphs in REFACTORING_PLAN.md (10 min)
  3. Review code structure sections per file (20 min)

Summary of Files

File Lines Functions Avg/Func Priority Phase
hakmem_pool.c 2,592 65 40 CRITICAL 2
hakmem_tiny.c 1,765 57 31 CRITICAL 3
hakmem.c 1,745 29 60 HIGH 4
hakmem_tiny_free.inc 1,711 10 171 CRITICAL 1
hakmem_l25_pool.c 1,195 39 31 HIGH 5
TOTAL 9,008 200 45 - -

Implementation Roadmap

Week 1: Phase 1 - Split tiny_free.inc (3 days)
        Phase 2 - Split pool.c starts (parallel)
        
Week 2: Phase 2 - Split pool.c (1 more day)
        Phase 3 - Consolidate tiny.c starts
        
Week 3: Phase 3 - Consolidate tiny.c (1 more day)
        Phase 4 - Split hakmem.c starts
        
Week 4: Phase 4 - Split hakmem.c
        Phase 5 - Extract pool_core starts (parallel)
        
Week 5: Phase 5 - Extract pool_core (final polish)
        Final testing and merge

Parallel Work Possible: Yes, with careful coordination Rollback Possible: Yes, simple git revert per phase Risk Level: LOW (changes isolated, APIs unchanged)


Success Criteria

Phase Completion

  • All deliverable files created
  • Compilation succeeds without errors
  • Larson benchmark unchanged (±1%)
  • No valgrind errors
  • Code review approved

Overall Success

  • 0 files over 1000 lines
  • Max file size: 800 lines
  • Avg function size: 25-35 lines
  • Compilation time: 60% improvement
  • Development speed: 3-6x faster for common tasks

Next Steps

  1. Today: Review this index + QUICK_REFERENCE.md
  2. Tomorrow: Technical discussion + ANALYSIS.md review
  3. Day 3: Phase 1 implementation planning
  4. Day 4: Phase 1 begins (estimated 3 days)
  5. Day 7: Phase 1 review + Phase 2 starts

Document Glossary

Phase: A 2-4 day work item splitting one or more large files

Deliverable: Specific file(s) to be created or modified in a phase

Metric: Quantifiable measure (lines, complexity, time)

Responsibility: A distinct task or subsystem within a file

Cohesion: How closely related functions are within a module

Coupling: How dependent a module is on other modules

Cyclomatic Complexity: Number of independent code paths (lower is better)


Document Metadata

  • Created: 2025-11-06
  • Last Updated: 2025-11-06
  • Status: COMPLETE
  • Review Status: Ready for technical review
  • Implementation Status: Ready for Phase 1 kickoff

Contact & Questions

For questions about the analysis:

  1. Review the relevant document above
  2. Check FAQ section in QUICK_REFERENCE.md
  3. Refer to corresponding phase in REFACTORING_PLAN.md

For implementation support:

  • Use phase-specific checklists
  • Follow week-by-week breakdown
  • Reference success criteria

Generated by: Large Files Analysis System
Repository: /mnt/workdisk/public_share/hakmem
Codebase: HAKMEM Memory Allocator