Files

Selfhosting Dev 043472c170 docs(papers): Update MIR13 to MIR14 and create SSA construction paper

Major changes:
- Update all MIR13 references to MIR14 throughout paper-a-mir13-ir-design/
- Add evolution history: 27 → 13 → 14 instructions (UnaryOp restoration)
- Create new paper-d-ssa-construction/ for SSA implementation struggles
- Add PAPER_INDEX.md consolidating ChatGPT5's 3-paper analysis

MIR14 updates:
- README.md: Add instruction evolution timeline
- abstract.md: Emphasize practical balance over pure minimalism
- main-paper*.md: Update titles and core concepts
- MIR13_CORE13_SPEC.md: Add UnaryOp to instruction list
- chapters/01-introduction.md: Reframe as "14-Instruction Balance"
- RENAME_NOTE.md: Document folder naming consideration

SSA paper structure:
- README.md: Paper overview and positioning
- current-struggles.md: Raw implementation challenges
- technical-details.md: BuilderCursor, Sealed SSA, type normalization
- abstract.md: English/Japanese abstracts

LoopForm experiments continue in parallel (minor adjustments to detection).

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-09-12 15:58:20 +09:00

4.4 KiB

Raw Blame History

Chapter 1: Introduction

The 14-Instruction Balance

Can we build a practical programming language with just 14 intermediate representation (IR) instructions? This paper demonstrates how we evolved from 27 instructions to 13 through aggressive minimization, then pragmatically added one back (UnaryOp) to achieve the optimal balance between theoretical minimalism and practical efficiency.

The Complexity Crisis

Modern intermediate representations have grown alarmingly complex:

LLVM IR: 60+ opcodes (and growing)
JVM bytecode: ~200 instructions
CLR IL: ~100 instructions
WebAssembly: ~150 instructions
Even "minimal" VMs: 30-50 instructions

This complexity stems from decades of optimization-driven design, where each performance improvement adds new instructions. The result? Compiler implementations measured in millions of lines of code, optimization passes that few understand, and a barrier to entry that excludes most researchers and students.

The MIR-14 Evolution

We present MIR-14, which evolved through three distinct phases:

Initial design: 27 instructions (feature-driven)
Aggressive reduction: 13 instructions via BoxCall unification
Practical restoration: 14 instructions (Core-13 + UnaryOp)

Traditional:                    MIR-14:
ArrayGet → 
ArraySet →     }  BoxCall
RefGet   →     }  (unified)
RefSet   →
UnaryOp (restored for efficiency)

The key insight: array operations and field accesses are fundamentally the same—they're all Box method calls. By recognizing this pattern, we achieved dramatic instruction reduction. However, practical experience showed that certain primitive operations (negation, NOT) warrant direct representation, leading to our final 14-instruction set.

Performance Without Complexity

Critics might assume that fewer instructions mean worse performance. We prove the opposite:

Inline Caching: 33x speedup for method dispatch
AOT Compilation: Near-native performance
Typed Array Specialization: Competitive with C arrays
Code Size Reduction: 20-50% smaller MIR output

The secret? Strategic optimization placement at Box boundaries rather than IR complexity.

Contributions

This paper makes five key contributions:

Evolution Methodology: A documented journey from Core-27 → Core-13 → Core-14, demonstrating both aggressive reduction and pragmatic restoration.
BoxCall Unification Architecture: A novel design pattern that elegantly absorbs data access operations into a single instruction.
Optimization Strategy: Demonstration that IR minimalism coupled with boundary optimization outperforms complex IR designs.
Implementation Evidence: Full compiler stack (Parser → MIR → VM/JIT/AOT/WASM) maintaining ±5% performance of baseline.
Educational Impact: A compiler design that students can understand in days, not months.

Paper Organization

The remainder of this paper is organized as follows:

Chapter 2 presents the Box Theory, our theoretical foundation for achieving complexity through composition rather than instruction proliferation.
Chapter 3 details the MIR15 design, explaining our process of reducing 26 instructions to 15 while maintaining full functionality.
Chapter 4 describes our implementation, including the unified architecture that enables four different backends to share the same minimal IR.
Chapter 5 evaluates our approach through GUI demonstrations, performance benchmarks, and instruction coverage analysis.
Chapter 6 discusses the implications of our findings and why this approach succeeds where conventional wisdom suggests it should fail.
Chapter 7 compares our work with related systems, highlighting the unique aspects of our minimalist approach.
Chapter 8 concludes with reflections on the future of minimal language design.

A Note on Simplicity

"Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away."
— Antoine de Saint-Exupéry

Nyash embodies this principle. By removing rather than adding, we have discovered that less truly can be more—not just philosophically, but practically. The GUI application running on your screen with 15 instructions is not a limitation overcome, but a validation of simplicity as a first-class design principle.

Welcome to the minimal instruction revolution.

4.4 KiB Raw Blame History