Files

Moe Charm 7a0f9bd432 🚨 AI協調開発の危機回避事例を論文化（paper-09）

「ん？大丈夫？」の一言がPython特化ハードコーディングを防いだ事例を記録。
Everything is Box哲学 vs 技術的正しさの綱渡りからの生還を分析。

- docs/research/paper-09-ai-collaboration-pitfall/ を新規作成
  - incident-analysis.md: Lowerer特殊化危機の詳細分析
  - ai-collaboration-lessons.md: AI協調開発の教訓
  - intuition-in-engineering.md: エンジニアの直感の価値
  - summary.md: 綱渡りからの生還まとめ
- 研究論文の1論文1フォルダ原則に従い整理
- Python統合関連の実装修正とビルド成功確認

🛡️ Generated with Claude Code

2025-08-30 08:54:15 +09:00

9.3 KiB

Raw Blame History

Box-First JIT: A Methodology for AI-Assisted Development without Brute-Force Optimization

Abstract

In the era of AI-assisted software development, the challenge is not generating code but controlling its complexity. We present Box-First, a design methodology that enabled the implementation of a fully functional JIT compiler in just 24 hours. By encapsulating configuration, boundaries, and observability as first-class "boxes," we achieve strong reversibility and avoid the pitfall of AI-generated brute-force optimizations. Our implementation in the Nyash language demonstrates 100% compilation success, zero runtime failures, and 1.06-1.40x performance improvements over the VM baseline. More importantly, the methodology provides guardrails for AI assistance, ensuring generated code remains maintainable and evolvable. We argue that Box-First represents a new paradigm for human-AI collaboration in complex system development.

1. Introduction

On August 27, 2025, we implemented a production-ready JIT compiler with control flow and PHI support in a single day. This achievement was not about rushing or cutting corners—it was the result of applying a systematic design methodology we call "Box-First."

The proliferation of AI coding assistants has created a new challenge: while AI can rapidly generate large amounts of code, this often leads to monolithic, tightly-coupled implementations that are difficult to understand, debug, or extend. We experienced this firsthand when initial AI suggestions produced complex, optimization-heavy code that was impressive but unmaintainable.

This paper presents Box-First as a methodology for AI-assisted development that prioritizes:

Visibility: All system behavior is observable
Reversibility: Any change can be safely rolled back
Switchability: Features can be toggled without recompilation

Our key contributions are:

The Box-First design principle for managing AI-generated complexity
A concrete implementation demonstrating 24-hour JIT development
Empirical evidence of the methodology's effectiveness
Guidelines for applying Box-First to other complex systems

2. The Box-First Methodology

2.1 Core Principle

Box-First treats every system component as an isolated "box" with three properties:

Fixed interfaces: Clear input/output contracts
Failure isolation: Errors cannot escape the box
Observable state: All internal behavior can be monitored

This is not merely modularization—it's a discipline for creating "reversible scaffolding" before implementation.

2.2 The Three Essential Boxes

Through our JIT implementation, we identified three fundamental box types:

Configuration Box: Centralizes all runtime options

Eliminates scattered environment variable reads
Provides capability probing and auto-adjustment
Enables consistent behavior across test/CLI/production

Boundary Box: Manages inter-component communication

Type-safe value conversion at boundaries
Handle-based indirection (no direct pointers)
Automatic resource cleanup via scoping

Observability Box: Makes system behavior visible

Unified statistics collection
Visual debugging (CFG/DOT generation)
Performance profiling without code changes

2.3 AI Collaboration Pattern

The Box-First methodology transforms AI assistance from a liability to an asset:

Define boxes first: Before any implementation, establish the three boxes
AI implements within boxes: Constrained scope prevents sprawl
Validate via observability: Use built-in monitoring to verify behavior
Iterate safely: Reversibility allows experimentation

This approach prevented common AI pitfalls such as:

Premature optimization
Violation of existing conventions
Monolithic implementations
Hidden dependencies

3. Case Study: 24-Hour JIT Implementation

3.1 Timeline

Design Phase (Aug 13-26, 2 weeks):

Established Box-First architecture
Defined JitValue ABI (independent from VM)
Created handle registry design

Implementation Day (Aug 27):

01:03: Infrastructure setup with three boxes
17:06: Basic operations (arithmetic, constants)
17:39: Control flow (branches, conditions)
17:52: PHI node support
17:58: Testing complete, 100% success rate

3.2 Technical Architecture

Figure 1 illustrates the Box-First JIT architecture. The key insight is complete decoupling:

VM (VMValue) <---> Boundary Box <---> JIT (JitValue)
                        |
                   Configuration Box
                   Observability Box

The JIT never directly accesses VM internals. All interaction goes through the Boundary Box using opaque handles.

3.3 Implementation Highlights

Configuration Box (JitConfigBox):

// Before: Scattered environment checks
if std::env::var("NYASH_JIT_EXEC") == Ok("1") { ... }

// After: Centralized configuration
if jit::config::current().exec { ... }

Boundary Box (HandleRegistry):

// Opaque handle instead of raw pointer
pub fn to_handle(obj: Arc<dyn NyashBox>) -> u64
pub fn get(h: u64) -> Option<Arc<dyn NyashBox>>

Observability Box (Statistics/DOT):

// Automatic tracking without code changes
[JIT] compiled fib -> handle=42
[JIT] stats: calls=1000 success=1000 fallback=0

4. Evaluation

4.1 Development Efficiency

Metric	Traditional JIT	Box-First JIT
Implementation Time	2-6 months	24 hours
Lines of Code	50,000+	~3,000
Time to First Working Version	Weeks	Hours

4.2 Runtime Performance

Tests show 1.06-1.40x speedup over VM baseline (including compilation overhead). While modest, these gains come with zero stability compromises.

4.3 Maintainability

The true benefit emerges in evolution:

Adding boolean types: 1 line change in Boundary Box
New optimization: Isolated to JIT box
Performance regression: Instantly visible via Observability Box

4.4 The Human Factor: Simplicity as a Metric

In practice, code acceptance was guided not only by automated checks but also by an intuitive 'simplicity sensor' of the developer. This qualitative filter proved to be extremely effective: most accepted changes required no rework, while rejected ones were identified almost instantly.

This phenomenon highlights an underexplored aspect of AI-assisted development: the role of human intuition as a real-time quality gate. The Box-First methodology amplified this intuition by providing clear boundaries—violations felt immediately "wrong" even before formal analysis.

The key insight is the complementary relationship between quantitative effects and qualitative judgments:

Quantitative: "Boxing enabled JIT implementation in one day"—measurable and reproducible outcomes
Qualitative: "Excessive boxing slows progress, requiring human intuitive judgment"—unmeasurable but essential quality control

We argue that this human-in-the-loop validation, while not quantifiable, is an essential component of the methodology. The combination of structural constraints (boxes) and human judgment (simplicity sensing) created a highly efficient filtering mechanism that traditional metrics fail to capture. This integration of quantitative and qualitative elements demonstrates the ideal division of labor between humans and AI in assisted development.

JIT Compilers: Traditional JITs (V8, HotSpot) achieve higher performance through tight coupling. Box-First trades some optimization potential for dramatic complexity reduction.

Software Architecture: Box-First extends beyond existing patterns:

Unlike microservices: In-process, zero network overhead
Unlike dependency injection: Boxes are observable and reversible
Unlike plugins: First-class architectural elements

AI-Assisted Development: Recent work on prompt engineering and code generation focuses on output quality. We focus on structural constraints that make any output maintainable.

6. Future Work

Formal verification of box properties
Automated box generation from specifications
Performance optimization within box constraints
Application to other domains (databases, compilers, OS)

7. Conclusion

Box-First is not about making JIT implementation easy—it's about making it possible to build complex systems with AI assistance while maintaining human understanding and control. By establishing configuration, boundary, and observability boxes before implementation, we created guardrails that channeled AI capabilities productively.

The 24-hour JIT implementation demonstrates that the right abstractions can reduce complexity by orders of magnitude. More importantly, it shows a path forward for human-AI collaboration: not replacing human judgment but augmenting it with systematic constraints.

As AI coding assistants become more powerful, methodologies like Box-First become more critical. The question is not whether AI can generate a JIT compiler—it's whether humans can still understand and maintain what was generated. Box-First ensures the answer remains yes.

References

[1] Lattner, C. "LLVM: An Infrastructure for Multi-Stage Optimization", 2002 [2] Würthinger, T. et al. "One VM to Rule Them All", Onward! 2013 [3] Implementation available at: https://github.com/[redacted]/nyash

Acknowledgments: This work was completed in collaboration with AI assistants, demonstrating the methodology's practical application.

9.3 KiB Raw Blame History