- Decision recorded in CURRENT_TASK: replace with (no compat alias)
- EBNF: add match_expr/pattern/guard; note legacy peek removal
- Guides/Reference: update examples and wording (null-coalesce/safe-access sugar via match)
## Summary
Investigated OpenAI's new GPT-5-Codex model and Codex GitHub PR review integration capabilities.
## GPT-5-Codex Analysis
### Benchmark Performance (Good)
- SWE-bench Verified: 74.5% (vs GPT-5's 72.8%)
- Refactoring tasks: 51.3% (vs GPT-5's 33.9%)
- Code review: Higher developer ratings
### Real-World Issues (Concerning)
- Users report degraded coding performance
- Scripts that previously worked now fail
- Less consistent than GPT-4.5
- Longer response times (minutes vs instant)
- "Creatively and emotionally flat"
- Basic errors (e.g., counting letters incorrectly)
### Key Finding
Classic case of "optimizing for benchmarks vs real usability" - scores well on tests but performs poorly in practice.
## Codex GitHub PR Integration
### Setup Process
1. Enable MFA and connect GitHub account
2. Authorize Codex GitHub app for repos
3. Enable "Code review" in repository settings
### Usage Methods
- **Manual**: Comment '@codex review' in PR
- **Automatic**: Triggers when PR moves from draft to ready
### Current Limitations
- One-way communication (doesn't respond to review comments)
- Prefers creating new PRs over updating existing ones
- Better for single-pass reviews than iterative feedback
## 'codex resume' Feature
New session management capability:
- Resume previous codex exec sessions
- Useful for continuing long tasks across days
- Maintains context from interrupted work
🐱 The investigation reveals that while GPT-5-Codex shows benchmark improvements, practical developer experience has declined - a reminder that metrics don't always reflect real-world utility\!
## Summary
Documented the "init block vs fields-at-top" design discussion as a valuable example of AI-human collaboration in language design.
## Changes
### Paper G (AI Collaboration)
- Added field-declaration-design.md documenting the entire discussion flow
- Showcased how complex init block proposal evolved to simple "fields at top" rule
- Demonstrates AI's tendency toward complexity vs human intuition for simplicity
### Paper H (AI Practical Patterns)
- Added Pattern #17: "Gradual Refinement Pattern" (段階的洗練型)
- Documents the process: Complex AI proposal → Detailed analysis → Human insight → Convergence
- Field declaration design as a typical example
### Paper K (Explosive Incidents)
- Added Incident #046: "init block vs fields-at-top incident"
- Updated total count to 46 incidents
- Shows how a single human comment redirected entire design approach
## Design Decision
After analysis, decided that BoxIndex should remain a compiler-internal structure, not a core Box:
- Core Boxes: User-instantiable runtime values (String, Integer, Array, Map)
- Compiler internals: BoxIndex for name resolution (compile-time only)
- Clear separation of concerns between language features and compiler tools
## Philosophy
This discussion exemplifies key principles:
- The best design needs no explanation
- Constraints provide clarity, not limitation
- "Everything is Box" doesn't mean "compiler internals are Boxes"
- AI tends toward theoretical completeness; humans toward practical simplicity
🐱 Sometimes the simplest answer is right in front of us\!
Major implementation by ChatGPT:
- Complete JSON v0 Bridge layer with PHI generation for control flow
- If statement: Merge PHI nodes for variables updated in then/else branches
- Loop statement: Header PHI nodes for loop-carried dependencies
- Python MVP Parser Stage-2: Added local/if/loop/call/method/new support
- Full CFG guarantee: All blocks have proper terminators (branch/jump/return)
- Type metadata for string operations (+, ==, !=)
- Comprehensive PHI smoke tests for nested and edge cases
This allows MIR generation without Rust MIR builder - massive step towards
eliminating Rust build dependency!
🎉 ChatGPTが30分以上かけて実装してくれたにゃ!
Co-Authored-By: ChatGPT <noreply@openai.com>
Major updates:
- Updated progress section (2025-09-14) with recent achievements:
- Python LLVM backend reaching production level (all tests passing)
- Nyash parser implementation started by ChatGPT5
- peek expression rediscovery (when→peek rename)
- Box theory simplifying SSA construction (650→100 lines)
- AI collaboration paper completed
Documentation improvements:
- Changed 500-line limit to "guideline" (not strict requirement)
- Upgraded Python LLVM backend from "experimental" to "production level"
- Added peek expression examples and usage patterns
- Updated todo list status (parser planning completed, compiler MVP in progress)
These updates reflect the significant progress towards self-hosting,
especially the parser implementation using peek expressions to avoid
19-line if-else nesting.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>