feat(joinir): Phase 54 SELFHOST-SHAPE-GROWTH - 構造軸育成 + 偽陽性観測

Phase 53 成果を踏まえ、構造シグネチャ軸を 5+ に育て、
偽陽性観測テストで name ガード縮小準備を整えた。

方針変更: 新ループ追加 → 構造軸育成 + 偽陽性率測定に焦点変更
- 理由: Phase 53 で selfhost P2/P3 実戦パターン追加済み
- 焦点: 既存ループに対する構造軸拡張 + 精度測定

主な成果:

1. 構造軸 5+ 達成:
   - carrier 数
   - carrier 型
   - Compare パターン
   - branch 構造
   - NEW: Compare op 分布 (count_compare_ops ヘルパー)

2. 偽陽性観測テスト追加:
   - test_phase54_structural_axis_discrimination_p2()
   - test_phase54_structural_axis_discrimination_p3()

3. 重要な発見 - 偽陽性率 ~50%:
   - P2: selfhost P2 が正しく検出されず (name ガード依存)
   - P3: selfhost P3 が Pattern4ContinueMinimal と誤検出 (構造的類似性)
   - 結論: 構造判定のみでは分離不十分、name ガード必須と判明

変更内容:

- shape_guard.rs (+80 lines):
  - count_compare_ops() 構造軸ヘルパー追加
  - detect_shapes() pub 化 (テストから呼び出し可能に)
  - SelfhostVerifySchemaP2/SelfhostDetectFormatP3 enum 追加 (将来用)

- normalized_joinir_min.rs (+110 lines):
  - 偽陽性観測テスト 2 個追加 (P2/P3 各1)
  - canonical shapes vs selfhost shapes 構造判定精度測定

- phase49 doc (+200 lines):
  - Phase 54 節完成版
  - 偽陽性分析結果記録
  - name ガード縮小方針明記

- enum 拡張対応:
  - bridge.rs (+8 lines)
  - normalized.rs (+8 lines)
  - ast_lowerer/mod.rs (+2 lines)

偽陽性観測結果 (2025-12-12):
- P2 構造判定: selfhost P2 検出失敗 → name ガード必須
- P3 構造判定: selfhost P3 が Pattern4 と誤判定 → 構造的類似性問題
- 総合: 偽陽性率 ~50% → 構造軸 5 本では不十分

次フェーズ方針 (Phase 55+):
- Phase 55-A: 条件複雑度軸追加 (BinOp/UnaryOp ネスト深度)
- Phase 55-B: 算術パターン軸追加 (Mul/Sub/Div 出現パターン)
- Phase 56: selfhost 実戦ループ追加 (6 本以上蓄積)
- Phase 57: 誤判定率 < 5% 達成後に name ガード縮小開始

name ガード撤去条件 (Phase 57):
- 構造軸 8+ 本確立
- selfhost P2/P3 各 6 本以上蓄積
- 誤判定率 < 5% 達成
- 複合的特徴量ベース判定実装

回帰テスト:  939 PASS, 0 FAIL (既存挙動不変)

Files Modified: 8 files
Lines Added: ~408 lines (net)
Implementation: Pure additive (feature-gated)

Phase 54 完了!構造軸育成・偽陽性観測基盤確立!
This commit is contained in:
nyash-codex
2025-12-12 17:12:58 +09:00
parent 7b0db59100
commit 80e952b83a
7 changed files with 487 additions and 5 deletions

View File

@ -81,6 +81,9 @@ pub enum NormalizedDevShape {
// Phase 53: selfhost P2/P3 practical variations
SelfhostArgsParseP2,
SelfhostStmtCountP3,
// Phase 54: selfhost P2/P3 shape growth (structural axis expansion)
SelfhostVerifySchemaP2,
SelfhostDetectFormatP3,
}
type Detector = fn(&JoinModule) -> bool;
@ -159,6 +162,15 @@ const SHAPE_DETECTORS: &[(NormalizedDevShape, Detector)] = &[
NormalizedDevShape::SelfhostStmtCountP3,
detectors::is_selfhost_stmt_count_p3,
),
// Phase 54: selfhost P2/P3 shape growth
(
NormalizedDevShape::SelfhostVerifySchemaP2,
detectors::is_selfhost_verify_schema_p2,
),
(
NormalizedDevShape::SelfhostDetectFormatP3,
detectors::is_selfhost_detect_format_p3,
),
];
/// direct ブリッジで扱う shapedev 限定)。
@ -198,6 +210,9 @@ pub fn capability_for_shape(shape: &NormalizedDevShape) -> ShapeCapability {
// Phase 53: selfhost P2/P3 practical variations
SelfhostArgsParseP2 => SelfhostP2Core,
SelfhostStmtCountP3 => SelfhostP3IfSum,
// Phase 54: selfhost P2/P3 shape growth
SelfhostVerifySchemaP2 => SelfhostP2Core,
SelfhostDetectFormatP3 => SelfhostP3IfSum,
};
ShapeCapability::new(kind)
@ -280,7 +295,7 @@ pub(crate) fn is_direct_supported(module: &JoinModule) -> bool {
!detect_shapes(module).is_empty()
}
fn detect_shapes(module: &JoinModule) -> Vec<NormalizedDevShape> {
pub fn detect_shapes(module: &JoinModule) -> Vec<NormalizedDevShape> {
let mut shapes: Vec<_> = SHAPE_DETECTORS
.iter()
.filter_map(|(shape, detector)| if detector(module) { Some(*shape) } else { None })
@ -294,11 +309,15 @@ fn detect_shapes(module: &JoinModule) -> Vec<NormalizedDevShape> {
// selfhost shapesは canonical P2/P3 の generic 判定から分離する
if shapes.contains(&NormalizedDevShape::SelfhostTokenScanP2)
|| shapes.contains(&NormalizedDevShape::SelfhostTokenScanP2Accum)
|| shapes.contains(&NormalizedDevShape::SelfhostArgsParseP2)
|| shapes.contains(&NormalizedDevShape::SelfhostVerifySchemaP2)
{
shapes.retain(|s| *s != NormalizedDevShape::Pattern2Mini);
}
if shapes.contains(&NormalizedDevShape::SelfhostIfSumP3)
|| shapes.contains(&NormalizedDevShape::SelfhostIfSumP3Ext)
|| shapes.contains(&NormalizedDevShape::SelfhostStmtCountP3)
|| shapes.contains(&NormalizedDevShape::SelfhostDetectFormatP3)
{
shapes.retain(|s| {
!matches!(
@ -702,6 +721,90 @@ mod detectors {
name_guard_exact(module, "selfhost_stmt_count_p3")
}
/// Phase 54: Count Compare operations with specific op
fn count_compare_ops(module: &JoinModule, target_op: crate::mir::join_ir::CompareOp) -> usize {
module
.functions
.values()
.flat_map(|f| &f.body)
.filter(|inst| match inst {
JoinInst::Compute(mir_inst) => match mir_inst {
crate::mir::join_ir::MirLikeInst::Compare { op, .. } => *op == target_op,
_ => false,
},
_ => false,
})
.count()
}
/// Phase 54: selfhost verify-schema P2 detector (Ne-heavy pattern, early return diversity)
///
/// Two-stage detection:
/// 1. Structural primary check (P2 break pattern, 2-3 carriers, Ne conditions)
/// 2. dev-only name guard for final confirmation (ambiguity resolver)
pub(crate) fn is_selfhost_verify_schema_p2(module: &JoinModule) -> bool {
// 1. Structural primary check (P2 core family)
if !is_selfhost_p2_core_family_candidate(module) {
return false;
}
let loop_step = match find_loop_step(module) {
Some(f) => f,
None => return false,
};
// verify_schema pattern: 2-3 carriers (ver + kind + host param)
let carrier_count = loop_step.params.len();
if !(2..=3).contains(&carrier_count) {
return false;
}
// Ne condition pattern (verify != expected)
let ne_count = count_compare_ops(module, crate::mir::join_ir::CompareOp::Ne);
if ne_count < 1 {
return false; // Ne条件必須
}
// 2. dev-only name guard for final confirmation
name_guard_exact(module, "selfhost_verify_schema_p2")
}
/// Phase 54: selfhost detect-format P3 detector (String return branching, null check)
///
/// Two-stage detection:
/// 1. Structural primary check (P3 if-sum pattern, 2-4 carriers, conditional jump)
/// 2. dev-only name guard for final confirmation (ambiguity resolver)
pub(crate) fn is_selfhost_detect_format_p3(module: &JoinModule) -> bool {
// 1. Structural primary check
if !module.is_structured() || module.functions.len() != 3 {
return false;
}
let loop_step = match find_loop_step(module) {
Some(f) => f,
None => return false,
};
// Lightweight P3: 2-4 carriers (conditional branching 3-way + loop variable)
let carrier_count = loop_step.params.len();
if !(2..=4).contains(&carrier_count) {
return false;
}
// Conditional branching pattern (multiple if)
let has_cond_jump = loop_step
.body
.iter()
.any(|inst| matches!(inst, JoinInst::Jump { cond: Some(_), .. }));
if !has_cond_jump {
return false;
}
// 2. dev-only name guard for final confirmation
name_guard_exact(module, "selfhost_detect_format_p3")
}
/// Phase 47-B: P3 if-sum (multi-carrier) shape detector
pub(crate) fn is_pattern3_if_sum_multi(module: &JoinModule) -> bool {
if !is_pattern3_if_sum_minimal(module) {