Fix overly strict AI PR review prompt causing excessive review cycles#196
Fix overly strict AI PR review prompt causing excessive review cycles#196
Conversation
Address five failure modes observed after gpt-5.4 upgrade (PRs #192, #194, #195): documented deviations flagged as blockers, deferred work not accepted, moving goalposts on re-review, undefined approval criteria, and valid implementation choices treated as methodology errors. Changes to pr_review.md: - Exempt REGISTRY.md-documented deviations from P0/P1 (classify as P3) - Add implementation choice exception for valid numerical approaches - Add Deferred Work Acceptance section honoring TODO.md tracking - Add Assessment Criteria with objective verdicts and mitigation rules - Add Re-review Scope rules to prevent oscillation between rounds Changes to ai_pr_review.yml: - Add step to fetch previous AI review comment for re-review context - Inject prior review findings into compiled prompt on /ai-review reruns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Overall Assessment ⛔ Blocker Executive Summary
Methodology No findings. This diff changes only the AI review prompt/workflow; it does not modify estimator math, weighting, variance/SE code, identification assumptions, or defaults. Code Quality No additional findings beyond the workflow/prompt issues below. Performance No findings. The added API call and prompt append are small relative to the rest of the job. Maintainability No additional findings beyond the policy inconsistency called out in Tech Debt. Tech Debt
Security
Documentation/Tests
Path to Approval
|
Filter prior review comments to bot-authored only, wrap injected review text in untrusted XML delimiters, scope TODO.md P3 classification to deferrable categories, and add explicit P0/P1 exclusion from mitigation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Summary
/ai-reviewroundsMethodology references (required if estimator / math changes)
Validation
/ai-reviewon a previously problematic PR (Add EfficientDiD estimator (Chen, Sant'Anna & Xie 2025) #192, Optimize CallawaySantAnna covariate path (5-19x speedup) #194, or Add (1-W) weight masking to TROP global method #195) to confirm the updated prompt produces more reasonable assessmentsSecurity / privacy
Generated with Claude Code