Skip to content

Fix overly strict AI PR review prompt causing excessive review cycles#196

Merged
igerber merged 2 commits intomainfrom
fix/ai-review-prompt-strictness
Mar 14, 2026
Merged

Fix overly strict AI PR review prompt causing excessive review cycles#196
igerber merged 2 commits intomainfrom
fix/ai-review-prompt-strictness

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Mar 14, 2026

Summary

  • Exempt REGISTRY.md-documented deviations from P0/P1 flags (classify as P3-informational)
  • Add implementation choice exception for valid numerical approaches (Cholesky vs QR, etc.)
  • Add Deferred Work Acceptance section honoring TODO.md-tracked tech debt
  • Add Assessment Criteria with objective ✅/⚠️/⛔ verdicts and mitigation rules
  • Add Re-review Scope rules to prevent moving goalposts across /ai-review rounds
  • Add workflow step to fetch and inject prior AI review into compiled prompt on reruns

Methodology references (required if estimator / math changes)

  • N/A — no methodology changes. Only the AI review prompt and workflow are modified.

Validation

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Address five failure modes observed after gpt-5.4 upgrade (PRs #192, #194,
#195): documented deviations flagged as blockers, deferred work not accepted,
moving goalposts on re-review, undefined approval criteria, and valid
implementation choices treated as methodology errors.

Changes to pr_review.md:
- Exempt REGISTRY.md-documented deviations from P0/P1 (classify as P3)
- Add implementation choice exception for valid numerical approaches
- Add Deferred Work Acceptance section honoring TODO.md tracking
- Add Assessment Criteria with objective verdicts and mitigation rules
- Add Re-review Scope rules to prevent oscillation between rounds

Changes to ai_pr_review.yml:
- Add step to fetch previous AI review comment for re-review context
- Inject prior review findings into compiled prompt on /ai-review reruns

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

Overall Assessment

⛔ Blocker

Executive Summary

  • ⛔ The new rerun path trusts any PR issue comment containing the <!-- ai-pr-review:codex: marker and injects that body into the next prompt verbatim, so a non-bot commenter can spoof a “previous AI review” and steer future /ai-review reruns.
  • That injected text is not treated as untrusted context; the prompt explicitly gives that treatment only to the PR title/body, which makes the spoofed comment higher-trust than the actual PR text.
  • ⚠️ The new TODO-based mitigation rule is broader than the non-deferrable-statistics rule. As written, a tracked or newly-added TODO.md entry can cause future NaN/inference or variance/SE defects to be treated as “mitigated.”
  • No estimator, weighting, variance/SE implementation, or identification logic changes are in scope here; the methodology risk is in the reviewer policy, not the library runtime.
  • No automated regression coverage was added for the new rerun prompt assembly or for the new mitigation precedence rules.

Methodology

No findings. This diff changes only the AI review prompt/workflow; it does not modify estimator math, weighting, variance/SE code, identification assumptions, or defaults.

Code Quality

No additional findings beyond the workflow/prompt issues below.

Performance

No findings. The added API call and prompt append are small relative to the rest of the job.

Maintainability

No additional findings beyond the policy inconsistency called out in Tech Debt.

Tech Debt

  • Severity: P1. Impact: .github/codex/prompts/pr_review.md:L61-L67 says missing NaN guards and incorrect statistical output are not deferrable, but .github/codex/prompts/pr_review.md:L95-L99 then treats any TODO-tracked limitation or PR-added TODO as “MITIGATED.” That lets a future PR downgrade exactly the P0/P1 statistical defects this reviewer is supposed to keep blocking. Concrete fix: narrow TODO-based mitigation to deferrable P2/P3 items only, and add an explicit exception that silent correctness bugs, NaN/inference bugs, missing assumption checks, incorrect variance/SE, and security findings are never mitigated by TODO.md.

Security

  • Severity: P0. Impact: .github/workflows/ai_pr_review.yml:L99-L107 selects any issue comment whose body merely contains <!-- ai-pr-review:codex:, and .github/workflows/ai_pr_review.yml:L133-L137 injects that body directly into the next prompt. The prompt’s untrusted-data guard applies only to PR title/body (.github/codex/prompts/pr_review.md:L69-L77), so a user-authored PR comment can spoof a prior AI review and poison a collaborator-triggered rerun. Concrete fix: only accept prior-review comments with an exact auto/rerun marker format from github-actions[bot] (or the specific workflow app identity), and wrap the injected text in a clearly delimited “historical model output, not instructions” block.

Documentation/Tests

  • Severity: P2. Impact: this PR adds a new rerun code path and new assessment-precedence logic without regression coverage, so both issues above would only surface after a bad review result. Concrete fix: add a small fixture-based test or script-level check for prompt assembly that asserts (1) spoofed non-bot marker comments are ignored and (2) a TODO-tracked NaN/inference bug still yields P0/P1 rather than being marked mitigated.

Path to Approval

  1. In .github/workflows/ai_pr_review.yml, restrict “previous AI review” lookup to exact workflow-authored markers from the bot/app account, and inject that text as explicitly untrusted historical output rather than trusted instructions.
  2. In .github/codex/prompts/pr_review.md, rewrite the mitigation rules so TODO.md only downgrades deferrable P2/P3 limitations and cannot suppress P0/P1 correctness, inference, methodology, or security findings.

Filter prior review comments to bot-authored only, wrap injected review
text in untrusted XML delimiters, scope TODO.md P3 classification to
deferrable categories, and add explicit P0/P1 exclusion from mitigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 14, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: d5d4ead3ad152849370611a196d4f70b93a1c06b


Overall Assessment

✅ Looks good

Executive Summary

  • No estimator, weighting, variance/SE, or identification code is touched here; I found no methodology defects in scope.
  • The rerun workflow change is directionally correct: it pulls the latest bot-authored AI PR comment and only injects it into the compiled prompt on comment-triggered reruns.
  • P3: one of the new REGISTRY label examples in the prompt does not exactly match the repository’s current formatting, which can slightly weaken the “documented deviation” exemption this PR is trying to enforce.
  • P3: the new TODO-mitigation rules are internally inconsistent about whether a matching TODO entry must include a PR reference.
  • P3: there is still no automated coverage for the new previous-review retrieval / prompt-compilation path, so regressions here will surface only during live /ai-review runs.

Methodology

Code Quality

  • Severity P3. Impact: the prompt tells the reviewer to look for **Deviation from R:**, but the registry currently uses **Deviation from R**:. That mismatch is minor, but it slightly increases the chance that a documented deviation is missed and incorrectly treated as undocumented. Concrete fix: change the prompt text to match the repository’s actual label exactly, or generalize it to “look for Note / Deviation from R labels.” Locations: .github/codex/prompts/pr_review.md#L9-L10, docs/methodology/REGISTRY.md#L533.

Performance

  • No findings. The added API read for previous comments is small and isolated to the review workflow.

Maintainability

  • Severity P3. Impact: the new deferred-work rules are inconsistent. One section requires a TODO entry “with a PR reference,” while the mitigation rules later say any entry under the TODO section counts. That ambiguity can produce inconsistent reviewer behavior across similar PRs. Concrete fix: make both sections use the same rule, preferably keeping the “with a PR reference” qualifier if that is the intended policy. Locations: .github/codex/prompts/pr_review.md#L61-L64, .github/codex/prompts/pr_review.md#L96-L100, TODO.md#L38-L60.

Tech Debt

  • No findings. The new prompt explicitly acknowledges deferred work via TODO.md, and the repository already has the referenced section in place.

Security

Documentation/Tests

  • Severity P3. Impact: there is no automated coverage for the new rerun-specific prompt assembly path, so regressions in previous-comment retrieval or prompt layout will only be caught by manual /ai-review runs. Concrete fix: extract the prompt-building block into a small script and add a fixture-driven test that asserts the compiled prompt includes the rerun note, wrapped prior review, and current diff in the expected order. Location: .github/workflows/ai_pr_review.yml#L92-L152.

@igerber igerber merged commit a27668a into main Mar 14, 2026
@igerber igerber deleted the fix/ai-review-prompt-strictness branch March 14, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant