Skip to content

Fix TROP nuclear norm solver step size + FISTA acceleration#187

Closed
igerber wants to merge 3 commits intomainfrom
trop-additional-analysis
Closed

Fix TROP nuclear norm solver step size + FISTA acceleration#187
igerber wants to merge 3 commits intomainfrom
trop-additional-analysis

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Mar 2, 2026

Summary

  • Fix factor-of-2 error in proximal gradient soft-thresholding threshold for L matrix: λ_nn/max(δ)λ_nn/(2·max(δ)), derived from Lipschitz constant L_f = 2·max(δ)
  • Fix applied to all 4 code paths: Python joint, Python twostep, Rust joint, Rust twostep
  • Add FISTA/Nesterov momentum acceleration to twostep inner solver (O(1/k²) convergence)

Methodology references (required if estimator / math changes)

  • Method name(s): TROP (Triply Robust Panel) — proximal gradient for nuclear norm minimization
  • Paper / source link(s): Arkhangelsky et al. (2021) "Doubly Robust Difference-in-Differences"; Beck (2012) "A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems" (FISTA)
  • Any intentional deviations from the source (and why): FISTA applied only to twostep inner solver (where α,β are fixed and objective is constant), NOT to joint solver's alternating loop where convergence guarantees don't hold

Validation

  • Tests added/updated: tests/test_trop.py — added TestTROPNuclearNormSolver with:
    • test_proximal_step_size_correctness: verifies convergence to analytical prox_{λ/2}(R) for uniform weights
    • test_lowrank_objective_decreases: verifies objective monotonicity across iterations
  • Full test suite: 81/81 TROP tests pass (pure Python), 20/20 Rust parity tests pass
  • Backtest / simulation / notebook evidence (if applicable): Comparison notebook in analysis/ (not committed) shows L Frobenius norm difference vs CVXPY decreases significantly

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

…ISTA acceleration

The soft-thresholding threshold for the L matrix was λ_nn/max(δ) when the
correct value is λ_nn/(2·max(δ)), derived from the Lipschitz constant
L_f = 2·max(δ) of the quadratic gradient. This over-shrinking caused
singular values to be reduced too aggressively.

Fix applied to all four code paths: Python joint, Python twostep, Rust
joint, Rust twostep. Also adds FISTA/Nesterov acceleration to the twostep
inner solver for faster L convergence (O(1/k²) vs O(1/k)).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

Overall assessment: ⚠️ Needs changes

Executive summary

  • TROP joint and twostep nuclear-norm solvers were modified (Python/Rust); joint looks aligned with the updated registry step-size logic, but the Python twostep path appears to miss max(W) scaling, creating backend divergence.
  • Methodology registry updates are partial: the core TROP section still documents unscaled soft-thresholding, which no longer matches the new proximal-gradient/FISTA implementation.
  • Tests cover the joint-step-size fix, but there’s no direct coverage for the twostep/FISTA path or Python↔Rust parity under non-uniform weights.

Methodology

  • Severity: P1. Impact: Python twostep low-rank update uses lambda_nn / 2 after weight normalization, but the registry and Rust implementation scale the threshold by max(W); when W_max < 1 (possible after masking with non-uniform weights), Python under-regularizes and diverges from Rust, yielding inconsistent L and ATT. Fix: compute eta = 1/(2*W_max) and use soft_threshold_svd(gradient_step, eta * lambda_nn) (or keep unnormalized W and explicit eta). Locations: diff_diff/trop.py:L2041-L2072, rust/src/trop.rs:L663-L772, docs/methodology/REGISTRY.md:L1064-L1071.
  • Severity: P1. Impact: The core TROP registry still states L̂ = soft_threshold(Σ, λ_nn) without step-size scaling or FISTA, which no longer matches the updated solver logic; this is an undocumented deviation from the Methodology Registry. Fix: update docs/methodology/REGISTRY.md to describe the proximal-gradient step with η = 1/(2·max(W)) and the FISTA acceleration in the twostep inner solver (or explicitly scope the formula to the unweighted special case). Locations: docs/methodology/REGISTRY.md:L1064-L1071, diff_diff/trop.py:L2053-L2071.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • Severity: P2. Impact: New FISTA twostep path and corrected step-size behavior lack direct tests, so regressions or Python↔Rust mismatches under non-uniform weights could slip. Fix: add a twostep test with non-uniform weights (where W_max < 1) and assert objective decrease and/or parity between Python and Rust. Location: tests/test_trop.py:L2595-L2652.

Tests not run (not requested).

…ghts

The proximal threshold was hardcoded as λ/(2) which is only correct when
W_max=1. Changed to λ/(2·W_max) to match the joint solver and Rust backend.
Added test with non-uniform weights and updated REGISTRY.md algorithm docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 7, 2026

/ai-review

@github-actions
Copy link

github-actions bot commented Mar 7, 2026

🔁 AI review rerun (requested by @igerber)

Head SHA: 5b97192a3f6625d9c143535fc512625035d166f6


Overall Assessment: ⚠️ Needs changes

Executive Summary

  • P1: Python twostep L-update divides by W_max without a zero guard, so LOOCV with an empty control set can return finite scores instead of the Registry-required Q(λ)=∞, and it diverges from Rust’s guarded behavior.
  • P2: In-code docstrings/comments still describe the old step-size/prox formula (η ≤ 1/max(W) and prox_{λ}), which now conflicts with the Registry and implementation.
  • The step-size correction and FISTA placement are otherwise consistent with the updated Methodology Registry for TROP (joint vs twostep behavior).
  • Tests not run in this review.

Methodology

  • P1: W_max == 0 in the twostep L-solver yields lambda_nn / (2.0 * W_max) and can silently produce finite estimates instead of a LOOCV failure, violating the Registry’s “empty control observations → Q(λ)=∞” requirement and breaking Python/Rust parity. Impact: incorrect λ selection and ATT in sparse/degenerate control sets; edge-case instability. Fix: guard for empty effective controls (e.g., if W_max <= 0 or W_masked.sum()==0: raise ValueError(...) so LOOCV returns ) or explicitly short-circuit with a documented failure path; also avoid division by zero. Location: diff_diff/trop.py:L2041-L2072 and Registry requirement docs/methodology/REGISTRY.md:L1120-L1132.

Code Quality
No issues found.

Performance
No issues found.

Maintainability
No issues found.

Tech Debt
No issues found.

Security
No issues found.

Documentation/Tests

  • P2: Docstrings/comments still describe the pre-fix step size and prox operator (η ≤ 1/max(W) and prox_{λ}), now inconsistent with the updated method (η = 1/(2·max(W)), FISTA in twostep). Impact: methodological confusion for readers and reviewers; can cause incorrect re-implementations. Fix: update the docstrings/comments to mirror the Registry and actual code (include L_f = 2·max(W) and FISTA note). Locations: diff_diff/trop.py:L1987-L1996, rust/src/trop.rs:L619-L626.
  • P3: Add a regression test for the empty-control/W_max==0 LOOCV edge case to ensure the solver fails cleanly (or returns ), preventing silent finite scores. Suggested location: tests/test_trop.py near tests/test_trop.py:L2595-L2694.

…rings

- Add conditional threshold when W_max==0 to prevent ZeroDivisionError,
  matching Rust backend behavior (trop.rs:665)
- Update Python and Rust docstrings to reflect correct FISTA/Nesterov
  acceleration formulas (L_f = 2·max(W), η = 1/(2·max(W)))
- Add regression test for all-zero weights edge case

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Mar 8, 2026

/ai-review

@github-actions
Copy link

github-actions bot commented Mar 8, 2026

🔁 AI review rerun (requested by @igerber)

Head SHA: 5e9ca252c19ed8e424aa8ba1446f729a945394d8


Overall assessment: ✅ Looks good

Executive summary:

  • TROP joint and twostep low‑rank updates now consistently use the corrected step size/threshold across Python and Rust, matching the registry update. (diff_diff/trop.py:L878-L894, diff_diff/trop.py:L2043-L2073, rust/src/trop.rs:L664-L666, rust/src/trop.rs:L1362-L1386, docs/methodology/REGISTRY.md:L1064-L1076, docs/methodology/REGISTRY.md:L1188-L1202)
  • Twostep L‑update now uses FISTA/Nesterov acceleration in both implementations, and this scope is documented in the registry. (diff_diff/trop.py:L2054-L2077, rust/src/trop.rs:L725-L781, docs/methodology/REGISTRY.md:L1075-L1076)
  • New Python tests cover step‑size correctness, objective decrease, non‑uniform weights, and zero‑weight handling. (tests/test_trop.py:L2595-L2714)

Methodology

  • Severity: P3 (info) — Impact: The corrected step size/threshold and FISTA scope align with the Methodology Registry for TROP joint/twostep; no mismatches found. Concrete fix: None. (diff_diff/trop.py:L878-L894, diff_diff/trop.py:L2043-L2077, rust/src/trop.rs:L664-L666, rust/src/trop.rs:L725-L781, rust/src/trop.rs:L1362-L1386, docs/methodology/REGISTRY.md:L1064-L1076, docs/methodology/REGISTRY.md:L1188-L1202)

Code Quality

  • Severity: P3 (info) — Impact: No code‑quality issues observed in the modified logic. Concrete fix: None.

Performance

  • Severity: P3 — Impact: FISTA adds up to 10 SVDs per outer iteration in twostep (Python/Rust), which could increase runtime on large panels despite faster convergence. Concrete fix: Consider exposing max_inner_iter as a tunable parameter or adding adaptive stopping based on objective improvement if performance regressions appear. (diff_diff/trop.py:L2054-L2077, rust/src/trop.rs:L748-L781)

Maintainability

  • Severity: P3 (info) — Impact: Docstrings and registry updates track the algorithm changes, keeping implementation and documentation aligned. Concrete fix: None. (diff_diff/trop.py:L1987-L1998, docs/methodology/REGISTRY.md:L1064-L1076, docs/methodology/REGISTRY.md:L1188-L1202)

Tech Debt

  • Severity: P3 (info) — Impact: No new tech debt observed. Concrete fix: None.

Security

  • Severity: P3 (info) — Impact: No secrets or unsafe operations introduced in the diff. Concrete fix: None.

Documentation/Tests

  • Severity: P3 — Impact: Python tests cover the corrected step size and several solver edge cases; Rust FISTA path isn’t directly unit‑tested in this PR. Concrete fix: Consider adding a Rust unit test for prox‑threshold correctness or objective decrease to reduce regression risk. (tests/test_trop.py:L2595-L2714, rust/src/trop.rs:L725-L781)

@igerber
Copy link
Owner Author

igerber commented Mar 8, 2026

Closing, additional work being done instead at #195

@igerber igerber closed this Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant