Skip to content

[Repo Assist] Fix Wilcoxon signed-rank test: incorrect rank multiplication in tie correction#358

Draft
github-actions[bot] wants to merge 2 commits intodeveloperfrom
repo-assist/fix-issue-347-wilcoxon-tie-correction-9540179245de0c96
Draft

[Repo Assist] Fix Wilcoxon signed-rank test: incorrect rank multiplication in tie correction#358
github-actions[bot] wants to merge 2 commits intodeveloperfrom
repo-assist/fix-issue-347-wilcoxon-tie-correction-9540179245de0c96

Conversation

@github-actions
Copy link

🤖 This is an automated PR from Repo Assist.

Closes #347

Root Cause

The tieCorrection function in Testing/Wilcoxon.fs was multiplying the correction term by the rank value i when a tie group had size > 2:

// Before (buggy)
let tieCorrection (i,j) = 
    if j = 2.0 then 
        (j**3. - j) / 48.
    else i * ((j**3. - j) / 48.)  // ← rank `i` multiplied incorrectly

The standard Wilcoxon signed-rank tie correction formula is:

$$\text{correction} = \sum_k \frac{t_k^3 - t_k}{48}$$

where $t_k$ is the number of observations in each tied group — the rank value plays no role. The special case for j = 2.0 happened to be correct (because it skipped the multiplication), but any tie group with 3+ equal values produced an inflated correction term, leading to incorrect z-statistics and p-values.

Fix

// After (correct)
let tieCorrection (_,j) = 
    (j**3. - j) / 48.
```

This aligns with [SciPy's implementation](https://github.com/scipy/scipy/blob/17603e519b3fe2cb3a94dcda99475f3100f23828/scipy/stats/_wilcoxon.py#L168) referenced in both the documentation and the issue.

## Impact

Only affects datasets where at least one value appears **3 or more times** in the ranked differences. Datasets with only pairwise ties (j=2) were unaffected.

## Test Status

All existing Wilcoxon tests continue to pass. Two new regression tests added using `differences = [1, 1, 1, 2, -3]` (a tie group of size 3), validated against SciPy:

```
SciPy wilcoxon([1,1,1,2,-3], mode='approx', correction=False) → p ≈ 0.4922
SciPy wilcoxon([1,1,1,2,-3], mode='approx', correction=True)  → p ≈ 0.5827

✅ Build: succeeded
✅ Tests: 8/8 passed (6 existing + 2 new)

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@d88ca0e8ee2b080fcba4490ac5b657c98a0eb26b

The tieCorrection function incorrectly multiplied the correction term
by the rank value (i) when tie group size j > 2. The standard formula
for the Wilcoxon tie correction is sum((t^3 - t) / 48) where t is the
count of tied observations — the rank plays no role.

This bug only affected datasets where at least one tie group had 3 or
more equal values; pairs (j=2) were handled correctly by a special case.

Closes #347

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wilcoxon Test Tie Correction Multiplies by Rank Incorrectly

0 participants