Skip to content

tests: eliminate port-allocation races in SSE/StreamableHTTP tests#2264

Draft
maxisbey wants to merge 1 commit intomainfrom
maxisbey/max-156-flaky-convert-ssestreamablehttp-tests-to-httpxasgitransport
Draft

tests: eliminate port-allocation races in SSE/StreamableHTTP tests#2264
maxisbey wants to merge 1 commit intomainfrom
maxisbey/max-156-flaky-convert-ssestreamablehttp-tests-to-httpxasgitransport

Conversation

@maxisbey
Copy link
Contributor

Motivation and Context

Fixes #1573.

Several test files use the pattern socket.bind(("", 0)) → get port → start uvicorn in multiprocessing.Processwait_for_server() → connect. This has a TOCTOU window between port selection and bind: with pytest-xdist parallel workers, two tests can grab the same port, leading to connections landing on the wrong server (the 404 != 200 failure in test_response) or "address already in use" errors.

Prior attempt #1528 used worker-specific port ranges but was closed unmerged.

How Has This Been Tested?

  • Full suite: 1136 passed, 100% coverage
  • Stress: 5× runs with 14 parallel workers, all pass
  • test_response flakefinder: 20/20 pass
  • Runtime: ~4.7s for 104 affected tests (was ~25s+ with multiprocessing)

Breaking Changes

None. Test infrastructure only.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Approach

New helper run_server_in_thread (tests/test_helpers.py): runs uvicorn in a daemon thread with port=0. The kernel atomically assigns a free port at bind time, then we read it back from the bound socket via server.servers[0].sockets[0].getsockname(). No window between check and use.

Why not httpx.ASGITransport everywhere?

I tested this — httpx.ASGITransport buffers the full response body before returning (await self.app(scope, receive, send) blocks until the app returns). This works for StreamableHTTP POST→SSE cycles where each request completes, but deadlocks for:

  • Legacy SSE (sse_client + SseServerTransport) — the GET /sse stream stays open indefinitely
  • StreamableHTTP standalone GET stream — same problem for server-initiated notifications
  • Any SSE security test expecting a 200 response (the stream never completes)

So I used a hybrid:

File Approach
test_streamable_http_security.py httpx.ASGITransport (all POST-based, responses complete)
test_http_unicode.py httpx.ASGITransport (POST→SSE cycles only)
test_sse_security.py run_server_in_thread (several tests need 200 on long-lived stream)
test_sse.py run_server_in_thread (legacy SSE requires real HTTP)
test_streamable_http.py run_server_in_thread (fixtures only — test bodies unchanged)

Net: −412 lines (495 added, 907 deleted).

Side effects

  • Server-side handler code now runs in-process (same PID, different thread), so coverage.py tracks it. Removed some # pragma: no cover annotations and cleaned up unused handlers that were never actually exercised.
  • The event_store in event_server fixture is now the same object the server uses (not a pickled copy), so tests can inspect server-side state if needed.
  • tests/shared/test_ws.py and tests/server/mcpserver/test_integration.py still use the old pattern — out of scope for this PR.

AI Disclaimer

Replace the multiprocessing.Process + socket.bind(("",0)) pattern with a
run_server_in_thread helper that uses uvicorn with port=0 in a background
thread. The kernel atomically assigns the port at bind time and we read it
back from the bound socket, eliminating the TOCTOU window that caused
test_response and others to intermittently connect to the wrong server
under pytest-xdist parallel execution.

Where possible (StreamableHTTP security tests, Unicode tests), use
httpx.ASGITransport for fully in-process testing with no real sockets.
Legacy SSE tests keep real HTTP because ASGITransport buffers responses
and cannot support the long-lived GET /sse stream pattern.

Github-Issue: #1573
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix flaky test test_response in test_streamable_http.py

1 participant