Skip to content

caddytest: make TestReverseProxyHealthCheck deterministic with poll instead of sleep#7474

Open
Amirhf1 wants to merge 1 commit intocaddyserver:masterfrom
Amirhf1:fix/reverseproxy-healthcheck-test-deterministic
Open

caddytest: make TestReverseProxyHealthCheck deterministic with poll instead of sleep#7474
Amirhf1 wants to merge 1 commit intocaddyserver:masterfrom
Amirhf1:fix/reverseproxy-healthcheck-test-deterministic

Conversation

@Amirhf1
Copy link

@Amirhf1 Amirhf1 commented Feb 12, 2026

the test was flaky because health checks can still be warming up when the first request is made. The 100ms sleep was not reliable: sometimes the upstream was still reported unhealthy, and the request returned 503 instead of 200, causing CI noise and test failures.

Solution:

  • Removed the fixed time.Sleep(100ms) and its TODO comment.
  • Added a poll loop that:
    • Sends a GET request to the proxy every 20ms.
    • Succeeds as soon as the response is 200 with body "Hello, World!".
    • Fails with a clear message if no success within a 5s timeout.

This keeps the test deterministic, removes arbitrary wait times, and aligns with the maintainer’s request for a polling mechanism with a timeout.

Fixes #7468

@CLAassistant
Copy link

CLAassistant commented Feb 12, 2026

CLA assistant check
All committers have signed the CLA.

@Amirhf1 Amirhf1 force-pushed the fix/reverseproxy-healthcheck-test-deterministic branch from 281f340 to 7c8ad4c Compare February 12, 2026 22:15
@francislavoie
Copy link
Member

Looks fine I guess, but I'd want to understand why the upstream is considered unhealthy to start in the first place. It's running as part of the same server, so it doesn't make sense to me that it would return 5xx at all if the server is reachable.

@Amirhf1
Copy link
Author

Amirhf1 commented Feb 12, 2026

with active health checks the upstream is only marked healthy after the first health check passes.
that first check runs in a goroutine right after config load so there’s a race: if the test hits the proxy before that check completes the upstream is still unhealthy and we get 503 (no upstreams available) the poll just waits until the health state has converged (we get 200) making the test deterministic

@francislavoie
Copy link
Member

I think we should probably assume configured/known upstreams are healthy unless discovered otherwise, to at least allow early requests a chance at making their way through 🤔 WDYT @mholt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make reverse proxy health check integration test deterministic

3 participants