Lately I have been experiencing odd behaviour where a specific cypress test, which never fails locally, started failing when ran in CCI due to timeouts:
CypressError: Timed out retrying after 10000ms: cy.wait() timed out waiting 10000ms for the 1st request to the route: getSomeData. No request ever occurred.
Observed Behavior:
-
Test fails on initial CI run (3 out of 6 attempts) - cancelled it during 3rd attempt
-
After cancelling the failed build and rerunning from failure, all tests pass
-
Rerunning from start or failure after a complete failed run just leads to consistent failures
-
The test verifies UI state after an API call, and the UI shows the correct state even when the test fails
Environment Details:
-
CI Provider: CircleCI
-
Test Framework: Cypress
-
Resource Usage During Failure:
-
CPU: Peaks at 79% during installation
-
RAM: Peaks at 63% in the first 5 minutes
-
-
Successful Run Metrics:
-
CPU: Peaks at 74%
-
RAM: Peaks at 58%
-
What I’ve Tried:
-
Increased timeouts in test configuration
-
Separated test into 3 fragments (Initially the test contained a monolithic fragment handling 3 test cases)
-
Confirmed the application behaves correctly despite test failures in CCI (always passes locally)
-
Compared resource usage between failed and successful runs
Key Questions:
-
Could the slight differences in resource utilisation (CPU 79% vs 74%, RAM 63% vs 58%) be significant enough to cause these failures and flaky tests?
-
Are there known issues with cold starts in CircleCI that could affect network request timing?
-
Are there specific CircleCI configurations or optimisations I should consider for more consistent test execution?
Additional Context:
-
The test involves mocking API responses and verifying UI state
-
The application appears to be working correctly even when tests fail
-
The issue is consistently reproducible in CI but never occurs locally