Inconsistent Test Results - Possibly Due to Shared Load?

Hi folks. We have some selenium-based tests that have been fairly stable for a while, but in the past few months there have been a few failures due to connection pool timeouts with the database, timeouts with the webdriver assertion, etc. We can run these tests locally and they pass just fine. Furthermore, we used to just re-run these tests from failure in the Circle dashboard, and they’d eventually pass. But it’s gotten progressively worse over time, to the point we’re at now where it’s a less than 50% chance that the test will pass. I’ve tried optimizing by splitting the jobs into parallel workflows, but it doesn’t seem to make a difference.

Is it possible that our use of the docker executor type combined with additional load on the worker instance we’re using is causing these timeouts? If so, what solution do you recommend?

Kind regards,
Joe

Did you figure this out? I think we’re seeing something similar.

We’re seeing the same thing. Our build times are “consistently” inconsistent. We honed our build configuration for years: to properly split tests into parallel workers by timings: to make sure each parallel job takes roughly the same amount of time to run… It was good for a while, but recent few months we indeed see our builds taking from 10 min to 30 min and we can’t find the explanation for it.

Hey CircleCI guys, did you switch to AWS Spot instances to save cost?

I don’t blame you, I just need to know so that I could switch to CI that can run on my own hardware, worker nodes. For example, GitLab CI works great and allows you to bring your own hosts.

Hi! Let me raise this to the team to see if we can find out more.