I doubt anyone will be able to help you in response to such a brief post. Would you supply your config.yml and your log output? Please supply both in text format, with block/code formatting applied.
Are you able to do some debugging to find out where your test gets stuck? Is it a browser test? The more information you can share here, the more helpful responses might be.
Is it possible that your test program could produce no output for 10 minutes and not actually be stuck? I am not familiar with --with-nicedots but I am assuming this will print dots to the screen to show the test system is running and not crashed?
I dont think that is possible. It runs under in a few seconds in my local and also, in the next successful run that happened after this failed test run.
The dots dont run continuously to tell if it is running or not.
It prints a dot, or a letter (such as E for error, F for fail).
You can read more about here.
In my case, it does this:
test_cases.py:TestCase.test_list ... Too long with no output (exceeded 10m0s)
More Info: Test run was failing in CircleCi 1.0, with a different test case. After which I migrated to 2.0, which ran smoothly for about a month and then it again started to fail and now its fine again.
It might be that your tests are too sensitive to their environment, and they need to be made more robust. Passing sometimes and failing sometimes is a classic sign of flakey tests (don’t worry about it, everyone gets 'em).
Can you add debug commands to your tests to log how far they get on the occasions it gets stuck? If it stops on the same test every time it fails, that would be an indicator of a problem to dig into.
Well, if the test was “flakey”, then it wouldn’t run after I SSH-ed into the CircleCI box as well, right? I managed to run it multiple times in the box and it didn’t stuck.
EDIT:
This Django Testing Example is much similar to the test which is failing. In the example link, it makes a POST request, I do a GET request and compare with the expected output. That’s it.
You might say that it may be stuck due to the API call, but that same API call is being made just in the previous test and that worked fine.
I think you’re partly right: your tests are brittle only on start-up. Do they use the MySQL server in your tests? If so, I wonder if you need to have a wait command before your tests to ensure MySQL is ready before you try connecting. Of course, in an SSH session, that’s already done, which might be why it works here.
If you are reliably running tests that use the database, and it gets stuck on something in the middle, then it is not likely to be a database start-up issue, and instead you need to find out where it gets stuck. You will need to add something to log to a file (and export as an artefact) or examine the file in a post-fail SSH session.