Workflow stops/crashes seemingly randomly

I’m experiencing that the automated test runs we run on PRs sometimes fails/crashes.
To me it seems the docker instance just kills itself for no apparent reason, and there’s no consistent reason/“location” in the run where it happens.
Haven’t found any clues in the logs while running in SSH mode, and the resource usage seems to not be pushed at all (image used: ubuntu-1604:202010-01)

Would appreciate any pointers on where to look next, not even sure if I’ve looked in the right logs.

Thanks!

Hi @kskarra,

Welcome to the CircleCI community!

Could you check if the job’s memory usage doesn’t reach the limit for the resource class you’re using?

Hi, and thanks!

I included the memory usage step from the link you provided, and it looks like only a few GB is in use (25,2% of 15GB) at peak right before the run crashes.

I’ve also SSH’d into a run while it was going, and looked at the htop intently, and I couldn’t see any spikes or unwanted behavior while it ran.

I am now able to reproduce this locally, so it’s likely not a issue within CircleCI.

Thanks for the help and feedback!

Thanks for the update, @kskarra!

Once you’ve managed to fix the issue locally, let me know if you have any trouble running the build on CircleCI.

1 Like

Was able to locate the issue, and now things are looking green again!

1 Like