No output from a container's step in a parallel job workflow

Naicisum · August 7, 2019, 5:54pm

Reference: https://circleci.com/gh/OpenNMS/opennms/8182#tests/containers/5

We seem to be experiencing intermittent issues where a step for a particular container is not producing output. This causes the container to abort after the preset time, subsequent steps that save artifacts and analysis show that the job is indeed running. Examining the other containers running in parallel show no issues.

The script in question smoke.sh first runs an IF to test a condition, then a binary IF which regardless of condition runs an echo command, this should always cause output at the beginning. Looking at the referenced link above, you can see no output for step Smoke Tests in container 5, but other containers have valid output.

This has occurred for us across different jobs/steps and cannot seem to be isolated to something controllable on our side of the fence.

halfer · August 7, 2019, 7:11pm

I have a ticket in support for this exact issue, and your output is very helpful. You have a 30m timeout, and it ran for 76m! That looks like a red flag to me…

Perhaps you could flag this to support@circleci.com and mention that it might be the same issue as ticket 54295?

j-white · August 8, 2019, 6:14pm

We’ve had this happen on some other jobs too like this one: https://circleci.com/gh/OpenNMS/opennms/8294#tests/containers/6

My suspicion is that if/when the container times out, the logs aren’t properly gathered.

There’s a similar problem that occurs when the container run time exceeds the 5 hour limit. Some of the logs are visible in the step output, but the complete log file cannot be downloaded.

jiteshsojitra · August 8, 2019, 7:28pm

Since 4-5 days, our jobs are completely hanging in the container and node command takes forever. It just shows blank command and doesn’t print anything after that. Sometime, when job starts it runs fine and in between vanishes all the console log. Unfortunately not a single workflow successfully PASSED in last 4-5 days for us! We have parallelism varies from 3-5 container and atleast one job hangs in each workflow.

It has affected our team productivity and can’t move forward. Container hangs for timeout limit for each jobs, we are blocked!

CircleCI, will you fix this blocker issue ASAP? I have opened support ticket but no reply from you too.

CircleCI%20Job%20Hangs

halfer · August 9, 2019, 5:17pm

It may not be a blocker for enough customers - the issue I have is intermittent, only on one repo, and a rebuild always fixes it. I notice also that you do not have a “Too long with no output”, so I wonder if you have a different problem.

jiteshsojitra · August 9, 2019, 6:02pm

Rebuild doesn’t fix issue for us and one of the container out of 3-4 parallelism always fails in any case. Issue is common as described in the original description and our test works same way as far as parallelism and tests distribution is concerned. It is private repo otherwise i would have shared the URLs. I got the replies from CircleCI and they are still investigating.

Naicisum · August 17, 2019, 5:17pm

We received a report from CircleCI support that this particular issue has been resolved. Our recent builds do not seem to be experiencing this particular issue anymore.

system · August 27, 2019, 5:17pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Steps with progress output stop running Feedback & Bug Reports docker , circle-yml	3	2254	September 23, 2019
Docker-compose not outputting from child - fixed with new binary Build Environment	0	762	March 24, 2020
Containers stuck in "Not Running" Feedback & Bug Reports	12	3840	June 18, 2018
"Too long with no output" on job that completes Feedback & Bug Reports	3	2180	March 31, 2020
No output from commands in job steps Feedback & Bug Reports	3	2115	September 19, 2019

No output from a container's step in a parallel job workflow

Related topics