mgurney
September 26, 2024, 1:13am
1
I have a particular Job
in a Workflow
that that hangs, in the CCI UI, I see Queued / Preparing
with 0 / 20m 46s
and going up.
Under the steps output, I just see Preparing environment
with a blue spinner.
It never progresses.
This is running on a local executor.
I have other Jobs in my Workflow which use the same executor and they work fine. It is always that specific Job where the issue happens.
In my pod where we run the container-agent kubernetes
I see HTTP SC 409
errors on url https://runner.circleci.com/api/v2/step/end
mgurney
September 26, 2024, 1:43am
2
Here is a slightly redacted version of the log output:
{
"data": {
"duration_ms": 100.041198,
"http.attempt": 1,
"http.base_url": "https://runner.circleci.com",
"http.client_name": "runner-agent",
"http.host": "runner.circleci.com",
"http.method": "POST",
"http.request_content_length": 0,
"http.response_content_length": "0",
"http.retry": false,
"http.route": "/api/v2/step/end",
"http.scheme": "https",
"http.status_code": 409,
"http.target": "/api/v2/step/end",
"http.url": "https://runner.circleci.com/api/v2/step/end",
"http.user_agent": "",
"meta.beeline_version": "1.15.0",
"meta.local_hostname": "circleci-runner-container-agent-123123-vgkkm",
"meta.span_type": "root",
"meta.type": "http_client",
"mode": "agent",
"name": "httpclient: runner-agent /api/v2/step/end",
"result": "success",
"service": "circleci-runner",
"service.name": "circleci-runner",
"service_name": "circleci-runner",
"span.kind": "Client",
"trace.span_id": "123",
"trace.trace_id": "123",
"version": "3.0.24-6207-17b540e",
"warning": "the response from POST /api/v2/step/end was 409 (Conflict) (1 attempts)"
},
"time": "2024-09-26T00:35:12.765978102Z",
"dataset": "unknown_dataset"
}
mgurney
September 26, 2024, 2:01am
3
It is not just this Job within this Workflow that has the issue. I see the same problem with unrelated Jobs in other Workflows. But the same pattern, that on re-run it is always the same Jobs that get the issue, whilst others run fine. All jobs are using the same executor and none are using any caching.
mgurney
September 26, 2024, 2:03am
4
I have tried rerunning the Workflow in 2 ways:
CCI UI “Rerun from start”
Push a git change and allow it to trigger a Workflow
In both cases the same result, the other Jobs in the Workflow complete and this Job hangs.
mgurney
September 26, 2024, 2:13am
5
We tried restarting the pod which hosts our container-agent
it has not helped.
I have tried retriggering jobs and I think that the HTTP SC 409
errors do correlate with Job’s getting stuck in the Preparing environment
state.
mgurney
September 26, 2024, 4:06am
6
The issue of Job’s not running was due to them pointing to a self-hosted runner that had been decommissioned. By changing the job to point to a different self hosted runner, the issue was resolved. I don’t know if the HTTP SC 409’s are related.
system
Closed
October 3, 2024, 4:06am
7
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.