Container-agent error on job execution

Environment #1
Environment: Azure AKS (v1.21.2)
Helm chart: circleci/container-agent
Helm version: 1.0.10775

Environment #2
Environment: Docker Desktop (v1.21.3)
Helm chart: circleci/container-agent
Helm version: 1.0.10775

When jobs run we get the following error (occurs in both environments):

20:47:45 347f5 55801.845ms service-work error=1 error occurred:
* could not start task containers: exec into build container “primary” failed: error executing command /bin/bash -c echo ‘Starting circleci-agent’; mkdir -p /home/circleci/ccita : Internal error occurred: error executing command in container: failed to exec in container: failed to start exec “9790”: OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: “/bin/bash”: stat /bin/bash: no such file or directory: unknown

mode=agent result=error service.name=container-agent service_name=container-agent

The container-agent-* pod seems healthy and shows up in the CCI UI under self-hosted runners with the expected version. When jobs execute I can see the ephemeral ccita-* pod get scheduled. This pod doesn’t seem to have any log output. When following the logs in the container-agent-* I get the above error (also seen in the CCI job UI).

Thanks for the help.

It’s possible that the image does not have bash installed

For right now, container agent needs the images to follow the requirements described here: Using Custom-Built Docker Images - CircleCI

We’re looking to see if we can (1) make the warning more obvious and (2) fallback to sh if bash is not present. The workaround right now would be to add bash to the image either from the dockerfile itself or a new dockerfile that builds from that image as a base.

I’ll investigate. FWIW, it had been working on 1.0.7638 of the Helm chart. Thanks @sebastian-lerner

interesting. and no change of image at all?

Yea, hashicorp/terraform:1.1.9. I can DM you job links showing a successful run on Container-agent version 1.0.7638 deployed via Helm v1.0.7638 vs a failed job on Container-agent version 1.0.10775 deployed via Helm v1.0.10775.

Job links would be super helpful, thank you. I’ll take it back to the team.

done. thanks for the help.