We’re excited to announce: Container Agent (final name TBD), a more scalable and container-friendly self-hosted runner, is now in Open Preview: Container runner - CircleCI
With container agent, self-hosted runner users will have:
The ability to easily define, publish, and use custom Docker images during job execution
The ability to easily manage dependencies or libraries through custom Docker images by using the Docker executor in config.yml
Seamless orchestration of ephemeral Kubernetes pods for every Docker job on self-hosted compute
If you need to run CI/CD jobs on your own infrastructure with Kubernetes, or are using the existing self-hosted runner installation on Kubernetes, visit the container agent docs today to get started.
Container Agent does not replace the existing self-hosted runner, but is instead a complement. The existing self-hosted runner is meant for customers needing to use the Machine executor. Container Agent is the equivalent of the Docker executor for self-hosted runners.
New additions in the past week, mainly improvements to scenarios when deviating from the happy path:
If a job using container agent fails, previously the workflow did not always gracefully fail as well. This has now been fixed
When the underlying node for a task pod is removed from the cluster (either by kubectl delete node, unexpected shutdowns, or a variety of other reasons) the container-agent garbage collection loop is now able to detect that the node is no longer available and clean up the pod
Because container agent allows you to configure tasks pods with the full range of Kubernetes settings, this means pods can be configured in a way which cannot be scheduled due to their constraints. We’ve added a constraint checker which periodically validates each resource class configuration against the current state of the cluster to ensure pods can be scheduled. This prevents container agent claiming jobs which it cannot schedule which would then fail
I am having issues installing the helm chart into multiple namespaces with different resource class names.
ClusterRole and ClusterRoleBindings have conflicts.
Will the console interface gain any features to allow more control over a runner - currently, it is possible to create a resource class and runner in the GUI, but there is no way to delete them.
There is also a lack of reporting in terms of runner usage, but that is a longer-term issue for when more people are using runners.
@rit1010 Yup, management of resource classes & resource class tokens via the UI is something on the near-term roadmap. We hope to have something out in the next ~3 months.
Showing Runner usage is also on the roadmap, but further down the line.
is there a way to allow us to intercept the ephemeral task pod creation process?
In my case, I’d like to append a Label to the ephemeral tasks pod so that it can claim to use a Managed Service Identity(MSI) during deployment to Azure.
@yuft Right now the only customization to task pods is through the resource class configuration process. I’m not as familiar with how one goes about appending that Label, is that something that can be added to the pod spec?
The task is still cleaned up but the container agent restarts after this, has anyone else reported this issue? Apologies if this is the wrong spot to toss this.
@uplight-james Can you share the version of container agent you’re using? It should be visible in the “task lifecycle” step from the Job Details page for a job that was run. Or if you go to your inventory screen (“Self-hosted Runners” on the left-hand nav of your UI) it should have the version as well
@sebastian-lerner thanks for the reply! We are using circleci/container-agent:1.0.8569-ccd6594. Let me know what else I can provide.
We see this directly after a task finishes, with garbage collection on or off. The container spun up for the task DOES get removed from the cluster properly but this error still occurs.
It results in the container-agent exiting 2 (according to kubectl describe pod) and restarting, the container-agent does come back up and start working after that.
Folks, a couple of updates to share in the recent helm chart upgrades:
container-agent can now be run on ARM pods for both the pod that installs container-agent and the “task pods”. No need to specify this in values.yaml, there’s logic built in to pick up the right architecture and work accordingly
We now fallback to a generic shell if bash is not included in the image provided. @jpi I think this should fix the issue you were seeing in this thread.
If you upgrade to the latest helm chart these should be available.
Also coming very soon, some logging improvements to the errors that we output to be more actionable.
We just pushed a fix with the latest version of the helm chart that fixes issues some users were seeing in this thread. It was preventing some images which worked just fine on CCI-hosted compute from being used with container-agent. This limitation should no longer exist. Reach out to me if there are still issues you’re seeing.
This is in open preview, but is it considered stable enough for production usage? To clarify, we probably have fairly plain-vanilla use cases, nothing complex.
We have customers today using it in production. It is used heavily as part of our internal development process in CI as well so we are using it in production within CircleCI.
That being said, we’re still making many changes over the next couple of weeks before declaring it “generally available” which may cause bugs. So our official stance is “be careful when using in production and use at your own risk”.