A more complete example for using container agent self-hosted runner?

denis · September 30, 2022, 4:19pm

I’m having a little trouble getting a sample of using the new container agent self-hosted runner working, on GCP/GKE. I’m new to CircleCI in general so I think I’m just missing some understanding.

I’ve installed the container agent to our cluster and configured it with the token and from the logs, its clearly talking to CircleCI just fine – it’s receiving jobs and attempting to run them.

I’m trying to run a job to deploy into our cluster. So I have this job:

  deploy:
    docker:
      - image: google/cloud-sdk
    resource_class: our-organization/denis-test-resource-class
    steps:
      - checkout
      - run:
          name: authenticate gcloud CLI and set project
          command: |
            echo $GCLOUD_SERVICE_KEY | gcloud auth activate-service-account --key-file=-
            gcloud --quiet config set project ${GOOGLE_PROJECT_ID}
            gcloud --quiet config set compute/zone ${GOOGLE_COMPUTE_ZONE}
      - gcp-gke/update-kubeconfig-with-credentials:
          cluster: em-alpha
      - gcp-gke/rollout-image:
          cluster: em-alpha
          container: $IMAGE_NAME
          deployment: $IMAGE_NAME
          image: $DOCKER_FULL_IMAGE_NAME
          tag: $BRANCH
          namespace: em-services-alpha-01

When I run the job, looks like the authenticate gcloud CLI and set project step runs fine:

#!/bin/bash -eo pipefail
echo $GCLOUD_SERVICE_KEY | gcloud auth activate-service-account --key-file=-
gcloud --quiet config set project ${GOOGLE_PROJECT_ID}
gcloud --quiet config set compute/zone ${GOOGLE_COMPUTE_ZONE}

Activated service account credentials for: [circleci-gcp-access@*******.iam.gserviceaccount.com]
Updated property [core/project].
WARNING: Property validation for compute/zone was skipped.
Updated property [compute/zone].
CircleCI received exit code 0

but I think gcp-gke/update-kubeconfig-with-credentials is failing (and I’m not sure I even need it?) (the step in the UI output is labelled Install latest gcloud CLI version, if not available)

#!/bin/bash -eo pipefail
install () {
  # Set sudo to work whether logged in as root user or non-root user
  if [[ $EUID == 0 ]]; then export SUDO=""; else export SUDO="sudo"; fi
  cd ~/
  curl -Ss --retry 5 https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-283.0.0-linux-x86_64.tar.gz | tar xz
  echo 'source ~/google-cloud-sdk/path.bash.inc' >> $BASH_ENV
}

if grep 'docker\|lxc' /proc/1/cgroup > /dev/null 2>&1; then
  if [[ $(command -v gcloud) == "" ]]; then
    install
  else
    echo "gcloud CLI is already installed."
  fi
else
  echo "----------------------------------------------------------------------------------------------------"
  echo "this is a machine executor job, replacing default installation of gcloud CLI"
  echo "----------------------------------------------------------------------------------------------------"
  sudo rm -rf /opt/google-cloud-sdk
  install
fi

----------------------------------------------------------------------------------------------------
this is a machine executor job, replacing default installation of gcloud CLI
----------------------------------------------------------------------------------------------------
/bin/bash: line 18: sudo: command not found

Exited with code exit status 127
CircleCI received exit code 127

Is there a more complete example of something like this somewhere? I’m piecing together bits and pieces from here and there…

(cc @sebastian-lerner )

sebastian-lerner · September 30, 2022, 5:32pm

denis:

gcp-gke/update-kubeconfig-with-credentials:
          cluster: em-alpha
      - gcp-gke/rollout-image:
          cluster: em-alpha
          container: $IMAGE_NAME
          deployment: $IMAGE_NAME
          image: $DOCKER_FULL_IMAGE_NAME
          tag: $BRANCH
          namespace: em-services-alpha-01

Hi @denis

In your .circleci/config.yml file, the two commands after the run step, are those meant to be specific “jobs” you want to run? or are those configurations for your cluster?

gcp-gke/update-kubeconfig-with-credentials:
          cluster: em-alpha
      - gcp-gke/rollout-image:
          cluster: em-alpha
          container: $IMAGE_NAME
          deployment: $IMAGE_NAME
          image: $DOCKER_FULL_IMAGE_NAME
          tag: $BRANCH
          namespace: em-services-alpha-01

The way you redirect the job to the cluster itself is by using the resource class associated with your container-agent. So you shouldn’t need to specify cluster: or namespace:, etc.

Here’s a full example CircleCI config file that uses a container-agent: https://circleci.com/docs/runner-faqs#sample-configuration-container-agent

Let me know if you still have questions, happy to help

denis · September 30, 2022, 6:46pm

I left out some details I should have made more clear.

The gcp-gke/update-kubeconfig-with-credentials and gcp-gke/rollout-image steps are from the gcp-gke Orb – CircleCI Developer Hub - circleci/gcp-gke

So all those references to cluster, etc, are just parameters to those steps; I want to deploy a docker image from GCP’s Artifact Repository to our cluster (which is behind firewall, which is why I’m using the container agent self-hosted runner for this).

I’m pretty sure the job is running in the container agent on our cluster. I have the resource_class defined at the job level:

    docker:
      - image: google/cloud-sdk
    resource_class: our-organization/denis-test-resource-class

Should I not be using the google/cloud-sdk docker image?

It’s weird that at one step (where I’m explicitly specifying commands) the gcloud CLI command is working fine… but in the steps from the orb, it seems to think gcloud is not installed and is trying to install it.

Thanks!

sebastian-lerner · September 30, 2022, 7:24pm

Interesting, thanks for clarifying. Let me ask the engineering team I work with since I am stumped at this point. I’ll let you know when I have more, thanks for your patience.

sebastian-lerner · October 3, 2022, 4:05pm

@denis we think we found the issue and it seems to be a problem with some assumptions in the orb logic that aren’t compatible with a k8s environment.

We’re working internally to see if we can get the orb updated in a way to get around this issue. I’ll reach out when I have more. Thanks for the patience.

sebastian-lerner · October 17, 2022, 8:11pm

@denis an update here, an engineer at CircleCI took a look at this issue this week.

We think a work-around is simply to install sudo and that should turn the build green. We’ll be updating the orb to put a check in place to see if sudo is installed and if not, fail the job with a proper message since it’s cryptic right now

Topic		Replies	Views
Running Docker Jobs in self hosted Build Environment	1	22	June 27, 2025
Circleci-k8s-agent: Kubernetes scaling solution for self-hosted runners Community Projects	3	1683	August 18, 2022
A more scalable, container-friendly self-hosted runner: Container Agent - now in Open Preview Build Environment	23	3783	May 4, 2023
K8s Container Runner pod giving request errors Build Environment docker	1	649	January 16, 2023
Google Container Engine Deployment Fails Feedback & Bug Reports docker , google-app-engine	14	10207	June 27, 2018

A more complete example for using container agent self-hosted runner?

Related topics