SSH access breaks in Docker build job, but works in debugging session

,

Hi, I’m using CircleCI to build a Dockerfile which pulls from a private Git repo.

I’ve set up a custom SSH key in my CircleCI project according to this CircleCI tutorial, and have verified that this key functions from my local machine — I can use it to clone a private Git repository from the host.

My CircleCI config for the relevant job first adds the SSH key in question (the only one in my account), and then runs a Docker build:

jobs:
  rnng:
    machine:
      image: circleci/classic:latest
      docker_layer_caching: true
    steps:
    - checkout
    - run:
        name: Docker login
        command: |
          echo $DOCKER_HUB_PWD | docker login -u $DOCKER_HUB_USER_ID --password-stdin
    - add_ssh_keys:
        fingerprints:
        - "88:36:05:01:d6:98:05:43:e3:e4:e3:d3:a3:67:e9:29"
    - run:
        name: Build
        working_directory: models/RNNG
        command: docker build -t cpllab/language-models:rnng .

That Dockerfile has a command which pulls from my private server’s IP address. This fails with an authentication error — see below (with anonymized IP).

Step 7/18 : RUN git clone cpl@1.2.3.4:rnng-incremental.git /opt/rnng-incremental
 ---> Running in e1c5d570eb0b
Cloning into '/opt/rnng-incremental'...
Warning: Permanently added '1.2.3.4' (ECDSA) to the list of known hosts.

Permission denied, please try again.

Permission denied, please try again.

Permission denied (publickey,password).

fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
The command '/bin/sh -c git clone cpl@1.2.3.4:rnng-incremental.git /opt/rnng-incremental' returned a non-zero code: 128
Exited with code 128

The weird thing is, I can SSH into this failed job on CircleCI directly after and successfully check out the repository with no issues.

$ ssh -p 54782 104.196.157.107
The authenticity of host '[104.196.157.107]:54782 ([104.196.157.107]:54782)' can't be established.
RSA key fingerprint is SHA256:TRHsnF3L7U62SNS3ncAmbVyundtGUdnGWIN6sdFEfTk.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[104.196.157.107]:54782' (RSA) to the list of known hosts.

circleci@default-4490a638-8f8f-4f6a-9621-09676decc79c:~$ git clone cpl@1.2.3.4:rnng-incremental.git
Cloning into 'rnng-incremental'...
Warning: Permanently added '1.2.3.4' (ECDSA) to the list of known hosts.
remote: Counting objects: 588, done.
remote: Compressing objects: 100% (493/493), done.
remote: Total 588 (delta 67), reused 588 (delta 67)
Receiving objects: 100% (588/588), 3.52 MiB | 20.61 MiB/s, done.
Resolving deltas: 100% (67/67), done.

I’m not sure why these results would be different. The latter successful run is run outside of my Docker build context — perhaps something in the container is making this break (e.g. a different Git version)? I otherwise can’t imagine why the automated build should fail and a manual clone directly after should succeed.

Any ideas from the community would be appreciated … thanks!

I may have misunderstood the function of add_ssh_keys — I suppose I need to then also make the SSH keys available within the Docker build context? How should I do this cleanly?

I’m not sure. This answer suggests I should be able to simply use the CircleCI add_ssh_keys directive, I think?

Got it in one (or two) :grin:

Your build server is Docker, which has the keys, and your build command is Docker-in-Docker, which does not have the keys. It will only get the keys if you give them to it.

The way I tend to do it is to convert a key to Base64, pop it in an environment variable, supply it as a build arg, and then declare it as an ARG in your Dockerfile. You can then use that var and Base64 decode it.