How to debug docker push failures?

How to debug docker push failures?

There might be a situation where running a docker push command returns an error, and only a few logs are generated. This article will introduce helpful ways to debug issues like this.

Example Situation

A Docker image upload to Google Registry(GCR) fails randomly with the following error. Also, you’ve never seen the same error in your local machine. This job uses Remote Docker with Docker executor.

denied: Unable to write blob sha256:29416d5f02a649cf688b551c289c535c2177de6263d43c5279db2cba514315bc

Step1 - Check DLC (Docker Layer Caching) usage

If the failed builds only use the same DLC volume, it might be related to the issue. If the DLC volume stores a broken cache, it causes some issues. You can double-check which volume is used in your build, in the Setup a remote Docker engine step when you use Remote Docker, and it shows Using volume: <volume number>.

If you use a Machine executor, that message shows in the Spin up environment step.
In this case, you can disable the DLC feature once, and see how it goes.

Step2 - Check detailed docker logs

You can see more detailed logs by enabling debug mode, and this helps you to determine whether the Docker daemon crashes internally or the Registory side blocks your request.

If you use Remote docker, you can add the following step after the setup_remote_docker step.

     - run:
          name: Enable debug mode
          command: |
              ssh remote-docker -- sudo bash -c "'apt update; apt install jq; if [ ! -f /etc/docker/daemon.json ]; then echo \"{}\" > /etc/docker/daemon.json; fi; cat \<<< \$(jq \".\\\"debug\\\" = true\" /etc/docker/daemon.json) > /etc/docker/daemon.json; systemctl restart docker.service'"

Then, you can add the following step to show more detailed logs.

     - run:
         name: Show docker logs 
         command: ssh remote-docker -- sudo journalctl -ae -u docker.service
         when: always

For the following detailed logs example, the docker daemon had tried to push but GCR didn’t accept, so in this case, the GCR configuration has an issue.

Apr 14 06:46:35 default-ccaf4e66-848d-4dad-8737-0b1e72ab2bf3 dockerd[7711]: time="2021-04-14T06:46:35.867086336Z" level=error msg="Upload failed: denied: Unable to write blob sha256:29416d5f02a649cf688b551c289c535c2177de6263d43c5279db2cba514315bc"
Apr 14 06:46:35 default-ccaf4e66-848d-4dad-8737-0b1e72ab2bf3 dockerd[7711]: time="2021-04-14T06:46:35.986918010Z" level=info msg="Attempting next endpoint for push after error: denied: Unable to write blob sha256:29416d5f02a649cf688b551c289c535c2177de6263d43c5279db2cba514315bc"

You can see more details for docker daemon configuration here.

Configure and troubleshoot the Docker daemon | Docker Documentation

Step3 - Check IAM permission

GCR and a few docker registry services have IAM settings to control actions. For GCR, it requires some permissions to push the docker image to GCR. You can see the example settings in the following GCR doc.

Configuring access control  |  Container Registry documentation

Step4 - Check Registry configuration

Registries also have some security features, and it sometimes causes an issue. In this example, the Google Cloud Storage (GCS) Retention Policy, which prevents objects in the bucket from being deleted or modified for a specified minimum period of time after they are uploaded, caused this issue when uploading a large image.

When pushing a blob object that is slightly larger like 400MB in size, the docker push command uploads the data in smaller pieces. However, the Retention Policy does not allow modifications to the file in GCS and it declines the second request causing the docker push to fail. In this case, you will need to disable the Retention Policy.

Retention policies and retention policy locks  |  Cloud Storage

Step5 - Submit a support ticket

If the above step doesn’t solve your problem, please feel free to send a support ticket, and we can debug your issue together.

Submit a request – CircleCI Support Center