Docker Hub rate limiting - how to prepare?

_joe · August 25, 2020, 2:22am

Starting November 1st, Docker Hub will change ToS and rate limit anonymous users to 100 image pulls per hour.

The CircleCI convenience images are all hosted on Docker Hub and looking at our own runs we see that the chance of having your images cached locally is close to 0%. Given that many builds run concurrently on your machines, we must assume that this limit will get exhausted quickly and builds will start failing due to that.

What is the official recommendation to your users?

Do you have an invisible caching proxy for image pulls?
Do you have any plans to provide the convenience images from a CircleCI registry?
Should we start mirroring the images we use to our own registry?
Should we get a Docker Hub subscription?

thekatertot · August 25, 2020, 9:36pm

Thank you for posting @_joe! We are working on answers to all of these questions, and will update as soon as we can. We know how important this is. I will update you here when I know more.

ruffsl · August 25, 2020, 11:03pm

For our OSS project, while the number of pulls from CI jobs can exceed the current limits, the number of triggered CI workflows is well below this. It would be nice if dependent jobs in a workflow could be qued to the same CI worker such that the image needn’t be re-pulled if within a user defined time window for that executor. This would have the added benefit of speeding up consecutive jobs that use the same image by skipping the minutes of spin-up time in downloading the docker-image/file-caches for the next dependent job container, as well as help ensure all jobs in the same workflow are using the same sha of the docker image tag, resolving related issues where an image may update between jobs in the same workflow.

Alternatively, CircleCI could erect a transparent registry between CI workers and DockerHub and interpret the same user defined time window parameter (e.g. 06h00m00sec) for that executor config to determine when the cache of the image-tag should be considered stale and re-pull the manifest before discarding old/unused layers. This would be different from the current Docker Layer Caching feature as it would be for caching the pulling of docker image for running the job, not the layers created when building in the job.

https://circleci.com/docs/2.0/docker-layer-caching/

As an concrete example, we use a nightly job to update the CI image on DockerHub, then pull that image repeatedly for the graph of jobs in every workflow, sparing resources and maximize caching.

https://app.circleci.com/pipelines/github/ros-planning/navigation2?branch=main

_joe · August 26, 2020, 12:05am

The problem is that you have no knowledge what other users are doing on the shared hosts, just for the fun of it I could create a job to pull a hundred different images in a few seconds breaking builds for everybody that gets a build scheduled on the same host.
We don’t pull that much in a single run either but the machines are shared and the limit is per IP.

But before I start changing the config of hundreds of projects I’d prefer to know what the preferred way of going forward is. IMHO the cleanest solution would be for CircleCI to have their own registry. They could even intercept pulls of their own images transparently so that no change on the user side is necessary.

ruffsl · August 26, 2020, 12:59am

I was thinking of executors using docker images, as I don’t think CircleCI supports docker-in-docker.

As soon as the first job is finished, the next dependent job running the same executor could take its place on the same host. This would require more intelligent scheduling, and would still be conditional upon workflow graph and sequential job ordering, but biasing jobs towards workers with a relevantly warm docker engine could save local bandwidth and cpu time for the CI cluster.

At first I was going to clarify that the limit was per docker hub repo:

But on second read, an independent 100pulls/6hr limit is applied to unauthenticated IPv4/v6 address?

For anonymous (unauthenticated) users, pull rates are limited based on the individual IP address.

Ooftah! That makes image docker caching even more necessary, even just for free tier CI users.

adrianmui · August 26, 2020, 11:50pm

This is really important and if there’s no alternative it’s hard to imagine using Circle as our preferred CI tool. We’re not able to automate what we can already produce with our existing Jenkins infrastructure.

Please advise.

alexey · August 27, 2020, 5:22pm

Hi folks, I’m one of the product managers here at CircleCI. We’re taking the Docker Hub Terms of Service change seriously, and our engineering team is working on a detailed plan for how we’ll handle this change.

Our goal is to minimise any disruption for our customers.

We will inform all customers, if they are affected by the change, about any action they need to take with as much advance notice as possible.

As I mentioned, we are working on a plan will share more details as soon as we have them. Expect an update from us next week.

Kate and others from the team are, of course, available to hear your questions and concerns in the meantime.

mqchau · August 28, 2020, 4:57am

I have an idea: CircleCI team use a dedicated paid DockerHub user to pull images that do not explicitly indicate username and password.

This DockerHub user will be by itself, has no access to any private images. The cost is small: $60/year. There is no security risk here because it can only access public images anyway.

This method will only apply to docker executor and not the docker commands like docker build. For docker build I think the developers need to login with some credentials to push anyway, then the limit is on that user.

_joe · August 28, 2020, 2:30pm

This could (and should) be considered account sharing.

CircleCI should do the right thing here, they have basically shifted cost to Docker Hub for years and this is clearly one of the reasons why the terms are changing for everybody.

_joe · September 10, 2020, 12:38pm

Two weeks have passed, any update?

thekatertot · September 10, 2020, 3:43pm

We’re still actively working on this @_joe!

_joe · September 25, 2020, 3:39pm

One month has passed, we still haven’t heard from you.
Time is running out!

Edit: We are now also contacting our account executive, time is running out.

thekatertot · September 25, 2020, 8:51pm

@_joe Thanks for checking in. As you can imagine, there’s a bit of red tape for us to get through as we navigate this change. We’re absolutely working on it, and will have statements out as soon as we can. The team is hard at work on this!

thekatertot · September 29, 2020, 8:11pm

Hi! We have our official update on Docker Hub rate limiting here! Authenticate with Docker to avoid impact of Nov. 1st rate limits

Topic		Replies	Views
Caching Docker Base Image in CircleCI Caching Dependencies docker	1	2957	August 25, 2020
Rate limiting for docker hub non authenticated access Build Environment docker	10	233	March 21, 2025
[Updated] Authenticate with Docker to avoid impact of Nov. 1st rate limits Announcements docker	58	18519	November 8, 2020
Docker Hub Rate Limiting: Customer Impact and Solutions Announcements	0	154	March 28, 2025
Caching for docker-compose pulls Build Environment docker	1	844	October 3, 2023

Docker Hub rate limiting - how to prepare?

Related topics