DLC teardown - control over deleting of images when over 15GB

Context
I have a job that produces both two docker images: one is ~1GB that’s used for most of our ELT logic; and a second one that’s ~14.5GB because it also includes machine learning dependencies. Although the final few layers are created in the same way in each case they start from very different bases (the ML one uses a heavyweight nvidia base image plus some custom additional heavy stuff).

As explained in the docs - support . circleci . com /hc/en-us/articles/4860926181787 - it is possible to check what the docker layer caching teardown does, and indeed i see that it deletes one or more images every time i run the job. This is a pain because when the caching is working the huge image only takes a few seconds to build and push (because it’s only the last layer that needs updating). The other image isn’t quite as bad because the worst case is only 2-3mins anyway, but that can still be annoying if you are deploying multiple times per day.

Is there any way to increase the 15GB limit, or in some way control what the teardown does? Perhaps i need to do run some docker cleanup commands explicitly so that i’m not leaving the decision making up to the teardown logic (i’m not clear how it picks which images to delete).

thanks

Support here is provided by a mix of volunteers and staff, but if you are using 14.5GB docker images that use nvidia based workloads you are likely to have good access to direct CircleCI support packages with SLAs. As such you are likely to get a quicker answer going via the support team, rather than the forums.

The logical way to resolve this would be to split your large docker image into 2 parts, the OS/application environment and the data set. The data set can then be stored via an independent solution such as CircleCI’s cache

1 Like

@dan-man

Any chance you can share a link to the job in question to sebastian @ circleci.com?

DLC teardown logic is a bit complex so it’s hard to give a blanket recommendation, I can try to take a look and see if I can provide a specific recommendation.

@sebastian-lerner - thanks for the help.

here’s a link
https:// app .circleci.com/pipelines/github/AtomHan/bohr/18450/workflows/c3b86489-07c0-4e2b-b1ed-f60a6383033e/jobs/145336

For now i have worked around the problem using conditional logic using circleci/path-filtering@0.1.6 to restrict the occasions on which we bother to build the big image, but that is a bit hacky, so it would be nice to fix the cache if possible.

@dan-man sorry for the delay here, we are still checking some things out internally for this. We have a lead, hope to respond back here soon with some more information.