Cache path are absolute, breaking cache restores between executors

Cache path are treated as semi-absolute, not relative as mentioned in the docs. This means one cannot restore caches from the “machine” executor in a “docker” executor.
This was reported here before by another user: Unable to restore cache saved in Docker to machine type job

config.yaml snippet:

  - save_cache:
      key: xyz-{{ .Branch }}-{{ .Revision }}
      paths:
        - .git

“Machine” executor saving cache:

Creating cache archive...
Uploading cache archive...
Stored Cache to xyz
  * /home/circleci/project/.git

Docker executor restoring cache:

Found a cache from build 615568 at [...]
Size: 5.3 GB
Cached paths:
  * /home/circleci/project/.git

Downloading cache archive...
Unarchiving cache...
tar: home/circleci: Cannot mkdir: Permission denied
tar: home/circleci/project/.git: Cannot mkdir: No such file or directory
tar: home/circleci: Cannot mkdir: Permission denied
[...]

The “cache paths” is clearly wrong, the absolute path made it into the cache despite the config is for the relative path.

1 Like

I have run into the same issue, only in my case between Docker and MacOS. I have currently been resorting to duplicating effort in the first MacOS job in my workflow, but that increases the amount of time required to execute the workflows, possibly by quite a lot depending on the work being redone.

Part of the issue here is when you move things between executors, you run into issues with file permissions due to the users having different UUIDs, even if the same users exist between executors.

For this reason we recommend against this. I’ll make sure we add a note in our docs that this isn’t recommended.

Using different executors is actually the reason why I’m using caching:

  • There’s a command (C1) which I can easily run with executor E1
  • There’s another command (C2) which I can easily run with executor E2, but not E1
  • Command C2 needs what command C1 generates (e.g. C1 might generate a AWS credentials file)

Given that I want to avoid manually installing the different CLI’s (e.g. kubectl, aws, terraform, etc) I’d rather use the cache to pass around data between executors.

Is there a better way to accomplish this?

Sorry for necroposting this, but topic is valid and needs a bump once a year :slight_smile:

I run into same issue. Building my go binary with cimg/go image, and caching resulting binary as paths: my_binary for next job which is running on google/cloud-sdk image. Restoring cache so that docker could copy it to image and build and push to GCP registry.
But absolute path saves cached binary as /home/circleci/project/my_binary, and working dir on gcloud image is /root/project, and cache is restored to /home/....
I could manually copy files, but that would mean hardcoded paths, and my docker-build-and-publish is reusable parameterized, having cache restore key as parameter. No point to add additional parameter for every file I want to cache between executors.
Really need relative paths on cache.

same here. I tried to use workspaces but with 300-500GB I almost hit the limit after 5 builds. I could live with that if I could mark a workspace to live e.g. 30minutes max 1h
The benefit of cache over workspace is that I can reuse same key and anyway most of the data will be the same so I can save a lot in terms of used storage. But when sharing my data (eg quite big .git) to first build JS on docker and then continue on macos to build capacitor app I cannot use cache due to path/ownership issues. I see a workaround to create a /home on macos (there is no such path on macos at all) writable then read cache and move the content but thats so ugly. Also need to move back later to save cache again…
Ideally I’d love to have reusable workspace.