Use the same cache key for jobs running in parallel?

ifeins · November 29, 2023, 3:14pm

Hi,

I have a project in which I want to run 3 jobs in parallel, but I want to install npm dependencies only once and cache the resulting node_modules:

test-unit - run unit tests
test-integration - run integration tests
build-sam - Build an AWS SAM (Serverless Application Model)

The installation of node modules and caching them is encapsulated in the install-deps-with-cache command which I use in all the above 3 jobs.

My intuition is that if the cache key already exists (if the package.json and package-lock.json weren’t changed from a previous commit) then I guess the 3 parallel jobs will be able to use it.

However, what happens if the cache key doesn’t exist (either package.json or package-lock.json were changed)?
Will each of the 3 jobs that run in parallel install the npm dependencies, which is a bit wasteful?
Or is CircleCI somehow manage to reuse the cache even though the jobs run in parallel?

commands:
  install-deps-with-cache:
    description: Install dependencies by using the cache if exists
    steps:
      - restore_cache:
          key: v1-{{ checksum "package.json" }}-{{ checksum "package-lock.json" }}
          working_directory: ~/project

      - run:
          name: Install dependencies
          command: |
            if [ -d 'node_modules' ]
            then 
              echo "restored node_modules from cache!"
            else
              npm ci
            fi
          working_directory: ~/project

      - save_cache:
          key: v1-{{ checksum "package.json" }}-{{ checksum "package-lock.json" }}
          paths: node_modules
          working_directory: ~/project

jobs:
  test-unit:
    docker:
      - image: cimg/node:18.16.1
    steps:
      - checkout
      - install-deps-with-cache
      - run:
          name: Test Unit
          command: npm run test:unit
          working_directory: ~/project

  test-integration:
    docker:
      - image: cimg/node:18.16.1
    steps:
      - checkout
      - install-deps-with-cache
      - run:
          name: Test Integration
          command: npm run test:integration
          working_directory: ~/project
  
  build-sam:
    docker:
      - image: 563186419109.dkr.ecr.us-east-1.amazonaws.com/build-images:sam-node-18    
    steps:
      - checkout
      - install-deps-with-cache
      - run:
          name: Build template
          command: sam build
          working_directory: ~/project

workflows:
  version: 2
  package:
    jobs:
      - test-unit:
          context: all
      - test-integration:
          context: all
      - build-sam:
          context: all
      - deploy:
          name: deploy-staging
          context: all
          deploy-env: staging
          notify-slack: false
          requires:
            - build-sam

rit1010 · November 30, 2023, 12:36am

There is no documented ‘advanced’ cache management. All indications are that the cache commands run within the environment with no independent controlling process. So using your example, each of the 3 jobs would create its own unique environment if no cache object exists when they start.

You could modify your process so that before executing the parallel tasks you run a task that makes sure that the cached environment exists and is up to date.

ifeins · November 30, 2023, 8:21am

@rit1010 thanks for your reply.

You could modify your process so that before executing the parallel tasks you run a task that makes sure that the cached environment exists and is up to date.

When you write “task” you mean a job?
Basically do something like this?

jobs:
  install-deps-job:
    docker:
      - image: cimg/node:18.16.1

    steps:
      - npmregistry
      - checkout
      - install-deps-with-cache

workflows:
  version: 2
  package:
    jobs:
      - install-deps-job:
          context: all

      - test-unit:
          context: all
          requires:
            - install-deps-job

      - test-integration:
          context: all
          requires:
            - install-deps-job

      - build-sam:
          context: all
          requires:
            - install-deps-job

That would probably work, the downside is that it performs the git checkout of the project just for the sake of installing dependencies and populating the cache. But I guess “git checkout” is much “cheaper” than installing dependencies so it’s probably worth it.

Are there any recommended approaches for things like this?
Like is it considered a best practice to have a job that just installs dependencies and populate the cache?

rit1010 · November 30, 2023, 8:56am

Yes, that is the type of thing I was proposing.

As for recommendations, I’ve not come across any documents or past forum posts that cover your use case, so there is no best practice I can point to.

nelsonp · May 7, 2025, 6:05pm

I was able to share cache between paralelle builds using the node index on my cache key

- save_cache:
          key: cached-web-mfes-{{ .Environment.CIRCLE_NODE_INDEX }}-{{ .Revision }}
          paths:
            - webapptemp

I also have to use dynamic configs, And on job that needs the cache, I use same count logic I use to drive paralelle count

parallelism: ${Math.ceil((prDeployedMFEs.length || 1) / 5)}

On follow up job

prDeployedMFEs.length
                      ? new Array(Math.ceil((prDeployedMFEs.length || 1) / 5))
                          .fill(null)
                          .map(
                            (_, i) => `
            - restore_cache:
                keys:
                  - cached-web-mfes-${i}-{{ .Revision }}

Topic		Replies	Views
Jest cache restoring with parallel steps Caching Dependencies	0	2101	November 10, 2021
Restore cache from another job Caching Dependencies	3	4400	March 17, 2019
Can you pass a docker and virtual environment from one job to the next? Build Environment	5	3532	August 11, 2020
Cache immutability, and using the cache from the last build Caching Dependencies	3	4641	January 7, 2020
Install php packages, then fan out for parallel steps (caching the entire container?) Caching Dependencies	3	1320	February 20, 2022

Use the same cache key for jobs running in parallel?

Related topics