Share data with cron job

Context

I am using CircleCI to deploy review apps for a React app. Essentially, when a PR is opened, CircleCI does these steps:

  • Install Dependencies (npm install)
  • Running tests (npm t)
  • Build & Deploy

During the “Build & Deploy” step, I have to do a number of things like:

  • Create AWS S3 bucket
  • Create AWS Cloudfront Distribution
  • Configure Okta to allow logging into the app

All of the steps above use the PR number from Github in order to construct the URL. For instance, the URL will be project-name-<PR_NUMBER>.

The Problem

I have a cron job set up to run every night that looks to see if the PR has been closed and it if has, I need to run my cleanup script which removes the S3 bucket, the Cloudfront Distro, and also the config in Okta. The problem is that I depend on the PR number in order to know which S3 bucket, Cloudfront Distro, and Okta URL to delete. However, the PR number is not available because the PR has be closed. The way that I am getting the PR number is by using the CircleCI environment variable CIRCLE_PULL_REQUEST in a Python script. Here is the error I get when attempting to access that environment variable: KeyError: 'CIRCLE_PULL_REQUEST'.

The Question

How can I persist the PR number so that my cleanup cron job has access to it? I looked at Artifacts but it looks like in order to get the Artifact, I need to know the build number and I am not sure how I would get that in the cleanup job since there could be multiple builds run.

Also, if there is a better way to trigger a cleanup (ideally it would be triggered when the PR is merged), then please let me know.


Here is my config.yml (I removed the test step). :

version: 2.1
executors:
  docker-executor:
    docker:
      - image: circleci/node:12
    resource_class: xlarge

orbs:
  aws-cli: circleci/aws-cli@0.1.17

commands:
  install-python:
    steps:
      - run: sudo apt install python3-pip && sudo pip3 install -r ./scripts/requirements.txt
  attach-workspace:
    steps:
      - attach_workspace:
          at: ~/
  persist-workspace:
    steps:
      - persist_to_workspace:
          root: ~/
          paths: ./

# CircleCI PR-only option must be enabled in the job settings
dev_only: &dev_only
  filters:
    branches:
      ignore:
        - main
        - develop
        - staging

staging_only: &staging_only
  context: frontend
  deploy_env: 'staging'
  filters:
    branches:
      only: main

jobs:
  install:
    executor: docker-executor
    steps:
      - checkout
      - restore_cache:
          keys:
            - npm-deps-{{ checksum "package-lock.json" }}
      - run: npm i
      - save_cache:
          key: npm-deps-{{ checksum "package-lock.json" }}
          paths:
            - ~/.npm
            - node_modules
      - persist-workspace
  build-deploy:
    executor: docker-executor
    steps:
      - attach-workspace
      - install-python
      - aws-cli/setup:
          aws-region: AWS_REGION
      - add_ssh_keys:
          fingerprints:
            - "fingerprint here"
      - run:
          name: Clone/Install SG1 Admin
          command: |
            GIT_SSH_COMMAND='ssh -i ~/.ssh/id_rsa'
            git clone git@github.com:shipt/sg1-admin.git ~/sg1-admin
            cd ~/sg1-admin
            npm i
      - run:
          name: Setup Review App and Env Vars
          command: |
            ./scripts/create_reviewapp.py
            cat .env.txt >> $BASH_ENV
            source $BASH_ENV
            cat $BASH_ENV
      - store_artifacts:
          path: /tmp/bucket_name
          destination: bucket_name
      - run:
          name: Build SG1 Admin
          command: |
            cd ~/sg1-admin
            cat .env.development
            REACT_APP_ENV=development npm run build
      - run:
          name: Build and Deploy
          command: |
            cat .env.txt >> $BASH_ENV
            source $BASH_ENV
            cat $BASH_ENV
            REACT_APP_ENV=development npm run build
            ./scripts/deploy_app.sh
      - run:
          name: Comment on PR
          command: |
            cat .env.txt >> $BASH_ENV
            source $BASH_ENV
            cat $BASH_ENV
            ./scripts/comment_on_pr.py $REVIEW_APP_URL
  cleanup:
    executor: docker-executor
    steps:
      - install-python
      - run: ./scripts/cleanup_reviewapp.py

workflows:
  install_test_deploy:
    jobs:
      - install:
          name: "Install Dependencies"
      - build-deploy:
          <<: *dev_only
          name: "Deploy Review App"
          requires:
            - "Install Dependencies"
  cleanup:
    triggers:
      - schedule:
          cron: "0 0 * * *"
          <<: *dev_only
    jobs:
      - cleanup:
          name: "Cleanup"

I have a similar need. Generically, this request is: How can you persist a piece of data from one workflow to a later job in the same CircleCI project? In this specific instance, the question is about how you can persist a piece of data from a github-triggered workflow, to cron (time-triggered) job that runs later? However, I think the same general solution would apply regardless of whether the workflows or jobs are triggered by VCS or cron.

You can do this using the CircleCI API. @dericgw asks:

I looked at Artifacts but it looks like in order to get the Artifact, I need to know the build number and I am not sure how I would get that in the cleanup job since there could be multiple builds run.

With the API, you can list recent jobs and iterate through that JSON to find the one you want. It’s sometimes tedious with API v1.1 because there aren’t sufficient filtering options (I don’t know about API v2, maybe it got better), but you can do it.

However, to use the API, you need an API token, which is associated with a user. If it’s a real user’s API key, then the job will break when that user rotates their key, or leaves the organization. You could use a service account dedicated to that CircleCI project, but I’m certainly not fond of the idea of having to use a whole new user license every time I want a CircleCI job to be able to read some data from a past job or workflow in the same project!!

So, how can we persist some piece of data from a workflow, to be accessed by some job that runs later and knows what to look for? Without requiring a user license for a new API key.

Updating to say I found an answer last year, but I forgot to come back here to say so. At the time, CircleCI Server was still stuck on version 2.x, so I had to work with API v1.1; not sure if there’s a better solution now that API v2 is available. But here’s what I did last year:

  1. CircleCI supports a “Project API token” which gives permissions to that project only, and is not tied to a user. So I used a project token for the API.

  2. Using the API, I had the workflow communicate with its past and future runs using project environment variables. A workflow can write a value to a variable that it knows the next run will try to read, by POSTing to https://circleci.$domain.com/api/v1.1/project/gh/$org/$reponame/envvar

  3. All project environment variables written by previous runs are automatically available to all subsequent workflows, directly in their shell’s environment.