I have three repos that I run migrations on before doing downstream integration tests. We use Postgres as our database. The migrated repos are django & flask. In one job, we run the migrations in each repo, call pg_dump
to dump the data to files on an attached ‘workspace’, then call persist_to_workspace
to save the files. This process takes around 10 minutes. In parallel downstream jobs, we call attach_workspace
to access those file and restore the database before running the integration tests.
I’d like to come up with a strategy to cache those ‘pg_dump’ files based on the last migration in each repo so, if no new migrations have been introduced, use the cached files. Otherwise, do all of the migrations and cache the new ‘pg_dump’ files.
Has anyone done something similar?
halfer
February 9, 2019, 9:56pm
2
You could use the cache device, using the hash of the migration file as a cache key. If the migration file changes, the old cache becomes invalid and you have to re-do the migration.
Unfortunately there isn’t a single migration file. Each ‘app’ in the repo has a ‘migrations’ directory where migrations for that app live.
halfer
February 11, 2019, 6:34pm
4
That’s probably OK - just write something that creates an overall hash of all of them and use that as your cache key.
This one set my brain turning. I haven’t tried this, but I think you can use MD5 to checksum the directory, save that as an exported ENV and then use {{ .Environment.variableName }}
in the cache key.
Please let me know if this works
https://circleci.com/docs/2.0/caching/#using-keys-and-templates
ETA: When I said esport, I mean to export it to $BASH_ENV, as in here https://circleci.com/docs/2.0/env-vars/#example-configuration-of-environment-variables
1 Like
This didn’t end up being quite what I wanted, so here’s the working code I have, hope it helps:
- run: |
echo $(find ./migrations -type f -exec md5sum {} \; | md5sum | cut -d' ' -f1) >> CACHE_KEY1
- run: cat CACHE_KEY1
- save_cache:
key: test-cache-v1-{{ checksum "CACHE_KEY1" }}
paths:
- migrations
I’ve been given some time to revisit our CI workflows and I’m about to do something very similar. I’ll post my solution once I’ve got it running.
Here’s what I did in my integration test setup job
integration_test_setup:
<<: *defaults
resource_class: small
steps:
- checkout
- run: *checkout_services # this clones all dependent repos locally
- run:
name: Create cache directory
command: |
mkdir /tmp/cache
- run:
name: Create migration checksum files
command: |
# The scripts below contain one or more 'ls' and 'find' commands that return a list of all
# migration files for the repo. To make 'find' commands deterministic, pipe the
# output to 'sort'.
shasum -a 256 `../repo1/scripts/ci/integration-test-list-migration-files` > flask_migrations.txt
shasum -a 256 `../repo2/scripts/ci/integration-test-list-migration-files` > django_migrations.txt
- run:
name: Create fixtures checksum files
command: |
# The scripts below contain one or more 'ls' and 'find' commands that return a list of all
# fixture files for the repo. To make 'find' commands deterministic, pipe the
# output to 'sort'.
shasum -a 256 `../repo1/scripts/ci/integration-test-list-fixture-files` > flask_fixtures.txt
shasum -a 256 `../repo2/scripts/ci/integration-test-list-fixture-files` > django_fixtures.txt
- restore_cache:
keys:
- test-v1-migrations-flask_{{ checksum "flask_migrations.txt" }}-django_{{ checksum "django_migrations.txt" }}
- restore_cache:
keys:
- test-v1-fixtures-flask_{{ checksum "flask_fixtures.txt" }}-django_{{ checksum "django_fixtures.txt" }}
- setup_remote_docker:
docker_layer_caching: true
- build_or_pull_images: # pull docker image from image repo if it exists, otherwise build & push to image repo
repos: "repo1 repo2"
- run:
name: Stop and remove containers
command: |
docker-compose down --remove-orphans
- run: *initialize_db # start database container wait until accepting connections
- run:
name: Load migrations
command: |
if [ -f /tmp/cache/db1.schema.dump ]; then
echo "Restoring schema from cache"
# copy dumped file to db1
docker cp /tmp/cache/db1.schema.dump app_db:/tmp/db1.dump
# restore the schema
# use different db to connect
docker exec db_container pg_restore -d 'postgresql://postgres:somepass@db_container/db2' -Fc --create --clean --if-exists tmp/db1.dump
else
echo "Cache not found. Running migrations"
services_to_migrate='repo1 repo2'
for service in $services_to_migrate; do
echo "Running migrations for $service..."
docker-compose run -e CIRCLECI=$CIRCLECI --rm $service make migrations
done
# dump schema to cache directory
docker exec db_container pg_dump 'postgresql://postgres:somepass@db_container/db1' -Fc > /tmp/cache/db1.schema.dump
fi
- save_cache:
key: test-v1-migrations-flask_{{ checksum "flask_migrations.txt" }}-django_{{ checksum "django_migrations.txt" }}
paths:
- /tmp/cache/db1.schema.dump
- run:
name: Load fixtures
command: |
if [ -f /tmp/cache/db1.fixtures.dump ]; then
echo "Restoring fixtures from cache"
# copy dumped files to db1
docker cp /tmp/cache/db1.fixtures.dump db_container:/tmp/db1.dump
# restore the schema
docker exec db_container pg_restore -d 'postgresql://postgres:somepass@db_container/db2' -Fc --create --clean --if-exists tmp/db1.dump
else
echo "Cache not found. Loading fixtures"
services='repo1 repo2'
for service in $services; do
echo "Loading fixtures for $service..."
docker-compose run -e CIRCLECI=$CIRCLECI --rm $service make fixtures
done
# dump fixtures to cache directory
docker exec db_container pg_dump 'postgresql://postgres:somepass@db_container/db1' -Fc > /tmp/cache/db1.fixtures.dump
fi
- save_cache:
key: test-v1-fixtures-flask_{{ checksum "flask_fixtures.txt" }}-django_{{ checksum "django_fixtures.txt" }}
paths:
- /tmp/cache/db1.fixtures.dump
- run:
name: Dump database state
command: |
docker exec db_container pg_dump 'postgresql://postgres:somepass@db_container/postgres' -Fc > /tmp/workspace/db1.dump
docker-compose down
- persist_to_workspace:
root: /tmp/workspace
paths:
- db1.dump
In a subsequent job, I restore the database from the dump file saved in the workspace before running the integration tests.