I have three repos that I run migrations on before doing downstream integration tests. We use Postgres as our database. The migrated repos are django & flask. In one job, we run the migrations in each repo, call
pg_dump to dump the data to files on an attached ‘workspace’, then call
persist_to_workspace to save the files. This process takes around 10 minutes. In parallel downstream jobs, we call
attach_workspace to access those file and restore the database before running the integration tests.
I’d like to come up with a strategy to cache those ‘pg_dump’ files based on the last migration in each repo so, if no new migrations have been introduced, use the cached files. Otherwise, do all of the migrations and cache the new ‘pg_dump’ files.
Has anyone done something similar?