Sbt recompiles everything between workflow steps

drobert · March 21, 2019, 2:47pm

Cross-posted on Stack Overflow: https://stackoverflow.com/questions/55282629/sbt-always-recompiles-full-project-in-ci-even-with-caching

I can’t seem to find a way to compile sources using scala/sbt in one workflow step and avoid full project recompilation on the next step.

I’ve looked at posts like How to cache SBT incremental compilation and it is not relevant to my setup, or at least not solving it.

My approach is basically this:

attach workspace /home/circleci/myorg
checkout code (to /home/circleci/myorg/myproj)
compile the project (all compilation artifacts should reside at or below the git/checkout directory)
persist myorg/myproj, ~/.sbt, ~/.ivy2/cache to the workspace

In the next workflow step (job):

Restore workspace
Move .sbt and .ivy2/cache back to the /home/circleci dir from the workspace
run sbt test

However sbt test recompiles the full project every time. I am unable to determine why that’s the case. The workspace with all source code and resulting compiled .class files should all still exist in the workspace; nothing should appear to it to have changed.

Relevant circleci config:

---
version: 2

jobs:
  # compile and cache compilation
  test-compile:
    working_directory: /home/circleci/myteam/myproj
    docker:
      - image: myorg/teika-myproj-base:sbt-1.2.8
    steps:
      # the directory to be persisted (cached/restored) to the next step
      - attach_workspace:
          at: /home/circleci/myteam
      # git pull to /home/circleci/myteam/myproj
      - checkout
      - restore_cache:
          # look for a pre-existing set of ~/.ivy2/cache, ~/.sbt dirs 
          # from a prior build
          keys:
            - sbt-artifacts-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}
      - restore_cache:
          # look for pre-existing set of 'target' dirs from a prior build
          keys:
            - build-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}
      - run:
          # the compile step
          working_directory: /home/circleci/myteam/myproj
          command: sbt test:compile
      # per: https://www.scala-sbt.org/1.0/docs/Travis-CI-with-sbt.html
      # Cleanup the cached directories to avoid unnecessary cache updates
      - run:
          working_directory: /home/circleci
          command: |
            rm -rf /home/circleci/.ivy2/.sbt.ivy.lock
            find /home/circleci/.ivy2/cache -name "ivydata-*.properties" -print -delete
            find /home/circleci/.sbt -name "*.lock" -print -delete
      - save_cache:
          # cache ~/.ivy2/cache and ~/.sbt for subsequent builds
          key: sbt-artifacts-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}-{{ .Revision }}
          paths:
            - /home/circleci/.ivy2/cache
            - /home/circleci/.sbt
      - save_cache:
          # cache the `target` dirs for subsequenet builds
          key: build-{{ checksum "project/build.properties"}}-{{ checksum "build.sbt" }}-{{ checksum "project/Dependencies.scala" }}-{{ checksum "project/plugins.sbt" }}-{{ .Branch }}-{{ .Revision }}
          paths:
            - /home/circleci/myteam/myproj/target
            - /home/circleci/myteam/myproj/project/target
            - /home/circleci/myteam/myproj/project/project/target
      # in circle, a 'workflow' undergoes several jobs, this first one 
      # is 'compile', the next will run the tests (see next 'job' section
      # 'test-run' below). 
      # 'persist to workspace' takes any files from this job and ensures 
      # they 'come with' the workspace to the next job in the workflow
      - persist_to_workspace:
          root: /home/circleci/myteam
          # bring the git checkout, including all target dirs
          paths:
            - myproj
      - persist_to_workspace:
          root: /home/circleci
          # bring the big stuff
          paths:
            - .ivy2/cache
            - .sbt

  # actually runs the tests compiled in the previous job
  test-run:
    environment:
      SBT_OPTS: -XX:+UseConcMarkSweepGC -XX:+UnlockDiagnosticVMOptions  -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -Duser.timezone=Etc/UTC -Duser.language=en -Duser.country=US
    docker:
      # run tests in the same image as before, but technically 
      # a different instance
      - image: myorg/teika-myproj-base:sbt-1.2.8
    steps:
      # bring over all files 'persist_to_workspace' in the last job
      - attach_workspace:
          at: /home/circleci/myteam
      # restore ~/.sbt and ~/.ivy2/cache via `mv` from the workspace 
      # back to the home dir
      - run:
          working_directory: /home/circleci/myteam
          command: |
            [[ ! -d /home/circleci/.ivy2 ]] && mkdir /home/circleci/.ivy2

            for d in .ivy2/cache .sbt; do
              [[ -d "/home/circleci/$d" ]] && rm -rf "/home/circleci/$d"
              if [ -d "$d"  ]; then
                mv -v "$d" "/home/circleci/$d"
              else
                echo "$d does not exist" >&2
                ls -la . >&2
                exit 1
              fi
            done
      - run:
          # run the tests, already compiled
          # note: recompiles everything every time!
          working_directory: /home/circleci/myteam/myproj
          command: sbt test
          no_output_timeout: 3900s

workflows:
  version: 2
  build-and-test:
    jobs:
      - test-compile
      - test-run:
          requires:
            - test-compile

drazisil · March 21, 2019, 2:52pm

I’m not an sbt wizard, so hoping someone else can help here.

From the other thread:

Is this something that would help, or was that project specific to them? I’m assuming sbt has to cache the compiled state somewhere that you aren’t adding to the workspace, but I have no clue where that would be.

drobert · March 21, 2019, 6:12pm

I don’t see how it’s applicable here. I’m persisting the entirety of my checkout to the workspace, and that would bring lib_managed along with it, wouldn’t it?

Another issue is that “lib_managed” only applies if you set retrieveManaged := true in build.sbt, which I understand to mean jars are downloaded here instead of (or in addition to) ~/.ivy2/cache. I don’t have this setting enabled, so there is no lib_managed, plus I should still have these artifacts in ~/.ivy2/cache persisted in the workspace.

drazisil · March 21, 2019, 7:03pm

True, I missed that and thought you doing a single sub-directory.

Will have to wait for someone who knows sbt more then I.

drobert · March 29, 2019, 3:10pm

I haven’t solved the problem so I’m not updating the post, but rather am replying, but I do have a workaround for my immediate problem that I might as well share.

The gist is, I’m trying to separate ‘test compile’ with ‘test run’ so that I can customize JVM properties appropriately and spun up dependencies at different times to lower total machine memory pressure.

What I’ve done, in a nutshell, is run scalatest from scala -cp ... org.scalatest.tools.Runner rather than via sbt test so that avoids any attempt at recompilation. The runner can operate against a directory of .class files.

The short version is this:

docker container: augmented to include a scala CLI install instead of using the one SBT pulls down (unfortunate as I now need to keep these versions in sync)
build phase: sbt test:compile 'inspect run' 'export test:fullClasspath' | tee >(grep -F '.jar' > ~test-classpath.txt)
- compiles but also records a copy-patseable classpath string, suitable for passing into scala -cp VALUE_HERE to run tests
test phase: scala -cp "$(cat test-classpath.txt)" org.scalatest.tools.Runner -R target/scala-2.12/test-classes/ -u target/test-reports -oD
- runs scalatest via the runner, using compiled .class files in target/scala-2.12/test-classes, using the classpath reported on in the compile phase, and printint to stdout as well as a reports directory

I don’t love this and it has some problems, but figured I’d share this workaround.

xuwei-k · February 21, 2020, 5:04am

see https://github.com/sbt/sbt/issues/4168

Topic		Replies	Views
How to cache SBT incremental compilation Build Environment scala , sbt , caching , cache , circle-yml	2	5187	June 18, 2018
How to Cache SBT Dependencies Build Environment scala , sbt , caching	4	7514	June 18, 2018
What Scala version is available in CircleCI? Build Environment scala , sbt	3	2138	June 19, 2018
Circle 2.1 Scala cross-building Programming Languages	1	2895	March 28, 2019
Sbt failing to install Build Environment sbt	3	4304	June 18, 2018

Sbt recompiles everything between workflow steps

Related topics