Caching apt-get packages

caching
apt
performance

#1

Many of you have asked about caching apt-get installs.

I’m here to tell you that there’s a way!

Basically it involves downloading the apt packages and caching that directory between builds.

Additionally, I will show you how to use a custom sources.list to speed up apt-get update for your builds.

A full example

We’ll start out with a full, commented, example and then walk through the what and why after.

dependencies:
  cache_directories:
    # We will store packages in this directory
    - "vendor/apt"

  pre:
    # Remove the repositories provided in base image we don't need
    - sudo mv /etc/apt/sources.list.d /etc/apt/sources.list.d.save
    - sudo mkdir /etc/apt/sources.list.d

    # Copy the repository list from source tree to replace default
    - sudo cp sources.list /etc/apt/sources.list
    # Update to take effect sources repository database
    - sudo apt-get update

    # Make the cache dir if it doesn't exist
    - |
      if ! [[ -d vendor/apt ]]; then
        mkdir -p vendor/apt
      fi

    # Install aspell if it doesn't exist
    - |
      if [[ ! -e /usr/bin/aspell ]]; then
        # First check for archives cache
        if ! [[ -d vendor/apt/archives ]]; then
          # It doesn't so download the packages
          sudo apt-get install --download-only aspell
          # Then move them to our cache directory
          sudo cp -R /var/cache/apt vendor/
          # Making sure our user has ownership, in order to cache
          sudo chown -R ubuntu:ubuntu vendor/apt
        fi

        # Install all packages in the cache
        sudo dpkg -i vendor/apt/archives/*.deb
      fi

  # Just echo something to avoid inference
  override:
    - echo "Nothing to do here."

# Another echo, so the build will succeed
test:
  override:
    - echo "Hello, world"

And here’s the sources.list file we’ve added to the project:

deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu trusty main restricted
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu trusty main restricted

deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu trusty-updates main restricted
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu trusty-updates main restricted

deb http://security.ubuntu.com/ubuntu trusty-security main
deb-src http://security.ubuntu.com/ubuntu trusty-security main

The what and why

Now let’s explain what’s going on here.

So you have a better understanding before you put it to use.

Sources

By default, our Ubuntu linux images contain a number of extra package sources to install common libraries and programs; for example mysql or postgres.

This means if you run apt-get update in your build you will also have to fetch and update these extra sources as well, which can add extra time to your build.

In this example, we’ve overridden the default sources to only the ones we care about – core packages from Ubuntu and security.

When using this approach, be sure to know which packages you need and where they come from.

You may have to modify the sources.list file above depending on your needs.

Caching

This strategy makes use of the “cache_directories” option of the configuration to keep a copy of the packages between builds in order to save time downloading them each build.

In order to cache them properly we have to ensure permissions are properly set to the ubuntu user within the cache directory.

The last part sudo dpkg -i vendor/apt/archives/*.deb should Just Work – but depending on your needs may require some tweaking.

For a small number of dependencies this strategy works perfectly, with that said Your Mileage May Very.


Caching apt-get install results
Add ability to cache apt-get programs
Caching for test runs via Squid
#2

Will this work with 2.0?


#3

This guide was targeted at 1.0.

If you’re using 2.0, we highly recommend you roll your own image including the apt packages you need pre-installed.


#4