Racing containers

2.0

#1

We are trying 2.0 and were pretty successful to get two of our projects built with 2.0 platform. However before I managed to get builds stable, I had to deal with strange and inconsistent behavior of some tools. In the end I managed to stabilize things with… sleep.

We have a custom pyhon-2.7-slim based container and additional postgres container:

version: 2
executorType: docker
containerInfo:
  - image: our-custom-base-image/ci:3
  - image: postgres:9.6
    env:
      - POSTGRES_DB=circle_test
      - POSTGRES_PASSWORD=
stages:
  build:
    workDir: ~/console-api
    steps:
      - type: checkout
      - type: shell
        name: Prepare Database
        shell: /bin/bash --login
        command: psql -h localhost -U postgres -d circle_test -c 'CREATE EXTENSION hstore'
      ...

With this setup every 2nd or 3rd build fails randomly with either git is not found in base image or psql unable to connect to postgres (connection refused) to execute the CREATE statement. This happens randomly.

After all I got an impression that platform is so fast, that psql step and checkout are being executed earlier than postgres container fully starts (or base image fully resolves?..)

The only one thing which helped me to stabilize builds is putting a sleep as a very first step:

    steps:
      - type: shell
        name: Spin up Time Machine
        command: sleep 4 && echo "The barman asks what the first one wants, two race conditions walk into a bar."

This does not really sound fun though :slight_smile:

Is there anything I’m missing here? Does the platform check if container is fully up & running before diving into steps?


#2

You’re absolutely correct that it runs before Postgres is booted, but we have no way to determine if the container is “fully up”. We start all the containers and your commands start to run.

I recommend referencing your image by sha id in that case. It sounds like you’ve had more than one image with that version tag, and if it’s cached on one of our hosts, it will use the version it has cached. Using the sha would force the download and use of that specific version.

We can’t run a container before it “fully resolves”. Git is either installed or it is not. Though, you can run a step before checkout to install git.


#3

Another way to solve this issue is to use Dockerize: Prevent race conditions by waiting for services with Dockerize


#4