Building a pipeline in CircleCi 2.0


#1

Hi there, I’ve been using circleci 1.0 to run a simple pipeline, which did something like this:

#pipeline.sh

./custom-command ./dist/downloaded-file.data

docker run
-v $(pwd):/opt
junjunzhang/spark2
/bin/bash -c
spark-submit --driver-memory 126G /opt/my-spark-script.py"

docker run
-v $(pwd):/opt
fuzzytolerance/tippecanoe
/opt/run-some-other-script.sh

ie it first runs a custom script to download some data and then runs two scripts over that data, each in their own docker container.

I feel this should be a lot cleaner in circleci2.0 but i’m confused how the docker executors work? They are started in the “spinup env” step, so should i create three jobs for the above? What is the recommended way to achieve the same thing, but rely on circleci’s new docker support (so it doesn’t have to fetch the docker images each run).

Or should i just run my pipeline.sh as is?


#2

Mounting volumes only works from the machine, but you just execute the script as-is.


#3

Looks like the docker command is not found in buildpack-deps:trusty … Was hoping to avoid having to build my own custom docker image, but looks like I might have to


#4

I always advise building your own image.


#5

OK, building my own image and using it now which seems to work. However, I noticed that during the “Spin up environment” phase, it says:
Build-agent version 0.0.2879-01e30b5 (2017-03-29T13:03:39+0000)
Starting container savvynavvy/buildpack
image not cached, downloading savvynavvy/buildpack

Is there a way for circleci to cache these images? Seems a waste to have to download the image on every build?


#6

Ignore that… it seems to now correctly get the image from the cache; not sure why it didn’t earlier


#7

And now it again downloaded the image… seems to be sporadically using the cache - not sure why/how


#8

The image is cached each time it is pulled, but each time your build lands on a new host, there is no cache.


#9