Docker image build is 3 times slower than the normal build

docker

#1

We are migrating to Docker and we are trying to build our app inside a docker image. The app is a Node app and is built with webpack.

Until now we were building the app inside the Circle Ci image and it took about 130 seconds to build it. When changing to docker and building the image with the remote docker machine we saw an increase from 130 seconds to 405 seconds. This is a 3 time increase in the build time!

On the local development machine we saw no difference between building the app on the host or with docker.

The docker build machine is running at 200% when building the app inside docker (tested with docker stats and by ssh into the docker machine and running top command), so the processor seems to be fully used. The memory doesn’t seem to be a issue (only 4Gb/ 8Gb are used), the network is not used at all and the disk is only used at the end of the build to save the compiled app.

I’m not sure that this is the problem, but I saw that Circle Ci is using different machines when running tests and when building images for deployment. The test machines seem to be running Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz and the building/ docker machines are running Intel(R) Xeon(R) CPU @ 2.30GHz. Full specs from lscpu here.

Can anything be done to speed-up those machines or the build process?


#2

Could you examine the build time for the layers in your Dockerfile, to see if one specific layer is taking up the increased time? Or are all layer affected proportionately?

You could try switching to the Machine Executor for the build phase to see if that behaves differently. I believe with the new executors in 2.1, you can mix Docker and Machine builds in the same config.


#3

Thanks for the reply and sorry for the late response!
The specified time was only on the layer that builds the app in dockerfile. All the layers slowed down by about 10% but the build layer is slowed down the most (this is the most resource intensive layer).

This got to be a docker issue with the machines.

We tried running on the same base image circleci/node:8.11 on docker as on normal tests. The instructions are the same as the normal test. The problem persisted.

We don’t want to use the machine executor because the spin up is taking too long. I could build the app in the “test” environment and than pass it to the docker build, but I don’t think this is good for consistency.

Thank you!


#4

Would you be able to share the Docker step here? It will be hard for anyone to give more than general advice unless they have something to dig into.


#5

Sorry, but the project is not open source. I will try to see if we can share the docker and the test steps.

I will try to compile with a opensource project and compare the build time on docker and on the normal build.


#6

The sharing of one line of [Docker] code should not be a problem, especially given that any one line of code can be trivially rewritten by a competent engineer. If your employer prohibits you from sharing even the smallest piece, you need to explain to them that hampers your efforts to seek volunteer help on the web.

Yes, that is what I meant. See above for why I don’t think you should need to ask for such a small amount of code (minor amendments for redaction purposes are fine).


#7

I have tried with this open source project.
I saw a 100% decrease in build speed.

And relevant times:

Command Normal Docker
npm i 14.357s 23.891s
npm run build 35,363s 73,123s

As you can see pretty much everything is 2 times slower.

config.yml:

# Testing the performance of docker build vs normal build
version: 2

jobs:
  job1:
    docker:
    - image: circleci/node:8.11.1
    steps:
    #      Normal build
    - run:
        name: Clone code
        command: git clone https://github.com/cezerin/cezerin
    - run:
        name: Install dependencies
        command: cd cezerin && npm i
    - run:
        name: Build
        command: cd cezerin && npm run build
    #        Docker build
    - setup_remote_docker
    - run:
        name: Build from dockerfile
        command: docker build https://gist.githubusercontent.com/PaulGgithub/c6508f706eb098fe367a4653b7fa177c/raw/67ddeef925c5d4833bdb363c3f427b9f244299f8/Dockerfile



workflows:
  version: 2

  tests_to_run:
    jobs:
    - job1

Dockerfile from here:

# Same base image as in config.yml
FROM circleci/node:8.11.1

# equivalent of checkout
WORKDIR /home/circleci
RUN git clone https://github.com/cezerin/cezerin

# cd to project
WORKDIR /home/circleci/cezerin
# Install dependencies
RUN npm i
# Build
RUN npm run build

If helpful full CI test here:

I can share our code if you consider it is relevant, but at this point I don’t think that our implementation is the problem.


#8

That’s excellent research - good work! For the avoidance of doubt, I am just a volunteer here, not an employee. I would be interested in their feedback too.

The free build minute allowance recently dropped 33% to 1000 minutes, predicated on a move to significantly faster CPUs. If that’s not the case then CircleCI will want to know about it.

I’ll take the liberty of reclassifying this as a bug, to attract attention to it.


#9

We are not using the free tier. We are using 10 instances for our tests. This is why I’m disappointed that the build speed drop when we migrated to docker.

Thanks for your support :slight_smile: !


#10

Sure, but I would assume that your paid tier would have received the same speed advantage.

(For my own case, I am trying to get CI into an organisation slowly, and I hope my free minutes budget lasts each month before I can persuade them to pay for it :grinning:).


#11

Hi Paul,

Can you please open a ticket so we can take a look and figure out why this is behaving as you observe?


#12

Sorry for the late replay.
I will open a ticket and keep the thread updated if we find something.


#13

Later edit:

I’ve tested 2 other ways:

  1. SSH into the docker machine and run the install commands:

     ssh remote-docker
     git clone https://github.com/cezerin/cezerin && cd cezerin && npm i && npm run build
    
  2. Run the circleci image and execute the commands inside the container. This should replicate the way circleci runs all the commands.

     version: 2
     
     jobs:
       job1:
         docker:
         - image: circleci/node:8.11.1
         steps:
         #      Normal build
         - run:
             name: Clone code
             command: git clone https://github.com/cezerin/cezerin
         - run:
             name: Install dependencies
             command: cd cezerin && npm i
         - run:
             name: Build
             command: cd cezerin && npm run build
         #            Docker build inside image
         - setup_remote_docker
         - run:
             name: Build from dockerfile
             command: docker build https://gist.githubusercontent.com/PaulGgithub/c6508f706eb098fe367a4653b7fa177c/raw/67ddeef925c5d4833bdb363c3f427b9f244299f8/Dockerfile
         #    Build inside container
         - run:
             name: Build inside container
             command: >
               docker run -it circleci/node:8.11.1 /bin/bash -c
               "cd home/circleci/ &&
               git clone https://github.com/cezerin/cezerin &&
               cd cezerin &&
               npm i &&
               npm run build"
     
     
     workflows:
       version: 2
     
       tests_to_run:
         jobs:
         - job1
    

And the results are mostly the same:

Command Normal Docker build Docker run in container SSH inside the docker machine
npm i 13.47s 23.04s 23.03s 28.315s
npm run build 34.72s 70.99s 69.55s 68.19s

I’m sure that the machines are the problem not the docker.
Can anything be done to be as fast as the “normal test” machines?


#14

It doesn’t look like you are caching dependencies, which is something that may be happening locally.

If you try that https://circleci.com/docs/2.0/caching/#npm-node does it run any faster? You would restore the cache, run npm, then save again to catch any updates.


#15

We are caching dependencies. I can’t share our code because it is private so I’ve replicated the problem with an open-source project. This is a simple example so this is why I haven’t cached the dependencies.

Even if I would have cached the dependencies the build speed would remain twice as slow as the non-docker machine build.

I don’t care about the dependency install speed but about the build speed.

I’ve also opened yesterday a support ticket with number 46614.

Thanks!


#16

Understood. I was confused by what you shared, sorry about that. Have you opened a ticket? You can share the build URL there privately and we can dig deeper to see if we can spot the cause of the slowness.


#17

I have opened a ticket and shared the build URL.
The ticket number is 46614.


#18

Thank you, I was having trouble locating it in our queue. I’ll make sure one of our support engineers takes a look at it for us.


#19

I am also facing a similar problem (though in a more complicated build, and using the machine executor) and I’m also in a paid plan.
The webpack build is taking a lot more time inside Docker than outside it – in similar circonstances (both not using cache).
I currently suspect the Docker containers are getting less share of the CPUs, but I’ll try to keep digging. In the meantime, can you please write here if you have updates on this? The solution might be helpful for others.


#20

I’ve noticed yesterday that is a problem even with the docker builds inside machine executor. They seem to be using the same machines for remote dockers machines and for machine executors.

In the case of remote docker I think is a cost saving measure because they need to run 2 machines: the docker executor machine and the remote docker machine (the one initialised with setup_remote_docker from the docker executor).
Ironically, the remote docker should be the most powerful machine since the docker executor is only running the docker client (the docker CLI). Maybe they could check if the job contains a setup_remote_docker step and use the powerful machine for the docker daemon and the other machine for the docker executor.

We could run docker in docker on the docker executor but it’s not possible because the container should be running with --privileged.

I’ve opened a ticked but they don’t seem to understand the problem. I’m still waiting for a response that will improve the webpack build speed. I will keep the thread updated :slight_smile: .