Build with circleci/python:3 started failing since yesterday

Hello,

I have a build that uses the circleci/python:3 image:

  some_job:
    docker:
      -
        image: circleci/python:3
    steps:
        ...

In this job I build a docker docker container with a long-ish requirements.txt file. The build fails using the circleci/python:3 image.

I’ve just cancelled my latest attempt using circleci/python:3.6 where the docker build process had reached “installing pandas” stage after 20 minutes. This is much more than it usually takes (and should take).

I have tried testing with this short Dockerfile, which spends an unusually long time downloading h5py package:

requirements.txt:

h5py==2.8.0

Dockerfile:

FROM circleci/python:3
COPY main.py .
COPY requirements.txt .
RUN pip install -r requirements.txt
CMD python main.py

This fail for me locally. Now compare this to a dockerfile that uses the (non-circleci) pyhon:3 image. The latter build is successful (and much much faster) also locally.

According to Twitter, I am not the only one facing this issue. Dockerhub lists the circleci/python:3 image as “Last updated 14 hours ago bycirclecipublicimage

Try pinning your image to a previously working hash. This doc will help https://support.circleci.com/hc/en-us/articles/115015742147-Did-something-change-in-the-CircleCI-Docker-Images-

Those tags are mutable and so you could be picking up incompatible minor and patch versions unknowingly. If you pin the image to a previously working job and it still fails it would indicate the issue isn’t with the image. It’s a good place to start checking though.

Thanks for your reply.

I tried as you suggested and used the last working tag: circleci/python@sha256:ebdf0ff085dbe11bee9f17c272c5311404b468d65121c5e412609cb0bedfb799, but it still fails (hangs at h5py). I will let the job continue for a while, but I suspect it will fail because the other failed jobs also hung at this point.

Edit: this now also failed after 29 minutes. Last successful run for this job yesterday was 2:35.

Edit edit: This cannot be a package version issue either since the job does not install anything from the requirements.txt file. What fails is the remote docker when building a new image. Can it be a network issue?

Steps to reproduce:

  • Create Dockerfile:
FROM circleci/python:3
COPY requirements.txt .
RUN sudo pip install -r requirements.txt
CMD python main.py
  • Create requirements.txt file:
h5py==2.8.0
  • Build docker image:
docker build -t test .

Your repro instructions would need readers to have a requirements.txt file, so they won’t be able to try that. Perhaps you could add your Docker build logs here, so we can see which line is failing?

Okay. Here is the shortest possible reproduction:

Dockerfile:

FROM circleci/python:3
RUN sudo pip install h5py==2.8.0

And it works just fine with another FROM:

FROM python:3
RUN pip install h5py==2.8.0

OK. How does it fail? Please add your Docker build logs, as I asked earlier.

Sorry I forgot the logs. They are quite long: 2658 lines. Here they are on hastebin: https://hastebin.com/widonopafa.sql (not sure why the link says .sql)

Looks like it’s something to do with gcc, but I’m not sure where to start looking in the logs

Could you try some of the things on the installation page? The manual seems to recommend --no-binary, but there is also an installer called Conda too.

Try playing around with the version number too - many there is an issue with that release.

I will try, but the my guess is that the issue is with the network connectivity or something from within the circleci/python:3 image. Otherwise I should see the same issue with the python:3 image, shouldn’t I?

I’m not sure how the circleci job works with setup_remote_docker, but from within my circleci script I ask it to build an image based on the python:3 image (as specified in my docker file). Only the executor is circleci/python:3.

So if I locally can build an image with python:3 that works, I would expect the same from the circleci job that uses the same image. But somehow that is not the case, and I don’t understand why? :frowning:

The result is the same with --no-binary option.

I also just tried creating a new project where the job relies on python:3 as the executor. It’s the same result and same error.

Now I’m completely confused! Is the issue related to the underlying machines used for the circleci jobs?

Update: I just tried to take the Dockerfile contents for the image circleci/python:3.7 (found here) and add my pip install h5py==2.8.0 to the end instead of CMD ["/bin/sh"] to try and debug the problem.

My thought process was that if creating a docker image with a base that fails, simply extending that base should also fail… right? The extended dockerfile can be seen here.

This runs fine and does not fail. I’m at a complete loss here :frowning: I don’t understand anything.

Update update: Just to make matters worse I can now successfully run the FROM circleci/python:3.7 locally, but my colleague can not?? I even tried building with --no-cache and it still works. Running on circleci as a job is still broken though

I would suggest getting the console output of each of your local build processes, and run a diff against them. See what the differences are.