Docker Executor Infrastructure Upgrade

We’re encountering 2x degraded performance, meaning we’ve had to double our resource class, and still getting killed processes.

This is an urgent priority for us as it’s meaning our CI just won’t even run, let alone doubling our costs.

V2 run with double resource + killed:
bcafd6c0-65ca-4bf0-80a3-32f4a2a912cc

V2 run without doubling resource class:
2bd09640-83c3-4d62-8125-e27370b99d41

V1 - no issues:
fdb9393e-403d-485d-9bfc-07508c833996

Would appreciate a techie looking at this urgently as well as clarification on if you’ll issue billing corrections for the extra resources and failed runs? This is the kind of thing that erodes trust and costs enormous amounts of staff time while we can’t operate correctly.

Thanks @bpetetot,

Those two should help with our investigations.

Dom

Please opt our org out. Our builds are almost twice as slow now.

https://app.circleci.com/pipelines/github/tankfarm/tankfarm.io/10016/workflows/4edc2cdb-f1d1-43ae-aa0f-4e122a565e1d/jobs/59783

Please opt our organization out as well. We have not had our Cypress tests pass once since the upgrade.

7ff12f72-4604-4914-ae1f-33157fda029d
aefa059e-4a2c-4dd6-9372-215fb1d1e13b

Hi @capnfuzz,

Sorry you’re having issues.

I’ve opted your org out, it should take about 10 minutes to apply. Please let me know if you’re still seeing Jobs running on the V1 runtime after that.

Thanks,

Dom

Hi, we’re currently facing issues in that the first run always failed with signal: killed. After rerunning the ci, it passes. Can I get any advice to solve this issue?

https://app.circleci.com/pipelines/github/kaiachain/kaia/2330/workflows/3cff1f7c-05ef-4c6f-8b36-a70ec66ca9c5/jobs/12892

Hi @hyeonLewis,

Sorry you’re having issues.

Thanks for providing that Job link. I have opted your org out of the rollout for now whilst we investigate.

Many thanks,

Dom

1 Like

Hello, @DomParfitt. Can you please opt out our org as well? We are having multiple builds fail with jvm out of memory error and our cpu resource stays at 100%. Thank you!
https://app.circleci.com/pipelines/github/SectorLabs/cheetah/56105/workflows/039170d6-d6cb-4847-8f61-b2de34023431/jobs/901254

Hi @ValentinCondurache,

Sorry you’re having issues.

I’ve just opted your org out. It may take around 10 minutes to apply but after that you should see your Jobs running on V1 again.

Many thanks,

Dom

1 Like

Hello!
Our org seems to have been upgraded to v2 container the other day.
The CI that used to take 6 or 7 minutes took 3 hours and finally failed.
Please opt out our org.

docker image : cimg/ruby:3.2.2-browsers
testing library : rspec (parallel_rspec)
job : https://app.circleci.com/pipelines/github/smartcamp/boxil/38081/workflows/080c6315-e7af-4353-baea-50895b16e911/jobs/313312/parallel-runs/1

Addition : CI was also down as well after updating image to cimg/ruby:3.2-browsers.

Hi @maaaaakoto35,

Sorry to hear that! I’ve opted out the boxil project whilst we investigate

Dom

1 Like

Hey @DominicLavery
we are seeing failing tests without any apparent reason (seems like related to general slowdown in some cases)

https://app.circleci.com/pipelines/github/Consensys/teku/35887/workflows/2c80f3cd-ce08-4d3f-a8fe-cb783899a162/jobs/270349

Any hint?

it is happening systematically since yesterday

Hi @tbenr,

It’s a tricky one. It doesn’t line up with the project being opt’ed in to v2. But looking at the timeline of your issues it could be related to a small bug that got introduced yesterday.

A fix for it has just been rolled out, would you be able to retry your failing job please?

Thanks
Dom

oky retrying

Still failing.
I suspect it is related to a slow CPU, because there is also another job that normally was taking 13min, it is now taking 31min

Can someone please opt our organization out? We are experience some issues with Rails and Capybara.

Our org id is: 51d1ac41-f636-4691-a993-7440a6b5b8d7

Sorry @tbenr. I’ve fully reverted the change from yesterday. Please could you give your build one more go?

Hi @tiagobabo.

Sorry to hear that. Please could you please provide a link to an effected job and some details about what you are seeing?

Thanks
Dom

Yes, here it goes: https://app.circleci.com/pipelines/github/carwow/quotes_site/93051/workflows/41b15f02-70ab-4f46-bece-5b32b1195482/jobs/2032725

We started seeing these failures across our capybara specs.