CircleCI is upgrading the kernel version of the operating system that is used to run customer containers as part of the Docker executor on CircleCI. This change is critical to ensuring the underlying infrastructure that runs your jobs continues to provide reliable and performant execution.
This upgrade should be effectively invisible to customers. The current kernel version used is 4.15 and it will be upgraded to 5.4.
CircleCI is gradually rolling out this change for all Docker executor customers beginning on December 1, 2021.
- CircleCI does not expect this to be a breaking change for the overwhelming majority of Docker executor jobs as the jobs are isolated in their own containers.
- There is a small chance that a job results in an Out of Memory (OOM) failure more frequently because of a change in the newer kernel’s OOM Killer. If a customer is experiencing more frequent OOM errors with the newer kernel version, CircleCI recommends upgrading to a resource class that offers more memory.
- It is possible to view which kernel version a job uses during execution by outputting “uname -r”
- If there are other issues observed as a result of this change, please comment on this post immediately so they can be analyzed and addressed.
- Once the kernel version is upgraded to 5.4 for 100% of customers, CircleCI will begin the gradual roll-out process of updating the full underlying operating system that is used to run customer containers as part of the Docker executor from Ubuntu 18.04 to 20.04.
- Similar communication will be shared at that time
- This upgrade will not impact Remote Docker jobs
Please comment on this post or reach out to your support contact if there are any questions about this change. Thank you!
Hi Sebastian! It seems like this upgrade hit our team recently and it caused some issues.
In particular, we’re a Rails app using ActiveStorage and we’ve started getting ActiveStorage::IntegrityError failures on many of our tests that touch ActiveStorage.
It seems there’s an issue with docker + certain versions of the linux kernel which is well described in this Github issue: Very specific set of circumstances leads to zero-byte (empty) file being created · Issue #1015 · docker/for-linux · GitHub
We’re seeing the same things that person described; when Rails tries to copy a file from /tmp/something to /tmp/something-else with
IO.copy_stream, the destination file is created but is zero bytes.
For now, we’ve worked around it in our test suite by defining a custom ActiveStorage service that doesn’t use
IO.copy_stream when writing from file → file, but for the sake of others on the platform it might be good to figure out why this is happening and if a different kernel version would help.
@tgrathwell Thank you so much for the detailed report and for bringing this to our attention. We’re looking into ways to resolve this issue swiftly.
The option we’re exploring involves expediting our schedule to upgrade the entire underlying OS to Ubuntu 20.04, which would include the 5.11 kernel where this issue is fixed. I should have more information within the next couple of days wrt feasibility of this option in the immediate short term. I will update this thread as I learn more.
In parallel, we’re trying to get the fix backported to 18.04’s 5.4 kernel: Bug #1953199 “0-byte files created in overlay filesystem” : Bugs : linux-base package : Ubuntu
To add to the googleability here:
We suddenly started getting failed tests in CircleCI (but not locally) where the “Prawn” Ruby library failed to render images stored on disk using the “CarrierWave” library.
CircleCI support suggested the issue could be due to this upgrade.
The errors were specifically “Prawn::Errors::UnsupportedImageType” because the images were empty on disk.
We were able to work around it by creating (FactoryBot.create) instead of just building the records storing the images.
This upgrade and the upgrade of the underlying operating system described in this post: Upgrading the underlying operating system for the Docker executor - 18.04 to 20.04 are now complete.
The upgrade to the underlying OS included an upgrade of the kernel from 5.4 to 5.11. You can view the kernel and OS version being used by a job by outputting “uname -r”.
If you are experiencing issues with the new kernel and/or OS, please comment below.
Thanks, @sebastian-lerner. I removed our workaround and CI is still happy, so the upgrade appears to have fixed our issue
Details on performance improvements seen with this latest OS upgrade can be seen here.