Orb running out of entropy while signing jars

jasperroel · May 10, 2023, 2:17pm

During our build we sign and timestamp (with the help of an external TSA) a list of jars rapidly. Most of the time, this takes a few minutes.

For the last months, we’ve also seen instances that can take hours for the same operation. We thought the external call was slow, but after re-running the job with SSH we can see that the VM (orb) is actually running out of available entropy (watching via /proc/sys/kernel/random/entropy_avail) and blocking.

There are others who have experienced something similar (See " Extremely slow jarsigner on Centos7 build server" from 2016), we seem to run into the same issue here.

The solution there seems to be to run a service to seed /dev/random, but the filesystem on CircleCI is readonly and I cannot get that tool to work.

Most of the time (when builds are fine), the entropy_avail returns 256, but during the slow builds we see this drop to somewhere between 0 and 50.

Has somebody dealt with this before, is this a CircleCI problem, or is there something we can do to prevent the entropy running out during our builds?

rit1010 · May 10, 2023, 3:30pm

To be clear these are not ‘solutions’ instead are possible ‘workarounds’, with luck someone from within CircleCI can provide more long-term help.

Does the signing tool you are using allow you to specify /dev/urandom instead of /dev/random?
Are you able to move the process into a docker-based container that you run within the CircleCI instance? By doing this you would be able to remap /dev/random to /dev/urandom.
Have you considered a self-hosted runner for this task as it would provide you with the low-level access you need to modify /dev/random with the third party tools?

Also, you do not state which OS image you are deploying as your CircleCI system. The Linux kernel can have a major impact on the way that /dev/random operates when combined with a cprng library install/init.

jasperroel · May 10, 2023, 4:41pm

Thanks @rit1010 for your suggestions. I can offer some of our context and will discuss your suggestions with the team!

We are running on Executor / Resource Class: Docker / X-Large
The environment/orb in CircleCI we’re using is cimg/openjdk:18.0:

Build-agent version 1.0.171204-ffb19313 (2023-05-09T14:27:45+0000)
System information:
 Server Version: 20.10.18
 Storage Driver: overlay2
  Backing Filesystem: xfs
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Kernel Version: 5.15.0-1030-aws
 Operating System: Ubuntu 20.04.5 LTS
 OSType: linux
 Architecture: x86_64

Starting container cimg/openjdk:18.0

We’re basically using the standard jarsigner from openjdk:18, which internally uses the default SecureRandom implementation (as far as I can tell). So I don’t believe I can tell it to use a difference source.

An “issue” is also reproducibility - something it just works, and I can’t actually get it into a “broken” state (which is good for the build, but not for debugging the issues).

rit1010 · May 10, 2023, 8:21pm

/dev/random has in the past suffered badly at times because it collected/collects entropy from disk, mice, and keyboard events - all things in limited supply on a docker container, in a VM on a dedicated server with no direct connected harddisks, mice or keyboards.

In terms of Java, you can pass the following link on to whoever understands Java security the best and whoever signs off changes to your security model.

 https://www.baeldung.com/java-security-egd

java-security-egd basically allows you to change the source file for entropy when a Java application is run. NOT TO BE USED unless you fully understand what that means. Personally, I am aware that the feature is reported as existing, but I provide no statement regarding it being used as a solution for your needs.

rit1010 · May 10, 2023, 8:32pm

If you start looking at a self-hosted runner, moving up from Ubuntu 20.04.5 LTS will likely solve the issue. 20.04.5 is based on Linux Kernal 5.15, from version 5.6 it is reported on wikipedia that a change was made with a link to the following

This does not help with the CircleCI provided images as all the openjdk 18 images seem to be based on 20.04.5.

jasperroel · May 11, 2023, 8:16am

Many thanks @rit1010, after looking over the issue and reading up on the various issues that cloud can have with regards to using /dev/random, we have indeed moved over to using /dev/urandom.

Since we’re using jarsigner, the way to use is slightly different than “normal”, but not overly complicated.

Basically, it becomes jarsigner -J-Djava.security.egd=file:/dev/./urandom.

Or in case of using ant:

<signjar ...>
  <path>
    <fileset includes="*.jar" />
  </path>
  <sysproperty key="java.security.egd" value="file:/dev/./urandom" />
</signjar>

Another option to force the JVM over to urandom is to update $JAVA_HOME/conf/security/java.security and set the securerandom.source=file:/dev/urandom option. But for now just getting jarsigner to behave is good enough.

Many thanks for your suggestions and help!

rit1010 · May 11, 2023, 11:24am

Thanks for posting a concrete example of what you ended up implementing as this will greatly help the next person who is impacted by the issue.

system · May 18, 2023, 11:25am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Builds using the circleci/aws-ecr@7.2.0 and @7.3.0 version are currently broken Orbs	2	1283	November 25, 2021
Getting some extra build minutes Build Environment	2	714	August 27, 2018
2.1 Config and Build Processing Build Environment	52	18105	March 4, 2019
Unable to publish dev version of orb Feedback & Bug Reports orbs , orb-tools	2	1855	February 17, 2021
CIRCLE_OIDC_TOKEN is sometimes missing with parallel builds Product Feedback	6	564	June 23, 2023

Orb running out of entropy while signing jars

Related Topics