Orb running out of entropy while signing jars

During our build we sign and timestamp (with the help of an external TSA) a list of jars rapidly. Most of the time, this takes a few minutes.

For the last months, we’ve also seen instances that can take hours for the same operation. We thought the external call was slow, but after re-running the job with SSH we can see that the VM (orb) is actually running out of available entropy (watching via /proc/sys/kernel/random/entropy_avail) and blocking.

There are others who have experienced something similar (See " Extremely slow jarsigner on Centos7 build server" from 2016), we seem to run into the same issue here.

The solution there seems to be to run a service to seed /dev/random, but the filesystem on CircleCI is readonly and I cannot get that tool to work.

Most of the time (when builds are fine), the entropy_avail returns 256, but during the slow builds we see this drop to somewhere between 0 and 50.

Has somebody dealt with this before, is this a CircleCI problem, or is there something we can do to prevent the entropy running out during our builds?

To be clear these are not ‘solutions’ instead are possible ‘workarounds’, with luck someone from within CircleCI can provide more long-term help.

  • Does the signing tool you are using allow you to specify /dev/urandom instead of /dev/random?

  • Are you able to move the process into a docker-based container that you run within the CircleCI instance? By doing this you would be able to remap /dev/random to /dev/urandom.

  • Have you considered a self-hosted runner for this task as it would provide you with the low-level access you need to modify /dev/random with the third party tools?

Also, you do not state which OS image you are deploying as your CircleCI system. The Linux kernel can have a major impact on the way that /dev/random operates when combined with a cprng library install/init.

Thanks @rit1010 for your suggestions. I can offer some of our context and will discuss your suggestions with the team!

We are running on Executor / Resource Class: Docker / X-Large
The environment/orb in CircleCI we’re using is cimg/openjdk:18.0:

Build-agent version 1.0.171204-ffb19313 (2023-05-09T14:27:45+0000)
System information:
 Server Version: 20.10.18
 Storage Driver: overlay2
  Backing Filesystem: xfs
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Kernel Version: 5.15.0-1030-aws
 Operating System: Ubuntu 20.04.5 LTS
 OSType: linux
 Architecture: x86_64

Starting container cimg/openjdk:18.0

We’re basically using the standard jarsigner from openjdk:18, which internally uses the default SecureRandom implementation (as far as I can tell). So I don’t believe I can tell it to use a difference source.

An “issue” is also reproducibility - something it just works, and I can’t actually get it into a “broken” state (which is good for the build, but not for debugging the issues).

/dev/random has in the past suffered badly at times because it collected/collects entropy from disk, mice, and keyboard events - all things in limited supply on a docker container, in a VM on a dedicated server with no direct connected harddisks, mice or keyboards.

In terms of Java, you can pass the following link on to whoever understands Java security the best and whoever signs off changes to your security model.

 https://www.baeldung.com/java-security-egd

java-security-egd basically allows you to change the source file for entropy when a Java application is run. NOT TO BE USED unless you fully understand what that means. Personally, I am aware that the feature is reported as existing, but I provide no statement regarding it being used as a solution for your needs.

If you start looking at a self-hosted runner, moving up from Ubuntu 20.04.5 LTS will likely solve the issue. 20.04.5 is based on Linux Kernal 5.15, from version 5.6 it is reported on wikipedia that a change was made with a link to the following

This does not help with the CircleCI provided images as all the openjdk 18 images seem to be based on 20.04.5.

Many thanks @rit1010, after looking over the issue and reading up on the various issues that cloud can have with regards to using /dev/random, we have indeed moved over to using /dev/urandom.

Since we’re using jarsigner, the way to use is slightly different than “normal”, but not overly complicated.

Basically, it becomes jarsigner -J-Djava.security.egd=file:/dev/./urandom.

Or in case of using ant:

<signjar ...>
  <path>
    <fileset includes="*.jar" />
  </path>
  <sysproperty key="java.security.egd" value="file:/dev/./urandom" />
</signjar>

Another option to force the JVM over to urandom is to update $JAVA_HOME/conf/security/java.security and set the securerandom.source=file:/dev/urandom option. But for now just getting jarsigner to behave is good enough.

Many thanks for your suggestions and help!

Thanks for posting a concrete example of what you ended up implementing as this will greatly help the next person who is impacted by the issue.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.