We are in the process of setting up our large project workflow in CircleCI 2.0, and we ran across an intermittent issue where some of our build binaries return an “Illegal Instruction” error.
Logging into the containers and debugging, reveals that the errors are due to the underlying CircleCI 2.0 server hardware not supporting AVX2 and BMI2 instruction sets introduced by Intel in Q2 2013 as part of their Haswell microarchitecture, and present (and expanded) in all Intel servers since then.
The errors seen have nothing to do with Caching or Workspace persistence but rather the CircleCI 2.0 server being chosen to build/execute our code not supporting these instruction sets from 2013.
Here are the errors seen:
Support for MULX missing (part of the BMI2 instruction set introduced in Haswell in 2013):
We have tried to run our binaries on CircleCI all within the same job (a workflow with a single job/container), and it will still fail with the illegal instruction error randomly (I guess depending if the underlying server is pre-2014 or not).
Ideally, the ability to select an {{ arch }} for the entire workflow would solve the incompatibility issue, as long as the {{ arch }} includes the architectures over the past 5 years at least (Haswell, Broadwell, Skylake, Kaby Lake, Coffee Lake? )
This is impacting our ability to test our product in CircleCI 2.0. Any assistance would be greatly appreciated!
It is possible that your compiler needs to be upgraded. For example, GCC only started supporting AVX2 in v4.7. Here’s a reference link: https://gcc.gnu.org/gcc-4.7/changes.html
In my case, proc/cpuinfo showed that CircleCI 2.0’s build server did not have the AVX2 nor the BMI2 instruction sets (correlation between the CPU type, Xeon E5-2680, and AWS instance types, reveals the CircleCI servers were most likely c3 generation servers)
what I mean though is that I build a binary with avx2 support, then move the binary on the machine and get illegal instruction even if the machine supports avx2
I assume you’re on the Docker executor, @rcfaria01. I think Machine is closer to bare metal, and I wonder if that would fix this? It’s a free option, for now at least. AFAICR, Circle is based on AWS, and their CPUs would be bang up to date.
Hi Jon. Indeed, I have not tried that option, and it is certainly a good suggestion. Thanks!
What still concerns me, though, is that the AWS servers that get allocated by CircleCI for my builds are always c3 servers (3rd generation servers, based on the CPU type in /proc/cpuinfo). However, AWS only introduced AVX2 support with instances from c4 onwards (and AVX512 with c5, last year).
My concern is that even if I have docker machine, or even bare-metal server access, if the CPU (Xeon 2650) doesn’t support AVX2, it will not properly execute the test binaries.
Righto. I wonder if CircleCI would need some sort of allocation system to pick server types from the farm based on CPU/hardware requirements. I don’t know what types they have in their farm at present. Could that perhaps be logged as an idea?
Hi @giacomodabisias Giacomo, I think that one possibility for your particular scenario is that the CircleCI AMI on which your binary is placed is running in paravirtual mode (pure or hybrid pv on hvm) . If this is the case, AVX2 will not work properly.
Also, you can try checking which instruction sets the server you are running the binary on supports by executing this command: gcc -O2 -march=native -E -v - </dev/null 2>&1 | grep cc1