Following the documentation’s canonical example of how to use the GPU instances here: Using the GPU execution environment - CircleCI
I launched the following job:
version: 2.1
jobs:
build:
machine:
image: ubuntu-2004-cuda-11.4:202110-01
steps:
- run: nvidia-smi
but it failed with the error
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Exited with code exit status 9
and I don’t understand why? Doing couple of random ls -al
, namely ls -al /usr/local/cuda-11.4
makes it seem like the drivers and the toolkit are installed, but the instance itself doesn’t have a GPU (meaning the error is real and not an artifact of a different one).