Nvidia-smi fails on image: ubuntu-2004-cuda-11.4:202110-01

DinoBektesevic · November 10, 2022, 7:23pm

Following the documentation’s canonical example of how to use the GPU instances here: Using the GPU execution environment - CircleCI
I launched the following job:

version: 2.1

jobs:
  build:
    machine:
      image: ubuntu-2004-cuda-11.4:202110-01
    steps:
      - run: nvidia-smi

but it failed with the error

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Exited with code exit status 9

and I don’t understand why? Doing couple of random ls -al, namely ls -al /usr/local/cuda-11.4 makes it seem like the drivers and the toolkit are installed, but the instance itself doesn’t have a GPU (meaning the error is real and not an artifact of a different one).

DinoBektesevic · November 14, 2022, 5:50pm

Since I heard back from the customer support on the issue I thought I’d update anyone else who runs into the question with the answer:

When specifying a GPU machine image, it is also necessary to include a special resource class in your config.yml file.

You can see a list of available resource classes on the following page:

Using the GPU execution environment - CircleCI

Using GPU-enabled resource classes also requires a paid Scale plan as mentioned on the following page:

CircleCI Resource Classes - CircleCI

An error message regarding the plan type should show in the UI if you try to add the gpu resource class to your job.

Thanks to the customer support for the quick response!

Topic		Replies	Views
Easiest way to run GPU-enabled tests Build Environment	4	2245	November 5, 2018
Using Circle CI executor with GPU to test CUDA Build Environment	1	1414	November 6, 2020
Is it possible to run a CUDA image on a machine without GPU? Ecosystem	0	369	October 26, 2023
Image ubuntu-2004-cuda-11.4 is unavailable Feedback & Bug Reports	0	443	August 4, 2023
Cannot create small resource class Ubuntu image Build Environment	3	445	August 12, 2022

Nvidia-smi fails on image: ubuntu-2004-cuda-11.4:202110-01

Related topics