How do I select which GPU to run a job on

Question

In a multi-GPU computer  how do I designate which GPU a CUDA job should run on     As an example  when installing CUDA  I opted to install the NVIDIA CUDA- lt     gt  Samples then ran several instances of the nbody simulation  but they all ran on one GPU 0  GPU 1 was completely idle  monitored using watch -n 1 nvidia-dmi    Checking CUDA VISIBLE DEVICES using  echo  CUDA VISIBLE DEVICES   I found this was not set   I tried setting it using   CUDA VISIBLE DEVICES 1   then running nbody again but it also went to GPU 0     I looked at the related question  how to choose designated GPU to run CUDA program   but deviceQuery command is not in the CUDA 8 0 bin directory   In addition to  CUDA VISIBLE DEVICES   I saw other posts refer to the environment variable  CUDA DEVICES but these were not set and I did not find information on how to use it   While not directly related to my question  using nbody -device 1 I was able to get the application to run on GPU 1 but using nbody -numdevices 2 did not run on both GPU 0 and 1   I am testing this on a system running using the bash shell  on CentOS 6 8  with CUDA 8 0  2 GTX 1080 GPUs  and NVIDIA driver 367 44   I know when writing using CUDA you can manage and control which CUDA resources to use but how would I manage this from the command line when running a compiled CUDA executable

User · Accepted Answer

The problem was caused by not setting the CUDA_VISIBLE_DEVICES variable within the shell correctly.

To specify CUDA device 1 for example, you would set the CUDA_VISIBLE_DEVICES using

export CUDA_VISIBLE_DEVICES=1

or

CUDA_VISIBLE_DEVICES=1 ./cuda_executable

The former sets the variable for the life of the current shell, the latter only for the lifespan of that particular executable invocation.

If you want to specify more than one device, use

export CUDA_VISIBLE_DEVICES=0,1

or

CUDA_VISIBLE_DEVICES=0,1 ./cuda_executable

User · Answer

You can also set the GPU in the command line so that you don t need to hard-code the device into your script  which may fail on systems without multiple GPUs   Say you want to run your script on GPU number 5  you can type the following on the command line and it will run your script just this once on GPU 5   CUDA VISIBLE DEVICES 5  python test script py

User · Answer

Set the following two environment variables   NVIDIA VISIBLE DEVICES  gpu id CUDA VISIBLE DEVICES 0   where gpu id is the ID of your selected GPU  as seen in the host system s nvidia-smi  a 0-based integer  that will be made available to the guest system  e g  to the Docker container environment    You can verify that a different card is selected for each value of gpu id by inspecting Bus-Id parameter in nvidia-smi run in a terminal in the guest system    More info  This method based on NVIDIA VISIBLE DEVICES exposes only a single card to the system  with local ID zero   hence we also hard-code the other variable  CUDA VISIBLE DEVICES to 0  mainly to prevent it from defaulting to an empty string that would indicate no GPU    Note that the environmental variable should be set before the guest system is started  so no chances of doing it in your Jupyter Notebook s terminal   for instance using docker run -e NVIDIA VISIBLE DEVICES 0 or env in Kubernetes or Openshift   If you want GPU load-balancing  make gpu id random at each guest system start   If setting this with python  make sure you are using strings for all environment variables  including numerical ones   You can verify that a different card is selected for each value of gpu id by inspecting nvidia-smi s Bus-Id parameter  in a terminal run in the guest system    The accepted solution based on CUDA VISIBLE DEVICES alone does not hide other cards  different from the pinned one   and thus causes access  errors if you try to use them in your GPU-enabled python packages  With this solution  other cards are not visible to the guest system  but other users still can access them and share their computing power on an equal basis  just like with CPU s  verified     This is also preferable to solutions using Kubernetes   Openshift controlers  resources limits nvidia com gpu   that would impose a lock on the allocated card  removing it from the pool of available resources  so the number of containers with GPU access could not exceed the number of physical cards    This has been tested under CUDA 8 0  9 0 and 10 1 in docker containers running Ubuntu 18 04 orchestrated by Openshift 3 11

User · Answer

In case of someone else is doing it in Python and it is not working  try to set it before do the imports of pycuda and tensorflow   I e    import os os environ  CUDA DEVICE ORDER      PCI BUS ID  os environ  CUDA VISIBLE DEVICES      0      import pycuda autoinit import tensorflow as tf       As saw here

[cuda] How do I select which GPU to run a job on?

Examples related to cuda

Examples related to nvidia