How to get current available GPUs in tensorflow

Question

I have a plan to use distributed TensorFlow  and I saw TensorFlow can use GPUs for training and testing  In a cluster environment  each machine could have 0 or 1 or more GPUs  and I want to run my TensorFlow graph into GPUs on as many machines as possible   I found that when running tf Session   TensorFlow gives information about GPU in the log messages like below   I tensorflow core common runtime gpu gpu init cc 126  DMA  0  I tensorflow core common runtime gpu gpu init cc 136  0    Y  I tensorflow core common runtime gpu gpu device cc 838  Creating TensorFlow device   gpu 0  - gt   device  0  name  GeForce GTX 1080  pci bus id  0000 01 00 0    My question is how do I get information about current available GPU from TensorFlow  I can get loaded GPU information from the log  but I want to do it in a more sophisticated  programmatic way  I also could restrict GPUs intentionally using the CUDA VISIBLE DEVICES environment variable  so I don t want to know a way of getting GPU information from OS kernel   In short  I want a function like tf get available gpus   that will return    gpu 0     gpu 1   if there are two GPUs available in the machine  How can I implement this

User · Answer

Ensure you have the latest TensorFlow 2 x GPU installed in your GPU supporting machine  Execute the following code in python   from   future   import absolute import  division  print function  unicode literals  import tensorflow as tf   print  Num GPUs Available     len tf config experimental list physical devices  GPU       Will get an output looks like      2020-02-07 10 45 37 587838  I   tensorflow stream executor cuda cuda gpu executor cc 1006  successful   NUMA node read from SysFS had negative value  -1   but there must be   at least one NUMA node  so returning NUMA node zero 2020-02-07   10 45 37 588896  I   tensorflow core common runtime gpu gpu device cc 1746  Adding visible   gpu devices  0  1  2  3  4  5  6  7 Num GPUs Available   8

User · Answer

The following works in tensorflow 2   import tensorflow as tf gpus   tf config experimental list physical devices  GPU   for gpu in gpus      print  Name    gpu name     Type    gpu device type    From 2 1  you can drop experimental       gpus   tf config list physical devices  GPU     https   www tensorflow org api docs python tf config list physical devices

User · Answer

Use this way and check all parts    from   future   import absolute import  division  print function  unicode literals  import numpy as np import tensorflow as tf import tensorflow hub as hub import tensorflow datasets as tfds   version   tf   version   executing eagerly   tf executing eagerly   hub version   hub   version   available   tf config experimental list physical devices  GPU    print  Version     version  print  Eager mode     executing eagerly  print  Hub Version     h version  print  GPU is    available  if avai else  NOT AVAILABLE

User · Answer

In TensorFlow 2 0  you can use tf config experimental list physical devices  GPU     import tensorflow as tf gpus   tf config experimental list physical devices  GPU   for gpu in gpus      print  Name    gpu name     Type    gpu device type    If you have two GPUs installed  it outputs this   Name   physical device GPU 0   Type  GPU Name   physical device GPU 1   Type  GPU   From 2 1  you can drop experimental   gpus   tf config list physical devices  GPU     See     Guide pages Current API

User · Answer

There is also a method in the test util  So all that has to be done is   tf test is gpu available     and or  tf test gpu device name     Look up the Tensorflow docs for arguments

User · Answer

In TensorFlow Core v2 3 0  the following code should work  import tensorflow as tf visible devices   tf config get visible devices   for devices in visible devices    print devices   Depending on your environment  this code will produce flowing results   PhysicalDevice name   physical device CPU 0   device type  CPU   PhysicalDevice name   physical device GPU 0   device type  GPU

User · Answer

The accepted answer gives you the number of GPUs but it also allocates all the memory on those GPUs  You can avoid this by creating a session with fixed lower memory before calling device lib list local devices   which may be unwanted for some applications   I ended up using nvidia-smi to get the number of GPUs without allocating any memory on them   import subprocess  n   str subprocess check output   nvidia-smi    -L     count  UUID

User · Answer

There is an undocumented method called device lib list local devices   that enables you to list the devices available in the local process   N B  As an undocumented method  this is subject to backwards incompatible changes   The function returns a list of DeviceAttributes protocol buffer objects  You can extract a list of string device names for the GPU devices as follows   from tensorflow python client import device lib  def get available gpus        local device protos   device lib list local devices       return  x name for x in local device protos if x device type     GPU     Note that  at least up to TensorFlow 1 4   calling device lib list local devices   will run some initialization code that  by default  will allocate all of the GPU memory on all of the devices  GitHub issue   To avoid this  first create a session with an explicitly small per process gpu fraction  or allow growth True  to prevent all of the memory being allocated  See this question for more details

User · Answer

I am working on TF-2 1 and torch  so I don t want to specific this automacit choosing in any ML frame  I just use original nvidia-smi and os environ to get a vacant gpu  def auto gpu selection usage max 0 01  mem max 0 05    quot  quot  quot Auto set CUDA VISIBLE DEVICES for gpu   param mem max  max percentage of GPU utility  param usage max  max percentage of GPU memory  return   quot  quot  quot  os environ  CUDA DEVICE ORDER      PCI BUS ID  log   str subprocess check output  quot nvidia-smi quot   shell True   split r quot  n quot   6 -1  gpu   0    Maximum of GPUS  8 is enough for most for i in range 8       idx   i 3   2     if idx  gt  log   len    -1          break     inf   log idx  split  quot   quot       if inf   len      lt  3          break     usage   int inf 3  split  quot   quot   0  strip        mem now   int str inf 2  split  quot   quot   0   strip    -3       mem all   int str inf 2  split  quot   quot   1   strip    -3         print  quot GPU- d   Usage   d    quot     gpu  usage       if usage  lt  100 usage max and mem now  lt  mem max mem all          os environ  quot CUDA VISIBLE EVICES quot     str gpu          print  quot  nAuto choosing vacant GPU- d   Memory   dMiB  dMiB    GPU-Util   d    n quot                   gpu  mem now  mem all  usage           return     print  quot GPU- d is busy  Memory   dMiB  dMiB    GPU-Util   d    quot               gpu  mem now  mem all  usage       gpu    1 print  quot  nNo vacant GPU  use CPU instead n quot   os environ  quot CUDA VISIBLE EVICES quot      quot -1 quot   If I can get any GPU  it will set CUDA VISIBLE EVICES to BUSID of that gpu   GPU-0 is busy  Memory  5738MiB 11019MiB    GPU-Util  60   GPU-1 is busy  Memory  9688MiB 11019MiB    GPU-Util  78    Auto choosing vacant GPU-2   Memory  1MiB 11019MiB    GPU-Util  0    else  set to -1 to use CPU  GPU-0 is busy  Memory  8900MiB 11019MiB    GPU-Util  95   GPU-1 is busy  Memory  4674MiB 11019MiB    GPU-Util  35   GPU-2 is busy  Memory  9784MiB 11016MiB    GPU-Util  74    No vacant GPU  use CPU instead  Note  Use this function before you import any ML frame that require a GPU  then it can automatically choose a gpu  Besides  it s easy for you to set multiple tasks

User · Answer

You can check all device list using following code   from tensorflow python client import device lib  device lib list local devices

User · Answer

I got a GPU called NVIDIA GTX GeForce 1650 Ti in my machine with tensorflow-gpu  2 2 0  Run the following two lines of code   import tensorflow as tf print  Num GPUs Available     len tf config experimental list physical devices  GPU       Output   Num GPUs Available   1

User · Answer

Apart from the excellent explanation by Mrry  where he suggested to use device lib list local devices   I can show you how you can check for GPU related information from the command line   Because currently only Nvidia s gpus work for NN frameworks  the answer covers only them  Nvidia has a page where they document how you can use the  proc filesystem interface to obtain run-time information about the driver  any installed NVIDIA graphics cards  and the AGP status       proc driver nvidia gpus 0  N information      Provide information about   each of the installed NVIDIA graphics adapters  model name  IRQ  BIOS   version  Bus Type   Note that the BIOS version is only available while   X is running    So you can run this from command line cat  proc driver nvidia gpus 0 information and see information about your first GPU  It is easy to run this from python and also you can check second  third  fourth GPU till it will fail   Definitely Mrry s answer is more robust and I am not sure whether my answer will work on non-linux machine  but that Nvidia s page provide other interesting information  which not many people know about

[python] How to get current available GPUs in tensorflow?

Examples related to python

Examples related to gpu

Examples related to tensorflow