I'm searching for a way to use the GPU from inside a docker container.
The container will execute arbitrary code so i don't want to use the privileged mode.
Any tips?
From previous research i understood that run -v
and/or LXC cgroup
was the way to go but i'm not sure how to pull that off exactly
Writing an updated answer since most of the already present answers are obsolete as of now.
Versions earlier than Docker 19.03
used to require nvidia-docker2
and the --runtime=nvidia
flag.
Since Docker 19.03
, you need to install nvidia-container-toolkit
package and then use the --gpus all
flag.
So, here are the basics,
Package Installation
Install the nvidia-container-toolkit
package as per official documentation at Github.
For Redhat based OSes, execute the following set of commands:
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
$ sudo yum install -y nvidia-container-toolkit
$ sudo systemctl restart docker
For Debian based OSes, execute the following set of commands:
# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker
Running the docker with GPU support
docker run --name my_all_gpu_container --gpus all -t nvidia/cuda
Please note, the flag --gpus all
is used to assign all available gpus to the docker container.
To assign specific gpu to the docker container (in case of multiple GPUs available in your machine)
docker run --name my_first_gpu_container --gpus device=0 nvidia/cuda
Or
docker run --name my_first_gpu_container --gpus '"device=0"' nvidia/cuda
To use GPU from docker container, instead of using native Docker, use Nvidia-docker. To install Nvidia docker use following commands
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64/nvidia-
docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker
sudo pkill -SIGHUP dockerd # Restart Docker Engine
sudo nvidia-docker run --rm nvidia/cuda nvidia-smi # finally run nvidia-smi in the same container
I would not recommend installing CUDA/cuDNN on the host if you can use docker. Since at least CUDA 8 it has been possible to "stand on the shoulders of giants" and use nvidia/cuda
base images maintained by NVIDIA in their Docker Hub repo. Go for the newest and biggest one (with cuDNN if doing deep learning) if unsure which version to choose.
A starter CUDA container:
mkdir ~/cuda11
cd ~/cuda11
echo "FROM nvidia/cuda:11.0-cudnn8-devel-ubuntu18.04" > Dockerfile
echo "CMD [\"/bin/bash\"]" >> Dockerfile
docker build --tag mirekphd/cuda11 .
docker run --rm -it --gpus 1 mirekphd/cuda11 nvidia-smi
Sample output:
(if nvidia-smi
is not found in the container, do not try install it there - it was already installed on thehost with NVIDIA GPU driver and should be made available from the host to the container system if docker has access to the GPU(s)):
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.57 Driver Version: 450.57 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:01:00.0 On | N/A |
| 0% 50C P8 17W / 280W | 409MiB / 11177MiB | 7% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Prerequisites
Appropriate NVIDIA driver with the latest CUDA version support to be installed first on the host (download it from NVIDIA Driver Downloads and then mv driver-file.run driver-file.sh && chmod +x driver-file.sh && ./driver-file.sh
). These are have been forward-compatible since CUDA 10.1.
GPU access enabled in docker
by installing sudo apt get update && sudo apt get install nvidia-container-toolkit
(and then restarting docker daemon using sudo systemctl restart docker
).
Install docker https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-16-04
Build the following image that includes the nvidia drivers and the cuda toolkit
Dockerfile
FROM ubuntu:16.04
MAINTAINER Jonathan Kosgei <[email protected]>
# A docker container with the Nvidia kernel module and CUDA drivers installed
ENV CUDA_RUN https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda_8.0.44_linux-run
RUN apt-get update && apt-get install -q -y \
wget \
module-init-tools \
build-essential
RUN cd /opt && \
wget $CUDA_RUN && \
chmod +x cuda_8.0.44_linux-run && \
mkdir nvidia_installers && \
./cuda_8.0.44_linux-run -extract=`pwd`/nvidia_installers && \
cd nvidia_installers && \
./NVIDIA-Linux-x86_64-367.48.run -s -N --no-kernel-module
RUN cd /opt/nvidia_installers && \
./cuda-linux64-rel-8.0.44-21122537.run -noprompt
# Ensure the CUDA libs and binaries are in the correct environment variables
ENV LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64
ENV PATH=$PATH:/usr/local/cuda-8.0/bin
RUN cd /opt/nvidia_installers &&\
./cuda-samples-linux-8.0.44-21122537.run -noprompt -cudaprefix=/usr/local/cuda-8.0 &&\
cd /usr/local/cuda/samples/1_Utilities/deviceQuery &&\
make
WORKDIR /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo docker run -ti --device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm <built-image> ./deviceQuery
You should see output similar to:
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GRID K520
Result = PASS
My goal was to make a CUDA enabled docker image without using nvidia/cuda as base image. Because I have some custom jupyter image, and I want to base from that.
The host machine had nvidia driver, CUDA toolkit, and nvidia-container-toolkit already installed. Please refer to the official docs, and to Rohit's answer.
Test that nvidia driver and CUDA toolkit is installed correctly with: nvidia-smi
on the host machine, which should display correct "Driver Version" and "CUDA Version" and shows GPUs info.
Test that nvidia-container-toolkit is installed correctly with: docker run --rm --gpus all nvidia/cuda:latest nvidia-smi
I found what I assume to be the official Dockerfile for nvidia/cuda here I "flattened" it, appended the contents to my Dockerfile and tested it to be working nicely:
FROM sidazhou/scipy-notebook:latest
# FROM ubuntu:18.04
###########################################################################
# See https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/10.1/ubuntu18.04-x86_64/base/Dockerfile
# See https://sarus.readthedocs.io/en/stable/user/custom-cuda-images.html
###########################################################################
USER root
###########################################################################
# base
RUN apt-get update && apt-get install -y --no-install-recommends \
gnupg2 curl ca-certificates && \
curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub | apt-key add - && \
echo "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list && \
echo "deb https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list && \
apt-get purge --autoremove -y curl \
&& rm -rf /var/lib/apt/lists/*
ENV CUDA_VERSION 10.1.243
ENV CUDA_PKG_VERSION 10-1=$CUDA_VERSION-1
# For libraries in the cuda-compat-* package: https://docs.nvidia.com/cuda/eula/index.html#attachment-a
RUN apt-get update && apt-get install -y --no-install-recommends \
cuda-cudart-$CUDA_PKG_VERSION \
cuda-compat-10-1 \
&& ln -s cuda-10.1 /usr/local/cuda && \
rm -rf /var/lib/apt/lists/*
# Required for nvidia-docker v1
RUN echo "/usr/local/nvidia/lib" >> /etc/ld.so.conf.d/nvidia.conf && \
echo "/usr/local/nvidia/lib64" >> /etc/ld.so.conf.d/nvidia.conf
ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:${PATH}
ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64
###########################################################################
#runtime next
ENV NCCL_VERSION 2.7.8
RUN apt-get update && apt-get install -y --no-install-recommends \
cuda-libraries-$CUDA_PKG_VERSION \
cuda-npp-$CUDA_PKG_VERSION \
cuda-nvtx-$CUDA_PKG_VERSION \
libcublas10=10.2.1.243-1 \
libnccl2=$NCCL_VERSION-1+cuda10.1 \
&& apt-mark hold libnccl2 \
&& rm -rf /var/lib/apt/lists/*
# apt from auto upgrading the cublas package. See https://gitlab.com/nvidia/container-images/cuda/-/issues/88
RUN apt-mark hold libcublas10
###########################################################################
#cudnn7 (not cudnn8) next
ENV CUDNN_VERSION 7.6.5.32
RUN apt-get update && apt-get install -y --no-install-recommends \
libcudnn7=$CUDNN_VERSION-1+cuda10.1 \
&& apt-mark hold libcudnn7 && \
rm -rf /var/lib/apt/lists/*
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV NVIDIA_REQUIRE_CUDA "cuda>=10.1"
###########################################################################
#docker build -t sidazhou/scipy-notebook-gpu:latest .
#docker run -itd -gpus all\
# -p 8888:8888 \
# -p 6006:6006 \
# --user root \
# -e NB_UID=$(id -u) \
# -e NB_GID=$(id -g) \
# -e GRANT_SUDO=yes \
# -v ~/workspace:/home/jovyan/work \
# --name sidazhou-jupyter-gpu \
# sidazhou/scipy-notebook-gpu:latest
#docker exec sidazhou-jupyter-gpu python -c "import tensorflow as tf; print(tf.config.experimental.list_physical_devices('GPU'))"
We just released an experimental GitHub repository which should ease the process of using NVIDIA GPUs inside Docker containers.
Regan's answer is great, but it's a bit out of date, since the correct way to do this is avoid the lxc execution context as Docker has dropped LXC as the default execution context as of docker 0.9.
Instead it's better to tell docker about the nvidia devices via the --device flag, and just use the native execution context rather than lxc.
These instructions were tested on the following environment:
See CUDA 6.5 on AWS GPU Instance Running Ubuntu 14.04 to get your host machine setup.
$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
$ sudo sh -c "echo deb https://get.docker.com/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
$ sudo apt-get update && sudo apt-get install lxc-docker
ls -la /dev | grep nvidia
crw-rw-rw- 1 root root 195, 0 Oct 25 19:37 nvidia0
crw-rw-rw- 1 root root 195, 255 Oct 25 19:37 nvidiactl
crw-rw-rw- 1 root root 251, 0 Oct 25 19:37 nvidia-uvm
I've created a docker image that has the cuda drivers pre-installed. The dockerfile is available on dockerhub if you want to know how this image was built.
You'll want to customize this command to match your nvidia devices. Here's what worked for me:
$ sudo docker run -ti --device /dev/nvidia0:/dev/nvidia0 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm tleyden5iwx/ubuntu-cuda /bin/bash
This should be run from inside the docker container you just launched.
Install CUDA samples:
$ cd /opt/nvidia_installers
$ ./cuda-samples-linux-6.5.14-18745345.run -noprompt -cudaprefix=/usr/local/cuda-6.5/
Build deviceQuery sample:
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ make
$ ./deviceQuery
If everything worked, you should see the following output:
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GRID K520
Result = PASS
Recent enhancements by NVIDIA have produced a much more robust way to do this.
Essentially they have found a way to avoid the need to install the CUDA/GPU driver inside the containers and have it match the host kernel module.
Instead, drivers are on the host and the containers don't need them. It requires a modified docker-cli right now.
This is great, because now containers are much more portable.
A quick test on Ubuntu:
# Install nvidia-docker and nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
# Test nvidia-smi
nvidia-docker run --rm nvidia/cuda nvidia-smi
For more details see: GPU-Enabled Docker Container and: https://github.com/NVIDIA/nvidia-docker
Use x11docker by mviereck:
https://github.com/mviereck/x11docker#hardware-acceleration says
Hardware acceleration
Hardware acceleration for OpenGL is possible with option -g, --gpu.
This will work out of the box in most cases with open source drivers on host. Otherwise have a look at wiki: feature dependencies. Closed source NVIDIA drivers need some setup and support less x11docker X server options.
This script is really convenient as it handles all the configuration and setup. Running a docker image on X with gpu is as simple as
x11docker --gpu imagename
Source: Stackoverflow.com