How to use GPU resources with nvidia docker2 and docker swarm

Using nvidia-docker2

This is the native toolkit of nvidia for docker and once installed the user simply needs to add --gpus all when launching a docker run command.

docker run --rm -it --network host --gpus all tensorflow/tensorflow:latest-gpu nvidia-smi

pros This toolkit takes care, automatically, of the shared resources and drivers and thus one can use easily the GPU.

cons This is not supported in swarm mode, and the user cannot have access to the GPU resources.

Using docker services (swarm)

It is possible to provide to docker swarm the required GPU resources by doing the following changes to:

sudo nano /etc/docker/daemon.json

"runtimes": {
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "default-runtime": "nvidia",
  "node-generic-resources": [
    "NVIDIA-GPU=GPU-45cbf7b"
    ]

To get the ID of the GPU run nvidia-smi -a

Another file needs to be also changed as such:

sudo nano /etc/nvidia-container-runtime/config.toml

add or uncomment the following line;

swarm-resource = "DOCKER_RESOURCE_GPU"

Once this is done docker needs to be restarted by doing

sudo systemctl restart docker.service

Start the swarm

docker swarm init

and the service that needs GPU can be launched using:

docker service create --replicas 1 --name test-gpu --generic-resource "NVIDIA-GPU=0" tensorflow/tensorflow:latest-gpu sh -c "nvidia-smi"

pros This works nicely with the deployment of docker swarm services and GPU resources. These modifications to not create conflicts with nvidia-docker2

cons There are files to be modified before being able to run things.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use GPU resources with nvidia docker2 and docker swarm

Using nvidia-docker2

Using docker services (swarm)

Clone this wiki locally