|
Select the topic you want to learn about from the list on the right.
|
|
## Setting up
|
|
|
|
|
|
# Frequently Asked Questions
|
|
#### Which Docker packages are supported?
|
|
|
|
* All stable releases of [`docker-ce`](https://docs.docker.com/release-notes/docker-ce/) installed from https://docs.docker.com/install/ starting from docker 19.03
|
|
### Setting up
|
|
* The package provided by Canonical: `docker.io` starting from docker 19.03.
|
|
|
|
* The package provided by Red Hat: `docker` starting from docker 19.03.
|
|
* [How do I register the new runtime to the Docker daemon?](Frequently-Asked-Questions#how-do-i-register-the-new-runtime-to-the-docker-daemon)
|
|
|
|
* [Which Docker packages are supported?](Frequently-Asked-Questions#which-docker-packages-are-supported)
|
|
Note that Edge, Test and Nightly releases are not officially supported but we will provide best effort support.
|
|
* [How do I install 2.0 if I'm not using the latest Docker version?](Frequently-Asked-Questions#how-do-i-install-20-if-im-not-using-the-latest-docker-version)
|
|
|
|
* [What is the minimum supported Docker version?](Frequently-Asked-Questions#what-is-the-minimum-supported-docker-version)
|
|
#### What is the minimum supported Docker version?
|
|
* [How do I install the NVIDIA driver?](Frequently-Asked-Questions#how-do-i-install-the-nvidia-driver)
|
|
Docker 19.03 which adds support for the `--gpus` option.
|
|
* [Can I use 2.0 and 1.0 side-by-side?](Frequently-Asked-Questions#can-i-use-20-and-10-side-by-side)
|
|
|
|
* [Why do I get the error `Unknown runtime specified nvidia`?](Frequently-Asked-Questions#why-do-i-get-the-error-unknown-runtime-specified-nvidia)
|
|
#### How do I install the NVIDIA driver?
|
|
* [Why do I get the error `flag provided but not defined: -console`?](Frequently-Asked-Questions#why-do-i-get-the-error-flag-provided-but-not-defined--console)
|
|
The recommended way is to use your [package manager](http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-installation) and install the `cuda-drivers` package (or equivalent).\
|
|
* [Why do I get the error `Depends: docker [...] but it is not installable` or `nothing provides docker [...]`?](Frequently-Asked-Questions#why-do-i-get-the-error-depends-docker--but-it-is-not-installable-or-nothing-provides-docker-)
|
|
When no packages are available, you should use an official ["runfile"](http://www.nvidia.com/object/unix.html).
|
|
* [I'm getting The following signatures were invalid: EXPKEYSIG while trying to install the packages, what do I do?](Frequently-Asked-Questions#im-getting-the-following-signatures-were-invalid-expkeysig-while-trying-to-install-the-packages-what-do-i-do)
|
|
|
|
* [Why do I get the error `file /etc/docker/daemon.json from install of nvidia-docker2 conflicts with file from package docker`?](Frequently-Asked-Questions#why-do-i-get-the-error-file-etcdockerdaemonjson-from-install-of-nvidia-docker2-conflicts-with-file-from-package-docker)
|
|
Alternatively, and as a technology preview, the NVIDIA driver can be deployed through a container.\
|
|
|
|
Refer to the [documentation](https://github.com/NVIDIA/nvidia-docker/wiki/Driver-containers-(EXPERIMENTAL)) for more information.
|
|
### Platform support
|
|
|
|
|
|
#### I'm getting `The following signatures were invalid: EXPKEYSIG` while trying to install the packages, what do I do?
|
|
* [Do you support Jetson platforms (AArch64)?](Frequently-Asked-Questions#do-you-support-Jetson-platforms-AArch64)
|
|
Make sure you fetched the latest GPG key from the repositories. Refer to the [repository instructions](https://nvidia.github.io/nvidia-docker/) for your distribution.
|
|
* [Is macOS supported?](Frequently-Asked-Questions#is-macos-supported)
|
|
|
|
* [Is Microsoft Windows supported?](Frequently-Asked-Questions#is-microsoft-windows-supported)
|
|
## Platform support
|
|
* [Do you support Microsoft native container technologies (e.g. Windows server, Hyper-v)?](Frequently-Asked-Questions#do-you-support-microsoft-native-container-technologies-eg-windows-server-hyper-v)
|
|
|
|
* [Do you support Optimus (i.e. NVIDIA dGPU + Intel iGPU)?](Frequently-Asked-Questions#do-you-support-optimus-ie-nvidia-dgpu--intel-igpu)
|
|
#### Is macOS supported?
|
|
* [What distributions are officially supported?](Frequently-Asked-Questions#what-distributions-are-officially-supported)
|
|
No, we do not support macOS (regardless of the version), however you can use the native macOS Docker client to deploy your containers remotely (refer to the [dockerd documentation](https://docs.docker.com/engine/reference/commandline/dockerd/#description)).
|
|
* [Do you support PowerPC64 (ppc64le)?](Frequently-Asked-Questions#do-you-support-powerpc64-ppc64le)
|
|
|
|
* [How do I use this in on my Cloud service provider (e.g. AWS, Azure, GCP)?](Frequently-Asked-Questions#how-do-i-use-this-in-on-my-cloud-service-provider-eg-aws-azure-gcp)
|
|
#### Is Microsoft Windows supported?
|
|
|
|
No, we do not support Microsoft Windows (regardless of the version), however you can use the native Microsoft Windows Docker client to deploy your containers remotely (refer to the [dockerd documentation](https://docs.docker.com/engine/reference/commandline/dockerd/#description)).
|
|
### Container runtime
|
|
|
|
|
|
#### Do you support Microsoft native container technologies (e.g. Windows server, Hyper-v)?
|
|
* [Does it have a performance impact on my GPU workload?](Frequently-Asked-Questions#does-it-have-a-performance-impact-on-my-gpu-workload)
|
|
No, we do not support native Microsoft container technologies.
|
|
* [Is OpenGL supported?](Frequently-Asked-Questions#is-opengl-supported)
|
|
|
|
* [How do I fix `unsatisfied condition: cuda >= X.Y`?](Frequently-Asked-Questions#how-do-i-fix-unsatisfied-condition-cuda--xy)
|
|
#### Do you support Optimus (i.e. NVIDIA dGPU + Intel iGPU)?
|
|
* [Do you support CUDA Multi Process Service (a.k.a. MPS)?](Frequently-Asked-Questions#do-you-support-cuda-multi-process-service-aka-mps)
|
|
Yes, from the CUDA perspective there is no difference as long as your dGPU is powered-on and you are following the official driver instructions.
|
|
* [Do you support running a GPU-accelerated X server inside the container?](Frequently-Asked-Questions#do-you-support-running-a-gpu-accelerated-x-server-inside-the-container)
|
|
|
|
* [I have multiple GPU devices, how can I isolate them between my containers?](Frequently-Asked-Questions#i-have-multiple-gpu-devices-how-can-i-isolate-them-between-my-containers)
|
|
#### Do you support Tegra platforms (arm64)?
|
|
* [Why is `nvidia-smi` inside the container not listing the running processes?](Frequently-Asked-Questions#why-is-nvidia-smi-inside-the-container-not-listing-the-running-processes)
|
|
No, we do not support Tegra platforms and can’t easily port the code to it.\
|
|
* [Can I share a GPU between multiple containers?](Frequently-Asked-Questions#can-i-share-a-gpu-between-multiple-containers)
|
|
The driver stack on arm64 is radically different and would require a complete architecture overhaul.
|
|
* [Can I limit the GPU resources (e.g. bandwidth, memory, CUDA cores) taken by a container?](Frequently-Asked-Questions#can-i-limit-the-gpu-resources-eg-bandwidth-memory-cuda-cores-taken-by-a-container)
|
|
|
|
* [Can I enforce exclusive access for a GPU?](Frequently-Asked-Questions#can-i-enforce-exclusive-access-for-a-gpu)
|
|
#### What distributions are officially supported?
|
|
* [Why is my container slow to start with 2.0?](Frequently-Asked-Questions#why-is-my-container-slow-to-start-with-20)
|
|
For your host distribution, the list of supported platforms is available [here](http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements).\
|
|
* [Can I use it with Docker-in-Docker (a.k.a. DinD)?](Frequently-Asked-Questions#can-i-use-it-with-docker-in-docker-aka-dind)
|
|
For your container images, both the [Docker Hub](https://github.com/NVIDIA/nvidia-docker/wiki/Docker-Hub) and [NGC registry](https://github.com/NVIDIA/nvidia-docker/wiki/NGC) images are officially supported.
|
|
* [Why is my application inside the container slow to initialize?](Frequently-Asked-Questions#why-is-my-application-inside-the-container-slow-to-initialize)
|
|
|
|
* [Is the JIT cache shared between containers?](Frequently-Asked-Questions#is-the-jit-cache-shared-between-containers)
|
|
#### Do you support PowerPC64 (ppc64le)?
|
|
* [What is causing the CUDA `invalid device function` error?](Frequently-Asked-Questions#what-is-causing-the-cuda-invalid-device-function-error)
|
|
Yes, little-endian only.
|
|
* [Why do I get `Insufficient Permissions` for some `nvidia-smi` operations?](Frequently-Asked-Questions#why-do-i-get-insufficient-permissions-for-some-nvidia-smi-operations)
|
|
|
|
* [Can I profile and debug my GPU code inside a container?](Frequently-Asked-Questions#can-i-profile-and-debug-my-gpu-code-inside-a-container)
|
|
## Container Runtime
|
|
* [Is OpenCL supported?](Frequently-Asked-Questions#is-opencl-supported)
|
|
|
|
* [Is Vulkan supported?](Frequently-Asked-Questions#is-vulkan-supported)
|
|
#### Does it have a performance impact on my GPU workload?
|
|
|
|
No, usually the impact should be in the order of less than 1% and hardly noticeable.\
|
|
### Container images
|
|
However be aware of the following (list non exhaustive):
|
|
|
|
* _GPU topology and CPU affinity_ \
|
|
* [What do I have to install in my container images?](Frequently-Asked-Questions#what-do-i-have-to-install-in-my-container-images)
|
|
You can query it using `nvidia-smi topo` and use [Docker CPU sets](https://docs.docker.com/engine/admin/resource_constraints/#cpu) to pin CPU cores.
|
|
* [Do you provide official Docker images?](Frequently-Asked-Questions#do-you-provide-official-docker-images)
|
|
* _Compiling your code for your device architecture_\
|
|
* [Can I use the GPU during a container build (i.e. `docker build`)?](Frequently-Asked-Questions#can-i-use-the-gpu-during-a-container-build-ie-docker-build)
|
|
Your container might be compiled for the wrong achitecture and could fallback to the JIT compilation of PTX code (refer to the [official documentation](http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-compilation) for more information).\
|
|
* [Are my container images built for version 1.0 compatible with 2.0?](Frequently-Asked-Questions#are-my-container-images-built-for-version-10-compatible-with-20)
|
|
Note that you can express [these constraints](https://github.com/nvidia/nvidia-container-runtime#nvidia_require_) in your container image.
|
|
* [How do I link against driver APIs at build time (e.g. `libcuda.so` or `libnvidia-ml.so`)?](Frequently-Asked-Questions#how-do-i-link-against-driver-apis-at-build-time-eg-libcudaso-or-libnvidia-mlso)
|
|
* _Container I/O overhead_\
|
|
* [The official CUDA images are too big, what do I do?](Frequently-Asked-Questions#the-official-cuda-images-are-too-big-what-do-i-do)
|
|
By default Docker containers rely on an overlay filesystem and bridged/NATed networking.\
|
|
* [Why aren't CUDA 10 images working with nvidia-docker v1?](Frequently-Asked-Questions#why-arent-cuda-10-images-working-with-nvidia-docker-v1)
|
|
Depending on your workload this can be a bottleneck, we recommend using [Docker volumes](https://docs.docker.com/engine/admin/volumes/volumes/) and experiment with different [Docker networks](https://docs.docker.com/engine/userguide/networking/).
|
|
|
|
* _Linux kernel accounting and security overhead_\
|
|
### Ecosystem enablement
|
|
In rare cases, you may notice than some kernel subsystems induce overhead.\
|
|
|
|
This will likely depend on your kernel version and can include things like: cgroups, LSMs, seccomp filters, netfilter...
|
|
* [Do you support Docker Swarm mode?](Frequently-Asked-Questions#do-you-support-docker-swarm-mode)
|
|
|
|
* [Do you support Docker Compose?](Frequently-Asked-Questions#do-you-support-docker-compose)
|
|
#### Is OpenGL supported?
|
|
* [Do you support Kubernetes?](Frequently-Asked-Questions#do-you-support-kubernetes) |
|
Yes, [EGL](https://devblogs.nvidia.com/parallelforall/egl-eye-opengl-visualization-without-x-server/) is supported for headless rendering, but this is a **beta** feature. There is no plan to support GLX in the near future.\
|
|
|
|
Images are available at [`nvidia/opengl`](https://hub.docker.com/r/nvidia/opengl/). If you need CUDA+OpenGL, use [`nvidia/cudagl`](https://hub.docker.com/r/nvidia/cudagl/).\
|
|
|
|
If you are a [NGC](https://github.com/NVIDIA/nvidia-docker/wiki/NGC) subscriber and require GLX for your workflow, please fill out a [feature request](https://devtalk.nvidia.com/default/board/221/feature-requests/) for support consideration.
|
|
|
|
|
|
|
|
#### How do I fix `unsatisfied condition: cuda >= X.Y`?
|
|
|
|
Your CUDA container image is incompatible with your driver version.\
|
|
|
|
Upgrade your driver or choose an [image tag](https://hub.docker.com/r/nvidia/cuda/) which is supported by your driver (see also [CUDA requirements](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA#requirements))
|
|
|
|
|
|
|
|
#### Do you support CUDA Multi Process Service (a.k.a. MPS)?
|
|
|
|
No, MPS is not supported at the moment. However we plan on supporting this feature in the future, and [this issue](https://github.com/NVIDIA/nvidia-docker/issues/419) will be updated accordingly.
|
|
|
|
|
|
|
|
#### Do you support running a GPU-accelerated X server inside the container?
|
|
|
|
No, running a X server inside the container is not supported at the moment and there is no plan to support it in the near future (see also [OpenGL support](#is-opengl-supported)).
|
|
|
|
|
|
|
|
#### I have multiple GPU devices, how can I isolate them between my containers?
|
|
|
|
GPU isolation is achieved through the CLI option `--gpus`. Devices can be referenced by index (following the PCI bus order) or by UUID.
|
|
|
|
|
|
|
|
e.g:
|
|
|
|
```
|
|
|
|
# If you have 4 GPUs, to isolate GPUs 3 and 4 (/dev/nvidia2 and /dev/nvidia3)
|
|
|
|
$ docker run --gpus device=2,3 nvidia/cuda:9.0-base nvidia-smi
|
|
|
|
```
|
|
|
|
|
|
|
|
#### Why is `nvidia-smi` inside the container not listing the running processes?
|
|
|
|
`nvidia-smi` and NVML are not compatible with [PID namespaces](http://man7.org/linux/man-pages/man7/pid_namespaces.7.html).\
|
|
|
|
We recommend monitoring your processes on the host or inside a container using `--pid=host`.
|
|
|
|
|
|
|
|
#### Can I share a GPU between multiple containers?
|
|
|
|
Yes. This is no different than sharing a GPU between multiple processes outside of containers.\
|
|
|
|
Scheduling and compute preemption vary from one GPU architecture to another (e.g. CTA-level, instruction-level).
|
|
|
|
|
|
|
|
#### Can I limit the GPU resources (e.g. bandwidth, memory, CUDA cores) taken by a container?
|
|
|
|
No. Your only option is to set the GPU clocks at a lower frequency before starting the container.
|
|
|
|
|
|
|
|
#### Can I enforce exclusive access for a GPU?
|
|
|
|
This is not currently supported but you can enforce it:
|
|
|
|
* At the container orchestration layer (Kubernetes, Swarm, Mesos, Slurm…) since this is tied to resource allocation.
|
|
|
|
* At the driver level by setting the [compute mode](http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-modes) of the GPU.
|
|
|
|
|
|
|
|
#### Why is my container slow to start?
|
|
|
|
You probably need to enable [persistence mode](http://docs.nvidia.com/deploy/driver-persistence/index.html) to keep the kernel modules loaded and the GPUs initialized.\
|
|
|
|
The recommended way is to start the `nvidia-persistenced` daemon on your host.
|
|
|
|
|
|
|
|
#### Can I use it with Docker-in-Docker (a.k.a. DinD)?
|
|
|
|
If you are running a Docker client inside a container: simply mount the Docker socket and proceed as usual.\
|
|
|
|
If you are running a Docker daemon inside a container: this case is untested.
|
|
|
|
|
|
|
|
#### Why is my application inside the container slow to initialize?
|
|
|
|
Your application was probably not compiled for the compute architecture of your GPU and thus the driver must [JIT](https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-understand-fat-binaries-jit-caching/) all the CUDA kernels from PTX.
|
|
|
|
In addition to a slow start, the JIT compiler might generate less efficient code than directly targeting your compute architecture (see also [performance impact](#does-it-have-a-performance-impact-on-my-gpu-workload)).
|
|
|
|
|
|
|
|
#### Is the JIT cache shared between containers?
|
|
|
|
No. You would have to handle this manually with [Docker volumes](https://docs.docker.com/engine/admin/volumes/volumes/).
|
|
|
|
|
|
|
|
#### What is causing the CUDA `invalid device function` error?
|
|
|
|
Your application was not compiled for the compute architecture of your GPU, and no PTX was generated during build time. Thus, JIT compiling is impossible (see also [slow to initialize](#why-is-my-application-inside-the-container-slow-to-initialize)).
|
|
|
|
|
|
|
|
#### Why do I get `Insufficient Permissions` for some `nvidia-smi` operations?
|
|
|
|
Some device management operations require extra privileges (e.g. setting clocks frequency).\
|
|
|
|
After learning about the security implications of doing so, you can add extra [capabilities](https://docs.docker.com/engine/security/security/#linux-kernel-capabilities) to your container using `--cap-add` on the command-line (`--cap-add=SYS_ADMIN` will allow most operations).
|
|
|
|
|
|
|
|
#### Can I profile and debug my GPU code inside a container?
|
|
|
|
Yes but as stated above, you might need extra privileges, meaning extra [capabilities](https://docs.docker.com/engine/security/security/#linux-kernel-capabilities) like `CAP_SYS_PTRACE` or tweak the [seccomp profile](https://docs.docker.com/engine/security/seccomp/) used by Docker to allow certain syscalls.
|
|
|
|
|
|
|
|
#### Is OpenCL supported?
|
|
|
|
Yes, we now provide images on [DockerHub](https://hub.docker.com/r/nvidia/opencl/).
|
|
|
|
|
|
|
|
#### Is Vulkan supported?
|
|
|
|
No, Vulkan is not supported at the moment. However we plan on supporting this feature in the future.
|
|
|
|
|
|
|
|
## Container images
|
|
|
|
|
|
|
|
#### What do I have to install in my container images?
|
|
|
|
Library dependencies vary from one application to another. In order to make things easier for developers, we provide a set of [official images](#do-you-provide-official-docker-images) to base your images on.
|
|
|
|
|
|
|
|
#### Do you provide official Docker images?
|
|
|
|
Yes, container images are available on [Docker Hub](https://github.com/NVIDIA/nvidia-docker/wiki/Docker-Hub) and on the [NGC registry](https://github.com/NVIDIA/nvidia-docker/wiki/NGC).
|
|
|
|
|
|
|
|
#### Can I use the GPU during a container build (i.e. `docker build`)?
|
|
|
|
Yes, as long as you [configure your Docker daemon](https://github.com/NVIDIA/nvidia-docker/wiki/Advanced-topics#default-runtime) to use the `nvidia` runtime as the default, you will be able to have build-time GPU support. However, be aware that this can render your images non-portable (see also [invalid device function](#what-is-causing-the-cuda-invalid-device-function-error)).
|
|
|
|
|
|
|
|
#### Are my container images built for version 1.0 compatible with 2.0 and 3.0?
|
|
|
|
Yes, for most cases. The main difference being that we don’t mount all driver libraries by default in 2.0 and 3.0. You might need to set the `CUDA_DRIVER_CAPABILITIES` environment variable in your Dockerfile or when starting the container. Check the documentation of [nvidia-container-runtime](https://github.com/nvidia/nvidia-container-runtime#environment-variables-oci-spec).
|
|
|
|
|
|
|
|
#### How do I link against driver APIs at build time (e.g. `libcuda.so` or `libnvidia-ml.so`)?
|
|
|
|
Use the library stubs provided in `/usr/local/cuda/lib64/stubs/`. Our official images already take care of setting [`LIBRARY_PATH`](https://gitlab.com/nvidia/cuda/blob/ubuntu16.04/9.0/devel/Dockerfile#L12).
|
|
|
|
However, do not set `LD_LIBRARY_PATH` to this folder, the stubs must not be used at runtime.
|
|
|
|
|
|
|
|
#### The official CUDA images are too big, what do I do?
|
|
|
|
The `devel` [image tags](https://hub.docker.com/r/nvidia/cuda/) are large since the CUDA toolkit ships with many libraries, a compiler and various command-line tools.\
|
|
|
|
As a general rule of thumb, you shouldn’t ship your application with its build-time dependencies. We recommend to use [multi-stage builds](https://docs.docker.com/engine/userguide/eng-image/multistage-build/) for this purpose. Your final container image should use our `runtime` or `base` images.\
|
|
|
|
As of CUDA 9.0 we now ship a `base` [image tag](https://hub.docker.com/r/nvidia/cuda/) which bundles the strict minimum of dependencies.
|
|
|
|
|
|
|
|
#### Why aren't CUDA 10 images working with nvidia-docker v1?
|
|
|
|
Starting from CUDA 10.0, the CUDA images require using nvidia-docker v2 and won't trigger the GPU enablement path from nvidia-docker v1.
|
|
|
|
|
|
|
|
## Ecosystem enablement
|
|
|
|
|
|
|
|
#### Do you support Docker Swarm mode?
|
|
|
|
Not currently, support for Swarmkit is still being worked on in the upstream Moby project. You can track our progress [here](https://github.com/moby/moby/issues/33439).
|
|
|
|
|
|
|
|
#### Do you support Docker Compose?
|
|
|
|
Yes, use Compose format `2.3` and add `runtime: nvidia` to your GPU service. Docker Compose must be version [1.19.0](https://github.com/docker/compose/releases/tag/1.19.0) or higher. You can find an example [here](https://github.com/NVIDIA/gpu-monitoring-tools/blob/master/exporters/prometheus-dcgm/docker/docker-compose.yml).
|
|
|
|
|
|
|
|
#### Do you support Kubernetes?
|
|
|
|
Since Kubernetes 1.8, the recommended way is to use our official [device plugin](https://github.com/NVIDIA/k8s-device-plugin). |