Fork of NVIDIA/ansible-role-nvidia-driver with support for Debian

Go to file

Adam DeConinck e80bcdb2ce Expand documentation in README - Add a note that the role should be run from a separate ansible control node - Document the role variables available to change		2020-09-11 19:50:38 +00:00
defaults	Add method for setting driver module parameters	2019-07-15 14:39:29 -07:00
files	Add CUDA repo package pin file to files	2020-01-07 23:17:26 +00:00
meta	Updates to role metadata	2019-04-30 13:17:34 -07:00
tasks	Fixes RedHat setup with specific version	2020-05-12 16:38:54 +02:00
templates	Add method for setting driver module parameters	2019-07-15 14:39:29 -07:00
tests	Add README to tests/	2019-04-29 10:12:42 -07:00
vars	Update var names; allow state=absent in tasks	2019-04-29 10:08:05 -07:00
.gitignore	Add gitignore	2019-04-29 10:08:05 -07:00
LICENSE	Add nvidia license	2019-04-25 10:50:56 -07:00
README.md	Expand documentation in README	2020-09-11 19:50:38 +00:00

README.md

ansible-role-nvidia-driver

An Ansible role to install the NVIDIA driver from the NVIDIA CUDA repositories.

Requirements

In the process of installing the NVIDIA driver, this role will reboot the nodes where it runs. Because of this, we strongly recommend that you run ansible-playbook from a separate node than the GPU nodes where you are installing the driver.

If you attempt to run Ansible on the same node where you are installing the driver, this role will either:

Refuse to proceed with an error like Running reboot with local connection would reboot the control node (if running with the local connection)
Reboot the node you're running on, interrupting the playbook execution! (if running the an ssh connection against localhost)

Installing

This role can be installed using Ansible Galaxy:

$ ansible-galaxy install nvidia.nvidia_driver

Role variables

Variable	Default value	Description
`nvidia_driver_package_state`	`"present"`	Package state for NVIDIA driver packages
`nvidia_driver_package_version`	`""`	Package version to install. Note that this should match the actual version of the deb or RPM package to be installed.
`nvidia_driver_persistence_mode_on`	`yes`	Whether to enable persistence mode (boolean)
`nvidia_driver_skip_reboot`	`no`	Whether to skip rebooting the node during the install
`nvidia_driver_module_file`	`"/etc/modprobe.d/nvidia.conf"`	Filename to use for NVIDIA driver parameters
`nvidia_driver_module_params`	`""`	Parameters to pass to the NVIDIA driver

Red Hat specific variables

| nvidia_driver_rhel_epel_repo_baseurl | "https://download.fedoraproject.org/pub/epel/$releasever/$basearch/" | Base URL to use for EPEL repo | | nvidia_driver_rhel_epel_repo_gpgkey | "https://epel.mirror.constant.com//RPM-GPG-KEY-EPEL-{{ ansible_distribution_major_version }}" | GPG key for the EPEL repo | | nvidia_driver_rhel_cuda_repo_baseurl | "https://developer.download.nvidia.com/compute/cuda/repos/{{ _rhel_repo_dir }}/" | Base URL to use for CUDA repo | | nvidia_driver_rhel_cuda_repo_gpgkey | "https://developer.download.nvidia.com/compute/cuda/repos/{{ _rhel_repo_dir }}/7fa2af80.pub" | GPG key for the CUDA repo |

Ubuntu specific variables

| nvidia_driver_ubuntu_cuda_repo_baseurl | "http://developer.download.nvidia.com/compute/cuda/repos/{{ _ubuntu_repo_dir }}" | Base URL to use for CUDA repo | | nvidia_driver_ubuntu_cuda_repo_gpgkey_url | "https://developer.download.nvidia.com/compute/cuda/repos/{{ _ubuntu_repo_dir }}/7fa2af80.pub" | GPG key for the CUDA repo | | nvidia_driver_ubuntu_cuda_repo_gpgkey_id | "7fa2af80" | GPG key ID for the CUDA repo |

Example playbook

- hosts: gpu_nodes
  roles:
  - nvidia.nvidia_driver

Supported distributions

Currently, this role supports the following Linux distributions:

NVIDIA DGX OS 4
Ubuntu 18.04 LTS
CentOS 7
Red Hat Enterprise Linux 7