NVIDIA Driver Dependency Conflict Incident Report
NVIDIA Driver Dependency Conflict Incident Report
Incident Summary
A dependency conflict occurred when attempting to install NVIDIA drivers on a Debian system. The conflict arose from having multiple repository sources (Debian official and NVIDIA CUDA) providing different versions of NVIDIA driver packages.
Root Cause
-
The system had two competing package sources:
- Official Debian repository: nvidia-driver version 535.216.01-1~deb12u1
- NVIDIA CUDA repository: nvidia-driver versions up to 570.124.06-1
-
The CUDA repository was added via
/etc/apt/sources.list.d/cuda-debian12-x86_64.list
-
When attempting to install the driver, apt was unable to resolve dependencies because it was mixing package versions from both repositories.
Resolution Steps
The issue was resolved by temporarily disabling the CUDA repository to ensure consistent package installation:
# Disable the CUDA repository
sudo mv /etc/apt/sources.list.d/cuda-debian12-x86_64.list /etc/apt/sources.list.d/cuda-debian12-x86_64.list.bak
# Update package index
sudo apt update
# Install the Debian version of the driver
sudo apt install nvidia-driver
# Verify installation
nvidia-smi
After installing the drivers from a single consistent source, the CUDA repository was re-enabled to install CUDA toolkit without driver components:
# Re-enable repository after driver install
sudo mv /etc/apt/sources.list.d/cuda-debian12-x86_64.list.bak /etc/apt/sources.list.d/cuda-debian12-x86_64.list
sudo apt update
# Install CUDA toolkit without the driver
sudo apt install cuda-toolkit
Lessons Learned
-
Repository Conflicts: Multiple repositories providing the same packages can cause dependency conflicts, especially when they offer different versions.
-
Driver-CUDA Compatibility: NVIDIA drivers and CUDA toolkit versions must be compatible. Installing them from separate repositories can lead to version mismatches.
-
Diagnostic Commands: The following commands were essential for diagnosing the issue:
apt-cache policy nvidia-driver apt-cache madison nvidia-driver ls -la /etc/apt/sources.list.d/
-
Separation of Concerns: It’s often better to install drivers from the distribution’s repositories and then add CUDA separately, rather than using NVIDIA’s repositories for both.
Prevention Strategies
-
Pin Package Versions: Use apt preferences to pin specific package versions when working with multiple sources.
-
Document Repository Changes: Maintain documentation when adding third-party repositories.
-
Test in Isolation: Test driver installations in isolation before adding CUDA repositories.
-
Check Compatibility Matrix: Always verify compatibility between NVIDIA driver versions and CUDA toolkit versions before installation.
Enjoy Reading This Article?
Here are some more articles you might like to read next: