Overview
This article explains how to resolve the "Failed to initialize NVML: Driver/library version mismatch" error that occurs when running nvidia-smi. This error indicates a compatibility issue between the NVIDIA driver and the NVIDIA Management Library (NVML), commonly occurring after driver updates or system reboots.
Affected Systems
Any Exxact Servers and Workstations
Any Nvidia GPU
Symptom:
Check current system status
nvidia-smi
You should see the error:
Failed to initialize NVML: Driver/library version mismatch NVML library version: 570.133
This means that the NVIDIA kernel driver and the NVIDIA user-space libraries (like libnvidia-ml.so) are from different versions. Specifically, the system is trying to use NVML from version 570.133, but the loaded kernel driver is from a different version.
What causes this?
- Driver upgrade or downgrade was incomplete, leaving mismatched components.
- The kernel module wasn't reloaded after a driver change.
- Multiple NVIDIA driver versions are present and conflicting.
-
The system was rebooted but
dkmsdidn't properly compile the driver for the current kernel.
Prerequisites
- Root or sudo access to the system
- Basic knowledge of Linux command line
- System running Rocky Linux or Ubuntu
- NVIDIA GPU installed in the system
Solution:
Check loaded kernel driver version:
cat /proc/driver/nvidia/version
Check installed user-space driver version:
nvidia-smi
If they don’t match, reinstall the driver cleanly:
Uninstall NVIDIA drivers:
Ubuntu Operating System
# Remove existing NVIDIA drivers sudo systemctl stop gdm # or lightdm, sddm, etc. (to stop GUI) sudo rmmod nvidia_uvm nvidia_drm nvidia_modeset nvidia sudo apt remove --purge nvidia-* sudo apt autoremove sudo apt update
Rocky Linux Operating System
# Remove existing NVIDIA drivers sudo dnf remove nvidia-* # Add NVIDIA repository sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
Manual Driver Installation
-
Download the latest driver from NVIDIA:
- Visit Download NVIDIA Driver
- Select your GPU model and download the .run file
-
Stop the display manager:
sudo systemctl stop gdm # For GNOME sudo systemctl stop lightdm # For other Display Managers
-
Install the driver:
chmod +x NVIDIA-Linux-x86_64-*.run sudo ./NVIDIA-Linux-x86_64-*.run --silent
-
Restart the display manager:
sudo systemctl start gdm
Verification
-
Confirm nvidia-smi works without errors:
nvidia-smi
Expected output should show GPU information without error messages.
-
Verify driver version consistency:
cat /proc/driver/nvidia/version nvidia-smi --query-gpu=driver_version --format=csv,noheader
Both commands should report the same driver version.
-
Test GPU functionality:
nvidia-smi -q
Additional Notes & Tips
- Estimated time: 10-30 minutes depending on method used
- System behavior: GPU workloads will be interrupted during driver reload/reinstall
- User impact: All GPU-accelerated applications must be restarted after resolution
- Best practice: Always reboot after NVIDIA driver installation or updates
- Warning: Unloading NVIDIA modules will terminate all running CUDA applications
- Recommendation: Create a system backup before manual driver installation
Related to
Comments
0 comments
Please sign in to leave a comment.