Overview
This article explains how to resolve the "Failed to initialize NVML: Driver/library version mismatch" error that occurs when running nvidia-smi. This error indicates a compatibility issue between the NVIDIA driver and the NVIDIA Management Library (NVML), commonly occurring after driver updates or system reboots.
Prerequisites
- Root or sudo access to the system
- Basic knowledge of Linux command line
- System running Rocky Linux or Ubuntu
- NVIDIA GPU installed in the system
Check current system status
nvidia-smi
You should see the error:
Failed to initialize NVML: Driver/library version mismatch NVML library version: 570.133
This means that the NVIDIA kernel driver and the NVIDIA user-space libraries (like libnvidia-ml.so
) are from different versions. Specifically, the system is trying to use NVML from version 570.133
, but the loaded kernel driver is from a different version.
What causes this?
- Driver upgrade or downgrade was incomplete, leaving mismatched components.
- The kernel module wasn't reloaded after a driver change.
- Multiple NVIDIA driver versions are present and conflicting.
-
The system was rebooted but
dkms
didn't properly compile the driver for the current kernel.
How to fix it:
Check loaded kernel driver version:
cat /proc/driver/nvidia/version
Check installed user-space driver version:
nvidia-smi
If they don’t match, reinstall the driver cleanly:
For Ubuntu Systems:
Remove existing NVIDIA drivers
# Remove existing NVIDIA drivers sudo systemctl stop gdm # or lightdm, sddm, etc. (to stop GUI) sudo rmmod nvidia_uvm nvidia_drm nvidia_modeset nvidia sudo apt remove --purge nvidia-* sudo apt autoremove sudo apt update
For Rocky Linux Systems:
# Remove existing NVIDIA drivers sudo dnf remove nvidia-* # Add NVIDIA repository sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
Driver Installation:
Method 4: Manual Driver Installation (Advanced)
-
Download the latest driver from NVIDIA:
- Visit Download NVIDIA Driver
- Select your GPU model and download the .run file
-
Stop the display manager:
sudo systemctl stop gdm # For GNOME sudo systemctl stop lightdm # For other DMs
-
Install the driver:
chmod +x NVIDIA-Linux-x86_64-*.run sudo ./NVIDIA-Linux-x86_64-*.run --silent
-
Restart the display manager:
sudo systemctl start gdm
Verification
-
Confirm nvidia-smi works without errors:
nvidia-smi
Expected output should show GPU information without error messages.
-
Verify driver version consistency:
cat /proc/driver/nvidia/version nvidia-smi --query-gpu=driver_version --format=csv,noheader
Both commands should report the same driver version.
-
Test GPU functionality:
nvidia-smi -q
Additional Notes / Tips
- Estimated time: 10-30 minutes depending on method used
- System behavior: GPU workloads will be interrupted during driver reload/reinstall
- User impact: All GPU-accelerated applications must be restarted after resolution
- Best practice: Always reboot after NVIDIA driver installation or updates
- Warning: Unloading NVIDIA modules will terminate all running CUDA applications
- Recommendation: Create a system backup before manual driver installation
Related to
Comments
0 comments
Please sign in to leave a comment.