RTX 3050 - Order Now
Home / Blog / Tutorials / nvidia-smi Shows No Devices: Troubleshooting
Tutorials

nvidia-smi Shows No Devices: Troubleshooting

Fix nvidia-smi showing no devices or failing to communicate with the NVIDIA driver. Covers driver reinstallation, kernel module loading, secure boot issues, and hardware verification.

The Problem: nvidia-smi Sees Nothing

You run nvidia-smi on your GPU server and get one of these frustrating outputs:

No devices were found
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver.
Make sure that the latest NVIDIA driver is properly installed.

This means the NVIDIA kernel driver is either not installed, not loaded, or cannot bind to your GPU. Without nvidia-smi working, nothing else does — no PyTorch, no TensorFlow, no vLLM, no inference of any kind.

Systematic Diagnosis

Work through these checks in order. Stop at the first failure — that is your root cause.

Check 1: Does the hardware exist?

lspci | grep -i nvidia

If this returns nothing, the system does not see an NVIDIA GPU on the PCIe bus. This could mean the GPU is not physically seated, the server needs a BIOS update, or the card is faulty. On a dedicated GPU server, contact your provider.

Check 2: Is the driver package installed?

dpkg -l | grep nvidia-driver
# or
rpm -qa | grep nvidia-driver

If no package appears, install the driver:

sudo apt update && sudo apt install nvidia-driver-550

Check 3: Is the kernel module loaded?

lsmod | grep nvidia

You should see nvidia, nvidia_modeset, nvidia_drm, and nvidia_uvm. If none appear:

sudo modprobe nvidia

If modprobe fails, the module was not compiled for your current kernel. Install matching headers and reinstall the driver.

Check 4: Is Secure Boot blocking the module?

mokutil --sb-state

If Secure Boot is enabled, unsigned kernel modules cannot load. Either disable Secure Boot in the BIOS or sign the NVIDIA module with a Machine Owner Key (MOK). On cloud GPU servers, Secure Boot is typically disabled by default.

Clean Driver Reinstallation

When diagnosis points to a broken driver, the most reliable path is a clean reinstall:

# Remove all existing NVIDIA packages
sudo apt purge 'nvidia-*' -y
sudo apt autoremove -y

# Reboot to ensure all modules are unloaded
sudo reboot

# After reboot, install fresh
sudo apt update
sudo apt install nvidia-driver-550
sudo reboot

After the second reboot, nvidia-smi should display your GPU. Our CUDA installation guide covers the full driver installation process including CUDA toolkit setup.

When a Kernel Update Breaks the Driver

This is the most common reason for nvidia-smi to suddenly stop working on a previously functioning server. Ubuntu’s automatic updates can install a new kernel whose headers do not match the compiled NVIDIA module.

# Check if the running kernel matches the installed headers
uname -r
apt list --installed 2>/dev/null | grep linux-headers

If they do not match:

sudo apt install linux-headers-$(uname -r)
sudo apt install --reinstall nvidia-driver-550
sudo reboot

To prevent this in future, pin the kernel or hold driver packages:

sudo apt-mark hold nvidia-driver-550 linux-image-$(uname -r)

Verification After the Fix

# nvidia-smi should now show your GPU
nvidia-smi

# Verify CUDA works end-to-end
python3 -c "
import torch
print(f'GPU detected: {torch.cuda.is_available()}')
print(f'Device: {torch.cuda.get_device_name(0)}')
"

Preventing Future Detection Failures

  • Pin your NVIDIA driver version with apt-mark hold to prevent automatic updates from breaking it.
  • Set up GPU monitoring that alerts when nvidia-smi fails — catching the issue before it impacts production.
  • Use Docker containers for inference workloads so that driver updates do not affect running services.
  • Schedule driver updates during planned maintenance windows, not through unattended-upgrades.
  • For multi-GPU setups running PyTorch or TensorFlow, test nvidia-smi before and after any system update.

GPU Servers That Just Work

GigaGPU pre-configures NVIDIA drivers on every dedicated server. nvidia-smi shows your GPU from the first login.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?