RTX 3050 - Order Now
Home / Blog / Tutorials / Ollama GPU Not Detected: Fix Guide
Tutorials

Ollama GPU Not Detected: Fix Guide

Fix Ollama running on CPU instead of GPU. Covers NVIDIA driver detection, CUDA library issues, Docker configuration, and environment variable settings for GPU acceleration.

Symptom: Ollama Is Running on CPU

You installed Ollama on your GPU server, pulled a model, and ran it. But generation is painfully slow, and when you check resource usage, the GPU sits idle at 0% utilisation while the CPU is maxed out. Running ollama ps shows no GPU assignment, or the logs contain:

level=WARN msg="no NVIDIA GPU detected"
msg="inference compute" id=0 library=cpu

Ollama should automatically detect NVIDIA GPUs and use them. When it falls back to CPU, the cause is always one of a small set of configuration issues between Ollama, the NVIDIA driver, and the CUDA libraries.

Diagnostic Steps

# 1. Confirm the GPU is visible to the OS
nvidia-smi

# 2. Check Ollama's GPU detection
ollama serve &  # Start in foreground to see logs
# Look for "NVIDIA GPU detected" or "no GPU" messages

# 3. Check which compute library Ollama selected
curl http://localhost:11434/api/ps

If nvidia-smi fails, fix the driver first using our CUDA installation guide.

Fix 1: Install Missing CUDA Libraries

Ollama bundles its own CUDA runtime, but it still needs the NVIDIA driver and some shared libraries:

# Ensure the NVIDIA driver is installed
sudo apt install nvidia-driver-550

# Ollama needs these libraries accessible
ls /usr/lib/x86_64-linux-gnu/libnvidia-ml.so*
ls /usr/lib/x86_64-linux-gnu/libcuda.so*

If these libraries are missing, the driver installation is incomplete. Reinstall the driver and reboot.

Fix 2: Docker GPU Passthrough for Ollama

If running Ollama in Docker, the container must have GPU access:

# Wrong: no GPU access
docker run -d ollama/ollama

# Correct: with GPU passthrough
docker run -d --gpus all -p 11434:11434 ollama/ollama

The NVIDIA Container Toolkit must be installed on the host. Our Docker GPU guide covers the full setup.

Fix 3: Environment Variable Override

Force Ollama to use specific GPUs:

# Use GPU 0 only
CUDA_VISIBLE_DEVICES=0 ollama serve

# Use GPUs 0 and 1
CUDA_VISIBLE_DEVICES=0,1 ollama serve

If CUDA_VISIBLE_DEVICES is set to an empty string or an invalid index elsewhere in your environment, Ollama will see no GPUs.

Fix 4: Update Ollama

Older Ollama versions had limited GPU support. Update to the latest release:

curl -fsSL https://ollama.com/install.sh | sh

Or download the latest binary directly. Each Ollama release improves GPU detection and adds support for newer driver versions.

Verification

# Pull a small model and test
ollama pull llama3.2:1b
ollama run llama3.2:1b "Hello, are you using my GPU?"

# While it runs, check GPU utilisation
nvidia-smi

You should see Ollama’s process in the nvidia-smi output, consuming VRAM and GPU compute. If GPU utilisation climbs during generation, the fix worked.

Multi-GPU Configuration

On multi-GPU servers, Ollama can split large models across GPUs automatically. Ensure all GPUs are visible:

CUDA_VISIBLE_DEVICES=0,1,2,3 ollama serve

For dedicated Ollama hosting, configure it as a systemd service with the correct GPU environment. Use GPU monitoring to verify all assigned GPUs are active during inference. For alternative LLM serving options, consider vLLM for higher-throughput production workloads. Check the tutorials section for related setup guides.

GPU Servers for Ollama

GigaGPU dedicated servers with NVIDIA GPUs and pre-installed drivers — Ollama detects your GPU automatically.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?