Home / Blog / Tutorials / nvidia-smi Deep Dive for GPU Server Operators

Tutorials

nvidia-smi Deep Dive for GPU Server Operators

Beyond the default dashboard view, nvidia-smi has subcommands for process listings, ECC status, topology, and continuous logging.

Tutorials April 23, 2026 1 min read admin

Every dedicated GPU operator knows nvidia-smi. Most use only the default summary view. The tool has far more capability – process tracking, ECC error reporting, topology queries, and scriptable output. On our dedicated GPU hosting these less-used modes are worth knowing.

Process listing
Machine-readable queries
Topology
Continuous logging

Process Listing

nvidia-smi pmon -c 1

Shows per-process GPU usage: PID, process name, utilisation, memory. Useful for finding which vLLM replica is using which GPU on a multi-process server.

nvidia-smi --query-compute-apps=pid,process_name,gpu_bus_id,used_memory --format=csv

Machine-Readable

For scripting or monitoring:

nvidia-smi --query-gpu=name,temperature.gpu,utilization.gpu,memory.used,memory.total \
  --format=csv,noheader,nounits

Output is parsable CSV. Good for one-off checks without setting up Prometheus.

Topology

nvidia-smi topo -m

Shows how GPUs connect – PCIe root complex, NUMA node, interconnect type. Critical for multi-GPU tensor-parallel setups: two GPUs on the same NUMA node communicate faster than cross-socket pairs.

Continuous Logging

nvidia-smi dmon -s u,m,p,t -c 300 > gpu-log.csv

Samples utilisation, memory, power, and temperature every second for 300 samples. Useful for post-mortem on a load test or identifying thermal throttling.

Flags:

-s u: utilisation
-s m: memory
-s p: power
-s t: temperature
-s c: clocks
-s e: ECC errors

GPU Server Tooling Ready

Preinstalled nvidia-smi, DCGM, and Prometheus on UK dedicated GPU hosting.

Browse GPU Servers

See DCGM Exporter and GPU power management.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Tutorials

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

nvidia-smi Deep Dive for GPU Server Operators

Contents

Process Listing

Machine-Readable

Topology

Continuous Logging

GPU Server Tooling Ready

Need a Dedicated GPU Server?

admin

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

nvidia-smi Deep Dive for GPU Server Operators

Contents

Process Listing

Machine-Readable

Topology

Continuous Logging

GPU Server Tooling Ready

Need a Dedicated GPU Server?

admin

Related Articles

Fine-Tune LLaMA 3 8B with LoRA: GPU & VRAM Guide

GPU Utilisation Guide: Hit 90%+ Efficiency

Migrate from HF Endpoints: Text Generation

Ollama Serve vs Run: When to Use Each

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?