Home / Blog / Tutorials / GPU Power Management on a Dedicated Server

Tutorials

GPU Power Management on a Dedicated Server

Power limits, clock speeds, and persistence mode - the nvidia-smi settings that affect both cost and performance on a dedicated GPU.

Tutorials April 23, 2026 1 min read admin

A modern GPU draws 150-575 W depending on model and workload. On our dedicated GPU hosting power is bundled but the settings still matter – they affect thermal headroom, performance consistency, and in some cases longevity.

Persistence mode
Power limit tuning
Clock locking
Tradeoffs

Persistence Mode

Persistence mode keeps the Nvidia driver resident between GPU uses. Without it, the driver reinitialises on every process launch – adding 1-3 seconds of cold-start latency. Always enable on a server.

sudo nvidia-smi -pm 1

Set via systemd unit to survive reboots.

Power Limit

You can cap power draw below the card’s default. Useful when thermals are marginal or when you want predictable power consumption:

sudo nvidia-smi -pl 300  # cap at 300 W

On a 5090 (575 W default), capping at 400 W reduces performance by roughly 10-15% but cuts power by 30%. For batch workloads not dominated by peak throughput, this is often a favourable trade.

Clock Locking

For benchmark repeatability you can lock clocks:

sudo nvidia-smi --lock-gpu-clocks=1500,1980
sudo nvidia-smi --reset-gpu-clocks  # unlock

Locking eliminates thermal throttling variance during benchmarks. For production serving, leave auto-boost enabled – the GPU will push clocks higher than your lock would allow.

Tradeoffs

Setting	Performance	Power	Thermal
Default	100%	100%	Baseline
-pl 80% of max	~90-95%	80%	Lower temps
-pl 60% of max	~75%	60%	Much lower

Most production deployments run at default. Power limit when the chassis runs hot or when you want deterministic thermal behaviour.

GPU Servers with Sensible Defaults

UK dedicated hosting with persistence mode and thermal envelopes preconfigured.

Browse GPU Servers

See TDP across the lineup and nvidia-smi deep dive.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Tutorials

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

GPU Power Management on a Dedicated Server

Contents

Persistence Mode

Power Limit

Clock Locking

Tradeoffs

GPU Servers with Sensible Defaults

Need a Dedicated GPU Server?

admin

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

GPU Power Management on a Dedicated Server

Contents

Persistence Mode

Power Limit

Clock Locking

Tradeoffs

GPU Servers with Sensible Defaults

Need a Dedicated GPU Server?

admin

Related Articles

Scaling vLLM Across Two GPUs – What Actually Changes

DeepSpeed ZeRO on Dual GPU Servers

Self-Host JupyterHub on a Dedicated GPU

Model Versioning on GPU Servers

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?