RTX 3050 - Order Now
Home / Blog / AI Hosting & Infrastructure / AI Workload Power Consumption: What Each GPU Actually Draws Under Real Load
AI Hosting & Infrastructure

AI Workload Power Consumption: What Each GPU Actually Draws Under Real Load

GPU TDP is the rated maximum. Real AI workloads draw close to TDP continuously. Numbers, energy cost per million tokens, and rack planning math.

For 24/7 AI inference, GPUs run near their TDP continuously. Power isn't free — it's a real line item.

TL;DR

RTX 5090 sustained AI load: ~555 W. At £0.20/kWh: ~£80/month in power. RTX 6000 Pro: ~£85/mo. Smaller cards proportionally less. For multi-GPU clusters, power is the second-biggest line item after the GPU rental.

Real draw by GPU

GPUTDPReal AI sustainedMonthly @ £0.20/kWh
RTX 3050130 W~125 W£18
RTX 3060 12 GB170 W~165 W£24
RTX 5060 Ti180 W~175 W£26
RTX 3090350 W~340 W£49
RTX 4090450 W~430 W£62
RTX 5080360 W~345 W£50
RTX 5090575 W~555 W£80
RTX 6000 Pro600 W~580 W£84

Energy per million tokens

For Mistral 7B FP8 on RTX 5090: 555 W × (1M / 1,920 tok/s) = ~290 Wh = ~£0.06 in energy alone.

Verdict

Power is a meaningful but minor cost vs the GPU itself. For dedicated rentals (where power is included), it's zero on your bill — embedded in the monthly. For on-prem buyout, factor £50-90/mo per GPU.

Bottom line

Energy is real. For dedicated rental, factored in. For on-prem, add it to the buyout math. See power and cooling.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?