Table of Contents
For 24/7 AI inference, GPUs run near their TDP continuously. Power isn't free — it's a real line item.
RTX 5090 sustained AI load: ~555 W. At £0.20/kWh: ~£80/month in power. RTX 6000 Pro: ~£85/mo. Smaller cards proportionally less. For multi-GPU clusters, power is the second-biggest line item after the GPU rental.
Real draw by GPU
| GPU | TDP | Real AI sustained | Monthly @ £0.20/kWh |
|---|---|---|---|
| RTX 3050 | 130 W | ~125 W | £18 |
| RTX 3060 12 GB | 170 W | ~165 W | £24 |
| RTX 5060 Ti | 180 W | ~175 W | £26 |
| RTX 3090 | 350 W | ~340 W | £49 |
| RTX 4090 | 450 W | ~430 W | £62 |
| RTX 5080 | 360 W | ~345 W | £50 |
| RTX 5090 | 575 W | ~555 W | £80 |
| RTX 6000 Pro | 600 W | ~580 W | £84 |
Energy per million tokens
For Mistral 7B FP8 on RTX 5090: 555 W × (1M / 1,920 tok/s) = ~290 Wh = ~£0.06 in energy alone.
Verdict
Power is a meaningful but minor cost vs the GPU itself. For dedicated rentals (where power is included), it's zero on your bill — embedded in the monthly. For on-prem buyout, factor £50-90/mo per GPU.
Bottom line
Energy is real. For dedicated rental, factored in. For on-prem, add it to the buyout math. See power and cooling.