RTX 3050 - Order Now
Home / Blog / GPU Comparisons / TDP and Power Draw Across the GigaGPU Lineup
GPU Comparisons

TDP and Power Draw Across the GigaGPU Lineup

Every GPU we host, ranked by total power draw, with the implications for hosting cost, cooling, and tokens per watt.

Power draw matters on a dedicated server more than on a desktop because you are paying for it continuously. On our UK hosting that cost is bundled into your monthly price, but it still shapes what we can provision together and which workloads pay back quickest. Here is every GPU in the lineup ranked by TDP.

Contents

TDP Ordered

GPUTDPVRAM
RTX 5090575 W32 GB
RTX 5080360 W16 GB
RTX 3090350 W24 GB
RTX 6000 Pro~300 W96 GB
R9700~260 W32 GB
AMD RX 9070 XT~250 W16 GB
Intel Arc Pro B70~220 W32 GB
RTX 4060 Ti165 W16 GB
RTX 5060150 W8 GB
RTX 4060115 W8 GB
Ryzen AI Max+ 395 (SoC)~120 W96 GB unified
RTX 3050115 W6 GB

Real vs Datasheet

TDP is a thermal design number, not a measured draw. At idle even the biggest cards pull 20-40 W. Under sustained AI load most cards run 85-100% of their rated TDP. LLM decode tends to be lighter than prefill or training – decoding Llama 3 8B on a 5090 pulls maybe 300 W, not 575 W. Batch prefill or training pushes close to the ceiling.

Tokens Per Watt

The interesting ranking is efficiency. For Llama 3 8B INT8 decode, rough tokens per watt:

GPURough tokens/sec/W
RTX 6000 Pro~0.30
RTX 5090~0.27
RTX 5080~0.26
RTX 4060 Ti~0.22
RTX 3090~0.18

Newer silicon is more efficient even if the TDP numbers look higher. A 5090 draws more watts but puts them to better use. See our tokens per watt deep dive.

Fixed Monthly Pricing Includes Power

No metered electricity bills, no cloud pricing volatility – one number per month on our UK hosting.

Browse GPU Servers

What It Means

First, the 5090 is the hungriest card in the lineup. Two of them in one box need a serious PSU and we price that accordingly. Second, the 6000 Pro is surprisingly efficient given its capacity – a huge card that runs cooler than a 5090. Third, the entry cards (3050, 4060, 5060) can fit in compact servers with modest power infrastructure, which is why they dominate the “your first AI server” bucket.

For purchasing strategy, see VRAM per pound.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?