If your primary constraint is “what model can I fit,” then the most honest benchmark is pounds per gigabyte of VRAM. That number does not tell you how fast the card runs, but it does tell you whether you are paying a premium for silicon you do not need. Below is the 2026 ranking across our UK dedicated GPU hosting.
Sections
- The ranking
- Why pounds-per-GB is not the only number
- Best value at each tier
- How to actually use this
The Ranking
Using approximate monthly rates as of Q2 2026, lower is better:
| GPU | VRAM | Relative £/GB |
|---|---|---|
| RTX 3090 | 24 GB | Best value in the 20+ GB tier |
| Ryzen AI Max+ 395 | 96 GB unified | Lowest overall £/GB (with bandwidth caveats) |
| Intel Arc Pro B70 | 32 GB | Best in 32 GB class, ex-CUDA |
| R9700 | 32 GB | Close second in 32 GB |
| RTX 4060 Ti 16GB | 16 GB | Best 16 GB on CUDA |
| RTX 5090 | 32 GB | Premium – you pay for speed, not capacity |
| RTX 6000 Pro | 96 GB | Cheapest path to 96 GB on one CUDA card |
| RTX 5080 | 16 GB | Premium 16 GB – speed tax |
| RTX 5060 | 8 GB | Premium entry – Blackwell tax |
| RTX 4060 | 8 GB | Best pure 8 GB value |
| RTX 3050 | 6 GB | Lowest monthly outlay, highest £/GB |
The Honest Caveat
Pounds per gigabyte ignores bandwidth, CUDA ecosystem, and FP8 availability. The Ryzen AI Max+ 395 wins on paper because 96 GB is cheap when it is system RAM. But that RAM runs at 256 GB/s – roughly a quarter of a 3090’s bandwidth. A large model decodes slowly there. See our bandwidth ranking for the speed side of the equation.
Best Value at Each Tier
Budget tier: The 4060 is the pure value winner on absolute cost, with the 4060 Ti 16GB winning on cost per usable GB once you need real production capacity.
Mid tier: The 3090 with 24 GB remains unbeaten for CUDA workloads that need memory. Five years old, still the right answer.
Large tier: Arc Pro B70 wins £/GB in the 32 GB class if you can live outside CUDA. R9700 is close.
Flagship: The 6000 Pro is the only one-card path to 96 GB on CUDA. Priced accordingly but nothing else competes.
Pay for the VRAM You Will Use
Our sizing team matches servers to workloads so you never pay for capacity you cannot saturate.
Browse GPU ServersHow to Actually Use This
Start by pinning your model’s VRAM requirement from our blog guides for your specific model. Add 20-30% headroom for KV cache and batching. Then find the cheapest card in the ranking that clears that target. If the card is an older generation (3090, 4060 Ti), check whether its bandwidth supports your latency target. If yes, buy. If no, step up one rung.
For the full tier ladder with workload mapping, see the 2026 GPU tier ladder.