The RTX 5080 and RTX 5060 Ti 16GB both have 16 GB VRAM and Blackwell FP8 tensor cores. Both available on our hosting. Comparison:
Contents
Specs
| Spec | 5060 Ti 16GB | 5080 16GB |
|---|---|---|
| CUDA cores | 4,608 | 10,752 |
| VRAM | 16 GB GDDR7 | 16 GB GDDR7 |
| Bandwidth | 448 GB/s | 960 GB/s |
| TDP | 180 W | 360 W |
| PCIe | Gen 5 x8 | Gen 5 x16 |
| Price (UK hosting) | Lower tier | Higher tier |
LLM Decode (Llama 3.1 8B FP8)
| Batch | 5060 Ti t/s | 5080 t/s | Ratio |
|---|---|---|---|
| 1 | 112 | 185 | 1.65x |
| 8 | 510 | 810 | 1.59x |
| 32 | 720 | 1,150 | 1.60x |
5080 is ~60% faster at equivalent batch, tracking roughly with bandwidth delta.
Aggregate Concurrency Ceiling
- 5060 Ti: ~720 t/s aggregate at ~35 concurrent Llama 3 8B chats
- 5080: ~1,150 t/s aggregate at ~55 concurrent Llama 3 8B chats
Per-Watt Efficiency
| Metric | 5060 Ti | 5080 |
|---|---|---|
| Peak t/s | 720 | 1,150 |
| Draw at peak | 155 W | 305 W |
| tokens/Joule | 4.6 | 3.8 |
5060 Ti wins on tokens/watt. 5080 wins on per-dollar-per-second if your workload needs the concurrency.
Verdict
- 5060 Ti: best when you run multiple separate workloads, value hosting, efficiency matters
- 5080: best when one workload needs the higher single-card throughput, more concurrent users, or slightly larger models at tighter quantisation
Both have identical VRAM so they serve the same model catalogue. The upgrade is purely about throughput.
Blackwell Value vs Blackwell Performance
Both available. UK dedicated hosting.
Order the RTX 5060 Ti 16GBSee also: 5060 Ti or 5080 decision, vs 3090, vs 4060, upgrading to 5090, tokens/watt.