Home / Blog / Benchmarks / SDXL Turbo Images/sec by GPU

Benchmarks

SDXL Turbo Images/sec by GPU

Benchmark results for SDXL Turbo single-step image generation speed across six GPUs with cost-efficiency data for dedicated GPU hosting.

Benchmarks April 14, 2026 2 min read admin

Table of Contents

SDXL Turbo Benchmark Overview
Images/sec Results by GPU
1-Step vs 4-Step Comparison
Cost Efficiency Analysis
GPU Recommendations
Conclusion

SDXL Turbo Benchmark Overview

SDXL Turbo uses adversarial diffusion distillation to generate images in as few as one sampling step, making it one of the fastest high-quality image generation models available. For real-time applications on a dedicated GPU server, SDXL Turbo can deliver sub-second image generation on the right hardware. We benchmark images per second across six GPUs.

All tests were run on GigaGPU servers at 512×512 resolution (SDXL Turbo’s optimal resolution) using 1-step and 4-step generation. SDXL Turbo requires approximately 6.5 GB of VRAM. For model comparisons, see the SD 1.5 vs SDXL speed benchmark.

Images/sec Results by GPU

GPU	VRAM	SDXL Turbo 1-Step (img/s)	Time per Image
RTX 3050	6 GB	1.8 img/s	~556ms
RTX 4060	8 GB	3.5 img/s	~286ms
RTX 4060 Ti	16 GB	4.8 img/s	~208ms
RTX 3090	24 GB	6.2 img/s	~161ms
RTX 5080	16 GB	9.5 img/s	~105ms
RTX 5090	32 GB	13.8 img/s	~72ms

SDXL Turbo is remarkably fast. Even the RTX 3050 delivers nearly 2 images per second at 1-step, while the RTX 5090 manages 13.8 images/sec at 72ms per image — fast enough for truly real-time applications.

1-Step vs 4-Step Comparison

More steps improve quality at the cost of speed. Below we compare 1-step and 4-step generation.

GPU	1-Step (img/s)	4-Step (img/s)
RTX 3050	1.8	0.48
RTX 4060	3.5	0.92
RTX 4060 Ti	4.8	1.25
RTX 3090	6.2	1.62
RTX 5080	9.5	2.48
RTX 5090	13.8	3.60

At 4 steps, the RTX 5090 still manages 3.6 images/sec (~278ms per image), which is faster than most models at any step count. For applications where quality matters more than speed, 4-step generation is recommended.

Cost Efficiency Analysis

GPU	1-Step img/s	Approx. Monthly Cost	img/s per Pound
RTX 3050	1.8	~£45	0.040
RTX 4060	3.5	~£60	0.058
RTX 4060 Ti	4.8	~£75	0.064
RTX 3090	6.2	~£110	0.056
RTX 5080	9.5	~£160	0.059
RTX 5090	13.8	~£250	0.055

The RTX 4060 Ti leads on cost efficiency at 0.064 img/s per pound. For the best GPU for Stable Diffusion, it offers outstanding value for SDXL Turbo.

GPU Recommendations

Budget: RTX 4060 — 3.5 img/s at 1-step for development and moderate-traffic APIs.
Best value: RTX 4060 Ti — top cost efficiency with 4.8 img/s.
Real-time: RTX 5090 — 72ms per image enables truly interactive generation.
High throughput: RTX 5080 — excellent balance of speed and cost for production.

For higher-quality image generation at lower speed, see our Flux.1 benchmark. Compare SDXL Turbo with the full SDXL model in our SD 1.5 vs SDXL comparison. Browse all results in the Benchmarks category.

Conclusion

SDXL Turbo is the fastest high-quality image generation model we have benchmarked. Its single-step capability means even budget GPUs can serve images in under a second, while high-end cards achieve frame-rate speeds. For applications requiring instant visual feedback, SDXL Turbo on dedicated GPU hardware is the optimal choice.

Real-Time Image Generation with SDXL Turbo

GPU servers optimised for image generation from budget to high-end. Sub-second generation speeds available.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Benchmarks

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

SDXL Turbo Images/sec by GPU

SDXL Turbo Benchmark Overview

Images/sec Results by GPU

1-Step vs 4-Step Comparison

Cost Efficiency Analysis

GPU Recommendations

Conclusion

Real-Time Image Generation with SDXL Turbo

Need a Dedicated GPU Server?

admin

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

SDXL Turbo Images/sec by GPU

SDXL Turbo Benchmark Overview

Images/sec Results by GPU

1-Step vs 4-Step Comparison

Cost Efficiency Analysis

GPU Recommendations

Conclusion

Real-Time Image Generation with SDXL Turbo

Need a Dedicated GPU Server?

admin

Related Articles

LoRA Fine-Tuning Speed by GPU

Flux.1 on RTX 3050: Images/sec & VRAM Usage, Category: Benchmarks, Slug: flux-1-on-rtx-3050-benchmark, Excerpt: Flux.1 benchmarked on RTX 3050: 0.15 it/s, 0.45 images/min at 1024×1024, VRAM usage, and cost per 1K images., Internal links: 8 –>

PaddleOCR on RTX 4060 Ti: OCR Speed & Cost, Category: Benchmarks, Slug: paddleocr-on-rtx-4060-ti-benchmark, Excerpt: PaddleOCR benchmarked on RTX 4060 Ti: 38 pages/sec, VRAM usage, cost efficiency, and deployment configuration., Internal links: 8 –>

Qwen 2.5 7B on RTX 5090: Performance Benchmark & Cost, Category: Benchmarks, Slug: qwen-2.5-7b-on-rtx-5090-benchmark, Excerpt: Qwen 2.5 7B benchmarked on RTX 5090: 92.8 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?