RTX 3050 - Order Now
Home / Blog / Benchmarks / SDXL Turbo Images/sec by GPU
Benchmarks

SDXL Turbo Images/sec by GPU

Benchmark results for SDXL Turbo single-step image generation speed across six GPUs with cost-efficiency data for dedicated GPU hosting.

SDXL Turbo Benchmark Overview

SDXL Turbo uses adversarial diffusion distillation to generate images in as few as one sampling step, making it one of the fastest high-quality image generation models available. For real-time applications on a dedicated GPU server, SDXL Turbo can deliver sub-second image generation on the right hardware. We benchmark images per second across six GPUs.

All tests were run on GigaGPU servers at 512×512 resolution (SDXL Turbo’s optimal resolution) using 1-step and 4-step generation. SDXL Turbo requires approximately 6.5 GB of VRAM. For model comparisons, see the SD 1.5 vs SDXL speed benchmark.

Images/sec Results by GPU

GPUVRAMSDXL Turbo 1-Step (img/s)Time per Image
RTX 30506 GB1.8 img/s~556ms
RTX 40608 GB3.5 img/s~286ms
RTX 4060 Ti16 GB4.8 img/s~208ms
RTX 309024 GB6.2 img/s~161ms
RTX 508016 GB9.5 img/s~105ms
RTX 509032 GB13.8 img/s~72ms

SDXL Turbo is remarkably fast. Even the RTX 3050 delivers nearly 2 images per second at 1-step, while the RTX 5090 manages 13.8 images/sec at 72ms per image — fast enough for truly real-time applications.

1-Step vs 4-Step Comparison

More steps improve quality at the cost of speed. Below we compare 1-step and 4-step generation.

GPU1-Step (img/s)4-Step (img/s)
RTX 30501.80.48
RTX 40603.50.92
RTX 4060 Ti4.81.25
RTX 30906.21.62
RTX 50809.52.48
RTX 509013.83.60

At 4 steps, the RTX 5090 still manages 3.6 images/sec (~278ms per image), which is faster than most models at any step count. For applications where quality matters more than speed, 4-step generation is recommended.

Cost Efficiency Analysis

GPU1-Step img/sApprox. Monthly Costimg/s per Pound
RTX 30501.8~£450.040
RTX 40603.5~£600.058
RTX 4060 Ti4.8~£750.064
RTX 30906.2~£1100.056
RTX 50809.5~£1600.059
RTX 509013.8~£2500.055

The RTX 4060 Ti leads on cost efficiency at 0.064 img/s per pound. For the best GPU for Stable Diffusion, it offers outstanding value for SDXL Turbo.

GPU Recommendations

  • Budget: RTX 4060 — 3.5 img/s at 1-step for development and moderate-traffic APIs.
  • Best value: RTX 4060 Ti — top cost efficiency with 4.8 img/s.
  • Real-time: RTX 5090 — 72ms per image enables truly interactive generation.
  • High throughput: RTX 5080 — excellent balance of speed and cost for production.

For higher-quality image generation at lower speed, see our Flux.1 benchmark. Compare SDXL Turbo with the full SDXL model in our SD 1.5 vs SDXL comparison. Browse all results in the Benchmarks category.

Conclusion

SDXL Turbo is the fastest high-quality image generation model we have benchmarked. Its single-step capability means even budget GPUs can serve images in under a second, while high-end cards achieve frame-rate speeds. For applications requiring instant visual feedback, SDXL Turbo on dedicated GPU hardware is the optimal choice.

Real-Time Image Generation with SDXL Turbo

GPU servers optimised for image generation from budget to high-end. Sub-second generation speeds available.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?