RTX 3050 - Order Now
Home / Blog / Benchmarks / RTX 5060 Ti 16GB FLUX.1 Schnell Benchmark
Benchmarks

RTX 5060 Ti 16GB FLUX.1 Schnell Benchmark

FLUX.1-schnell on Blackwell 16GB - 4-step distilled SOTA image gen, FP16 and FP8 throughput numbers.

FLUX.1-schnell from Black Forest Labs is the 4-step distilled variant of FLUX.1 – SOTA image quality at fast iteration time. On the RTX 5060 Ti 16GB via our hosting, FLUX.1-schnell fits with headroom.

Contents

Setup

  • Diffusers 0.30, PyTorch 2.5
  • Model: black-forest-labs/FLUX.1-schnell
  • 12B-param diffusion transformer (DiT), T5 + CLIP text encoders
  • Resolution: 1024×1024
  • Licence: Apache 2.0 (schnell variant)

FP16 Throughput

StepsTimeVRAM peak
11.9 s14.8 GB
22.5 s14.8 GB
43.8 s14.8 GB

FP16 just fits. 1-step produces passable images; 4-step is the recommended setting – under 4 seconds per 1024×1024 image.

FP8 Throughput

StepsTimeVRAM peak
11.2 s9.2 GB
21.6 s9.2 GB
42.4 s9.2 GB

FP8 drops VRAM from 14.8 GB to 9.2 GB with ~35% speed uplift on Blackwell. Quality is essentially indistinguishable at 1024×1024. The FP8-quantised weights are available from the community or produced locally via ComfyUI’s FP8 nodes.

Fit and VRAM

  • FP16: 14.8 GB peak – no room for batch
  • FP8: 9.2 GB – fits batch 2 comfortably
  • CPU-offloaded text encoder (T5): reclaims ~3 GB at cost of ~500 ms first-token latency

vs SDXL

MetricFLUX.1-schnell FP8 4-stepSDXL 30-step
Time/image @ 10242.4 s3.4 s
Peak VRAM9.2 GB9.2 GB
Prompt adherenceNoticeably betterBaseline
Typography / text in imageMuch betterWeak

FLUX.1-schnell is faster AND higher quality than SDXL for most prompts. Unless you have an SDXL-specific LoRA or checkpoint you need, FLUX is the new default.

FLUX.1 on Blackwell 16GB

2.4 s per 1024px image at FP8. UK dedicated hosting.

Order the RTX 5060 Ti 16GB

See also: SDXL benchmark, SD 1.5 benchmark, ComfyUI setup, image studio, SD setup.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?