Home / Blog / Benchmarks / FLUX Schnell Throughput by GPU

Benchmarks

FLUX Schnell Throughput by GPU

Black Forest Labs' FLUX Schnell is the fastest high-quality diffusion model in 2026 - 4-step generation with SDXL-beating quality. Measured by card.

Benchmarks April 23, 2026 1 min read gigagpu

FLUX.1 Schnell is the fastest top-tier diffusion model in 2026. Apache-licensed, 4 steps per image, quality that rivals or beats SDXL. On our dedicated GPU hosting every tier runs it; here is measured throughput per card.

VRAM requirements
Setup
Benchmarks
Schnell vs Dev

VRAM

FLUX Schnell at FP16: ~22 GB. FP8: ~12 GB. INT4 (bitsandbytes): ~7 GB.

Setup

from diffusers import FluxPipeline
import torch

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    torch_dtype=torch.bfloat16,
).to("cuda")

image = pipe(
    "A detailed urban sketch, high contrast ink on paper",
    num_inference_steps=4,
    guidance_scale=0.0,
    max_sequence_length=256,
).images[0]

Benchmarks

1024×1024, 4 steps, time per image:

GPU	Precision	Time
4060 Ti 16GB	FP8	~2.8s
3090 24GB	BF16	~1.6s
5080 16GB	FP8	~1.1s
5090 32GB	BF16	~0.7s
6000 Pro 96GB	BF16	~0.6s

Schnell vs Dev

FLUX.1 Dev is a 28-step variant with higher quality and a non-commercial licence. For commercial use stick with Schnell. For portfolio or non-commercial work Dev can be worth the extra steps.

Quality comparison on commercial workflows: Schnell is typically within 5-10% of Dev quality on most prompts. Speed is 7-8x faster.

FLUX Schnell Hosting

FLUX preconfigured on UK dedicated GPUs at any tier.

Browse GPU Servers

See FLUX on 5090 benchmark and SDXL Lightning vs Turbo.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Benchmarks

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

FLUX Schnell Throughput by GPU

Contents

VRAM

Setup

Benchmarks

Schnell vs Dev

FLUX Schnell Hosting

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

FLUX Schnell Throughput by GPU

Contents

VRAM

Setup

Benchmarks

Schnell vs Dev

FLUX Schnell Hosting

Need a Dedicated GPU Server?

gigagpu

Related Articles

LLaMA 3 70B on RTX 3090: Performance Benchmark & Cost, Category: Benchmarks, Slug: llama-3-70b-on-rtx-3090-benchmark, Excerpt: LLaMA 3 70B benchmarked on RTX 3090: 5.2 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

Image Generation Benchmark Update: April 2026

Gemma 2 9B Tokens/sec by GPU

Gemma Benchmarks: Performance on GigaGPU Servers

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?