Black Forest Labs released Flux.1 in two distinct flavours: Dev for maximum quality and Schnell for maximum speed. They share the same architecture — a 12-billion-parameter rectified flow transformer — but diverge sharply on how many steps they need and what licence they carry. Picking the wrong variant wastes either GPU cycles or output quality. Here is how to choose for your GPU hosting deployment.
Variant Overview
| Feature | Flux.1 Dev | Flux.1 Schnell |
|---|---|---|
| Parameters | 12B | 12B |
| Architecture | Rectified Flow Transformer | Rectified Flow Transformer |
| Recommended Steps | 20-50 | 1-4 |
| Guidance Distilled | Yes | Yes (aggressive) |
| VRAM (FP16) | ~24 GB | ~24 GB |
| VRAM (FP8/NF4) | ~12 GB | ~12 GB |
| Native Resolution | Up to 2MP | Up to 2MP |
| Licence | Non-commercial (Flux.1-dev) | Apache 2.0 |
The licence difference matters enormously. Schnell ships under Apache 2.0 — use it however you like, including commercial products. Dev carries a non-commercial licence that restricts production use unless you purchase an enterprise agreement from BFL. That single factor may make your decision for you.
Quality Analysis
At their respective optimal step counts, both variants produce excellent images. Dev at 28 steps delivers the sharpest detail, most accurate text rendering, and best prompt adherence in the Flux family. Schnell at 4 steps produces images that are slightly softer but still superior to SDXL at 30 steps.
The interesting comparison is Dev at low step counts versus Schnell. At 4 steps, Dev produces noticeably worse results than Schnell because Dev was not distilled for ultra-low step counts. Schnell’s aggressive distillation specifically targets the 1-4 step regime.
| Metric | Flux.1 Dev (28 steps) | Flux.1 Schnell (4 steps) | SDXL (30 steps) |
|---|---|---|---|
| Human Preference Rate | 82% | 74% | 61% |
| Text Rendering Accuracy | High | Medium-High | Low |
| Prompt Adherence | Excellent | Good | Good |
| Generation Time (RTX 5090) | ~12 seconds | ~1.5 seconds | ~4 seconds |
GPU Requirements
Both variants have identical VRAM footprints because they share the same weights architecture. The difference is purely in compute time.
| GPU | Dev (28 steps) img/min | Schnell (4 steps) img/min | VRAM at FP16 |
|---|---|---|---|
| RTX 3090 | ~2 | ~12 | 24 GB (tight) |
| RTX 5090 | ~5 | ~30 | 24 GB (tight) |
| RTX 6000 Pro 96 GB | ~4 | ~25 | 24 GB (comfortable) |
At FP16, Flux.1 uses nearly all 24 GB of an RTX 3090 or 5090. For comfortable headroom, use NF4 quantisation (12 GB) or deploy on an RTX 6000 Pro. The 5090’s faster memory bandwidth gives it a meaningful throughput advantage over the RTX 6000 Pro at this model size.
Decision Framework
Choose Schnell when: You need commercial licensing, real-time or interactive generation, high-volume batch processing, or an image generation API where latency matters. Schnell at 4 steps outperforms SDXL at 30 steps on most quality metrics while being faster.
Choose Dev when: You have a non-commercial use case (research, internal tools, personal projects), maximum image quality justifies longer generation times, or you need the best possible text rendering in generated images.
For comparison with the broader image model landscape, see SD 1.5 vs SDXL vs Flux.1 and SDXL Turbo vs SDXL.
Deployment Tips
Both variants are served via ComfyUI, the diffusers library, or dedicated APIs. For production deployments on image generation hosting, consider wrapping either variant behind FastAPI or Flask. Pair with an LLM for automated content workflows or social media generation.
Check the best GPU for image generation guide for hardware recommendations and the benchmark tool for real-time performance data.
Host Flux.1 on Dedicated GPUs
Run Flux.1 Dev or Schnell on bare-metal GPU servers. Full root access, no generation caps, and predictable monthly pricing.
Browse GPU Servers