Table of Contents
Flux.1 Benchmark Overview
Flux.1 by Black Forest Labs has emerged as one of the leading open image generation models, delivering exceptional prompt adherence and visual quality. Running Flux.1 inference on a dedicated GPU server requires a card with at least 12 GB of VRAM, as the model is significantly larger than Stable Diffusion. We benchmark images per second across six GPUs to help you choose the right hardware.
All tests were run on GigaGPU servers at 1024×1024 resolution with 20 sampling steps using the default Euler scheduler. Flux.1 Dev was used for all measurements. For comparisons with other image models, see our SD 1.5 vs SDXL speed benchmark.
Images/sec Results by GPU
| GPU | VRAM | Flux.1 Dev 1024×1024 (images/sec) | Notes |
|---|---|---|---|
| RTX 3050 | 6 GB | N/A | Insufficient VRAM |
| RTX 4060 | 8 GB | N/A | Insufficient VRAM |
| RTX 4060 Ti | 16 GB | 0.12 img/s | ~8.3s per image |
| RTX 3090 | 24 GB | 0.19 img/s | ~5.3s per image |
| RTX 5080 | 16 GB | 0.28 img/s | ~3.6s per image |
| RTX 5090 | 32 GB | 0.42 img/s | ~2.4s per image |
Flux.1 is considerably more compute-intensive than SDXL. The RTX 5090 at 0.42 images/sec (~2.4 seconds per image) is the only GPU tested that approaches real-time generation speeds. The RTX 3090 at 5.3 seconds per image is still practical for batch generation.
Resolution Impact on Speed
Higher resolutions dramatically reduce throughput. Below we compare 512×512, 1024×1024, and 1536×1536 on the RTX 5090.
| Resolution | RTX 3090 (img/s) | RTX 5090 (img/s) |
|---|---|---|
| 512×512 | 0.52 | 1.15 |
| 1024×1024 | 0.19 | 0.42 |
| 1536×1536 | 0.08 | 0.18 |
At 512×512 the RTX 5090 breaks one image per second, while 1536×1536 drops to roughly one every 5.5 seconds. Choose your resolution target carefully based on your application needs.
Cost Efficiency Analysis
| GPU | 1024×1024 img/s | Approx. Monthly Cost | img/s per Pound |
|---|---|---|---|
| RTX 4060 Ti | 0.12 | ~£75 | 0.0016 |
| RTX 3090 | 0.19 | ~£110 | 0.0017 |
| RTX 5080 | 0.28 | ~£160 | 0.0018 |
| RTX 5090 | 0.42 | ~£250 | 0.0017 |
The RTX 5080 offers the best cost efficiency for Flux.1. For the best GPU for Flux, it represents an excellent balance of speed and price.
GPU Recommendations
- Budget: RTX 3090 — 5.3s per image at 1024×1024 is workable for batch generation pipelines.
- Best value: RTX 5080 — highest images per pound with sub-4-second generation.
- Best speed: RTX 5090 — 2.4s per image for near-real-time generation APIs.
Compare Flux.1 with other image generation models in our SDXL Turbo benchmark or the SD 1.5 vs SDXL comparison. Browse all results in the Benchmarks category.
Conclusion
Flux.1 delivers superior image quality but at the cost of higher compute requirements compared to Stable Diffusion models. GPUs with 16 GB or more VRAM can run it, with the RTX 5080 and RTX 5090 providing the best experience. For high-volume image generation, consider batching requests or using lower resolutions to maximise throughput.
Generate Images with Flux.1 on Dedicated GPUs
High-VRAM GPU servers optimised for image generation workloads with fast NVMe and full root access.
Browse GPU Servers