Yes, the RTX 5080 runs Flux.1 well in quantised modes. With 16GB GDDR7 VRAM, the RTX 5080 handles Flux.1 Schnell in FP8 at 1024×1024 and Flux.1 Dev in NF4 quantisation. Full FP16 mode requires 24GB+, but the quantised variants deliver excellent quality with the 5080’s fast memory bandwidth.
The Short Answer
YES for Schnell FP8 and Dev NF4. NO for Dev FP16 (needs ~26GB).
Flux.1 is a 12B parameter diffusion transformer from Black Forest Labs. In FP16, weights alone consume about 24GB, ruling out the RTX 5080 at full precision. However, FP8 quantisation halves that to roughly 12GB, and NF4 brings it to about 7GB. Both fit within 16GB with room for generation overhead. The Schnell (fast) variant needs only 4 inference steps compared to 20-50 for Dev, making it the faster option. For a full memory analysis, see our image model VRAM requirements guide.
VRAM Analysis
| Configuration | Model VRAM | Generation Overhead | Total (1024×1024) | RTX 5080 (16GB) |
|---|---|---|---|---|
| Flux.1 Dev FP16 | ~24GB | ~2GB | ~26GB | No |
| Flux.1 Dev FP8 | ~12GB | ~1.5GB | ~13.5GB | Fits |
| Flux.1 Dev NF4 | ~7GB | ~1.5GB | ~8.5GB | Fits well |
| Flux.1 Schnell FP8 | ~12GB | ~1.2GB | ~13.2GB | Fits |
| Flux.1 Schnell NF4 | ~7GB | ~1GB | ~8GB | Fits well |
FP8 is the recommended quantisation level on the 5080 since it retains nearly all of FP16’s quality while fitting comfortably. NF4 introduces slight softness in fine details but frees up VRAM for ControlNet or IP-Adapter workflows.
Performance Benchmarks
| GPU | Schnell FP8 1024×1024 (4 steps) | Dev FP8 1024×1024 (20 steps) |
|---|---|---|
| RTX 4060 Ti (16GB) | ~18s | ~85s |
| RTX 3090 (24GB) | ~12s | ~55s |
| RTX 5080 (16GB) | ~7s | ~32s |
| RTX 5090 (32GB) | ~4.5s | ~20s |
The RTX 5080 generates a Flux.1 Schnell image in about 7 seconds, making it practical for interactive use and small-batch production. Dev mode at 32 seconds per image is suitable for batch pipelines where quality is paramount. For more image generation benchmarks, see our Flux.1 images/sec benchmark.
Setup Guide
ComfyUI is the recommended way to run Flux.1 on the RTX 5080:
# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
# Launch (no lowvram needed for FP8 on 16GB)
python main.py --listen 0.0.0.0 --port 8188
Download the Flux.1 Schnell FP8 checkpoint and place it in models/unet/. Use the built-in FP8 loader nodes. The 5080 does not need the --lowvram flag for FP8 models, as everything stays within 16GB without offloading.
For the GGUF/NF4 variant, install the ComfyUI GGUF nodes extension and use the appropriate NF4 checkpoint file.
Recommended Alternative
For Flux.1 Dev in full FP16 quality, the RTX 3090 with 24GB is the minimum viable card. The RTX 5090 with 32GB runs Dev FP16 with room for ControlNet. For the RTX 5090 Flux.1 FP16 analysis, see our dedicated guide.
For SDXL on the same card, check the RTX 5080 SDXL guide. For LLM workloads, see DeepSeek on the 5080 or Mistral 7B FP16 on the 5080. Find the best card for image generation in our best GPU guide and browse servers on our dedicated GPU hosting page.
Deploy This Model Now
Dedicated GPU servers with the VRAM you need. UK datacenter, full root access.
Browse GPU Servers