Home / Blog / GPU Comparisons / Can RTX 5090 Run Flux.1 in FP16?

GPU Comparisons

Can RTX 5090 Run Flux.1 in FP16?

Yes, the RTX 5090 runs Flux.1 Dev in full FP16 with 32GB VRAM. Here are the benchmarks, VRAM usage, and setup guide for maximum quality.

GPU Comparisons April 14, 2026 3 min read gigagpu

Yes, the RTX 5090 runs Flux.1 in full FP16 precision. With 32GB GDDR7 VRAM, the RTX 5090 is one of the few single-GPU options that can load Flux.1 Dev at full FP16 quality without any quantisation. This gives you the highest possible image quality from Flux.1 with fast generation times.

Table of Contents

The Short Answer
VRAM Analysis
Performance Benchmarks
Setup Guide
Recommended Alternative

The Short Answer

YES. Flux.1 Dev FP16 needs ~26GB peak at 1024×1024. The RTX 5090’s 32GB handles this with 6GB to spare.

Flux.1 is a 12B parameter diffusion transformer. In FP16, the model weights consume approximately 24GB. During 1024×1024 generation with 20 steps, peak VRAM usage including latent tensors, attention maps, and the text encoders (T5-XXL + CLIP-L) reaches roughly 26GB. The RTX 5090 fits this comfortably. For comparison, the RTX 3090’s 24GB cannot fit Flux.1 FP16 reliably, making the 5090 the entry point for full-precision Flux. See our image model VRAM guide for comparisons with other models.

VRAM Analysis

Configuration	Model VRAM	Generation Overhead	Total (1024×1024)	RTX 5090 (32GB)
Flux.1 Dev FP16	~24GB	~2GB	~26GB	Fits
Flux.1 Dev FP16 + ControlNet	~24GB	~4.5GB	~28.5GB	Fits
Flux.1 Dev FP16 (1536×1536)	~24GB	~4GB	~28GB	Fits
Flux.1 Dev FP16 (2048×2048)	~24GB	~7GB	~31GB	Tight
Flux.1 Schnell FP16	~24GB	~1.5GB	~25.5GB	Fits well

At 1024×1024, you have 6GB of headroom for ControlNet, IP-Adapter, or other add-ons. At 1536×1536, the fit is still comfortable. Only at 2048×2048 resolution does it start to get tight. Batch size is limited to 1 at FP16 due to the model’s large footprint.

Performance Benchmarks

GPU	Flux.1 Dev FP16 1024×1024 (20 steps)	Schnell FP16 1024×1024 (4 steps)
RTX 3090 (24GB)	OOM / borderline	OOM / borderline
RTX 5080 (16GB)	N/A (insufficient VRAM)	N/A (insufficient VRAM)
RTX 5090 (32GB)	~20s	~4.5s

The RTX 5090 generates a full-quality Flux.1 Dev FP16 image in about 20 seconds at 1024×1024 with 20 steps. Schnell at 4 steps takes roughly 4.5 seconds. These are true FP16 results with no quality loss from quantisation. For FP8 benchmarks on more GPUs, see our Flux.1 images/sec benchmark.

Setup Guide

ComfyUI is the recommended interface for Flux.1 FP16:

# Clone and set up ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

# Launch without lowvram (not needed on 32GB)
python main.py --listen 0.0.0.0 --port 8188

Download the Flux.1 Dev FP16 checkpoint (flux1-dev.safetensors) and the T5-XXL and CLIP-L text encoder files. Place them in the appropriate models/ subdirectories. Do NOT use the --lowvram flag, as the RTX 5090 has enough VRAM to keep everything resident.

For the best quality workflow, use the full Dev model with 20-30 steps and the Euler scheduler. The FP16 model produces sharper details and better text rendering than FP8 or NF4 quantised versions.

Recommended Alternative

If 32GB is not enough (for example, batch generation or 2048×2048), multi-GPU setups with two RTX 3090 cards can provide 48GB. For Flux.1 on a budget, the RTX 5080 runs Flux.1 in FP8 with good quality at a lower price point.

For SDXL on the 5090, check our SD 1.5 vs SDXL speed comparison. For LLM workloads, see the LLaMA 3 70B INT4 guide or multi-LLM guide. Browse all GPU options in the GPU Comparisons category or on our dedicated GPU hosting page.

Deploy This Model Now

Dedicated GPU servers with the VRAM you need. UK datacenter, full root access.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

GPU Comparisons

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Can RTX 5090 Run Flux.1 in FP16?

The Short Answer

VRAM Analysis

Performance Benchmarks

Setup Guide

Recommended Alternative

Deploy This Model Now

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

Can RTX 5090 Run Flux.1 in FP16?

The Short Answer

VRAM Analysis

Performance Benchmarks

Setup Guide

Recommended Alternative

Deploy This Model Now

Need a Dedicated GPU Server?

gigagpu

Related Articles

Best GPU for LLM Inference in 2025

DeepSeek 7B vs Mistral 7B for Cost-Optimised Batch Processing: GPU Benchmark

Mistral 7B vs Gemma 2 9B for Chatbot / Conversational AI: GPU Benchmark

GPU Memory Bandwidth Across the GigaGPU Lineup

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?