Home / Blog / Model Guides / How to Run Flux.1 on a Dedicated GPU Server

Model Guides

How to Run Flux.1 on a Dedicated GPU Server

Deploy Black Forest Labs' Flux.1 image generation model on a dedicated GPU server. Covers VRAM requirements, ComfyUI and diffusers setup, CLI commands, and optimization.

Model Guides April 10, 2026 4 min read gigagpu

Table of Contents

What Makes Flux.1 Different
GPU VRAM Requirements for Flux.1
Preparing Your GPU Server
Running Flux.1 with ComfyUI
Running Flux.1 with Diffusers
Creating an API Endpoint
Performance Optimization

What Makes Flux.1 Different

Flux.1 from Black Forest Labs represents a significant leap in text-to-image generation. Built on a flow matching architecture with a 12-billion-parameter transformer backbone, it delivers exceptional prompt adherence, natural text rendering, and photorealistic output. Running Flux.1 on a dedicated GPU server gives you the high VRAM and compute needed for fast generation without the queue times of shared services.

GigaGPU’s Flux.1 hosting provides servers with RTX 5090 and RTX 6000 Pro GPUs that meet Flux.1’s substantial VRAM requirements. Whether you are building a production image generation API, creating marketing content at scale, or fine-tuning with LoRAs, dedicated hardware ensures consistent throughput. This guide covers both ComfyUI and Python diffusers workflows for Flux.1 deployment.

GPU VRAM Requirements for Flux.1

Flux.1 is more VRAM-hungry than Stable Diffusion XL due to its larger transformer architecture. For a full GPU comparison, see our best GPU for Stable Diffusion and Flux benchmark.

Model Variant	Precision	VRAM Required	Recommended GPU
Flux.1 Schnell	FP16	~16 GB	1x RTX 5090
Flux.1 Schnell	FP8	~10 GB	1x RTX 3090
Flux.1 Dev	FP16	~24 GB	1x RTX 5090 32 GB
Flux.1 Dev	FP8	~14 GB	1x RTX 5090
Flux.1 Dev + ControlNet	FP16	~32 GB	1x RTX 6000 Pro
Flux.1 Pro (via API)	N/A	N/A	API-only

For the highest resolution and batch sizes, GigaGPU’s image generator hosting offers RTX 6000 Pro 96 GB servers.

Preparing Your GPU Server

Update your system and verify GPU access:

sudo apt update && sudo apt upgrade -y
sudo apt install -y python3 python3-pip python3-venv git wget
nvidia-smi

Create a virtual environment for the Flux.1 stack:

python3 -m venv ~/flux-env
source ~/flux-env/bin/activate
pip install --upgrade pip

Install PyTorch with CUDA. Our PyTorch GPU installation guide covers driver compatibility:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Authenticate with Hugging Face to download Flux.1 weights:

pip install huggingface_hub
huggingface-cli login

Running Flux.1 with ComfyUI

ComfyUI is the most popular interface for Flux.1 thanks to its node-based workflow system. Read our full ComfyUI setup guide for detailed installation steps.

git clone https://github.com/comfyanonymous/ComfyUI.git ~/ComfyUI
cd ~/ComfyUI
python3 -m venv venv && source venv/bin/activate
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

Download the Flux.1 Dev checkpoint and supporting models:

# Flux.1 Dev UNET
wget -P ~/ComfyUI/models/unet/ \
  https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors

# CLIP models
wget -P ~/ComfyUI/models/clip/ \
  https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
wget -P ~/ComfyUI/models/clip/ \
  https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors

# VAE
wget -P ~/ComfyUI/models/vae/ \
  https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors

Launch ComfyUI and load a Flux.1 workflow:

python main.py --listen 0.0.0.0 --port 8188

Running Flux.1 with Diffusers

For programmatic access, use the Hugging Face diffusers library directly:

pip install diffusers transformers accelerate sentencepiece protobuf

Generate an image with a simple Python script:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.float16
)
pipe.to("cuda")

image = pipe(
    prompt="A majestic snowy owl perched on a branch at twilight, photorealistic",
    num_inference_steps=28,
    guidance_scale=3.5,
    height=1024,
    width=1024
).images[0]

image.save("flux_output.png")

For Flux.1 Schnell (faster, distilled variant):

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    torch_dtype=torch.float16
)
pipe.to("cuda")

image = pipe(
    prompt="Abstract watercolour landscape with mountains",
    num_inference_steps=4,
    guidance_scale=0.0,
    height=1024,
    width=1024
).images[0]

Creating an API Endpoint

Wrap Flux.1 in a FastAPI server for production use:

pip install fastapi uvicorn python-multipart

Create a minimal server script and run it:

uvicorn flux_api:app --host 0.0.0.0 --port 8000

Test the endpoint:

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cyberpunk cityscape at night", "steps": 28}' \
  --output result.png

For managed API deployments, explore GigaGPU’s API hosting platform.

Performance Optimization

Get faster Flux.1 generation with these tips:

Use FP8 quantisation — Flux.1 Dev at FP8 fits on a single RTX 5090 with minimal quality loss.
Try Schnell for speed — The distilled Schnell variant generates quality images in just 4 steps, ideal for real-time applications.
Enable torch.compile — Add pipe.transformer = torch.compile(pipe.transformer) for up to 30% faster inference after warmup.
Offload T5 encoder — Use pipe.enable_model_cpu_offload() to free VRAM after text encoding for larger batch sizes.
Pick the right GPU — Read our RTX 3090 vs RTX 5090 comparison for image generation benchmarks.

If you also work with Stable Diffusion, our guide on deploying Stable Diffusion covers SDXL setup. For a broader overview of GigaGPU’s Stable Diffusion hosting options, visit the dedicated landing page. Browse more tutorials in our model guides category.

Run Flux.1 on High-VRAM GPU Servers

Generate stunning images with Flux.1 on dedicated RTX 5090 and RTX 6000 Pro GPUs. Full root access, pre-installed CUDA, and no queue times.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Model Guides

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

How to Run Flux.1 on a Dedicated GPU Server

What Makes Flux.1 Different

GPU VRAM Requirements for Flux.1

Preparing Your GPU Server

Running Flux.1 with ComfyUI

Running Flux.1 with Diffusers

Creating an API Endpoint

Performance Optimization

Run Flux.1 on High-VRAM GPU Servers

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

How to Run Flux.1 on a Dedicated GPU Server

What Makes Flux.1 Different

GPU VRAM Requirements for Flux.1

Preparing Your GPU Server

Running Flux.1 with ComfyUI

Running Flux.1 with Diffusers

Creating an API Endpoint

Performance Optimization

Run Flux.1 on High-VRAM GPU Servers

Need a Dedicated GPU Server?

gigagpu

Related Articles

LLaMA 3 8B for Code Generation & Review: GPU Requirements & Setup

ChromaDB + LLM VRAM Requirements for RAG

Running a 128K Context LLM on the RTX 5060 Ti 16 GB: What Actually Fits

Phi-3.5 vs Phi-3: What Microsoft Improved

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?