RTX 3050 - Order Now
Home / Blog / Model Guides / How to Run Flux.1 on a Dedicated GPU Server
Model Guides

How to Run Flux.1 on a Dedicated GPU Server

Deploy Black Forest Labs' Flux.1 image generation model on a dedicated GPU server. Covers VRAM requirements, ComfyUI and diffusers setup, CLI commands, and optimization.

What Makes Flux.1 Different

Flux.1 from Black Forest Labs represents a significant leap in text-to-image generation. Built on a flow matching architecture with a 12-billion-parameter transformer backbone, it delivers exceptional prompt adherence, natural text rendering, and photorealistic output. Running Flux.1 on a dedicated GPU server gives you the high VRAM and compute needed for fast generation without the queue times of shared services.

GigaGPU’s Flux.1 hosting provides servers with RTX 5090 and RTX 6000 Pro GPUs that meet Flux.1’s substantial VRAM requirements. Whether you are building a production image generation API, creating marketing content at scale, or fine-tuning with LoRAs, dedicated hardware ensures consistent throughput. This guide covers both ComfyUI and Python diffusers workflows for Flux.1 deployment.

GPU VRAM Requirements for Flux.1

Flux.1 is more VRAM-hungry than Stable Diffusion XL due to its larger transformer architecture. For a full GPU comparison, see our best GPU for Stable Diffusion and Flux benchmark.

Model Variant Precision VRAM Required Recommended GPU
Flux.1 SchnellFP16~16 GB1x RTX 5090
Flux.1 SchnellFP8~10 GB1x RTX 3090
Flux.1 DevFP16~24 GB1x RTX 5090 32 GB
Flux.1 DevFP8~14 GB1x RTX 5090
Flux.1 Dev + ControlNetFP16~32 GB1x RTX 6000 Pro
Flux.1 Pro (via API)N/AN/AAPI-only

For the highest resolution and batch sizes, GigaGPU’s image generator hosting offers RTX 6000 Pro 96 GB servers.

Preparing Your GPU Server

Update your system and verify GPU access:

sudo apt update && sudo apt upgrade -y
sudo apt install -y python3 python3-pip python3-venv git wget
nvidia-smi

Create a virtual environment for the Flux.1 stack:

python3 -m venv ~/flux-env
source ~/flux-env/bin/activate
pip install --upgrade pip

Install PyTorch with CUDA. Our PyTorch GPU installation guide covers driver compatibility:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Authenticate with Hugging Face to download Flux.1 weights:

pip install huggingface_hub
huggingface-cli login

Running Flux.1 with ComfyUI

ComfyUI is the most popular interface for Flux.1 thanks to its node-based workflow system. Read our full ComfyUI setup guide for detailed installation steps.

git clone https://github.com/comfyanonymous/ComfyUI.git ~/ComfyUI
cd ~/ComfyUI
python3 -m venv venv && source venv/bin/activate
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

Download the Flux.1 Dev checkpoint and supporting models:

# Flux.1 Dev UNET
wget -P ~/ComfyUI/models/unet/ \
  https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors

# CLIP models
wget -P ~/ComfyUI/models/clip/ \
  https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
wget -P ~/ComfyUI/models/clip/ \
  https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors

# VAE
wget -P ~/ComfyUI/models/vae/ \
  https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors

Launch ComfyUI and load a Flux.1 workflow:

python main.py --listen 0.0.0.0 --port 8188

Running Flux.1 with Diffusers

For programmatic access, use the Hugging Face diffusers library directly:

pip install diffusers transformers accelerate sentencepiece protobuf

Generate an image with a simple Python script:

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.float16
)
pipe.to("cuda")

image = pipe(
    prompt="A majestic snowy owl perched on a branch at twilight, photorealistic",
    num_inference_steps=28,
    guidance_scale=3.5,
    height=1024,
    width=1024
).images[0]

image.save("flux_output.png")

For Flux.1 Schnell (faster, distilled variant):

pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    torch_dtype=torch.float16
)
pipe.to("cuda")

image = pipe(
    prompt="Abstract watercolour landscape with mountains",
    num_inference_steps=4,
    guidance_scale=0.0,
    height=1024,
    width=1024
).images[0]

Creating an API Endpoint

Wrap Flux.1 in a FastAPI server for production use:

pip install fastapi uvicorn python-multipart

Create a minimal server script and run it:

uvicorn flux_api:app --host 0.0.0.0 --port 8000

Test the endpoint:

curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cyberpunk cityscape at night", "steps": 28}' \
  --output result.png

For managed API deployments, explore GigaGPU’s API hosting platform.

Performance Optimization

Get faster Flux.1 generation with these tips:

  • Use FP8 quantisation — Flux.1 Dev at FP8 fits on a single RTX 5090 with minimal quality loss.
  • Try Schnell for speed — The distilled Schnell variant generates quality images in just 4 steps, ideal for real-time applications.
  • Enable torch.compile — Add pipe.transformer = torch.compile(pipe.transformer) for up to 30% faster inference after warmup.
  • Offload T5 encoder — Use pipe.enable_model_cpu_offload() to free VRAM after text encoding for larger batch sizes.
  • Pick the right GPU — Read our RTX 3090 vs RTX 5090 comparison for image generation benchmarks.

If you also work with Stable Diffusion, our guide on deploying Stable Diffusion covers SDXL setup. For a broader overview of GigaGPU’s Stable Diffusion hosting options, visit the dedicated landing page. Browse more tutorials in our model guides category.

Run Flux.1 on High-VRAM GPU Servers

Generate stunning images with Flux.1 on dedicated RTX 5090 and RTX 6000 Pro GPUs. Full root access, pre-installed CUDA, and no queue times.

Browse GPU Servers

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Have a question? Need help?