Stable Diffusion 3.5 Large (SD 3.5L) is Stability AI’s 8B-parameter diffusion transformer. A real quality step up from SDXL with better text rendering, photorealism, and complex composition. On our dedicated GPU hosting it needs more VRAM than SDXL but fits a 24 GB+ card.
Contents
VRAM
| Precision | Total | Fits On |
|---|---|---|
| FP16 | ~20 GB | 24 GB+ card |
| FP8 | ~10 GB | 12 GB+ card |
| INT4 | ~6 GB | 8 GB+ card |
Deployment
from diffusers import StableDiffusion3Pipeline
import torch
pipe = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3.5-large",
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
"A realistic photo of a fox in a snow-covered forest at dawn",
num_inference_steps=28,
guidance_scale=3.5,
).images[0]
For the turbo variant that needs fewer steps, use stabilityai/stable-diffusion-3.5-large-turbo with num_inference_steps=4.
Quality
SD 3.5L visibly outperforms SDXL on:
- Text rendering inside images
- Complex multi-object scenes
- Hands and human anatomy
- Following long, detailed prompts
It slightly underperforms SDXL on stylised art where SDXL’s richer LoRA ecosystem still dominates.
Speed
On a 5090, 1024×1024, 28 steps: ~4 seconds per image. 3.5 Large Turbo at 4 steps: ~0.8 seconds. Slower than SDXL per image but produces higher quality.
See SDXL Lightning vs Turbo and Flux Schnell.