The simplest SD setup is HuggingFace Diffusers in a Python venv. Good for custom scripts or API wrappers on the RTX 5060 Ti 16GB at our hosting.
Contents
Install
uv venv --python 3.12 ~/.venvs/sd
source ~/.venvs/sd/bin/activate
uv pip install torch --index-url https://download.pytorch.org/whl/cu126
uv pip install diffusers transformers accelerate safetensors
First Generation (SDXL)
import torch
from diffusers import StableDiffusionXLPipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16",
).to("cuda")
image = pipe(
prompt="a golden retriever wearing glasses, hyperreal, studio lighting",
num_inference_steps=30,
).images[0]
image.save("output.png")
Expected time: ~3-4 seconds for 1024×1024.
FLUX.1-schnell
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-schnell",
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
prompt="a neon-lit cyberpunk street at night",
num_inference_steps=4,
guidance_scale=0.0,
).images[0]
image.save("flux.png")
Expected time: ~3.8 s at FP16. For FP8 quantised weights see FLUX benchmark.
Expose as API
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
class PromptIn(BaseModel):
prompt: str
steps: int = 30
@app.post("/generate")
def generate(p: PromptIn):
img = pipe(prompt=p.prompt, num_inference_steps=p.steps).images[0]
# save or return base64
...
For production API, ComfyUI or A1111 give you the UI free – see ComfyUI setup or A1111 setup.
Stable Diffusion on Blackwell 16GB
SDXL, FLUX, SD 1.5 all fit. UK dedicated hosting.
Order the RTX 5060 Ti 16GBSee also: ComfyUI, A1111, SDXL benchmark, FLUX benchmark.