Home / Blog / Benchmarks / RTX 5060 Ti 16GB Stable Diffusion 1.5 Benchmark

Benchmarks

RTX 5060 Ti 16GB Stable Diffusion 1.5 Benchmark

SD 1.5 on Blackwell 16GB - blazing fast at 512x512, high-batch workloads, and the thriving LoRA ecosystem.

Benchmarks April 23, 2026 1 min read admin

Stable Diffusion 1.5 is 5+ years old but still massively used because of LoRA ecosystem depth and raw speed. On the RTX 5060 Ti 16GB at our hosting, SD 1.5 is effectively CPU-bound – the GPU is barely working.

Setup
512×512
768×768
Batch throughput
When to use SD 1.5

Setup

Diffusers 0.30, PyTorch 2.5, xFormers
Model: runwayml/stable-diffusion-v1-5
Sampler: DPM++ 2M Karras

512×512

Steps	Time	VRAM
20	0.65 s	3.0 GB
30	0.95 s	3.0 GB
50	1.55 s	3.0 GB

768×768 (via img2img or fine-tuned checkpoints)

Steps	Time	VRAM
30	2.2 s	4.5 GB

Batch Throughput

512×512, 30 steps:

Batch	Total time	Time per image
1	0.95 s	0.95 s
4	1.8 s	0.45 s
8	3.1 s	0.39 s
16	5.4 s	0.34 s
24	7.8 s	0.33 s

At batch 24, throughput = ~180 images per minute. Aggregate rate on 16 GB is outstanding.

When to Use SD 1.5

Anime / stylised art where you have specific LoRAs / Dreambooth models
High-volume thumbnail generation
Interactive tools where 0.4s/image matters
Fine-tuning where SD 1.5’s lightweight size speeds training

For new projects at photorealism, skip to FLUX.1 or SDXL. For existing LoRA-heavy workflows, SD 1.5 still has no equal on speed.

SD 1.5 on Blackwell 16GB

180 images/min at 512, LoRAs galore. UK dedicated hosting.

Order the RTX 5060 Ti 16GB

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Benchmarks

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

RTX 5060 Ti 16GB Stable Diffusion 1.5 Benchmark

Contents

Setup

512×512

768×768 (via img2img or fine-tuned checkpoints)

Batch Throughput

When to Use SD 1.5

SD 1.5 on Blackwell 16GB

Need a Dedicated GPU Server?

admin

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

RTX 5060 Ti 16GB Stable Diffusion 1.5 Benchmark

Contents

Setup

512×512

768×768 (via img2img or fine-tuned checkpoints)

Batch Throughput

When to Use SD 1.5

SD 1.5 on Blackwell 16GB

Need a Dedicated GPU Server?

admin

Related Articles

LoRA Fine-Tuning Speed by GPU

PaddleOCR on RTX 3090: OCR Speed & Cost, Category: Benchmarks, Slug: paddleocr-on-rtx-3090-benchmark, Excerpt: PaddleOCR benchmarked on RTX 3090: 52 pages/sec, VRAM usage, cost efficiency, and deployment configuration., Internal links: 8 –>

Gemma 2 9B on RTX 4060 Ti: Performance Benchmark & Cost, Category: Benchmarks, Slug: gemma-2-9b-on-rtx-4060-ti-benchmark, Excerpt: Gemma 2 9B benchmarked on RTX 4060 Ti: 23.6 tok/s at 4-bit GGUF Q4_K_M, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

RTX 5060 Ti 16GB Tokens per Watt

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?