Home / Blog / Benchmarks / Coqui XTTS-v2 on RTX 5090: TTS Speed & Cost, Category: Benchmarks, Slug: coqui-xtts-v2-on-rtx-5090-benchmark, Excerpt: Coqui XTTS-v2 benchmarked on RTX 5090: RTF 0.08, 12.5x real-time processing, VRAM usage, and cost per audio hour., Internal links: 8 –>

Benchmarks

Coqui XTTS-v2 on RTX 5090: TTS Speed & Cost, Category: Benchmarks, Slug: coqui-xtts-v2-on-rtx-5090-benchmark, Excerpt: Coqui XTTS-v2 benchmarked on RTX 5090: RTF 0.08, 12.5x real-time processing, VRAM usage, and cost per audio hour., Internal links: 8 –>

Coqui XTTS-v2 benchmarked on RTX 5090: RTF 0.08, 12.5x real-time processing, VRAM usage, and cost per audio hour., Internal links: 8 -->

Benchmarks April 15, 2026 2 min read admin

Twelve-and-a-half times faster than real-time. That means Coqui XTTS-v2 on the RTX 5090 synthesises a one-minute voice clip in under five seconds — with full voice cloning from a short reference sample. We pushed this combination through our benchmark suite on GigaGPU because the numbers almost seemed too good.

Peak TTS Speed

Metric	Value
Real-Time Factor (lower = faster)	0.08
Synthesis speed	12.5x real-time
Audio hours processed per GPU-hour	12.5
Precision	FP16
Performance rating	Very Good

Benchmark conditions: FP16 inference, single-stream processing, 24kHz output, English, single-speaker. XTTS-v2 streaming server.

29 GB Free After Loading

Component	VRAM
Model weights (FP16)	2.4 GB
Audio buffer + runtime	~0.4 GB
Total RTX 5090 VRAM	32 GB
Free headroom	~29.6 GB

XTTS-v2 consumes less than 8% of the 5090’s VRAM. The remaining 29.6 GB is enough to co-host Whisper Large-v3, a 13B-parameter LLM, and Flux.1 — all simultaneously. If you are building a multi-modal AI product, the 5090 is the one card that can genuinely host your entire inference stack.

Economics of Scale

Cost Metric	Value
Server cost	£1.50/hr (£299/mo)
Cost per audio hour	£0.120
Audio hours per £1	8.3

Twelve pence per hour of synthesised voice. At 12.5x speed, a single 5090 generates 300 hours of audio per day. Audiobook publishers, large-scale accessibility services, and enterprise notification systems all benefit from this throughput. See how it compares in our cross-GPU benchmark.

Maximum Performance, Maximum Flexibility

The 5090 is not the cheapest way to run XTTS-v2 — the RTX 5080 offers similar per-hour costs at 8.3x speed. What the 5090 offers is headroom for growth. Start with TTS, add speech recognition, layer in language understanding. As your product complexity increases, the 5090 absorbs new models without needing a second server. That flexibility has a value that does not show up in the per-hour cost alone. Full comparison: best GPU for TTS.

Quick deploy:

docker run --gpus all -p 8000:8000 ghcr.io/coqui-ai/xtts-streaming-server:latest

Explore: Coqui hosting guide, all benchmarks, SD hosting.

Deploy Coqui XTTS-v2 on RTX 5090

Order this exact configuration. UK datacenter, full root access.

Order RTX 5090 Server

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Benchmarks

admin

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Coqui XTTS-v2 on RTX 5090: TTS Speed & Cost, Category: Benchmarks, Slug: coqui-xtts-v2-on-rtx-5090-benchmark, Excerpt: Coqui XTTS-v2 benchmarked on RTX 5090: RTF 0.08, 12.5x real-time processing, VRAM usage, and cost per audio hour., Internal links: 8 –>

Peak TTS Speed

29 GB Free After Loading

Economics of Scale

Maximum Performance, Maximum Flexibility

Deploy Coqui XTTS-v2 on RTX 5090

Need a Dedicated GPU Server?

admin

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

Coqui XTTS-v2 on RTX 5090: TTS Speed & Cost, Category: Benchmarks, Slug: coqui-xtts-v2-on-rtx-5090-benchmark, Excerpt: Coqui XTTS-v2 benchmarked on RTX 5090: RTF 0.08, 12.5x real-time processing, VRAM usage, and cost per audio hour., Internal links: 8 –>

Peak TTS Speed

29 GB Free After Loading

Economics of Scale

Maximum Performance, Maximum Flexibility

Deploy Coqui XTTS-v2 on RTX 5090

Need a Dedicated GPU Server?

admin

Related Articles

LLM + TTS Pipeline on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: llm-tts-pipeline-on-rtx-5080-benchmark, Excerpt: LLM + TTS Pipeline benchmarked on RTX 5080: LLaMA 3 8B + Coqui XTTS-v2, concurrent performance, VRAM breakdown, and cost analysis., Internal links: 9 –>

DeepSeek 7B on RTX 5080: Performance Benchmark & Cost, Category: Benchmarks, Slug: deepseek-7b-on-rtx-5080-benchmark, Excerpt: DeepSeek 7B benchmarked on RTX 5080: 68.0 tok/s at FP16, VRAM usage, cost per 1M tokens, and deployment configuration., Internal links: 9 –>

GDDR6 vs GDDR6X vs GDDR7 for AI

YOLOv8 Nano vs Small vs Medium FPS by GPU

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?