Home / Blog / Model Guides / Fish Speech v1.5 Self-Hosted

Model Guides

Fish Speech v1.5 Self-Hosted

Fish Speech is a zero-shot voice cloning TTS that clones a voice from 10-30 seconds of reference audio. Runs on modest GPUs.

Model Guides April 23, 2026 1 min read gigagpu

Fish Speech v1.5 is a text-to-speech model from Fish Audio with zero-shot voice cloning: given 10-30 seconds of reference audio, it synthesises new speech in that voice. On our dedicated GPU hosting it fits an 8 GB card comfortably.

VRAM
Deployment
Voice cloning workflow
Ethics and consent

VRAM

~4-6 GB at FP16. Runs on any card from the 4060 up.

Deployment

git clone https://github.com/fishaudio/fish-speech
cd fish-speech
pip install -e .

python tools/api_server.py \
  --llama-checkpoint-path checkpoints/fish-speech-1.5 \
  --decoder-checkpoint-path checkpoints/fish-speech-1.5/decoder.pth \
  --decoder-config-name firefly_gan_vq

The API exposes an HTTP endpoint that accepts reference audio plus target text.

Cloning Workflow

Record 10-30 seconds of the target speaker reading a varied text
POST to Fish Speech API with reference audio and target text
Receive synthesised audio in the cloned voice

Quality improves with cleaner reference audio. Room echo, background noise, and short references (<10s) degrade cloning fidelity.

Ethics

Voice cloning technology is easily abused. For UK-facing products, get documented consent from anyone whose voice you clone. Do not synthesise voices of public figures or deceased persons without permission. Add watermarking to synthetic audio where possible. These are not legal requirements in every jurisdiction but represent basic professional conduct.

Self-Hosted Voice Cloning

Fish Speech on UK dedicated GPUs with clear operational logging.

Browse GPU Servers

See RVC voice cloning and Parler-TTS.

Need a Dedicated GPU Server?

Deploy from RTX 3050 to RTX 5090. Full root access, NVMe storage, 1Gbps — UK datacenter.

Browse GPU Servers

Model Guides

gigagpu

We benchmark, deploy, and optimise GPU infrastructure for AI workloads. All data in our guides comes from real-world testing on our UK-based dedicated GPU servers.

Ready to deploy your AI workload?

Dedicated GPU servers from our UK datacenter. NVMe storage, 1Gbps networking, full root access.

Browse GPU Servers Contact Sales

Fish Speech v1.5 Self-Hosted

Contents

VRAM

Deployment

Cloning Workflow

Ethics

Self-Hosted Voice Cloning

Need a Dedicated GPU Server?

gigagpu

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help?

Fish Speech v1.5 Self-Hosted

Contents

VRAM

Deployment

Cloning Workflow

Ethics

Self-Hosted Voice Cloning

Need a Dedicated GPU Server?

gigagpu

Related Articles

RTX 5060 Ti 16GB for Phi-3-medium

DeepSeek V3 vs V2: Performance Upgrade on Dedicated GPU

Multimodal LLM Deployment Guide: Vision-Language Models on Self-Hosted GPUs

YOLOv8 vs YOLOv9 vs YOLOv10: Detection Model Comparison

GPU Hosting

Blog Categories

AI Model Hosting

Benchmarks & Tools

Deploy a GPU Server

Ready to deploy your AI workload?

Have a question? Need help? Contact us

Have a question? Need help?