MeloTTS from MyShell is a compact multilingual TTS library with support for English, Spanish, French, Chinese, Japanese, and Korean. On our dedicated GPU hosting it is the fastest permissively-licensed option at reasonable quality.
Contents
Install
pip install melo-tts
python -m unidic download
Example
from melo.api import TTS
tts = TTS(language="EN", device="cuda")
speaker_ids = tts.hps.data.spk2id
text = "Hello from a GigaGPU dedicated server."
tts.tts_to_file(text, speaker_ids["EN-US"], "out.wav", speed=1.0)
Pre-shipped speakers: EN-US, EN-BR (British), EN-INDIA, EN-AU, EN-Default. Multilingual code supports ES, FR, ZH, JA, KR.
Throughput
On a 4060 8GB:
- 10-word sentence: ~0.3 seconds generation
- Real-time factor: ~0.05x (20x faster than playback)
- Throughput: hundreds of seconds of audio per minute
MeloTTS is notably faster than Parler-TTS or Bark because the model is smaller and the architecture is optimised for speed over voice variety.
Alternatives
| Model | Voice Quality | Speed | Licence |
|---|---|---|---|
| MeloTTS | Good | Very fast | MIT |
| Parler-TTS | Excellent | Fast | Apache 2.0 |
| Fish Speech | Excellent (clonable) | Moderate | CC-BY-NC-SA (non-commercial) |
| Bark | Expressive | Slow | MIT |
For high-throughput voice needs (notifications, narration pipelines), MeloTTS wins on cost. For expressive voiceovers, Parler or Fish Speech sound better.
See Parler-TTS and Fish Speech.