Stable Audio Open is a generative audio model from Stability AI that produces sound effects, loops, and short music clips from text prompts. It is licensed permissively (non-commercial for the released weights; commercial requires Stability licence) and runs on a modest GPU on our dedicated GPU hosting.
Contents
VRAM
~8 GB at FP16 for Stable Audio Open 1.0. Fits any GPU from the 4060 up.
Deployment
pip install stable-audio-tools
from stable_audio_tools import get_pretrained_model
from stable_audio_tools.inference.generation import generate_diffusion_cond
import torchaudio
model, config = get_pretrained_model("stabilityai/stable-audio-open-1.0")
model = model.to("cuda")
audio = generate_diffusion_cond(
model,
steps=100,
cfg_scale=7,
conditioning=[{"prompt": "upbeat synthwave loop, 120 BPM"}],
sample_size=config["sample_size"],
device="cuda",
)
torchaudio.save("output.wav", audio[0].cpu(), 44100)
Use Cases
- Background music for apps and videos
- Sound effect generation (explosions, footsteps, ambient tones)
- Loop generation for game development
- Prototyping audio ideas before hiring a composer
Limits
Output is capped at 47 seconds. Not suitable for full songs or long-form audio. Vocal generation is poor – no lyrics support. For music with vocals, different models like Suno or Udio are required (not self-hostable in 2026).
Licence: Stability’s community licence is non-commercial for the released weights. Commercial use requires a paid Stability licence or using older permissive alternatives.
Self-Hosted Audio Generation
Stable Audio Open on UK dedicated GPUs for prototyping and non-commercial use.
Browse GPU ServersSee MusicGen (longer-form alternative).