Stability AI unveiled the sound generator Stable Audio Open
Stability AI has released Stable Audio Open, an open-source text-to-audio generation model optimized for generating audio samples and sound effects up to 47 seconds long, enabling sound designers, musicians, and creatives to experiment with AI-generated audio production and sound design.
Stability AI has thrown its hat in the ring of sound effects generation models with its recent release of Stable Audio Open, an open-source text-to-audio generation model optimized to generate short audio samples and sound effects up to 47 seconds long in response to text-based prompts. The Stable Audio Open release empowers sound designers, musicians, and creative communities with more of Stability AI's audio generation capabilities by delivering a tool that excels in producing high-quality audio production and sound design samples, including drum beats, instrument riffs, ambient sounds, and foley recordings. Additionally, Stable Audio Open can be fine-tuned on custom audio data, allowing users to train the model on, for instance, samples of their work to experiment with generating new material.
Unlike Stable Audio Open, which only produces samples up to 47 seconds long, Stability AI's commercial audio generation offering, Stable Audio, can handle high-quality structured full tracks up to three minutes long, audio-to-audio tasks, and coherent multi-part compositions. Further differences between models are that Stable Audio Open specializes exclusively in audio samples and production elements only, and was trained on a different dataset from Freesound and the Free Music Archive to become the first step in a journey towards generative AI audio production paired with responsible development that respects creator rights. The model weights are available for download on Hugging Face, with Stability AI encouraging anyone who might find the model useful to download, experiment with it, and provide feedback.