Stable Diffusion 3 is now available via an early preview waitlist
Stability AI recently announced the early preview of Stable Diffusion 3, a text-to-image generation model family ranging from 800M to 8B parameters based on a diffusion transformer architecture with flow matching. SD3 stands out for its remarkable text generation capabilities.
Stability AI announced last week that the models in its most recent text-to-image generation family, Stable Diffusion 3 (SD3), ranged in size from 800M to 8B parameters. The varying sizes of the models are meant to provide users with a range of scalability and quality options to suit a diversity of use cases, in line with Stability AI's commitment to democratizing access to generative AI. Stable Diffusion 3 was built using a diffusion transformer architecture with flow matching. Details on the process will appear in a forthcoming technical report.
Since Stable Diffusion 3 is not generally available, it cannot be tested first-hand yet. However, two things stand out when examining the (possibly cherry-picked) samples featured in the early preview announcement. The first one is that SD3's generations are undoubtedly approaching the quality of its competitors, especially when compared to popular closed-source tools such as DALL-E. The other one is that SD3 seems to be especially good at handling text, a feature that most other text-to-image models have struggled with at one point or another. Stability AI CEO Emad Mostaque and Media Lead André Kerygma have been (heavily) showcasing SD3's text generation capabilities on the social media platform X.
A definite launch date remains unannounced, but Stability has clarified that it has introduced several safeguards to SD3 in preparation for the early preview. The company expects that the feedback gathered from this early preview will be essential to prepare the product for public release. In the meantime, those interested in trying SD3 out can sign up on the waitlist.
It will be an interesting exercise to follow the evolution of Stable Diffusion 3 since once it is publicly available, it is set to be released alongside an ecosystem of tools, perhaps following the standard set by Microsoft, Open AI, and others who have integrated out-of-the-box image generation within their text-based assistants, such as was the case with ChatGPT, Microsoft Copilot, and even Imagine with Meta AI.