Stability AI calls Stable Diffusion 3 Medium its 'most advanced text-to-image open model'
Stable Diffusion 3 Medium, the most recent installment in the Stable Diffusion 3 series, is a two billion-parameter text-to-image generation model that proves that bigger is not always better. Its outstanding features include:
- the ability to deliver high-quality, photorealistic images with improved generation of challenging queries, such as faces and hands, as well as handling a variety of non-photorealistic styles;
- enhanced prompt comprehension, which includes understanding of spatial reasoning, compositional elements, actions, and styles;
- improved typography generation capabilities that deliver fewer spelling, spacing, kerning, and other related errors;
- a size small enough to be run in consumer PCs and enterprise-grade GPUs without significant performance compromises; and
- weights that are openly available for non-commercial use, a low-cost Creator License for limited commercial use introduced with professional artists, designers, developers, and AI enthusiasts in mind, and the possibility to arrange large-scale commercial use;
Stable Diffusion 3 Medium can be test run using the API in the Stability Platform, a three-day trial on Stable Assistant, or Stable Artisan on Discord. On the latter two, users can also get started with other models in the SD3 series, such as the Large and Ultra models.
The Stable Diffusion 3 models have been optimized with NVIDIA® RTX GPUs and TensorRT in collaboration with NVIDIA and Stable Diffusion 3 Medium is no exception. The TensorRT-optimized Stable Diffusion 3 Medium can be downloaded from Hugging Face. Similarly, AMD has optimized Stable Diffusion 3 Medium for devices including its latest APUs, consumer GPUs and MI-300X Enterprise GPUs.