Google has expanded the availability of Imagen 3 through ImageFX for more users
Google's Imagen 3 advances AI-generated imagery with improved detail, versatility, and prompt understanding; while emphasizing responsible development and planning deployment across various Google products.
Google has expanded the availability of the Imagen 3 text-to-image model, which can be tried out through ImageFX. Although several reports state it is available for US-based accounts only, the availability seems to be wider than that. Formerly, the model was available to a more limited number of users via Vertex AI. In the long term, Google plans to release several versions of Imagen 3 optimized for different tasks. In parallel, the company will also integrate Imagen 3 into additional products, like the Gemini assistant, Workspace, and Ads; and it will incorporate the Imagen 2 editing features, such as in and outpainting, into Imagen 3.
Imagen 3 boasts improved detail, richer lighting, and fewer artifacts when compared to its predecessor. Additional key Imagen 3 features include:
- Versatility, as the model can generate high-quality images across various formats and styles, from photorealistic landscapes to whimsical claymation scenes.
- Improved prompt adherence and understanding, which leads to more accurate output without the need for prompt engineering. This is one of the most commented-on features on social media by those who have experimented with the model.
- Enhanced detail and texture rendering, with the model delivering outputs that feature details such as camera angles and specific compositions.
- Text rendering, a feature that has become one of the standard tests for image generation models, is vastly improved. So much so, that there are some reports of users circumventing the model's guardrails to generate logos with copyright-infringing accuracy.
- Safety and responsibility measures implemented include filtering, data labeling, and red teaming to reduce the likelihood of harmful outputs. Additionally, Imagen 3 is shipping with SynthID technology, a digital watermark invisible to the human eye that makes Imagen 3 generations detectable as AI-generated content.
In addition to praising Imagen 3's prompt adherence and understanding, users have commented that the model's guardrails may be too strict, rejecting what seem to be harmless prompts. On the other hand, as is common practice with new image-generation models, reports have also emerged that while the model usually rejects the generation of copyrighted characters by name, the restriction is easily circumvented by describing the character instead. In any case, it seems that, unlike the recent Grok-2 image generator controversy, Google has decided to play it safe, as most other model providers do, but especially so given the incident that led to Google suspending Gemini's people-generation capabilities.