On Tuesday, OpenAI unveiled its most advanced image generation capabilities yet with the integration of image creation directly into its GPT-4o model. The new image generation system represents a significant upgrade from DALL·E 3, with GPT-4o now capable of creating images that are not only appealing but also useful. According to OpenAI, this integration allows for more precise and accurate image generation with improved capabilities in several key areas:

  • Superior text rendering within images
  • Better prompt adherence with the ability to handle more objects than its rivals, which OpenAI claims struggle when handling around 5-8 objects.
  • Context awareness that maintains consistency across multiple iterations.
  • In-context learning that enables the model to integrate details from user-uploaded images into its generations.
  • Convincing image creation and editing, enabled by rigorous training on diverse image styles

Performance Trade-offs

Unlike DALL-E 3, GPT-4o's native image generation takes longer to render images, up to one minute in some cases, as it creates more detailed and accurate output. The company also noted that although GPT-4o will become the default image generator for all ChatGPT users, image generation with DALL·E will still be available through a dedicated DALL·E GPT.

Safety Measures

OpenAI has implemented several safety features with this release:

  • C2PA metadata tagging on all generated images identifies them as coming from GPT-4o, and an internal search tool lets OpenAI verify whether a generation came from GPT-4o.
  • Restrictions for images of humans used in context that block the creation of images involving nudity and graphic violence.
  • An LLM trained on human-written and interpretable safety specifications that enables the identification of ambiguities in the company's policies.

Access and Availability

The feature, which can also be used in Sora, began rolling out on Tuesday to ChatGPT Pro subscribers, who would be closely followed by Plus and Team users. OpenAI initially planned to enable immediate access for free users. However, this Wednesday, OpenAI CEO Sam Altman confirmed that the feature would roll out to free users later than expected, citing overwhelming demand.

OpenAI also plans to roll out 4o image generation to Edu and Enterprise users, and will make the feature available to developers via API in the coming weeks.