News

DALL-E is now accessible via conversation for paying ChatGPT users

ChatGPT can now create imagery with the help of DALL-E. Starting Oct 19, Pro and Enterprise users can ask ChatGPT to generate and refine images from a conversation. The feature was announced together with further details on DALL-E's risk mitigation and security features.

by Ellie Ramirez-Camara

Updated October 20, 2023

DALL-E is now accessible via conversation for paying ChatGPT users

Starting October 19, ChatGPT Pro and Enterprise users can generate images using DALL-E in conversations with the popular chatbot. Once prompted, ChatGPT will offer a selection of pictures matching the user's query, which can then be refined or iterated without leaving the conversation. This release follows two other groundbreaking advances delivered by OpenAI a mere weeks ago: The release of the current DALL-E 3 and the image recognition and speech synthesis features incorporated into ChatGPT Pro and Enterprise.

In previous coverage of the DALL-E 3 release, there was mention of this model's improvements over DALL-E 2: the increased accuracy and detail when rendering and the improved attention to extended prompts that included precise descriptions. Complimenting that previous announcement, OpenAI has now released a research paper detailing how the improvements over DALL-E 2 were achieved: the team hypothesized that DALL-E was struggling with longer prompts because of inaccuracies and noise present in the captions of the original training dataset. To address this, they recaptioned the complete dataset with a bespoke image captioner and then re-trained the model using the newly captioned dataset.

Now that DALL-E is headed towards a wider availability, it also makes sense that OpenAI has finally gone into a more detailed discussion of the safety measures behind the model's deployment. In terms of harmful content generation, OpenAI has stated that security checks are run over the prompt and the resulting imagery before it is presented to the user. Furthermore, user feedback was also taken into account to identify gaps and edge cases not covered by previous security system versions. Armed with this knowledge, the team stress-tested the model to prepare it for public deployment. Allegedly, the steps taken should prevent DALL-E from responding to prompts asking for content in the style of a living artist, images of public figures, and harmful or biased depictions of people.

User and artist feedback remains the main channel for reporting offensive, harmful, or copyright-infringing content. However, new details about OpenAI's provenance classifier have also surfaced. This is an internal tool that helps with the identification of DALL-E-generated content. OpenAI claims that early testing places the classifier at 99% accuracy when the images have not been modified and up to 95% accuracy when the media has undergone minor common modifications, such as cropping, resizing, JPEG compression, and small superpositions of text or images over the original picture. Thus, there can be no certainty when using the provenance classifier, although being able to conclude if an image was likely generated by AI is progress that may set the foundations for stronger security measures.

by Ellie Ramirez-Camara

Updated October 20, 2023

Subscribe to Our Newsletter

DALL-E is now accessible via conversation for paying ChatGPT users

Stable Video 4D showcases Stability AI's research into multi-angle video generation

Mistral AI released Mistral Large 2, a multilingual, tool use-capable, open model of its own

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts

Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet

Data Phoenix Digest

Read More

Stable Video 4D showcases Stability AI's research into multi-angle video generation

Mistral AI released Mistral Large 2, a multilingual, tool use-capable, open model of its own

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts