OpenAI launched several new tools for ChatGPT along with the new GPT-4o model

OpenAI recently introduced GPT-4o (for omni), a model with a performance comparable to GPT-4, but natively trained across text, audio, image, and video. This allows GPT-4o to accept any combination of text, audio, image, and video as input and generates combinations of text, audio, and image as output.

GPT-4o can also reply to audio input in an average of 320 milliseconds, similar to the human response time in conversations. This means that audio conversations with GPT-4o no longer rely on three different models, as they did with previous GPT iterations: Voice mode relied on one model to transcribe the user input into text, then had GPT-4 process the transcribed text, generate an answer which was relayed to a further model that converted the text to audio in a process lasting several seconds.

The GPT-4o announcement fully introduces the model along with some performance highlights, evaluation details, and safety information. ChatGPT users in all tiers will have access to GPT-40's text and image capabilities starting May 13. Plus users will have higher message limits than free users. Team and Enterprise users will have even higher limits, although GPT-4o is only rolling out for Team users for now. A new Voice Mode will be available for Plus users shortly. Developers can access GPT-4o as a text and vision model in the API, with the audio and video capabilities rolling out shortly to selected trusted partners.

To match GPT-4o's improved language skills, the ChatGPT user interface is now available in over 50 languages. Moreover, free and paid users will soon have access to ChatGPT desktop apps. The macOS app is already rolling out to Plus users and will be broadly available soon. The Windows app has no set launch date but will be released before year's end. Those still awaiting the desktop app can start experiencing the improved ChatGPT in the web application, where they'll find a new friendlier, and more conversational layout and home page, among other improvements.