OpenAI is also jumping on the small model trend with the GPT-4o mini
OpenAI has launched GPT-4o mini, a highly cost-efficient and capable small language model that outperforms comparably-sized models on various benchmarks, offers significantly reduced pricing, and aims to democratize AI by making advanced intelligence more accessible.
OpenAI is following up on its commitment to making artificial intelligence accessible to everyone by releasing GPT-4o mini, its most cost-efficient small model yet. Priced at 15 cents per million input tokens and 60 cents per million output tokens, GPT-4o mini is 99% more affordable than the less capable text-davinci-003, a model released in 2022. OpenAI expects that the affordability of GPT-4o will contribute to unlocking various use cases leveraging the model's capability to chain or parallelize multiple model calls, process a large volume of context, and provide fast text-based responses to user queries.
In addition to the former capabilities, GPT-4o mini features API-based text and vision support (with plans to deliver multimodal inputs and outputs eventually), a context window accepting up to 128K tokens, the capability to generate up to 16k tokens as output per request, knowledge up to October 2023, and the same tokenizer as GPT-4o, which makes non-English queries more cost-effective. GPT-4o mini outperforms GPT-4 in the LMSYS leaderboard, and scores 82% on MMLU, compared to 77.9% for Gemini Flash and 73.8% for Claude Haiku.
More generally, GPT-4o mini excels in various tasks, including reasoning, math, coding, and multimodal processing, outperforming GPT-3.5 Turbo and other small models in benchmarks testing for textual intelligence and multimodal reasoning. The model handles the same range of languages as GPT-4o and performs strongly in function calling. In addition to its impressive MMLU score, GPT-4o scored 87.0% on MGSM (mathematical reasoning), 87.2% on HumanEval (coding), and 59.4% on MMMU (multimodal reasoning). Gemini Flash and Claude Haiku scored below 80% on the first two benchmarks, while Flash scored 56.1% on MMMU, and Haiku scored 50.2%.
OpenAI has prioritized safety in GPT-4o mini's development, implementing built-in safety measures and leveraging insights from expert evaluations. The model also introduces new techniques like the instruction hierarchy method to enhance reliability and safety at scale. GPT-4o mini is available on OpenAI's APIs as a text and vision model and will be available for ChatGPT, Free, Plus, and Team users to use instead of GPT-3.5, starting immediately. Enterprise users will have access to the model shortly.