OpenAI is continuing to make advanced AI accessible to all with the launch of GPT-4o mini, now available in the API and rolling out in ChatGPT today.
GPT-4o mini is OpenAI’s most intelligent and affordable small model, available today in the API. GPT-4o mini is significantly smarter and cheaper than GPT-3.5 Turbo.
OpenAI remains dedicated to making artificial intelligence broadly accessible. Today, OpenAI announces the launch of GPT-4o mini, its most cost-efficient small model to date. This model is expected to significantly expand the range of applications that can be built with AI by making intelligence more affordable. The GPT-4o mini scores 82% on the MMLU benchmark and currently outperforms GPT-41 in chat preferences on the LMSYS leaderboard. It is priced at 15 cents per million input tokens and 60 cents per million output tokens, making it an order of magnitude more affordable than previous frontier models and over 60% cheaper than GPT-3.5 Turbo.
GPT-4o mini enables a wide array of tasks due to its low cost and latency. These tasks include applications that chain or parallelize multiple model calls (e.g., calling multiple APIs), provide a large volume of context to the model (e.g., full code base or conversation history), or interact with customers through fast, real-time text responses (e.g., customer support chatbots).
GPT-4o mini API
Currently, GPT-4o mini supports text and vision in the API, with plans to include support for text, image, video, and audio inputs and outputs in the future. The model has a context window of 128K tokens and knowledge up to October 2023. With the improved tokenizer shared with GPT-4o, handling non-English text is now even more cost-effective.
As a small model with superior textual intelligence and multimodal reasoning, GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning. It supports the same range of languages as GPT-4o and demonstrates strong performance in function calling, enabling developers to build applications that fetch data or take actions with external systems. It also offers improved long-context performance compared to GPT-3.5 Turbo.
GPT-4o mini Evaluation
GPT-4o mini has been evaluated across several key benchmarks:
- Reasoning tasks: GPT-4o mini outperforms other small models in reasoning tasks involving both text and vision, scoring 82.0% on MMLU, a textual intelligence and reasoning benchmark, compared to 77.9% for Gemini Flash and 73.8% for Claude Haiku.
- Math and coding proficiency: GPT-4o mini excels in mathematical reasoning and coding tasks, outperforming previous small models on the market. On MGSM, which measures math reasoning, GPT-4o mini scored 87.0%, compared to 75.5% for Gemini Flash and 71.7% for Claude Haiku. GPT-4o mini also scored 87.2% on HumanEval, a measure of coding performance, compared to 71.5% for Gemini Flash and 75.9% for Claude Haiku.
- Multimodal reasoning: GPT-4o mini demonstrates strong performance on MMMU, a multimodal reasoning evaluation, scoring 59.4% compared to 56.1% for Gemini Flash and 50.2% for Claude Haiku.
As part of their model development process, OpenAI collaborated with a select group of trusted partners to better understand the use cases and limitations of GPT-4o mini. They partnered with companies like Ramp and Superhuman, who found GPT-4o mini to significantly outperform GPT-3.5 Turbo in tasks such as extracting structured data from receipt files and generating high-quality email responses when provided with thread history.
Built-in Safety Measures
OpenAI integrates safety into their models from the outset, reinforcing it at every development step. During pre-training, they filter out information that they do not want their models to learn from or output, such as hate speech, adult content, sites that primarily aggregate personal information, and spam. Post-training, they align the model’s behavior with their policies using techniques like reinforcement learning with human feedback (RLHF) to enhance the accuracy and reliability of the models’ responses.
GPT-4o mini incorporates the same safety mitigations as GPT-4o, carefully assessed through both automated and human evaluations according to OpenAI’s Preparedness Framework and voluntary commitments. More than 70 external experts in fields like social psychology and misinformation tested GPT-4o to identify potential risks, which OpenAI addressed. The details of these assessments will be shared in the forthcoming GPT-4o system card and Preparedness scorecard. Insights from these expert evaluations have helped improve the safety of both GPT-4o and GPT-4o mini.
Building on these learnings, OpenAI’s teams worked to enhance the safety of GPT-4o mini using new techniques informed by their research. GPT-4o mini in the API is the first model to apply the instruction hierarchy method, which improves the model’s ability to resist jailbreaks, prompt injections, and system prompt extractions, making its responses more reliable and safer to use in applications at scale.
OpenAI will continue to monitor how GPT-4o mini is used and improve the model’s safety as they identify new risks.
Availability and Pricing
GPT-4o mini is now available as a text and vision model in the Assistants API, Chat Completions API, and Batch API. Developers pay 15 cents per million input tokens and 60 cents per million output tokens (roughly equivalent to 2500 pages in a standard book). Fine-tuning for GPT-4o mini will be rolled out in the coming days.
In ChatGPT, Free, Plus, and Team users will have access to GPT-4o mini starting today, replacing GPT-3.5. Enterprise users will also gain access starting next week, in line with OpenAI’s mission to make the benefits of AI accessible to all.
What’s Next
In recent years, OpenAI has witnessed remarkable advancements in AI intelligence paired with substantial reductions in cost. For example, the cost per token of GPT-4o mini has dropped by 99% since the introduction of text-davinci-003, a less capable model, in 2022. OpenAI is committed to continuing this trajectory of driving down costs while enhancing model capabilities.
OpenAI envisions a future where models are seamlessly integrated into every app and website. GPT-4o mini is paving the way for developers to build and scale powerful AI applications more efficiently and affordably. The future of AI is becoming more accessible, reliable, and embedded in daily digital experiences, and OpenAI is excited to continue leading the way.
Read related articles: